Skip to Content
DevelopmentReferenceTasksChainsAthenaeumAthenaeumRawTask — Developer Reference

AthenaeumRawTask — Developer Reference

Developer reference for AthenaeumRawTask with constructor parameters, examples and implementation details.

Constructor Parameters

ParameterTypeDescription
nameOptional[str]Custom name for the task
job_contextJobContextContext with Spark session and configuration
sync_datesOptional[List[str]]Specific dates to sync (YYYY-MM-DD format)
skip_ingestionboolSkip ingestion and only run crawler (default: False)

Example Usage

from etl_lib.tasks.chains.athenaeum.AthenaeumRawTask import AthenaeumRawTask task = AthenaeumRawTask( job_context=job_context, write_to_catalog=True ) task.run()

Sync Specific Dates

task = AthenaeumRawTask( job_context=job_context, sync_dates=["2024-01-15", "2024-01-16"] ) task.run()

Skip Ingestion

task = AthenaeumRawTask( job_context=job_context, skip_ingestion=True ) task.run()

Implementation Details

Subtask Initialization

def __init__(self, *, job_context, sync_dates=None, skip_ingestion=False, **kwargs): subtasks = [] # Conditionally add ingester if not skip_ingestion: subtasks.append( AthenaeumIngesterTask( job_context=job_context, sync_dates=sync_dates, write_to_catalog=False ) ) # Always add crawler subtasks.append( AthenaeumCrawlerTask( job_context=job_context, write_to_catalog=False ) ) super().__init__( job_context=job_context, subtasks=subtasks, **kwargs )

Back to process documentation: /processes/tasks/chains/athenaeum/athenaeum-raw-task

Last updated on