AthenaeumRawTask — Developer Reference

Developer reference for AthenaeumRawTask with constructor parameters, examples and implementation details.

Constructor Parameters

Parameter	Type	Description
`name`	`Optional[str]`	Custom name for the task
`job_context`	`JobContext`	Context with Spark session and configuration
`sync_dates`	`Optional[List[str]]`	Specific dates to sync (YYYY-MM-DD format)
`skip_ingestion`	`bool`	Skip ingestion and only run crawler (default: False)

Example Usage


from etl_lib.tasks.chains.athenaeum.AthenaeumRawTask import AthenaeumRawTask
 
task = AthenaeumRawTask(
	job_context=job_context,
	write_to_catalog=True
)
 
task.run()

Sync Specific Dates


task = AthenaeumRawTask(
	job_context=job_context,
	sync_dates=["2024-01-15", "2024-01-16"]
)
task.run()

Skip Ingestion


task = AthenaeumRawTask(
	job_context=job_context,
	skip_ingestion=True
)
task.run()

Implementation Details

Subtask Initialization


def __init__(self, *, job_context, sync_dates=None, skip_ingestion=False, **kwargs):
	subtasks = []
    
	# Conditionally add ingester
	if not skip_ingestion:
		subtasks.append(
			AthenaeumIngesterTask(
				job_context=job_context,
				sync_dates=sync_dates,
				write_to_catalog=False
			)
		)
    
	# Always add crawler
	subtasks.append(
		AthenaeumCrawlerTask(
			job_context=job_context,
			write_to_catalog=False
		)
	)
    
	super().__init__(
		job_context=job_context,
		subtasks=subtasks,
		**kwargs
	)

Back to process documentation: /processes/tasks/chains/athenaeum/athenaeum-raw-task