AthenaeumRawTask — Developer Reference
Developer reference for AthenaeumRawTask with constructor parameters, examples and implementation details.
Constructor Parameters
| Parameter | Type | Description |
|---|---|---|
name | Optional[str] | Custom name for the task |
job_context | JobContext | Context with Spark session and configuration |
sync_dates | Optional[List[str]] | Specific dates to sync (YYYY-MM-DD format) |
skip_ingestion | bool | Skip ingestion and only run crawler (default: False) |
Example Usage
from etl_lib.tasks.chains.athenaeum.AthenaeumRawTask import AthenaeumRawTask
task = AthenaeumRawTask(
job_context=job_context,
write_to_catalog=True
)
task.run()Sync Specific Dates
task = AthenaeumRawTask(
job_context=job_context,
sync_dates=["2024-01-15", "2024-01-16"]
)
task.run()Skip Ingestion
task = AthenaeumRawTask(
job_context=job_context,
skip_ingestion=True
)
task.run()Implementation Details
Subtask Initialization
def __init__(self, *, job_context, sync_dates=None, skip_ingestion=False, **kwargs):
subtasks = []
# Conditionally add ingester
if not skip_ingestion:
subtasks.append(
AthenaeumIngesterTask(
job_context=job_context,
sync_dates=sync_dates,
write_to_catalog=False
)
)
# Always add crawler
subtasks.append(
AthenaeumCrawlerTask(
job_context=job_context,
write_to_catalog=False
)
)
super().__init__(
job_context=job_context,
subtasks=subtasks,
**kwargs
)Back to process documentation: /processes/tasks/chains/athenaeum/athenaeum-raw-task
Last updated on