OperaRawTask — Developer Reference
Developer reference for OperaRawTask with constructor parameters, examples and implementation details.
Constructor Parameters
| Parameter | Type | Description |
|---|---|---|
name | Optional[str] | Optional task name |
job_context | JobContext | Job context with Spark, catalog and config |
sync_dates | Optional[List[str]] | Specific dates to sync (YYYY-MM-DD) |
skip_ingestion | bool | Skip the ingester step and only run crawler |
write_to_catalog | bool | Whether to write raw files to the catalog |
incremental_crawl | Optional[bool] | Whether the crawler will run incrementally |
Provides / Requires
requires()— []provides()—[RawReservationModel, RawRateModel, RawProfileModel]
Example Usage
from etl_lib.tasks.chains.opera.OperaRawTask import OperaRawTask
op = OperaRawTask(job_context=job_context, skip_ingestion=False, write_to_catalog=True)
op.run()Implementation Details
OperaRawTask orchestrates the OperaIngesterTask and OperaCrawlerTask subtasks based on skip_ingestion and sync_dates:
subtasks = []
if not skip_ingestion:
subtasks.append(OperaIngesterTask(job_context=job_context, sync_dates=sync_dates))
subtasks.append(OperaCrawlerTask(job_context=job_context, incremental_crawl=incremental_crawl))
super().__init__(job_context=job_context, subtasks=subtasks, **kwargs)Back to process documentation: [/processes/tasks/chains/opera/opera-raw-task](/processes/tasks/chains/opera/opera-raw-task)
Last updated on