Skip to Content
DevelopmentReferenceTasksChainsOperaOperaRawTask — Developer Reference

OperaRawTask — Developer Reference

Developer reference for OperaRawTask with constructor parameters, examples and implementation details.

Constructor Parameters

ParameterTypeDescription
nameOptional[str]Optional task name
job_contextJobContextJob context with Spark, catalog and config
sync_datesOptional[List[str]]Specific dates to sync (YYYY-MM-DD)
skip_ingestionboolSkip the ingester step and only run crawler
write_to_catalogboolWhether to write raw files to the catalog
incremental_crawlOptional[bool]Whether the crawler will run incrementally

Provides / Requires

  • requires() — []
  • provides()[RawReservationModel, RawRateModel, RawProfileModel]

Example Usage

from etl_lib.tasks.chains.opera.OperaRawTask import OperaRawTask op = OperaRawTask(job_context=job_context, skip_ingestion=False, write_to_catalog=True) op.run()

Implementation Details

OperaRawTask orchestrates the OperaIngesterTask and OperaCrawlerTask subtasks based on skip_ingestion and sync_dates:

subtasks = [] if not skip_ingestion: subtasks.append(OperaIngesterTask(job_context=job_context, sync_dates=sync_dates)) subtasks.append(OperaCrawlerTask(job_context=job_context, incremental_crawl=incremental_crawl)) super().__init__(job_context=job_context, subtasks=subtasks, **kwargs)

Back to process documentation: [/processes/tasks/chains/opera/opera-raw-task](/processes/tasks/chains/opera/opera-raw-task)

Last updated on