Skip to Content
ProcessesTasksChainsOperaOpera Raw Task

Opera Raw Task

The OperaRawTask orchestrates raw data ingestion and crawling steps for Opera chain properties. This composite task runs OperaIngesterTask and OperaCrawlerTask subtasks.

Subtasks

  1. OperaIngesterTask (optional) — Downloads raw data from the Opera PMS API and stores files in a local or S3 cache; can be skipped when raw files already exist.
  2. OperaCrawlerTask — Parses downloaded files and loads them into Spark DataFrames.

Configuration Options

  • skip_ingestion — Skip the ingestion/download step when raw files exist.
  • sync_dates — Specific dates to sync (optional).
  • incremental_crawl — Whether to perform an incremental crawl (useful for small sync ranges).

Models Provided

  • RawReservationModel — Raw reservations payloads.
  • RawRateModel — Daily rate payloads with revenue details.
  • RawProfileModel — Profile/guest payloads.

Example Usage

Used in the job entrypoint for Cheval Collection:

OperaRawTask(job_context=job_context, skip_ingestion=False)

This will download (unless skip_ingestion is True) and parse Opera raw files for the configured properties.

  • OperaIngesterTask (/processes/tasks/chains/opera/opera-ingester-task)
  • OperaCrawlerTask (/processes/tasks/chains/opera/opera-crawler-task)
  • CleanOperaTask
Last updated on