Skip to Content
DevelopmentReferenceTasksChainsOperaOperaCrawlerTask — Developer Reference

OperaCrawlerTask — Developer Reference

Developer reference for OperaCrawlerTask (S3/ingest → Iceberg crawler). Includes constructor parameters and core methods used to map files to models.

Constructor Parameters

ParameterTypeDescription
nameOptional[str]Optional task name
job_contextJobContextJob context with Spark, catalog and config

Provides / Requires

  • requires() — []
  • provides()[RawReservationModel, RawRateModel, RawProfileModel]

Example Usage

from etl_lib.tasks.chains.opera.OperaCrawlerTask import OperaCrawlerTask crawler = OperaCrawlerTask(job_context=job_context, write_to_catalog=True) crawler.run()

Implementation Details

Key methods:

  • get_files_path(table_name, property_id=None) — Return the S3 paths for the given table for either all properties or a specific property. Example:
def get_files_path(self, table_name, property_id=None): chain_id = self.job_context.chain_id if property_id: return S3Path(f"/etl-gp-raw/ingest/{chain_id}/{table_name}/property_id={property_id}/") return S3Path(f"/etl-gp-raw/ingest/{chain_id}/{table_name}/")
  • get_model_for_table(table_name) — Map ingest table name to model class (e.g., daily_ratesRawRateModel).

Back to process documentation: [/processes/tasks/chains/opera/opera-raw-task](/processes/tasks/chains/opera/opera-raw-task)

Last updated on