OperaCrawlerTask — Developer Reference
Developer reference for OperaCrawlerTask (S3/ingest → Iceberg crawler). Includes constructor parameters and core methods used to map files to models.
Constructor Parameters
| Parameter | Type | Description |
|---|---|---|
name | Optional[str] | Optional task name |
job_context | JobContext | Job context with Spark, catalog and config |
Provides / Requires
requires()— []provides()—[RawReservationModel, RawRateModel, RawProfileModel]
Example Usage
from etl_lib.tasks.chains.opera.OperaCrawlerTask import OperaCrawlerTask
crawler = OperaCrawlerTask(job_context=job_context, write_to_catalog=True)
crawler.run()Implementation Details
Key methods:
get_files_path(table_name, property_id=None)— Return the S3 paths for the given table for either all properties or a specific property. Example:
def get_files_path(self, table_name, property_id=None):
chain_id = self.job_context.chain_id
if property_id:
return S3Path(f"/etl-gp-raw/ingest/{chain_id}/{table_name}/property_id={property_id}/")
return S3Path(f"/etl-gp-raw/ingest/{chain_id}/{table_name}/")get_model_for_table(table_name)— Map ingest table name to model class (e.g.,daily_rates→RawRateModel).
Back to process documentation: [/processes/tasks/chains/opera/opera-raw-task](/processes/tasks/chains/opera/opera-raw-task)
Last updated on