AthenaeumRawTask
The AthenaeumRawTask orchestrates the raw data acquisition for the Athenaeum hotel chain, running both ingestion and crawling subtasks to fetch reservation data from the source system.
Overview
This task handles the initial data acquisition stage by:
- Ingesting data from the Athenaeum API/system
- Crawling and parsing reservation data
- Producing raw models for downstream processing
- Supporting optional date-based sync strategies
Flow Diagram
Subtasks
AthenaeumIngesterTask
Connects to the Athenaeum system to fetch and ingest raw booking data.
AthenaeumCrawlerTask
Parses and transforms ingested data into standardized raw models.
Models
Requires
- None (initial data acquisition)
Provides
RawReservationModel- Raw reservation data from AthenaeumRawRevenueModel- Raw revenue data from Athenaeum
Error Handling
The task handles common scenarios:
- API connection failures
- Missing data for specific dates
- Parsing errors in crawler
- Invalid date formats
Downstream Tasks
The output of this task feeds into:
Related Tasks
- CleanAthenaeumTask - Next step: cleaning
- Task - Base task class
Best Practices
- Use sync_dates for incremental updates to avoid re-ingesting all data
- Monitor ingestion logs for API errors or missing data
- Test with skip_ingestion when developing crawler changes
- Schedule appropriately based on Athenaeum data update frequency
- Handle API credentials securely through JobContext configuration
Last updated on