GuestlineRawTask
The GuestlineRawTask orchestrates the raw data acquisition for Guestline-based chains (e.g., Queensway). It runs ingestion and crawling subtasks to fetch roompicks and personprofiles data and writes these into the ingest/catalog for further processing.
Overview
This task handles the initial data acquisition stage by:
- Ingesting data from the Guestline API (per-property roompicks and global personprofiles)
- Crawling S3 ingest buckets and writing to Iceberg/raw tables
- Producing raw models for downstream processing
- Supporting optional date-based sync strategies
Flow Diagram
Subtasks
GuestlineIngesterTask
Connects to the Guestline API and downloads event logs for roompicks (property-specific) and personprofiles (global). This task writes payloads to the configured ingest bucket.
GuestlineCrawlerTask
Parses the S3 ingest payloads and transforms them into the raw models that are used by subsequent cleaner tasks.
Models
Requires
- None — initial data acquisition
Provides
RawRoompicksModel- Raw Guestline roompick event data (one row per event)RawPersonprofilesModel- Raw Guestline personprofile payloads
Best Practices
- Use
sync_datesto perform incremental ingests. - For local development,
skip_ingestioncan be used to load local test data and iterate on crawler logic. - Monitor API rate limits and gateway logs when running full ingestion jobs.
- Ensure API keys are provided via the
JobContextconfiguration, and never hardcode secrets.
Downstream Tasks
The output of this task feeds into:
Related Tasks
- CleanGuestlineTask - The next step: clean raw Guestline data
- Task - Base task class
Last updated on