Skip to Content
ProcessesTasksChainsGuestlineGuestline Raw Task

GuestlineRawTask

The GuestlineRawTask orchestrates the raw data acquisition for Guestline-based chains (e.g., Queensway). It runs ingestion and crawling subtasks to fetch roompicks and personprofiles data and writes these into the ingest/catalog for further processing.

Overview

This task handles the initial data acquisition stage by:

  • Ingesting data from the Guestline API (per-property roompicks and global personprofiles)
  • Crawling S3 ingest buckets and writing to Iceberg/raw tables
  • Producing raw models for downstream processing
  • Supporting optional date-based sync strategies

Flow Diagram

Subtasks

GuestlineIngesterTask

Connects to the Guestline API and downloads event logs for roompicks (property-specific) and personprofiles (global). This task writes payloads to the configured ingest bucket.

GuestlineCrawlerTask

Parses the S3 ingest payloads and transforms them into the raw models that are used by subsequent cleaner tasks.

Models

Requires

  • None — initial data acquisition

Provides

  • RawRoompicksModel - Raw Guestline roompick event data (one row per event)
  • RawPersonprofilesModel - Raw Guestline personprofile payloads

Best Practices

  1. Use sync_dates to perform incremental ingests.
  2. For local development, skip_ingestion can be used to load local test data and iterate on crawler logic.
  3. Monitor API rate limits and gateway logs when running full ingestion jobs.
  4. Ensure API keys are provided via the JobContext configuration, and never hardcode secrets.

Downstream Tasks

The output of this task feeds into:

Last updated on