Skip to Content
ProcessesTasksChainsGuestlineGuestline Crawler Task

GuestlineCrawlerTask

The GuestlineCrawlerTask parses the files in the ingest bucket for Guestline chains and writes Iceberg/raw tables for further processing. It maps the physical ingest file layout to model classes and standardizes the table naming convention.

Overview

This task:

  • Maps S3 ingest paths to RawRoompicksModel and RawPersonprofilesModel
  • Reads JSON payloads from the ingest bucket
  • Writes them into Iceberg/raw tables for downstream cleaning tasks
  • Uses get_files_path to locate per-property vs global payloads

Implemented Logic

  • get_files_path(table_name, property_id=None) returns the ingest path used by the chain & property
  • get_model_for_table(table_name) maps roompicks to RawRoompicksModel and personprofiles to RawPersonprofilesModel

Models

Requires

  • None (ingester writes ingest files)

Provides

  • RawRoompicksModel
  • RawPersonprofilesModel
Last updated on