Skip to Content
ProcessesTasksProcessingGuestMatchingCheckTask

GuestMatchingCheckTask

The GuestMatchingCheckTask ensures all guests have a guest_cluster_id for downstream processing. If the column is missing, it creates it by copying from guest_id.

Overview

This task serves as a safety check and fallback mechanism:

  • Validates that guest_cluster_id column exists
  • Creates the column from guest_id if missing
  • Ensures consistent schema for downstream tasks
  • Allows pipeline to run even when guest matching is skipped

Notes

  • If ProcessedGuestModel is not available (e.g. guest matching was not run), the task will fall back to CleanGuestModel and copy guest_id to guest_cluster_id.
  • The task does not perform incremental merging itself; it is intended as a schema safety net for downstream tasks.

Flow Diagram

Models

Requires

  • ProcessedGuestModel - Guest data from previous processing steps

Provides

  • ProcessedGuestModel - Guest data with guaranteed guest_cluster_id

When testing other parts of the pipeline without waiting for expensive guest matching.

When appending new records that will be matched in a separate batch process.

When guest duplication across records is unlikely (e.g., API-driven systems with unique IDs).

Last updated on