GuestMatchingCheckTask
The GuestMatchingCheckTask ensures all guests have a guest_cluster_id for downstream processing. If the column is missing, it creates it by copying from guest_id.
Overview
This task serves as a safety check and fallback mechanism:
- Validates that
guest_cluster_idcolumn exists - Creates the column from
guest_idif missing - Ensures consistent schema for downstream tasks
- Allows pipeline to run even when guest matching is skipped
Notes
- If
ProcessedGuestModelis not available (e.g. guest matching was not run), the task will fall back toCleanGuestModeland copyguest_idtoguest_cluster_id. - The task does not perform incremental merging itself; it is intended as a schema safety net for downstream tasks.
Flow Diagram
Models
Requires
ProcessedGuestModel- Guest data from previous processing steps
Provides
ProcessedGuestModel- Guest data with guaranteedguest_cluster_id
When testing other parts of the pipeline without waiting for expensive guest matching.
When appending new records that will be matched in a separate batch process.
When guest duplication across records is unlikely (e.g., API-driven systems with unique IDs).
Related Tasks
- GuestMatchingTask - Actual guest matching
- ProcessingTask - Parent orchestrator
- GuestLoyaltyTask - Requires guest_cluster_id
Last updated on