Skip to Content
ProcessesTasksTasks Glossary

Tasks Glossary

Tasks are the fundamental building blocks of the ETL pipeline. They encapsulate specific data processing operations and can be composed together to build complex workflows.

Core Task Classes

Task

The base abstract class that all tasks inherit from. Defines the task lifecycle, input/output models, and execution flow.

TaskGroup

A container for grouping multiple tasks together for organized execution.


Common Processing Tasks

Tasks used across all hotel chains for standard data processing operations.

GuestLoyaltyTask

Computes guest loyalty metrics including stays, lifetime value, booking patterns, and loyalty categories.

GuestMatchingCheckTask

Ensures all guests have a cluster ID for downstream processing by copying guest_id if cluster_id is missing.

GuestMatchingTask

Matches and deduplicates guests using the Splink library for record linkage.

ProcessingTask

Orchestrator task that runs all common processing subtasks in sequence.

ReplaceValuesTask

Replaces values in DataFrames based on configured mappings from job configuration.

ReservationMetricsTask

Calculates key metrics for reservations including cancellation window, booking window, and stay nights.

RoomMetricsTask

Calculates daily stay metrics per room including stay-day/night flags, length of stay, and revenue aggregations.


Reporting Tasks

Tasks that generate final reports for consumption by end users.

BookingRoomsReportTask

Aggregates daily room data to the reservation level, summing all charges per booking.

DailyReportTask

Produces one row per stay date (per booking), showing detailed daily room activity.

GuestLoyaltyReportTask

Generates a guest loyalty report with KPIs and loyalty flags based on booking and stay history.

PickupReportTask

Generates pickup & pace report comparing current bookings with historical snapshots.

ReportsTask

Orchestrator task that runs all report generation subtasks.


Chain-Specific Tasks

Tasks that are specific to individual hotel chains.

Athenaeum

AthenaeumLegacyMergeTask

Merges legacy Athenaeum data (pre-2025) into processed DataFrames.

AthenaeumRawTask

Orchestrates both ingestion and crawling for Athenaeum data.

CleanAthenaeumTask

Cleans and transforms raw Athenaeum data into standardized guest, reservation, and room models.

Opera (Cheval Collection)

OperaRawTask

Orchestrates both ingestion and crawling for Opera data (Cheval Collection).

OperaIngesterTask

Downloads raw files for Opera.

OperaCrawlerTask

Parses raw Opera payloads and produces raw models used by the cleaner.

CleanOperaTask

Cleans and transforms Opera raw data into standardized guest, reservation, and room models.

OperaLegacyMergeTask

Merges legacy Opera data from S3 into processed DataFrames.

CMTPFixTask

Applies a targeted workaround for CTMP cancellations at Cheval Maison The Palm Dubai.

Last updated on