GuestMatchingCheckTask — Developer Reference
This developer reference hosts code examples and implementation detail for the GuestMatchingCheckTask.
Example Usage
Standard Usage
from etl_lib.tasks.processing.GuestMatchingCheckTask import GuestMatchingCheckTask
task = GuestMatchingCheckTask(
job_context=job_context,
write_to_catalog=True
)
task.run()When Guest Matching is Skipped
from etl_lib.tasks.processing.ProcessingTask import ProcessingTask
# Skip GuestMatchingTask but ensure cluster IDs exist
task = ProcessingTask(
job_context=job_context,
skip_subtasks=["GuestMatchingTask"]
)
# GuestMatchingCheckTask will create cluster IDs from guest_id
task.run()Implementation
def run(self):
# Fallback to CleanGuestModel if ProcessedGuestModel not present
try:
guest_df = self.get_df_from_input(ProcessedGuestModel)
except Exception:
guest_df = self.get_df_from_input(CleanGuestModel)
if "guest_cluster_id" not in guest_df.columns:
print("Guest cluster id column not found. Copying guest_id to guest_cluster_id")
guest_df = guest_df.withColumn(
"guest_cluster_id",
F.col("guest_id")
)
self.write_to_output(ProcessedGuestModel, guest_df)See also
- Developer Reference: /development/reference/tasks/processing/guest-matching-check-task
- Process documentation: /processes/tasks/processing/guest-matching-check-task
Last updated on