Skip to Content
DevelopmentReferenceTasksProcessingGuestMatchingCheckTask — Developer Reference

GuestMatchingCheckTask — Developer Reference

This developer reference hosts code examples and implementation detail for the GuestMatchingCheckTask.

Example Usage

Standard Usage

from etl_lib.tasks.processing.GuestMatchingCheckTask import GuestMatchingCheckTask task = GuestMatchingCheckTask( job_context=job_context, write_to_catalog=True ) task.run()

When Guest Matching is Skipped

from etl_lib.tasks.processing.ProcessingTask import ProcessingTask # Skip GuestMatchingTask but ensure cluster IDs exist task = ProcessingTask( job_context=job_context, skip_subtasks=["GuestMatchingTask"] ) # GuestMatchingCheckTask will create cluster IDs from guest_id task.run()

Implementation

def run(self): # Fallback to CleanGuestModel if ProcessedGuestModel not present try: guest_df = self.get_df_from_input(ProcessedGuestModel) except Exception: guest_df = self.get_df_from_input(CleanGuestModel) if "guest_cluster_id" not in guest_df.columns: print("Guest cluster id column not found. Copying guest_id to guest_cluster_id") guest_df = guest_df.withColumn( "guest_cluster_id", F.col("guest_id") ) self.write_to_output(ProcessedGuestModel, guest_df)

See also

Last updated on