Guest Loyalty
Guest loyalty summarizes how often, how recently, and how valuable a guest is. It counts stays and nights, measures booking lead time, tracks first/last and upcoming stays, lists properties visited, and totals room revenue when available. It flags returning guests, buckets stay frequency, records the last booked property/channel/source/agent/segment/rate, and tags each reservation with the guest’s booking order.
Logic for Aggregating per guest cluster
Inputs
- Expects
res_df(reservations) andguest_dfto exist on the workflow context.res_dfshould contain at least:res_id,booking_date,check_in_date,check_out_date,status, andguest_id(orguest_cluster_idafter matching). Optional but commonly used columns:total_room_net,total_room_rate_net,stay_nights,property_id,channel,source,travel_agent,market_segment,room_rate_code, andcreated_timestamp.
Outputs
- Writes back an enriched
guest_dfaggregated byguest_cluster_idwith loyalty metrics such asstays,total_stay_nights,avg_stay_nights,lifetime_total_value,avg_total_net_revenue_per_stay,first_booking_date,last_booking_date,last_stayed_property_id,properties_visited(comma-separated),returning_guest,stays_category, and a set of last-booked fields (channel, source, travel agent, market segment, rate code). - Updates
res_dfby joining aguest_cluster_id(fromguest_df) and addingnth_booking(1-based bybooking_date) for reservations with status inconfirmed,checked_in, orchecked_out.
Key behaviours and notes
- Only reservations with status in [“confirmed”, “checked_in”, “checked_out”] are considered when computing loyalty aggregates and when calculating
time_between_bookings(the code filtersres_dfbefore joining guests). time_between_bookingsis computed using a windowedlag(booking_date)partitioned byguest_cluster_idand ordered bybooking_date.- Revenue metrics are only aggregated if
total_room_net/total_room_rate_netcolumns exist; the implementation conditionally extends aggregation expressions when these columns are present. properties_visitedis built by collecting a set ofproperty_idvalues for past stays and concatenating them with commas.stays_categorymaps counts into buckets:1,2,3,4-7,8-10,11-20,20+.
Edge cases and failure modes
- If
guest_cluster_idis missing onguest_df(for example when matching is skipped),Guest Matching Checkshould be used to copyguest_idintoguest_cluster_idto avoid null joins. Without aguest_cluster_idthe loyalty aggregation will not group guests correctly. - The code casts several aggregated longs into integers (e.g.,
stays,total_stay_nights,future_checkins) to ensure stable types for downstream consumers. - The
nth_bookingnumbering only includes reservations with an eligible status and non-nullguest_cluster_id. - Null or missing date columns will propagate nulls in booking window and stay-night related metrics; the code guards some computations with conditional
whenclauses but relies on valid dates for meaningful numeric values.
Example (high level)
Given a workflow context with res_df and guest_df:
- Running the Guest Loyalty runnable produces an updated
guest_dfkeyed byguest_cluster_idwith aggregated loyalty metrics and an updatedres_dfcontainingguest_cluster_idandnth_bookingfor eligible reservations.
See also
- Guest Matching — Splink-based record linkage that produces
guest_cluster_idused by loyalty. - Guest Matching Check — ensures a
guest_cluster_idexists when matching is skipped.
Last updated on