Room Metrics
- Joins room-stay rows with reservation status.
- Computes length of stay (days) where applicable.
- Sets per-stay-date for check-in, check-out, stay-day, and stay-night.
- Normalizes revenue components and aggregates per reservation.
- Adds reservation-level check-in/out dates and guest totals when missing.
- Cleans up temporary fields and updates the rooms dataset.
Inputs
- Expects
rooms_dfandres_dfon the workflow context.res_dfis used to provide reservationstatusfor eachres_id.rooms_dfshould contain fields such asres_id,room_stay_date,room_check_in_date,room_check_out_date,room_stay_date_rate_net,room_stay_date_fnb_net,room_stay_date_other_net,room_pmand guest counts (room_guests_adults, room_guests_children) when available.
Outputs
- Writes back an enriched
rooms_dfwith computed columns:room_length_of_stay,room_stay_date_total_net(sum of rate/fnb/other),room_stay_date_is_check_in_day,room_stay_date_is_check_out_day,room_stay_date_is_stay_day,room_stay_date_is_stay_nightand aggregated reservation-level columns such ascheck_in_date,check_out_date,guests_adults,guests_children,total_room_net,total_room_rate_net,total_room_fnb_net,total_room_other_netwhen the corresponding day-level fields are present.
Behaviour and business rules
- The runnable joins
rooms_dfwithres_df.select("res_id", "status")(broadcast). It removes duplicates and repartitions byres_id. - Length of stay (
room_length_of_stay) is computed as the absolute datediff betweenroom_check_out_dateandroom_check_in_datefor current/future rows. If the necessary check-in/out fields are missing, that column is left unchanged. - Revenue day-level components are coalesced to 0.0 to avoid nulls before aggregation.
- Stay-day and stay-night flags are computed using several exclusion rules:
- Non-room bookings (based on booking type) are excluded.
- Cancelled, no_show and waitlist statuses are excluded.
- PM rooms (
room_pm) are excluded. - Day-use stays (same check-in and check-out date) are handled specially: they can count as a stay-day but not a stay-night.
- Check-out day for overnight stays is excluded from stay-day/stay-night.
- Past dates may be excluded unless the reservation is in
checked_in,checked_outorin_housestatuses.
- Aggregations per
res_idcompute min/max check-in/out dates, sum of guest counts, and sums of revenue components which are then joined back to the room rows.
Edge cases and notes
- If
statusis present in the joined df it is dropped after use to avoid leaking temporary columns. - The code uses
checkpoint(eager=True)on intermediate DataFrames to limit lineage and improve stability for long pipelines. - Booking type is inspected; rows that are not ROOM bookings are excluded from stay-day/night flags.
- Null or missing fields will result in missing derived values; the runnable is defensive but expects typical reservation and room-day fields to be present for meaningful results.
Last updated on