Replace Values

Replaces values in dataset using a mapping.

Inputs

Expects one or more of res_df, rooms_df, guest_df to be present on the workflow context. The runnable loops over these DataFrames and inspects every column for configured replacements.

Mappings are provided through the job configuration via get_value_replacements_for_column(column) (the exact config key lives in the job context). The mapping should be a dictionary mapping original values to replacement values (old_value -> new_value).

For each target DataFrame and for each column that has a replacement mapping:
- The original value is preserved in a new column named <column>_og.
- A small mapping DataFrame is built in Spark from the mapping dict and left-joined to the target DataFrame on the column value.
- The column is replaced using coalesce(new_val, column) so unmapped values are preserved.
After processing all columns in a DataFrame the code checkpoints the DataFrame eagerly and writes it back to the workflow context under the same attribute name.

Updated DataFrames on the workflow context with _og columns for each replaced field and values updated according to the provided mapping.

If a target DataFrame is not present on the context it is skipped.
If no mapping exists for a given column it is left unchanged (no _og column created).
The implementation uses a join-based approach which is robust for many mappings but may be less efficient for extremely large mapping dictionaries — consider a broadcast join or a UDF for very large mappings if performance becomes an issue.

If the job config contains a mapping for status: {"Cancelled": "cancelled", "No Show": "no_show"} then running Replace Values will produce a status_og column with the original values and replace status values according to the mapping.