Missing-data

Coalescing joins in dplyr

Filling in missing data by joining

Edward Visel

4 minute read

When aggregating data, it is not uncommon to need to combine datasets containing identical non-key variables in varying states of completeness. There are various ways to accomplish this task. One possibility an coalescing join, a join in which missing values in x are filled with matching values from y. Such behavior does not exist in current dplyr joins, though it has been discussed, and so may someday. For now, let’s build an coalesce_join function.