A Gateway Drug to purrr

A lot of the data I work with uses numeric codes rather than text to describe features of each record. For example, financial data often has a fund code that represents the account’s source of dollars and an object code that signals what is bought (e.g. salaries, benefits, supplies). This is a little like the factor data type in R, which to the frustration of many modern analysts is internally an integer that mapped to a character label (which is a level) with a fixed number of possible values.

I am often looking at data stored like this:

fund_code	object_code	debit	credit
1000	2121	0	10000
1000	2122	1000	0

with the labels stored in another set of tables:

fund_code	fund_name
1000	General

and

object_code	object_name
2121	Social Security
2122	Life Insurance

Before purrr, I might have done a series of dplyr::left_join or merge to combine these data sets and get the labels in the same data.frame as my data.

But no longer!

Now, I can just create a list, add all the data to it, and use purrr:reduce to bring the data together. Incredibly convenient when up to 9 codes might exist for a single record!

# Assume each code-name pairing is in a CSV file in a directory
data_codes <- lapply(dir('codes/are/here/', full.names = TRUE ), 
                     readr::read_csv)
data_codes$transactions <- readr::read_csv('my_main_data_table.csv')
transactions <- purrr:reduce_right(data_codes, dplyr::left_join)