Functional Programming in R with purrr

Functional Programming in R with purrr

I have been writing about the family packages from TidyVerse. Tidyverse is a great collection of R packages offering data science solutions in the areas of data manipulation, exploration, and visualization that share a common design philosophy. Today, I will talk about purrr.

As I enter my second to last semester of masters of science in data science program, I have become accustomed to writing several functions for various analysis. But often writing several functions creates mistakes which throws an error. Take the following code for example:

aov_mpg <- aov(mpg ~ factor(cyl), data = mtcars)
summary(aov_mpg)

aov_disp <- aov(disp ~ factor(cyll), data = mtcars)
summary(aov_disp)

aov_hp <- aov(hp ~ factor(cyl), data = mrcars)
summry(aov_hpp)

aov_wt <- aov(wt ~ factor(cyl), datas = mtcars)
summary(aov_wt)

In the code chunk above, if you wanted to change ANOVAs for number of gears instead of number of cylinders, you would have to go back and change the factor(cyl) call to factor(gear) 4x! This is not very efficient, and you are likely to end up with mistakes as you have to type everything multiple times. It gets more complicated if you have to write functions for hundreds of variables.

This is where purrr comes in. Purrr solves the issue of minimizing repetition with further replication. Here we use purrr, to solve the same one-way ANOVAs for some dependent variables and a set independent variable. We can see that purrr requires less coding and if were to change a variable, we have to do it once. That’s the beauty of purrr.

mtcars %>%
  mutate(cyl = factor(cyl)) %>%
  select(mpg, disp, hp) %>%
  map(~ aov(.x ~ cyl, data = mtcars)) %>%
  map_dfr(~ tidy(.), .id = 'source') %>%
  mutate(p.value = round(p.value, 5)) %>% 
  kable() %>% 
  kable_styling()
sourcetermdfsumsqmeansqstatisticp.value
mpgcyl1817.7130817.7129579.561030
mpgResiduals30308.334210.27781NANA
dispcyl1387454.0926387454.09261130.998880
dispResiduals3088730.70212957.69007NANA
hpcyl1100984.1721100984.1720967.709930
hpResiduals3044742.70291491.42343NANA
Avatar
Data Scientist

Saayed Alam creates machine learning products and occasionally gets philosophical.

Next

Related