Quantcast
Channel: NumberTheory
Viewing all articles
Browse latest Browse all 31

Using mutate from dplyr inside a function: getting around non-standard evaluation

$
0
0

To edit or add columns to a data.frame, you can use mutate from the dplyr package:

library(dplyr)
mtcars %>% mutate(new_column = mpg + wt)

Here, dplyr uses non-standard evaluation in finding the contents for mpg and wt, knowing that it needs to look in the context of mtcars. This is nice for interactive use, but not so nice for using mutate inside a function where mpg and wt are inputs to the function.

The goal is to write a function f that takes the columns in mtcars you want to add up as strings, and executes mutate. Note that we also want to be able to set the new column name. A first naive approach might be:

f = function(col1, col2, new_col_name) {
    mtcars %>% mutate(new_col_name = col1 + col2)
}

The problem is that col1 and col2 are not interpreted, in stead dplyr tries looking for col1 and col2 in mtcars. In addition, the name of the new column will be new_col_name, and not the content of new_col_name. To get around non-standard evaluation, you can use the lazyeval package. The following function does what we expect:

library(lazyeval)
f = function(col1, col2, new_col_name) {
    mutate_call = lazyeval::interp(~ a + b, a = as.name(col1), b = as.name(col2))
    mtcars %>% mutate_(.dots = setNames(list(mutate_call), new_col_name))
}
head(f('wt', 'mpg', 'hahaaa'))
   mpg cyl disp  hp drat    wt  qsec vs am gear carb hahaaa
1 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4 23.620
2 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4 23.875
3 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1 25.120
4 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1 24.615
5 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2 22.140
6 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1 21.560

The important parts here are, given the call to f above:

  • lazyeval::interp(~ a + b, a = as.name(col1), b = as.name(col2)) this creates the expression wt + mpg.
  • mutate_(mutate_call) where mutate_ is the version of mutate that uses standard evaluation (SE).
  • setNames(list(mutate_call), new_col_name)) sets the output name to the content of new_col_name, i.e. hahaaa.

Viewing all articles
Browse latest Browse all 31

Latest Images

Trending Articles





Latest Images