r - Conditional mutate with data from two data frames -

- August 25, 2014

i have 2 data frames, number 1 contains new values of rows of data in data frame 2 (data frame 2 has lot more data no. 1). have earlier used following code overwriting (from data frame 1 data frame 2) specific column values based on number in column:

for(i in 1:nrow(dataset1)){   sak.i <- dataset1$column1[i]   rad.i <- which(dataset2$column1 == sak.i)   dataset2$column2[rad.i] <- dataset1$column2[i]   dataset2$column3[rad.i] <- dataset1$column3[i]   ...   }

this works fine. however, wanted not overwrite create new column information. wanted insert new values column if rad.i = true, otherwise use values present in second data frame. came this:

for(i in 1:nrow(dataset1)){   sak.i <- dataset1$column1[i]   rad.i <- which(dataset2$column1 == sak.i)   mutate(new_column_name = ifelse(     dataset2$column2[rad.i], dataset1$column2[i], dataset2$column2)          )   mutate(new_column_name2 = ifelse(     dataset2$column3[rad.i], dataset1$column3[i], dataset2$column3)          )   ... }

when run following error:

error in mutate_(.data, .dots = compat_as_lazy_dots(...)) :    argument ".data" missing, no default

i have read bit error, cannot seem isolate problem.

note: want work around 10 columns. there easier way this? have mutate command every column?

example:

col11 <- as.character(4:7) col21 <- c(0.03, 0.06, 1, 2) col12 <- as.character(1:7) col22 <- c(67,23,0.03,1,2,10,16)  dataframe1 <- cbind(col11, col21) dataframe2 <- cbind(col12, col22)  data frame 1: col1 col2 4    0.03 5    0.06 6    1 7    2  data frame 2: col1  col2 1     67 2     23 3     0.03 4     1 5     2 6     10 7     16  expected output: col1  col2  col3 1     67    67 2     23    23 3     0.03  0.03 4     1     0.03 5     2     0.06 6     10    1 7     16    2

you can in 2 steps. first merge on col1 , replace na, i.e.

final_d <- merge(d1, d2, = 'col1', = true) final_d$col2.x[is.na(final_d$col2.x)] <- final_d$col2.y[is.na(final_d$col2.x)]

which gives,

 col1 col2.x col2.y 1    1  67.00  67.00 2    2  23.00  23.00 3    3   0.03   0.03 4    4   0.03   1.00 5    5   0.06   2.00 6    6   1.00  10.00 7    7   2.00  16.00

since mention mutate, dplyr version of above be,

d1 %>%   full_join(d2, = 'col1') %>%   mutate(col2.x = replace(col2.x, is.na(col2.x), col2.y[is.na(col2.x)])) %>%   arrange(col1)

data

dput(d1) structure(list(col1 = 4:7, col2 = c(0.03, 0.06, 1, 2)), .names = c("col1",  "col2"), class = "data.frame", row.names = c(na, -4l))  dput(d2) structure(list(col1 = 1:7, col2 = c(67, 23, 0.03, 1, 2, 10, 16 )), .names = c("col1", "col2"), class = "data.frame", row.names = c(na,  -7l))

wiki

Search This Blog

tL

r - Conditional mutate with data from two data frames -

Comments

Post a Comment

Popular posts from this blog

Asterisk AGI Python Script to Dialplan does not work -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -

python - Read npy file directly from S3 StreamingBody -