r - Conditional mutate with data from two data frames -
i have 2 data frames, number 1 contains new values of rows of data in data frame 2 (data frame 2 has lot more data no. 1). have earlier used following code overwriting (from data frame 1 data frame 2) specific column values based on number in column:
for(i in 1:nrow(dataset1)){ sak.i <- dataset1$column1[i] rad.i <- which(dataset2$column1 == sak.i) dataset2$column2[rad.i] <- dataset1$column2[i] dataset2$column3[rad.i] <- dataset1$column3[i] ... }
this works fine. however, wanted not overwrite create new column information. wanted insert new values column if rad.i = true, otherwise use values present in second data frame. came this:
for(i in 1:nrow(dataset1)){ sak.i <- dataset1$column1[i] rad.i <- which(dataset2$column1 == sak.i) mutate(new_column_name = ifelse( dataset2$column2[rad.i], dataset1$column2[i], dataset2$column2) ) mutate(new_column_name2 = ifelse( dataset2$column3[rad.i], dataset1$column3[i], dataset2$column3) ) ... }
when run following error:
error in mutate_(.data, .dots = compat_as_lazy_dots(...)) : argument ".data" missing, no default
i have read bit error, cannot seem isolate problem.
note: want work around 10 columns. there easier way this? have mutate command every column?
example:
col11 <- as.character(4:7) col21 <- c(0.03, 0.06, 1, 2) col12 <- as.character(1:7) col22 <- c(67,23,0.03,1,2,10,16) dataframe1 <- cbind(col11, col21) dataframe2 <- cbind(col12, col22) data frame 1: col1 col2 4 0.03 5 0.06 6 1 7 2 data frame 2: col1 col2 1 67 2 23 3 0.03 4 1 5 2 6 10 7 16 expected output: col1 col2 col3 1 67 67 2 23 23 3 0.03 0.03 4 1 0.03 5 2 0.06 6 10 1 7 16 2
you can in 2 steps. first merge on col1
, replace na
, i.e.
final_d <- merge(d1, d2, = 'col1', = true) final_d$col2.x[is.na(final_d$col2.x)] <- final_d$col2.y[is.na(final_d$col2.x)]
which gives,
col1 col2.x col2.y 1 1 67.00 67.00 2 2 23.00 23.00 3 3 0.03 0.03 4 4 0.03 1.00 5 5 0.06 2.00 6 6 1.00 10.00 7 7 2.00 16.00
since mention mutate
, dplyr
version of above be,
d1 %>% full_join(d2, = 'col1') %>% mutate(col2.x = replace(col2.x, is.na(col2.x), col2.y[is.na(col2.x)])) %>% arrange(col1)
data
dput(d1) structure(list(col1 = 4:7, col2 = c(0.03, 0.06, 1, 2)), .names = c("col1", "col2"), class = "data.frame", row.names = c(na, -4l)) dput(d2) structure(list(col1 = 1:7, col2 = c(67, 23, 0.03, 1, 2, 10, 16 )), .names = c("col1", "col2"), class = "data.frame", row.names = c(na, -7l))
wiki
Comments
Post a Comment