similarity - How to find best resemblance between 1 row and the rest of dataframe in R? -
how can find best resemblance between 1 particular row , rest of rows in dataframe?
i try explain mean. take @ dataframe:
df <- structure(list(person = 1:5, var1 = c(1l, 5l, 2l, 2l, 5l), var2 = c(4l, 4l, 3l, 2l, 2l), var3 = c(5l, 4l, 4l, 3l, 1l)), .names = c("person", "var1", "var2", "var3"), class = "data.frame", row.names = c(na, -5l))
how can find best resemblance between person 1 (row 1) , rest of rows (persons) in data frame. output should like: person 1 still in row 1 , rest of rows in order of best resemblance. simmilarity algorithm want use cosine or pearson. tried solve problem functions arules package
, didn't match needs.
any ideas someone?
another idea define cosine function manually, , apply on data frame, i.e.
f1 <- function(x, y){ crossprod(x, y)/sqrt(crossprod(x) * crossprod(y)) } df[c(1, order(sapply(2:nrow(df), function(i) f1(unlist(df[1,-1]), unlist(df[i, -1]))), decreasing = true)+1),]
which gives,
person var1 var2 var3 1 1 1 4 5 3 3 2 3 4 4 4 2 2 3 2 2 5 4 4 5 5 5 2 1
wiki
Comments
Post a Comment