similarity - How to find best resemblance between 1 row and the rest of dataframe in R? -

- April 25, 2010

how can find best resemblance between 1 particular row , rest of rows in dataframe?

i try explain mean. take @ dataframe:

df <- structure(list(person = 1:5, var1 = c(1l, 5l, 2l, 2l, 5l), var2 = c(4l,  4l, 3l, 2l, 2l), var3 = c(5l, 4l, 4l, 3l, 1l)), .names = c("person",  "var1", "var2", "var3"), class = "data.frame", row.names = c(na,  -5l))

how can find best resemblance between person 1 (row 1) , rest of rows (persons) in data frame. output should like: person 1 still in row 1 , rest of rows in order of best resemblance. simmilarity algorithm want use cosine or pearson. tried solve problem functions arules package, didn't match needs.

any ideas someone?

another idea define cosine function manually, , apply on data frame, i.e.

f1 <- function(x, y){   crossprod(x, y)/sqrt(crossprod(x) * crossprod(y)) }  df[c(1, order(sapply(2:nrow(df), function(i)                                  f1(unlist(df[1,-1]), unlist(df[i, -1]))),                                                            decreasing = true)+1),]

which gives,

   person var1 var2 var3 1      1    1    4    5 3      3    2    3    4 4      4    2    2    3 2      2    5    4    4 5      5    5    2    1

wiki

Search This Blog

tL

similarity - How to find best resemblance between 1 row and the rest of dataframe in R? -

Comments

Post a Comment

Popular posts from this blog

python - Read npy file directly from S3 StreamingBody -

Asterisk AGI Python Script to Dialplan does not work -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -