pandas - Python - Compare two columns of features, return values which are not common to both -
i compare 2 columns of features("a", "b") , return values not common both. rows of unequal size, , values may occur more once.
i tried:
a[np.logical_not(np.in1d(a,b))]
but doesn't seem work if len(b) > len(a)
any suggestions?
iiuc looking symmetric difference:
source dfs:
in [41]: d1 out[41]: 0 1 b 2 c 3 x 4 d 5 l 6 z in [42]: d2 out[42]: b 0 b 1 2 d 3 c 4 y
numpy solution:
in [43]: np.setdiff1d(np.union1d(d1.a, d2.b), np.intersect1d(d1.a, d2.b)) out[43]: array(['l', 'x', 'y', 'z'], dtype=object)
pandas solution:
in [44]: pd.index.symmetric_difference(pd.index(d1.a), pd.index(d2.b)) out[44]: index(['l', 'x', 'y', 'z'], dtype='object')
wiki
Comments
Post a Comment