python - Retrieve all values in a column that are not present in another column -
in pandas, used able take dataframe column, compare against second dataframe column, , items missing second column so:
notyetincluded = notyetincluded.loc[~notyetincluded["id"].isin(df_o["id"])]
however, no longer works in updated pandas (i error valueerror: buffer dtype mismatch, expected 'python object' got 'long long'
). how do that?
the part seems cause breakage this: notyetincluded["id"].isin(df_o["id"])
i don't know if helps, these columns store numbers 4150
, 5808
, etc. they're 4 digits or less long.
for example:
notyetincluded: 0 5747 1 5746 2 5725 3 5722 4 5720 5 5707 name: id, dtype: object
df_o: 24 5365 4 5720 15 5599 name: id, dtype: int64
use df.astype(str)
cast columns string , compare.
n = notyetincluded notyetincluded = n[~n["id"].astype(str).isin(df_o["id"].astype(str))]
wiki
Comments
Post a Comment