python - Retrieve all values in a column that are not present in another column -
in pandas, used able take dataframe column, compare against second dataframe column, , items missing second column so:
notyetincluded = notyetincluded.loc[~notyetincluded["id"].isin(df_o["id"])] however, no longer works in updated pandas (i error valueerror: buffer dtype mismatch, expected 'python object' got 'long long'). how do that?
the part seems cause breakage this: notyetincluded["id"].isin(df_o["id"])
i don't know if helps, these columns store numbers 4150, 5808, etc. they're 4 digits or less long.
for example:
notyetincluded: 0 5747 1 5746 2 5725 3 5722 4 5720 5 5707 name: id, dtype: object
df_o: 24 5365 4 5720 15 5599 name: id, dtype: int64
use df.astype(str) cast columns string , compare.
n = notyetincluded notyetincluded = n[~n["id"].astype(str).isin(df_o["id"].astype(str))] wiki
Comments
Post a Comment