python - How to keep column MultiIndex values when merging pandas DataFrames -
i have 2 pandas dataframes, below:
df1 = pd.dataframe({('q1', 'subq1'):[1, 2, 3], ('q1', 'subq2'):[1, 2, 3], ('q2', 'subq1'):[1, 2, 3]}) df1['id'] = ['a', 'b', 'c'] df2 = pd.dataframe({'item_id': ['a', 'b', 'c'], 'url':['a.com', 'blah.com', 'company.com']})
df1
:
q1 q2 id subq1 subq2 subq1 0 1 1 1 1 2 2 2 b 2 3 3 3 c
df2
:
item_id url 0 a.com 1 b blah.com 2 c company.com
note df1
has columns hierarchical indexing (eg. ('q1', 'subq1')
) , normal indexing (eg. id
).
i want merge these 2 data frames on id
, item_id
fields. using:
result = pd.merge(df1, df2, left_on='id', right_on='item_id')
gives:
(q1, subq1) (q1, subq2) (q2, subq1) (id, ) item_id url 0 1 1 1 a.com 1 2 2 2 b b blah.com 2 3 3 3 c c company.com
as can see, merge works fine, multiindex has been lost , has reverted tuples. i've tried recreate multiindex using pd.multiindex.from_tuples
, in:
result.columns = pd.multiindex.from_tuples(result)
but causes problems item_id
, url
columns, taking first 2 characters of names:
q1 q2 id u subq1 subq2 subq1 t r 0 1 1 1 a.com 1 2 2 2 b b blah.com 2 3 3 3 c c company.com
converting columns in df2
one-element tuples (ie. ('item_id',)
rather 'item_id'
) makes no difference.
how can merge these 2 dataframes , keep multiindex properly? or alternatively, how can take result of merge , columns proper multiindex without mucking names of item_id
, url
columns?
if can't beat 'em, join 'em. (make both dataframes have same number of index levels before merging):
import pandas pd df1 = pd.dataframe({('q1', 'subq1'):[1, 2, 3], ('q1', 'subq2'):[1, 2, 3], ('q2', 'subq1'):[1, 2, 3]}) df1['id'] = ['a', 'b', 'c'] df2 = pd.dataframe({'item_id': ['a', 'b', 'c'], 'url':['a.com', 'blah.com', 'company.com']}) df2.columns = pd.multiindex.from_product([df2.columns, ['']]) result = pd.merge(df1, df2, left_on='id', right_on='item_id') print(result)
yields
q1 q2 id item_id url subq1 subq2 subq1 0 1 1 1 a.com 1 2 2 2 b b blah.com 2 3 3 3 c c company.com
this avoids userwarning
:
pandas/core/reshape/merge.py:551: userwarning: merging between different levels can give unintended result (2 levels on left, 1 on right)
wiki
Comments
Post a Comment