python - Pandas conditional creating columns issues -

- May 25, 2012

i have sample data set,

import pandas pd  df = {   'columa':['1a','ws rank','rank','ws rank','rank','drank'],  'value': [ 1, 12, 34, 50, 3,2] }   df = pd.dataframe(df)

1. want create column 'hp', columna rows 'ws rank' , 'rank' , 'drank', if value 1 hp 25, if value 2 hp 24...etc.
first created smaller dataset contain rows because real data set big. concatenate dataset , original dataset include 'hp' column. when concatenated datasets there duplicated rows. there must easier way.

my code:

dfrank=df[df["columa"].str.contains('ws rank|rank')] dfrank['value'] = dfrank['value'].astype(int) dfrank.loc[dfrank.value == 1, 'hp'] = 25 dfrank.loc[dfrank.value == 2, 'hp'] = 24 dfrank.loc[dfrank.value == 3, 'hp'] = 23 dfrank.loc[dfrank.value == 4, 'hp'] = 22 dfrank.loc[dfrank.value == 5, 'hp'] = 21 dfrank.loc[dfrank.value == 6, 'hp'] = 20 dfrank.loc[dfrank.value == 7, 'hp'] = 19 dfrank.loc[dfrank.value == 8, 'hp'] = 18 dfrank.loc[dfrank.value == 9, 'hp'] = 17 dfrank.loc[dfrank.value == 10, 'hp'] = 16 dfrank.loc[dfrank.value == 11, 'hp'] = 15 dfrank.loc[dfrank.value == 12, 'hp'] = 14 dfrank.loc[dfrank.value == 13, 'hp'] = 13 dfrank.loc[dfrank.value == 14, 'hp'] = 12 dfrank.loc[dfrank.value == 15, 'hp'] = 11 dfrank.loc[dfrank.value == 16, 'hp'] = 10 dfrank.loc[dfrank.value == 17, 'hp'] = 9 dfrank.loc[dfrank.value == 18, 'hp'] = 8 dfrank.loc[dfrank.value == 19, 'hp'] = 7 dfrank.loc[dfrank.value == 20, 'hp'] = 6 dfrank.loc[(dfrank.value > 20)&(dfrank.value <= 50), 'hp'] = 5  df2=pd.concat([df, dfrank])

is there easier way conditions? keep getting error message, but think i'm using form it's suggesting : settingwithcopywarning: value trying set on copy of slice dataframe. try using .loc[row_indexer,col_indexer] = value instead

see caveats in documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy dfrank['value'] = dfrank['value'].astype(int) h:/code/pythonscripts/python_work/dataset1.py:20: settingwithcopywarning: value trying set on copy of slice dataframe. try using .loc[row_indexer,col_indexer] = value instead

see caveats in documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy dfrank.loc[dfrank.value == 1, 'hp'] = 25 c:\users\amywang\appdata\local\continuum\anaconda3\lib\site-packages\pandas\core\indexing.py:477: settingwithcopywarning: value trying set on copy of slice dataframe. try using .loc[row_indexer,col_indexer] = value instead

see caveats in documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy self.obj[item] = s

2. want create 'hppoint' column groups 'columa' values , sums 'hp' values, didn't work , returned null

df2['hppoint']=df2.groupby('columa')['hp'].sum()

in pandas, indexing dataframe returns reference initial dataframe when selecting data , storing in new variable. should copy dataframe use .loc new dataframe i.e

dfrank=df[df["columa"].str.contains('ws rank|rank')].copy()

this create new index , indexing new dataframe.

since want map data can rid of lot lines creating dictionary, a mask , .loc, can fill nan values using fillna i.e

dicct = {1:25,2:24,3:23,4:22,5:21,6:20,7:19,8:18,9:17,10:16,11:15,12:14,13:13,14:12,15:11,16:10,17:9,18:8,19:7,20:6} df['hp'] = 0 mask=df["columa"].str.contains('ws rank|rank') df.loc[mask,'hp'] = df.loc[mask,'value'].map(dicct).fillna(5)

output :

     columa  value    hp 0       1a    1.0   0.0 1  ws rank   14.0  12.0 2     rank    5.0  21.0 3  ws rank    5.0  21.0 4     rank   23.0   5.0 5    drank   24.0   5.0 in [ ]:

if want fill new column groupby sum can use transform i.e

df['hppoint']=df.groupby('columa')['hp'].transform(sum)

output :

     columa  value    hp  hppoint 0       1a    1.0   0.0      0.0 1  ws rank   14.0  12.0     33.0 2     rank    5.0  21.0     26.0 3  ws rank    5.0  21.0     33.0 4     rank   23.0   5.0     26.0 5    drank   24.0   5.0      5.0

hope helps

wiki

Search This Blog

tL

python - Pandas conditional creating columns issues -

Comments

Post a Comment

Popular posts from this blog

Asterisk AGI Python Script to Dialplan does not work -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -

python - Read npy file directly from S3 StreamingBody -