python - Mean of a grouped-by pandas dataframe -

- August 25, 2012

i need calculate mean per day of colums duration , km rows value ==1 , values = 0.

df out[20]:                            date duration km   value 0   2015-03-28 09:07:00.800001    0      0    0 1   2015-03-28 09:36:01.819998    1      2    1 2   2015-03-30 09:36:06.839997    1      3    1  3   2015-03-30 09:37:27.659997    nan    5    0  4   2015-04-22 09:51:40.440003    3      7    0 5   2015-04-23 10:15:25.080002    0      nan  1

how can modify solution in order have means duration_value0, duration_value1, km_value0 , km_value1?

df = df.set_index('date').groupby(pd.grouper(freq='d')).mean().dropna(how='all') print (df)             duration   km date                      2015-03-28       0.5  1.0 2015-03-30       1.5  4.0 2015-04-22       3.0  7.0 2015-04-23       0.0  0.0

i think looking pivot table i.e

df.pivot_table(values=['duration','km'],columns=['value'],index=df['date'].dt.date,aggfunc='mean')

output:

            duration        km      value             0    1    0    1 date                               2015-03-28      0.0  1.0  0.0  2.0 2015-03-30      nan  1.0  5.0  3.0 2015-04-22      3.0  nan  7.0  nan 2015-04-23      nan  0.0  nan  nan in [24]:

if want new column names distance0,distance1 ... can use list comprehension i.e if store pivot table in ndf

ndf.columns = [i[0]+str(i[1]) in ndf.columns]

output:

             duration0  duration1  km0  km1 date                                       2015-03-28        0.0        1.0  0.0  2.0 2015-03-30        nan        1.0  5.0  3.0 2015-04-22        3.0        nan  7.0  nan 2015-04-23        nan        0.0  nan  nan

wiki

Search This Blog

tL

python - Mean of a grouped-by pandas dataframe -

Comments

Post a Comment

Popular posts from this blog

elasticsearch - what is the equivalent data type for geo_point in hibernate search? -

Asterisk AGI Python Script to Dialplan does not work -

powershell - Invoke-WebRequest fails for a large file -