python - Pandas Row Date Conditional Filter Prior to Groupby - MAXIFS/MINIFS -




i trying maxifs style calculations in pandas

i trying add column containing next (if exists) & last (if exists) flagged date each unique id

sample table: (trying next flag & last flag columns)

edit: form more generic case, if wanted perform function e.g ditinctcount on period <= row

the idea able apply custom functions applied filtered subset each id = row id , date <= row date (i have created pandas compatible row functions way slow)

table:

id      date        flag    next flag   last flag   flag2   uniqueflags  1       21-aug      0       nan         18-aug              1 1       20-aug      0       nan         18-aug              1 1       19-aug      0       nan         18-aug              1 1       18-aug      1       nan         18-aug            1 1       17-aug      0       18-aug      15-aug              1 1       16-aug      0       18-aug      15-aug              1 1       15-aug      1       18-aug      15-aug            1 1       14-aug      0       15-aug      nan                 0 1       13-aug      0       15-aug      nan                 0 2       21-aug      0       nan         19-aug              2 2       20-aug      0       nan         19-aug              2 2       19-aug      1       nan         19-aug            2 2       18-aug      0       19-aug      15-aug              1 2       17-aug      0       19-aug      15-aug              1 2       16-aug      0       19-aug      15-aug              1 2       15-aug      1       19-aug      15-aug      b       1 2       14-aug      0       15-aug      nan                 0 2       13-aug      0       15-aug      nan                 0 3       21-aug      0       nan         17-aug              1 3       20-aug      0       nan         17-aug              1 3       19-aug      0       nan         17-aug              1 3       18-aug      0       nan         17-aug              1 3       17-aug      1       nan         17-aug            1 3       16-aug      0       17-aug      nan                 0 3       15-aug      0       17-aug      nan                 0 3       14-aug      0       17-aug      nan                 0 3       13-aug      0       17-aug      nan                 0 

ive tried groupby cant dates <= row date whilst being each id

thanks

try ? break down steps , assuming sort_values column id , date

df['next flag']=np.nan df['last flag']=np.nan df.loc[(df.flag==1).shift().fillna(false),'next flag']=df.date.shift() df.loc[(df.flag==1).fillna(false),'last flag']=df.date df[['next flag','last flag']]=df.groupby('id').agg({'next flag':'ffill','last flag':'bfill'})       id    date  flag next flag last flag 0    1  21-aug     0       nan    18-aug 1    1  20-aug     0       nan    18-aug 2    1  19-aug     0       nan    18-aug 3    1  18-aug     1       nan    18-aug 4    1  17-aug     0    18-aug    15-aug 5    1  16-aug     0    18-aug    15-aug 6    1  15-aug     1    18-aug    15-aug 7    1  14-aug     0    15-aug       nan 8    1  13-aug     0    15-aug       nan 9    2  21-aug     0       nan    19-aug 10   2  20-aug     0       nan    19-aug 11   2  19-aug     1       nan    19-aug 12   2  18-aug     0    19-aug    15-aug 13   2  17-aug     0    19-aug    15-aug 14   2  16-aug     0    19-aug    15-aug 15   2  15-aug     1    19-aug    15-aug 16   2  14-aug     0    15-aug       nan 17   2  13-aug     0    15-aug       nan 18   3  21-aug     0       nan    17-aug 19   3  20-aug     0       nan    17-aug 20   3  19-aug     0       nan    17-aug 21   3  18-aug     0       nan    17-aug 22   3  17-aug     1       nan    17-aug 23   3  16-aug     0    17-aug       nan 24   3  15-aug     0    17-aug       nan 25   3  14-aug     0    17-aug       nan 26   3  13-aug     0    17-aug       nan 

edit: base on updated question

df=df.drop(['next flag','last flag','uniqueflags'],axis=1) df['next flag']=np.nan df['last flag']=np.nan df.loc[(df.flag==1).shift().fillna(false),'next flag']=df.date.shift() df.loc[(df.flag==1).fillna(false),'last flag']=df.date df.flag2=df.flag2.replace({'':np.nan}) df[['next flag','last flag','flag3']]=df.groupby('id').agg({'next flag':'ffill','last flag':'bfill','flag2':'bfill'}) df=df.sort_values(['id','date'],ascending=[true,true]) df['lol']=df1.flag3!=df1.flag3.shift() df.loc[df.flag3.isnull(),'lol']=false df['lol']=df['lol'].astype(int) df['uniqueflags']=df.groupby('id').lol.cumsum() df=df.sort_values(['id','date'],ascending=[true,false]).drop(['flag3','lol'],axis=1)   df out[1665]:      id    date  flag flag2 next flag last flag  uniqueflags 0    1  21-aug     0   nan       nan    18-aug            1 1    1  20-aug     0   nan       nan    18-aug            1 2    1  19-aug     0   nan       nan    18-aug            1 3    1  18-aug     1           nan    18-aug            1 4    1  17-aug     0   nan    18-aug    15-aug            1 5    1  16-aug     0   nan    18-aug    15-aug            1 6    1  15-aug     1        18-aug    15-aug            1 7    1  14-aug     0   nan    15-aug       nan            0 8    1  13-aug     0   nan    15-aug       nan            0 9    2  21-aug     0   nan       nan    19-aug            2 10   2  20-aug     0   nan       nan    19-aug            2 11   2  19-aug     1           nan    19-aug            2 12   2  18-aug     0   nan    19-aug    15-aug            1 13   2  17-aug     0   nan    19-aug    15-aug            1 14   2  16-aug     0   nan    19-aug    15-aug            1 15   2  15-aug     1     b    19-aug    15-aug            1 16   2  14-aug     0   nan    15-aug       nan            0 17   2  13-aug     0   nan    15-aug       nan            0 18   3  21-aug     0   nan       nan    17-aug            1 19   3  20-aug     0   nan       nan    17-aug            1 20   3  19-aug     0   nan       nan    17-aug            1 21   3  18-aug     0   nan       nan    17-aug            1 22   3  17-aug     1           nan    17-aug            1 23   3  16-aug     0   nan    17-aug       nan            0 24   3  15-aug     0   nan    17-aug       nan            0 25   3  14-aug     0   nan    17-aug       nan            0 26   3  13-aug     0   nan    17-aug       nan            0 




wiki

Comments

Popular posts from this blog

python - Read npy file directly from S3 StreamingBody -

kotlin - Out-projected type in generic interface prohibits the use of metod with generic parameter -

Asterisk AGI Python Script to Dialplan does not work -