Issue
This Content is from Stack Overflow. Question asked by Geosphere
I know this questions has been mentioned various times on StackOverflow, I do not find it trivial to accomplish this task. This and many other answers: Adding total row to pandas DataFrame groupby
Sample of my data (there are actually 25 columns for this but they are similar, only numerical):
owner player val1 val1 val3
A x 5.60 3.18 0.76
A y 12.08 15.95 -0.24
A z 0.03 0.05 -0.41
B x 0.02 0.01 2.06
B z 2.36 2.37 0.00
C x 0.16 0.15 0.05
C y 0.72 0.75 -0.04
D x 0.33 0.56 -0.41
My intended output is as follows, where for each owner the total is calculated
and placed as the first row in the subgroup.
owner player val1 val1 val3
A total 17.71 19.18 0.11
A x 5.60 3.18 0.76
A y 12.08 15.95 -0.24
A z 0.03 0.05 -0.41
B total 2.38 2.38 2.05
B x 0.02 0.01 2.06
B z 2.36 2.37 0.00
C total 0.88 0.90 0.01
C x 0.16 0.15 0.05
C y 0.72 0.75 -0.04
D total 0.33 0.56 -0.41
D x 0.33 0.56 -0.41
I attempted to use something that I also found on StackOverflow which looked like what I was searching for but I couldn’t make it quite right.
def lambda_t(x):
df = x.sort_values(['owner']).drop(['owner'],axis=1)
df.loc['total'] = df.sum()
return df
df.groupby(['owner']).apply(lambda_t)
While in theory this might have been something interesting, the total is not placed where I want and on top of that the values on player name are concatenating, so I end up having a really packed column. This way I end up having a multiindex.
owner player val1 val1 val3
A 0 x 5.60 3.18 0.76
1 y 12.08 15.95 -0.24
2 z 0.03 0.05 -0.41
total xzy 17.71 19.18 0.11
.....
Apparently, dropping level of a multiindex can help but I’m missing the total this way, it disappears.
df.groupby(['owner']).apply(lambda_t).droplevel(level=1)
owner player val1 val1 val3
A x 5.60 3.18 0.76
A y 12.08 15.95 -0.24
A z 0.03 0.05 -0.41
A xzy 17.71 19.18 0.11
Any ideas if this is possible? I’ve seen that with groupby, assign and loc you can’t order them properly.
Solution
You can use:
m=df.groupby(['Level','Company','Item'])['Value'].sum().unstack()
m.assign(total=m.sum(1)).stack().to_frame('Value')
Value
Level Company Item
1 X a 100.0
b 200.0
total 300.0
Y a 35.0
b 150.0
c 35.0
total 220.0
2 X a 48.0
b 100.0
c 50.0
total 198.0
Y a 80.0
total 80.0
This Question was asked in StackOverflow by user12392864 and Answered by anky It is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.