Issue
This Content is from Stack Overflow. Question asked by Patrik
I have a dataset in this form:
col_1 col_2 col_3 col_4 col_5
0 0 0 0 Nan
0 1 Nan 1 1
1 0 1 0 Nan
0 0 0 0 0
Now, I want to create a new column such that for any particular row, if any of the column values is 1, then the output should be 1.. For example, in above dataset, the new column will have second and third row as 1…
SO, I have tried this approach:
if ((df['col_1]==1) | (df['col_2']==1) | (df['col_3']==1) | (df['col_4']==1) | (df['col_5']==1)):
df['new_column']=1
else:
df['new_column']=0
This code was giving me error.
So, I tried a different approach.
lists = ['col_1','col_2','col_3','col_4','col_5']
for i in lists:
if(df[i]==1):
df['new_column]==1
else:
df['new_column']==0
This code is again giving me wrong values…
Can someone please help me in solving it as I am beginner in pandas and stuck in this problem.
Solution
Use comparison to 1 (with eq
), aggregate with any
per row (axis=1
), then perform boolean indexing:
df.loc[df.eq(1).any(axis=1), 'new_col'] = 1
output:
col_1 col_2 col_3 col_4 col_5 new_col
0 0 0 0 0 Nan NaN
1 0 1 Nan 1 1 1.0
2 1 0 1 0 Nan 1.0
3 0 0 0 0 0 NaN
If you prefer a 0/1 output:
df['new'] = df.eq(1).any(1).astype(int)
output:
col_1 col_2 col_3 col_4 col_5 new
0 0 0 0 0 Nan 0
1 0 1 Nan 1 1 1
2 1 0 1 0 Nan 1
3 0 0 0 0 0 0
This Question was asked in StackOverflow by Patrik and Answered by mozway It is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.