[SOLVED] Change column value on identical indexes


This Content is from Stack Overflow. Question asked by Dante van der Heijden

Have the following dataframe:

enter image description here

Sometimes the index is duplicate and then I want to change the value in the column ‘Hotspot’. So prograding_feature_polygon_30 should be changed to prograding_feature_polygon_30_1 and the second on index 0 to prograding_feature_polygon_30_2.

The same on index 1, so again the values should be changed to prograding_feature_polygon_30_1 and prograding_feature_polygon_30_2. And so on…

Indexes are not always duplicate and if not, the value in Hotspot should remain the same. Anyone knows an easy way to do this?




Data sample

import pandas as pd
import numpy as np

df = pd.DataFrame({'a': np.repeat([*'ABCD'],[2,1,3,1]),
                   'b': [*range(7)]}, 


   a  b
0  A  0
0  A  1
1  B  2
2  C  3
2  C  4
2  C  5
3  D  6


For each duplicate in the index, we want to add a consecutive number to the values in column a. So, A_1, A_2 for index value 0, and C_1, C_2, C_3 for index value 2. Values without duplicates (1 and 3) should be unaffected.


df.a = np.where(df.index.duplicated(keep=False),
         df.a + '_' + df.groupby(level=0).cumcount().add(1).astype(str),


     a  b
0  A_1  0
0  A_2  1
1    B  2
2  C_1  3
2  C_2  4
2  C_3  5
3    D  6


  • Use df.index.duplicated with param keep=False to get an array with True for duplicates, False for non-duplicates.
  • Use this array inside np.where. If True, we want df.a + consecutive number, else simply df.a.
  • Use df.groupby on the index, and apply .cumcount to enumerate items per group. add(1) to start at 1, instead of 0. Finally, use astype(str), in view of the concatenation with df.a.

This Question was asked in StackOverflow by Dante van der Heijden and Answered by ouroboros1 It is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.

people found this article helpful. What about you?