Home » excel » python – Rearranging a dataframes cells with the same columns title using Pandas

python – Rearranging a dataframes cells with the same columns title using Pandas

Posted by: admin May 14, 2020 Leave a comment

Questions:

I have this df which I read from an excel file with pandas.read_excel():

ID  A   B   C   A   B   C   A   B   C
10  a1  b1  c1  a4  b4  c4  a7  b7  c7
20  a2  b2  c2  a5  b5  c5  a8  b8  c8
30  a3  b3  c3  a6  b6  c6  a9  b9  c9

How can I change it to have df_1 like this:

ID   A   B   C     
10   a1  b1  c1    
20   a2  b2  c2    
30   a3  b3  c3    
10   a4  b4  c4
20   a5  b5  c5
30   a6  b6  c6
10   a7  b7  c7
20   a8  b8  c8
30   a9  b9  c9
How to&Answers:

You can create MultiIndex in columns for count duplicated columns names by cumcount and then is possible reshape by stack, last some data cleaning by reset_index:

df = df.set_index('ID')
s = df.columns.to_series()
df.columns = [df.columns, s.groupby(s).cumcount()]

df = df.stack().sort_index(level=1).reset_index(level=1, drop=True).reset_index()
print (df)
   ID   A   B   C
0  10  a1  b1  c1
1  20  a2  b2  c2
2  30  a3  b3  c3
3  10  a4  b4  c4
4  20  a5  b5  c5
5  30  a6  b6  c6
6  10  a7  b7  c7
7  20  a8  b8  c8
8  30  a9  b9  c9

Answer:

Here’s another way using list comprehension and pd.concat

df1 = df.set_index('ID')
n=3 #The number of times your column headers repeat
pd.concat([df1.iloc[:,i:i+n] for i in range(0,df1.shape[1],n)]).reset_index()

Output:

   ID   A   B   C
0  10  a1  b1  c1
1  20  a2  b2  c2
2  30  a3  b3  c3
3  10  a4  b4  c4
4  20  a5  b5  c5
5  30  a6  b6  c6
6  10  a7  b7  c7
7  20  a8  b8  c8
8  30  a9  b9  c9