Home » excel » Read Excel File Pivot in Python

Read Excel File Pivot in Python

Posted by: admin May 14, 2020 Leave a comment

Questions:

I want to retrieve data from my Excel file.

Here are the data:
enter image description here

Here is the result I want to have:
enter image description here

And here is the code that I started to write:

from openpyxl import load_workbook
wb = load_workbook('C:\Fichiers_Excel\Team.xlsx')
m = wb.active
# Retrieve the value of a certain cell
team_1_captain=m['A2'].value,m['B2'].value,m['D2'].value,m['E2'].value
team_1_player=m['A6'].value,m['B6'].value,m['D6'].value,m['E6'].value
team_1= m['A2'].value,m['B2'].value,m['D2'].value,m['E2'].value,m['B6'].value,m['D6'].value,m['E6'].value
print (team_1)
How to&Answers:

Using pandas, you can manipulate the data to having teams as rows with the captain and player as columns, then join in the day values:

import pandas as pd

df = pd.read_excel("<path to your file>.xlsx")

teams = df.set_index(["Team", "Status"])["Name"].unstack()

cols = ["Monday", "Tuesday"]
days = []
for i in ["player", "captain"]:
    days.append(df[df["Status"] == i].groupby("Team")[cols].first().rename(columns={col: "{}_{}".format(i, col) for col in cols}))

days = pd.concat(days, axis=1)

result = teams.join(days)
result[["captain", "captain_Monday", "captain_Tuesday", "player", "player_Monday", "player_Tuesday"]]

Answer:

It’s fairly straightforward to this with pandas.

For demonstration purposes I created your original worksheet and read it in. Then I merge a dataframe where the ‘Status’ column is equal to ‘captain’ to one where the ‘Status’ column is equal to ‘player’. Then I output this new dataframe as an xlsx file. The fully reproducible code and expected outputs are below.

import pandas as pd

df = pd.DataFrame({"Team": [ 1, 3, 3, 2, 1, 2],
                   "Name": ["Mike", "Mc", "Dany", "Tom", "Steve", "Hector"],
                   "Status": ["captain", "captain", "player", "captain", "player", "player"],
                   "Monday": [10, 5, 2, 11, 10, 10],
                   "Tuesday": [7, 1, 1, 8, 5, 8]})

writer = pd.ExcelWriter('start.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1', index=False)
writer.save()

df = pd.read_excel('start.xlsx')
captain_df = df[df['Status'] == 'captain']
player_df = df[df['Status'] == 'player']


final_df = captain_df.merge(player_df, how='left', on='Team' , suffixes=('_1', '_2'))
final_df = final_df.rename(columns={"Name_1": "Captain", "Name_2": "Player"})
final_df = final_df[['Team', 'Captain', 'Monday_1', 'Tuesday_1', 'Player', 'Monday_2', 'Tuesday_2']]

writer = pd.ExcelWriter('finish.xlsx')
final_df.to_excel(writer, sheet_name='Sheet1', index=False)
#optional rename of the column names in the next 6 lines since Monday and Tuesday are duplicate names
workbook  = writer.book
worksheet = writer.sheets['Sheet1']
worksheet.write('C1', 'Monday')
worksheet.write('D1', 'Tuesday')
worksheet.write('F1', 'Monday')
worksheet.write('G1', 'Tuesday')
writer.save()

Expected start.xlsx:

Expected start.xlsx

Expected finish.xlsx:

finish.xlsx