Home » excel » python – Pull two columns from Excel and append key value pairs to dictionary

python – Pull two columns from Excel and append key value pairs to dictionary

Posted by: admin April 23, 2020 Leave a comment

Questions:

My apologies if similar questions have been asked — I dug through quite a few, but they did not match my specific issue.

Basically, I have an Excel spreadsheet with 2 columns; Name and Email. I’m using pandas to grab the two columns from the file. I want to grab the values from the columns in order, and append them to a dictionary so that I can easily reference name and email pairs later on.

I currently have two functions in two files. One is my main file/function, and the other is a file named readExcel with a function named read:

# readExcel.py
import pandas as pd

def read(fileName: str, sheetName: str):
    f = pd.read_excel(fileName, sheet_name = sheetName)
    return f

# __main__.py
import readExcel as re

from pathlib import Path

def main():
    contacts = {}

    p = Path(__file__).with_name('contacts.xlsx')
    f = re.read(p, "Sheet1")

    for n in f["Name"]:
        for e in f["Email"]:
            contacts[n] = e

    print(contacts)

The issue I’m facing here is that the resulting dictionary is un-ordered, e.g., Bob Testerson: [email protected], Jim Tester: [email protected]

How would I go about properly ordering the data I’m pulling from the spreadsheet?

EDIT: Per request, I’ll add more information regarding the Excel file and preferred order.

The Excel file looks like this:
Excel image preview

As for the ordering of the data, it seems it would be best done before adding it to the dictionary, but that’s not a requirement for me. Also, I don’t specifically care about the order in which the key / value pairs appear in the dictionary, but rather that the key /values pairs appear as they do in the Excel file, e.g.,

{
    "Jon Testerson": "[email protected]", 
    "Henry": "[email protected]", 
    "Bryce Testington": "[email protected]", 
    "Greg": "[email protected]", 
    "Jerry Testerfield", "[email protected]"
}
How to&Answers:

Try this using the pandas to dict method. Just change the column names if you need to.

import pandas as pd

def read_excel(path_to_file):

    df = pd.read_excel(path_to_file)

    return df

def dataframe_to_dict(df, key_column, value_column):

    name_email_dict = df.set_index(key_column)[value_column].to_dict()

    return name_email_dict

if __name__ == "__main__":

    path_to_file = 'C:\projects\scratchwork\excel_dict.xlsx'

    df = read_excel(path_to_file)

    name_email_dict = dataframe_to_dict(df,'Name','Email')

    print(name_email_dict)

Answer:

I’m sure there’s an easier way to do it but I would put the data into a data frame and then use the sort_values method to sort them. This would look something like:

# readExcel.py
import pandas as pd

def read(fileName: str, sheetName: str):
  f = pd.read_excel(fileName, sheet_name = sheetName)
  return f

# __main__.py
import readExcel as re

from pathlib import Path

def main():
  df = pd.DataFrame()
  contacts = {}

 p = Path(__file__).with_name('contacts.xlsx')
 f = re.read(p, "Sheet1")
 df = df.append(f,ignore_index=True)

print(df.sort_values(by=["Name","Email"]))

Again may not be the best way to do it but it should work if there is extra information on Sheet 1 then prior to the print I would do:

df = df[['Name','Email']]

Which would then only select name and email