Attempting to pull data from an excel sheet, apply an equation (in this case median()), and create a histogram from that data.

Here is my code:

import pandas as pd
import matplotlib.pyplot as plt

pd.set_option('display.max_columns', 100000)
absent = pd.read_excel('Absenteeism_at_work.xls')
col = ['Distance from Residence to Work', 'Transportation expense', 'Month of absence', 'Social smoker',
       'Social drinker', 'Education']

# print(absent.loc[:741, col])

plt.title('The Mean')
plt.xlabel('Attribute of Absence')
# x = ['Distance', 'Trans Exp.', 'Month', 'Smoker', 'Drinker', 'Edu.']
x = absent.loc[:741, col].median()
x.plot(kind="bar", figsize=(5, 5))

# print(hist)
plt.show() # shows histogram in side-window

Here is the terminal output:

Distance from Residence to Work     26.0
Transportation expense             225.0
Month of absence                     6.0
Social smoker                        0.0
Social drinker                       1.0
Education                            1.0
dtype: float64

and most importantly, the incorrect Histogram:

Median Histogram based on data above (ignore "The Mean")

Shouldn’t ‘Social Smoker’ be showing as 0? Also, what is that extra bit of bar to the right of ‘Distance from Residence to Work’? Is this proper? Thank you!

How to&Answers:

Your graphs x.plot(kind="bar", figsize=(5, 5)) and plt.hist(x)
getting combined.

x.plot(kind=”bar”, figsize=(5, 5)):

