Home » excel » python – Problems writing to file in pandas

python – Problems writing to file in pandas

Posted by: admin April 23, 2020 Leave a comment

Questions:

I’m currently trying to write an excel file from a file format using the function tr8 pd.to_excel of pandas. However, It writes the excel file, but when opening in excel I cannot see the full data. I attached the code of tr8

output_file = pd.ExcelWriter('20131001103311.xlsx')
widths = [1, 8, 2, 4, 2, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 5, 1, 10, 1]
df = pd.read_fwf('20131001103311.tr8', widths=widths, header=True)
df.columns = ['TIP. REG.', 'COD. EST.', 'TIP. INF.', 'AGNO', 'DEL', 'ENE', 'OBS', 'FEB', 'OBS', 'MAR', 'OBS', 'ABR',
              'OBS', 'MAY', 'OBS', 'JUN', 'OBS', 'JUL', 'OBS', 'AGO', 'OBS', 'SEP', 'OBS', 'OCT', 'OBS', 'NOV', 'OBS',
              'DIC', 'OBS', 'ESP.', 'TIP. DATO']
df.to_excel(output_file, '20131001103311')
output_file.save()
How to&Answers:

I simplified your program down to 2 columns of data for testing:

import pandas as pd

output_file = pd.ExcelWriter('20131001103311.xlsx')

widths = [10, 10]
df = pd.read_fwf('20131001103311.tr8', widths=widths, header=True)

df.columns = ['TIP. REG.', 'COD. EST.']

df.to_excel(output_file, '20131001103311')
output_file.save()

And I ran it against the following fixed width format fwf file:

$ cat 20131001103311.tr8
TIP. REG. COD. EST.
1         1000
2         300
3         7000
4         600
5         12345

I didn’t get any execution errors and the output looks like it should:

enter image description here

The first row of data is missing since the parameter header=True has been passed to read_fwf.

So it doesn’t seem to be an pandas issue.

I would look at the columns in your fixed width fields file. Perhaps print it out after reading to see if the column names that you supply to df.columns have all been parsed correctly.

Update: Looking at the images of the input data and the output file that @jchavarro tried to upload it looks that there may be an issue here. At least the Excel output doesn’t tie up with the DataFrame data. Probably due to the repeated OBS columns.

Update 2: This is an issue. I’ve raised it on GitHub and submitted a fix.

Update 3: I created a fix for the above issue which has now been merged into the pandas master branch and which should be released as part of the 0.13 release.