I need to write a python script to read excel files, find each worksheet and then print these to pdf with the standard formating defined in the excel.
This gives me the ability to find the names of each worksheet.
import xlrd book = xlrd.open_workbook("myfile.xls") print "Worksheet name(s):", book.sheet_names()
This results in
Worksheet name(s): [u'Form 5', u'Form 3', u'988172 Adams Road', u'379562 Adams Road', u'32380 Adams Road', u'676422 Alderman Road', u'819631 Appleyard Road', u'280998 Appleyard Road', u'781656 Atkinson Road', u'949461 Barretts Lagoon Road', u'735284 Bilyana Road', u'674784 Bilyana Road', u'490894 Blackman Road', u'721026 Blackman Road']
Now I want to print each worksheet which starts with a number to a pdf.
So I can
worksheetList=book.sheet_names() for worksheet in worksheetList: if worksheet.find('Form')!=0: #this just leaves out worksheets with the word 'form' in it <function to print to pdf> book.sheet_by_name(worksheet) #what can I use for this?
or something similar to above…what can I use to achieve this?
The XLRD documentation is confusing it says
Formatting features not included in xlrd version 0.6.1: Miscellaneous
sheet-level and book-level items e.g. printing layout, screen panes
This collection of features, new in xlrd version 0.6.1, is intended to
provide the information needed to (1) display/render spreadsheet
contents (say) on a screen or in a PDF file
Which is true? can some other package be used to print to pdf?
This seems like the place to put this answer.
In the simplest form:
import win32com.client o = win32com.client.Dispatch("Excel.Application") o.Visible = False wb_path = r'c:\user\desktop\sample.xls' wb = o.Workbooks.Open(wb_path) ws_index_list = [1,4,5] #say you want to print these sheets path_to_pdf = r'C:\user\desktop\sample.pdf' wb.WorkSheets(ws_index_list).Select() wb.ActiveSheet.ExportAsFixedFormat(0, path_to_pdf)
Including a little formatting magic that scales to fit to a single page and sets the print area:
import win32com.client o = win32com.client.Dispatch("Excel.Application") o.Visible = False wb_path = r'c:\user\desktop\sample.xls' wb = o.Workbooks.Open(wb_path) ws_index_list = [1,4,5] #say you want to print these sheets path_to_pdf = r'C:\user\desktop\sample.pdf' print_area = 'A1:G50' for index in ws_index_list: #off-by-one so the user can start numbering the worksheets at 1 ws = wb.Worksheets[index - 1] ws.PageSetup.Zoom = False ws.PageSetup.FitToPagesTall = 1 ws.PageSetup.FitToPagesWide = 1 ws.PageSetup.PrintArea = print_area wb.WorkSheets(ws_index_list).Select() wb.ActiveSheet.ExportAsFixedFormat(0, path_to_pdf)
I have also started a module over a github if you want to look at that: https://github.com/spottedzebra/excel/blob/master/excel_to_pdf.py