I am using python xlwings to read a column of data in Excel 2013. Column
A is populated with numbers. To import this column into a python list
py_list, I have the following code;
import xlwings as xw wb = xw.Book('BookName.xlsm') sht = xw.Book('SheetName') py_list = sht.range('A2:A40').value
The above code works if the column data is populated at
A2:A40. However, the column data can keep growing. Data can grow and stretch to
A2:A80. The last row is empty. It is not known at compile time how many rows of data is in this column.
How can I modify the code to detect the empty cell at the last row so that the range of data can be read by
I am open to using other python libraries to read the Excel data besides xlwings. I am using python v3.6
I say this a lot about reading files in from csv or excel, but I would use
import pandas as pd df = pd.read_excel('filename.xlsm', sheetname=0) # can also index sheet by name or fetch all sheets mylist = df['column name'].tolist()
an alternative would be to use a dynamic formula using soemthing like OFFSET in excel instead of
'A2:A40', or perhaps a named range?
I know this is an old question, but you can also use
from openpyxl import load_workbook wb = load_workbook("BookName.xlsx") # Work Book ws = wb.get_sheet_by_name('SheetName') # Work Sheet column = ws['A'] # Column column_list = [column[x].value for x in range(len(column))]
After much trial and error, I will answer my own question.
The key to this question is finding out the number of rows in column
The number of rows can be found with this single line using xlwings below;
rownum = sht.range('A1').end('down').last_cell.row
One needs to read the API documentation carefully to get the answer.
Once the number of rows is found, it is easy to figure out the rest.
I found this as the easiest way to create lists from the entire columns in excel and it only takes the populated excel cells.
import pandas as pd
import numpy as np
#Insert complete path to the excel file and index of the worksheet df = pd.read_excel("PATH.xlsx", sheet_name=0) # insert the name of the column as a string in brackets list1 = list(df['Column Header 1']) list2 = list(df['Column Header 2']) print(list1) print(list2)
I went through xlwings documentation to look for something, didn’t find something like this, but you can always try and go around this:
temp = [x for x in xw.Range('A2:A200').value if x != None] #A200 just put a big number..
or I don’t know try this:
from itertools import takewhile temp =[takewhile(lambda x: x != None, xw.Range('A2:A70').value)] while True: try: next(temp) except StopIteration: break
at line 2, at first I tried doing something like this:
temp =[lambda x: x for x in xw.Range('D:D').values if x != None else exit()] #or to replace this with quit() but there is no option to break lambdas as far as I know
temp = iter(xw.Range('A:A').value) list =  a = next(temp) #depending your first cell starts at row 1 while a != None: #might want zeros or '' etc list.append(a) a = next(temp)