Home » excel » excel – Struggling a bit with openpyxl for Python 2.6

excel – Struggling a bit with openpyxl for Python 2.6

Posted by: admin April 23, 2020 Leave a comment

Questions:

Apologies if this has been asked elsewhere already.

I’ve imported openpyxl (v.1.6.1) for Python (2.6 in this case). So far it works great, but what I’m trying to do I can’t quite figure out.

I’ll list the xlsx file’s structure I need to read first. It’s essentially a standby roster which looks like this:

A1 B1 C1 D1
YYYY/MM/DD System1 System2 System3
A2 B2 C2 D2
2013/02/11 User A User B User C
A3 B3 C3 D3
2013/02/12 User D User E User F
A4 B4 C4 D4
2013/02/13 User G User H User I

What I need to do is first scan all cells in A (except for row 1, so A2 to A-infinity), then depending on if I can find today’s date in cells A2 to A-infinity, print the entire row associated with today’s date as well as the system that the user is associated with (so in this case B3, C3 and D3 as well as B1, C1 and D1).

I can paste what I have so far, but it’s not much:

import openpyxl  
from openpyxl import load_workbook

wb = load_workbook(filename = 'standby.xlsx', use_iterators = True)  
ws = wb.get_sheet_by_name(name = 'Sheet1') # ws is now an IterableWorksheet  
for row in ws.iter_rows(): # it brings a new method: iter_rows()  
    for cell in row:  
        print cell.internal_value  

With a standard text file I’d normally I’d do something like this:

textfile = os.open('textfile', 'r')
textfiler = textfile.readlines()
for line in textfile:
    if "today's date" in line:
        print line

I’m just not sure how to do what I’d do with a textfile with an xlsx file using openpyxl. Can anyone please give me a hint how to go about this?

I think what I need to do is scan everything under A (not sure how to do this), do a match for today’s date (the date thing I think I can figure out), then print out the entire row (Bxxx, Cxxx, Dxxx etc.) if today’s date has been found (not sure how to do this either).

Hopefully I explained my problem OK, but if not, please let me know and I’ll try again.

EDIT: Thanks to Glen, I think I’m a bit further than I was, bit still not quite there yet. Code I have so far:

import openpyxl
from openpyxl import load_workbook

def find_row(today, ws):
    for a in ws.iter_rows():
        if today == a.internal_value:
            return (a)

def main():
    wb = load_workbook(filename = 'standby.xlsx', use_iterators = True)
    ws = wb.get_sheet_by_name(name = 'Sheet1') # ws is now an IterableWorksheet
    today = '2013-02-12 00:00:00' #whatever date format you're using
    row = find_row(today, ws)
    print row

def test():
    wb = load_workbook(filename = r'standby.xlsx')
    sheet_ranges = wb.get_sheet_by_name(name = 'Sheet1')
    print sheet_ranges.cell('A2607').value # D18

if __name__ == '__main__':
    main()

Error I’m stuck at now is “AttributeError: ‘tuple’ object has no attribute ‘internal_value'” (still googling this one).

The def find_row section looked like this previously:

def find_row(today, ws):
    for a in ws.rows():
        if today == a.internal_value:
            return (a)

That served up the following:
NotImplementedError: use ‘iter_rows()’ instead

EDIT #2: Thanks to Glen Swinfield’s help and patience, I think I finally figured it out. Here’s what the code looks like right now (please excuse if it looks a bit messy, turns out there are quite a lot of columns in the spreadsheet):
import datetime
import openpyxl
from openpyxl import load_workbook

def find_row(today, ws):
    for a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21,a22,a23,a24,a25,a26,a27,a28,a29,a30,a31,a32,a33,a34,a35,a36,a37,a38,a39,a40,a41,a42,a43,a44,a45,a46,a47,a48,a49,a50,a51,a52,a53,a54,a55,a56,a57,a58,a59,a60,a61,a62,a63,a64,a65,a66,a67,a68,a69,a70,a71,a72,a73,a74,a75,a76,a77,a78,a79,a80,a81,a82,a83,a84,a85,a86,a87,a88,a89,a90,a91,a92,a93 in ws.iter_rows():
        if today == a1.internal_value:
            print(a1.internal_value,a2.internal_value,a3.internal_value,a4.internal_value,a5.internal_value,a6.internal_value,a7.internal_value,a8.internal_value,a9.internal_value,a10.internal_value,a11.internal_value,a12.internal_value,a13.internal_value,a14.internal_value,a15.internal_value,a16.internal_value,a17.internal_value,a18.internal_value,a19.internal_value,a20.internal_value,a21.internal_value,a22.internal_value,a23.internal_value,a24.internal_value,a25.internal_value,a26.internal_value,a27.internal_value,a28.internal_value,a29.internal_value,a30.internal_value,a31.internal_value,a32.internal_value,a33.internal_value,a34.internal_value,a35.internal_value,a36.internal_value,a37.internal_value,a38.internal_value,a39.internal_value,a40.internal_value,a41.internal_value,a42.internal_value,a43.internal_value,a44.internal_value,a45.internal_value,a46.internal_value,a47.internal_value,a48.internal_value,a49.internal_value,a50.internal_value,a51.internal_value,a52.internal_value,a53.internal_value,a54.internal_value,a55.internal_value,a56.internal_value,a57.internal_value,a58.internal_value,a59.internal_value,a60.internal_value,a61.internal_value,a62.internal_value,a63.internal_value,a64.internal_value,a65.internal_value,a66.internal_value,a67.internal_value,a68.internal_value,a69.internal_value,a70.internal_value,a71.internal_value,a72.internal_value,a73.internal_value,a74.internal_value,a75.internal_value,a76.internal_value,a77.internal_value,a78.internal_value,a79.internal_value,a80.internal_value,a81.internal_value,a82.internal_value,a83.internal_value,a84.internal_value,a85.internal_value,a86.internal_value,a87.internal_value,a88.internal_value,a89.internal_value,a90.internal_value,a91.internal_value,a92.internal_value,a93.internal_value)

def main():
    wb = load_workbook(filename = 'standby.xlsx', use_iterators = True)
    ws = wb.get_sheet_by_name(name = 'Sheet1') # ws is now an IterableWorksheet
    today = datetime.datetime(2013, 02, 12, 0, 0) #whatever date format you're using
    row = find_row(today, ws)

def test():
    wb = load_workbook(filename = r'standby.xlsx')
    sheet_ranges = wb.get_sheet_by_name(name = 'Sheet1')
    print sheet_ranges.cell('A2607').value # D18

if __name__ == '__main__':
    main()
How to&Answers:

I can’t test this right now, but in principle you need iterate over each row, unpacking the cells as you go, then check if cell A has today’s date, if it does, return the row.

import openpyxl  
from openpyxl import load_workbook

def find_row(today, ws):
    for a,b,c,d in ws.rows():
        if today == a.internal_value:
            return (a, b, c, d)
    raise someException('row not found')

wb = load_workbook(filename = 'standby.xlsx', use_iterators = True)  
ws = wb.get_sheet_by_name(name = 'Sheet1') # ws is now an IterableWorksheet
today = '' #whatever date format you're using
try:
    row = find_row(today, ws) 
except someException:
    # handle exception

There are several ways of doing this in python, with a while/else loop for example, but this is the essence of it.

Updated responding to comment:

def find_row(today, ws):
    for a,b,c,d in ws.iter_rows():
        if today == a.internal_value:
            return (a, b, c, d)
    raise someException('row not found')

The use_iterators = True flag means you get an iterator, so you have to use iter_rows() instead of just .rows.