Home » excel » excel – Processing images in XLSX using Python

excel – Processing images in XLSX using Python

Posted by: admin May 14, 2020 Leave a comment

Questions:

I have an xlsx that has two sheets: on has some data in G1:O25 (let’s call this “data”) and one that has some images inserted into cells in G1:O25 (let’s call this one “images”).

My goal is to use Python to filter the data using images. I want a popup that shows me image from cell G1 along with a checkbox or something to include/exclude this data point. Then create a new sheet (“filtered data”) with the included data points.

I’m new to Python so bear with me, but I’ve figured out a couple things from searching:

  1. I can load the data into a list.
  2. xlsx files are actually zip files so I can use zipfile and matplotlib to read the images from subdirectories display them.
  3. It shouldn’t be hard to add the checkbox thing and do the filtering.

The issues I am having:

  1. Since openpyxl does not preserve the images when reading/writing to a workbook, I would loose the images when I append my “filtered data” sheet. Maybe there is a workaround like saving to a seperete sheet and using COM?
  2. Although I can load the images using the zip method, I lose information on which cell they are associated with. They are in a logical order inside the xlsx/zip file, but sometimes there will be a missing image (i.e. say cell K11 does not have an image) so I cannot just assume that image1.jpeg corresponds to cell G1 and so on and so forth). I am not sure where in the excel file I can find info associating images to their respective cells in the spreadsheet.

Thank you in advance

How to&Answers:

As per how to get the relative position of shapes within a worksheet , in Excel object model, you get the cell adjacent to an image by its .TopLeftCell property:

test pictures

import win32com.client
x=win32com.client.Dispatch("Excel.Application")
wb=x.Workbooks.Open("<path_to.xlsx>")
ws=wb.Sheets("Sheet1")
for i in ws.Shapes:
    print i.TopLeftCell.Address

prints:

$B$2
$B$5
$D$3