Home » excel » java – Apache POI formula cell duplication very slow

java – Apache POI formula cell duplication very slow

Posted by: admin April 23, 2020 Leave a comment

Questions:

I’m generating an excel file using Apache POI 3.8 , and have the need to replicate some existing row n° times.

This because I have some complex formula which I use as a template to create new lines, replacing cell indexes with regexps.

The problem is that performance are awful, it takes 2h to generate some 4000 rows.

I’ve pinpointed the problem to be not in the regexp part, as I initially thought, but in the duplication of formula cells.

I actually use this to replicate formula cells:

case Cell.CELL_TYPE_FORMULA:
    newCell.setCellType(oldCell.getCellType());
    newCell.setCellFormula(oldCell.getCellFormula());
    break; 

If I copy the formula as text like this:

case Cell.CELL_TYPE_FORMULA:
    newCell.setCellType(Cell.CELL_TYPE_STRING);
    newCell.setCellValue("="+oldCell.getCellFormula());
    break;

it’s instead pretty fast, even with my regexp in place.

Anyway, this is an imperfect solution, because the formula has english keywords (ie IF()), when I need to write in italian format.

More, cells with formula inserted like that need to be forcefully re-evaluated in excel with something like “replace all -> ‘=’ with ‘='”.

The problem seems to rely in the setCellFormula(), because of the HSSFFormulaParser.parse().

What’s strange, is that parsing time seems to grow exponentially:

100 rows ->  6785ms
200 rows -> 23933ms
300 rows -> 51388ms
400 rows -> 88586ms

What it seems, is that each time I copy a formula, the POI library re-evaluates or re-parses or re-something all preceding rows.

Do anyone know how can solve this problem?
Thanks in advance.

How to&Answers:

Oh my…I think I found it…

Original was:

// If the row exist in destination, push down all rows by 1 else create a new row
if (newRow != null) {
    worksheet.shiftRows(destinationRowNum, worksheet.getLastRowNum(), 1);
} else {
    newRow = worksheet.createRow(destinationRowNum);
}

I’ve commented everything leaving only

newRow = worksheet.createRow(destinationRowNum);

And now I’m down to 60sec to process all rows!

Probably, there’s some dirt in my template which was causing POI to shift everything at each iteration.