Home » excel » optimization – Excel Formula Optimisation

optimization – Excel Formula Optimisation

Posted by: admin May 14, 2020 Leave a comment

Questions:

I am no excel expert and after some research have come up with this formula to look at two sets of the same data from different times. It then displays new entries that are in the latest list of data but not in the old list.

This is my formula:

  {=IF(ROWS(L$4:L8)<=(SUMPRODUCT(--ISNA(MATCH($E$1:$E$2500,List1!$E$1:$E$2500,0)))),
    INDEX(E$1:E$2500,
    SMALL(IF(ISNA(MATCH($E$1:$E$2500&$F$1:$F$2500,List1!$E$1:$E$2500&List1!$F$1:$F$2500,0)),
    ROW($F$1:$F$2500)-ROW($F$1)+1),ROWS(L$4:L8))),"")}

Are there any optimisation techniques I could employ to speed up the calculation?

As requested
Some example data(link to a spreadsheet):
https://docs.google.com/file/d/0B186C84TADzrMlpmelJoRHN2TVU/edit?usp=sharing

On this scaled down version its more efficent but on my actual sheet with a lot more data it is slowed.

How to&Answers:

Well, I was playing around a bit and I think that this works the same, and without the first IF statement:

=IFERROR(INDEX(A$1:A$2500,SMALL(IF(ISNA(MATCH($A$1:$A$2500&$B$1:$B$2500,List1!$A$1:$A$2500&List1!$B$1:$B$2500,0)),ROW($B$1:$B$2500)-ROW($B$1)+1),ROWS(F$2:F2))),"")

That part in your sample data:

ROWS(F$2:F2)<=(SUMPRODUCT(--ISNA(MATCH($A$1:$A$2500,List1!$A$1:$A$2500,0))))

As I understood it, it only sees to it that the row number in which the formula is entered is lower than the number of ‘new’ items, but it doesn’t serve any purpose because when you drag the formula more than required, you still get errors instead of the expected blank. So I thought it could be removed altogether (after trying to substitute it with COUNTA() instead) and use an IFERROR() on the part directly fetching the details.

EDIT: Scratched that out. See barry houdini’s comment for the importance of those parts.

Next, you had this:

ROW($B$1:$B$2500)-ROW($B$1)+1

-ROW($B$1)+1 always returns 0, so I didn’t find any use to it and removed it altogether.

It’s still quite long and takes some time I guess, but I believe it should be faster than previously by a notch 🙂

Answer:

A relatively fast solution is to add a multi-cell array formula in a column alongside List 2

{=MATCH($A$1:$A$16,List1!$A$1:$A$11,0)}

and filter the resultant output for #N/A.

(Or see Compare.Lists vs VLOOKUP for my commercial solution)

Answer:

Array formula is slow. When you have thousands of array formula, it will make the speed very slow. Thus the key will be to avoid any array formula.

The following will be my way to achieve it, using only simple formula. It should be fast enough if you only have 2500 rows.

  • Column F and H are “Keys”, created by concatenating your 2 columns (E and F in your original formula)
  • Assuming the first line of data is on row 3.

Data:

|   A   |      B      |    |  D |       E       |     F     |      |     H     |
| index | final value |    | ID | exist in Old? | Key (New) |      | Key (Old) |
--------------------------------------------------------------------------------
|   1   |    XXX-33   |    |  0 |      3        | OOD-06    |      | OOC-01    |
|   2   |    ZZZ-66   |    |  0 |      1        | OOC-01    |      | OOC-02    |
|   3   |    ZZZ-77   |    |  1 |     N/A       | XXX-33    |      | OOD-06    |
|   4   |             |    |  1 |      4        | OOE-01    |      | OOE-01    |
|   5   |             |    |  1 |      2        | OOC-02    |      | OOF-03    |
|   6   |             |    |  2 |     N/A       | ZZZ-66    |      |           |
|   7   |             |    |  3 |     N/A       | ZZZ-77    |      |           |

Column E “exist in Old?”: test if the new key (Column F) exists in the old list (Column H)

=MATCH(F3, $H$3:$H$2500, 0)

Column D “ID”: to increment by one whenever a new item is found

=IF(ISNA(E3), 1, 0)+IF(ISNUMBER(D2), D2, 0)  

the 2nd part of ISNUMBER is just for the first row, where just using D2 can cause an error

Column A “index”: just a plain series starting from 1 (until the length of new list Column F)

Column B “final value”: to find the new key by matching column A to Column D.

=IF(A3>MAX($D$3:$D$2500), "", INDEX($F$3:$F$2500, MATCH(A3, $D$3:$D$2500, 0))

This column B will be the list you want.

If it is still too slow, there exists some dirty tricks to speed up the calculation, e.g. by utilizing a sorted list with MATCH( , , 1) instead of MATCH( , , 0).