Assume you have between 25,000 and and 50,000 rows of data in columns 1 to 5,000, every column may have a different number of rows. All data is continuous i.e. there are no empty rows in a column and no empty columns.
Take the following code into consideration
Dim i As Long
Dim Ws1 As Worksheet
Set Ws1 = ThisWorkbook.Worksheets(1)
Dim LastColumn As Long
Dim LastRow As Long
With Ws1
LastColumn = .Cells(1, .Columns.Count).End(xlToLeft).Column
For i = 1 To LastColumn
LastRow = .Cells(.Rows.Count, i).End(xlUp).Row
ThisWorkbook.Worksheets(2).Cells(i,1).Value = "Column: " & i & "has " & LastRow & "rows."
Next i
End With
In this code I’ve found the number of the last column and row by using the .End(xlToLeft)
and .End(xlUp)
which looks from the end of the sheet back to where the data is.
My question is, do you incur a performance penalty for looking back from the end and by extension, if your data is continuous it would be better to use .End(xlToRight)
and .End(xlDown)
or is this difference negligible?
I decided to follow the excellent suggestion of @Davesexcel an do some time experiments:
Sub time_test(m As Long, n As Long, k As Long)
Dim i As Long, r As Long, sum As Long, lastrow As Long
Dim start As Double, elapsed As Double
Application.ScreenUpdating = False
lastrow = Range("A:A").Rows.Count
Range(Cells(1, 1), Cells(lastrow, m)).ClearContents
For i = 1 To m
r = Application.WorksheetFunction.RandBetween(n, k)
Cells(1, i).EntireColumn.ClearContents
Range(Cells(1, i), Cells(r, i)).Value = 1
Next i
Debug.Print m & " columns initialized with " & n & " to " & k & " rows of data in each"
Debug.Print "testing xlDown..."
start = Timer
For i = 1 To m
sum = sum + Cells(1, i).End(xlDown).Value
Next i
elapsed = Timer - start
Debug.Print sum & " columns processed in " & elapsed & " seconds"
sum = 0
Debug.Print "testing xlUp..."
start = Timer
For i = 1 To m
sum = sum + Cells(lastrow, i).End(xlUp).Value
Next i
elapsed = Timer - start
Debug.Print sum & " columns processed in " & elapsed & " seconds"
Application.ScreenUpdating = True
End Sub
Used like this:
Sub test()
time_test 1000, 5000, 10000
End Sub
This produces the output:
1000 columns initialized with 5000 to 10000 rows of data in each
testing xlDown...
1000 columns processed in 0.1796875 seconds
testing xlUp...
1000 columns processed in 0.0625 seconds
which suggests that using xlUp
is better.
On the other hand, if I run time_test 5000, 500, 1000
I get the output:
5000 columns initialized with 500 to 1000 rows of data in each
testing xlDown...
5000 columns processed in 0.08984375 seconds
testing xlUp...
5000 columns processed in 0.84375 seconds
in which the apparent advantage is flipped. I’ve experimented with a number of different choices for the parameters and am unable to get a clear signal.
If I try to run time_test 5000, 25000, 50000
(which is what you asked about) Excel crashes with an error message about Excel lacking resources to finish the task — before it even reaches the timing stage.
To sum up — any difference appears to be minor and might depend on the actual number of columns and rows used. If you are in a situation where it might make a difference then you are probably running against Excel memory limits in which case the difference between xlDown
and xlUp
is probably the least of your worries.
Tags: excelexcel, performance