I’m on Excel 2010, on an admittedly very large sheet (400k rows X 20 columns).
My code aims to:
- load the entire sheet into an array
- examine every row for a certain criteria
- rows which qualify are copied to another array
- finally return the second array back to another sheet
- the second array will end up being roughly 90% of the original
I wrote a definition of two variable arrays as variants
And tried to initialize them by copying the sheet’s content twice.
first copy works, but by the second one I hit an error of “Out of memory”.
Any ideas if there’s a workaround? or is this just a limitation of VBA/ Excel.
Is there a way to not pre-define / initialize the destination array, and instead, let it “grow” with every successful qualification of the criteria? (On a scale of this magnitude).
Sub CopyPending() Dim LastRow As Long Dim LastCol As Integer Dim AllRange() As Variant Dim CopyRange() As Variant Dim i As Long Dim x As Long Dim z As Long LastCol = 21 LastRow = ActiveSheet.UsedRange.Rows.Count AllRange = Range(Cells(2, 1), Cells(LastRow, LastCol)).Value CopyRange = Range(Cells(2, 1), Cells(LastRow, LastCol)).Value ''' ERROR TRIGGER i = 1 x = 1 z = 1 For i = LBound(AllRange) To UBound(AllRange) - 1 If AllRange(i, 7) = "TestCriteria" Then For z = 1 To LastCol CopyRange(x, z) = AllRange(i, z) Next z x = x + 1 End If Next i With Sheets(2) .Range(.Cells(2, 1), .Cells(x, LastCol)).Value = CopyRange End With End Sub
As comments on your post indicate, this error comes from shortage of working memory.
Each Variant type variable consumes 16 bytes, this is the reason your code require vast amount of memory. So one way to solve this problem is increase physical memory on your computer.
Other solution is filtering data by certain amount of rows.
Sub ProcessRows() Dim originalData() As Variant Dim maxRow as Long, currentRow as Long, incrementRow maxRow = ActiveSheet.Usedrange.Rows.Count currentRow =1 incrementRow=5000 While currentRow < maxRow Set originalData = Range(.Cells(currentRow,1),.Cells(currentRow+incrementRow-1,20) your process to filter data currentRow = currentRow +incrementRow Wend End Sub
Of course you can go with row by row approach, but I assume you use array variable to speed up your code, so I do not recommend to use row by row approach.
Working row by row is extremely slow so this is not a viable solution for such a large dataset.
Arrays are definately the way to go so the choice is between:
- Loading the data in batches then running your processing on a contiguous data set *(viable until large amounts of data – perhaps around 8M elements depending on your system)
- Loading the data in batches then running your processing on the batch only (viable for an arbitrary amount of data)
Edit: I see you are 400k * 20 which is pushing the boundaries of Option 1. You may have no choice but to refactor your code and load and process by batch (vs. load by batch then process together)
- This should be fine until very large datasets as the Out of Memory error is at first not from the size of the array itself but rather from reading from the worksheet.
- If you get an Out of Memory error from the size of the array itself, then:
- you will have no choice but to either use 64-bit Excel;
- Or (better) to refactor your procedure to process the data in chunks (Option 2 above).
The below loads the data in batches into a single array by recursively loading the data in batches. Try it – the benefits of still having one array at the end mean you don’t have to restructure the rest of your code.
Example of Option 1:
Option Explicit Sub example() Dim myCompletedataArr Dim myTestDataRange As Range Set myTestDataRange = ActiveSheet.UsedRange loadDataInBatches myTestDataRange, myCompletedataArr Debug.Assert False End Sub Sub loadDataInBatches(dataRange As Range, dataArr, Optional startRow As Long = 1, Optional rows As Long = 10000) Dim endRow As Long, i As Long, j As Long Dim dataArrLb1 As Long, dataArrLb2 As Long, batchArrLb1 As Long, batchArrLb2 As Long Dim batchArr, batchRange As Range If Not IsArray(dataArr) Then ReDim dataArr(0 To dataRange.rows.Count - 1, 0 To dataRange.Columns.Count - 1) End If 'otherwise assume dataArr is correctly dimensioned (for simplicity) endRow = WorksheetFunction.Min(startRow + rows - 1, dataRange.rows.Count) If endRow <= startRow Then Exit Sub Set batchRange = dataRange.rows(startRow & ":" & endRow) batchArr = batchRange.Value 'cache lower bounds as we use them a lot dataArrLb1 = LBound(dataArr, 1): dataArrLb2 = LBound(dataArr, 2) batchArrLb1 = LBound(batchArr, 1): batchArrLb2 = LBound(batchArr, 2) For i = batchArrLb1 To UBound(batchArr, 1) For j = batchArrLb2 To UBound(batchArr, 2) dataArr(startRow - 1 + i + dataArrLb1 - batchArrLb1, j + dataArrLb2 - batchArrLb2) = batchArr(i, j) Next j Next i Erase batchArr 'free up some memory before the recursive call loadDataInBatches dataRange, dataArr, endRow + 1, rows End Sub