Home » excel » excel – Most efficient way to change values conditionally in a range?

excel – Most efficient way to change values conditionally in a range?

Posted by: admin May 14, 2020 Leave a comment

Questions:

I’m just wanting to run through a (large) range and replace certain values (if they’re above a given max or below a given min…also one particular character) with a given replacement value.

My first thought is to simply traverse each cell and check/replace when necessary. I have a feeling this procedure would be really slow though, and I’m curious if there’s a better way to accomplish this.

Any time I write code that does something similar to this in VBA I watch each cell have its value altered cell by cell and it seems like there must be better way. Thanks in advance.

edit:

I haven’t even written this implementation yet because I know what the result will be and I would rather do something different if it’s possible, but here’s what it would look like

For something
  If(Range.Value == condition)
    Range.Value = replacement_value
  Range = Range.Offset(a, b)
End For
How to&Answers:

Make a formula in a separate column, and then copy/paste special, values only.

= if(A2 > givenvalue; replace; if(A2< anothergivenvalue; anotherreplace; if (A2 = "particularcharacterortext"; replaceonemore; A2)))

Put the formula in an empty cell in an empty column, drag it or copy/paste to the entire column. After that, if the new values are ok, copy/paste values only to the original position.

Answer:

The following VBA code provides a simple framework that you can customize to meet your needs. It incorporates many of the optimizations that have been mentioned in the comments to your question, such turning off screen updating and moving the comparison from the worksheet to an array.

You will notice that the macro does a rather large compare and replace. The data set I ran it on was 2.5 million random numbers between 1 and 1000 in the range A1:Y100000. If a number was greater than 250 and less than 500, I replaced it with 0. This required replacing 24.9 percent of all the numbers in the data set.

Sub ReplaceExample()

    Dim arr() As Variant
    Dim rng As Range
    Dim i As Long, _
        j As Long
    Dim floor as Long
    Dim ceiling as Long
    Dim replacement_value

    'assign the worksheet range to a variable
    Set rng = Worksheets("Sheet2").Range("A1:Y100000")
    floor = 250
    ceiling = 500
    replacement_value = 0

    ' copy the values in the worksheet range to the array
    arr = rng

    ' turn off time-consuming external operations
    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual
    Application.EnableEvents = False

    'loop through each element in the array
    For i = LBound(arr, 1) To UBound(arr, 1)
        For j = LBound(arr, 2) To UBound(arr, 2) 
           'do the comparison of the value in an array element
           'with the criteria for replacing the value
           If arr(i, j) > floor And arr(i, j) < ceiling Then
                arr(i, j) = replacement
            End If
        Next j
    Next i

    'copy array back to worksheet range
    rng = arr

    'turn events back on
    Application.ScreenUpdating = True
    Application.Calculation = xlCalculationAutomatic
    Application.EnableEvents = True

End Sub  

I did some performance testing on different alternatives for coding this simple compare and replace, with results that I would expect are consistent with VBA performance results by others. I ran each alternative 10 times, calculating the elapsed time for each run, and averaging the 10 elapsed times.

vba performance results

The results reveal the large impact that using arrays can have, especially when the data set is large: Compared to code that tested and changed worksheet cell values one-by-one, the array operation — copying the data set from the worksheet into an array, comparing and changing the array values, and then writing the array results back to the worksheet — in this case reduced average run times by 98 percent, from 3.6 minutes to 4 seconds.

While the optimizations that turned off external events made a noticeable difference in worksheet operations, with a 22 percent reduction in run times, those optimizations had very little impact when most of the computational work is array-based.