Home » excel » excel – Remove duplicates from array using VBA

excel – Remove duplicates from array using VBA

Posted by: admin March 9, 2020 Leave a comment

Questions:

Assume I have a block of data in Excel 2010, 100 rows by 3 columns.

Column C contains some duplicates, say it starts off as

1, 1, 1, 2, 3, 4, 5, ….. , 97, 98

Using VBA, I would like to remove the duplicate rows so I am left with 98 rows and 3 columns.

1, 2, 3, ….. , 97, 98

I know there is a button in Excel 2010 to do that but it inteferes with the rest of my code subsequently and gives incorrect results.

Furthermore, I would like to do it in arrays, then paste the results on the worksheet, rather than methods such as Application.Worksheetfunction.countif(.....

So something like:

Dim myarray() as Variant
myarray=cells(1,1).Currentregion.value

Dim a as Long

For a=1 to Ubound(myarray,1)

    'something here to 

Next a
How to&Answers:

I answered a similar question. Here is the code I used:

Dim dict As Object
Dim rowCount As Long
Dim strVal As String

Set dict = CreateObject("Scripting.Dictionary")

rowCount = Sheet1.Range("A1").CurrentRegion.Rows.Count

'you can change the loop condition to iterate through the array rows instead
Do While rowCount > 1
  strVal = Sheet1.Cells(rowCount, 1).Value2

  If dict.exists(strVal) Then
    Sheet1.Rows(rowCount).EntireRow.Delete
  Else
    'if doing this with an array, then add code in the Else block
    ' to assign values from this row to the array of unique values
    dict.Add strVal, 0
  End If

  rowCount = rowCount - 1
Loop

Set dict = Nothing

If you want to use an array, then loop through the elements with the same conditional (if/else) statements. If the item doesn’t exist in the dictionary, then you can add it to the dictionary and add the row values to another array.

Honestly, I think the most efficient way is to adapt code you’d get from the macro recorder. You can perform the above function in one line:

    Sheet1.UsedRange.RemoveDuplicates Columns:=3, Header:=xlYes

Answer:

Function eliminateDuplicate(poArr As Variant) As Variant
    Dim poArrNoDup()

    dupArrIndex = -1
    For i = LBound(poArr) To UBound(poArr)
        dupBool = False

        For j = LBound(poArr) To i
            If poArr(i) = poArr(j) And Not i = j Then
                dupBool = True
            End If
        Next j

        If dupBool = False Then
            dupArrIndex = dupArrIndex + 1
            ReDim Preserve poArrNoDup(dupArrIndex)
            poArrNoDup(dupArrIndex) = poArr(i)
        End If
    Next i

    eliminateDuplicate = poArrNoDup
End Function

Answer:

Dictionaries have a max of 255 items, so if you have more values you need to use a Collection. Unfortunately, the Collection object does not have a .Contains(a) or .Exists(a) method, but this function handles (fakes it) it nicely by using the Error numbers:

CORRECTION: Dictionaries do not have such a limit (thanks Zairja). I may have been using an Integer to iterate through my Dictionary. In any event, this function allows you to check Collections for item existence, so I’ll leave it here if it’s useful to anyone:

CollContainsItem(col As Collection, val As Variant) As Boolean

Dim itm As Variant
On Error Resume Next

    itm = col.Item(val)
    CollContainsItem = Not (Err.Number = 5 Or Err.Number = 9)

On Error GoTo 0

End Function

So if you do need a Collection, you could likely just replace

dict.Exists(strVal)

with

CollContainsItem(coll, strVal)

and replace

Set dict = CreateObject("Scripting.Dictionary")

with

Set coll = CreateObject("Scripting.Collection")

And use the rest of Zairja’s code. (I didn’t actually try it but it should be close)

Answer:

I know this is old, but here’s something I used to copy duplicate values to another range so that I could see them quickly to establish data integrity for a database I was standing up from various spreadsheets. To make the procedure delete the duplicates it would be as simple as replacing the dupRng lines with Cell.Delete Shift:=xlToLeft or something to that effect.

I haven’t tested that personally, but it should work.

Sub PartCompare()
    Dim partRng As Range, partArr() As Variant, i As Integer
    Dim Cell As Range, lrow As Integer

    lrow = ThisWorkbook.Worksheets("Sheet1").Cells(Rows.Count, 1).End(xlUp).Row
    i = 0

    Set partRng = ThisWorkbook.Worksheets("Sheet1").Range(Cells(1, 1), Cells(lrow, 1))

    For Each Cell In partRng.Cells
        ReDim Preserve partArr(i)
        partArr(i) = Cell.Value
        i = i + 1
    Next

    Dim dupRng As Range, j As Integer, x As Integer, c As Integer

    Set dupRng = ThisWorkbook.Worksheets("Sheet1").Range("D1")

    x = 0
    c = 1
    For Each Cell In partRng.Cells
        For j = c To UBound(partArr)
            If partArr(j) = Cell.Value Then
                dupRng.Offset(x, 0).Value = Cell.Value
                dupRng.Offset(x, 1).Value = Cell.Address()
                x = x + 1
                Exit For
            End If
        Next j
        c = c + 1
    Next Cell
End Sub

Answer:

Simple function to remove duplicates from a 1D array

Private Function DeDupeArray(vArray As Variant) As Variant
  Dim oDict As Object, i As Long
  Set oDict = CreateObject("Scripting.Dictionary")
  For i = LBound(vArray) To UBound(vArray)
    oDict(vArray(i)) = True
  Next
  DeDupeArray = oDict.keys()
End Function

Answer:

Answer from @RBILLC could be easily improved by adding an Exit For inside internal loop:

Function eliminateDuplicate(poArr As Variant) As Variant
    Dim poArrNoDup()

    dupArrIndex = -1
    For i = LBound(poArr) To UBound(poArr)
        dupBool = False

        For j = LBound(poArr) To i
            If poArr(i) = poArr(j) And Not i = j Then
                dupBool = True
                Exit For
            End If
        Next j

        If dupBool = False Then
            dupArrIndex = dupArrIndex + 1
            ReDim Preserve poArrNoDup(dupArrIndex)
            poArrNoDup(dupArrIndex) = poArr(i)
        End If
    Next i

    eliminateDuplicate = poArrNoDup
End Function