Home » excel » excel – Find which cells have the smallest levenshtein distance

# excel – Find which cells have the smallest levenshtein distance

Posted by: admin May 14, 2020 Leave a comment

Questions:

So, I have this Function which will quickly return the Levenshtein Distance between two Strings:

``````Function Levenshtein(ByVal string1 As String, ByVal string2 As String) As Long

Dim i As Long, j As Long
Dim string1_length As Long
Dim string2_length As Long
Dim distance() As Long

string1_length = Len(string1)
string2_length = Len(string2)
ReDim distance(string1_length, string2_length)

For i = 0 To string1_length
distance(i, 0) = i
Next

For j = 0 To string2_length
distance(0, j) = j
Next

For i = 1 To string1_length
For j = 1 To string2_length
If Asc(Mid\$(string1, i, 1)) = Asc(Mid\$(string2, j, 1)) Then
distance(i, j) = distance(i - 1, j - 1)
Else
distance(i, j) = Application.WorksheetFunction.Min _
(distance(i - 1, j) + 1, _
distance(i, j - 1) + 1, _
distance(i - 1, j - 1) + 1)
End If
Next
Next

Levenshtein = distance(string1_length, string2_length)

End Function
``````

I want to perform a fast comparison between all cells in the “A” column and return which ones have a “small” Levenshtein distance. How would I make all these comparisons?

How to&Answers:

Do you want to find which combinations of strings have small levenshtein distances or just overall how similar/disimilar each string is with all the other strings?

If it is the former this should work fine: You just copy and paste transposed values to create all those headers(as Dale commented). You can use the conditional formatting to highlight the lowest results.

Or if you want the actual strings to return you should be able to use this:

``````=IF(AND(Levenshtein(\$A28,B\$27)>0,Levenshtein(\$A28,B\$27)<=3),\$A28&"/"&B\$27,"")
`````` Just copy and paste unique values if you want the returned combinations in a single column.

Good Luck.