Home » excel » arrays – Most efficient way to split a string up by a delimiter while ignoring certain instances of said delimiter using excel vba

arrays – Most efficient way to split a string up by a delimiter while ignoring certain instances of said delimiter using excel vba

Posted by: admin March 9, 2020 Leave a comment

Questions:

I have a very redundant piece of old code, which I created when I first discovered and decided to learn vba (and man have I come a long way). I was using this code to cycle through cells that contained multiple values delimited with commas. However, there are cases where I can’t simply use something such as the Split(string,",") function because some of the values have commas within that value (example value:[blah blah, so blah blah]). In instances where these brackets exist (they are around every value that has a comma within it) I devised the rather long-winded approach that is my old method to properly split the values up, dump them in an array and then proceed with my other tasks. But, now I’ve decided to revisit the code and fix the accuracy. Here is some background.

Sample Data that can be found in one cell:
Please Note: This is data that suppliers send us, we don’t have control over what they enter or how they enter it. This is a simple example to show the gist of how the data typically is provided in some cases

Available on 2 sides: Silkscreen,[full: color, covers entire face],Pad Print: One color,[heat transfer, may bleed]

Values are:

  • Available on 2 sides: Silkscreen
  • [full: color, covers entire face]
  • Pad Print: One color
  • [heat transfer, may bleed]

What I’m looking for:
I’m looking for a more efficient and simpler method to be able to split up values correctly (while keeping the brackets for the values that have them).

I believe I have managed to create a much more efficient and compact method to handle instances that don’t include brackets using the following code

New Code (Under Construction):
I’m having issues with not knowing how to efficiently and accurately split the cells with brackets

Sub Test()
    Dim rngXid As Range, RegularColons As New Collection, UpchargeColons As New Collection, additionals As Range, upcharges As Range, Colon, UpchargeColon
    Dim Values() As String, endRange As Long, xidMap As Object, xid As String, NumberofValues As Integer
    endRange = ActiveSheet.Range("A" & Rows.Count).End(xlUp).Row

    Set xidMap = getXidMap(ActiveSheet.Range("A2:A" & UsedRange.Rows.Count)) 'Map products for quicker navigation
    Set additionals = ActiveSheet.Range("AJ:AK"): Set upcharges = ActiveSheet.Range("CS:CT")
    Set RegularColons = FindAllMatches(additionals, ":") 'This returns all instances/cells that contain a colon in the specified columns
    If Not RegularColons Is Nothing Then
        For Each Colon In RegularColons
            xid = ActiveSheet.Range("A" & Colon.Row).Value
            If InStr(1, Colon.Value, "[") = 0 Then 'If no brackets then simply split
                Values = Split(Trim(Colon.Value), ",")
            Else
                'This is where I'm at a lose for a more effective method
                '-----------Populate Values array with Colon.Value while watching out for brackets--------
            End If
            Set rngXid = xidMap(xid).EntireRow.Columns(upcharges) 'set to this specific product
            For ColorLocation = LBound(Values) To UBound(Values) 'cycle through each value in Values array
                If Not InStr(1, Values(ColorLocation), ":") = 0 Then 'Only proceed if the value has a colon
                    Set UpchargeColons = FindAllMatches(rngXid, Values(ColorLocation)) 'Searching other columns for this value
                    If Not UpchargeColons Is Nothing Then
                        For Each UpchargeColon In UpchargeColons 'If found in other columns proceed to replace colon
                            UpchargeColon.Value = Replace(UpchargeColon.Value, ":", " ")
                            Log UpchargeColon.Range, "Removed Colon from Additional Color/Location Upcharge", "Corrected" 'This is a custom sub of mine to record the change
                        Next UpchargeColon
                    End If
                    Values(ColorLocation) = Replace(Values(ColorLocation), ":", " ")
                End If
            Next ColorLocation
            Log Colon.Range, "Removed Colon(s) from Additional Color/Location Value(s)", "Corrected"
        Next Colon
    End If
End Sub

I’ve been browsing possible ways to do this and the one that keeps sticking out is Regex, which admittedly I have absolutely no experience with although I have previously heard of it. So, I tried reading up on it a bit using sites like this and of course the msdn documentation. My observations/thoughts while trying to learn a little more about this method are:

  1. This is absolutely, terrifyingly complex and overwhelming. So much so I have to refrain from crawling into a corner, assuming fetal position, and bawling my eyes out.
  2. I can’t seem to find anything that would suggest how I could incorporate this function with my need to split the string appropriately, at least as far as I can see. But, it may just be that I get so overwhelmed by all the seemingly random symbol sequences

So, my question is:
What would be the most efficient way to accurately split up the values in cells that contain brackets?

How to&Answers:

Regular Expressions (aka “regex”) are indeed scary-looking, but they’re also a powerful tool, and VBA supports them, if you add a reference to the Microsoft VBScript Regular Expressions 5.5 library.

With it you can create a RegExp object, which gives you a MatchCollection, which is, well, a collection of Match objects.

Here’s how to use them:

Sub Test()
    Const value As String = _
    "Available on 2 sides: Silkscreen,[full: color, covers entire face],Pad Print: One color,[heat transfer, may bleed]"

    Const pattern As String = _
    "(\[[^\]]+\]|[^,]+)"

    Dim regex As New RegExp
    regex.Global = True
    regex.pattern = pattern

    Dim matches As MatchCollection
    Set matches = regex.Execute(value)

    Dim m As Match
    For Each m In matches
        Debug.Print Trim(m.value) 'value will preserve any leading/trailing spaces
    Next

End Sub

Notice the pattern is pretty much that in Laurel’s answer:

(\[[^\]]+\]|[^,]+)

By not specifying that you want to match a comma, you’re not matching it (whether it’s there or not) – hence, the above code outputs this:

Available on 2 sides: Silkscreen
[full: color, covers entire face]
Pad Print: One color
[heat transfer, may bleed]

You can easily iterate a MatchCollection to populate an array if you need one.

Answer:

There are other ways, but this regex seems to be pretty fast:

(\[[^\]]+\]|[^,]+),?

Explanation:

\[ and \] are escaped versions of [ and]

Essentially, it’s looking for a \[, gets all non brackets [^\]], then the \]. Otherwise |, it will get all the non commas [^,]. The surrounding () makes it a capture group. ,? means there may or may not be a comma.

Answer:

One way is to take the bracketed commas and replace them with Chr(184)‘s. These little guys look a lot like commas.

Once the bracketed commas have been replaced, you can use normal Split() Here is some code to do the replacement:

Sub parser()
    Dim s As String, s1 As String, s2 As String, pseudo As String
    Dim switch As Boolean, temp As String, CH As String

    pseudo = Chr(184)
    s1 = "["
    s2 = "]"
    s = [A1]
    switch = False
    temp = ""

    For i = 1 To Len(s)
        CH = Mid(s, i, 1)
        If CH = s1 Or CH = s2 Then switch = Not switch
        If switch Then CH = Replace(CH, ",", pseudo)
        temp = temp & CH
    Next i

    Range("A2").Value = temp
    MsgBox s & vbCrLf & temp
End Sub

enter image description here

Answer:

Function Splitter(s)
    Dim p As Long, b As Long, l As String
    Dim c As Long, s2 As String, arr, n

    If InStr(s, "[") = 0 Then
        arr = Split(s, ",")
    Else
        c = 0
        For p = 1 To Len(s)
            l = Mid(s, p, 1)
            If l = "," And c = 0 Then
                Mid(s, p, 1) = vbNull
            Else
                If l = "[" Then c = c + 1
                If l = "]" Then c = c - 1
            End If
        Next p
        arr = Split(s, vbNull)
    End If
    Splitter = arr
End Function