Home » excel » excel – Extract and print desired values with regex leaving the rest without modification

excel – Extract and print desired values with regex leaving the rest without modification

Posted by: admin May 14, 2020 Leave a comment

Questions:

I have an array “arr” with the content as you can see in code

Sub z()
Dim regex As Object, allMatches As Object, match As Object
Dim arr(1 To 16)
Dim str As String

arr(1) = "{ value:{90914793497} }"
arr(2) = "{ iPBAdd:{iPBV4Add:{192.168.1.15}} }"
arr(3) = "859272608"
arr(4) = "pocbh"
arr(5) = "0x00 01"
arr(6) = "{ iPAdd:{iPBAdd:{iPBV4Add:{192.168.33.1}}} }"
arr(7) = "TRUE"
arr(8) = "{ :{qRd:{-} dVGU:{1280} dVGD:{4224} cC:{rC} cT:{2018-09-21 14:05:30 -03:00 } uLI:{-} eInf:{qCI:{9} mRB:{-} rth:{5} aMBL:{-}}} }"
arr(9) = "2/21/2019 14:04"
arr(10) = "39"
arr(11) = "normalR"
arr(12) = "{ mSCause:{1} }"
arr(13) = "{ value:{677607185} }"
arr(14) = "GMT"
arr(15) = "-"
arr(16) = "{ :{GHH} }"

Set regex = CreateObject("vbscript.regexp")
With regex
    .Global = True
    .MultiLine = False
    .Pattern = "\b[\d\.\-: ]+\b"
End With

For i = 1 To 16
    Set allMatches = regex.Execute(arr(i))
    For Each match In allMatches
        If i = 8 Then
            str = str & "|" & match.Value
        Else
            str = match.Value
        End If
    Next
    If i = 8 Then
        Debug.Print Trim(Mid(str, 2, Application.Search(" -", str) - 2))
    Else
        Debug.Print Trim(str)
    End If
    str = ""
Next
End Sub

I want to extract all the values that are within {}. Normally each item of array only has one value between {} but for item arr(8)
there are several values within {} and for that item I only want the value after dVGU, dVGD and date/hout without -03:00

My code almast work, is extracting the values desired but I want to print too the values that don’t have a match.

My current output is:

90914793497
192.168.1.15
859272608

01
192.168.33.1

1280|4224|2018-09-21 14:05:30
2019 14:04
39

1
677607185

and I would like the ouput like this

90914793497                                          
192.168.1.15                 
859272608                    
pocbh                        
0x00 01                      
92.168.33.1                  
TRUE                         
1280|4224|2018-09-21 14:05:30
2/21/2019 14:04              
39                           
normalR                      
1                            
677607185                    
GMT                          
-                            
GHH     

so, are missing in output pocbh, 0x00 01, TRUE, normalR, GMT, -, GHH

How can I fix this?

How to&Answers:

You could simply check if there is a { in the string first. If not, set str to the array value.

...
For i = 1 To 16
    If InStr(1, arr(i), "{") = 0 Then 
        str = arr(i)
    Else
        Set allMatches = regex.Execute(arr(i))
        For Each match In allMatches
            If i = 8 Then
                str = str & "|" & match.Value
            Else
                str = match.Value
            End If
        Next
        If allMatches.Count = 0 Then
            str = alternate_pattern(CStr(arr(i)))
        End If
    End If
    If i = 8 Then
        Debug.Print Trim(Mid(str, 2, Application.Search(" -", str) - 2))
    Else
        Debug.Print Trim(str)
    End If
    str = ""
Next
End Sub

Function alternate_pattern(str As String) As String
Dim regex As Object, matches As Object
Set regex = CreateObject("vbscript.regexp")
With regex
    .Global = True
    .MultiLine = False
    .Pattern = "\b[\w\.\-: ]+\b"
End With

Set matches = regex.Execute(str)
If matches.Count > 0 Then
    alternate_pattern = matches(0)
Else
    alternate_pattern = str
End If

End Function

output:

90914793497
192.168.1.15
859272608
pocbh
0x00 01
192.168.33.1
TRUE
1280|4224|2018-09-21 14:05:30
2/21/2019 14:04
39
normalR
1
677607185
GMT
-
GHH

Note that there is most likely a better way to account for the end GHH. Your current regex pattern is looking for numbers only (using \d), and in some cases you have letters in the {}. That’s what the alternate_pattern checks for.

I’m not that good with Regex, but I’d think you could do an OR statement in there, or group them? …I don’t know. Point is, this can be improved for sure, but does seem to work with the samples you have.