Home » excel » excel – Finding first occurence of a term in VBA after another occurence of another term

excel – Finding first occurence of a term in VBA after another occurence of another term

Posted by: admin May 14, 2020 Leave a comment

Questions:

I am trying to extract information from an .xml document that has been copied in to MS Excel.

The structure is as follows (each line is in a different line in Excel):

<chapterid="chapter1">
    <value1>123</value1>
    <value2>456</value2>
</chapter>
<chapterid="chapter1">
    <value1>789</value1>
    <value2>012</value2>
</chapter>

I am trying, specifying value1 (or value2) and the chapterid, to find the first occurrence of value1 after <chapterid="chapter1">.

(It also must be able to run within different worksheets, hence the second variable in the below function. The active Worksheet in which I search is called mysheet, and I am trying to put the result in cells(2,2) in this sheet for now.)

Here is my code so far:

Sub extractreport()

Dim chapterid As String
Dim mysheet As String
Dim myvar as String

tradeid = "Chapter1"
mansheet = "Test2"
myvar = "value1"    

Worksheets(mysheet).Cells(2, 2).Value = FindContentAfter(chapterid, mysheet, myvar)

End Sub

My attempt at the Function is as follows:

Public Function FindContentAfter(chapterid As String, mysheet As String, myvar As String)

Dim foundcell As Range
Set wb = Workbooks("ThisWorkbook")
Dim rowfind As Range
Dim adjstring As String

adjstring = "<chapterid=" & Chr(34) & chapterid & Chr(34) & ">"
MsgBox adjstring 'searching for the beginning of the chapter, because the String "Chapter1" can occur in different `values` in other chapters.
Set rowfind = Worksheets(mysheet).Range("A:A").find(What:=adjstring, LookAt:=xlPart, MatchCase:=False, SearchFormat:=False) 'find row in which chapter begins

Set foundcell = Worksheets(mysheet).Range("A:A").find(What:=mystring, LookAt:=xlPart, MatchCase:=False, SearchFormat:=False, after:=Worksheets(mysheet).Range(rowfind)) 'find first myvar after the previously found row
If Not foundcell Is Nothing Then
    'MsgBox (mystring & " found in row: " & foundcell.Row)
Else
    'MsgBox (mystring & " not found")
End If
MsgBox "1    " & foundcell

FindContentAfter = Mid(foundcell.Value, InStr(foundcell.Value, ">") + 1, InStr(3, foundcell.Value, "<") - InStr(foundcell.Value, ">") - 1)

End Function

It works up to the defining and searching of adjstring. However, finding rowfind (as the beginning of the chapter) and giving this as a starting point to the search for myvar does not work.

The error is “Runtime Error 1004”. I tried various things, such as:

Set foundcell = Worksheets(mysheet).Range("A" & rowfind.Row() & ":A").find(What:=mystring, LookAt:=xlPart, MatchCase:=False, SearchFormat:=False)

, or Range(rowfind).Row(), but I cannot get over the issue.

How to&Answers:

If I understand your post correctly, you are trying to fetch the text between <value1>123</value1>, so in your case you are trying to get 123.

If that’s correct you will need to modify your formula to the formula below:

FindContentAfter = Mid(foundcell.Value, InStr(foundcell.Value, ">") + 1, InStrRev(foundcell.Value, "<") - InStr(foundcell.Value, ">") - 1).

Also, in your second Find, the parameter After is looking for a range, so in this case we use the rowfind Range object from our first Find.

Try modifying your code to the code below, explanation inside the code’s comments:

Option Explicit

Sub extractreport()

Dim chapterid As String, mysheet As String, myvar As String

chapterid = "Chapter1"
mysheet = "Test2"
myvar = "value1"

Worksheets(mysheet).Cells(2, 2).Value = FindContentAfter(chapterid, mysheet, myvar)

End Sub

'===================================================================================

Public Function FindContentAfter(chapterid As String, mysheet As String, myvar As String)

Dim foundcell As Range
Dim rowfind As Range
Dim adjstring As String

adjstring = "<chapterid=" & Chr(34) & chapterid & Chr(34) & ">"
MsgBox adjstring 'searching for the beginning of the chapter, because the String "Chapter1" can occur in different `values` in other chapters.

With Worksheets(mysheet)
    Set rowfind = .Range("A:A").Find(What:=adjstring, LookAt:=xlPart, MatchCase:=False, SearchFormat:=False) ' find the range where the chapter begins
    ' conmfirm Chapter was found
    If rowfind Is Nothing Then
        MsgBox "Unable to find " & adjstring & " in worksheet " & .Name
        Exit Function
    End If

    Set foundcell = .Range("A:A").Find(What:=myvar, After:=rowfind, LookAt:=xlPart, MatchCase:=False, SearchFormat:=False) 'find first myvar after the previously found row
    If Not foundcell Is Nothing Then
        MsgBox myvar & " found in row: " & foundcell.Row
    Else
        MsgBox myvar & " not found"
    End If
    MsgBox "1    " & foundcell
End With

FindContentAfter = Mid(foundcell.Value, InStr(foundcell.Value, ">") + 1, InStrRev(foundcell.Value, "<") - InStr(foundcell.Value, ">") - 1)

End Function

Answer:

Your main issue is with the (after) argument,

Change your code to the following :

Sub extractreport()

   Dim chapterid As String
   Dim mysheet As String
   Dim myvar As String

   chapterid = "Chapter1"
   mysheet = "Test2"
   myvar = "value1"

   Worksheets(mysheet).Cells(2, 2).Value = FindContentAfter(chapterid, mysheet, myvar)

End Sub


Public Function FindContentAfter(chapterid As String, mysheet As String, myvar As String)

   Dim wb As Workbook

   Dim foundcell As Range
   'Set wb = Workbooks("ThisWorkbook")
   Dim rowfind As Range
   Dim adjstring As String

   adjstring = "<chapterid=" & Chr(34) & chapterid & Chr(34) & ">"

   'MsgBox adjstring 'searching for the beginning of the chapter, because the String "Chapter1" can occur in different `values` in other chapters.
   Set rowfind = Worksheets(mysheet).Range("A:A").Find(What:=adjstring, Lookat:=xlPart, MatchCase:=False, SearchFormat:=False) 'find row in which chapter begins           

   Set foundcell = Worksheets(mysheet).Range("A:A").Find(What:=myvar, Lookat:=xlPart, MatchCase:=False, SearchFormat:=False, after:=rowfind)  'find first myvar after the previously found row

   If Not foundcell Is Nothing Then
       MsgBox (myvar & " found in row: " & foundcell.Row)
   Else
       MsgBox (myvar & " not found")
   End If
   MsgBox "1    " & foundcell

   'FindContentAfter = Mid(foundcell.Value, InStr(foundcell.Value, ">") + 1, InStr(3, foundcell.Value, "<") - InStr(foundcell.Value, ">") - 1)

End Function