Home » excel » excel – Why Object.Selectnodes(XPath) gets 1st node value if former node is empty

excel – Why Object.Selectnodes(XPath) gets 1st node value if former node is empty

Posted by: admin May 14, 2020 Leave a comment

Questions:

Why Object.SelectNodes(XPath) gets as 1st node value second node value if former node (real 1st node value) is empty.
Example below:

XML:

<?xml version="1.0" encoding="UTF-8"?>
<Document>
    <person>
    </person>
    <person>
           <name>Peter</name>
    </person>
</Document>

VBA code:

Dim j as Integer
Dim FileToOpen as Variant
FileToOpen = Application.GetOpenFilename(Filefilter:="XML Files (*.xml), *.xml", _
Title:="Choose XML document ", MultiSelect:=False)
Set XDoc = CreateObject("MSXML2.DOMDocument")
    XDoc.async = False: XDoc.validateOnParse = False
    XDoc.Load FileToOpen

    For j = 1 To 2
        Set tofields = XDoc.SelectNodes("//Document/person/name")
        If Not (tofields.Item(j)) Is Nothing Then
            Debug.Print tofields.Item(j).Text
        Else
            Debug.Print "Nothing"
        End If
    Next j

Result:

Peter
Nothing

Why is not “Nothing” on the first place in the result ? How to reach that ? If parent node does not include 1st child node, 1st iteration is omitted.
Thank you.

How to&Answers:

Enumeration in XMLDOM differs from XPath

XMLDOM syntax enumerates nodes as zero based items in a NodeList, i.e. starting from 0, whereas XPath expressions identifying subnodes start from 1 (e.g. calling a first name item "//name[1]"). This mistakes in considering 2 as the last item index instead of looping from .Item(0) to .Item(1) in your example code. You get the number of found items results via the node list’s .Length method (2 minus 1 results in 1 as the last index number, thus giving a name node series of 0 to 1).

Furthermore it’s recommended to reference MSXML2 version 6.0 (MSXML2.DOMDocument refers to the last stable version 3.0 used only for compatibility reasons).

Further hints assuming that you want loop through all persons in your XML document (the nodelist in the OP delivers one item only, as the name node exists only once):

  • The expressions xDoc.SelectNodes("//Document/person") or //person would search the defined node set at any hierarchy Level within a given node structure. So it’s less time consuming to use unambigously Set toFields = xDoc.DocumentElement.SelectNodes("person") in your case.
  • The following code example wouldn’t show the Nothing case, as the node list displays two name nodes only (For i = 0 To toFields.Length - 1). Just in order to check your original attempt you could enumerate up to three items by changing intently to For i = 0 To toFields.Length (i.e. 0 to 2).

Additional link

Analyze your XML structure via recursive calls; a working function can be found at Parse XML using XMLDOM.

Code example

    Dim xDoc As Object, toFields As Object
    Dim myName As String, i As Long
    Set xDoc = CreateObject("MSXML2.DOMDocument.6.0")   ' recommended version 6.0 (if late bound MSXML2)
    xDoc.async = False: xDoc.validateOnParse = False
    ' ...

    Set toFields = xDoc.DocumentElement.SelectNodes("person")
    For i = 0 To toFields.Length - 1
        If Not toFields.Item(i) Is Nothing Then
            If toFields.Item(i).HasChildNodes Then
               myName = toFields.Item(i).SelectSingleNode("name").Text
               Debug.Print i, IIf(Len(Trim(myName)) = 0, "**Empty name", myName)

            Else
               Debug.Print i, "**No name node**"
            End If
        Else
            Debug.Print i, "**Nothing**"            ' shouldn't be needed from 0 to .Length-1 items :-)
        End If
    Next i