I need to extract certain information from HTML using VBA.
This is the HTML from which I am trying to extract the location information alone.
<dl id="headline" class="demographic-info adr"> <dt>Location</dt> <dd> <span class="locality"> Dallas/Fort Worth Area </span> </dd> <dt>Industry</dt> <dd class="industry"> Higher Education </dd>
In my excel VBA, after opening the web page, I am using the following code to extract the information.
Dim openedpage as String openedpage = iedoc1.getElementById("headline").innerText
However, I am getting the information as,
Location Dallas/Fort Worth Area Industry Higher Education
I just need to extract,
Dallas/Fort Worth Area as the output.
Your getting all the extra text because that is kinda what you asked for, the innerText of the parent element, which is everything inside of it.
The above code gets the content of the “headline” element, then finds all “span” tags inside of it. Looking at the list returned, it chooses the first instance and returns the innerText.
I always seem to get the index base wrong, the
1 in my example should have been a