Home » excel » html – VBA Web Scraping using getElementsByClassName to names and addresses

html – VBA Web Scraping using getElementsByClassName to names and addresses

Posted by: admin May 14, 2020 Leave a comment

Questions:

I’m trying to extract the clinic name and corresponding address for all the clinics from the following web page: https://medimap.ca/Location/Calgary,%20AB,%20Canada

I’m having issues locating the exact area where I should be drilling down into. All the clinic names have the same class name of “_1FLG5” and the addresses are all “_1-Gov” . However, when I run through the below code nothing happens – no errors just nothing.

I’m also unsure if the reference after .getElementsByClassName is correct, as I want the inner text from the same row as where the “_1FLG5” is I referenced (0) and since I wanted the text from two rows below “_1-Gov” I referenced (2).

Option Explicit

Sub GetClinicData()

    Dim objIE As InternetExplorer
    Dim clinicEle As Object
    Dim clinicAdd As Object

    Dim clinicName As String
    Dim address As String
    Dim y As Integer
    Dim x As Integer

    Set objIE = New InternetExplorer
    objIE.Visible = False

    objIE.navigate "https://medimap.ca/Location/Calgary,%20AB,%20Canada"
    Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

    y = 1

    For Each clinicEle In objIE.document.getElementsByClassName("_1FLG5")
        clinicName = clinicEle.getElementsByClassName("_1FLG5")(0).innerText
        Sheets("Sheet1").Range("A" & y).Value = clinicName
        y = y + 1
    Next

    x = 1

    For Each clinicAdd In objIE.document.getElementsByClassName("_1-Gov")
        clinicAdd = clinicAdd.getElementsByClassName("_1-Gov")(2).innerText
        Sheets("Sheet1").Range("B" & x).Value = clinicAdd
        x = x + 1
    Next


End Sub
How to&Answers:

Content is dynamically loaded so you need a wait condition to ensure content loaded – otherwise your collections end up being of length 0. I use querySelectorAll to apply the class names which return nodeList you For Loop over the .Length of. Ideally you should add a timeout condition to the loop. I show a timed loop here.

Option Explicit

'VBE > Tools > References: Microsoft Internet Controls
Public Sub GetData()
    Dim ie As Object
    Set ie = CreateObject("InternetExplorer.Application")
    With ie
        .Visible = True
        .Navigate2 "https://medimap.ca/Location/Calgary,%20AB,%20Canada"

        While .Busy Or .readyState < 4: DoEvents: Wend

        Dim clinics As Object, addresses As Object, i As Long
        With .document

            Do
                Set clinics = .querySelectorAll("._1FLG5")
                Set addresses = .querySelectorAll("._1-Gov")
            Loop While clinics.Length = 0

            For i = 0 To clinics.Length - 1
                With ThisWorkbook.Worksheets("Sheet1")
                    .Cells(i + 1, 1) = Trim$(clinics.item(i).innerText)
                    .Cells(i + 1, 2) = Trim$(addresses.item(i).innerText)
                End With
            Next
        End With
        .Quit
    End With
End Sub