Home » excel » excel – How to use a variable to represent a link?

excel – How to use a variable to represent a link?

Posted by: admin May 14, 2020 Leave a comment

Questions:

I recorded a macro and tried to adapt it using a for loop with the different links I want to scrape data from.

The problem is, that VBA doesn’t recognize my variable as a link. When I type in a link directly in the code, it works. I do not only need data from one link, but from 500.

Here is my code fragment:

Dim Link As String
Link = "https://coinmarketcap.com/currencies/bitcoin/historical-data/"
For i = 1 To 5
Link = Cells(i, 1)

     ActiveWorkbook.Queries.Add Name:="Table 0 (3)", Formula:= _
        "let" & Chr(13) & "" & Chr(10) & "    Quelle = Web.Page(Web.Contents(""https://coinmarketcap.com/currencies/ontology/historical-data/""))," & Chr(13) & "" & Chr(10) & "    Data0 = Quelle{0}[Data]," & Chr(13) & "" & Chr(10) & "    #""Geänderter Typ"" = Table.TransformColumnTypes(Data0,{{""Date"", type date}, {""Open*"", type number}, {""High"", type number}, {""Low"", type number}, {""Close**"", type number}, {""Volume"", type number}, {""Market Cap" & _
        """, type number}})" & Chr(13) & "" & Chr(10) & "in" & Chr(13) & "" & Chr(10) & "    #""Geänderter Typ"""
    With ActiveSheet.ListObjects.Add(SourceType:=0, Source:= _
        "OLEDB;Provider=Microsoft.Mashup.OleDb.1;Data Source=$Workbook$;Location=""Table 0 (3)"";Extended Properties=""""" _
        , Destination:=Range("$D$1")).QueryTable
        .CommandType = xlCmdSql
        .CommandText = Array("SELECT * FROM [Table 0 (3)]")
        .RowNumbers = False
        .FillAdjacentFormulas = False
        .PreserveFormatting = True
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SavePassword = False
        .SaveData = True
        .AdjustColumnWidth = True
        .RefreshPeriod = 0
        .PreserveColumnInfo = True
        .ListObject.DisplayName = "Table_0__3"
        .Refresh BackgroundQuery:=False
    End With
Next

As soon as I change the link (“”https://coinmarketcap.comblabla“”) for the variable “link”, I get an application or object defined error. When I dig deeper and click on the array, Excel tells me that the Import “link” is not connected to an export.

How to&Answers:

You can get the main historic data table and the info above with the code below. It is a little tricky and somewhat fragile as a lot of this relies on the current page styling, which can change. The historic data bit, which is an actual table, is a more robust.

You can loop using new URLs picked from cells, for example, and simply have a Sheets.Add line in at the start of each loop so you have a new Activesheet to write data to.

Below, should be enough to get you started depending on your requirements.


I get the top bit:

top bit

using
.Cells(1, 1) = IE.document.querySelector(".col-xs-6.col-sm-8.col-md-4.text-left").innerText. This is not very robust. The document’s styling could be changed. However, it is not an easy part of the page to access and obtaining it will likely be vulnerable which ever method you choose currently. I am using the element’s classname (".") to retrieve the information using the .querySelector method of document to apply the CSS selector .col-xs-6.col-sm-8.col-md-4.text-left. That is the same as .getElementsByClassName(0).


I get the middle bit:

middle

With

Set aNodeList = IE.document.querySelectorAll("[class*='coin-summary'] div")

This uses the CSS selector [class*='coin-summary'] div , which are the div tags within elements’ with a className containing the string 'coin-summary'.

That CSS selector returns a list so the .querySelectorAll method is used to return a nodeLIst which is then traversed.

List returned by CSS selector


I get the end historic data (which is an actual table), using the table tag:

Set hTable = .document.getElementsByTagName("table")(0)

I then traverse the rows, and cells within rows, of the table.


VBA:

Option Explicit
Public Sub GetInfo()
    Dim IE As Object
    Set IE = CreateObject("InternetExplorer.Application")
    Application.ScreenUpdating = False
    With IE
        .Visible = True
        .navigate "https://coinmarketcap.com/currencies/bitcoin/historical-data/"

        While .Busy Or .readyState < 4: DoEvents: Wend '<== Loop until loaded

        Dim hTable As HTMLTable
        Set hTable = .document.getElementsByTagName("table")(0)

        Dim tSection As Object, tRow As Object, tCell As Object, tr As Object, td As Object, r As Long, c As Long, hBody As Object
        Dim headers(), headers2()
        headers = Array("Date", "Open*", "High", "Low", "Close**", "volume", "Market Cap")
        headers2 = Array("Market Cap", "Volume (24h)", "Circulating Supply", "Max Supply")

        With ActiveSheet
            .Cells.ClearContents
            .Cells(1, 1) = IE.document.querySelector(".col-xs-6.col-sm-8.col-md-4.text-left").innerText
            Dim aNodeList As Object, i As Long, resumeRow As Long
            Set aNodeList = IE.document.querySelectorAll("[class*='coin-summary'] div")
            resumeRow = .Cells(.Rows.Count, "A").End(xlUp).Row + 2
            .Range("A" & resumeRow).Resize(1, UBound(headers2) + 1) = headers2

            For i = 0 To aNodeList.Length - 1
                .Cells(resumeRow + 1, i + 1) = aNodeList.item(i).innerText
            Next i

            r = .Cells(.Rows.Count, "A").End(xlUp).Row + 2

            .Cells(r, 1).Resize(1, UBound(headers) + 1) = headers
            Set hBody = hTable.getElementsByTagName("tbody")
            For Each tSection In hBody           'HTMLTableSection
                Set tRow = tSection.getElementsByTagName("tr") 'HTMLTableRow
                For Each tr In tRow
                    r = r + 1
                    Set tCell = tr.getElementsByTagName("td")
                    c = 1
                    For Each td In tCell         'DispHTMLElementCollection
                        .Cells(r, c).Value = td.innerText 'HTMLTableCell
                        c = c + 1
                    Next td

                Next tr
            Next tSection


        End With

        'Quit '<== Remember to quit application
        Application.ScreenUpdating = True
    End With
End Sub

Output in sheet (sample):

Example output


Some example data from page:

Example data

Answer:

This will get the data from that table.

Option Explicit
Sub Web_Table_Option_One()
    Dim xml    As Object
    Dim html   As Object
    Dim objTable As Object
    Dim result As String
    Dim lRow As Long
    Dim lngTable As Long
    Dim lngRow As Long
    Dim lngCol As Long
    Dim ActRw As Long
    Set xml = CreateObject("MSXML2.XMLHTTP.6.0")
    With xml
        .Open "GET", "https://coinmarketcap.com/currencies/bitcoin/historical-data/", False
        .send
    End With
    result = xml.responseText
    Set html = CreateObject("htmlfile")
    html.body.innerHTML = result
    Set objTable = html.getElementsByTagName("Table")
    For lngTable = 0 To objTable.Length - 1
        For lngRow = 0 To objTable(lngTable).Rows.Length - 1
            For lngCol = 0 To objTable(lngTable).Rows(lngRow).Cells.Length - 1
                ThisWorkbook.Sheets("Sheet1").Cells(ActRw + lngRow + 1, lngCol + 1) = objTable(lngTable).Rows(lngRow).Cells(lngCol).innerText
            Next lngCol
        Next lngRow
        ActRw = ActRw + objTable(lngTable).Rows.Length + 1
    Next lngTable
End Sub

You can certainly loop through an array of URLs, and iterate through each one. Where are these 500 URLs? If they are not the same as the one you provided, you may have your work cut out for you. Normally, all web sites are very different , and screen scraping is a highly customized process.