Home ยป excel ยป excel – How to get the name and link of first five news articles using VBA

excel – How to get the name and link of first five news articles using VBA

Posted by: admin May 14, 2020 Leave a comment


I’m sorry if this is a tough one but I’m stuck and could really use the help ๐Ÿ™‚

I want to be able to get the first five articles (article name will be hyperlinked to the article) and place them under their respective cells.

Here is my thought process on how to implement this:

1. I have a bunch of items in a row (ex. chicken, fish, cow)

2. The algorithm goes to Google

3. The algorithm searches based on the cell value (first iteration will be ‘chicken’)

4. The algorithm clicks on “News”

5. The algorithm clicks on “Tools” and then clicks “Past week”

6. The algorithm extracts the first five articles under the cell (for example, if chicken was in A1, the five articles will be in A2-A6). The cell will have the article name as a value, with a hyperlink to the actual article.

I don’t want VBA to actually open up a browser (I’ve seen other answers that implemented XMLHTTP to do this?)



    Dim url As String, lColumn As Integer, i As Long, v As Long
    Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object

For i = 1 To lastRow

'this is to get last column
lColumn = ws.Cells(i, Columns.Count).End(xlToLeft).Column

'searches google based on row'
url = "https://www.google.com/search?q=" & Cells(1, i)

'I don't know much about using XMLHTTP for vba online interaction but I found this online
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
        XMLHTTP.Open "GET", url, False
        XMLHTTP.setRequestHeader "Content-Type", "text/xml"
        XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"

Set objCollection = IE.Document.getElementsByTagName("input")
v = 0
If objCollection(v).Name = "Tools" Then objectelement.Click
If objCollection(v).Name = "Last Week" Then objectelement.Click
Next i
End Sub

Thank you!

How to&Answers:

Something like the following should work for you

    Dim ws As Worksheet
    Dim LastColumn As Long, j As Long, noNewsItems As Long
    Dim query As String, niDateStr As String
    Dim xhr As MSXML2.XMLHTTP60
    Dim gXML As MSXML2.DOMDocument60
    Dim newsItems As IXMLDOMNodeList
    Dim nI As IXMLDOMElement
    Dim StartOfWeek As Date, EndOfWeek As Date, niDate As Date

    StartOfWeek = DateAdd("ww", -1, Date - (Weekday(Date, vbMonday) - 1))
    EndOfWeek = DateAdd("d", 6, StartOfWeek)

    Set xhr = New MSXML2.XMLHTTP60

    Set ws = ActiveSheet
    With ws
        LastColumn = .Rows(1).End(xlToLeft).Column
    End With

    For j = 1 To LastColumn
        query = "https://news.google.com/rss/search?q=" & ws.Cells(1, j).Value2

        With xhr
            .Open "GET", query, False
            Set gXML = .responseXML
            Set newsItems = gXML.SelectNodes(".//item")
            Debug.Print "Number of scraped items:", newsItems.Length
            noNewsItems = 0
            For Each nI In newsItems
                niDateStr = nI.ChildNodes(3).nodeTypedValue
                niDateStr = Mid(niDateStr, InStr(niDateStr, " ") + 1, InStrRev(niDateStr, " ") - 5)
                niDate = DateValue(niDateStr)
                If niDate >= StartOfWeek And niDate <= EndOfWeek Then
                    noNewsItems = noNewsItems + 1
                    Debug.Print nI.ChildNodes(0).nodeTypedValue, nI.ChildNodes(1).nodeTypedValue, nI.ChildNodes(3).nodeTypedValue

                    ws.Hyperlinks.Add anchor:=ws.Cells(1, j).Offset(noNewsItems, 0), Address:=nI.ChildNodes(1).nodeTypedValue, TextToDisplay:=nI.ChildNodes(0).nodeTypedValue
                End If
                If noNewsItems = 5 Then Exit For
            Next nI
        End With
    Next j
End Sub