Home » excel » excel – Web pages scraping with variable date vba

excel – Web pages scraping with variable date vba

Posted by: admin May 14, 2020 Leave a comment

Questions:

Looking for a solution to scrape/import web pages from an internal network and paste into excel.

Objective
Tuesday to Friday – To visit web page and import data for the current and previous day. On a Monday it needs to import data for the Current day and the previous 3 days (Sunday, Saturday and Friday).

I recorded the macro yesterday of copying the excel cell with the web address in it, pasting into the address field in New Web Query and going through the import process and repeating for the previous day.

This delivered the expected result but when I run the macro again this morning it returned the data for Yesterday and the previous day because the web address is hard coded.

I concatenate the start of the web address with the date elements and the address to get to webpage for today is in cell K2, previous day K3, -2 days K4 and -3 days K5.

The constant part of the web page address starts http:/…..prd03! followed by the variable yyyy!mm!dd

e.g http:/…..prd03!2018!07!12 for today
e.g http:/…..prd03!2018!07!11 for yesterday

tomorrow http:/…..prd03!2018!07!12 will be yesterday

Below is the code generated from the macro recording
End With

Application.CutCopyMode = False
Range("K2").Select
ActiveCell.FormulaR1C1 = _
    "http:....prd03!2018!07!11" 'can't show full address
Range("G9").Select
ActiveWorkbook.Worksheets.Add
ActiveSheet.Name = "Today"
With ActiveSheet.QueryTables.Add(Connection:= _
    "URL;http:....prd03!2018!07!11", _ ' the URL is hard coded
    Destination:=Range("$A$1"))
    .Name = "...prd03!2018!07!11" 'can't show full name
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlEntirePage
    .WebFormatting = xlWebFormattingNone
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
End With

I’ve opened the webpage html source code and today’s date is shown as below

A TITLE=”Main page name” HREF=”/……!prd03!2018!07!12″>12

Any help you can provide would be appreciated. If further info is required please let me know.

How to&Answers:

In VBA you can code your URLs to include dates:

Dim fmtToday As String
Dim fmtYesterday As String
Dim fmtTwoDays As String
Dim fmtThreeDays As String
Dim BaseURL As String

BaseURL = "....prd03!" ' the first part of your url, change this to reflect your actual URL excluding http://

fmtToday = BaseURL & Format(Now, "yyyy!mm!dd") ' combine the BaseURL with the formated date
fmtYesterday = BaseURL & Format(Now - 1, "yyyy!mm!dd")  'combine the BaseURL with the formated date minus 1 day
fmtTwoDays = BaseURL & Format(Now - 2, "yyyy!mm!dd")  ' combine the BaseURL with the formated date minus 2 days
fmtThreeDays = BaseURL & Format(Now - 3, "yyyy!mm!dd")  ' combine the BaseURL with the formated date minus 3 days

Then you can reference them in your code:

Application.CutCopyMode = False

Range("K2").Value = "http://" & fmtToday

ActiveWorkbook.Worksheets.Add
ActiveSheet.Name = "Today"

With ActiveSheet.QueryTables.Add(Connection:= _
    "URL;http://" & fmtToday, _
    Destination:=Range("$A$1"))
    .Name = fmtToday
    .FieldNames = True
    .RowNumbers = False
    .FillAdjacentFormulas = False
    .PreserveFormatting = True
    .RefreshOnFileOpen = False
    .BackgroundQuery = True
    .RefreshStyle = xlInsertDeleteCells
    .SavePassword = False
    .SaveData = True
    .AdjustColumnWidth = True
    .RefreshPeriod = 0
    .WebSelectionType = xlEntirePage
    .WebFormatting = xlWebFormattingNone
    .WebPreFormattedTextToColumns = True
    .WebConsecutiveDelimitersAsOne = True
    .WebSingleBlockTextImport = False
    .WebDisableDateRecognition = False
    .WebDisableRedirections = False
    .Refresh BackgroundQuery:=False
End With

I have adjusted your code to work with fmtToday, to use it with previous days you will need to adjust your code accordingly.