I’m trying to get the scores from the soccer world cups since 1930, in this link are the match ups listed and by clicking in the team a new window pop up with the summary I’m looking for. My problem is that in the HTML source code I don’t see a
<a> tag with the href inside of it as usually is, i’m trying to simulate a click event with the following VBA code:
Sub Test() Dim URL As String Dim IE As InternetExplorer Dim HTMLdoc As HTMLDocument Dim TDelements As IHTMLElementCollection Dim TDelement As HTMLTableCell Dim i As Integer URL = "http://www.flashscore.com/soccer/world/world-cup-1990/results/" Set IE = New InternetExplorer With IE .Navigate URL .Visible = True While .Busy Or .ReadyState <> READYSTATE_COMPLETE: DoEvents: Wend Set HTMLdoc = .Document End With Set TDelements = HTMLdoc.getElementsByTagName("td") For Each TDelement In TDelements If TDelement.Title = "Click for match detail!" Then TDelement.Click End If Next IE.Quit End Sub
I’d do this one differently, as what you’re trying to achieve requires a massive overhead and might take you on a wild goose chase of links, actions, loops, and whatnot. Instead of emulating the actions, it’s much better emulating the results of the actions.
Take for example the Germany – Argentina match. When you click on it, it opens this pop-up:
Of particular note here is the URL of the pop-up. Intuitively enough, the URL has some weird combination that might possibly be a unique identifier. The first thing we check is the tag for that row in the table we just clicked.
Check the highlighted part: it resembles the address in our pop-up closely, albeit there are four (4) extra characters at the beginning. Collapsing the row and cross-checking against other rows, we come to the conclusion that the rows use a similar format:
Removing the g_1_ from the
id attribute of the
tr tag gives us the unique identifier for every match. I’ll be taking the sanitized
id from the Italy – England match and I will input it into the generic URL. We’ll do this in another tab as the pop-up’s address bar is locked:
Voila. The page is open in another tab, and from there you can get what you want from the match summary. You can even check the head-to-head by changing
Why does this matter?
Because it’s safer. Since you’re using Excel, you have the sheets at your control. The safest way is to record all these
id attributes from each row into, say, a column. Then loop over that column, concatenating it properly to produce the target URLs.
Then you loop over the URLs and get what you want from the EXACT pages you’re looking for. There’s no need anymore for the original link/table at this point since almost all the details are in the “summary” pages anyway.
Hope this helps and good luck.
As I feel a bit guilty not posting any code, here’s my attempt at producing the
id values for the World Cup section. Note that I took some liberties with the code, so modify it to suit your approach better. I will only be providing the way to get the
id values. All else is up to you.
Sub Test() Dim URL As String Dim IE As New InternetExplorer Dim HTMLdoc As HTMLDocument Dim dictObj As Object: Set dictObj = CreateObject("Scripting.Dictionary") Dim tRowID As String URL = "http://www.flashscore.com/soccer/world/world-cup-1990/results/" With IE .Navigate URL .Visible = True Do Until .ReadyState = READYSTATE_COMPLETE: DoEvents: Loop Set HTMLdoc = .Document End With With HTMLdoc 'Target the set of tables. Set tblSet = .getElementById("fs-results") 'Index starts at 0. World Cup Playoffs is (0), World Cup is (1), 'Qualifiers is (2), etc... Targeting tbody right away, as there is 'only one (1) tbody tag per row anyway. Set mTbl = tblSet.getElementsByTagName("tbody")(1) Set tRows = mTbl.getElementsByTagName("tr") With dictObj 'If if value is not yet in dictionary, store it. For Each tRow In tRows 'Remove the first four (4) characters. tRowID = Mid(tRow.getAttribute("id"), 5) If Not .Exists(tRowID) Then .Add tRowID, Empty End If Next tRow End With End With 'Print out each of the ID values. For Each Key In dictObj Debug.Print Key Next Key Set IE = Nothing End Sub
Which returns the following
id values for the World Cup table (not the Playoffs!):
04LfR6G7 6RNojjg8 nww8zBgL t0xCZU8R dKY0Pgu3 jsY4ODf9 K4YcDdti 2qM7lD3j bXvdGk2k tdU0F9He 65hR8YfM lUsW7hAS jBqaxXO8 YZY3yivF n1HjSn11 hvInTSoe rHUdQZPd xfS51jmd ncyhHVnq 4b34kglp O0kz7LJH lpQE9EuG UHlQLBvS fNQAAfQ9 CKpewD92 EmJrU8Wl KnT4ETW1 nRnnVlHr b3T12WXk jwUc3CIq n9RDMiPL pAjW714B GznS8sk5 GWR6BzB3 UXW8NX9F beX1CGec
Hope this helps you some more. Kindly make sure you read the comments. Good luck!