VBA Webscrape not picking up elmenents; pick up frames/tables?


VBA Webscrape not picking up elmenents; pick up frames/tables?



Tried asking this question. Didn't get many answers. Can't install things onto my work computer. https://stackoverflow.com/questions/29805065/vba-webscrape-not-picking-up-elements



Want to scrape a morningstar page into Excel with the code below. Problem is, it doesn't feed any real elements/data back. I actually just want the Dividend and cap gain distribution table really from that link I put into my_Page.



This is usually easiest way, but an entire page scrape way, AND Excel-->Data-->From Web DON'T work.



I've tried to use get elements by tag name and class before, but I failed at being able to do it in this case.This might be the way to go... Once again, just want that Dividend and Cap Gain distribution table. Not seeing any results in via the Debug.print



Working code below, just need to parse into excel. Updated attempt below:


Sub Macro1()


Dim IE As New InternetExplorer
IE.Visible = True
IE.navigate "http://quotes.morningstar.com/fund/fundquote/f?&t=ANNPX&culture=en_us&platform=RET&viewId1=2046632524&viewId2=3141452350&viewId3=3475652630"
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Dim doc As HTMLDocument
Set doc = IE.document



'For Each Table In doc.getElementsByClassName("gr_table_b1")

'For Each td In Table.getElementsByTagName("tr")
On Error Resume Next
For Each td In doc.getElementsByClassName("gr_table_row4")
Debug.Print td.Cells(5).innerText
'Debug.Print td.Cells(1).innerText
Next td
'Next Table


'IE.Quit
'Application.EnableEvents = True



End Sub




2 Answers
2



The content in question is contained within an iframe. You can see this by right clicking on that section of the sebsite, and selecting Inspect element. Looking up the tree, you'll see an iframe tag, containing the url of data. You should try to find that element, and extract that url (which is generated with js) and then open that page.


Inspect element





My current environment has an old version of IE, which does not render the page properly, so I cannot build something to actually do this.
– Degustaf
May 29 '15 at 15:51





iframe is this quotes.morningstar.com/fund/fundquote/…. My debug yields something now (It just says "YTD"). I think if I can get those tags correct I'll be in business now.
– pjhollow
May 29 '15 at 17:11





@pjhollow Just so you know. I got a different URL on my end. The viewIDs seam to be generated by the js. I'm not sure what will happen if you hard code those values.
– Degustaf
May 29 '15 at 17:29





Thanks for taking the time to look into this. Mind pasting what URL you are seeing?
– pjhollow
May 29 '15 at 17:30





@pjhollow quotes.morningstar.com/fund/fundquote/…
– Degustaf
May 29 '15 at 17:32




No frame to worry about. You only need the table id.



Webpage view:



Web view



Print out from code:



Code print out



VBA:


Option Explicit
Public Sub GetDivAndCapTable()
Dim ie As New InternetExplorer, hTable As HTMLTable
Const URL = "http://quotes.morningstar.com/fund/fundquote/f?&t=ANNPX&culture=en_us&platform=RET&viewId1=2046632524&viewId2=3141452350&viewId3=3475652630"
Application.ScreenUpdating = False
With ie
.Visible = True

.navigate URL

While .Busy Or .READYSTATE < 4: DoEvents: Wend

Set hTable = .document.getElementById("DividendAndCaptical")
WriteTable hTable, 1
Application.ScreenUpdating = True
.Quit
End With
End Sub

Public Sub WriteTable(ByRef hTable As HTMLTable, Optional ByVal startRow As Long = 1, Optional ByRef ws As Worksheet)

If ws Is Nothing Then Set ws = ActiveSheet

Dim tSection As Object, tRow As Object, tCell As Object, tr As Object, td As Object, R As Long, C As Long, tBody As Object
R = startRow
With ActiveSheet
Dim headers As Object, header As Object, columnCounter As Long
Set headers = hTable.getElementsByTagName("th")
For Each header In headers
columnCounter = columnCounter + 1
.Cells(startRow, columnCounter) = header.innerText
Next header
startRow = startRow + 1
Set tBody = hTable.getElementsByTagName("tbody")
For Each tSection In tBody 'HTMLTableSection
Set tRow = tSection.getElementsByTagName("tr") 'HTMLTableRow
For Each tr In tRow
Set tCell = tr.getElementsByTagName("td")
C = 1
For Each td In tCell 'DispHTMLElementCollection
.Cells(R, C).Value = td.innerText 'HTMLTableCell
C = C + 1
Next td
R = R + 1
Next tr
Next tSection
End With
End Sub






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

how to run turtle graphics in Colaboratory

Export result set on Dbeaver to CSV