7 Replies - 1495 Views - Last Post: 04 October 2012 - 10:43 AM Rate Topic: -----

#1 lar3ry  Icon User is offline

  • Coding Geezer
  • member icon

Reputation: 310
  • View blog
  • Posts: 1,290
  • Joined: 12-September 12

Having problems with InnerText

Posted 03 October 2012 - 07:42 PM

Perhaps I am not understanding InnerText. Here's my code.
    Public Sub getStuffFromPage(ByVal page As HtmlDocument, ByVal basepage As String)
        Dim theElementCollection As HtmlElementCollection = page.GetElementsByTagName("span")
          For Each curElement As HtmlElement In theElementCollection
            If curElement.GetAttribute("id") = "rxPower" Then
                Label1.Text = curElement.InnerText
              End If
          Next
    End Sub



In the HTML document I am processing, I have the following...
	<td class="meterCell" height="60">
		Rx Power: <b><span id="rxPower">-48.3</span> dBm</b>
	</td>


If I place a breakpoint on line 6 (Label1.text = ...), It stops, and looking at curElement, I see:
curElement.Id = "rxPower"
curElement.OuterHTML = "<SPAN id=rxPower></SPAN>"
curElement.TagName = "SPAN"


but curElement.InnerText = Nothing.

Any ideas?

Edited to remove dead code I didn't notice.

This post has been edited by lar3ry: 03 October 2012 - 07:45 PM


Is This A Good Question/Topic? 0
  • +

Replies To: Having problems with InnerText

#2 lucky3  Icon User is offline

  • Friend lucky3 As IHelpable
  • member icon

Reputation: 231
  • View blog
  • Posts: 765
  • Joined: 19-October 11

Re: Having problems with InnerText

Posted 03 October 2012 - 09:35 PM

This should work. Perhaps you are catching some other span tag at the moment. If you are having plenty results in your theElementCollection, try to display them somewhere else (like append them to some TextBox with txtDisplay.Text &= curElement.InnerText or something like that), because each of them writes over .Text of Label1, and maybe you are seeing just the last one, where curElement.InnerText = ""

This post has been edited by lucky3: 03 October 2012 - 09:40 PM

Was This Post Helpful? 0
  • +
  • -

#3 lar3ry  Icon User is offline

  • Coding Geezer
  • member icon

Reputation: 310
  • View blog
  • Posts: 1,290
  • Joined: 12-September 12

Re: Having problems with InnerText

Posted 03 October 2012 - 09:58 PM

View Postlucky3, on 03 October 2012 - 09:35 PM, said:

This should work. Perhaps you are catching some other span tag at the moment. If you are having plenty results in your theElementCollection, try to display them somewhere else (like append them to some TextBox with txtDisplay.Text &= curElement.InnerText or something like that), because each of them writes over .Text of Label1, and maybe you are seeing just the last one, where curElement.InnerText = ""

I thought of that, but the breakpoint at line 6 only stops once during the For Each loop.

I did discover something else that baffles me, and I don't know how I can get around it, short of asking the satellite modem manufacturer to rewrite their HTML.

If I look at WebBrowser1.DocumentText, the value I am trying to get is not there. In other words, the page source (as captured by FireFox) has Rx Power: <b><span id="rxPower">-48.3</span>&nbsp;dBm</b>, but the DocumentText has Rx Power: <b><span id="rxPower"></span>&nbsp;dBm</b>, missing the value right after the ID.

That value is obviously refreshed by firmware in the modem, but I don't know how to get the text of the HTML in the same manner as I can get it with "View Page Source" in Firefox.
Was This Post Helpful? 0
  • +
  • -

#4 lar3ry  Icon User is offline

  • Coding Geezer
  • member icon

Reputation: 310
  • View blog
  • Posts: 1,290
  • Joined: 12-September 12

Re: Having problems with InnerText

Posted 03 October 2012 - 10:21 PM

I'm beginning to think I can't get there from here.

I just tried a "View Source" with Internet Explorer, and the value is missing there too, yet Firefox definitely contains the value. Since the WebBrowser control is essentially Internet Explorer, I am wondering if it's impossible to get that value from it.
Was This Post Helpful? 0
  • +
  • -

#5 lucky3  Icon User is offline

  • Friend lucky3 As IHelpable
  • member icon

Reputation: 231
  • View blog
  • Posts: 765
  • Joined: 19-October 11

Re: Having problems with InnerText

Posted 04 October 2012 - 12:31 AM

I don't know if that would make a difference, but try getting your code with HttpWebRequest and HttpWebResponse.

Use:
 Dim myUri As New Uri("http://somepage.html")
 Dim myHTMLCode As String = myUri.GetWebPageCode



Extension:
    <System.Runtime.CompilerServices.Extension()>
    Public Function GetWebPageCode(ByVal uri As Uri) As String
        Dim webRequest__1 As HttpWebRequest = DirectCast(WebRequest.Create(uri), HttpWebRequest)
        webRequest__1.Timeout = 10000
        webRequest__1.UserAgent = "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"

        Try
            Dim webResponse As HttpWebResponse = DirectCast(webRequest__1.GetResponse(), HttpWebResponse)
            Dim ReceiveStream As Stream = webResponse.GetResponseStream()
            Dim inStream = New StreamReader(webResponse.GetResponseStream())
            Dim returnString As String = inStream.ReadToEnd()
           
            Return returnString
        Catch ex As Exception
            Throw ex
        End Try

    End Function


Was This Post Helpful? 1
  • +
  • -

#6 lar3ry  Icon User is offline

  • Coding Geezer
  • member icon

Reputation: 310
  • View blog
  • Posts: 1,290
  • Joined: 12-September 12

Re: Having problems with InnerText

Posted 04 October 2012 - 09:20 AM

View Postlucky3, on 04 October 2012 - 12:31 AM, said:

I don't know if that would make a difference, but try getting your code with HttpWebRequest and HttpWebResponse.


That was a good idea, but alas, it stll gives me the same result. Everything is there except the actual values. I've never tried using that technique. Nice to be able to do it without the WebBrowser control. Thanks for that.

Perhaps if I explain the action of the page in a browser, someone might get an idea.

The page itself is in my sattelite modem. When I access it with my browser, it shows various parameters, such as signal strength, Signal/Noise ratio, number of bytes sent and received, and so on. The page gets updated about once per second. The entire page does not seem to refresh; only the numeric values.

Further investigation turned up something a little surprising. The first time I did a "View Page Source" in Firefox, I got the numbers just fine. I tried it again this morning, and they did not show up. I tried it about 50 times, and the values did not show up. This leads me to believe that the values are refreshed, rendered, and then removed somehow. I don't know enough about HTML to guess how this might operate. It seems to me that the first time I tried "View Page Source" with Firefox, I just got lucky with the timing.

I'm really baffled.
Was This Post Helpful? 0
  • +
  • -

#7 lucky3  Icon User is offline

  • Friend lucky3 As IHelpable
  • member icon

Reputation: 231
  • View blog
  • Posts: 765
  • Joined: 19-October 11

Re: Having problems with InnerText

Posted 04 October 2012 - 09:49 AM

If it's not flash (that would be obvious), then it's probably javascript. It can inject elements later to html source, but I'm far from knowledgeable in it.
Was This Post Helpful? 0
  • +
  • -

#8 lar3ry  Icon User is offline

  • Coding Geezer
  • member icon

Reputation: 310
  • View blog
  • Posts: 1,290
  • Joined: 12-September 12

Re: Having problems with InnerText

Posted 04 October 2012 - 10:43 AM

Got it! Problem Solved!

Turns out that the page loads, without the values in the source. Then, a short time later, the values are filled in, and sent to the browser. Whenver I navigated to the page, the values disappeared, then shortly after, filled in again.

So, all I had to do was to navigate to the page, and on wb.DocumentCompleted, start a timer, then look at wb.Document after the Timer tripped. Sure enough, there were the values.

In the hope of helping someone with a similar problem, here's the test code.
After extracting the data from the modem status page, I navigate back to the basic status page, which does not keep sending any updated information to the browser.

    Private Sub Button1_Click(sender As System.Object, e As System.EventArgs) Handles Button1.Click
        wb.Navigate("http://192.168.100.1/?page=modemStatus")
    End Sub

    Private Sub wb_DocumentCompleted(sender As System.Object, e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles wb.DocumentCompleted
        'Timer1 interval is 500ms
        If Not wb.document.Url.ToString.Contains("basicStatus") Then
            Timer1.Enabled = True
        End If
    End Sub

    Private Sub Timer1_Tick(sender As System.Object, e As System.EventArgs) Handles Timer1.Tick
        Dim page As HtmlDocument = wb.Document
        Dim pageURL As String = wb.document.Url.ToString
        getStuffFromPage(page, pageURL)
        Timer1.Enabled = False
    End Sub

    Public Sub getStuffFromPage(ByVal page As HtmlDocument, pageURL As String)
        Dim theElementCollection As HtmlElementCollection = page.GetElementsByTagName("span")
        If pageURL.Contains("modemStatus") Then
            For Each curElement As HtmlElement In theElementCollection
                If curElement.GetAttribute("id") = "rxPower" Then
                    Label1.Text = "Rx Power:  " & curElement.InnerText
                End If
                If curElement.GetAttribute("id") = "esNo" Then
                    Label2.Text = "SNR:  " & curElement.InnerText
                End If
            Next
        End If
        wb.Navigate("http://192.168.100.1/?page=basicStatus")
    End Sub



Thanks again for the help.
Was This Post Helpful? 1
  • +
  • -

Page 1 of 1