6 Replies - 993 Views - Last Post: 20 July 2011 - 07:04 PM Rate Topic: -----

#1 kevin_911  Icon User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 127
  • Joined: 02-April 09

Extracting User Comments from Web Page Source

Posted 07 July 2011 - 07:56 AM

Hi Guys need a bit of help in understanding on how to go about extracting user comments on likes of forums by
using the page source.

I am able to extract the data by the following code:

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        WebBrowser1.Navigate(TextBox2.Text)
End Sub

Private Sub WebBrowser1_DocumentCompleted(ByVal sender As System.Object, ByVal e As System.Windows.Forms.WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
        TextBox1.Text = WebBrowser1.document.Body.OuterText
End Sub


Anyhow it gives me plain text without any metadata(or HTML) but I am more or so interested in extracting the comments left by the user.

Please advice :)

This post has been edited by kevin_911: 07 July 2011 - 07:57 AM


Is This A Good Question/Topic? 0
  • +

Replies To: Extracting User Comments from Web Page Source

#2 TADS  Icon User is offline

  • D.I.C Head
  • member icon

Reputation: 10
  • View blog
  • Posts: 161
  • Joined: 09-August 08

Re: Extracting User Comments from Web Page Source

Posted 07 July 2011 - 10:03 AM

take a look on this tutorial... watch part1 and 2 this is prop just what you are looking forhttp://www.youtube.c...u/3/FpAvBOhDrYk
hope it helps
Was This Post Helpful? 0
  • +
  • -

#3 kevin_911  Icon User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 127
  • Joined: 02-April 09

Re: Extracting User Comments from Web Page Source

Posted 07 July 2011 - 03:50 PM

View PostTADS, on 07 July 2011 - 11:03 AM, said:

take a look on this tutorial... watch part1 and 2 this is prop just what you are looking forhttp://www.youtube.c...u/3/FpAvBOhDrYk
hope it helps


Thanks TAD for your kind reply. It surely was helpful!!

I am still getting use to the term Regex so I am still not sure on how to go about
creating one so that it fits on all forums. Most of the example out there are made
for a particular website whereas in mine it has be on any forum

Like here is a pretty standard coding format for regex

   Private Sub TextBox1_TextChanged(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles TextBox1.TextChanged
        Dim content As String = Replace(TextBox1.Text, Global.Microsoft.VisualBasic.ChrW(10), Nothing)

        Dim regex As New System.Text.RegularExpressions.Regex("<meta.*name=""description"".*content="".*""")
        

        For Each M As Match In regex.Matches(content)
            Dim Description As String = M.Value.Split("""").GetValue(3)
            Label1.Text = "Description: " & Description

        Next
    End Sub


In a web page source I would probably look for a repitative pattern but am not sure how it would work without knowing any patterns.

This post has been edited by kevin_911: 07 July 2011 - 03:58 PM

Was This Post Helpful? 0
  • +
  • -

#4 kevin_911  Icon User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 127
  • Joined: 02-April 09

Re: Extracting User Comments from Web Page Source

Posted 07 July 2011 - 03:56 PM

Sorry for the repost guys :oops:

This post has been edited by kevin_911: 07 July 2011 - 03:58 PM

Was This Post Helpful? 0
  • +
  • -

#5 kevin_911  Icon User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 127
  • Joined: 02-April 09

Re: Extracting User Comments from Web Page Source

Posted 08 July 2011 - 09:09 AM

Hi Guys,

Well I have looked across the whole of google and still cant seem to find any answer to my
problem. Regex just seems too complicated to use if the structure of a website is not known.

Ok so by using this code:
 TextBox1.Text = WebBrowser1.document.Body.InnerText


Result shown below. From the extracted source how can I go about removing
the unwanted texts apart from the comments. I solely want the user info and their
comments.

Please advice :)

Quote

TechArena Community > Software > Software Development
View HTML Source in a VB.net Program
Become a Member!
Forgot your username/password?
User NameRemember Me?
Password


RegisterTagsActive TopicsRSSSearchSiteMap



Go to Page...


Tags: html, source code, vbnet


Sponsored Links






View HTML Source in a VB.net Program
Software Development


Thread Tools Search this Thread

#1 09-04-2009
Calast
Member Join Date: Apr 2009
Posts: 1

View HTML Source in a VB.net Program



I'm attempting to make a very basic HTML editor in VB for an assignment. and one of the requirements is that I be able to enter a URL and have it retrieve the HTML of whatever page I sent it to. however I'm not sure how I would accomplish this. any help would be appreciated


#2 09-04-2009
Kirtikumar
Member Join Date: Dec 2008
Posts: 323

Re: View HTML Source in a VB.net Program



Dim webResponse3 As System.Net.HttpWebResponse = Nothing
Dim webRequest3 As System.Net.HttpWebRequest = System.Net.HttpWebRequest.Create("http://www.yahoo.com")
Try
webResponse3 = DirectCast(webRequest3.GetResponse(), System.Net.HttpWebResponse)
Dim srResp As System.IO.StreamReader
srResp = New System.IO.StreamReader(webResponse3.GetResponseStream())
dim SOMESTRING as string
SOMESTRING = srResp.ReadToEnd
Catch ex As Exception

End Try


#3 09-04-2009
MindSpace
Member Join Date: Feb 2008
Posts: 1,832

Re: View HTML Source in a VB.net Program



This example will show you how use a string in VB to create PHP code.In order to do this, you need a string to store your PHP page and a function that I will list at the bottom of the page for you to put in a module. This code is written in VB.NET

Public Sub CreatePage(ByVal HTMLTitle As String, ByVal HTMLText As String, ByVal HTMLFileName As String)

Code:
Dim strFile As String

' ----------------------
' -- Prepare String --
' ----------------------
strFile = ""

' --------------------
' -- Write Starter --
' --------------------
strFile = "<html>" & vbNewLine
strFile = strFile & "<head>" & vbNewLine
strFile = strFile & "<title>" & HTMLTitle & "</title>" & vbNewLine
strFile = strFile & "</head><body>" & vbNewLine
strFile = strFile & HTMLText & vbNewLine
strFile = strFile & "</body></html>"
SaveTextToFile(strFile, "C:\" & HTMLFileName & ".html")

End SubNow we're done with the sub for creating the page, this is the only other snippet of code you will need, this needs to go into a
module of your program.

Public Function SaveTextToFile(ByVal strData As String, _ ByVal FullPath As String, _ Optional ByVal ErrInfo As String = "") As Boolean


Code:
Dim Contents As String
Dim Saved As Boolean = False
Dim objReader As IO.StreamWriter
Try

objReader = New IO.StreamWriter(FullPath)
objReader.Write(strData)
objReader.Close()
Saved = True
Catch Ex As Exception
ErrInfo = Ex.Message

End Try
Return Saved
End FunctionAs you can see, the code has a wide range of uses, such as taking old INI databases from your applications and being able to turn them into a useable webpage. Just take the strFile and add onto it with any HTML code or even PHP, all you'd have to do is change the ending on the save function inside the CreatePage sub.




TechArena Community > Software > Software Development




Inserting a new class in a dialog box (win32 API) Problem violation of the memory



Thread Tools
Show Printable Version
Email this Page

Search this Thread


Advanced Search



Similar Threads for: "View HTML Source in a VB.net Program"
ThreadThread StarterForumRepliesLast Post
Unable to view the source code in FirefoxSiketanTechnology & Internet514-04-2011 12:42 AM
Internet explorer 8: View message source in hotmailPrisciliaTechnology & Internet414-02-2011 09:57 AM
View HTML files in KindleGurdeepSPortable Devices422-03-2010 07:52 PM
Java program to retrieve html sourceAaliya SethSoftware Development512-01-2010 10:59 AM
How to View HTML Source Code in Word 2007RutajitWindows Software308-08-2009 12:50 PM



All times are GMT +5.5. The time now is 09:35 PM.


Contact Us - TechArena - Privacy Statement -

Was This Post Helpful? 0
  • +
  • -

#6 kevin_911  Icon User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 127
  • Joined: 02-April 09

Re: Extracting User Comments from Web Page Source

Posted 09 July 2011 - 05:43 PM

Well I further researched into how to get a particular user post from a forum by using the
following code

TextBox1.Text = WebBrowser1.document.GetElementById("post-1670732").InnerText


But this me physically entering the ID, how can I design the system to automatically
get this during each visits?

Please advice guys :helpsmilie:
Was This Post Helpful? 0
  • +
  • -

#7 kevin_911  Icon User is offline

  • D.I.C Head

Reputation: 4
  • View blog
  • Posts: 127
  • Joined: 02-April 09

Re: Extracting User Comments from Web Page Source

Posted 20 July 2011 - 07:04 PM

Bumping this thread!

Just wondering if anybody have any ideas on how to acheive this?
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1