1 Replies - 6677 Views - Last Post: 22 March 2007 - 05:08 PM

#1 tody4me  Icon User is offline

  • Banned
  • member icon

Reputation: 12
  • View blog
  • Posts: 1,398
  • Joined: 12-April 06

Website Scraping - Data Tables

Posted 22 March 2007 - 09:14 AM

I'm starting to venture out into web programming / data gathering and need some help on where best to look for answers. Wasn't sure if this is more a .NET question or a software design question, so i posted it here.

I have a site for a shipment provider that i go to for updates on what orders were received and what was shipped and the expected ship date and all sorts of misc data. I have written a program that takes the data from the site (excel document for now) and uploads it into an access database (moving to SQL shortly) and I wanted to know if there is a way to scrape the data from the site (HTML table, using a data grid viewer in ASP.net written in vb, consumed in c#) so that i can bypass the copy / paste into excel process. Has anyone here done any site scraping, and if so how easy / complex is it, and where did you find all of the information to do so. Basically all i want to do is scrape the data from the HTML table into a data table to which I would update anything that changed. Right now he's working on writing it as a web service, but I don't know that he's going to have that done any time soon, and if this is something that I could write easily i would like to do that in the interim.

Thanks,

Is This A Good Question/Topic? 0
  • +

Replies To: Website Scraping - Data Tables

#2 salindor  Icon User is offline

  • D.I.C Regular
  • member icon

Reputation: 46
  • View blog
  • Posts: 301
  • Joined: 10-November 06

Re: Website Scraping - Data Tables

Posted 22 March 2007 - 05:08 PM

View Posttody4me, on 22 Mar, 2007 - 09:14 AM, said:

I'm starting to venture out into web programming / data gathering and need some help on where best to look for answers. Wasn't sure if this is more a .NET question or a software design question, so i posted it here.

I have a site for a shipment provider that i go to for updates on what orders were received and what was shipped and the expected ship date and all sorts of misc data. I have written a program that takes the data from the site (excel document for now) and uploads it into an access database (moving to SQL shortly) and I wanted to know if there is a way to scrape the data from the site (HTML table, using a data grid viewer in ASP.net written in vb, consumed in c#) so that i can bypass the copy / paste into excel process. Has anyone here done any site scraping, and if so how easy / complex is it, and where did you find all of the information to do so. Basically all i want to do is scrape the data from the HTML table into a data table to which I would update anything that changed. Right now he's working on writing it as a web service, but I don't know that he's going to have that done any time soon, and if this is something that I could write easily i would like to do that in the interim.

Thanks,


I have not done any searches on automated tools; but I extremely confident what your talking about isn't hard. Scan html document for a table, build your own 2d matrix based on the information provided. Proably should look for some common text to see if the particular table in question is worth scanning.

The biggest problem your going to find yourself facing is one of context, on an html with multiple tables, which table contains the information you want. Actually strike that, if the table has an id, just scan for the id.

Anyways, I am not a .NET guru (at least yet) but you posted over here so I figure free game^^

Salindor
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1