3 Replies - 1810 Views - Last Post: 18 November 2012 - 10:19 AM Rate Topic: -----

#1 maxrolo  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 2
  • Joined: 16-November 12

web data scratching - server side

Posted 16 November 2012 - 09:14 AM

I am looking for advice on which language / environment / technology will best fit my objective. The project is straightforward. A user enters specific criteria in a web page, and on the server side, the server goes to work for him, initiating multiple web searches and scrapes, and then delivers the results to the user. The web sites scraped do not offer any specific APIís or web services.

In a desktop environment there are oodles of front end applications like IMacros or the like, however server side Ė I am clueless. There may be 1000 or more simultaneous searches, and results must all be delivered instantly. I have to assume that each session will trigger its own crawler / scratcher. Which technologies should be used server side? What are the options? Future scalability obviously a factor.

Is This A Good Question/Topic? 0
  • +

Replies To: web data scratching - server side

#2 modi123_1  Icon User is offline

  • Suitor #2
  • member icon



Reputation: 9363
  • View blog
  • Posts: 35,172
  • Joined: 12-June 08

Re: web data scratching - server side

Posted 16 November 2012 - 09:22 AM

The sites you are scraping... do they allow this sort of behavior?
Was This Post Helpful? 0
  • +
  • -

#3 blackcompe  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 1155
  • View blog
  • Posts: 2,535
  • Joined: 05-May 05

Re: web data scratching - server side

Posted 16 November 2012 - 10:00 AM

Quote

A user enters specific criteria in a web page, and on the server side, the server goes to work for him, initiating multiple web searches and scrapes, and then delivers the results to the user. The web sites scraped do not offer any specific APIís or web services.


A starting point is to read their TOS and robots.txt to see if they allow crawling.
Was This Post Helpful? 0
  • +
  • -

#4 maxrolo  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 2
  • Joined: 16-November 12

Re: web data scratching - server side

Posted 18 November 2012 - 10:19 AM

yes. they do allow crawling
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1