6 Replies - 1242 Views - Last Post: 20 April 2014 - 07:19 PM

#1 CryptoMonkey  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 2
  • Joined: 19-April 14

Article/Media Scraping App

Posted 19 April 2014 - 05:04 PM

I'm wanting to create an app which scrapes Google and some other news sites for articles, images and videos. The user enters a keyword then the app scrapes news and media for that keyword and delivers the various results throughout the day through the app. It's main function would be to provide news updates on certain topics in a simplified manner. I'm new to programming and I'm not so familiar with scrapers so I'm not too sure how this would work. I'm relatively new to programming, how hard would this be to do for someone like me?

Also, would there be any legal implications of scraping data from other websites?

Thanks

Is This A Good Question/Topic? 0
  • +

Replies To: Article/Media Scraping App

#2 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 9048
  • View blog
  • Posts: 33,970
  • Joined: 12-June 08

Re: Article/Media Scraping App

Posted 19 April 2014 - 05:06 PM

Quote

Also, would there be any legal implications of scraping data from other websites?

Yeah.. read their 'terms of use'.. typically bots and scraping are a no-no.
Was This Post Helpful? 1
  • +
  • -

#3 CryptoMonkey  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 2
  • Joined: 19-April 14

Re: Article/Media Scraping App

Posted 19 April 2014 - 05:12 PM

View Postmodi123_1, on 19 April 2014 - 05:06 PM, said:

Quote

Also, would there be any legal implications of scraping data from other websites?

Yeah.. read their 'terms of use'.. typically bots and scraping are a no-no.


Thanks for clearing that up.

If I happened to find a website that was lenient with this sort of thing, how hard would it be to implement?
Was This Post Helpful? 0
  • +
  • -

#4 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3530
  • View blog
  • Posts: 10,933
  • Joined: 05-May 12

Re: Article/Media Scraping App

Posted 19 April 2014 - 07:49 PM

Doing a naive screen scraping is easy. Making an intelligent indexer is much harder. For example, if the keyword you are looking for is the noun "taxes", you'll probably want to filter out the results where the word "tax" is used on a webpage where it says "Taxes and shipping not included", and pages where the word "taxes" is used as a verb (e.g. "Installing Photoshop taxes the system"), but you'll want to keep pages that say "Remember to pay your taxes".
Was This Post Helpful? 0
  • +
  • -

#5 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 9048
  • View blog
  • Posts: 33,970
  • Joined: 12-June 08

Re: Article/Media Scraping App

Posted 19 April 2014 - 08:39 PM

Of course the more legit option is checking out any API these sites-that-allow-data-pilfering may/should have.
Was This Post Helpful? 0
  • +
  • -

#6 Momerath  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 1010
  • View blog
  • Posts: 2,444
  • Joined: 04-October 09

Re: Article/Media Scraping App

Posted 20 April 2014 - 02:15 AM

Isn't this what RSS is for?
Was This Post Helpful? 0
  • +
  • -

#7 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 3530
  • View blog
  • Posts: 10,933
  • Joined: 05-May 12

Re: Article/Media Scraping App

Posted 20 April 2014 - 07:19 PM

RSS has no guarantees that it will give you all the contents of new items. It will only notify you that there are new items, and probably a short blurb.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1