2 Replies - 1814 Views - Last Post: 12 March 2012 - 08:18 AM Rate Topic: -----

#1 micnap  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 35
  • Joined: 23-November 11

Limiting depth of Breadth First Search

Posted 11 March 2012 - 09:03 PM

Hi,

I have a homework exercise I'm trying to complete.
I'm using this algorithm from here http://stackoverflow...ction-iterative to do a crawl of a small website. I have this working well in Python but am not allowed to post the actual code here per the course rules. But the assignment also has the requirement of limiting the depth of the search to n levels and they've given us the hint of Bread First Search. My thinking is that I need to somehow mark the depth of the link when it's added to urlsToVisit. But darned if I can figure out how. Can someone point me in the right direction?

//pseudocode:
var urlsToVisit = new StorageObject(); // Could be a queue (BFS) or stack(DFS). (probably with a database backing or something).
var visitedUrls = new List(); // List of visited URLs.

// initialization:
urlsToVisit.Add( rootUrl );

while(urlsToVisit.Count > 0) {
  var nextUrl = urlsToVisit.FetchAndRemoveNextUrl();
  var page = FetchPage(nextUrl);
  ProcessPage(page);
  visitedUrls.Add(nextUrl);
  var links = ParseLinks(page);
  foreach (var link in links)
     if (!visitedUrls.Contains(link))
        urlsToVisit.Add(link); 
}



Thanks,
Mickey

Is This A Good Question/Topic? 0
  • +

Replies To: Limiting depth of Breadth First Search

#2 atraub  Icon User is offline

  • Pythoneer
  • member icon

Reputation: 759
  • View blog
  • Posts: 2,010
  • Joined: 23-December 08

Re: Limiting depth of Breadth First Search

Posted 12 March 2012 - 08:17 AM

Admittedly, it's been a while since I've done graph algorithms, but it seems to me that you need a counter.

How about this: urlsToVisit currently hold a list of urls. What if instead, it held URL-depth tuples? The depth could be checked upon adding any URLS and if the depth were greater than 5, you simply wouldn't add it. Here's a quick rework of the pseudocode (that looks a lot more like Java than pseudocode ;) )

//pseudocode:
var urlTuplesToVisit = new StorageObject(); // Could be a queue (BFS) or stack(DFS). (probably with a database backing or something).
var visitedUrls = new List(); // List of visited URLs.
depthLimit = 5;

// initialization: This isn't proper Java, but you should get the pythonic-idea
urlTuplesToVisit.Add( (rootUrl,0) ); //url-depth tuple

while(urlTuplesToVisit.Count > 0) {
  var current = urlsToVisit.FetchAndRemoveNextUrlDepthTuple();
  var thisUrl = next[0]; //first element of tuple
  var thisDepth = next[1]; //second element of tuple
  var page = FetchPage(thisUrl);
  ProcessPage(page);
  visitedUrls.Add(thisUrl);
  var links = ParseLinks(page);
  foreach (var link in links)
     if ((!visitedUrls.Contains(link)) && thisDepth+1 < depthLimit)
        urlsToVisit.Add((link,thisDepth+1)); 
}


This post has been edited by atraub: 12 March 2012 - 08:19 AM

Was This Post Helpful? 0
  • +
  • -

#3 micnap  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 35
  • Joined: 23-November 11

Re: Limiting depth of Breadth First Search

Posted 12 March 2012 - 08:18 AM

Thanks atraub. I should have posted...I figured it out. Once the homework is graded, I'll post my version in Python.

Thanks for your help though.
Mickey
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1