6 Replies - 320 Views - Last Post: 12 September 2017 - 07:10 PM

#1 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 5826
  • View blog
  • Posts: 19,859
  • Joined: 05-May 12

Need graph theory help finding a term

Posted 12 September 2017 - 07:08 AM

If I knew what term or keyword to look for, I would likely be able to do my own research, but I don't even know what the word or phrase I should look for is.

The problem I'm trying to solve is to crawl through a set of web pages at work. I want to find the smallest complete set of pages that are all referenced by a primary index page. The purpose is to migrate the web pages elsewhere. For example: Index.htm points to SummaryDocA.htm. SummaryDocA.htm references DocA.htm, DocB.htm, SummaryDocB.htm, ChartX.png, Intranet.MyCompany.com/HomePage, and google.com. Index.htm, SummaryDocA.htm, DocA.htm, DocB.htm, SummaryDocB.htm, ChartX.png are all hosted within Intranet.MyCompany.com. I want to ignore the Intranet.MyCompany.com/Home and google.com links.

I'm thinking at the worse case is that I keep on building up a blacklist, but this can get pretty hairy because there is an estimated 2600+ web pages that I do want to keep. I think there is a graph theory solution to this problem, but I don't even know what term to use to start looking.

Is This A Good Question/Topic? 0
  • +

Replies To: Need graph theory help finding a term

#2 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 13400
  • View blog
  • Posts: 53,477
  • Joined: 12-June 08

Re: Need graph theory help finding a term

Posted 12 September 2017 - 07:10 AM

So similar to a depth first search for a loosely graphed tree?
Was This Post Helpful? 0
  • +
  • -

#3 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 5826
  • View blog
  • Posts: 19,859
  • Joined: 05-May 12

Re: Need graph theory help finding a term

Posted 12 September 2017 - 10:33 AM

I was actually planning on doing breadth first. After each depth level, I was going to see what URLs are picked up and start throwing them into a blacklist so that they are not explored any further for the next depth level.

The idea of going depth first does have its appeal. I'll find out early if there are any URLs that should not have been explored.
Was This Post Helpful? 0
  • +
  • -

#4 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 5826
  • View blog
  • Posts: 19,859
  • Joined: 05-May 12

Re: Need graph theory help finding a term

Posted 12 September 2017 - 02:38 PM

NVM. I finally got to see the sample data instead of relying on the client descriptions. All the data is nicely self-contained and I don't need to do any kind of crawling and analysis to find the core data that needs to be migrated.
Was This Post Helpful? 0
  • +
  • -

#5 modi123_1  Icon User is online

  • Suitor #2
  • member icon



Reputation: 13400
  • View blog
  • Posts: 53,477
  • Joined: 12-June 08

Re: Need graph theory help finding a term

Posted 12 September 2017 - 02:39 PM

Oh those clients!
Was This Post Helpful? 0
  • +
  • -

#6 macosxnerd101  Icon User is online

  • Games, Graphs, and Auctions
  • member icon




Reputation: 12135
  • View blog
  • Posts: 45,119
  • Joined: 27-December 08

Re: Need graph theory help finding a term

Posted 12 September 2017 - 05:14 PM

I'm a little late to the party, but I'm guessing you're looking for the transitive closure of the graph.
Was This Post Helpful? 1
  • +
  • -

#7 Skydiver  Icon User is offline

  • Code herder
  • member icon

Reputation: 5826
  • View blog
  • Posts: 19,859
  • Joined: 05-May 12

Re: Need graph theory help finding a term

Posted 12 September 2017 - 07:10 PM

Ah, Skienna... I'm not really loving my electronic copy of the second edition of his Algorithm Design Manual. He must have covered this in his book I must missed it on my Kindle. The paper copy of his first edition is buried somewhere in my attic. I suspect I would have had a better chance of stumbling across that flipping through physical pages.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1