Page 1 of 1

Data Structures: Kruskal's Algorithm Rate Topic: -----

#1 macosxnerd101  Icon User is online

  • Self-Trained Economist
  • member icon




Reputation: 10785
  • View blog
  • Posts: 40,161
  • Joined: 27-December 08

Posted 11 December 2010 - 03:24 PM

In this tutorial, we will examine Kruskal's algorithm for finding a Minimum Spanning Tree from a Graph. To begin, a minimum spanning tree is an acyclic graph that connects all the vertices in the Graph so that the sum of their weights is minimized. For this algorithm, the weights are distributed along the edges (or Connectors) rather than the Vertices.

What this algorithm does is order the Connectors by weight in ascending order, utilizing a Greedy approach by examining the Connector with the lowest weight first. It then places each Vertex as its own Tree, and stores a Forrest, or Collection of Trees. The Vertices also have their Connections severed; but as the Connections are stored elsewhere, this does not inhibit the algorithm. Rather, it allows us to construct to Minimum Spanning Tree without having to worry about other Connectors. While I have written both Graph and Tree implementations, I utilized my Graph implementation for this algorithm due to it providing random access to each Vertex rather than having to traverse the Tree to search for an element.

Once we have the data set up, the Connectors are traversed. As we are rebuilding a Tree, we have to be careful as to not create any cycles. So the way this is set up is that the Graphs with the Vertices the Connectors couple are found. If the two Graphs are the same, then a circuit would be created by re-establishing that Connection. So the Connection is simply ignored, and the next Connection is examined. However, if the two Graphs are different, the Connection is re-established between the two Vertices, linking the Graphs together. As such, the old Graph is no longer needed and is removed from the Forrest. If the Graph is continuous, a single Minimum Spanning Tree is left in the Forrest. However, if the Graph has discontinuities, then multiple Spanning Trees are left. For the implementation below, it was assumed that the Graph provided was continuous, so only the first Graph was returned as the Minimum Spanning Tree.

This algorithm operates in O(E log(E)) time, where E is the number of Connectors in the Graph, because of the O(n log(n)) nature of Mergesort used by Collections.sort(). It takes O(EV) time, where V is the number of Vertices, to actually build the Minimum Spanning Tree once the Forrest and Connectors are set up in their data structures. Since O(E log(E)) < O(EV), the Big-O comes out to O(E log(E)). This algorithm could also be described as O(E log(V)), as E can be no bigger than V2, and log(E) = 2 log(V) by a property of logarithms, leaving an O(E log(V)).
/***
     * @param g: The Graph to find the Minimum Spanning Tree for
     * @return Graph<E>: The Minimum Spanning Tree of g
     *
     * This implementation of Kruskal's algorithm is used to find the
     * Minimum Spanning Tree of a Graph. It starts by creating a Forest,
     * or Collection of Trees. These Trees have no children initially.
     * Their connections are re-established as the Minimum Spanning Tree
     * is constructed. Kruskal's algorithm utilizes a Greedy approach, building
     * the Minimum Spanning Tree based on the Connector distances from least to
     * greatest. Connectors that bridge two Trees are re-established in the
     * Minimum Spanning Tree. Those that do not are discarded, as they will
     * create a circuit, thus violating the definition of a Tree.
     * */
    public static <E> Graph<E> kruskal(Graph<E> g){

        //stores the Connectors to assemble the MST with
        ArrayList<Connector<E>> connections = new ArrayList<Connector<E>>();

        //stores the forrest of Trees to build the MST from
        ArrayList<Graph<E>> forrest = new ArrayList<Graph<E>>();

        //for each Node in the Graph
        for(int i = 0; i < g.count(); i++){
            Edge<E> temp = g.get(i);

            //add all the Connectors to the List
            connections.addAll(temp.getConnections());

            //then clear them from the Node
            //so the existing Node can be re-utilized to recreate the MST
            temp.getConnections().clear();

            //create a new Tree based on the current Node only
            //and add it to the forrest
            Graph<E> tree = new Graph<E>();
            tree.addEdge(temp);
            forrest.add(tree);
        }

        //sort the Connections for the Greedy approach
        Collections.sort(connections, new Comparator<Connector<E>>(){
            public int compare(Connector<E> one, Connector<E> two){
                return one.getDistance() - two.getDistance();
            }
        });

        //for each Connection
        for(Connector<E> c: connections){
            Edge<E> one = c.getOne();
            Edge<E> two = c.getTwo();

            Graph<E> g1 = null, g2 = null;

            //find the Graphs their edges are in
            for(int i = 0; i < forrest.size(); i++){
                if(forrest.get(i).contains(one))
                    g1 = forrest.get(i);
                if(forrest.get(i).contains(two))
                    g2 = forrest.get(i);
            }

            //if they are the same, then we aren't connecting
            //two Trees
            if(g1 == g2){
                 continue;
            }

            //otherwise, re-create the Connection
            one.connectTo(two, c.getDistance());

            //merge the Trees
            for(int j = 0; j < g2.count(); j++){
                g1.addEdge(g2.get(j));
            }

            //and remove the old Tree from the forrest
            if(forrest.size() > 1)
                forrest.remove(g2);
        }
        return forrest.get(0);
    }



Is This A Good Question/Topic? 3
  • +

Replies To: Data Structures: Kruskal's Algorithm

#2 ComputerAnalysis  Icon User is offline

  • D.I.C Head

Reputation: 8
  • View blog
  • Posts: 92
  • Joined: 29-June 09

Posted 20 December 2010 - 11:27 AM

Hey. Good job and the description of Kruskal's algorithm and nice implementation. However your analysis is wrong. Specifically your analysis goes wrong at this step - "O(Elog(E)) > O(VE)". Notice since E can be at most V^2, we have O(log(E))=O(log(V)). Thus we have
O(Elog(E)) = O(Elog(V)) < O(VE). So your implementation is actually O(VE), which is not optimal.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1