CSCI H343 Data Structures Fall 2024

Minimum Spanning Trees and Kruskal’s Algorithm

Suppose you’re a cable company and you are planning where to place cables so that you reach every house, but you’d like to minimize the amount of cable that you use.

Definition. A spanning tree is a subset of edges of a graph that form a tree and that connect all the vertices in the graph.

Definition. Given an edge-weighted graph G, the minimum spanning tree problem is to find a spanning tree of G whose total weight is less or equal to any other spanning tree. We write T₁ ≤ T₂ when the total weight of the edges in tree T₁ is less-than or equal to the total weight of the edges in tree T₂.

**Example graph for MST problem.**

**A MST for the above graph, with weight 13.**

**Another MST for the above graph, with weight 13.**

There are two popular algorithms for MST, I’ll cover the first. The second is in the textbook.

The two algorithms have a lot in common.

But how do we find a safe edge?

We’ll need some definitions to lead up to the answer.

Definitions. Suppose G = (V,E) is a graph with weighted edges.

The following theorem tells us that to find a safe edge, it suffices to find a light edge.

Theorem 23.1 Let A be a subset of some MST of G=(V,E). Let (S,V-S) be a cut of G that respects A. Let (u,v) be a light edge wrt. the cut. Then (u,v) is a safe edge.

Proof. Let T be a MST of G that includes A.

QED

We can think of MST algorithms as maintaining a forest of trees, where all the tree edges are in the set A. Initially, each vertex is in it’s own tree because A={}. At each step, we merge two trees into a single tree by identifying the lightest edge connecting them (such an edge is safe).

Kruskal’s Algorithm

Main idea: process all the edges from lightest to heaviest

Demo of Kruskal’s algorithm

**Example graph for MST problem.**

sorted list of edges:
A-1-B, B-2-E, C-2-F, A-3-E, D-4-E, A-4-D, D-4-B, C-4-E, E-7-F

Implementation Ideas

Implementation of Kruskal’s algorithm in Java

static <V> void kruskal_mst(EdgeGraph<V> G, 
                            Map<V,Map<V,Double>> weight,
                            ArrayList<Edge<V>> T, 
                            DisjointSets<V> sets)
{
    for (V v : G.vertices())
        sets.make_set(v);
    ArrayList<Edge<V>> edges = new ArrayList<Edge<V>>();
    for (Edge<V> e : G.edges())
        edges.add(e);
    // bucket sort O(n), mergesort O(n log n)			
    sort(edges, new CompareWeight<V>(weight));
    for (Edge<V> e : edges) {
        if (sets.find(e.source()) != sets.find(e.target())) {
            T.add(e);
            sets.union(e.source(), e.target());
        }
	}
}

Time complexity

The dominating cost is the sorting. So the overall time complexity is O(m log m) which can be instead stated as O(m log n) because m < n² and therefore log m < 2 log n.

Student group work

apply Kruskal’s to the following graph (didn’t get to this)

**Graph for MST student exercise.**

Solution: sorted edges:

	   A-1-E, G-1-H, B-1-D, D-1-H
	   F-2-G, A-2-B, 
	   A-3-C,
	   E-4-F, C-4-H, D-4-G
	   C-5-D, 
	   B-8-F

An MST of weight 11: (there are other MST’s)

**Solution for MST student exercise.**