CSCI H343 Data Structures Fall 2024

Union-Find and Incremental Connected Components

The Union-Find data structure (aka Disjoint Sets) is for efficiently representing members of an equivalence relation.

Overview

Tracking relatives

Disjoint Sets Interface

    interface DisjointSets<N> {
        void make_set(N x);
        N find(N x);
        N union(N x, N y);
    }

Linked-List Implementation

Tree Implementation

Maintain a separate tree/forest structure in which each tree contains one partition of elements with the root as the representative.

Each node in the tree has a parent pointer.

Student Exercise

Use the basic union-find to track relatives in the above example.

Application: solving equations on labeled trees

Example:

    F      =        F
   / \             / \
 X1   X2          G   G
                  |   |
                  X2  X3

Can also be written as terms:

F(X1, X2) = F(G(X2), G(X3))

The solution is

  X1 = G   X2 = G   X3 = X3
       |        |
       G        X3
       |
       X3

  X1=G(G(X3)), X2=G(X3), X3=X3

So we have

F(G(G(X3)), G(X3)) = F(G(G(X3)), G(X3))

Intuition behind how to solve the equations: propagate the equalities to sub-trees.

From the above equation we get the following two equations:

  X1 = G    X2 = G
       |         |
       X2        X3

In general, the process needs to keep track of several sets of things that are equal, that is, we need to partition all of the trees into sets containing equal trees.

Unification algorithm:

    class Node {
        String label;
        public Node[] kids;
        ...
    };
    class Equation {
        Equation(Node l, Node r) { lhs = l; rhs = r; }
        Node lhs; Node rhs;
    }

    static void
    unify(Node t1, Node t2, DisjointSets<Node> sets)
    {
        init(t1, sets); 
        init(t2, sets);
        LinkedList<Equation> equations = new LinkedList<Equation>();
        equations.add(new Equation(t1,t2));
        while (equations.size() != 0) {
            Equation e = equations.pop();
            Node u = sets.find(e.lhs);
            Node v = sets.find(e.rhs);
            if (u != v) {
                if (u instanceof Variable
                    || v instanceof Variable)
                    sets.union(u,v);
                else if (u.label.equals(v.label)) {
                    sets.union(u, v);
                    if (u.kids.length != v.kids.length)
                        return null;
                    for (int i = 0; i != u.kids.length; ++i)
                        equations.add(new Equation(u.kids[i], v.kids[i]));
                } else
                    return null;
            }
        }
        return sets;
    }

    static void init(Node t, DisjointSets<Node> sets) {
        sets.make_set(t);
        for (int i = 0; i != t.kids.length; ++i) {
            init(t.kids[i], sets);
        }
    }

Incremental Connected Components

Def. A connected component is a maximal subset of vertices C in an undirected graph such that for every u and v in C, u \Rightarrow v.

Example:

      a--b   e--f  h   j
      | /|   |     |
      |/ |   |     |
      c--d   g     i

      initial partitions  {a} {b} {c} {d} {e} {f} {g} {h} {i} {j}
       edge processed
       (b,d)              {a} {b,d} {c} {e} {f} {g} {h} {i} {j}
       (e,g)              {a} {b,d} {c} {e,g} {f} {h} {i} {j}
       (a,c)              {a,c} {b,d} {e,g} {f} {h} {i} {j}
       (h,i)              {a,c} {b,d} {e,g} {f} {h,i} {j}
       (a,b)              {a,c,b,d} {e,g} {f} {h,i} {j}
       (e,f)              {a,c,b,d} {e,g,f} {h,i} {j}
       (b,c)              {a,c,b,d} {e,g,f} {h,i} {j}

Student Exercise

Compute the connected components of the following graph.

    V = {a,b,c,d,e,f,g,h,i,j,k}
    E = { (d,i),(f,k),(g,i),(b,g),(a,h),(i,j),(d,k),(b,j),(d,f),(g,j),(a,e) }

    solution:

    {a,e,h}
    {c}
    {b,d,g,i,j,f,k }