Dynamic Connectivity Problem and Union-Find

Basic Definition

🔷 Dynamic Connectivity Problem

Given a graph where edges can be added or removed, how can we efficiently answer whether two nodes are connected (i.e., lie in the same connected component)?

There are two types:

Incremental connectivity: Only add edges over time.
Fully dynamic connectivity: Both add and remove edges.

🧠 Use Cases

Network connectivity (e.g., are two computers connected?)
Image processing (e.g., labeling connected regions)
Kruskal's algorithm for Minimum Spanning Trees
Dynamic grouping (e.g., social networks, clustering)

🔷 Union-Find Data Structure

Union-Find Data Structure also known as Disjoint Set Union (DSU), is ideal for solving the incremental version of the dynamic connectivity problem.

In Union-Find structure, we use identifier array id where:

each index represents an element.
id[i] is the component identifier for element i.

Before adding an edge, id[i] == i for all i. It means each element is in its own component initially:

id = [0, 1, 2, 3, ..., N-1]

🔷 Union-Find Operations

By adding an edge (p,q),

Union(p, q) and Find(p) will be called.

Union(p, q) (main operation) If p and q are already in the same group, do nothing. If p and q are in different groups, merge them.
Find(p) (update operation) Find and returns the identifier of the group containing p and update id array. The other name for identifier is root, representative, leader, or parent.

For finding connectivity

between p and q, we check whether p and q are in the same group.

Connected(p, q) (fix operation) This is equivalent to they have same identifier i.e find(p) == find(q).

Quick-Find

🔹 What is Quick-Find?

Quick-Find is one of the earliest and simplest approaches in the Union-Find data structure family.

# Union Operation
def union(p, q):
    pid = id[p]
    qid = id[q]
    for i in range(len(id)):
        if id[i] == pid:
            id[i] = qid

# Find Operation
def find(p):
    return id[p]

🧪 Quick-Find Implementation

class QuickFindUF:
    def __init__(self, n):
        # Create an array where each element is its own component
        self.id = list(range(n))

    def union(self, p, q):
        pid = self.find(p)
        qid = self.find(q)

        if pid == qid:
            return  # already connected

        # Change all entries from pid to qid
        for i in range(len(self.id)):
            if self.id[i] == pid:
                self.id[i] = qid

    def find(self, p):
        # Component identifier for p is just id[p]
        return self.id[p]

    def connected(self, p, q):
        # Check if two elements are in the same component
        return self.find(p) == self.find(q)

    def __str__(self):
        return str(self.id)

📌 Example Usage

uf = QuickFindUF(10)
print("Initial state:", uf)

uf.union(4, 3)
print("After union(4, 3):", uf)

uf.union(3, 8)
print("After union(3, 8):", uf)

print("Are 4 and 8 connected?", uf.connected(4, 8))  # True
print("Are 1 and 7 connected?", uf.connected(1, 7))  # False

uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
print("Final state:", uf)

🧾 Sample Output

Initial state: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
After union(4, 3): [0, 1, 2, 3, 3, 5, 6, 7, 8, 9]
After union(3, 8): [0, 1, 2, 8, 8, 5, 6, 7, 8, 9]
Are 4 and 8 connected? True
Are 1 and 7 connected? False
Final state: [0, 1, 2, 8, 8, 6, 6, 7, 8, 8]

🚀 Performance of Quick-find

Very fast find operation: Constant time O(1).
Slow union operation: Linear time O(N), because it may need to update many entries in the array.

Quick-Find is not suitable for large datasets, due to its inefficient union operation.

Quick-Union

🔹 What is Quick-Union?

Quick-Union is another way to solve the dynamic connectivity problem, but it improves on Quick-Find by representing each set as a tree.

Find the root of p by following parent pointers:

# Union Operation
def union(p, q):
    pid = find(p)
    qid = find(q)
    if pid == qid:
        return
    id[pid] = qid

# Find Operation
def find(p):
    while p != id[p]:
        p = id[p]
    return p

🧪 Quick-Union Implementation

class QuickUnionUF:
    def __init__(self, n):
        self.id = list(range(n))

    def union(self, p, q):
        pid = self.find(p)
        qid = self.find(q)
        if pid == qid:
            return
        self.id[pid] = qid

    def find(self, p): # Chase parent pointers until root
        while p != self.id[p]:
            p = self.id[p]
        return p

    def connected(self, p, q):
        return self.find(p) == self.find(q)

    def __str__(self):
        return str(self.id)

📌 Example Usage

uf = QuickUnionUF(10)
print("Initial state:", uf)

uf.union(4, 3)
print("After union(4, 3):", uf)

uf.union(3, 8)
print("After union(3, 8):", uf)

print("Are 4 and 8 connected?", uf.connected(4, 8))  # True
print("Are 1 and 7 connected?", uf.connected(1, 7))  # False

uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
print("Final state:", uf)

🧾 Sample Output

Initial state: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
After union(4, 3): [0, 1, 2, 3, 3, 5, 6, 7, 8, 9]
After union(3, 8): [0, 1, 2, 8, 3, 5, 6, 7, 8, 9]
Are 4 and 8 connected? True
Are 1 and 7 connected? False
Final state: [0, 1, 2, 8, 3, 6, 5, 7, 8, 8]

🚀 Performance

While union is fast, the tree can become tall, making find slow in the worst case (O(N)).

Weighted Quick-Union

🔹 What is Weighted Quick-Union?

Let's now explore Weighted Quick-Union, which improves on basic Quick-Union by keeping the trees balanced—resulting in much faster performance.

Weighted Quick-Union improves the Quick-Union algorithm by always attaching the smaller tree to the larger tree, preventing the formation of tall trees.

This drastically reduces the depth of the trees and speeds up the find() operation.

🧠 Key Ideas

We maintain an additional array size[]:

size[i] stores the number of nodes in the tree rooted at i.

When performing union(p, q):

Find the root of p and q
Attach the smaller tree to the larger one
Update the size[] array

🧪 Weighted Quick-Union Implementation

class WeightedQuickUnionUF:
    def __init__(self, n):
        self.id = list(range(n))     # parent links
        self.size = [1] * n          # size of each tree

    def union(self, p, q):
        pid = self.find(p)
        qid = self.find(q)
        if pid == qid:
            return

        # Make the smaller tree a subtree of the larger one
        if self.size[pid] < self.size[qid]:
            self.id[pid] = qid
            self.size[qid] += self.size[pid]
        else:
            self.id[qid] = pid
            self.size[pid] += self.size[qid]

    def find(self, p):
        while p != self.id[p]:
            p = self.id[p]
        return p

    def connected(self, p, q):
        return self.find(p) == self.find(q)

    def __str__(self):
        return f"id: {self.id}\nsize: {self.size}"

📌 Example Usage

uf = WeightedQuickUnionUF(10)
print("Initial state:")
print(uf)

uf.union(4, 3)
uf.union(3, 8)
uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
uf.union(5, 0)
uf.union(7, 2)
uf.union(6, 1)

print("\nFinal state:")
print(uf)

print("\nAre 8 and 9 connected?", uf.connected(8, 9))  # True
print("Are 1 and 0 connected?", uf.connected(1, 0))    # True

🧾 Sample Output

Initial state:
id: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
size: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]

Final state:
id:   [6, 2, 6, 4, 3, 6, 6, 2, 3, 4]
size: [1, 1, 2, 1, 1, 1, 7, 1, 1, 1]

Are 8 and 9 connected? True
Are 1 and 0 connected? True

🚀 Performance

Union is still fast: nearly O(log N)
Trees stay balanced, making find() much faster than basic Quick-Union

⚠️ Still Not Optimal

Even though trees stay relatively shallow, you can still optimize it further using Path Compression, which flattens the tree during find().

Weighted Quick-Union with Path Compression

🔹 What is Path Compression?

Let’s now look at the fastest and most optimized version. This combines tree balancing (weighted union) with path compression to make both union() and find() operations run in near-constant time.

In path compression, during the find(p) operation, we flatten the tree by making each node point directly to the root.

This drastically reduces the tree depth, and future operations become much faster.

🧠 Key Idea (Path Compression during `find()`)

def find(p):
    while p != id[p]:
        id[p] = id[id[p]]  # Path compression: point to grandparent
        p = id[p]
    return p

This is called one-pass path compression (semi-flattening).

🧪 Weighted Quick-Union + Path Compression Implement

class WeightedQuickUnionPathUF:
    def __init__(self, n):
        self.id = list(range(n))     # parent links
        self.size = [1] * n          # size of each tree

    def union(self, p, q):
        pid = self.find(p)
        qid = self.find(q)
        if pid == qid:
            return

        # Weighted union: smaller tree goes under larger one
        if self.size[pid] < self.size[qid]:
            self.id[pid] = qid
            self.size[qid] += self.size[pid]
        else:
            self.id[qid] = pid
            self.size[pid] += self.size[qid]

    def find(self, p):
        while p != self.id[p]:
            # Path compression: point node to its grandparent
            self.id[p] = self.id[self.id[p]]
            p = self.id[p]
        return p

    def connected(self, p, q):
        return self.find(p) == self.find(q)

    def __str__(self):
        return f"id: {self.id}\nsize: {self.size}"

📌 Example Usage

uf = UnionFind(10)
uf.union(4, 3)
uf.union(3, 8)
uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
uf.union(5, 0)
uf.union(7, 2)
uf.union(6, 1)
uf.union(7, 3)

print("Are 0 and 9 connected?", uf.connected(0, 9))  # True
print("Are 1 and 8 connected?", uf.connected(1, 8))  # True

print("\nInternal State:")
print(uf)

🚀 Performance of Path compression

Time per operation: Nearly O(α(N)), where α(N) is the inverse Ackermann function, which grows extremely slowly.
For all practical inputs (billions of elements), α(N) ≤ 5

So this is essentially constant time for real-world usage.

Comparison Table

✅ Why It’s Powerful

Over time, we used techniques to make this more efficient:

Quick-Find (simple, slow)
Quick-Union (tree-based)
Weighted Union (balance the trees)
Path Compression (flatten trees for speed)

With Weighted Union + Path Compression, Union-Find can process millions of operations in nearly constant time — making it a key tool in competitive programming and graph algorithms.

🔍 Union-Find Algorithm Comparison Table

Algorithm	`find(p)` Time	`union(p, q)` Time	Time Complexity (per op)	Tree Balanced?	Path Compression?	Notes
Quick-Find	O(1)	O(N)	O(N)	❌ No	❌ No	Fast `find`, very slow `union`
Quick-Union	O(N)	O(N)	O(N)	❌ No	❌ No	Trees can become tall
Weighted Quick-Union	O(log N)	O(log N)	O(log N)	✅ Yes	❌ No	Keeps trees balanced
Weighted Quick-Union + Path Compression	~O(1)	~O(1)	O(α(N)) ≈ constant	✅ Yes	✅ Yes	Best in practice, near-constant time

🧠 Notes

N = number of elements
α(N) = Inverse Ackermann function — grows so slowly that for all practical inputs, it's ≤ 5
The final version (weighted + path compression) is used in high-performance applications and competitive programming.

🔺 Fully Dynamic Connectivity

When edge deletions are also allowed, Union-Find isn't enough. You need more advanced data structures:

Euler Tour Trees
Link-Cut Trees
Dynamic Trees (e.g., Sleator-Tarjan)
ETT + Level-based decomposition

These allow:

add_edge(u, v)
remove_edge(u, v)
connected(u, v)

All in sublinear or logarithmic time (e.g., O(log^2 n) amortized).

Basic Definition