- Published on
Dynamic Connectivity Problem and Union-Find
- Authors
Basic Definition
π· Dynamic Connectivity Problem
Given a graph where edges can be added or removed, how can we efficiently answer whether two nodes are connected (i.e., lie in the same connected component)?
There are two types:
- Incremental connectivity: Only add edges over time.
- Fully dynamic connectivity: Both add and remove edges.
π§ Use Cases
- Network connectivity (e.g., are two computers connected?)
- Image processing (e.g., labeling connected regions)
- Kruskal's algorithm for Minimum Spanning Trees
- Dynamic grouping (e.g., social networks, clustering)
π· Union-Find Data Structure
Union-Find Data Structure also known as Disjoint Set Union (DSU), is ideal for solving the incremental version of the dynamic connectivity problem.
In Union-Find structure, we use identifier array id
where:
- each index represents an element.
id[i]
is the component identifier for elementi
.
Before adding an edge, id[i] == i
for all i
. It means each element is in its own component initially:
id = [0, 1, 2, 3, ..., N-1]
π· Union-Find Operations
By adding an edge (p,q),
Union(p, q) and Find(p) will be called.
Union(p, q) (main operation) If
p
andq
are already in the same group, do nothing. Ifp
andq
are in different groups, merge them.Find(p) (update operation) Find and returns the identifier of the group containing
p
and updateid
array. The other name for identifier is root, representative, leader, or parent.
For finding connectivity
between p
and q
, we check whether p
and q
are in the same group.
- Connected(p, q) (fix operation) This is equivalent to they have same identifier i.e
find(p) == find(q)
.
Quick-Find
πΉ What is Quick-Find?
Quick-Find is one of the earliest and simplest approaches in the Union-Find data structure family.
# Union Operation
def union(p, q):
pid = id[p]
qid = id[q]
for i in range(len(id)):
if id[i] == pid:
id[i] = qid
# Find Operation
def find(p):
return id[p]
π§ͺ Quick-Find Implementation
class QuickFindUF:
def __init__(self, n):
# Create an array where each element is its own component
self.id = list(range(n))
def union(self, p, q):
pid = self.find(p)
qid = self.find(q)
if pid == qid:
return # already connected
# Change all entries from pid to qid
for i in range(len(self.id)):
if self.id[i] == pid:
self.id[i] = qid
def find(self, p):
# Component identifier for p is just id[p]
return self.id[p]
def connected(self, p, q):
# Check if two elements are in the same component
return self.find(p) == self.find(q)
def __str__(self):
return str(self.id)
π Example Usage
uf = QuickFindUF(10)
print("Initial state:", uf)
uf.union(4, 3)
print("After union(4, 3):", uf)
uf.union(3, 8)
print("After union(3, 8):", uf)
print("Are 4 and 8 connected?", uf.connected(4, 8)) # True
print("Are 1 and 7 connected?", uf.connected(1, 7)) # False
uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
print("Final state:", uf)
π§Ύ Sample Output
Initial state: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
After union(4, 3): [0, 1, 2, 3, 3, 5, 6, 7, 8, 9]
After union(3, 8): [0, 1, 2, 8, 8, 5, 6, 7, 8, 9]
Are 4 and 8 connected? True
Are 1 and 7 connected? False
Final state: [0, 1, 2, 8, 8, 6, 6, 7, 8, 8]
π Performance of Quick-find
Very fast
find
operation: Constant time O(1).Slow
union
operation: Linear time O(N), because it may need to update many entries in the array.
Quick-Find is not suitable for large datasets, due to its inefficient union operation.
Quick-Union
πΉ What is Quick-Union?
Quick-Union is another way to solve the dynamic connectivity problem, but it improves on Quick-Find by representing each set as a tree.
Find the root of p
by following parent pointers:
# Union Operation
def union(p, q):
pid = find(p)
qid = find(q)
if pid == qid:
return
id[pid] = qid
# Find Operation
def find(p):
while p != id[p]:
p = id[p]
return p
π§ͺ Quick-Union Implementation
class QuickUnionUF:
def __init__(self, n):
self.id = list(range(n))
def union(self, p, q):
pid = self.find(p)
qid = self.find(q)
if pid == qid:
return
self.id[pid] = qid
def find(self, p): # Chase parent pointers until root
while p != self.id[p]:
p = self.id[p]
return p
def connected(self, p, q):
return self.find(p) == self.find(q)
def __str__(self):
return str(self.id)
π Example Usage
uf = QuickUnionUF(10)
print("Initial state:", uf)
uf.union(4, 3)
print("After union(4, 3):", uf)
uf.union(3, 8)
print("After union(3, 8):", uf)
print("Are 4 and 8 connected?", uf.connected(4, 8)) # True
print("Are 1 and 7 connected?", uf.connected(1, 7)) # False
uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
print("Final state:", uf)
π§Ύ Sample Output
Initial state: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
After union(4, 3): [0, 1, 2, 3, 3, 5, 6, 7, 8, 9]
After union(3, 8): [0, 1, 2, 8, 3, 5, 6, 7, 8, 9]
Are 4 and 8 connected? True
Are 1 and 7 connected? False
Final state: [0, 1, 2, 8, 3, 6, 5, 7, 8, 8]
π Performance
While union
is fast, the tree can become tall, making find
slow in the worst case (O(N)).
Weighted Quick-Union
πΉ What is Weighted Quick-Union?
Let's now explore Weighted Quick-Union, which improves on basic Quick-Union by keeping the trees balancedβresulting in much faster performance.
Weighted Quick-Union improves the Quick-Union algorithm by always attaching the smaller tree to the larger tree, preventing the formation of tall trees.
This drastically reduces the depth of the trees and speeds up the find()
operation.
π§ Key Ideas
We maintain an additional array size[]
:
size[i]
stores the number of nodes in the tree rooted ati
.
When performing union(p, q)
:
- Find the root of
p
andq
- Attach the smaller tree to the larger one
- Update the
size[]
array
π§ͺ Weighted Quick-Union Implementation
class WeightedQuickUnionUF:
def __init__(self, n):
self.id = list(range(n)) # parent links
self.size = [1] * n # size of each tree
def union(self, p, q):
pid = self.find(p)
qid = self.find(q)
if pid == qid:
return
# Make the smaller tree a subtree of the larger one
if self.size[pid] < self.size[qid]:
self.id[pid] = qid
self.size[qid] += self.size[pid]
else:
self.id[qid] = pid
self.size[pid] += self.size[qid]
def find(self, p):
while p != self.id[p]:
p = self.id[p]
return p
def connected(self, p, q):
return self.find(p) == self.find(q)
def __str__(self):
return f"id: {self.id}\nsize: {self.size}"
π Example Usage
uf = WeightedQuickUnionUF(10)
print("Initial state:")
print(uf)
uf.union(4, 3)
uf.union(3, 8)
uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
uf.union(5, 0)
uf.union(7, 2)
uf.union(6, 1)
print("\nFinal state:")
print(uf)
print("\nAre 8 and 9 connected?", uf.connected(8, 9)) # True
print("Are 1 and 0 connected?", uf.connected(1, 0)) # True
π§Ύ Sample Output
Initial state:
id: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
size: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
Final state:
id: [6, 2, 6, 4, 3, 6, 6, 2, 3, 4]
size: [1, 1, 2, 1, 1, 1, 7, 1, 1, 1]
Are 8 and 9 connected? True
Are 1 and 0 connected? True
π Performance
- Union is still fast: nearly O(log N)
- Trees stay balanced, making
find()
much faster than basic Quick-Union
β οΈ Still Not Optimal
Even though trees stay relatively shallow, you can still optimize it further using Path Compression, which flattens the tree during find()
.
Weighted Quick-Union with Path Compression
πΉ What is Path Compression?
Letβs now look at the fastest and most optimized version. This combines tree balancing (weighted union) with path compression to make both union()
and find()
operations run in near-constant time.
In path compression, during the find(p)
operation, we flatten the tree by making each node point directly to the root.
This drastically reduces the tree depth, and future operations become much faster.
find()
)
π§ Key Idea (Path Compression during def find(p):
while p != id[p]:
id[p] = id[id[p]] # Path compression: point to grandparent
p = id[p]
return p
This is called one-pass path compression (semi-flattening).
π§ͺ Weighted Quick-Union + Path Compression Implement
class WeightedQuickUnionPathUF:
def __init__(self, n):
self.id = list(range(n)) # parent links
self.size = [1] * n # size of each tree
def union(self, p, q):
pid = self.find(p)
qid = self.find(q)
if pid == qid:
return
# Weighted union: smaller tree goes under larger one
if self.size[pid] < self.size[qid]:
self.id[pid] = qid
self.size[qid] += self.size[pid]
else:
self.id[qid] = pid
self.size[pid] += self.size[qid]
def find(self, p):
while p != self.id[p]:
# Path compression: point node to its grandparent
self.id[p] = self.id[self.id[p]]
p = self.id[p]
return p
def connected(self, p, q):
return self.find(p) == self.find(q)
def __str__(self):
return f"id: {self.id}\nsize: {self.size}"
π Example Usage
uf = UnionFind(10)
uf.union(4, 3)
uf.union(3, 8)
uf.union(6, 5)
uf.union(9, 4)
uf.union(2, 1)
uf.union(5, 0)
uf.union(7, 2)
uf.union(6, 1)
uf.union(7, 3)
print("Are 0 and 9 connected?", uf.connected(0, 9)) # True
print("Are 1 and 8 connected?", uf.connected(1, 8)) # True
print("\nInternal State:")
print(uf)
π Performance of Path compression
Time per operation: Nearly O(Ξ±(N)), where Ξ±(N) is the inverse Ackermann function, which grows extremely slowly.
For all practical inputs (billions of elements), Ξ±(N) β€ 5
So this is essentially constant time for real-world usage.
Comparison Table
β Why Itβs Powerful
Over time, we used techniques to make this more efficient:
- Quick-Find (simple, slow)
- Quick-Union (tree-based)
- Weighted Union (balance the trees)
- Path Compression (flatten trees for speed)
With Weighted Union + Path Compression, Union-Find can process millions of operations in nearly constant time β making it a key tool in competitive programming and graph algorithms.
π Union-Find Algorithm Comparison Table
Algorithm | find(p) Time | union(p, q) Time | Time Complexity (per op) | Tree Balanced? | Path Compression? | Notes |
---|---|---|---|---|---|---|
Quick-Find | O(1) | O(N) | O(N) | β No | β No | Fast find , very slow union |
Quick-Union | O(N) | O(N) | O(N) | β No | β No | Trees can become tall |
Weighted Quick-Union | O(log N) | O(log N) | O(log N) | β Yes | β No | Keeps trees balanced |
Weighted Quick-Union + Path Compression | ~O(1) | ~O(1) | O(Ξ±(N)) β constant | β Yes | β Yes | Best in practice, near-constant time |
π§ Notes
- N = number of elements
- Ξ±(N) = Inverse Ackermann function β grows so slowly that for all practical inputs, it's β€ 5
- The final version (weighted + path compression) is used in high-performance applications and competitive programming.
πΊ Fully Dynamic Connectivity
When edge deletions are also allowed, Union-Find isn't enough. You need more advanced data structures:
- Euler Tour Trees
- Link-Cut Trees
- Dynamic Trees (e.g., Sleator-Tarjan)
- ETT + Level-based decomposition
These allow:
add_edge(u, v)
remove_edge(u, v)
connected(u, v)
All in sublinear or logarithmic time (e.g., O(log^2 n)
amortized).