- Published on
Implementing Union in C++
- Authors
- Name
- 楊翔淳 Jimmy
Introduction
Union data structure is also called disjoint-set data structure. If the operation is required to partition a set into disjoint subsets, then using union is preferred. A popular implementation is disjoint-set forest
Operations
- Add new set
- Merge two sets
- Find the representative member of a set
Initialize sets
You have to initialize elements to sets that only contains itself, in order for them to be merged.
void MakeSet(Node* a) {
if (a->parent != nullptr) return;
a->parent = a;
a->rank = 0; // if union uses rank
a->size = 1; // if union uses size
}
Merge operation (Union)
When deciding if set A is merged into set B or the opposite way, we can check its rank(height of tree) or the size of the set. The set with smaller rank or smaller size is merged into the other set.
Example: merge using rank
void Union(Node* a, Node* b) {
Node* setA = Find(a);
Node* setB = Find(b);
if (setA == setB) { // already in the same union
return;
}
if (setA->rank > setB->rank) {
setB->parent = setA;
}
else if (setA->rank < setB->rank) {
setA->parent = setB;
}
else {
setB->parent = setA;
setA->rank++;
}
}
Find operation
The representative member (or the root) of the set has parent of itself. For a node in the set, follow the chain of parent pointers until reach the representative member.
Node* Find(Node* node) {
if (node != node->parent) {
return Find(node->parent);
}
else {
return node;
}
}
There are a few efficient implementations, here is an example of Path Splitting:
Node* Find(Node* node) {
while (node->parent != node) {
Node* tmp = node->parent;
node->parent = node->parent->parent;
node = tmp;
}
return node;
}
And there is Path Halfing:
Node* Find(Node* node) {
while (node->parent != node) {
node->parent = node->parent->parent;
node = node->parent;
}
return node;
}
Why path halfing works is that we know the parent of the root is also itself. Therefore, we can always jump two steps further when finding the root.
Further Reading
Kruskal's algorithm uses union to find minimum spanning tree.
(Leetcode) Regions Cut by slashes