A randomized algorithm for finding the Minimum Spanning Tree in O(m+n) expected time. Created by Karger, Klein, Tarjan.
Definitions & Properties
Forest - An undirected graph in which any two vertices are connected by at most one path. Can also be defined as an undirected graph with no cycles, or a graph with multiple trees as the components.
Cut Rule - For any cut of the graph, the minimum-weight edge that crosses the cut must be in the MST. This rule helps us determine what to add to our MST.
Cycle Rule - For any cycle in G, the heaviest edge on that cycle cannot be in the MST. This helps us determine what we can remove in constructing the MST.
F-heavy - Let F be a forest that is a subgraph of G. An edge e∈E(G) is said to be F-heavy if e creates a cycle when added to F, and is the heaviest edge in this cycle.
F-light - An edge which is not F-heavy is F-light.
Heavy & Light edges Properties
Edge e is F-light ⟺e∈MST(F∪{e}). This essentially means that we can include e when making the MST.
If T is an MST of G, then edge e∈E(G) is T-light if and only if e∈T.
For any forest F, the F-light edges contain the MST of the underlying graph G. In other words, an F-heavy edge is also heavy with respect to the MST of the entire graph.
The last point implies that if we find the F-heavy edges for some forest F that is a subgraph of G, they will be heavy with respect to the minimum spanning tree of G as well, hence we can discard them.
Hence, we can use a strategy of creating a random forest F, finding and discarding all the F-heavy edges, and keep repeating till you have n−1 edges left.
To use this strategy, we need 2 things
MST Verification - Given F, how can we quickly classify an edge as heavy or light?
Randomization - How do we ensure we find a forest with many F-heavy edges, ensuring that we discard many edges?
We’ll assume for now that we can output the set of all F-light edges of a forest F in time O(m+n).
Algorithm
Correctness
Claim -KKT returns MST(G)
We’ve already stated that removing the heavy edges of any forest F in a graph G doesn’t change the MST. Therefore, the MST of G′ will be the same as the MST of G2.
We can then add the contracted edges of Boruvka’s algorithm and get the MST by the cut rule.
Claims
1. E[#E1]=2m′
At step 1.3, we are randomly sampling E′, which is of size m′. Since we are picking each edge with probability, by linearity of expectation, the expected number of edges in E1 is 2m′
2. E[#E2]≤2n′
To prove this, we look at the construction of F1 in the perspective of Kruskal’s algorithm, making it more intuitive to analyze.
We sort all the edges in E′, and run a slightly modified Kruskal’s, iterating over edges e1,e2,…,e∣E′∣, creating F1.
Useful - An edge e is said to be useful in the current state if it connects two trees / doesn’t form a cycle.
Now, when we are iterating over ei, we first determine if the edge is useful or not. If it is useful, we then flip an unbiased coin, and include the edge ei in F1 if we get heads, else we skip it.
We note that this algorithm is equivalent to step 1.3, 1.4 in the KKT algorithm, hence we can analyze those steps indirectly by analyzing this modified Kruskal’s.
Number of heads required - Given there are n′ nodes, the maximum number of edges we can include is n′−1, hence the maximum number of heads we can get is n′−1 as well.
Expected number of coin flips - The probability of including a useful edge is 21, hence there is an expected number of 2(n′−1) flips before we get the maximum of n′−1 edges. This also implies the expected number of useful edges is 2(n′−1).
Now, we can easily claim that any non-useful edge is F-heavy, as it implies that it creates a cycle, and by the sorted order, it’s the largest edge on the cycle, hence it’s an F-heavy edge.
Therefore
F-light edges⊆useful edges
Hence, the expected number of F-light edges is bounded by 2(n′−1), proving the claim that E[#E2]≤2n′.
Expected Running Time
We claim that expected running time of the algorithm with m edges, n vertices is O(m+n).
Let TG be the expected running time on graph G,
Tm,n=G=(V,E),∣V∣=n,∣E∣=mmax{TG}
Referring to the KKT algorithm, we know that steps 1, 2, 3, 5, 7 all can be done in linear time, steps 4 and 6 are the ones we have to analyze.
We write the expected running time of steps 4 and 6 as TG1,TG2 respectively, and denote m1=#E1,m2=#E2.
Hence, we can write the expected running time TG bound as
We’ve already proven these properties Third Inequality: Properties Used -
n′≤8n,m′≤m
We know n′≤8n as we ran Boruvka’s thrice, and in each step, the number of vertices are at least halved in every step.
The second property is trivial as the number of edges reduce whenever we run Boruvka’s.
Hence, we’ve inductively proven that this bound holds, proving the claim that the expected running time is O(m+n).