KKT Algorithm

A randomized algorithm for finding the Minimum Spanning Tree in O(m+n)O(m + n) expected time. Created by Karger, Klein, Tarjan.

Definitions & Properties

  1. Forest - An undirected graph in which any two vertices are connected by at most one path. Can also be defined as an undirected graph with no cycles, or a graph with multiple trees as the components.
  2. Cut Rule - For any cut of the graph, the minimum-weight edge that crosses the cut must be in the MST. This rule helps us determine what to add to our MST.
  3. Cycle Rule - For any cycle in G, the heaviest edge on that cycle cannot be in the MST. This helps us determine what we can remove in constructing the MST.
  4. F-heavy - Let FF be a forest that is a subgraph of GG. An edge eE(G)e \in E(G) is said to be F-heavy if ee creates a cycle when added to FF, and is the heaviest edge in this cycle.
  5. F-light - An edge which is not F-heavy is F-light.

Heavy & Light edges Properties

  1. Edge ee is F-light     eMST(F{e})\iff e \in \operatorname{MST}(F \cup \{e\}). This essentially means that we can include ee when making the MST.
  2. If TT is an MST of GG, then edge eE(G)e \in E(G) is T-light if and only if eTe \in T.
  3. For any forest FF, the F-light edges contain the MST of the underlying graph GG. In other words, an F-heavy edge is also heavy with respect to the MST of the entire graph.

The last point implies that if we find the F-heavy edges for some forest FF that is a subgraph of GG, they will be heavy with respect to the minimum spanning tree of GG as well, hence we can discard them.

Hence, we can use a strategy of creating a random forest FF, finding and discarding all the F-heavy edges, and keep repeating till you have n1n - 1 edges left.

To use this strategy, we need 2 things

  1. MST Verification - Given FF, how can we quickly classify an edge as heavy or light?
  2. Randomization - How do we ensure we find a forest with many F-heavy edges, ensuring that we discard many edges?

We’ll assume for now that we can output the set of all F-light edges of a forest FF in time O(m+n)O(m + n).

Algorithm

The KKT Algorithm

Correctness

Claim - KKT\operatorname{KKT} returns MST(G)\operatorname{MST}(G)

We’ve already stated that removing the heavy edges of any forest FF in a graph GG doesn’t change the MST. Therefore, the MST of GG' will be the same as the MST of G2G_{2}.

We can then add the contracted edges of Boruvka’s algorithm and get the MST by the cut rule.

Claims

1. E[#E1]=m2\mathbb{E}[\#E_{1}] = \frac{m'}{2}

At step 1.3\texttt{1.3}, we are randomly sampling EE', which is of size mm'. Since we are picking each edge with probability, by linearity of expectation, the expected number of edges in E1E_{1} is m2\frac{m'}{2}

2. E[#E2]2n\mathbb{E}[\#E_{2}] \leq 2n'

To prove this, we look at the construction of F1F_{1} in the perspective of Kruskal’s algorithm, making it more intuitive to analyze.

We sort all the edges in EE', and run a slightly modified Kruskal’s, iterating over edges e1,e2,,eEe_{1}, e_{2}, \dots, e_{|E'|}, creating F1F_{1}.

Useful - An edge ee is said to be useful in the current state if it connects two trees / doesn’t form a cycle.

Now, when we are iterating over eie_{i}, we first determine if the edge is useful or not. If it is useful, we then flip an unbiased coin, and include the edge eie_{i} in F1F_{1} if we get heads, else we skip it.
We note that this algorithm is equivalent to step 1.3, 1.4\texttt{1.3, 1.4} in the KKT\operatorname{KKT} algorithm, hence we can analyze those steps indirectly by analyzing this modified Kruskal’s.

Number of heads required - Given there are nn' nodes, the maximum number of edges we can include is n1n' - 1, hence the maximum number of heads we can get is n1n' - 1 as well.

Expected number of coin flips - The probability of including a useful edge is 12\frac{1}{2}, hence there is an expected number of 2(n1)2(n' - 1) flips before we get the maximum of n1n' - 1 edges. This also implies the expected number of useful edges is 2(n1)2(n' - 1).

Now, we can easily claim that any non-useful edge is F-heavy, as it implies that it creates a cycle, and by the sorted order, it’s the largest edge on the cycle, hence it’s an F-heavy edge.
Therefore

F-light edgesuseful edges\text{F-light edges} \subseteq \text{useful edges}

Hence, the expected number of F-light edges is bounded by 2(n1)2(n' - 1), proving the claim that E[#E2]2n\mathbb{E}[\#E_{2}] \leq 2n'.

Expected Running Time

We claim that expected running time of the algorithm with mm edges, nn vertices is O(m+n)O(m + n).
Let TGT_{G} be the expected running time on graph GG,

Tm,n=maxG=(V,E),V=n,E=m{TG}T_{m,n} = \max_{G =(V, E), |V| =n, |E|=m} \{T_{G}\}

Referring to the KKT\operatorname{KKT} algorithm, we know that steps 1, 2, 3, 5, 7 all can be done in linear time, steps 4 and 6 are the ones we have to analyze.

We write the expected running time of steps 4 and 6 as TG1,TG2T_{G_{1}}, T_{G_{2}} respectively, and denote m1=#E1,m2=#E2m_{1}=\#E_{1}, m_{2}=\#E_{2}.
Hence, we can write the expected running time TGT_{G} bound as

TGc(m+n)+E[TG1+TG2]c(m+n)+E[Tm1,n]+E[Tm2,n]\begin{aligned} T_{G} \leq c(m + n) + \mathbb{E}[T_{G_{1}} + T_{G_{2}}] \leq c(m + n) + \mathbb{E}[T_{m_{1},n'}] + \mathbb{E}[T_{m_{2},n'}] \end{aligned}

We inductively assume

Tm,n2c(m+n)T_{m,n}\leq 2c(m + n)
TGc(m+n)+E[2c(m1+n)]+E[2c(m2+n)]c(m+n)+c(m+2n)+2c(2n+n)=c(m+m+n+8n)2c(m+n)\begin{aligned} T_{G} &\leq c(m + n) + \mathbb{E}[2c(m_{1} + n')] + \mathbb{E}[2c(m_{2} + n')]\\ &\leq c(m + n) + c(m' + 2n') + 2c(2n' + n') = c(m + m' + n + 8n')\\ &\leq 2c(m + n) \end{aligned}

Second Inequality: Properties Used -

E[m1]=12m,E[m2]2n\mathbb{E}[m_{1}] = \frac{1}{2}m', \mathbb{E}[m_{2}] \leq 2n'

We’ve already proven these properties
Third Inequality: Properties Used -

nn8,mmn' \leq \frac{n}{8}, m' \leq m

We know nn8n' \leq \frac{n}{8} as we ran Boruvka’s thrice, and in each step, the number of vertices are at least halved in every step.
The second property is trivial as the number of edges reduce whenever we run Boruvka’s.

Hence, we’ve inductively proven that this bound holds, proving the claim that the expected running time is O(m+n)O(m + n).

Published Feb 2, 2023