Kruskal's algorithm

2017-01-08 13:28:52 +02:00 · 2017-01-08 13:28:52 +02:00 · b0f75a819e
parent 2d74407966
commit b0f75a819e
1 changed files with 135 additions and 145 deletions
--- a/luku15.tex
+++ b/luku15.tex
@ -1,17 +1,15 @@
 \chapter{Spanning trees}
-\index{virittxvx puu@virittävä puu}
+\index{spanning tree}
-\key{Virittävä puu} on kokoelma
+A \key{spanning tree} is a set of edges of a graph
-verkon kaaria,
+such that there is a path between any two nodes
-joka kytkee kaikki
+in the graph using only the edges in the spanning tree.
-verkon solmut toisiinsa.
+Like trees in general, a spanning tree is
-Kuten puut yleensäkin,
+connected and acyclic.
-virittävä puu on yhtenäinen ja syklitön.
+Usually, there are many ways to construct a spanning tree.
 Virittävän puun muodostamiseen
 on yleensä monia tapoja.
-Esimerkiksi verkossa
+For example, in the graph
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
 \node[draw, circle] (1) at (1.5,2) {$1$};
@ -30,7 +28,7 @@ Esimerkiksi verkossa
 \path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
 \end{tikzpicture}
 \end{center}
-yksi mahdollinen virittävä puu on seuraava:
+one possible spanning tree is as follows:
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
 \node[draw, circle] (1) at (1.5,2) {$1$};
@ -47,13 +45,16 @@ yksi mahdollinen virittävä puu on seuraava:
 \end{tikzpicture}
 \end{center}
-Virittävän puun paino on siihen kuuluvien kaarten painojen summa.
+The weight of a spanning tree is the sum of the edge weights.
-Esimerkiksi yllä olevan puun paino on $3+5+9+3+2=22$.
+For example, the weight of the above spanning tree is
 $3+5+9+3+2=22$.
-\key{Pienin virittävä puu}
+\index{minimum spanning tree}
-on virittävä puu, jonka paino on mahdollisimman pieni.
+
-Yllä olevan verkon pienin virittävä puu
+A \key{minimum spanning tree}
-on painoltaan 20, ja sen voi muodostaa seuraavasti:
+is a spanning tree whose weight is as small as possible.
 The weight of a minimum spanning tree for the above graph
 is 20, and a tree can be constructed as follows:
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
@ -75,10 +76,12 @@ on painoltaan 20, ja sen voi muodostaa seuraavasti:
 \end{tikzpicture}
 \end{center}
-Vastaavasti \key{suurin virittävä puu}
+\index{maximum spanning tree}
-on virittävä puu, jonka paino on mahdollisimman suuri.
+
-Yllä olevan verkon suurin virittävä puu on
+Correspondingly, a \key{maximum spanning tree}
-painoltaan 32:
+is a spanning tree whose weight is as large as possible.
 The weight of a maximum spanning tree for the
 above graph is 32:
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
@ -99,53 +102,46 @@ painoltaan 32:
 \end{tikzpicture}
 \end{center}
-Huomaa, että voi olla monta erilaista
+Note that there may be several different ways
-tapaa muodostaa pienin tai
+for constructing a minimum or maximum spanning tree,
-suurin virittävä puu, eli puut eivät ole yksikäsitteisiä.
+so the trees are not unique.
-Tässä luvussa tutustumme algoritmeihin,
+This chapter discusses algorithms that construct
-jotka muodostavat verkon pienimmän tai suurimman
+a minimum or maximum spanning tree for a graph.
-virittävän puun.
+It turns out that it is easy to find such spanning trees
-Osoittautuu, että virittävien puiden etsiminen
+because many greedy methods produce an optimal solution.
 on siinä mielessä helppo ongelma,
 että monenlaiset ahneet menetelmät tuottavat
 optimaalisen ratkaisun.
-Käymme läpi kaksi algoritmia, jotka molemmat valitsevat
+We will learn two algorithms that both construct the
-puuhun mukaan kaaria painojärjestyksessä.
+tree by choosing edges ordered by weights.
-Keskitymme pienimmän virittävän puun etsimiseen,
+We will focus on finding a minimum spanning tree,
-mutta samoilla algoritmeilla voi muodostaa myös suurimman virittävän
+but the same algorithms can be used for finding a
-puun käsittelemällä kaaret käänteisessä järjestyksessä.
+maximum spanning tree by processing the edges in reverse order.
-\section{Kruskalin algoritmi}
+\section{Kruskal's algorithm}
-\index{Kruskalin algoritmi@Kruskalin algoritmi}
+\index{Kruskal's algorithm}
-\key{Kruskalin algoritmi} aloittaa pienimmän
+In \key{Kruskal's algorithm}, the initial spanning tree
-virittävän
+is empty and doesn't contain any edges.
-puun muodostamisen tilanteesta,
+Then the algorithm adds edges to the tree
-jossa puussa ei ole yhtään kaaria.
+one at a time
-Sitten algoritmi alkaa lisätä
+in increasing order of their weights.
-puuhun kaaria järjestyksessä
+At each step, the algorithm includes an edge in the tree
-kevyimmästä raskaimpaan.
+if it doesn't create a cycle.
 Kunkin kaaren kohdalla
 algoritmi ottaa kaaren mukaan puuhun,
 jos tämä ei aiheuta sykliä.
-Kruskalin algoritmi pitää yllä
+Kruskal's algorithm maintains the components
-tietoa verkon komponenteista.
+in the tree.
-Aluksi jokainen solmu on omassa
+Initially, each node of the graph
-komponentissaan,
+is in its own component,
-ja komponentit yhdistyvät pikkuhiljaa
+and each edge added to the tree joins two components.
-algoritmin aikana puuhun tulevista kaarista.
+Finally, all nodes will be in the same component,
-Lopulta kaikki solmut ovat samassa
+and a minimum spanning tree has been found.
 komponentissa, jolloin pienin virittävä puu on valmis.
-\subsubsection{Esimerkki}
+\subsubsection{Example}
 \begin{samepage}
-Tarkastellaan Kruskalin algoritmin toimintaa
+Let's consider how Kruskal's algorithm processes the
-seuraavassa verkossa:
+following graph:
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
 \node[draw, circle] (1) at (1.5,2) {$1$};
@ -167,13 +163,13 @@ seuraavassa verkossa:
 \end{samepage}
 \begin{samepage}
-Algoritmin ensimmäinen vaihe on
+The first step in the algorithm is to sort the
-järjestää verkon kaaret niiden painon mukaan.
+edges in increasing order of their weights.
-Tuloksena on seuraava lista:
+The result is the following list:
 \begin{tabular}{ll}
 \\
-kaari & paino \\
+edge & weight \\
 \hline
 5--6 & 2 \\
 1--2 & 3 \\
@ -187,11 +183,11 @@ kaari & paino \\
 \end{tabular}
 \end{samepage}
-Tämän jälkeen algoritmi käy listan läpi
+After this, the algorithm goes through the list
-ja lisää kaaren puuhun,
+and adds an edge to the tree if it joins
-jos se yhdistää kaksi erillistä komponenttia.
+two separate components.
-Aluksi jokainen solmu on omassa komponentissaan:
+Initially, each node is in its own component:
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
@ -211,9 +207,9 @@ Aluksi jokainen solmu on omassa komponentissaan:
 %\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
 \end{tikzpicture}
 \end{center}
-Ensimmäinen virittävään puuhun lisättävä
+The first edge to be added to the tree is
-kaari on 5--6, joka yhdistää
+edge 5--6 that joins components
-komponentit $\{5\}$ ja $\{6\}$ komponentiksi $\{5,6\}$:
+$\{5\}$ and $\{6\}$ into component $\{5,6\}$:
 \begin{center}
 \begin{tikzpicture}
@ -234,8 +230,7 @@ komponentit $\{5\}$ ja $\{6\}$ komponentiksi $\{5,6\}$:
 %\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
 \end{tikzpicture}
 \end{center}
-Tämän jälkeen algoritmi lisää puuhun vastaavasti
+After this, edges 1--2, 3--6 and 1--5 are added in a similar way:
 kaaret 1--2, 3--6 ja 1--5:
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
@ -257,18 +252,18 @@ kaaret 1--2, 3--6 ja 1--5:
 \end{tikzpicture}
 \end{center}
-Näiden lisäysten jälkeen monet
+After those steps, many components have been joined
-komponentit ovat yhdistyneet ja verkossa on kaksi
+and there are two components in the tree:
-komponenttia: $\{1,2,3,5,6\}$ ja $\{4\}$.
+$\{1,2,3,5,6\}$ and $\{4\}$.
-Seuraavaksi käsiteltävä kaari on 2--3,
+The next edge in the list is edge 2--3,
-mutta tämä kaari ei tule mukaan puuhun,
+but it will not be included in the tree because
-koska solmut 2 ja 3 ovat jo samassa komponentissa.
+nodes 2 and 3 are already in the same component.
-Vastaavasta syystä myöskään kaari 2--5 ei tule mukaan puuhun.
+For the same reason, edge 2--5 will not be added
 to the tree.
 \begin{samepage}
-Lopuksi puuhun tulee kaari 4--6,
+Finally, edge 4--6 will be included in the tree:
 joka luo yhden komponentin:
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
@ -291,29 +286,26 @@ joka luo yhden komponentin:
 \end{center}
 \end{samepage}
-Tämän lisäyksen jälkeen algoritmi päättyy,
+After this, the algorithm terminates because
-koska kaikki solmut on kytketty toisiinsa kaarilla
+there is a path between any two nodes and
-ja verkko on yhtenäinen.
+the graph is connected.
-Tuloksena on verkon pienin virittävä puu,
+The resulting graph is a minimum spanning tree
-jonka paino on $2+3+3+5+7=20$.
+with weight $2+3+3+5+7=20$.
-\subsubsection{Miksi algoritmi toimii?}
+\subsubsection{Why does this work?}
-On hyvä kysymys, miksi Kruskalin algoritmi
+It's a good question why Kruskal's algorithm works.
-toimii aina eli miksi ahne strategia tuottaa
+Why does the greedy strategy guarantee that we
-varmasti pienimmän mahdollisen virittävän puun.
+will find a minimum spanning tree?
-Voimme perustella algoritmin toimivuuden
+Let's see what happens if the lightest edge in
-tekemällä vastaoletuksen, että pienimmässä
+the graph is not included in the minimum spanning tree.
-virittävässä puussa ei olisi verkon keveintä kaarta.
+For example, assume that a minimum spanning tree
-Oletetaan esimerkiksi, että äskeisen verkon
+for the above graph would not contain the edge
-pienimmässä virittävässä puussa ei olisi
+between nodes 5 and 6 with weight 2.
-2:n painoista kaarta solmujen 5 ja 6 välillä.
+We don't know exactly how the new minimum spanning tree
-Emme tiedä tarkalleen, millainen uusi pienin
+would look like, but still it has to contain some edges.
-virittävä puu olisi, mutta siinä täytyy olla
+Assume that the tree would be as follows:
 kuitenkin joukko kaaria.
 Oletetaan, että virittävä puu olisi
 vaikkapa seuraavanlainen:
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
@ -332,12 +324,12 @@ vaikkapa seuraavanlainen:
 \end{tikzpicture}
 \end{center}
-Ei ole kuitenkaan mahdollista,
+However, it's not possible that the above tree
-että yllä oleva virittävä puu olisi todellisuudessa
+would be a real minimum spanning tree for the graph.
-verkon pienin virittävä puu.
+The reason for this is that we can remove an edge
-Tämä johtuu siitä, että voimme poistaa siitä
+from it and replace it with the edge with weight 2.
-jonkin kaaren ja korvata sen 2:n painoisella kaarella.
+This produces a spanning tree whose weight is
-Tuloksena on virittävä puu, jonka paino on \emph{pienempi}:
+\emph{smaller}:
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
@ -356,55 +348,53 @@ Tuloksena on virittävä puu, jonka paino on \emph{pienempi}:
 \end{tikzpicture}
 \end{center}
-Niinpä on aina optimaalinen ratkaisu valita pienimpään
+For this reason, it is always optimal to include the lightest edge
-virittävään puuhun verkon kevein kaari.
+in the minimum spanning tree.
-Vastaavalla tavalla voimme perustella
+Using a similar argument, we can show that we
-seuraavaksi keveimmän kaaren valinnan, jne.
+can also add the second lightest edge to the tree, and so on.
-Niinpä Kruskalin algoritmi toimii oikein ja
+Thus, Kruskal's algorithm works correctly and
-tuottaa aina pienimmän virittävän puun.
+always produces a minimum spanning tree.
-\subsubsection{Toteutus}
+\subsubsection{Implementation}
-Kruskalin algoritmi on mukavinta toteuttaa
+Kruskal's algorithm can be conveniently
-kaarilistan avulla. Algoritmin ensimmäinen vaihe
+implemented using an edge list.
-on järjestää kaaret painojärjestykseen,
+The first phase of the algorithm sorts the
-missä kuluu aikaa $O(m \log m)$.
+edges in $O(m \log m)$ time.
-Tämän jälkeen seuraa algoritmin toinen vaihe,
+After this, the second phase of the algorithm
-jossa listalta valitaan kaaret mukaan puuhun.
+builds the minimum spanning tree. 
-Algoritmin toinen vaihe rakentuu seuraavanlaisen silmukan ympärille:
+The second phase of the algorithm looks as follows:
 \begin{lstlisting}
 for (...) {
-  if (!sama(a,b)) liita(a,b);
+  if (!same(a,b)) union(a,b);
 }
 \end{lstlisting}
-Silmukka käy läpi kaikki listan kaaret
+The loop goes through the edges in the list
-niin, että muuttujat $a$ ja $b$ ovat kulloinkin kaaren
+and always processes an edge $a$--$b$
-päissä olevat solmut.
+where $a$ and $b$ are two nodes.
-Koodi käyttää kahta funktiota:
+The code uses two functions:
-funktio \texttt{sama} tutkii,
+the function \texttt{same} determines
-ovatko solmut samassa komponentissa,
+if the nodes are in the same component,
-ja funktio \texttt{liita}
+and the function \texttt{unite}
-yhdistää kaksi komponenttia toisiinsa.
+joins two components into a single component.
-Ongelmana on, kuinka toteuttaa tehokkaasti
+The problem is how to efficiently implement
-funktiot \texttt{sama} ja \texttt{liita}.
+the functions \texttt{same} and \texttt{unite}.
-Yksi mahdollisuus on pitää yllä verkkoa tavallisesti
+One possibility is to maintain the graph
-ja toteuttaa funktio \texttt{sama} verkon läpikäyntinä.
+in a usual way and implement the function
-Tällöin kuitenkin funktion \texttt{sama}
+\texttt{same} as graph traversal.
-suoritus veisi aikaa $O(n+m)$,
+However, using this technique,
-mikä on hidasta, koska funktiota kutsutaan
+the running time of the function \texttt{same} would be $O(n+m)$,
-jokaisen kaaren kohdalla.
+and this would be slow because the function will be
 called for each edge in the graph.
-Seuraavaksi esiteltävä union-find-rakenne
+We will solve the problem using a union-find structure
-ratkaisee asian.
+that implements both the functions in $O(\log n)$ time.
-Se toteuttaa molemmat funktiot
+Thus, the time complexity of Kruskal's algorithm
-ajassa $O(\log n)$,
+will be only $O(m \log n)$ after sorting the edge list.
 jolloin Kruskalin algoritmin
 aikavaativuus on vain $O(m \log n)$
 kaarilistan järjestämisen jälkeen.
 \section{Union-find-rakenne}