Chapter 24 first version

2017-01-15 15:45:45 +02:00 · 2017-01-15 15:45:45 +02:00 · e2bf1c136b
commit e2bf1c136b
parent 452a2cfd3d
1 changed files with 117 additions and 116 deletions
--- a/luku24.tex
+++ b/luku24.tex
@ -484,115 +484,114 @@ By calculating matrix powers efficiently,
 we can calculate in $O(n^3 \log m)$ time
 the distribution after $m$ steps.
-\section{Satunnaisalgoritmit}
+\section{Randomized algorithms}
-\index{satunnaisalgoritmi@satunnaisalgoritmi}
+\index{randomized algorithm}
-Joskus tehtävässä voi hyödyntää satunnaisuutta,
+Sometimes we can use randomness for solving a problem,
-vaikka tehtävä ei itsessään liittyisi todennäköisyyteen.
+even if the problem is not related to random events.
-\key{Satunnaisalgoritmi} on algoritmi, jonka toiminta
+A \key{randomized algorithm} is an algorithm that
-perustuu satunnaisuuteen.
+is based on randomness.
-\index{Monte Carlo -algoritmi}
+\index{Monte Carlo algorithm}
-\key{Monte Carlo -algoritmi} on satunnaisalgoritmi,
+A \key{Monte Carlo algorithm} is a randomized algorithm
-joka saattaa tuottaa joskus väärän tuloksen.
+that may sometimes give a wrong answer.
-Jotta algoritmi olisi käyttökelpoinen,
+For such an algorithm to be useful,
-väärän vastauksen todennäköisyyden tulee olla pieni.
+the probability of a wrong answer should be small.
-\index{Las Vegas -algoritmi}
+\index{Las Vegas algorithm}
-\key{Las Vegas -algoritmi} on satunnaisalgoritmi,
+A \key{Las Vegas algorithm} is a randomized algorithm
-joka tuottaa aina oikean tuloksen mutta jonka
+that always gives the correct answer,
-suoritusaika vaihtelee satunnaisesti.
+but its running time varies randomly.
-Tavoitteena on, että algoritmi toimisi nopeasti
+The goal is to design an algorithm that is
-suurella todennäköisyydellä.
+efficient with high probability.
-Tutustumme seuraavaksi kolmeen esimerkkitehtävään,
+Next we will go through three example problems that
-jotka voi ratkaista satunnaisuuden avulla.
+can be solved using randomness.
-\subsubsection{Järjestystunnusluku}
+\subsubsection{Order statistics}
-\index{järjestystunnusluku}
+\index{order statistic}
-Taulukon $k$. \key{järjestystunnusluku}
+The $kth$ \key{order statistic} of an array
-on kohdassa $k$ oleva alkio,
+is the element at index $k$ after sorting
-kun alkiot järjestetään
+the array in increasing order.
-pienimmästä suurimpaan.
+It's easy to calculate any order statistic
-On helppoa laskea mikä tahansa
+in $O(n \log n)$ time by sorting the array,
-järjestystunnusluku ajassa $O(n \log n)$
+but is it really needed to sort the whole array
-järjestämällä taulukko,
+to just find one element?
 mutta onko oikeastaan tarpeen järjestää koko taulukkoa?
-Osoittautuu, että järjestystunnusluvun
+It turns out that we can find order statistics
-voi etsiä satunnaisalgoritmilla ilman taulukon
+using a randomized algorithm without sorting the array.
-järjestämistä.
+The algorithm is an Las Vegas algorithm:
-Algoritmi on Las Vegas -tyyppinen:
+its running time is usually $O(n)$,
-sen aikavaativuus on yleensä $O(n)$,
+but $O(n^2)$ in the worst case.
 mutta pahimmassa tapauksessa $O(n^2)$.
-Algoritmi valitsee taulukosta satunnaisen alkion $x$
+The algorithm chooses a random element $x$
-ja siirtää $x$:ää pienemmät alkiot
+in the array, and moves elements smaller than $x$
-taulukon vasempaan osaan ja loput alkiot
+to the left part of the array,
-taulukon oikeaan osaan.
+and the other elements to the right part of the array.
-Tämä vie aikaa $O(n)$, kun taulukossa on $n$ alkiota.
+This takes $O(n)$ time when there are $n$ elements.
-Oletetaan, että vasemmassa osassa on $a$
+Assume that the left part contains $a$ elements
-alkiota ja oikeassa osassa on $b$ alkiota.
+and the right part contains $b$ elements.
-Nyt jos $a=k-1$, alkio $x$ on haluttu alkio.
+If $a=k-1$, element $x$ is the $k$th order statistic.
-Jos $a>k-1$, etsitään rekursiivisesti
+Otherwise, if $a>k-1$, we recursively find the $k$th order
-vasemmasta osasta, mikä on kohdassa $k$ oleva alkio.
+statistic for the left part,
-Jos taas $a<k-1$, etsitään rekursiivisesti
+and if $a<k-1$, we recursively find the $r$th order
-oikeasta osasta, mikä on kohdassa $k-a-1$ oleva alkio.
+statistic for the right part where $r=k-a-1$.
-Haku jatkuu vastaavalla tavalla rekursiivisesti,
+The search continues like this, until the element
-kunnes haluttu alkio on löytynyt.
+has been found.
-Kun alkiot $x$ valitaan satunnaisesti,
+When each element $x$ is randomly chosen,
-taulukon koko suunnilleen puolittuu
+the size of the array about halves at each step,
-joka vaiheessa, joten kohdassa $k$ olevan
+so the time complexity for
-alkion etsiminen vie aikaa
+finding the $k$th order statistic is about
 \[n+n/2+n/4+n/8+\cdots=O(n).\]
-Algoritmin pahin tapaus on silti $O(n^2)$,
+The worst case for the algorithm is still $O(n^2)$,
-koska on mahdollista,
+because it is possible that $x$ is always chosen
-että $x$ valitaan sattumalta aina niin,
+in such a way that it's the smallest or largest
-että se on taulukon pienin alkio.
+element in the array.
-Silloin taulukko pienenee joka vaiheessa
+In this case, the size of the array decreases
-vain yhden alkion verran.
+only by one at each step.
-Tämän todennäköisyys on kuitenkin erittäin pieni,
+However, the probability for this is so small
-eikä näin tapahdu käytännössä.
+that this never happens in practice.
-\subsubsection{Matriisitulon tarkastaminen}
+\subsubsection{Verifying matrix multiplication}
-\index{matriisitulo@matriisitulo}
+\index{matrix multiplication}
-Seuraava tehtävämme on \emph{tarkastaa},
+Our next problem is to \emph{verify}
-päteekö matriisitulo $AB=C$, kun $A$, $B$ ja $C$
+if $AB=C$ holds when $A$, $B$ and $C$
-ovat $n \times n$ -kokoisia matriiseja.
+are matrices of size $n \times n$.
-Tehtävän voi ratkaista laskemalla matriisitulon
+Of course, we can solve the problem
-$AB$ (perusalgoritmilla ajassa $O(n^3)$), mutta voisi toivoa,
+by calculating the product $AB$ again
-että ratkaisun tarkastaminen olisi helpompaa
+(in $O(n^3)$ time using the basic algorithm),
-kuin sen laskeminen alusta alkaen uudestaan.
+but one could hope that verifying the
 answer would by easier than to calculate it again.
-Osoittautuu, että tehtävän voi ratkaista
+It turns out that we can solve the problem
-Monte Carlo -algoritmilla,
+using a Monte Carlo algorithm whose
-jonka aikavaativuus on vain $O(n^2)$.
+time complexity is only $O(n^2)$.
-Idea on yksinkertainen: valitaan satunnainen
+The idea is simple: we choose a random vector
-$n \times 1$ -matriisi $X$ ja lasketaan
+$X$ of $n$ elements, and calculate the matrices
-matriisit $ABX$ ja $CX$.
+$ABX$ and $CX$. If $ABX=CX$, we report that $AB=C$,
-Jos $ABX=CX$, ilmoitetaan, että $AB=C$,
+and otherwise we report that $AB \neq C$.
 ja muuten ilmoitetaan, että $AB \neq C$.
-Algoritmin aikavaativuus on $O(n^2)$,
+The time complexity of the algorithm is
-koska matriisien $ABX$ ja $CX$ laskeminen
+$O(n^2)$, because we can calculate the matrices
-vie aikaa $O(n^2)$.
+$ABX$ and $CX$ in $O(n^2)$ time.
-Matriisin $ABX$ tapauksessa laskennan
+We can calculate the matrix $ABX$ efficiently
-voi suorittaa osissa $A(BX)$, jolloin riittää
+using the representation $A(BX)$, so only two
-kertoa kahdesti $n \times n$- ja $n \times 1$-kokoiset
+multiplications of $n \times n$ and $n \times 1$
-matriisit.
+size matrices are needed.
-Algoritmin heikkoutena on, että on pieni mahdollisuus,
+The weakness in the algorithm is
-että algoritmi erehtyy, kun se ilmoittaa, että $AB=C$.
+that there is a small chance that the algorithm
-Esimerkiksi 
+makes a mistake when it reports that $AB=C$.
 For example, 
 \[
 \begin{bmatrix}
  2 & 4 \\
@ -604,7 +603,7 @@ Esimerkiksi
  7 & 4 \\
 \end{bmatrix},
 \]
-mutta
+but
 \[
 \begin{bmatrix}
  2 & 4 \\
@ -624,21 +623,22 @@ mutta
  3 \\
 \end{bmatrix}.
 \]
-Käytännössä erehtymisen todennäköisyys on kuitenkin
+However, in practice, the probability that the
-pieni ja todennäköisyyttä voi pienentää lisää
+algorithm makes a mistake is small,
-tekemällä tarkastuksen usealla
+and we can decrease the probability by
-satunnaisella matriisilla $X$ ennen vastauksen
+verifying the result using multiple random vectors $X$
-$AB=C$ ilmoittamista.
+before reporting the answer $AB=C$.
-\subsubsection{Verkon värittäminen}
+\subsubsection{Graph coloring}
-\index{vxritys@väritys}
+\index{coloring}
-Annettuna on verkko, jossa on $n$ solmua ja $m$ kaarta.
+Given a graph that contains $n$ nodes and $m$ edges,
-Tehtävänä on etsiä tapa värittää verkon solmut kahdella värillä
+our task is to find a way to color the nodes
-niin, että ainakin $m/2$ kaaressa
+of the graph using two colors so that
-päätesolmut ovat eri väriset.
+for at least $m/2$ edges, the end nodes 
-Esimerkiksi verkossa
+have different colors.
 For example, in the graph
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
 \node[draw, circle] (1) at (1,3) {$1$};
@ -656,7 +656,7 @@ Esimerkiksi verkossa
 \path[draw,thick,-] (4) -- (5);
 \end{tikzpicture}
 \end{center}
-yksi kelvollinen väritys on seuraava:
+a valid coloring is as follows:
 \begin{center}
 \begin{tikzpicture}[scale=0.9]
 \node[draw, circle, fill=blue!40] (1) at (1,3) {$1$};
@ -674,20 +674,21 @@ yksi kelvollinen väritys on seuraava:
 \path[draw,thick,-] (4) -- (5);
 \end{tikzpicture}
 \end{center}
-Yllä olevassa verkossa on 7 kaarta ja niistä 5:ssä
+The above graph contains 7 edges, and for 5 of them,
-päätesolmut ovat eri väriset,
+the end nodes have different colors,
-joten väritys on kelvollinen.
+so the coloring is valid.
-Tehtävä on mahdollista ratkaista Las Vegas -algoritmilla
+The problem can be solved using a Las Vegas algorithm
-muodostamalla satunnaisia värityksiä niin kauan,
+that generates random colorings until a valid coloring
-kunnes syntyy kelvollinen väritys.
+has been found.
-Satunnaisessa värityksessä jokaisen solmun väri on
+In a random coloring, the color of each node is
-valittu toisistaan riippumatta niin,
+independently chosen so that the probability of
-että kummankin värin todennäköisyys on $1/2$.
+both colors is $1/2$.
-Satunnaisessa värityksessä todennäköisyys, että yksittäisen kaaren päätesolmut
+In a random coloring, the probability that the end nodes
-ovat eri väriset on $1/2$. Niinpä odotusarvo, monessako kaaressa
+of a single edge have different colors is $1/2$.
-päätesolmut ovat eri väriset, on $1/2 \cdot m = m/2$.
+Hence, the expected number of edges whose end nodes
-Koska satunnainen väritys on odotusarvoisesti kelvollinen,
+have different colors is $1/2 \cdot m = m/2$.
-jokin kelvollinen väritys löytyy käytännössä nopeasti.
+Since it is excepted that a random coloring is valid,
 we'll find a valid coloring quickly in practice.