Chapter 2 first version

2016-12-29 20:51:57 +02:00 · 2016-12-29 20:51:57 +02:00 · b125c282a3
parent e64eb60b85
commit b125c282a3
1 changed files with 124 additions and 126 deletions
--- a/luku02.tex
+++ b/luku02.tex
@ -270,43 +270,43 @@ All the above time complexities except
 $O(2^n)$ and $O(n!)$ are polynomial.
 In practice, the constant $k$ is usually small,
 and therefore a polynomial time complexity
-means that the algorithm is \emph{efficient}.
+roughly means that the algorithm is \emph{efficient}.
 \index{NP-hard problem}
 Most algorithms in this book are polynomial.
 Still, there are many important problems for which
 no polynomial algorithm is known, i.e.,
-nobody knows how to solve the problem efficiently.
+nobody knows how to solve them efficiently.
 \key{NP-hard} problems are an important set
 of problems for which no polynomial algorithm is known.
-\section{Tehokkuuden arviointi}
+\section{Estimating efficiency}
-Aikavaativuuden hyötynä on,
+By calculating the time complexity,
-että sen avulla voi arvioida ennen algoritmin
+it is possible to check before the implementation that
-toteuttamista, onko algoritmi riittävän nopea
+an algorithm is efficient enough for the problem.
-tehtävän ratkaisemiseen.
+The starting point for the estimation is the fact that
-Lähtökohtana arviossa on, että nykyaikainen tietokone
+a modern computer can perform some hundreds of
-pystyy suorittamaan sekunnissa joitakin
+millions of operations in a second.
 satoja miljoonia koodissa olevia komentoja.
-Oletetaan esimerkiksi, että tehtävän aikaraja on 
+For example, assume that the time limit for
-yksi sekunti ja syötteen koko on $n=10^5$.
+a problem is one second and the input size is $n=10^5$.
-Jos algoritmin aikavaativuus on $O(n^2)$,
+If the time complexity is $O(n^2)$,
-algoritmi suorittaa noin $(10^5)^2=10^{10}$ komentoa.
+the algorithm will perform about $(10^5)^2=10^{10}$ operations.
-Tähän kuluu aikaa arviolta kymmeniä sekunteja,
+This should take some tens of seconds time,
-joten algoritmi vaikuttaa liian hitaalta tehtävän ratkaisemiseen.
+so the algorithm seems to be too slow for solving the problem.
-Käänteisesti syötteen koosta voi päätellä,
+On the other hand, given the input size,
-kuinka tehokasta algoritmia tehtävän laatija odottaa
+we can try to guess
-ratkaisijalta.
+the desired time complexity of the algorithm
-Seuraavassa taulukossa on joitakin hyödyllisiä arvioita,
+that solves the problem.
-jotka olettavat, että tehtävän aikaraja on yksi sekunti.
+The following table contains some useful estimates
 assuming that the time limit is one second.
 \begin{center}
 \begin{tabular}{ll}
-syötteen koko ($n$) & haluttu aikavaativuus \\
+input size ($n$) & desired time complexity \\
 \hline
 $n \le 10^{18}$ & $O(1)$ tai $O(\log n)$ \\
 $n \le 10^{12}$ & $O(\sqrt n)$ \\
@ -318,41 +318,44 @@ $n \le 10$ & $O(n!)$ \\
 \end{tabular}
 \end{center}
-Esimerkiksi jos syötteen koko on $n=10^5$,
+For example, if the input size is $n=10^5$,
-tehtävän laatija odottaa luultavasti
+it is probably expected that the time
-algoritmia, jonka aikavaativuus on $O(n)$ tai $O(n \log n)$.
+complexity of the algorithm should be $O(n)$ or $O(n \log n)$.
-Tämä tieto helpottaa algoritmin suunnittelua,
+This information makes it easier to design an algorithm
-koska se rajaa pois monia lähestymistapoja,
+because it rules out approaches that would yield
-joiden tuloksena olisi hitaampi aikavaativuus.
+an algorithm with a slower time complexity.
-\index{vakiokerroin}
+\index{constant factor}
-Aikavaativuus ei kerro kuitenkaan kaikkea algoritmin
+Still, it is important to remember that a
-tehokkuudesta, koska se kätkee toteutuksessa olevat
+time complexity doesn't tell everything about 
-\key{vakiokertoimet}. Esimerkiksi aikavaativuuden $O(n)$
+the efficiency because it hides the \key{constant factors}.
-algoritmi voi tehdä käytännössä $n/2$ tai $5n$ operaatiota.
+For example, an algorithm that runs in $O(n)$ time
-Tällä on merkittävä vaikutus algoritmin
+can perform $n/2$ or $5n$ operations.
-todelliseen ajankäyttöön.
+This has an important effect on the actual
 running time of the algorithm.
-\section{Suurin alitaulukon summa}
+\section{Maximum subarray sum}
-\index{suurin alitaulukon summa@suurin alitaulukon summa}
+\index{maximum subarray sum}
-Usein ohjelmointitehtävän ratkaisuun on monta
+There are often several possible algorithms
-luontevaa algoritmia, joiden aikavaativuudet eroavat.
+for solving a problem with different
-Tutustumme seuraavaksi klassiseen ongelmaan,
+time complexities.
-jonka suoraviivaisen ratkaisun aikavaativuus on $O(n^3)$,
+This section discusses a classic problem that
-mutta algoritmia parantamalla aikavaativuudeksi
+has a straightforward $O(n^3)$ solution.
-tulee ensin $O(n^2)$ ja lopulta $O(n)$.
+However, by designing a better algorithm it
 is possible to solve the problem in $O(n^2)$
 time and even in $O(n)$ time.
-Annettuna on taulukko, jossa on $n$ kokonaislukua
+Given an array of $n$ integers $x_1,x_2,\ldots,x_n$,
-$x_1,x_2,\ldots,x_n$, ja tehtävänä on etsiä
+our task is to find the
-taulukon \key{suurin alitaulukon summa}
+\key{maximum subarray sum}, i.e.,
-eli mahdollisimman suuri summa
+the largest possible sum of numbers
-taulukon yhtenäisellä välillä.
+in a contiguous region in the array.
-Tehtävän kiinnostavuus on siinä, että taulukossa
+The problem is interesting because there may be
-saattaa olla negatiivisia lukuja.
+negative numbers in the array.
-Esimerkiksi taulukossa
+For example, in the array
 \begin{center}
 \begin{tikzpicture}[scale=0.7]
 \draw (0,0) grid (8,1);
@ -378,7 +381,7 @@ Esimerkiksi taulukossa
 \end{tikzpicture}
 \end{center}
 \begin{samepage}
-suurimman summan $10$ tuottaa seuraava alitaulukko:
+the following subarray produces the maximum sum $10$:
 \begin{center}
 \begin{tikzpicture}[scale=0.7]
 \fill[color=lightgray] (1,0) rectangle (6,1);
@ -406,14 +409,14 @@ suurimman summan $10$ tuottaa seuraava alitaulukko:
 \end{center}
 \end{samepage}
 \subsubsection{Solution 1}
-\subsubsection{Ratkaisu 1}
+A straightforward solution for the problem
-
+is to go through all possible ways to
-Suoraviivainen ratkaisu tehtävään on käydä
+select a subarray, calculate the sum of
-läpi kaikki tavat valita alitaulukko taulukosta,
+numbers in each subarray and maintain
-laskea jokaisesta vaihtoehdosta lukujen summa
+the maximum sum.
-ja pitää muistissa suurinta summaa.
+The following code implements this algorithm:
 Seuraava koodi toteuttaa tämän algoritmin:
 \begin{lstlisting}
 int p = 0;
@ -429,22 +432,24 @@ for (int a = 1; a <= n; a++) {
 cout << p << "\n";
 \end{lstlisting}
-Koodi olettaa, että luvut on tallennettu taulukkoon \texttt{x},
+The code assumes that the numbers are stored in array \texttt{x}
-jota indeksoidaan $1 \ldots n$.
+with indices $1 \ldots n$.
-Muuttujat $a$ ja $b$ valitsevat alitaulukon ensimmäisen
+Variables $a$ and $b$ select the first and last
-ja viimeisen luvun, ja alitaulukon summa lasketaan muuttujaan $s$.
+number in the subarray,
-Muuttujassa $p$ on puolestaan paras haun aikana löydetty summa.
+and the sum of the subarray is calculated to variable $s$.
 Variable $p$ contains the maximum sum found during the search.
-Algoritmin aikavaativuus on $O(n^3)$, koska siinä on kolme
+The time complexity of the algorithm is $O(n^3)$
-sisäkkäistä silmukkaa ja jokainen silmukka käy läpi $O(n)$ lukua.
+because it consists of three nested loops and
 each loop contains $O(n)$ steps.
-\subsubsection{Ratkaisu 2}
+\subsubsection{Solution 2}
-Äskeistä ratkaisua on helppoa tehostaa hankkiutumalla
+It is easy to make the first solution more efficient
-eroon sisimmästä silmukasta.
+by removing one loop.
-Tämä on mahdollista laskemalla summaa samalla,
+This is possible by calculating the sum at the same
-kun alitaulukon oikea reuna liikkuu eteenpäin.
+time when the right border of the subarray moves.
-Tuloksena on seuraava koodi:
+The result is the following code:
 \begin{lstlisting}
 int p = 0;
@ -457,41 +462,35 @@ for (int a = 1; a <= n; a++) {
 }
 cout << p << "\n";
 \end{lstlisting}
-Tämän muutoksen jälkeen koodin aikavaativuus on $O(n^2)$.
+After this change, the time complexity is $O(n^2)$.
-\subsubsection{Ratkaisu 3}
+\subsubsection{Solution 3}
-Yllättävää kyllä, tehtävään on olemassa myös
+Surprisingly, it is possible to solve the problem
-$O(n)$-aikainen ratkaisu eli koodista pystyy
+in $O(n)$ time which means that we can remove
-karsimaan vielä yhden silmukan.
+one more loop.
-Ideana on laskea taulukon jokaiseen
+The idea is to calculate for each array index
-kohtaan, mikä on suurin alitaulukon
+the maximum subarray sum that ends to that index.
-summa, jos alitaulukko päättyy kyseiseen kohtaan.
+After this, the answer for the problem is the
-Tämän jälkeen ratkaisu tehtävään on suurin
+maximum of those sums.
 näistä summista.
-Tarkastellaan suurimman summan tuottavan
+Condider the subproblem of finding the maximum subarray
-alitaulukon etsimistä,
+for a fixed ending index $k$.
-kun valittuna on alitaulukon loppukohta $k$.
+There are two possibilities:
 Vaihtoehtoja on kaksi:
 \begin{enumerate}
-\item Alitaulukossa on vain kohdassa $k$ oleva luku.
+\item The subarray only contains the element at index $k$.
-\item Alitaulukossa on ensin jokin kohtaan $k-1$ päättyvä alitaulukko
+\item The subarray consists of a subarray that ends
-ja sen jälkeen kohdassa $k$ oleva luku.
+to index $k-1$, followed by the element at index $k$.
 \end{enumerate}
-Koska tavoitteena on löytää alitaulukko,
+Our goal is to find a subarray with maximum sum,
-jonka lukujen summa on suurin,
+so in case 2 the subarray that ends to index $k-1$
-tapauksessa 2 myös kohtaan $k-1$ päättyvän
+should also have the maximum sum.
-alitaulukon tulee olla sellainen,
+Thus, we can solve the problem efficiently
-että sen summa on suurin.
+when we calculate the maximum subarray sum
-Niinpä tehokas ratkaisu syntyy käymällä läpi
+for each ending index from left to right.
 kaikki alitaulukon loppukohdat järjestyksessä
 ja laskemalla jokaiseen kohtaan suurin
 mahdollinen kyseiseen kohtaan päättyvän alitaulukon summa.
 Seuraava koodi toteuttaa ratkaisun:
 The following code implements the solution:
 \begin{lstlisting}
 int p = 0, s = 0;
 for (int k = 1; k <= n; k++) {
@ -501,28 +500,28 @@ for (int k = 1; k <= n; k++) {
 cout << p << "\n";
 \end{lstlisting}
-Algoritmissa on vain yksi silmukka,
+The algorithm only contains one loop
-joka käy läpi taulukon luvut,
+that goes through the input,
-joten sen aikavaativuus on $O(n)$.
+so the time complexity is $O(n)$.
-Tämä on myös paras mahdollinen aikavaativuus,
+This is also the best possible time complexity,
-koska minkä tahansa algoritmin täytyy käydä
+because any algorithm for the problem
-läpi ainakin kerran taulukon sisältö.
+has to access all array elements at least once.
-\subsubsection{Tehokkuusvertailu}
+\subsubsection{Efficiency comparison}
-On kiinnostavaa tutkia, kuinka tehokkaita algoritmit
+It is interesting to study how efficient the
-ovat käytännössä.
+algorithms are in practice.
-Seuraava taulukko näyttää, kuinka nopeasti äskeiset
+The following table shows the running times
-ratkaisut toimivat eri $n$:n arvoilla
+of the above algorithms for different
-nykyaikaisella tietokoneella.
+values of $n$ in a modern computer.
-Jokaisessa testissä syöte on muodostettu satunnaisesti.
+In each test, the input was generated randomly.
-Ajankäyttöön ei ole laskettu syötteen lukemiseen
+The time needed for reading the input was not
-kuluvaa aikaa.
+measured.
 \begin{center}
 \begin{tabular}{rrrr}
-taulukon koko $n$ & ratkaisu 1 & ratkaisu 2 & ratkaisu 3 \\
+array size $n$ & solution 1 & solution 2 & solution 3 \\
 \hline
 $10^2$ & $0{,}0$ s & $0{,}0$ s & $0{,}0$ s \\
 $10^3$ & $0{,}1$ s & $0{,}0$ s & $0{,}0$ s \\
@ -533,13 +532,12 @@ $10^7$ & > $10,0$ s & > $10,0$ s & $0{,}0$ s \\
 \end{tabular}
 \end{center}
-Vertailu osoittaa,
+The comparison shows that all algorithms
-että pienillä syötteillä kaikki algoritmit
+are efficient when the input size is small,
-ovat tehokkaita,
+but larger inputs bring out remarkable
-mutta suuremmat syötteet tuovat esille
+differences in running times of the algorithms.
-merkittäviä eroja algoritmien suoritusajassa.
+The $O(n^3)$ time solution 1 becomes slower
-$O(n^3)$-aikainen ratkaisu 1 alkaa hidastua,
+when $n=10^3$, and the $O(n^2)$ time solution 2
-kun $n=10^3$, ja $O(n^2)$-aikainen ratkaisu 2
+becomes slower when $n=10^4$.
-alkaa hidastua, kun $n=10^4$.
+Only the $O(n)$ time solution 3 solves
-Vain $O(n)$-aikainen ratkaisu 3 selvittää
+even the largest inputs instantly.
 suurimmatkin syötteet salamannopeasti.