35 changed files with 25770 additions and 3 deletions
--- a/README.md
+++ b/README.md
@ -1,4 +1,22 @@
-# cphb
+# Competitive Programmer's Handbook

-SOI adjusted Competitive Programmer's Handbook
-(see https://github.com/pllk/cphb for the original)
+Competitive Programmer's Handbook is a modern introduction to competitive programming.
+The book discusses programming tricks and algorithm design techniques relevant in competitive programming.
+
+## CSES Problem Set
+
+The CSES Problem Set contains a collection of competitive programming problems.
+You can practice the techniques presented in the book by solving the problems.
+
+https://cses.fi/problemset/
+
+## License
+
+The license of the book is Creative Commons BY-NC-SA.
+
+## Other books
+
+Guide to Competitive Programming is a printed book, published by Springer, based on Competitive Programmer's Handbook.
+There is also a Russian edition Олимпиадное программирование (Olympiad Programming) and a Korean edition 알고리즘 트레이닝: 프로그래밍 대회 입문 가이드.
+
+https://cses.fi/book/
--- a/book.pdf
+++ b/book.pdf
--- a/book.tex
+++ b/book.tex
@ -0,0 +1,131 @@
+\documentclass[twoside,12pt,a4paper,english]{book}
+
+%\includeonly{chapter04,list}
+
+\usepackage[english]{babel}
+\usepackage[utf8]{inputenc}
+\usepackage{listings}
+\usepackage[table]{xcolor}
+\usepackage{tikz}
+\usepackage{multicol}
+\usepackage{hyperref}
+\usepackage{array}
+\usepackage{microtype}
+
+\usepackage{fouriernc}
+\usepackage[T1]{fontenc}
+
+\usepackage{graphicx}
+\usepackage{framed}
+\usepackage{amssymb}
+\usepackage{amsmath}
+
+\usepackage{pifont}
+\usepackage{ifthen}
+\usepackage{makeidx}
+\usepackage{enumitem}
+
+\usepackage{titlesec}
+
+\usepackage{skak}
+\usepackage[scaled=0.95]{inconsolata}
+
+
+\usetikzlibrary{patterns,snakes}
+\pagestyle{plain}
+
+\definecolor{keywords}{HTML}{44548A}
+\definecolor{strings}{HTML}{00999A}
+\definecolor{comments}{HTML}{990000}
+
+\lstset{language=C++,frame=single,basicstyle=\ttfamily \small,showstringspaces=false,columns=flexible}
+\lstset{
+  literate={ö}{{\"o}}1
+           {ä}{{\"a}}1
+           {ü}{{\"u}}1
+}
+\lstset{xleftmargin=20pt,xrightmargin=5pt}
+\lstset{aboveskip=12pt,belowskip=8pt}
+
+\lstset{
+    commentstyle=\color{comments},
+    keywordstyle=\color{keywords},
+    stringstyle=\color{strings}
+}
+
+\date{Draft \today}
+
+\usepackage[a4paper,vmargin=30mm,hmargin=33mm,footskip=15mm]{geometry}
+
+\title{\Huge Competitive Programmer's Handbook}
+\author{\Large Antti Laaksonen}
+
+\makeindex
+\usepackage[totoc]{idxlayout}
+
+\titleformat{\subsubsection}
+{\normalfont\large\bfseries\sffamily}{\thesubsection}{1em}{}
+
+\begin{document}
+
+%\selectlanguage{finnish}
+
+%\setcounter{page}{1}
+%\pagenumbering{roman}
+
+\frontmatter
+\maketitle
+\setcounter{tocdepth}{1}
+\tableofcontents
+
+\include{preface}
+
+\mainmatter
+\pagenumbering{arabic}
+\setcounter{page}{1}
+
+\newcommand{\key}[1] {\textbf{#1}}
+
+\part{Basic techniques}
+\include{chapter01}
+\include{chapter02}
+\include{chapter03}
+\include{chapter04}
+\include{chapter05}
+\include{chapter06}
+\include{chapter07}
+\include{chapter08}
+\include{chapter09}
+\include{chapter10}
+\part{Graph algorithms}
+\include{chapter11}
+\include{chapter12}
+\include{chapter13}
+\include{chapter14}
+\include{chapter15}
+\include{chapter16}
+\include{chapter17}
+\include{chapter18}
+\include{chapter19}
+\include{chapter20}
+\part{Advanced topics}
+\include{chapter21}
+\include{chapter22}
+\include{chapter23}
+\include{chapter24}
+\include{chapter25}
+\include{chapter26}
+\include{chapter27}
+\include{chapter28}
+\include{chapter29}
+\include{chapter30}
+
+\cleardoublepage
+\phantomsection
+\addcontentsline{toc}{chapter}{Bibliography}
+\include{list}
+
+\cleardoublepage
+\printindex
+
+\end{document}
--- a/chapter01.tex
+++ b/chapter01.tex
@ -0,0 +1,990 @@
+\chapter{Introduction}
+
+Competitive programming combines two topics:
+(1) the design of algorithms and (2) the implementation of algorithms.
+
+The \key{design of algorithms} consists of problem solving
+and mathematical thinking.
+Skills for analyzing problems and solving them
+creatively are needed.
+An algorithm for solving a problem
+has to be both correct and efficient,
+and the core of the problem is often
+about inventing an efficient algorithm.
+
+Theoretical knowledge of algorithms
+is important to competitive programmers.
+Typically, a solution to a problem is
+a combination of well-known techniques and
+new insights.
+The techniques that appear in competitive programming
+also form the basis for the scientific research
+of algorithms.
+
+The \key{implementation of algorithms} requires good
+programming skills.
+In competitive programming, the solutions
+are graded by testing an implemented algorithm
+using a set of test cases.
+Thus, it is not enough that the idea of the
+algorithm is correct, but the implementation also
+has to be correct.
+
+A good coding style in contests is
+straightforward and concise.
+Programs should be written quickly,
+because there is not much time available.
+Unlike in traditional software engineering,
+the programs are short (usually at most a few
+hundred lines of code), and they do not need to 
+be maintained after the contest.
+
+\section{Programming languages}
+
+\index{programming language}
+
+At the moment, the most popular programming
+languages used in contests are C++, Python and Java.
+For example, in Google Code Jam 2017,
+among the best 3,000 participants,
+79 \% used C++,
+16 \% used Python and
+8 \% used Java \cite{goo17}.
+Some participants also used several languages.
+
+Many people think that C++ is the best choice
+for a competitive programmer,
+and C++ is nearly always available in
+contest systems.
+The benefits of using C++ are that
+it is a very efficient language and
+its standard library contains a 
+large collection
+of data structures and algorithms.
+
+On the other hand, it is good to
+master several languages and understand
+their strengths.
+For example, if large integers are needed
+in the problem,
+Python can be a good choice, because it
+contains built-in operations for
+calculating with large integers.
+Still, most problems in programming contests
+are set so that
+using a specific programming language
+is not an unfair advantage.
+
+All example programs in this book are written in C++,
+and the standard library's
+data structures and algorithms are often used.
+The programs follow the C++11 standard,
+which can be used in most contests nowadays.
+If you cannot program in C++ yet,
+now is a good time to start learning.
+
+\subsubsection{C++ code template}
+
+A typical C++ code template for competitive programming
+looks like this:
+
+\begin{lstlisting}
+#include <bits/stdc++.h>
+
+using namespace std;
+
+int main() {
+    // solution comes here
+}
+\end{lstlisting}
+
+The \texttt{\#include} line at the beginning
+of the code is a feature of the \texttt{g++} compiler
+that allows us to include the entire standard library.
+Thus, it is not needed to separately include
+libraries such as \texttt{iostream},
+\texttt{vector} and \texttt{algorithm},
+but rather they are available automatically.
+
+The \texttt{using} line declares
+that the classes and functions
+of the standard library can be used directly
+in the code.
+Without the \texttt{using} line we would have
+to write, for example, \texttt{std::cout},
+but now it suffices to write \texttt{cout}.
+
+The code can be compiled using the following command:
+
+\begin{lstlisting}
+g++ -std=c++11 -O2 -Wall test.cpp -o test
+\end{lstlisting}
+
+This command produces a binary file \texttt{test}
+from the source code \texttt{test.cpp}.
+The compiler follows the C++11 standard
+(\texttt{-std=c++11}),
+optimizes the code (\texttt{-O2})
+and shows warnings about possible errors (\texttt{-Wall}).
+
+\section{Input and output}
+
+\index{input and output}
+
+In most contests, standard streams are used for
+reading input and writing output.
+In C++, the standard streams are
+\texttt{cin} for input and \texttt{cout} for output.
+In addition, the C functions
+\texttt{scanf} and \texttt{printf} can be used.
+
+The input for the program usually consists of
+numbers and strings that are separated with
+spaces and newlines.
+They can be read from the \texttt{cin} stream
+as follows:
+
+\begin{lstlisting}
+int a, b;
+string x;
+cin >> a >> b >> x;
+\end{lstlisting}
+
+This kind of code always works,
+assuming that there is at least one space
+or newline between each element in the input.
+For example, the above code can read
+both of the following inputs:
+\begin{lstlisting}
+123 456 monkey
+\end{lstlisting}
+\begin{lstlisting}
+123    456
+monkey
+\end{lstlisting}
+The \texttt{cout} stream is used for output
+as follows:
+\begin{lstlisting}
+int a = 123, b = 456;
+string x = "monkey";
+cout << a << " " << b << " " << x << "\n";
+\end{lstlisting}
+
+Input and output is sometimes
+a bottleneck in the program.
+The following lines at the beginning of the code
+make input and output more efficient:
+
+\begin{lstlisting}
+ios::sync_with_stdio(0);
+cin.tie(0);
+\end{lstlisting}
+
+Note that the newline \texttt{"\textbackslash n"}
+works faster than \texttt{endl},
+because \texttt{endl} always causes
+a flush operation.
+
+The C functions \texttt{scanf}
+and \texttt{printf} are an alternative
+to the C++ standard streams.
+They are usually a bit faster,
+but they are also more difficult to use.
+The following code reads two integers from the input:
+\begin{lstlisting}
+int a, b;
+scanf("%d %d", &a, &b);
+\end{lstlisting}
+The following code prints two integers:
+\begin{lstlisting}
+int a = 123, b = 456;
+printf("%d %d\n", a, b);
+\end{lstlisting}
+
+Sometimes the program should read a whole line
+from the input, possibly containing spaces.
+This can be accomplished by using the
+\texttt{getline} function:
+
+\begin{lstlisting}
+string s;
+getline(cin, s);
+\end{lstlisting}
+
+If the amount of data is unknown, the following
+loop is useful:
+\begin{lstlisting}
+while (cin >> x) {
+    // code
+}
+\end{lstlisting}
+This loop reads elements from the input
+one after another, until there is no
+more data available in the input.
+
+In some contest systems, files are used for
+input and output.
+An easy solution for this is to write
+the code as usual using standard streams,
+but add the following lines to the beginning of the code:
+\begin{lstlisting}
+freopen("input.txt", "r", stdin);
+freopen("output.txt", "w", stdout);
+\end{lstlisting}
+After this, the program reads the input from the file
+''input.txt'' and writes the output to the file
+''output.txt''.
+
+\section{Working with numbers}
+
+\index{integer}
+
+\subsubsection{Integers}
+
+The most used integer type in competitive programming
+is \texttt{int}, which is a 32-bit type with
+a value range of $-2^{31} \ldots 2^{31}-1$
+or about $-2 \cdot 10^9 \ldots 2 \cdot 10^9$.
+If the type \texttt{int} is not enough,
+the 64-bit type \texttt{long long} can be used.
+It has a value range of $-2^{63} \ldots 2^{63}-1$
+or about $-9 \cdot 10^{18} \ldots 9 \cdot 10^{18}$.
+
+The following code defines a
+\texttt{long long} variable:
+\begin{lstlisting}
+long long x = 123456789123456789LL;
+\end{lstlisting}
+The suffix \texttt{LL} means that the
+type of the number is \texttt{long long}.
+
+A common mistake when using the type \texttt{long long}
+is that the type \texttt{int} is still used somewhere
+in the code.
+For example, the following code contains
+a subtle error:
+
+\begin{lstlisting}
+int a = 123456789;
+long long b = a*a;
+cout << b << "\n"; // -1757895751
+\end{lstlisting}
+
+Even though the variable \texttt{b} is of type \texttt{long long},
+both numbers in the expression \texttt{a*a}
+are of type \texttt{int} and the result is
+also of type \texttt{int}.
+Because of this, the variable \texttt{b} will
+contain a wrong result.
+The problem can be solved by changing the type
+of \texttt{a} to \texttt{long long} or
+by changing the expression to \texttt{(long long)a*a}.
+
+Usually contest problems are set so that the
+type \texttt{long long} is enough.
+Still, it is good to know that
+the \texttt{g++} compiler also provides
+a 128-bit type \texttt{\_\_int128\_t}
+with a value range of
+$-2^{127} \ldots 2^{127}-1$ or about $-10^{38} \ldots 10^{38}$.
+However, this type is not available in all contest systems.
+
+\subsubsection{Modular arithmetic}
+
+\index{remainder}
+\index{modular arithmetic}
+
+We denote by $x \bmod m$ the remainder
+when $x$ is divided by $m$.
+For example, $17 \bmod 5 = 2$,
+because $17 = 3 \cdot 5 + 2$.
+
+Sometimes, the answer to a problem is a
+very large number but it is enough to
+output it ''modulo $m$'', i.e.,
+the remainder when the answer is divided by $m$
+(for example, ''modulo $10^9+7$'').
+The idea is that even if the actual answer
+is very large,
+it suffices to use the types
+\texttt{int} and \texttt{long long}.
+
+An important property of the remainder is that
+in addition, subtraction and multiplication,
+the remainder can be taken before the operation:
+
+\[
+\begin{array}{rcr}
+(a+b) \bmod m & = & (a \bmod m + b \bmod m) \bmod m \\
+(a-b) \bmod m & = & (a \bmod m - b \bmod m) \bmod m \\
+(a \cdot b) \bmod m & = & (a \bmod m \cdot b \bmod m) \bmod m
+\end{array}
+\]
+
+Thus, we can take the remainder after every operation
+and the numbers will never become too large.
+
+For example, the following code calculates $n!$,
+the factorial of $n$, modulo $m$:
+\begin{lstlisting}
+long long x = 1;
+for (int i = 2; i <= n; i++) {
+    x = (x*i)%m;
+}
+cout << x%m << "\n";
+\end{lstlisting}
+
+Usually we want the remainder to always
+be between $0\ldots m-1$.
+However, in C++ and other languages,
+the remainder of a negative number
+is either zero or negative.
+An easy way to make sure there
+are no negative remainders is to first calculate
+the remainder as usual and then add $m$
+if the result is negative:
+\begin{lstlisting}
+x = x%m;
+if (x < 0) x += m;
+\end{lstlisting}
+However, this is only needed when there
+are subtractions in the code and the
+remainder may become negative.
+
+\subsubsection{Floating point numbers}
+
+\index{floating point number}
+
+The usual floating point types in
+competitive programming are
+the 64-bit \texttt{double}
+and, as an extension in the \texttt{g++} compiler,
+the 80-bit \texttt{long double}.
+In most cases, \texttt{double} is enough,
+but \texttt{long double} is more accurate.
+
+The required precision of the answer
+is usually given in the problem statement.
+An easy way to output the answer is to use
+the \texttt{printf} function
+and give the number of decimal places
+in the formatting string.
+For example, the following code prints
+the value of $x$ with 9 decimal places:
+
+\begin{lstlisting}
+printf("%.9f\n", x);
+\end{lstlisting}
+
+A difficulty when using floating point numbers
+is that some numbers cannot be represented
+accurately as floating point numbers,
+and there will be rounding errors.
+For example, the result of the following code
+is surprising:
+
+\begin{lstlisting}
+double x = 0.3*3+0.1;
+printf("%.20f\n", x); // 0.99999999999999988898
+\end{lstlisting}
+
+Due to a rounding error,
+the value of \texttt{x} is a bit smaller than 1,
+while the correct value would be 1.
+
+It is risky to compare floating point numbers
+with the \texttt{==} operator,
+because it is possible that the values should be
+equal but they are not because of precision errors.
+A better way to compare floating point numbers
+is to assume that two numbers are equal
+if the difference between them is less than $\varepsilon$,
+where $\varepsilon$ is a small number.
+
+In practice, the numbers can be compared
+as follows ($\varepsilon=10^{-9}$):
+
+\begin{lstlisting}
+if (abs(a-b) < 1e-9) {
+    // a and b are equal
+}
+\end{lstlisting}
+
+Note that while floating point numbers are inaccurate,
+integers up to a certain limit can still be
+represented accurately.
+For example, using \texttt{double},
+it is possible to accurately represent all
+integers whose absolute value is at most $2^{53}$.
+
+\section{Shortening code}
+
+Short code is ideal in competitive programming,
+because programs should be written
+as fast as possible.
+Because of this, competitive programmers often define
+shorter names for datatypes and other parts of code.
+
+\subsubsection{Type names}
+\index{tuppdef@\texttt{typedef}}
+Using the command \texttt{typedef}
+it is possible to give a shorter name
+to a datatype.
+For example, the name \texttt{long long} is long,
+so we can define a shorter name \texttt{ll}:
+\begin{lstlisting}
+typedef long long ll;
+\end{lstlisting}
+After this, the code
+\begin{lstlisting}
+long long a = 123456789;
+long long b = 987654321;
+cout << a*b << "\n";
+\end{lstlisting}
+can be shortened as follows:
+\begin{lstlisting}
+ll a = 123456789;
+ll b = 987654321;
+cout << a*b << "\n";
+\end{lstlisting}
+
+The command \texttt{typedef}
+can also be used with more complex types.
+For example, the following code gives
+the name \texttt{vi} for a vector of integers
+and the name \texttt{pi} for a pair
+that contains two integers.
+\begin{lstlisting}
+typedef vector<int> vi;
+typedef pair<int,int> pi;
+\end{lstlisting}
+
+\subsubsection{Macros}
+\index{macro}
+Another way to shorten code is to define
+\key{macros}.
+A macro means that certain strings in
+the code will be changed before the compilation.
+In C++, macros are defined using the
+\texttt{\#define} keyword.
+
+For example, we can define the following macros:
+\begin{lstlisting}
+#define F first
+#define S second
+#define PB push_back
+#define MP make_pair
+\end{lstlisting}
+After this, the code
+\begin{lstlisting}
+v.push_back(make_pair(y1,x1));
+v.push_back(make_pair(y2,x2));
+int d = v[i].first+v[i].second;
+\end{lstlisting}
+can be shortened as follows:
+\begin{lstlisting}
+v.PB(MP(y1,x1));
+v.PB(MP(y2,x2));
+int d = v[i].F+v[i].S;
+\end{lstlisting}
+
+A macro can also have parameters
+which makes it possible to shorten loops and other
+structures.
+For example, we can define the following macro:
+\begin{lstlisting}
+#define REP(i,a,b) for (int i = a; i <= b; i++)
+\end{lstlisting}
+After this, the code
+\begin{lstlisting}
+for (int i = 1; i <= n; i++) {
+    search(i);
+}
+\end{lstlisting}
+can be shortened as follows:
+\begin{lstlisting}
+REP(i,1,n) {
+    search(i);
+}
+\end{lstlisting}
+
+Sometimes macros cause bugs that may be difficult
+to detect. For example, consider the following macro
+that calculates the square of a number:
+\begin{lstlisting}
+#define SQ(a) a*a
+\end{lstlisting}
+This macro \emph{does not} always work as expected.
+For example, the code
+\begin{lstlisting}
+cout << SQ(3+3) << "\n";
+\end{lstlisting}
+corresponds to the code
+\begin{lstlisting}
+cout << 3+3*3+3 << "\n"; // 15
+\end{lstlisting}
+
+A better version of the macro is as follows:
+\begin{lstlisting}
+#define SQ(a) (a)*(a)
+\end{lstlisting}
+Now the code
+\begin{lstlisting}
+cout << SQ(3+3) << "\n";
+\end{lstlisting}
+corresponds to the code
+\begin{lstlisting}
+cout << (3+3)*(3+3) << "\n"; // 36
+\end{lstlisting}
+
+
+\section{Mathematics}
+
+Mathematics plays an important role in competitive
+programming, and it is not possible to become
+a successful competitive programmer without
+having good mathematical skills.
+This section discusses some important
+mathematical concepts and formulas that
+are needed later in the book.
+
+\subsubsection{Sum formulas}
+
+Each sum of the form
+\[\sum_{x=1}^n x^k = 1^k+2^k+3^k+\ldots+n^k,\]
+where $k$ is a positive integer,
+has a closed-form formula that is a
+polynomial of degree $k+1$.
+For example\footnote{\index{Faulhaber's formula}
+There is even a general formula for such sums, called \key{Faulhaber's formula},
+but it is too complex to be presented here.},
+\[\sum_{x=1}^n x = 1+2+3+\ldots+n = \frac{n(n+1)}{2}\]
+and
+\[\sum_{x=1}^n x^2 = 1^2+2^2+3^2+\ldots+n^2 = \frac{n(n+1)(2n+1)}{6}.\]
+
+An \key{arithmetic progression} is a \index{arithmetic progression}
+sequence of numbers
+where the difference between any two consecutive
+numbers is constant.
+For example,
+\[3, 7, 11, 15\]
+is an arithmetic progression with constant 4.
+The sum of an arithmetic progression can be calculated
+using the formula
+\[\underbrace{a + \cdots + b}_{n \,\, \textrm{numbers}} = \frac{n(a+b)}{2}\]
+where $a$ is the first number,
+$b$ is the last number and
+$n$ is the amount of numbers.
+For example,
+\[3+7+11+15=\frac{4 \cdot (3+15)}{2} = 36.\]
+The formula is based on the fact
+that the sum consists of $n$ numbers and
+the value of each number is $(a+b)/2$ on average.
+
+\index{geometric progression}
+A \key{geometric progression} is a sequence
+of numbers
+where the ratio between any two consecutive
+numbers is constant.
+For example,
+\[3,6,12,24\]
+is a geometric progression with constant 2.
+The sum of a geometric progression can be calculated
+using the formula
+\[a + ak + ak^2 + \cdots + b = \frac{bk-a}{k-1}\]
+where $a$ is the first number,
+$b$ is the last number and the
+ratio between consecutive numbers is $k$.
+For example,
+\[3+6+12+24=\frac{24 \cdot 2 - 3}{2-1} = 45.\]
+
+This formula can be derived as follows. Let
+\[ S = a + ak + ak^2 + \cdots + b .\]
+By multiplying both sides by $k$, we get
+\[ kS = ak + ak^2 + ak^3 + \cdots + bk,\]
+and solving the equation
+\[ kS-S = bk-a\]
+yields the formula.
+
+A special case of a sum of a geometric progression is the formula
+\[1+2+4+8+\ldots+2^{n-1}=2^n-1.\]
+
+\index{harmonic sum}
+
+A \key{harmonic sum} is a sum of the form
+\[ \sum_{x=1}^n \frac{1}{x} = 1+\frac{1}{2}+\frac{1}{3}+\ldots+\frac{1}{n}.\]
+
+An upper bound for a harmonic sum is $\log_2(n)+1$.
+Namely, we can
+modify each term $1/k$ so that $k$ becomes
+the nearest power of two that does not exceed $k$.
+For example, when $n=6$, we can estimate
+the sum as follows:
+\[ 1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\frac{1}{5}+\frac{1}{6} \le
+1+\frac{1}{2}+\frac{1}{2}+\frac{1}{4}+\frac{1}{4}+\frac{1}{4}.\]
+This upper bound consists of $\log_2(n)+1$ parts
+($1$, $2 \cdot 1/2$, $4 \cdot 1/4$, etc.),
+and the value of each part is at most 1.
+
+\subsubsection{Set theory}
+
+\index{set theory}
+\index{set}
+\index{intersection}
+\index{union}
+\index{difference}
+\index{subset}
+\index{universal set}
+\index{complement}
+
+A \key{set} is a collection of elements.
+For example, the set
+\[X=\{2,4,7\}\]
+contains elements 2, 4 and 7.
+The symbol $\emptyset$ denotes an empty set,
+and $|S|$ denotes the size of a set $S$,
+i.e., the number of elements in the set.
+For example, in the above set, $|X|=3$.
+
+If a set $S$ contains an element $x$,
+we write $x \in S$,
+and otherwise we write $x \notin S$.
+For example, in the above set
+\[4 \in X \hspace{10px}\textrm{and}\hspace{10px} 5 \notin X.\]
+
+\begin{samepage}
+New sets can be constructed using set operations:
+\begin{itemize}
+\item The \key{intersection} $A \cap B$ consists of elements
+that are in both $A$ and $B$.
+For example, if $A=\{1,2,5\}$ and $B=\{2,4\}$,
+then $A \cap B = \{2\}$.
+\item The \key{union} $A \cup B$ consists of elements
+that are in $A$ or $B$ or both.
+For example, if $A=\{3,7\}$ and $B=\{2,3,8\}$,
+then $A \cup B = \{2,3,7,8\}$.
+\item The \key{complement} $\bar A$ consists of elements
+that are not in $A$.
+The interpretation of a complement depends on
+the \key{universal set}, which contains all possible elements.
+For example, if $A=\{1,2,5,7\}$ and the universal set is
+$\{1,2,\ldots,10\}$, then $\bar A = \{3,4,6,8,9,10\}$.
+\item The \key{difference} $A \setminus B = A \cap \bar B$
+consists of elements that are in $A$ but not in $B$.
+Note that $B$ can contain elements that are not in $A$.
+For example, if $A=\{2,3,7,8\}$ and $B=\{3,5,8\}$,
+then $A \setminus B = \{2,7\}$.
+\end{itemize}
+\end{samepage}
+
+If each element of $A$ also belongs to $S$,
+we say that $A$ is a \key{subset} of $S$,
+denoted by $A \subset S$.
+A set $S$ always has $2^{|S|}$ subsets,
+including the empty set.
+For example, the subsets of the set $\{2,4,7\}$ are
+\begin{center}
+$\emptyset$,
+$\{2\}$, $\{4\}$, $\{7\}$, $\{2,4\}$, $\{2,7\}$, $\{4,7\}$ and $\{2,4,7\}$.
+\end{center}
+
+Some often used sets are
+$\mathbb{N}$ (natural numbers),
+$\mathbb{Z}$ (integers),
+$\mathbb{Q}$ (rational numbers) and
+$\mathbb{R}$ (real numbers).
+The set $\mathbb{N}$
+can be defined in two ways, depending
+on the situation:
+either $\mathbb{N}=\{0,1,2,\ldots\}$
+or $\mathbb{N}=\{1,2,3,...\}$.
+
+We can also construct a set using a rule of the form
+\[\{f(n) : n \in S\},\]
+where $f(n)$ is some function.
+This set contains all elements of the form $f(n)$,
+where $n$ is an element in $S$.
+For example, the set
+\[X=\{2n : n \in \mathbb{Z}\}\]
+contains all even integers.
+
+\subsubsection{Logic}
+
+\index{logic}
+\index{negation}
+\index{conjuction}
+\index{disjunction}
+\index{implication}
+\index{equivalence}
+
+The value of a logical expression is either
+\key{true} (1) or \key{false} (0).
+The most important logical operators are
+$\lnot$ (\key{negation}),
+$\land$ (\key{conjunction}),
+$\lor$ (\key{disjunction}),
+$\Rightarrow$ (\key{implication}) and
+$\Leftrightarrow$ (\key{equivalence}).
+The following table shows the meanings of these operators:
+
+\begin{center}
+\begin{tabular}{rr|rrrrrrr}
+$A$ & $B$ & $\lnot A$ & $\lnot B$ & $A \land B$ & $A \lor B$ & $A \Rightarrow B$ & $A \Leftrightarrow B$ \\
+\hline
+0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 \\
+0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 \\
+1 & 0 & 0 & 1 & 0 & 1 & 0 & 0 \\
+1 & 1 & 0 & 0 & 1 & 1 & 1 & 1 \\
+\end{tabular}
+\end{center}
+
+The expression $\lnot A$ has the opposite value of $A$.
+The expression $A \land B$ is true if both $A$ and $B$
+are true,
+and the expression $A \lor B$ is true if $A$ or $B$ or both
+are true.
+The expression $A \Rightarrow B$ is true
+if whenever $A$ is true, also $B$ is true.
+The expression $A \Leftrightarrow B$ is true
+if $A$ and $B$ are both true or both false.
+
+\index{predicate}
+
+A \key{predicate} is an expression that is true or false
+depending on its parameters.
+Predicates are usually denoted by capital letters.
+For example, we can define a predicate $P(x)$
+that is true exactly when $x$ is a prime number.
+Using this definition, $P(7)$ is true but $P(8)$ is false.
+
+\index{quantifier}
+
+A \key{quantifier} connects a logical expression
+to the elements of a set.
+The most important quantifiers are
+$\forall$ (\key{for all}) and $\exists$ (\key{there is}).
+For example,
+\[\forall x (\exists y (y < x))\]
+means that for each element $x$ in the set,
+there is an element $y$ in the set
+such that $y$ is smaller than $x$.
+This is true in the set of integers,
+but false in the set of natural numbers.
+
+Using the notation described above,
+we can express many kinds of logical propositions.
+For example,
+\[\forall x ((x>1 \land \lnot P(x)) \Rightarrow (\exists a (\exists b (a > 1 \land b > 1 \land x = ab))))\]
+means that if a number $x$ is larger than 1
+and not a prime number,
+then there are numbers $a$ and $b$
+that are larger than $1$ and whose product is $x$.
+This proposition is true in the set of integers.
+
+\subsubsection{Functions}
+
+The function $\lfloor x \rfloor$ rounds the number $x$
+down to an integer, and the function
+$\lceil x \rceil$ rounds the number $x$
+up to an integer. For example,
+\[ \lfloor 3/2 \rfloor = 1 \hspace{10px} \textrm{and} \hspace{10px} \lceil 3/2 \rceil = 2.\]
+
+The functions $\min(x_1,x_2,\ldots,x_n)$
+and $\max(x_1,x_2,\ldots,x_n)$
+give the smallest and largest of values
+$x_1,x_2,\ldots,x_n$.
+For example,
+\[ \min(1,2,3)=1 \hspace{10px} \textrm{and} \hspace{10px} \max(1,2,3)=3.\]
+
+\index{factorial}
+
+The \key{factorial} $n!$ can be defined
+\[\prod_{x=1}^n x = 1 \cdot 2 \cdot 3 \cdot \ldots \cdot n\]
+or recursively
+\[
+\begin{array}{lcl}
+0! & = & 1 \\
+n! & = & n \cdot (n-1)! \\
+\end{array}
+\]
+
+\index{Fibonacci number}
+
+The \key{Fibonacci numbers}
+%\footnote{Fibonacci (c. 1175--1250) was an Italian mathematician.}
+arise in many situations.
+They can be defined recursively as follows:
+\[
+\begin{array}{lcl}
+f(0) & = & 0 \\
+f(1) & = & 1 \\
+f(n) & = & f(n-1)+f(n-2) \\
+\end{array}
+\]
+The first Fibonacci numbers are
+\[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, \ldots\]
+There is also a closed-form formula
+for calculating Fibonacci numbers, which is sometimes called
+\index{Binet's formula} \key{Binet's formula}:
+\[f(n)=\frac{(1 + \sqrt{5})^n - (1-\sqrt{5})^n}{2^n \sqrt{5}}.\]
+
+\subsubsection{Logarithms}
+
+\index{logarithm}
+
+The \key{logarithm} of a number $x$
+is denoted $\log_k(x)$, where $k$ is the base
+of the logarithm.
+According to the definition,
+$\log_k(x)=a$ exactly when $k^a=x$.
+
+A useful property of logarithms is
+that $\log_k(x)$ equals the number of times
+we have to divide $x$ by $k$ before we reach 
+the number 1.
+For example, $\log_2(32)=5$
+because 5 divisions by 2 are needed:
+
+\[32 \rightarrow 16 \rightarrow 8 \rightarrow 4 \rightarrow 2 \rightarrow 1 \]
+
+Logarithms are often used in the analysis of
+algorithms, because many efficient algorithms
+halve something at each step.
+Hence, we can estimate the efficiency of such algorithms
+using logarithms.
+
+The logarithm of a product is
+\[\log_k(ab) = \log_k(a)+\log_k(b),\]
+and consequently,
+\[\log_k(x^n) = n \cdot \log_k(x).\]
+In addition, the logarithm of a quotient is
+\[\log_k\Big(\frac{a}{b}\Big) = \log_k(a)-\log_k(b).\]
+Another useful formula is
+\[\log_u(x) = \frac{\log_k(x)}{\log_k(u)},\]
+and using this, it is possible to calculate
+logarithms to any base if there is a way to
+calculate logarithms to some fixed base.
+
+\index{natural logarithm}
+
+The \key{natural logarithm} $\ln(x)$ of a number $x$
+is a logarithm whose base is $e \approx 2.71828$.
+Another property of logarithms is that
+the number of digits of an integer $x$ in base $b$ is
+$\lfloor \log_b(x)+1 \rfloor$.
+For example, the representation of
+$123$ in base $2$ is 1111011 and
+$\lfloor \log_2(123)+1 \rfloor = 7$.
+
+\section{Contests and resources}
+
+\subsubsection{IOI}
+
+The International Olympiad in Informatics (IOI)
+is an annual programming contest for
+secondary school students.
+Each country is allowed to send a team of
+four students to the contest.
+There are usually about 300 participants
+from 80 countries.
+
+The IOI consists of two five-hour long contests.
+In both contests, the participants are asked to
+solve three algorithm tasks of various difficulty.
+The tasks are divided into subtasks,
+each of which has an assigned score.
+Even if the contestants are divided into teams,
+they compete as individuals.
+
+The IOI syllabus \cite{iois} regulates the topics
+that may appear in IOI tasks.
+Almost all the topics in the IOI syllabus
+are covered by this book.
+
+Participants for the IOI are selected through
+national contests.
+Before the IOI, many regional contests are organized,
+such as the Baltic Olympiad in Informatics (BOI),
+the Central European Olympiad in Informatics (CEOI)
+and the Asia-Pacific Informatics Olympiad (APIO).
+
+Some countries organize online practice contests
+for future IOI participants,
+such as the Croatian Open Competition in Informatics \cite{coci}
+and the USA Computing Olympiad \cite{usaco}.
+In addition, a large collection of problems from Polish contests
+is available online \cite{main}.
+
+\subsubsection{ICPC}
+
+The International Collegiate Programming Contest (ICPC)
+is an annual programming contest for university students.
+Each team in the contest consists of three students,
+and unlike in the IOI, the students work together;
+there is only one computer available for each team.
+
+The ICPC consists of several stages, and finally the
+best teams are invited to the World Finals.
+While there are tens of thousands of participants
+in the contest, there are only a small number\footnote{The exact number of final
+slots varies from year to year; in 2017, there were 133 final slots.} of final slots available,
+so even advancing to the finals
+is a great achievement in some regions.
+
+In each ICPC contest, the teams have five hours of time to
+solve about ten algorithm problems.
+A solution to a problem is accepted only if it solves
+all test cases efficiently.
+During the contest, competitors may view the results of other teams,
+but for the last hour the scoreboard is frozen and it
+is not possible to see the results of the last submissions.
+
+The topics that may appear at the ICPC are not so well
+specified as those at the IOI.
+In any case, it is clear that more knowledge is needed
+at the ICPC, especially more mathematical skills.
+
+\subsubsection{Online contests}
+
+There are also many online contests that are open for everybody.
+At the moment, the most active contest site is Codeforces,
+which organizes contests about weekly.
+In Codeforces, participants are divided into two divisions:
+beginners compete in Div2 and more experienced programmers in Div1.
+Other contest sites include AtCoder, CS Academy, HackerRank and Topcoder.
+
+Some companies organize online contests with onsite finals.
+Examples of such contests are Facebook Hacker Cup,
+Google Code Jam and Yandex.Algorithm.
+Of course, companies also use those contests for recruiting:
+performing well in a contest is a good way to prove one's skills.
+
+\subsubsection{Books}
+
+There are already some books (besides this book) that
+focus on competitive programming and algorithmic problem solving:
+
+\begin{itemize}
+\item S. S. Skiena and M. A. Revilla:
+\emph{Programming Challenges: The Programming Contest Training Manual} \cite{ski03}
+\item S. Halim and F. Halim:
+\emph{Competitive Programming 3: The New Lower Bound of Programming Contests} \cite{hal13}
+\item K. Diks et al.: \emph{Looking for a Challenge? The Ultimate Problem Set from
+the University of Warsaw Programming Competitions} \cite{dik12}
+\end{itemize}
+
+The first two books are intended for beginners,
+whereas the last book contains advanced material.
+
+Of course, general algorithm books are also suitable for
+competitive programmers.
+Some popular books are:
+
+\begin{itemize}
+\item T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein:
+\emph{Introduction to Algorithms} \cite{cor09}
+\item J. Kleinberg and É. Tardos:
+\emph{Algorithm Design} \cite{kle05}
+\item S. S. Skiena:
+\emph{The Algorithm Design Manual} \cite{ski08}
+\end{itemize}
--- a/chapter02.tex
+++ b/chapter02.tex
@ -0,0 +1,538 @@
+\chapter{Time complexity}
+
+\index{time complexity}
+
+The efficiency of algorithms is important in competitive programming.
+Usually, it is easy to design an algorithm
+that solves the problem slowly,
+but the real challenge is to invent a
+fast algorithm.
+If the algorithm is too slow, it will get only
+partial points or no points at all.
+
+The \key{time complexity} of an algorithm
+estimates how much time the algorithm will use
+for some input.
+The idea is to represent the efficiency
+as a function whose parameter is the size of the input.
+By calculating the time complexity,
+we can find out whether the algorithm is fast enough
+without implementing it.
+
+\section{Calculation rules}
+
+The time complexity of an algorithm
+is denoted $O(\cdots)$
+where the three dots represent some
+function.
+Usually, the variable $n$ denotes
+the input size.
+For example, if the input is an array of numbers,
+$n$ will be the size of the array,
+and if the input is a string,
+$n$ will be the length of the string.
+
+\subsubsection*{Loops}
+
+A common reason why an algorithm is slow is
+that it contains many loops that go through the input.
+The more nested loops the algorithm contains,
+the slower it is.
+If there are $k$ nested loops,
+the time complexity is $O(n^k)$.
+
+For example, the time complexity of the following code is $O(n)$:
+\begin{lstlisting}
+for (int i = 1; i <= n; i++) {
+    // code
+}
+\end{lstlisting}
+
+And the time complexity of the following code is $O(n^2)$:
+\begin{lstlisting}
+for (int i = 1; i <= n; i++) {
+    for (int j = 1; j <= n; j++) {
+        // code
+    }
+}
+\end{lstlisting}
+
+\subsubsection*{Order of magnitude}
+
+A time complexity does not tell us the exact number
+of times the code inside a loop is executed,
+but it only shows the order of magnitude.
+In the following examples, the code inside the loop
+is executed $3n$, $n+5$ and $\lceil n/2 \rceil$ times,
+but the time complexity of each code is $O(n)$.
+
+\begin{lstlisting}
+for (int i = 1; i <= 3*n; i++) {
+    // code
+}
+\end{lstlisting}
+
+\begin{lstlisting}
+for (int i = 1; i <= n+5; i++) {
+    // code
+}
+\end{lstlisting}
+
+\begin{lstlisting}
+for (int i = 1; i <= n; i += 2) {
+    // code
+}
+\end{lstlisting}
+
+As another example,
+the time complexity of the following code is $O(n^2)$:
+
+\begin{lstlisting}
+for (int i = 1; i <= n; i++) {
+    for (int j = i+1; j <= n; j++) {
+        // code
+    }
+}
+\end{lstlisting}
+
+\subsubsection*{Phases}
+
+If the algorithm consists of consecutive phases,
+the total time complexity is the largest
+time complexity of a single phase.
+The reason for this is that the slowest
+phase is usually the bottleneck of the code.
+
+For example, the following code consists
+of three phases with time complexities
+$O(n)$, $O(n^2)$ and $O(n)$.
+Thus, the total time complexity is $O(n^2)$.
+
+\begin{lstlisting}
+for (int i = 1; i <= n; i++) {
+    // code
+}
+for (int i = 1; i <= n; i++) {
+    for (int j = 1; j <= n; j++) {
+        // code
+    }
+}
+for (int i = 1; i <= n; i++) {
+    // code
+}
+\end{lstlisting}
+
+\subsubsection*{Several variables}
+
+Sometimes the time complexity depends on
+several factors.
+In this case, the time complexity formula
+contains several variables.
+
+For example, the time complexity of the
+following code is $O(nm)$:
+
+\begin{lstlisting}
+for (int i = 1; i <= n; i++) {
+    for (int j = 1; j <= m; j++) {
+        // code
+    }
+}
+\end{lstlisting}
+
+\subsubsection*{Recursion}
+
+The time complexity of a recursive function
+depends on the number of times the function is called
+and the time complexity of a single call.
+The total time complexity is the product of
+these values.
+
+For example, consider the following function:
+\begin{lstlisting}
+void f(int n) {
+    if (n == 1) return;
+    f(n-1);
+}
+\end{lstlisting}
+The call $\texttt{f}(n)$ causes $n$ function calls,
+and the time complexity of each call is $O(1)$.
+Thus, the total time complexity is $O(n)$.
+
+As another example, consider the following function:
+\begin{lstlisting}
+void g(int n) {
+    if (n == 1) return;
+    g(n-1);
+    g(n-1);
+}
+\end{lstlisting}
+In this case each function call generates two other
+calls, except for $n=1$.
+Let us see what happens when $g$ is called
+with parameter $n$.
+The following table shows the function calls
+produced by this single call:
+\begin{center}
+\begin{tabular}{rr}
+function call & number of calls \\
+\hline
+$g(n)$ & 1 \\
+$g(n-1)$ & 2 \\
+$g(n-2)$ & 4 \\
+$\cdots$ & $\cdots$ \\
+$g(1)$ & $2^{n-1}$ \\
+\end{tabular}
+\end{center}
+Based on this, the time complexity is
+\[1+2+4+\cdots+2^{n-1} = 2^n-1 = O(2^n).\]
+
+\section{Complexity classes}
+
+\index{complexity classes}
+
+The following list contains common time complexities
+of algorithms:
+
+\begin{description}
+\item[$O(1)$]
+\index{constant-time algorithm}
+The running time of a \key{constant-time} algorithm
+does not depend on the input size.
+A typical constant-time algorithm is a direct
+formula that calculates the answer.
+
+\item[$O(\log n)$]
+\index{logarithmic algorithm}
+A \key{logarithmic} algorithm often halves
+the input size at each step.
+The running time of such an algorithm
+is logarithmic, because
+$\log_2 n$ equals the number of times
+$n$ must be divided by 2 to get 1.
+
+\item[$O(\sqrt n)$]
+A \key{square root algorithm} is slower than
+$O(\log n)$ but faster than $O(n)$.
+A special property of square roots is that
+$\sqrt n = n/\sqrt n$, so the square root $\sqrt n$ lies,
+in some sense, in the middle of the input.
+
+\item[$O(n)$]
+\index{linear algorithm}
+A \key{linear} algorithm goes through the input
+a constant number of times.
+This is often the best possible time complexity,
+because it is usually necessary to access each
+input element at least once before
+reporting the answer.
+
+\item[$O(n \log n)$]
+This time complexity often indicates that the
+algorithm sorts the input,
+because the time complexity of efficient
+sorting algorithms is $O(n \log n)$.
+Another possibility is that the algorithm
+uses a data structure where each operation
+takes $O(\log n)$ time.
+
+\item[$O(n^2)$]
+\index{quadratic algorithm}
+A \key{quadratic} algorithm often contains
+two nested loops.
+It is possible to go through all pairs of
+the input elements in $O(n^2)$ time.
+
+\item[$O(n^3)$]
+\index{cubic algorithm}
+A \key{cubic} algorithm often contains
+three nested loops.
+It is possible to go through all triplets of
+the input elements in $O(n^3)$ time.
+
+\item[$O(2^n)$]
+This time complexity often indicates that
+the algorithm iterates through all
+subsets of the input elements.
+For example, the subsets of $\{1,2,3\}$ are
+$\emptyset$, $\{1\}$, $\{2\}$, $\{3\}$, $\{1,2\}$,
+$\{1,3\}$, $\{2,3\}$ and $\{1,2,3\}$.
+
+\item[$O(n!)$]
+This time complexity often indicates that
+the algorithm iterates through all
+permutations of the input elements.
+For example, the permutations of $\{1,2,3\}$ are
+$(1,2,3)$, $(1,3,2)$, $(2,1,3)$, $(2,3,1)$,
+$(3,1,2)$ and $(3,2,1)$.
+
+\end{description}
+
+\index{polynomial algorithm}
+An algorithm is \key{polynomial}
+if its time complexity is at most $O(n^k)$
+where $k$ is a constant.
+All the above time complexities except
+$O(2^n)$ and $O(n!)$ are polynomial.
+In practice, the constant $k$ is usually small,
+and therefore a polynomial time complexity
+roughly means that the algorithm is \emph{efficient}.
+
+\index{NP-hard problem}
+
+Most algorithms in this book are polynomial.
+Still, there are many important problems for which
+no polynomial algorithm is known, i.e.,
+nobody knows how to solve them efficiently.
+\key{NP-hard} problems are an important set
+of problems, for which no polynomial algorithm
+is known\footnote{A classic book on the topic is
+M. R. Garey's and D. S. Johnson's
+\emph{Computers and Intractability: A Guide to the Theory
+of NP-Completeness} \cite{gar79}.}.
+
+\section{Estimating efficiency}
+
+By calculating the time complexity of an algorithm,
+it is possible to check, before
+implementing the algorithm, that it is
+efficient enough for the problem.
+The starting point for estimations is the fact that
+a modern computer can perform some hundreds of
+millions of operations in a second.
+
+For example, assume that the time limit for
+a problem is one second and the input size is $n=10^5$.
+If the time complexity is $O(n^2)$,
+the algorithm will perform about $(10^5)^2=10^{10}$ operations.
+This should take at least some tens of seconds,
+so the algorithm seems to be too slow for solving the problem.
+
+On the other hand, given the input size,
+we can try to \emph{guess}
+the required time complexity of the algorithm
+that solves the problem.
+The following table contains some useful estimates
+assuming a time limit of one second.
+
+\begin{center}
+\begin{tabular}{ll}
+input size & required time complexity \\
+\hline
+$n \le 10$ & $O(n!)$ \\
+$n \le 20$ & $O(2^n)$ \\
+$n \le 500$ & $O(n^3)$ \\
+$n \le 5000$ & $O(n^2)$ \\
+$n \le 10^6$ & $O(n \log n)$ or $O(n)$ \\
+$n$ is large & $O(1)$ or $O(\log n)$ \\
+\end{tabular}
+\end{center}
+
+For example, if the input size is $n=10^5$,
+it is probably expected that the time
+complexity of the algorithm is $O(n)$ or $O(n \log n)$.
+This information makes it easier to design the algorithm,
+because it rules out approaches that would yield
+an algorithm with a worse time complexity.
+
+\index{constant factor}
+
+Still, it is important to remember that a
+time complexity is only an estimate of efficiency,
+because it hides the \emph{constant factors}.
+For example, an algorithm that runs in $O(n)$ time
+may perform $n/2$ or $5n$ operations.
+This has an important effect on the actual
+running time of the algorithm.
+
+\section{Maximum subarray sum}
+
+\index{maximum subarray sum}
+
+There are often several possible algorithms
+for solving a problem such that their
+time complexities are different.
+This section discusses a classic problem that
+has a straightforward $O(n^3)$ solution.
+However, by designing a better algorithm, it
+is possible to solve the problem in $O(n^2)$
+time and even in $O(n)$ time.
+
+Given an array of $n$ numbers,
+our task is to calculate the
+\key{maximum subarray sum}, i.e.,
+the largest possible sum of 
+a sequence of consecutive values
+in the array\footnote{J. Bentley's
+book \emph{Programming Pearls} \cite{ben86} made the problem popular.}.
+The problem is interesting when there may be
+negative values in the array.
+For example, in the array
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$-1$};
+\node at (1.5,0.5) {$2$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$-3$};
+\node at (4.5,0.5) {$5$};
+\node at (5.5,0.5) {$2$};
+\node at (6.5,0.5) {$-5$};
+\node at (7.5,0.5) {$2$};
+\end{tikzpicture}
+\end{center}
+\begin{samepage}
+the following subarray produces the maximum sum $10$:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (1,0) rectangle (6,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$-1$};
+\node at (1.5,0.5) {$2$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$-3$};
+\node at (4.5,0.5) {$5$};
+\node at (5.5,0.5) {$2$};
+\node at (6.5,0.5) {$-5$};
+\node at (7.5,0.5) {$2$};
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+We assume that an empty subarray is allowed,
+so the maximum subarray sum is always at least $0$.
+
+\subsubsection{Algorithm 1}
+
+A straightforward way to solve the problem
+is to go through all possible subarrays,
+calculate the sum of values in each subarray and maintain
+the maximum sum.
+The following code implements this algorithm:
+
+\begin{lstlisting}
+int best = 0;
+for (int a = 0; a < n; a++) {
+    for (int b = a; b < n; b++) {
+        int sum = 0;
+        for (int k = a; k <= b; k++) {
+            sum += array[k];
+        }
+        best = max(best,sum);
+    }
+}
+cout << best << "\n";
+\end{lstlisting}
+
+The variables \texttt{a} and \texttt{b} fix the first and
+last index of the subarray,
+and the sum of values is calculated to the variable \texttt{sum}.
+The variable \texttt{best} contains the maximum sum found during the search.
+
+The time complexity of the algorithm is $O(n^3)$,
+because it consists of three nested loops 
+that go through the input.
+
+\subsubsection{Algorithm 2}
+
+It is easy to make Algorithm 1 more efficient
+by removing one loop from it.
+This is possible by calculating the sum at the same
+time when the right end of the subarray moves.
+The result is the following code:
+
+\begin{lstlisting}
+int best = 0;
+for (int a = 0; a < n; a++) {
+    int sum = 0;
+    for (int b = a; b < n; b++) {
+        sum += array[b];
+        best = max(best,sum);
+    }
+}
+cout << best << "\n";
+\end{lstlisting}
+After this change, the time complexity is $O(n^2)$.
+
+\subsubsection{Algorithm 3}
+
+Surprisingly, it is possible to solve the problem
+in $O(n)$ time\footnote{In \cite{ben86}, this linear-time algorithm
+is attributed to J. B. Kadane, and the algorithm is sometimes
+called \index{Kadane's algorithm} \key{Kadane's algorithm}.}, which means
+that just one loop is enough.
+The idea is to calculate, for each array position,
+the maximum sum of a subarray that ends at that position.
+After this, the answer for the problem is the
+maximum of those sums.
+
+Consider the subproblem of finding the maximum-sum subarray
+that ends at position $k$.
+There are two possibilities:
+\begin{enumerate}
+\item The subarray only contains the element at position $k$.
+\item The subarray consists of a subarray that ends
+at position $k-1$, followed by the element at position $k$.
+\end{enumerate}
+
+In the latter case, since we want to
+find a subarray with maximum sum,
+the subarray that ends at position $k-1$
+should also have the maximum sum.
+Thus, we can solve the problem efficiently
+by calculating the maximum subarray sum
+for each ending position from left to right.
+
+The following code implements the algorithm:
+\begin{lstlisting}
+int best = 0, sum = 0;
+for (int k = 0; k < n; k++) {
+    sum = max(array[k],sum+array[k]);
+    best = max(best,sum);
+}
+cout << best << "\n";
+\end{lstlisting}
+
+The algorithm only contains one loop
+that goes through the input,
+so the time complexity is $O(n)$.
+This is also the best possible time complexity,
+because any algorithm for the problem
+has to examine all array elements at least once.
+
+\subsubsection{Efficiency comparison}
+
+It is interesting to study how efficient 
+algorithms are in practice.
+The following table shows the running times
+of the above algorithms for different
+values of $n$ on a modern computer.
+
+In each test, the input was generated randomly.
+The time needed for reading the input was not
+measured.
+
+\begin{center}
+\begin{tabular}{rrrr}
+array size $n$ & Algorithm 1 & Algorithm 2 & Algorithm 3 \\
+\hline
+$10^2$ & $0.0$ s & $0.0$ s & $0.0$ s \\
+$10^3$ & $0.1$ s & $0.0$ s & $0.0$ s \\
+$10^4$ & > $10.0$ s & $0.1$ s & $0.0$ s \\
+$10^5$ & > $10.0$ s & $5.3$ s & $0.0$ s \\
+$10^6$ & > $10.0$ s & > $10.0$ s & $0.0$ s \\
+$10^7$ & > $10.0$ s & > $10.0$ s & $0.0$ s \\
+\end{tabular}
+\end{center}
+
+The comparison shows that all algorithms
+are efficient when the input size is small,
+but larger inputs bring out remarkable
+differences in the running times of the algorithms.
+Algorithm 1 becomes slow
+when $n=10^4$, and Algorithm 2
+becomes slow when $n=10^5$.
+Only Algorithm 3 is able to process
+even the largest inputs instantly.
--- a/chapter03.tex
+++ b/chapter03.tex
@ -0,0 +1,863 @@
+\chapter{Sorting}
+
+\index{sorting}
+
+\key{Sorting}
+is a fundamental algorithm design problem.
+Many efficient algorithms
+use sorting as a subroutine,
+because it is often easier to process
+data if the elements are in a sorted order.
+
+For example, the problem ''does an array contain
+two equal elements?'' is easy to solve using sorting.
+If the array contains two equal elements,
+they will be next to each other after sorting,
+so it is easy to find them.
+Also, the problem ''what is the most frequent element
+in an array?'' can be solved similarly.
+
+There are many algorithms for sorting, and they are
+also good examples of how to apply
+different algorithm design techniques.
+The efficient general sorting algorithms
+work in $O(n \log n)$ time,
+and many algorithms that use sorting
+as a subroutine also
+have this time complexity.
+
+\section{Sorting theory}
+
+The basic problem in sorting is as follows:
+\begin{framed}
+\noindent
+Given an array that contains $n$ elements,
+your task is to sort the elements
+in increasing order.
+\end{framed}
+\noindent
+For example, the array
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$8$};
+\node at (3.5,0.5) {$2$};
+\node at (4.5,0.5) {$9$};
+\node at (5.5,0.5) {$2$};
+\node at (6.5,0.5) {$5$};
+\node at (7.5,0.5) {$6$};
+\end{tikzpicture}
+\end{center}
+will be as follows after sorting:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$2$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$3$};
+\node at (4.5,0.5) {$5$};
+\node at (5.5,0.5) {$6$};
+\node at (6.5,0.5) {$8$};
+\node at (7.5,0.5) {$9$};
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{$O(n^2)$ algorithms}
+
+\index{bubble sort}
+
+Simple algorithms for sorting an array
+work in $O(n^2)$ time.
+Such algorithms are short and usually
+consist of two nested loops.
+A famous $O(n^2)$ time sorting algorithm
+is \key{bubble sort} where the elements
+''bubble'' in the array according to their values.
+
+Bubble sort consists of $n$ rounds.
+On each round, the algorithm iterates through
+the elements of the array.
+Whenever two consecutive elements are found
+that are not in correct order,
+the algorithm swaps them.
+The algorithm can be implemented as follows:
+\begin{lstlisting}
+for (int i = 0; i < n; i++) {
+    for (int j = 0; j < n-1; j++) {
+        if (array[j] > array[j+1]) {
+            swap(array[j],array[j+1]);
+        }
+    }
+}
+\end{lstlisting}
+
+After the first round of the algorithm,
+the largest element will be in the correct position,
+and in general, after $k$ rounds, the $k$ largest
+elements will be in the correct positions.
+Thus, after $n$ rounds, the whole array
+will be sorted.
+
+For example, in the array
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$8$};
+\node at (3.5,0.5) {$2$};
+\node at (4.5,0.5) {$9$};
+\node at (5.5,0.5) {$2$};
+\node at (6.5,0.5) {$5$};
+\node at (7.5,0.5) {$6$};
+\end{tikzpicture}
+\end{center}
+
+\noindent
+the first round of bubble sort swaps elements
+as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$8$};
+\node at (4.5,0.5) {$9$};
+\node at (5.5,0.5) {$2$};
+\node at (6.5,0.5) {$5$};
+\node at (7.5,0.5) {$6$};
+
+\draw[thick,<->] (3.5,-0.25) .. controls (3.25,-1.00) and (2.75,-1.00) .. (2.5,-0.25);
+\end{tikzpicture}
+\end{center}
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$8$};
+\node at (4.5,0.5) {$2$};
+\node at (5.5,0.5) {$9$};
+\node at (6.5,0.5) {$5$};
+\node at (7.5,0.5) {$6$};
+
+\draw[thick,<->] (5.5,-0.25) .. controls (5.25,-1.00) and (4.75,-1.00) .. (4.5,-0.25);
+\end{tikzpicture}
+\end{center}
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$8$};
+\node at (4.5,0.5) {$2$};
+\node at (5.5,0.5) {$5$};
+\node at (6.5,0.5) {$9$};
+\node at (7.5,0.5) {$6$};
+
+\draw[thick,<->] (6.5,-0.25) .. controls (6.25,-1.00) and (5.75,-1.00) .. (5.5,-0.25);
+\end{tikzpicture}
+\end{center}
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$8$};
+\node at (4.5,0.5) {$2$};
+\node at (5.5,0.5) {$5$};
+\node at (6.5,0.5) {$6$};
+\node at (7.5,0.5) {$9$};
+
+\draw[thick,<->] (7.5,-0.25) .. controls (7.25,-1.00) and (6.75,-1.00) .. (6.5,-0.25);
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Inversions}
+
+\index{inversion}
+
+Bubble sort is an example of a sorting
+algorithm that always swaps \emph{consecutive}
+elements in the array.
+It turns out that the time complexity
+of such an algorithm is \emph{always}
+at least $O(n^2)$, because in the worst case,
+$O(n^2)$ swaps are required for sorting the array.
+
+A useful concept when analyzing sorting
+algorithms is an \key{inversion}:
+a pair of array elements
+$(\texttt{array}[a],\texttt{array}[b])$ such that
+$a<b$ and $\texttt{array}[a]>\texttt{array}[b]$,
+i.e., the elements are in the wrong order.
+For example, the array
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$2$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$6$};
+\node at (4.5,0.5) {$3$};
+\node at (5.5,0.5) {$5$};
+\node at (6.5,0.5) {$9$};
+\node at (7.5,0.5) {$8$};
+\end{tikzpicture}
+\end{center}
+has three inversions: $(6,3)$, $(6,5)$ and $(9,8)$.
+The number of inversions indicates
+how much work is needed to sort the array.
+An array is completely sorted when
+there are no inversions.
+On the other hand, if the array elements
+are in the reverse order,
+the number of inversions is the largest possible:
+\[1+2+\cdots+(n-1)=\frac{n(n-1)}{2} = O(n^2)\]
+
+Swapping a pair of consecutive elements that are
+in the wrong order removes exactly one inversion
+from the array.
+Hence, if a sorting algorithm can only
+swap consecutive elements, each swap removes
+at most one inversion, and the time complexity
+of the algorithm is at least $O(n^2)$.
+
+\subsubsection{$O(n \log n)$ algorithms}
+
+\index{merge sort}
+
+It is possible to sort an array efficiently
+in $O(n \log n)$ time using algorithms
+that are not limited to swapping consecutive elements.
+One such algorithm is \key{merge sort}\footnote{According to \cite{knu983},
+merge sort was invented by J. von Neumann in 1945.},
+which is based on recursion.
+
+Merge sort sorts a subarray \texttt{array}$[a \ldots b]$ as follows:
+
+\begin{enumerate}
+\item If $a=b$, do not do anything, because the subarray is already sorted.
+\item Calculate the position of the middle element: $k=\lfloor (a+b)/2 \rfloor$.
+\item Recursively sort the subarray \texttt{array}$[a \ldots k]$.
+\item Recursively sort the subarray \texttt{array}$[k+1 \ldots b]$.
+\item \emph{Merge} the sorted subarrays \texttt{array}$[a \ldots k]$ and
+\texttt{array}$[k+1 \ldots b]$
+into a sorted subarray \texttt{array}$[a \ldots b]$.
+\end{enumerate}
+
+Merge sort is an efficient algorithm, because it
+halves the size of the subarray at each step.
+The recursion consists of $O(\log n)$ levels,
+and processing each level takes $O(n)$ time.
+Merging the subarrays \texttt{array}$[a \ldots k]$ and \texttt{array}$[k+1 \ldots b]$
+is possible in linear time, because they are already sorted.
+
+For example, consider sorting the following array:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$6$};
+\node at (3.5,0.5) {$2$};
+\node at (4.5,0.5) {$8$};
+\node at (5.5,0.5) {$2$};
+\node at (6.5,0.5) {$5$};
+\node at (7.5,0.5) {$9$};
+\end{tikzpicture}
+\end{center}
+
+The array will be divided into two subarrays
+as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (4,1);
+\draw (5,0) grid (9,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$6$};
+\node at (3.5,0.5) {$2$};
+
+\node at (5.5,0.5) {$8$};
+\node at (6.5,0.5) {$2$};
+\node at (7.5,0.5) {$5$};
+\node at (8.5,0.5) {$9$};
+\end{tikzpicture}
+\end{center}
+
+Then, the subarrays will be sorted recursively
+as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (4,1);
+\draw (5,0) grid (9,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$2$};
+\node at (2.5,0.5) {$3$};
+\node at (3.5,0.5) {$6$};
+
+\node at (5.5,0.5) {$2$};
+\node at (6.5,0.5) {$5$};
+\node at (7.5,0.5) {$8$};
+\node at (8.5,0.5) {$9$};
+\end{tikzpicture}
+\end{center}
+
+Finally, the algorithm merges the sorted
+subarrays and creates the final sorted array:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$2$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$3$};
+\node at (4.5,0.5) {$5$};
+\node at (5.5,0.5) {$6$};
+\node at (6.5,0.5) {$8$};
+\node at (7.5,0.5) {$9$};
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Sorting lower bound}
+
+Is it possible to sort an array faster
+than in $O(n \log n)$ time?
+It turns out that this is \emph{not} possible
+when we restrict ourselves to sorting algorithms
+that are based on comparing array elements.
+
+The lower bound for the time complexity
+can be proved by considering sorting
+as a process where each comparison of two elements
+gives more information about the contents of the array.
+The process creates the following tree:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) rectangle (3,1);
+\node at (1.5,0.5) {$x < y?$};
+
+\draw[thick,->] (1.5,0) -- (-2.5,-1.5);
+\draw[thick,->] (1.5,0) -- (5.5,-1.5);
+
+\draw (-4,-2.5) rectangle (-1,-1.5);
+\draw (4,-2.5) rectangle (7,-1.5);
+\node at (-2.5,-2) {$x < y?$};
+\node at (5.5,-2) {$x < y?$};
+
+\draw[thick,->] (-2.5,-2.5) -- (-4.5,-4);
+\draw[thick,->] (-2.5,-2.5) -- (-0.5,-4);
+\draw[thick,->] (5.5,-2.5) -- (3.5,-4);
+\draw[thick,->] (5.5,-2.5) -- (7.5,-4);
+
+\draw (-6,-5) rectangle (-3,-4);
+\draw (-2,-5) rectangle (1,-4);
+\draw (2,-5) rectangle (5,-4);
+\draw (6,-5) rectangle (9,-4);
+\node at (-4.5,-4.5) {$x < y?$};
+\node at (-0.5,-4.5) {$x < y?$};
+\node at (3.5,-4.5) {$x < y?$};
+\node at (7.5,-4.5) {$x < y?$};
+
+\draw[thick,->] (-4.5,-5) -- (-5.5,-6);
+\draw[thick,->] (-4.5,-5) -- (-3.5,-6);
+\draw[thick,->] (-0.5,-5) -- (0.5,-6);
+\draw[thick,->] (-0.5,-5) -- (-1.5,-6);
+\draw[thick,->] (3.5,-5) -- (2.5,-6);
+\draw[thick,->] (3.5,-5) -- (4.5,-6);
+\draw[thick,->] (7.5,-5) -- (6.5,-6);
+\draw[thick,->] (7.5,-5) -- (8.5,-6);
+\end{tikzpicture}
+\end{center}
+
+Here ''$x<y?$'' means that some elements
+$x$ and $y$ are compared.
+If $x<y$, the process continues to the left,
+and otherwise to the right.
+The results of the process are the possible
+ways to sort the array, a total of $n!$ ways.
+For this reason, the height of the tree
+must be at least
+\[ \log_2(n!) = \log_2(1)+\log_2(2)+\cdots+\log_2(n).\]
+We get a lower bound for this sum
+by choosing the last $n/2$ elements and
+changing the value of each element to $\log_2(n/2)$.
+This yields an estimate
+\[ \log_2(n!) \ge (n/2) \cdot \log_2(n/2),\]
+so the height of the tree and the minimum
+possible number of steps in a sorting
+algorithm in the worst case
+is at least $n \log n$.
+
+\subsubsection{Counting sort}
+
+\index{counting sort}
+
+The lower bound $n \log n$ does not apply to
+algorithms that do not compare array elements
+but use some other information.
+An example of such an algorithm is
+\key{counting sort} that sorts an array in
+$O(n)$ time assuming that every element in the array
+is an integer between $0 \ldots c$ and $c=O(n)$.
+
+The algorithm creates a \emph{bookkeeping} array,
+whose indices are elements of the original array.
+The algorithm iterates through the original array
+and calculates how many times each element
+appears in the array.
+\newpage
+
+For example, the array
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$6$};
+\node at (3.5,0.5) {$9$};
+\node at (4.5,0.5) {$9$};
+\node at (5.5,0.5) {$3$};
+\node at (6.5,0.5) {$5$};
+\node at (7.5,0.5) {$9$};
+\end{tikzpicture}
+\end{center}
+corresponds to the following bookkeeping array:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (9,1);
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$0$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$0$};
+\node at (4.5,0.5) {$1$};
+\node at (5.5,0.5) {$1$};
+\node at (6.5,0.5) {$0$};
+\node at (7.5,0.5) {$0$};
+\node at (8.5,0.5) {$3$};
+
+\footnotesize
+
+\node at (0.5,1.5) {$1$};
+\node at (1.5,1.5) {$2$};
+\node at (2.5,1.5) {$3$};
+\node at (3.5,1.5) {$4$};
+\node at (4.5,1.5) {$5$};
+\node at (5.5,1.5) {$6$};
+\node at (6.5,1.5) {$7$};
+\node at (7.5,1.5) {$8$};
+\node at (8.5,1.5) {$9$};
+\end{tikzpicture}
+\end{center}
+
+For example, the value at position 3
+in the bookkeeping array is 2,
+because the element 3 appears 2 times
+in the original array.
+
+Construction of the bookkeeping array
+takes $O(n)$ time. After this, the sorted array
+can be created in $O(n)$ time because
+the number of occurrences of each element can be retrieved
+from the bookkeeping array.
+Thus, the total time complexity of counting
+sort is $O(n)$.
+
+Counting sort is a very efficient algorithm
+but it can only be used when the constant $c$
+is small enough, so that the array elements can
+be used as indices in the bookkeeping array.
+
+\section{Sorting in C++}
+
+\index{sort@\texttt{sort}}
+
+It is almost never a good idea to use
+a home-made sorting algorithm
+in a contest, because there are good
+implementations available in programming languages.
+For example, the C++ standard library contains
+the function \texttt{sort} that can be easily used for
+sorting arrays and other data structures.
+
+There are many benefits in using a library function.
+First, it saves time because there is no need to
+implement the function.
+Second, the library implementation is
+certainly correct and efficient: it is not probable
+that a home-made sorting function would be better.
+
+In this section we will see how to use the
+C++ \texttt{sort} function.
+The following code sorts
+a vector in increasing order:
+\begin{lstlisting}
+vector<int> v = {4,2,5,3,5,8,3};
+sort(v.begin(),v.end());
+\end{lstlisting}
+After the sorting, the contents of the
+vector will be
+$[2,3,3,4,5,5,8]$.
+The default sorting order is increasing,
+but a reverse order is possible as follows:
+\begin{lstlisting}
+sort(v.rbegin(),v.rend());
+\end{lstlisting}
+An ordinary array can be sorted as follows:
+\begin{lstlisting}
+int n = 7; // array size
+int a[] = {4,2,5,3,5,8,3};
+sort(a,a+n);
+\end{lstlisting}
+\newpage
+The following code sorts the string \texttt{s}:
+\begin{lstlisting}
+string s = "monkey";
+sort(s.begin(), s.end());
+\end{lstlisting}
+Sorting a string means that the characters
+of the string are sorted.
+For example, the string ''monkey'' becomes ''ekmnoy''.
+
+\subsubsection{Comparison operators}
+
+\index{comparison operator}
+
+The function \texttt{sort} requires that
+a \key{comparison operator} is defined for the data type
+of the elements to be sorted.
+When sorting, this operator will be used
+whenever it is necessary to find out the order of two elements.
+
+Most C++ data types have a built-in comparison operator,
+and elements of those types can be sorted automatically.
+For example, numbers are sorted according to their values
+and strings are sorted in alphabetical order.
+
+\index{pair@\texttt{pair}}
+
+Pairs (\texttt{pair}) are sorted primarily according to their
+first elements (\texttt{first}).
+However, if the first elements of two pairs are equal,
+they are sorted according to their second elements (\texttt{second}):
+\begin{lstlisting}
+vector<pair<int,int>> v;
+v.push_back({1,5});
+v.push_back({2,3});
+v.push_back({1,2});
+sort(v.begin(), v.end());
+\end{lstlisting}
+After this, the order of the pairs is
+$(1,2)$, $(1,5)$ and $(2,3)$.
+
+\index{tuple@\texttt{tuple}}
+
+In a similar way, tuples (\texttt{tuple})
+are sorted primarily by the first element,
+secondarily by the second element, etc.\footnote{Note that in some older compilers,
+the function \texttt{make\_tuple} has to be used to create a tuple instead of
+braces (for example, \texttt{make\_tuple(2,1,4)} instead of \texttt{\{2,1,4\}}).}:
+\begin{lstlisting}
+vector<tuple<int,int,int>> v;
+v.push_back({2,1,4});
+v.push_back({1,5,3});
+v.push_back({2,1,3});
+sort(v.begin(), v.end());
+\end{lstlisting}
+After this, the order of the tuples is
+$(1,5,3)$, $(2,1,3)$ and $(2,1,4)$.
+
+\subsubsection{User-defined structs}
+
+User-defined structs do not have a comparison
+operator automatically.
+The operator should be defined inside
+the struct as a function
+\texttt{operator<},
+whose parameter is another element of the same type.
+The operator should return \texttt{true}
+if the element is smaller than the parameter,
+and \texttt{false} otherwise.
+
+For example, the following struct \texttt{P}
+contains the x and y coordinates of a point.
+The comparison operator is defined so that
+the points are sorted primarily by the x coordinate
+and secondarily by the y coordinate.
+
+\begin{lstlisting}
+struct P {
+    int x, y;
+    bool operator<(const P &p) {
+        if (x != p.x) return x < p.x;
+        else return y < p.y;
+    }
+};
+\end{lstlisting}
+
+\subsubsection{Comparison functions}
+
+\index{comparison function}
+
+It is also possible to give an external
+\key{comparison function} to the \texttt{sort} function
+as a callback function.
+For example, the following comparison function \texttt{comp}
+sorts strings primarily by length and secondarily
+by alphabetical order:
+
+\begin{lstlisting}
+bool comp(string a, string b) {
+    if (a.size() != b.size()) return a.size() < b.size();
+    return a < b;
+}
+\end{lstlisting}
+Now a vector of strings can be sorted as follows:
+\begin{lstlisting}
+sort(v.begin(), v.end(), comp);
+\end{lstlisting}
+
+\section{Binary search}
+
+\index{binary search}
+
+A general method for searching for an element
+in an array is to use a \texttt{for} loop
+that iterates through the elements of the array.
+For example, the following code searches for
+an element $x$ in an array:
+
+\begin{lstlisting}
+for (int i = 0; i < n; i++) {
+    if (array[i] == x) {
+        // x found at index i
+    }
+}
+\end{lstlisting}
+
+The time complexity of this approach is $O(n)$,
+because in the worst case, it is necessary to check
+all elements of the array.
+If the order of the elements is arbitrary,
+this is also the best possible approach, because
+there is no additional information available where
+in the array we should search for the element $x$.
+
+However, if the array is \emph{sorted},
+the situation is different.
+In this case it is possible to perform the
+search much faster, because the order of the
+elements in the array guides the search.
+The following \key{binary search} algorithm
+efficiently searches for an element in a sorted array
+in $O(\log n)$ time.
+
+\subsubsection{Method 1}
+
+The usual way to implement binary search
+resembles looking for a word in a dictionary.
+The search maintains an active region in the array,
+which initially contains all array elements.
+Then, a number of steps is performed,
+each of which halves the size of the region.
+
+At each step, the search checks the middle element
+of the active region.
+If the middle element is the target element,
+the search terminates.
+Otherwise, the search recursively continues
+to the left or right half of the region,
+depending on the value of the middle element.
+
+The above idea can be implemented as follows:
+\begin{lstlisting}
+int a = 0, b = n-1;
+while (a <= b) {
+    int k = (a+b)/2;
+    if (array[k] == x) {
+        // x found at index k
+    }
+    if (array[k] > x) b = k-1;
+    else a = k+1;
+}
+\end{lstlisting}
+
+In this implementation, the active region is $a \ldots b$,
+and initially the region is $0 \ldots n-1$.
+The algorithm halves the size of the region at each step,
+so the time complexity is $O(\log n)$.
+
+\subsubsection{Method 2}
+
+An alternative method to implement binary search
+is based on an efficient way to iterate through
+the elements of the array.
+The idea is to make jumps and slow the speed
+when we get closer to the target element.
+
+The search goes through the array from left to
+right, and the initial jump length is $n/2$.
+At each step, the jump length will be halved:
+first $n/4$, then $n/8$, $n/16$, etc., until
+finally the length is 1.
+After the jumps, either the target element has
+been found or we know that it does not appear in the array.
+
+The following code implements the above idea:
+\begin{lstlisting}
+int k = 0;
+for (int b = n/2; b >= 1; b /= 2) {
+    while (k+b < n && array[k+b] <= x) k += b;
+}
+if (array[k] == x) {
+    // x found at index k
+}
+\end{lstlisting}
+
+During the search, the variable $b$
+contains the current jump length.
+The time complexity of the algorithm is $O(\log n)$,
+because the code in the \texttt{while} loop
+is performed at most twice for each jump length.
+
+\subsubsection{C++ functions}
+
+The C++ standard library contains the following functions
+that are based on binary search and work in logarithmic time:
+
+\begin{itemize}
+\item \texttt{lower\_bound} returns a pointer to the
+first array element whose value is at least $x$.
+\item \texttt{upper\_bound} returns a pointer to the
+first array element whose value is larger than $x$.
+\item \texttt{equal\_range} returns both above pointers.
+\end{itemize}
+
+The functions assume that the array is sorted.
+If there is no such element, the pointer points to
+the element after the last array element.
+For example, the following code finds out whether
+an array contains an element with value $x$:
+
+\begin{lstlisting}
+auto k = lower_bound(array,array+n,x)-array;
+if (k < n && array[k] == x) {
+    // x found at index k
+}
+\end{lstlisting}
+
+Then, the following code counts the number of elements
+whose value is $x$:
+
+\begin{lstlisting}
+auto a = lower_bound(array, array+n, x);
+auto b = upper_bound(array, array+n, x);
+cout << b-a << "\n";
+\end{lstlisting}
+
+Using \texttt{equal\_range}, the code becomes shorter:
+
+\begin{lstlisting}
+auto r = equal_range(array, array+n, x);
+cout << r.second-r.first << "\n";
+\end{lstlisting}
+
+\subsubsection{Finding the smallest solution}
+
+An important use for binary search is
+to find the position where the value of a \emph{function} changes.
+Suppose that we wish to find the smallest value $k$
+that is a valid solution for a problem.
+We are given a function $\texttt{ok}(x)$
+that returns \texttt{true} if $x$ is a valid solution
+and \texttt{false} otherwise.
+In addition, we know that $\texttt{ok}(x)$ is \texttt{false}
+when $x<k$ and \texttt{true} when $x \ge k$.
+The situation looks as follows:
+
+\begin{center}
+\begin{tabular}{r|rrrrrrrr}
+$x$ & 0 & 1 & $\cdots$ & $k-1$ & $k$ & $k+1$ & $\cdots$ \\
+\hline
+$\texttt{ok}(x)$ & \texttt{false} & \texttt{false}
+& $\cdots$ & \texttt{false} & \texttt{true} & \texttt{true} & $\cdots$ \\
+\end{tabular}
+\end{center}
+
+\noindent
+Now, the value of $k$ can be found using binary search:
+
+\begin{lstlisting}
+int x = -1;
+for (int b = z; b >= 1; b /= 2) {
+    while (!ok(x+b)) x += b;
+}
+int k = x+1;
+\end{lstlisting}
+
+The search finds the largest value of $x$ for which
+$\texttt{ok}(x)$ is \texttt{false}.
+Thus, the next value $k=x+1$
+is the smallest possible value for which
+$\texttt{ok}(k)$ is \texttt{true}.
+The initial jump length $z$ has to be
+large enough, for example some value
+for which we know beforehand that $\texttt{ok}(z)$ is \texttt{true}.
+
+The algorithm calls the function \texttt{ok}
+$O(\log z)$ times, so the total time complexity
+depends on the function \texttt{ok}.
+For example, if the function works in $O(n)$ time,
+the total time complexity is $O(n \log z)$.
+
+\subsubsection{Finding the maximum value}
+
+Binary search can also be used to find
+the maximum value for a function that is
+first increasing and then decreasing.
+Our task is to find a position $k$ such that
+
+\begin{itemize}
+\item
+$f(x)<f(x+1)$ when $x<k$, and
+\item
+$f(x)>f(x+1)$ when $x \ge k$.
+\end{itemize}
+
+The idea is to use binary search
+for finding the largest value of $x$
+for which $f(x)<f(x+1)$.
+This implies that $k=x+1$
+because $f(x+1)>f(x+2)$.
+The following code implements the search: 
+
+\begin{lstlisting}
+int x = -1;
+for (int b = z; b >= 1; b /= 2) {
+    while (f(x+b) < f(x+b+1)) x += b;
+}
+int k = x+1;
+\end{lstlisting}
+
+Note that unlike in the ordinary binary search,
+here it is not allowed that consecutive values
+of the function are equal.
+In this case it would not be possible to know
+how to continue the search.
--- a/chapter04.tex
+++ b/chapter04.tex
@ -0,0 +1,794 @@
+\chapter{Data structures}
+
+\index{data structure}
+
+A \key{data structure} is a way to store
+data in the memory of a computer.
+It is important to choose an appropriate
+data structure for a problem,
+because each data structure has its own
+advantages and disadvantages.
+The crucial question is: which operations
+are efficient in the chosen data structure?
+
+This chapter introduces the most important
+data structures in the C++ standard library.
+It is a good idea to use the standard library
+whenever possible,
+because it will save a lot of time.
+Later in the book we will learn about more sophisticated
+data structures that are not available
+in the standard library.
+
+\section{Dynamic arrays}
+
+\index{dynamic array}
+\index{vector}
+
+A \key{dynamic array} is an array whose
+size can be changed during the execution
+of the program.
+The most popular dynamic array in C++ is
+the \texttt{vector} structure,
+which can be used almost like an ordinary array.
+
+The following code creates an empty vector and
+adds three elements to it:
+
+\begin{lstlisting}
+vector<int> v;
+v.push_back(3); // [3]
+v.push_back(2); // [3,2]
+v.push_back(5); // [3,2,5]
+\end{lstlisting}
+
+After this, the elements can be accessed like in an ordinary array:
+
+\begin{lstlisting}
+cout << v[0] << "\n"; // 3
+cout << v[1] << "\n"; // 2
+cout << v[2] << "\n"; // 5
+\end{lstlisting}
+
+The function \texttt{size} returns the number of elements in the vector.
+The following code iterates through
+the vector and prints all elements in it:
+
+\begin{lstlisting}
+for (int i = 0; i < v.size(); i++) {
+    cout << v[i] << "\n";
+}
+\end{lstlisting}
+
+\begin{samepage}
+A shorter way to iterate through a vector is as follows:
+
+\begin{lstlisting}
+for (auto x : v) {
+    cout << x << "\n";
+}
+\end{lstlisting}
+\end{samepage}
+
+The function \texttt{back} returns the last element
+in the vector, and
+the function \texttt{pop\_back} removes the last element:
+
+\begin{lstlisting}
+vector<int> v;
+v.push_back(5);
+v.push_back(2);
+cout << v.back() << "\n"; // 2
+v.pop_back();
+cout << v.back() << "\n"; // 5
+\end{lstlisting}
+
+The following code creates a vector with five elements:
+
+\begin{lstlisting}
+vector<int> v = {2,4,2,5,1};
+\end{lstlisting}
+
+Another way to create a vector is to give the number
+of elements and the initial value for each element:
+
+\begin{lstlisting}
+// size 10, initial value 0
+vector<int> v(10);
+\end{lstlisting}
+\begin{lstlisting}
+// size 10, initial value 5
+vector<int> v(10, 5);
+\end{lstlisting}
+
+The internal implementation of a vector
+uses an ordinary array.
+If the size of the vector increases and
+the array becomes too small,
+a new array is allocated and all the
+elements are moved to the new array.
+However, this does not happen often and the
+average time complexity of
+\texttt{push\_back} is $O(1)$.
+
+\index{string}
+
+The \texttt{string} structure
+is also a dynamic array that can be used almost like a vector.
+In addition, there is special syntax for strings
+that is not available in other data structures.
+Strings can be combined using the \texttt{+} symbol.
+The function $\texttt{substr}(k,x)$ returns the substring
+that begins at position $k$ and has length $x$,
+and the function $\texttt{find}(\texttt{t})$ finds the position
+of the first occurrence of a substring \texttt{t}.
+
+The following code presents some string operations:
+
+\begin{lstlisting}
+string a = "hatti";
+string b = a+a;
+cout << b << "\n"; // hattihatti
+b[5] = 'v';
+cout << b << "\n"; // hattivatti
+string c = b.substr(3,4);
+cout << c << "\n"; // tiva
+\end{lstlisting}
+
+\section{Set structures}
+
+\index{set}
+
+A \key{set} is a data structure that
+maintains a collection of elements.
+The basic operations of sets are element
+insertion, search and removal.
+
+The C++ standard library contains two set
+implementations:
+The structure \texttt{set} is based on a balanced
+binary tree and its operations work in $O(\log n)$ time.
+The structure \texttt{unordered\_set} uses hashing,
+and its operations work in $O(1)$ time on average.
+
+The choice of which set implementation to use
+is often a matter of taste.
+The benefit of the \texttt{set} structure
+is that it maintains the order of the elements
+and provides functions that are not available
+in \texttt{unordered\_set}.
+On the other hand, \texttt{unordered\_set}
+can be more efficient.
+
+The following code creates a set
+that contains integers,
+and shows some of the operations.
+The function \texttt{insert} adds an element to the set,
+the function \texttt{count} returns the number of occurrences
+of an element in the set,
+and the function \texttt{erase} removes an element from the set.
+
+\begin{lstlisting}
+set<int> s;
+s.insert(3);
+s.insert(2);
+s.insert(5);
+cout << s.count(3) << "\n"; // 1
+cout << s.count(4) << "\n"; // 0
+s.erase(3);
+s.insert(4);
+cout << s.count(3) << "\n"; // 0
+cout << s.count(4) << "\n"; // 1
+\end{lstlisting}
+
+A set can be used mostly like a vector,
+but it is not possible to access
+the elements using the \texttt{[]} notation.
+The following code creates a set,
+prints the number of elements in it, and then
+iterates through all the elements:
+\begin{lstlisting}
+set<int> s = {2,5,6,8};
+cout << s.size() << "\n"; // 4
+for (auto x : s) {
+    cout << x << "\n";
+}
+\end{lstlisting}
+
+An important property of sets is
+that all their elements are \emph{distinct}.
+Thus, the function \texttt{count} always returns
+either 0 (the element is not in the set)
+or 1 (the element is in the set),
+and the function \texttt{insert} never adds
+an element to the set if it is
+already there.
+The following code illustrates this:
+
+\begin{lstlisting}
+set<int> s;
+s.insert(5);
+s.insert(5);
+s.insert(5);
+cout << s.count(5) << "\n"; // 1
+\end{lstlisting}
+
+C++ also contains the structures
+\texttt{multiset} and \texttt{unordered\_multiset}
+that otherwise work like \texttt{set}
+and \texttt{unordered\_set}
+but they can contain multiple instances of an element.
+For example, in the following code all three instances
+of the number 5 are added to a multiset:
+
+\begin{lstlisting}
+multiset<int> s;
+s.insert(5);
+s.insert(5);
+s.insert(5);
+cout << s.count(5) << "\n"; // 3
+\end{lstlisting}
+The function \texttt{erase} removes
+all instances of an element
+from a multiset:
+\begin{lstlisting}
+s.erase(5);
+cout << s.count(5) << "\n"; // 0
+\end{lstlisting}
+Often, only one instance should be removed,
+which can be done as follows:
+\begin{lstlisting}
+s.erase(s.find(5));
+cout << s.count(5) << "\n"; // 2
+\end{lstlisting}
+
+\section{Map structures}
+
+\index{map}
+
+A \key{map} is a generalized array
+that consists of key-value-pairs.
+While the keys in an ordinary array are always
+the consecutive integers $0,1,\ldots,n-1$,
+where $n$ is the size of the array,
+the keys in a map can be of any data type and
+they do not have to be consecutive values.
+
+The C++ standard library contains two map
+implementations that correspond to the set
+implementations: the structure
+\texttt{map} is based on a balanced
+binary tree and accessing elements
+takes $O(\log n)$ time,
+while the structure
+\texttt{unordered\_map} uses hashing
+and accessing elements takes $O(1)$ time on average.
+
+The following code creates a map
+where the keys are strings and the values are integers:
+
+\begin{lstlisting}
+map<string,int> m;
+m["monkey"] = 4;
+m["banana"] = 3;
+m["harpsichord"] = 9;
+cout << m["banana"] << "\n"; // 3
+\end{lstlisting}
+
+If the value of a key is requested
+but the map does not contain it,
+the key is automatically added to the map with
+a default value.
+For example, in the following code,
+the key ''aybabtu'' with value 0
+is added to the map.
+
+\begin{lstlisting}
+map<string,int> m;
+cout << m["aybabtu"] << "\n"; // 0
+\end{lstlisting}
+The function \texttt{count} checks
+if a key exists in a map:
+\begin{lstlisting}
+if (m.count("aybabtu")) {
+    // key exists
+}
+\end{lstlisting}
+The following code prints all the keys and values
+in a map:
+\begin{lstlisting}
+for (auto x : m) {
+    cout << x.first << " " << x.second << "\n";
+}
+\end{lstlisting}
+
+\section{Iterators and ranges}
+
+\index{iterator}
+
+Many functions in the C++ standard library
+operate with iterators.
+An \key{iterator} is a variable that points
+to an element in a data structure.
+
+The often used iterators \texttt{begin}
+and \texttt{end} define a range that contains
+all elements in a data structure.
+The iterator \texttt{begin} points to
+the first element in the data structure,
+and the iterator \texttt{end} points to
+the position \emph{after} the last element.
+The situation looks as follows:
+
+\begin{center}
+\begin{tabular}{llllllllll}
+\{ & 3, & 4, & 6, & 8, & 12, & 13, & 14, & 17 & \} \\
+& $\uparrow$ & & & & & & & & $\uparrow$ \\
+& \multicolumn{3}{l}{\texttt{s.begin()}} & & & & & & \texttt{s.end()} \\
+\end{tabular}
+\end{center}
+
+Note the asymmetry in the iterators:
+\texttt{s.begin()} points to an element in the data structure,
+while \texttt{s.end()} points outside the data structure.
+Thus, the range defined by the iterators is \emph{half-open}.
+
+\subsubsection{Working with ranges}
+
+Iterators are used in C++ standard library functions
+that are given a range of elements in a data structure.
+Usually, we want to process all elements in a
+data structure, so the iterators
+\texttt{begin} and \texttt{end} are given for the function.
+
+For example, the following code sorts a vector
+using the function \texttt{sort},
+then reverses the order of the elements using the function
+\texttt{reverse}, and finally shuffles the order of
+the elements using the function \texttt{random\_shuffle}.
+
+\index{sort@\texttt{sort}}
+\index{reverse@\texttt{reverse}}
+\index{random\_shuffle@\texttt{random\_shuffle}}
+
+\begin{lstlisting}
+sort(v.begin(), v.end());
+reverse(v.begin(), v.end());
+random_shuffle(v.begin(), v.end());
+\end{lstlisting}
+
+These functions can also be used with an ordinary array.
+In this case, the functions are given pointers to the array
+instead of iterators:
+
+\newpage
+\begin{lstlisting}
+sort(a, a+n);
+reverse(a, a+n);
+random_shuffle(a, a+n);
+\end{lstlisting}
+
+\subsubsection{Set iterators}
+
+Iterators are often used to access
+elements of a set.
+The following code creates an iterator
+\texttt{it} that points to the smallest element in a set:
+\begin{lstlisting}
+set<int>::iterator it = s.begin();
+\end{lstlisting}
+A shorter way to write the code is as follows:
+\begin{lstlisting}
+auto it = s.begin();
+\end{lstlisting}
+The element to which an iterator points
+can be accessed using the \texttt{*} symbol.
+For example, the following code prints
+the first element in the set:
+
+\begin{lstlisting}
+auto it = s.begin();
+cout << *it << "\n";
+\end{lstlisting}
+
+Iterators can be moved using the operators
+\texttt{++} (forward) and \texttt{--} (backward),
+meaning that the iterator moves to the next
+or previous element in the set.
+
+The following code prints all the elements
+in increasing order:
+\begin{lstlisting}
+for (auto it = s.begin(); it != s.end(); it++) {
+    cout << *it << "\n";
+}
+\end{lstlisting}
+The following code prints the largest element in the set:
+\begin{lstlisting}
+auto it = s.end(); it--;
+cout << *it << "\n";
+\end{lstlisting}
+
+The function $\texttt{find}(x)$ returns an iterator
+that points to an element whose value is $x$.
+However, if the set does not contain $x$,
+the iterator will be \texttt{end}.
+
+\begin{lstlisting}
+auto it = s.find(x);
+if (it == s.end()) {
+    // x is not found
+}
+\end{lstlisting}
+
+The function $\texttt{lower\_bound}(x)$ returns
+an iterator to the smallest element in the set
+whose value is \emph{at least} $x$, and
+the function $\texttt{upper\_bound}(x)$
+returns an iterator to the smallest element in the set
+whose value is \emph{larger than} $x$.
+In both functions, if such an element does not exist,
+the return value is \texttt{end}.
+These functions are not supported by the
+\texttt{unordered\_set} structure which
+does not maintain the order of the elements.
+
+\begin{samepage}
+For example, the following code finds the element
+nearest to $x$:
+
+\begin{lstlisting}
+auto it = s.lower_bound(x);
+if (it == s.begin()) {
+    cout << *it << "\n";
+} else if (it == s.end()) {
+    it--;
+    cout << *it << "\n";
+} else {
+    int a = *it; it--;
+    int b = *it;
+    if (x-b < a-x) cout << b << "\n";
+    else cout << a << "\n";
+}
+\end{lstlisting}
+
+The code assumes that the set is not empty,
+and goes through all possible cases
+using an iterator \texttt{it}.
+First, the iterator points to the smallest
+element whose value is at least $x$.
+If \texttt{it} equals \texttt{begin},
+the corresponding element is nearest to $x$.
+If \texttt{it} equals \texttt{end},
+the largest element in the set is nearest to $x$.
+If none of the previous cases hold,
+the element nearest to $x$ is either the
+element that corresponds to \texttt{it} or the previous element.
+\end{samepage}
+
+\section{Other structures}
+
+\subsubsection{Bitset}
+
+\index{bitset}
+
+A \key{bitset} is an array
+whose each value is either 0 or 1.
+For example, the following code creates
+a bitset that contains 10 elements:
+\begin{lstlisting}
+bitset<10> s;
+s[1] = 1;
+s[3] = 1;
+s[4] = 1;
+s[7] = 1;
+cout << s[4] << "\n"; // 1
+cout << s[5] << "\n"; // 0
+\end{lstlisting}
+
+The benefit of using bitsets is that
+they require less memory than ordinary arrays,
+because each element in a bitset only
+uses one bit of memory.
+For example, 
+if $n$ bits are stored in an \texttt{int} array,
+$32n$ bits of memory will be used,
+but a corresponding bitset only requires $n$ bits of memory.
+In addition, the values of a bitset
+can be efficiently manipulated using
+bit operators, which makes it possible to
+optimize algorithms using bit sets.
+
+The following code shows another way to create the above bitset:
+\begin{lstlisting}
+bitset<10> s(string("0010011010")); // from right to left
+cout << s[4] << "\n"; // 1
+cout << s[5] << "\n"; // 0
+\end{lstlisting}
+
+The function \texttt{count} returns the number
+of ones in the bitset:
+
+\begin{lstlisting}
+bitset<10> s(string("0010011010"));
+cout << s.count() << "\n"; // 4
+\end{lstlisting}
+
+The following code shows examples of using bit operations:
+\begin{lstlisting}
+bitset<10> a(string("0010110110"));
+bitset<10> b(string("1011011000"));
+cout << (a&b) << "\n"; // 0010010000
+cout << (a|b) << "\n"; // 1011111110
+cout << (a^b) << "\n"; // 1001101110
+\end{lstlisting}
+
+\subsubsection{Deque}
+
+\index{deque}
+
+A \key{deque} is a dynamic array
+whose size can be efficiently
+changed at both ends of the array.
+Like a vector, a deque provides the functions
+\texttt{push\_back} and \texttt{pop\_back}, but
+it also includes the functions
+\texttt{push\_front} and \texttt{pop\_front}
+which are not available in a vector.
+
+A deque can be used as follows:
+\begin{lstlisting}
+deque<int> d;
+d.push_back(5); // [5]
+d.push_back(2); // [5,2]
+d.push_front(3); // [3,5,2]
+d.pop_back(); // [3,5]
+d.pop_front(); // [5]
+\end{lstlisting}
+
+The internal implementation of a deque
+is more complex than that of a vector,
+and for this reason, a deque is slower than a vector.
+Still, both adding and removing
+elements take $O(1)$ time on average at both ends.
+
+\subsubsection{Stack}
+
+\index{stack}
+
+A \key{stack}
+is a data structure that provides two
+$O(1)$ time operations:
+adding an element to the top,
+and removing an element from the top.
+It is only possible to access the top
+element of a stack.
+
+The following code shows how a stack can be used:
+\begin{lstlisting}
+stack<int> s;
+s.push(3);
+s.push(2);
+s.push(5);
+cout << s.top(); // 5
+s.pop();
+cout << s.top(); // 2
+\end{lstlisting}
+\subsubsection{Queue}
+
+\index{queue}
+
+A \key{queue} also
+provides two $O(1)$ time operations:
+adding an element to the end of the queue,
+and removing the first element in the queue.
+It is only possible to access the first
+and last element of a queue.
+
+The following code shows how a queue can be used:
+\begin{lstlisting}
+queue<int> q;
+q.push(3);
+q.push(2);
+q.push(5);
+cout << q.front(); // 3
+q.pop();
+cout << q.front(); // 2
+\end{lstlisting}
+
+\subsubsection{Priority queue}
+
+\index{priority queue}
+\index{heap}
+
+A \key{priority queue}
+maintains a set of elements.
+The supported operations are insertion and,
+depending on the type of the queue,
+retrieval and removal of
+either the minimum or maximum element.
+Insertion and removal take $O(\log n)$ time,
+and retrieval takes $O(1)$ time.
+
+While an ordered set efficiently supports
+all the operations of a priority queue,
+the benefit of using a priority queue is
+that it has smaller constant factors.
+A priority queue is usually implemented using
+a heap structure that is much simpler than a
+balanced binary tree used in an ordered set.
+
+\begin{samepage}
+By default, the elements in a C++
+priority queue are sorted in decreasing order,
+and it is possible to find and remove the
+largest element in the queue.
+The following code illustrates this:
+
+\begin{lstlisting}
+priority_queue<int> q;
+q.push(3);
+q.push(5);
+q.push(7);
+q.push(2);
+cout << q.top() << "\n"; // 7
+q.pop();
+cout << q.top() << "\n"; // 5
+q.pop();
+q.push(6);
+cout << q.top() << "\n"; // 6
+q.pop();
+\end{lstlisting}
+\end{samepage}
+
+If we want to create a priority queue
+that supports finding and removing
+the smallest element,
+we can do it as follows:
+
+\begin{lstlisting}
+priority_queue<int,vector<int>,greater<int>> q;
+\end{lstlisting}
+
+\subsubsection{Policy-based data structures}
+
+The \texttt{g++} compiler also supports
+some data structures that are not part
+of the C++ standard library.
+Such structures are called \emph{policy-based}
+data structures.
+To use these structures, the following lines
+must be added to the code:
+\begin{lstlisting}
+#include <ext/pb_ds/assoc_container.hpp>
+using namespace __gnu_pbds; 
+\end{lstlisting}
+After this, we can define a data structure \texttt{indexed\_set} that
+is like \texttt{set} but can be indexed like an array.
+The definition for \texttt{int} values is as follows:
+\begin{lstlisting}
+typedef tree<int,null_type,less<int>,rb_tree_tag,
+             tree_order_statistics_node_update> indexed_set; 
+\end{lstlisting}
+Now we can create a set as follows:
+\begin{lstlisting}
+indexed_set s;
+s.insert(2);
+s.insert(3);
+s.insert(7);
+s.insert(9);
+\end{lstlisting}
+The speciality of this set is that we have access to
+the indices that the elements would have in a sorted array.
+The function $\texttt{find\_by\_order}$ returns
+an iterator to the element at a given position:
+\begin{lstlisting}
+auto x = s.find_by_order(2);
+cout << *x << "\n"; // 7
+\end{lstlisting}
+And the function $\texttt{order\_of\_key}$
+returns the position of a given element:
+\begin{lstlisting}
+cout << s.order_of_key(7) << "\n"; // 2
+\end{lstlisting}
+If the element does not appear in the set,
+we get the position that the element would have
+in the set:
+\begin{lstlisting}
+cout << s.order_of_key(6) << "\n"; // 2
+cout << s.order_of_key(8) << "\n"; // 3
+\end{lstlisting}
+Both the functions work in logarithmic time.
+
+\section{Comparison to sorting}
+
+It is often possible to solve a problem
+using either data structures or sorting.
+Sometimes there are remarkable differences
+in the actual efficiency of these approaches,
+which may be hidden in their time complexities.
+
+Let us consider a problem where
+we are given two lists $A$ and $B$
+that both contain $n$ elements.
+Our task is to calculate the number of elements
+that belong to both of the lists.
+For example, for the lists
+\[A = [5,2,8,9] \hspace{10px} \textrm{and} \hspace{10px} B = [3,2,9,5],\]
+the answer is 3 because the numbers 2, 5
+and 9 belong to both of the lists.
+
+A straightforward solution to the problem is
+to go through all pairs of elements in $O(n^2)$ time,
+but next we will focus on
+more efficient algorithms.
+
+\subsubsection{Algorithm 1}
+
+We construct a set of the elements that appear in $A$,
+and after this, we iterate through the elements
+of $B$ and check for each elements if it
+also belongs to $A$.
+This is efficient because the elements of $A$
+are in a set.
+Using the \texttt{set} structure,
+the time complexity of the algorithm is $O(n \log n)$.
+
+\subsubsection{Algorithm 2}
+
+It is not necessary to maintain an ordered set,
+so instead of the \texttt{set} structure
+we can also use the \texttt{unordered\_set} structure.
+This is an easy way to make the algorithm
+more efficient, because we only have to change
+the underlying data structure.
+The time complexity of the new algorithm is $O(n)$.
+
+\subsubsection{Algorithm 3}
+
+Instead of data structures, we can use sorting.
+First, we sort both lists $A$ and $B$.
+After this, we iterate through both the lists
+at the same time and find the common elements.
+The time complexity of sorting is $O(n \log n)$,
+and the rest of the algorithm works in $O(n)$ time,
+so the total time complexity is $O(n \log n)$.
+
+\subsubsection{Efficiency comparison}
+
+The following table shows how efficient
+the above algorithms are when $n$ varies and
+the elements of the lists are random
+integers between $1 \ldots 10^9$:
+
+\begin{center}
+\begin{tabular}{rrrr}
+$n$ & Algorithm 1 & Algorithm 2 & Algorithm 3 \\
+\hline
+$10^6$ & $1.5$ s & $0.3$ s & $0.2$ s \\
+$2 \cdot 10^6$ & $3.7$ s & $0.8$ s & $0.3$ s \\
+$3 \cdot 10^6$ & $5.7$ s & $1.3$ s & $0.5$ s \\
+$4 \cdot 10^6$ & $7.7$ s & $1.7$ s & $0.7$ s \\
+$5 \cdot 10^6$ & $10.0$ s & $2.3$ s & $0.9$ s \\
+\end{tabular}
+\end{center}
+
+Algorithms 1 and 2 are equal except that
+they use different set structures.
+In this problem, this choice has an important effect on
+the running time, because Algorithm 2
+is 4–5 times faster than Algorithm 1.
+
+However, the most efficient algorithm is Algorithm 3
+which uses sorting.
+It only uses half the time compared to Algorithm 2.
+Interestingly, the time complexity of both
+Algorithm 1 and Algorithm 3 is $O(n \log n)$,
+but despite this, Algorithm 3 is ten times faster.
+This can be explained by the fact that
+sorting is a simple procedure and it is done
+only once at the beginning of Algorithm 3,
+and the rest of the algorithm works in linear time.
+On the other hand,
+Algorithm 1 maintains a complex balanced binary tree
+during the whole algorithm.
--- a/chapter05.tex
+++ b/chapter05.tex
@ -0,0 +1,758 @@
+\chapter{Complete search}
+
+\key{Complete search}
+is a general method that can be used
+to solve almost any algorithm problem.
+The idea is to generate all possible
+solutions to the problem using brute force,
+and then select the best solution or count the
+number of solutions, depending on the problem.
+
+Complete search is a good technique
+if there is enough time to go through all the solutions,
+because the search is usually easy to implement
+and it always gives the correct answer.
+If complete search is too slow,
+other techniques, such as greedy algorithms or
+dynamic programming, may be needed.
+
+\section{Generating subsets}
+
+\index{subset}
+
+We first consider the problem of generating
+all subsets of a set of $n$ elements.
+For example, the subsets of $\{0,1,2\}$ are
+$\emptyset$, $\{0\}$, $\{1\}$, $\{2\}$, $\{0,1\}$,
+$\{0,2\}$, $\{1,2\}$ and $\{0,1,2\}$.
+There are two common methods to generate subsets:
+we can either perform a recursive search
+or exploit the bit representation of integers.
+
+\subsubsection{Method 1}
+
+An elegant way to go through all subsets
+of a set is to use recursion.
+The following function \texttt{search}
+generates the subsets of the set
+$\{0,1,\ldots,n-1\}$.
+The function maintains a vector \texttt{subset}
+that will contain the elements of each subset.
+The search begins when the function is called
+with parameter 0.
+
+\begin{lstlisting}
+void search(int k) {
+    if (k == n) {
+        // process subset
+    } else {
+        search(k+1);
+        subset.push_back(k);
+        search(k+1);
+        subset.pop_back();
+    }
+}
+\end{lstlisting}
+
+When the function \texttt{search}
+is called with parameter $k$,
+it decides whether to include the
+element $k$ in the subset or not,
+and in both cases,
+then calls itself with parameter $k+1$
+However, if $k=n$, the function notices that
+all elements have been processed
+and a subset has been generated.
+
+The following tree illustrates the function calls when $n=3$.
+We can always choose either the left branch
+($k$ is not included in the subset) or the right branch
+($k$ is included in the subset).
+
+\begin{center}
+\begin{tikzpicture}[scale=.45]
+  \begin{scope}
+    \small
+    \node at (0,0) {$\texttt{search}(0)$};
+
+    \node at (-8,-4) {$\texttt{search}(1)$};
+    \node at (8,-4) {$\texttt{search}(1)$};
+
+    \path[draw,thick,->] (0,0-0.5) -- (-8,-4+0.5);
+    \path[draw,thick,->] (0,0-0.5) -- (8,-4+0.5);
+
+    \node at (-12,-8) {$\texttt{search}(2)$};
+    \node at (-4,-8) {$\texttt{search}(2)$};
+    \node at (4,-8) {$\texttt{search}(2)$};
+    \node at (12,-8) {$\texttt{search}(2)$};
+
+    \path[draw,thick,->] (-8,-4-0.5) -- (-12,-8+0.5);
+    \path[draw,thick,->] (-8,-4-0.5) -- (-4,-8+0.5);
+    \path[draw,thick,->] (8,-4-0.5) -- (4,-8+0.5);
+    \path[draw,thick,->] (8,-4-0.5) -- (12,-8+0.5);
+
+    \node at (-14,-12) {$\texttt{search}(3)$};
+    \node at (-10,-12) {$\texttt{search}(3)$};
+    \node at (-6,-12) {$\texttt{search}(3)$};
+    \node at (-2,-12) {$\texttt{search}(3)$};
+    \node at (2,-12) {$\texttt{search}(3)$};
+    \node at (6,-12) {$\texttt{search}(3)$};
+    \node at (10,-12) {$\texttt{search}(3)$};
+    \node at (14,-12) {$\texttt{search}(3)$};
+
+    \node at (-14,-13.5) {$\emptyset$};
+    \node at (-10,-13.5) {$\{2\}$};
+    \node at (-6,-13.5) {$\{1\}$};
+    \node at (-2,-13.5) {$\{1,2\}$};
+    \node at (2,-13.5) {$\{0\}$};
+    \node at (6,-13.5) {$\{0,2\}$};
+    \node at (10,-13.5) {$\{0,1\}$};
+    \node at (14,-13.5) {$\{0,1,2\}$};
+
+
+    \path[draw,thick,->] (-12,-8-0.5) -- (-14,-12+0.5);
+    \path[draw,thick,->] (-12,-8-0.5) -- (-10,-12+0.5);
+    \path[draw,thick,->] (-4,-8-0.5) -- (-6,-12+0.5);
+    \path[draw,thick,->] (-4,-8-0.5) -- (-2,-12+0.5);
+    \path[draw,thick,->] (4,-8-0.5) -- (2,-12+0.5);
+    \path[draw,thick,->] (4,-8-0.5) -- (6,-12+0.5);
+    \path[draw,thick,->] (12,-8-0.5) -- (10,-12+0.5);
+    \path[draw,thick,->] (12,-8-0.5) -- (14,-12+0.5);
+\end{scope}
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Method 2}
+
+Another way to generate subsets is based on
+the bit representation of integers.
+Each subset of a set of $n$ elements
+can be represented as a sequence of $n$ bits,
+which corresponds to an integer between $0 \ldots 2^n-1$.
+The ones in the bit sequence indicate
+which elements are included in the subset.
+
+The usual convention is that
+the last bit corresponds to element 0,
+the second last bit corresponds to element 1,
+and so on.
+For example, the bit representation of 25
+is 11001, which corresponds to the subset $\{0,3,4\}$.
+
+The following code goes through the subsets
+of a set of $n$ elements
+
+\begin{lstlisting}
+for (int b = 0; b < (1<<n); b++) {
+    // process subset
+}
+\end{lstlisting}
+
+The following code shows how we can find
+the elements of a subset that corresponds to a bit sequence.
+When processing each subset,
+the code builds a vector that contains the
+elements in the subset.
+
+\begin{lstlisting}
+for (int b = 0; b < (1<<n); b++) {
+    vector<int> subset;
+    for (int i = 0; i < n; i++) {
+        if (b&(1<<i)) subset.push_back(i);
+    }
+}
+\end{lstlisting}
+
+\section{Generating permutations}
+
+\index{permutation}
+
+Next we consider the problem of generating
+all permutations of a set of $n$ elements.
+For example, the permutations of $\{0,1,2\}$ are
+$(0,1,2)$, $(0,2,1)$, $(1,0,2)$, $(1,2,0)$,
+$(2,0,1)$ and $(2,1,0)$.
+Again, there are two approaches:
+we can either use recursion or go through the
+permutations iteratively.
+
+\subsubsection{Method 1}
+
+Like subsets, permutations can be generated
+using recursion.
+The following function \texttt{search} goes
+through the permutations of the set $\{0,1,\ldots,n-1\}$.
+The function builds a vector \texttt{permutation}
+that contains the permutation,
+and the search begins when the function is
+called without parameters.
+
+\begin{lstlisting}
+void search() {
+    if (permutation.size() == n) {
+        // process permutation
+    } else {
+        for (int i = 0; i < n; i++) {
+            if (chosen[i]) continue;
+            chosen[i] = true;
+            permutation.push_back(i);
+            search();
+            chosen[i] = false;
+            permutation.pop_back();
+        }
+    }
+}
+\end{lstlisting}
+
+Each function call adds a new element to
+\texttt{permutation}.
+The array \texttt{chosen} indicates which
+elements are already included in the permutation.
+If the size of \texttt{permutation} equals the size of the set,
+a permutation has been generated.
+
+\subsubsection{Method 2}
+
+\index{next\_permutation@\texttt{next\_permutation}}
+
+Another method for generating permutations
+is to begin with the permutation
+$\{0,1,\ldots,n-1\}$ and repeatedly
+use a function that constructs the next permutation
+in increasing order.
+The C++ standard library contains the function
+\texttt{next\_permutation} that can be used for this:
+
+\begin{lstlisting}
+vector<int> permutation;
+for (int i = 0; i < n; i++) {
+    permutation.push_back(i);
+}
+do {
+    // process permutation
+} while (next_permutation(permutation.begin(),permutation.end()));
+\end{lstlisting}
+
+\section{Backtracking}
+
+\index{backtracking}
+
+A \key{backtracking} algorithm
+begins with an empty solution
+and extends the solution step by step.
+The search recursively
+goes through all different ways how
+a solution can be constructed.
+
+\index{queen problem}
+
+As an example, consider the problem of
+calculating the number
+of ways $n$ queens can be placed on
+an $n \times n$ chessboard so that
+no two queens attack each other.
+For example, when $n=4$,
+there are two possible solutions:
+
+\begin{center}
+\begin{tikzpicture}[scale=.65]
+  \begin{scope}
+    \draw (0, 0) grid (4, 4);
+    \node at (1.5,3.5) {\symqueen};
+    \node at (3.5,2.5) {\symqueen};
+    \node at (0.5,1.5) {\symqueen};
+    \node at (2.5,0.5) {\symqueen};
+
+    \draw (6, 0) grid (10, 4);
+    \node at (6+2.5,3.5) {\symqueen};
+    \node at (6+0.5,2.5) {\symqueen};
+    \node at (6+3.5,1.5) {\symqueen};
+    \node at (6+1.5,0.5) {\symqueen};
+
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+
+The problem can be solved using backtracking
+by placing queens to the board row by row.
+More precisely, exactly one queen will
+be placed on each row so that no queen attacks
+any of the queens placed before.
+A solution has been found when all
+$n$ queens have been placed on the board.
+
+For example, when $n=4$,
+some partial solutions generated by
+the backtracking algorithm are as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \draw (0, 0) grid (4, 4);
+
+    \draw (-9, -6) grid (-5, -2);
+    \draw (-3, -6) grid (1, -2);
+    \draw (3, -6) grid (7, -2);
+    \draw (9, -6) grid (13, -2);
+
+    \node at (-9+0.5,-3+0.5) {\symqueen};
+    \node at (-3+1+0.5,-3+0.5) {\symqueen};
+    \node at (3+2+0.5,-3+0.5) {\symqueen};
+    \node at (9+3+0.5,-3+0.5) {\symqueen};
+
+    \draw (2,0) -- (-7,-2);
+    \draw (2,0) -- (-1,-2);
+    \draw (2,0) -- (5,-2);
+    \draw (2,0) -- (11,-2);
+
+    \draw (-11, -12) grid (-7, -8);
+    \draw (-6, -12) grid (-2, -8);
+    \draw (-1, -12) grid (3, -8);
+    \draw (4, -12) grid (8, -8);
+    \draw[white] (11, -12) grid (15, -8);
+    \node at (-11+1+0.5,-9+0.5) {\symqueen};
+    \node at (-6+1+0.5,-9+0.5) {\symqueen};
+    \node at (-1+1+0.5,-9+0.5) {\symqueen};
+    \node at (4+1+0.5,-9+0.5) {\symqueen};
+    \node at (-11+0+0.5,-10+0.5) {\symqueen};
+    \node at (-6+1+0.5,-10+0.5) {\symqueen};
+    \node at (-1+2+0.5,-10+0.5) {\symqueen};
+    \node at (4+3+0.5,-10+0.5) {\symqueen};
+
+    \draw (-1,-6) -- (-9,-8);
+    \draw (-1,-6) -- (-4,-8);
+    \draw (-1,-6) -- (1,-8);
+    \draw (-1,-6) -- (6,-8);
+
+    \node at (-9,-13) {illegal};
+    \node at (-4,-13) {illegal};
+    \node at (1,-13) {illegal};
+    \node at (6,-13) {valid};
+
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+
+At the bottom level, the three first configurations
+are illegal, because the queens attack each other.
+However, the fourth configuration is valid
+and it can be extended to a complete solution by
+placing two more queens to the board.
+There is only one way to place the two remaining queens.
+
+\begin{samepage}
+The algorithm can be implemented as follows:
+\begin{lstlisting}
+void search(int y) {
+    if (y == n) {
+        count++;
+        return;
+    }
+    for (int x = 0; x < n; x++) {
+        if (column[x] || diag1[x+y] || diag2[x-y+n-1]) continue;
+        column[x] = diag1[x+y] = diag2[x-y+n-1] = 1;
+        search(y+1);
+        column[x] = diag1[x+y] = diag2[x-y+n-1] = 0;
+    }
+}
+\end{lstlisting}
+\end{samepage}
+The search begins by calling \texttt{search(0)}.
+The size of the board is $n \times n$,
+and the code calculates the number of solutions
+to \texttt{count}.
+
+The code assumes that the rows and columns
+of the board are numbered from 0 to $n-1$.
+When the function \texttt{search} is
+called with parameter $y$,
+it places a queen on row $y$
+and then calls itself with parameter $y+1$.
+Then, if $y=n$, a solution has been found
+and the variable \texttt{count} is increased by one.
+
+The array \texttt{column} keeps track of columns
+that contain a queen,
+and the arrays \texttt{diag1} and \texttt{diag2}
+keep track of diagonals.
+It is not allowed to add another queen to a
+column or diagonal that already contains a queen. 
+For example, the columns and diagonals of
+the $4 \times 4$ board are numbered as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=.65]
+  \begin{scope}
+    \draw (0-6, 0) grid (4-6, 4);
+    \node at (-6+0.5,3.5) {$0$};
+    \node at (-6+1.5,3.5) {$1$};
+    \node at (-6+2.5,3.5) {$2$};
+    \node at (-6+3.5,3.5) {$3$};
+    \node at (-6+0.5,2.5) {$0$};
+    \node at (-6+1.5,2.5) {$1$};
+    \node at (-6+2.5,2.5) {$2$};
+    \node at (-6+3.5,2.5) {$3$};
+    \node at (-6+0.5,1.5) {$0$};
+    \node at (-6+1.5,1.5) {$1$};
+    \node at (-6+2.5,1.5) {$2$};
+    \node at (-6+3.5,1.5) {$3$};
+    \node at (-6+0.5,0.5) {$0$};
+    \node at (-6+1.5,0.5) {$1$};
+    \node at (-6+2.5,0.5) {$2$};
+    \node at (-6+3.5,0.5) {$3$};
+
+    \draw (0, 0) grid (4, 4);
+    \node at (0.5,3.5) {$0$};
+    \node at (1.5,3.5) {$1$};
+    \node at (2.5,3.5) {$2$};
+    \node at (3.5,3.5) {$3$};
+    \node at (0.5,2.5) {$1$};
+    \node at (1.5,2.5) {$2$};
+    \node at (2.5,2.5) {$3$};
+    \node at (3.5,2.5) {$4$};
+    \node at (0.5,1.5) {$2$};
+    \node at (1.5,1.5) {$3$};
+    \node at (2.5,1.5) {$4$};
+    \node at (3.5,1.5) {$5$};
+    \node at (0.5,0.5) {$3$};
+    \node at (1.5,0.5) {$4$};
+    \node at (2.5,0.5) {$5$};
+    \node at (3.5,0.5) {$6$};
+
+    \draw (6, 0) grid (10, 4);
+    \node at (6.5,3.5) {$3$};
+    \node at (7.5,3.5) {$4$};
+    \node at (8.5,3.5) {$5$};
+    \node at (9.5,3.5) {$6$};
+    \node at (6.5,2.5) {$2$};
+    \node at (7.5,2.5) {$3$};
+    \node at (8.5,2.5) {$4$};
+    \node at (9.5,2.5) {$5$};
+    \node at (6.5,1.5) {$1$};
+    \node at (7.5,1.5) {$2$};
+    \node at (8.5,1.5) {$3$};
+    \node at (9.5,1.5) {$4$};
+    \node at (6.5,0.5) {$0$};
+    \node at (7.5,0.5) {$1$};
+    \node at (8.5,0.5) {$2$};
+    \node at (9.5,0.5) {$3$};
+
+    \node at (-4,-1) {\texttt{column}};
+    \node at (2,-1) {\texttt{diag1}};
+    \node at (8,-1) {\texttt{diag2}};
+
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+
+Let $q(n)$ denote the number of ways
+to place $n$ queens on an $n \times n$ chessboard.
+The above backtracking
+algorithm tells us that, for example, $q(8)=92$.
+When $n$ increases, the search quickly becomes slow,
+because the number of solutions increases
+exponentially.
+For example, calculating $q(16)=14772512$
+using the above algorithm already takes about a minute
+on a modern computer\footnote{There is no known way to efficiently
+calculate larger values of $q(n)$. The current record is
+$q(27)=234907967154122528$, calculated in 2016 \cite{q27}.}.
+
+\section{Pruning the search}
+
+We can often optimize backtracking
+by pruning the search tree.
+The idea is to add ''intelligence'' to the algorithm
+so that it will notice as soon as possible
+if a partial solution cannot be extended
+to a complete solution.
+Such optimizations can have a tremendous
+effect on the efficiency of the search.
+
+Let us consider the problem
+of calculating the number of paths
+in an $n \times n$ grid from the upper-left corner
+to the lower-right corner such that the
+path visits each square exactly once.
+For example, in a $7 \times 7$ grid,
+there are 111712 such paths.
+One of the paths is as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \draw (0, 0) grid (7, 7);
+    \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
+          (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
+          (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
+          (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
+          (5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) --
+          (3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) --
+          (4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) --
+          (5.5,6.5) -- (6.5,6.5) -- (6.5,0.5);
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+
+We focus on the $7 \times 7$ case,
+because its level of difficulty is appropriate to our needs.
+We begin with a straightforward backtracking algorithm,
+and then optimize it step by step using observations
+of how the search can be pruned.
+After each optimization, we measure the running time
+of the algorithm and the number of recursive calls,
+so that we clearly see the effect of each
+optimization on the efficiency of the search.
+
+\subsubsection{Basic algorithm}
+
+The first version of the algorithm does not contain
+any optimizations. We simply use backtracking to generate
+all possible paths from the upper-left corner to
+the lower-right corner and count the number of such paths.
+
+\begin{itemize}
+\item
+running time: 483 seconds
+\item
+number of recursive calls: 76 billion
+\end{itemize}
+
+\subsubsection{Optimization 1}
+
+In any solution, we first move one step
+down or right.
+There are always two paths that 
+are symmetric
+about the diagonal of the grid
+after the first step.
+For example, the following paths are symmetric:
+
+\begin{center}
+\begin{tabular}{ccc}
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \draw (0, 0) grid (7, 7);
+    \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
+          (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
+          (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
+          (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
+          (5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) --
+          (3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) --
+          (4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) --
+          (5.5,6.5) -- (6.5,6.5) -- (6.5,0.5);
+  \end{scope}
+\end{tikzpicture}
+& \hspace{20px}
+& 
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}[yscale=1,xscale=-1,rotate=-90]
+    \draw (0, 0) grid (7, 7);
+    \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
+          (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
+          (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
+          (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
+          (5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) --
+          (3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) --
+          (4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) --
+          (5.5,6.5) -- (6.5,6.5) -- (6.5,0.5);
+  \end{scope}
+\end{tikzpicture}
+\end{tabular}
+\end{center}
+
+Hence, we can decide that we always first
+move one step down (or right),
+and finally multiply the number of solutions by two.
+
+\begin{itemize}
+\item
+running time: 244 seconds
+\item
+number of recursive calls: 38 billion
+\end{itemize}
+
+\subsubsection{Optimization 2}
+
+If the path reaches the lower-right square
+before it has visited all other squares of the grid,
+it is clear that
+it will not be possible to complete the solution.
+An example of this is the following path:
+
+\begin{center}
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \draw (0, 0) grid (7, 7);
+    \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
+          (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
+          (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
+          (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
+          (6.5,0.5);
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+Using this observation, we can terminate the search
+immediately if we reach the lower-right square too early.
+\begin{itemize}
+\item
+running time: 119 seconds
+\item
+number of recursive calls: 20 billion
+\end{itemize}
+
+\subsubsection{Optimization 3}
+
+If the path touches a wall
+and can turn either left or right,
+the grid splits into two parts
+that contain unvisited squares.
+For example, in the following situation,
+the path can turn either left or right:
+
+\begin{center}
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \draw (0, 0) grid (7, 7);
+    \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
+          (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
+          (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
+          (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
+          (5.5,0.5) -- (5.5,6.5);
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+In this case, we cannot visit all squares anymore,
+so we can terminate the search.
+This optimization is very useful:
+
+\begin{itemize}
+\item
+running time: 1.8 seconds
+\item
+number of recursive calls: 221 million
+\end{itemize}
+
+\subsubsection{Optimization 4}
+
+The idea of Optimization 3
+can be generalized:
+if the path cannot continue forward
+but can turn either left or right,
+the grid splits into two parts
+that both contain unvisited squares.
+For example, consider the following path:
+
+\begin{center}
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \draw (0, 0) grid (7, 7);
+    \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
+          (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
+          (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
+          (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
+          (5.5,0.5) -- (5.5,4.5) -- (3.5,4.5);
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+It is clear that we cannot visit all squares anymore,
+so we can terminate the search.
+After this optimization, the search is
+very efficient:
+
+\begin{itemize}
+\item
+running time: 0.6 seconds
+\item
+number of recursive calls: 69 million
+\end{itemize}
+
+~\\
+Now is a good moment to stop optimizing
+the algorithm and see what we have achieved.
+The running time of the original algorithm
+was 483 seconds,
+and now after the optimizations,
+the running time is only 0.6 seconds.
+Thus, the algorithm became nearly 1000 times
+faster after the optimizations.
+
+This is a usual phenomenon in backtracking,
+because the search tree is usually large
+and even simple observations can effectively
+prune the search.
+Especially useful are optimizations that
+occur during the first steps of the algorithm,
+i.e., at the top of the search tree.
+
+\section{Meet in the middle}
+
+\index{meet in the middle}
+
+\key{Meet in the middle} is a technique
+where the search space is divided into
+two parts of about equal size.
+A separate search is performed
+for both of the parts,
+and finally the results of the searches are combined.
+
+The technique can be used
+if there is an efficient way to combine the
+results of the searches.
+In such a situation, the two searches may require less
+time than one large search.
+Typically, we can turn a factor of $2^n$
+into a factor of $2^{n/2}$ using the meet in the
+middle technique.
+
+As an example, consider a problem where
+we are given a list of $n$ numbers and
+a number $x$,
+and we want to find out if it is possible
+to choose some numbers from the list so that
+their sum is $x$.
+For example, given the list $[2,4,5,9]$ and $x=15$,
+we can choose the numbers $[2,4,9]$ to get $2+4+9=15$.
+However, if $x=10$ for the same list,
+it is not possible to form the sum.
+
+A simple algorithm to the problem is to
+go through all subsets of the elements and
+check if the sum of any of the subsets is $x$.
+The running time of such an algorithm is $O(2^n)$,
+because there are $2^n$ subsets.
+However, using the meet in the middle technique,
+we can achieve a more efficient $O(2^{n/2})$ time algorithm\footnote{This
+idea was introduced in 1974 by E. Horowitz and S. Sahni \cite{hor74}.}.
+Note that $O(2^n)$ and $O(2^{n/2})$ are different
+complexities because $2^{n/2}$ equals $\sqrt{2^n}$.
+
+The idea is to divide the list into
+two lists $A$ and $B$ such that both
+lists contain about half of the numbers.
+The first search generates all subsets
+of $A$ and stores their sums to a list $S_A$.
+Correspondingly, the second search creates
+a list $S_B$ from $B$.
+After this, it suffices to check if it is possible
+to choose one element from $S_A$ and another
+element from $S_B$ such that their sum is $x$.
+This is possible exactly when there is a way to
+form the sum $x$ using the numbers of the original list.
+
+For example, suppose that the list is $[2,4,5,9]$ and $x=15$.
+First, we divide the list into $A=[2,4]$ and $B=[5,9]$.
+After this, we create lists
+$S_A=[0,2,4,6]$ and $S_B=[0,5,9,14]$.
+In this case, the sum $x=15$ is possible to form,
+because $S_A$ contains the sum $6$,
+$S_B$ contains the sum $9$, and $6+9=15$.
+This corresponds to the solution $[2,4,9]$.
+
+We can implement the algorithm so that
+its time complexity is $O(2^{n/2})$.
+First, we generate \emph{sorted} lists $S_A$ and $S_B$,
+which can be done in $O(2^{n/2})$ time using a merge-like technique.
+After this, since the lists are sorted,
+we can check in $O(2^{n/2})$ time if
+the sum $x$ can be created from $S_A$ and $S_B$.
--- a/chapter06.tex
+++ b/chapter06.tex
@ -0,0 +1,680 @@
+\chapter{Greedy algorithms}
+
+\index{greedy algorithm}
+
+A \key{greedy algorithm}
+constructs a solution to the problem
+by always making a choice that looks
+the best at the moment.
+A greedy algorithm never takes back
+its choices, but directly constructs
+the final solution.
+For this reason, greedy algorithms
+are usually very efficient.
+
+The difficulty in designing greedy algorithms
+is to find a greedy strategy
+that always produces an optimal solution
+to the problem.
+The locally optimal choices in a greedy
+algorithm should also be globally optimal.
+It is often difficult to argue that
+a greedy algorithm works.
+
+\section{Coin problem}
+
+As a first example, we consider a problem
+where we are given a set of coins
+and our task is to form a sum of money $n$
+using the coins.
+The values of the coins are
+$\texttt{coins}=\{c_1,c_2,\ldots,c_k\}$,
+and each coin can be used as many times we want.
+What is the minimum number of coins needed?
+
+For example, if the coins are the euro coins (in cents)
+\[\{1,2,5,10,20,50,100,200\}\]
+and $n=520$,
+we need at least four coins.
+The optimal solution is to select coins
+$200+200+100+20$ whose sum is 520.
+
+\subsubsection{Greedy algorithm}
+
+A simple greedy algorithm to the problem
+always selects the largest possible coin,
+until the required sum of money has been constructed.
+This algorithm works in the example case,
+because we first select two 200 cent coins,
+then one 100 cent coin and finally one 20 cent coin.
+But does this algorithm always work?
+
+It turns out that if the coins are the euro coins,
+the greedy algorithm \emph{always} works, i.e.,
+it always produces a solution with the fewest
+possible number of coins.
+The correctness of the algorithm can be
+shown as follows:
+
+First, each coin 1, 5, 10, 50 and 100 appears
+at most once in an optimal solution,
+because if the
+solution would contain two such coins,
+we could replace them by one coin and
+obtain a better solution.
+For example, if the solution would contain
+coins $5+5$, we could replace them by coin $10$.
+
+In the same way, coins 2 and 20 appear
+at most twice in an optimal solution,
+because we could replace
+coins $2+2+2$ by coins $5+1$ and
+coins $20+20+20$ by coins $50+10$.
+Moreover, an optimal solution cannot contain
+coins $2+2+1$ or $20+20+10$,
+because we could replace them by coins $5$ and $50$.
+
+Using these observations,
+we can show for each coin $x$ that
+it is not possible to optimally construct
+a sum $x$ or any larger sum by only using coins
+that are smaller than $x$.
+For example, if $x=100$, the largest optimal
+sum using the smaller coins is  $50+20+20+5+2+2=99$.
+Thus, the greedy algorithm that always selects
+the largest coin produces the optimal solution.
+
+This example shows that it can be difficult
+to argue that a greedy algorithm works,
+even if the algorithm itself is simple.
+
+\subsubsection{General case}
+
+In the general case, the coin set can contain any coins
+and the greedy algorithm \emph{does not} necessarily produce
+an optimal solution.
+
+We can prove that a greedy algorithm does not work
+by showing a counterexample
+where the algorithm gives a wrong answer.
+In this problem we can easily find a counterexample:
+if the coins are $\{1,3,4\}$ and the target sum
+is 6, the greedy algorithm produces the solution
+$4+1+1$ while the optimal solution is $3+3$.
+
+It is not known if the general coin problem
+can be solved using any greedy algorithm\footnote{However, it is possible
+to \emph{check} in polynomial time
+if the greedy algorithm presented in this chapter works for
+a given set of coins \cite{pea05}.}.
+However, as we will see in Chapter 7,
+in some cases,
+the general problem can be efficiently
+solved using a dynamic
+programming algorithm that always gives the
+correct answer.
+
+\section{Scheduling}
+
+Many scheduling problems can be solved
+using greedy algorithms.
+A classic problem is as follows:
+Given $n$ events with their starting and ending
+times, find a schedule
+that includes as many events as possible.
+It is not possible to select an event partially.
+For example, consider the following events:
+\begin{center}
+\begin{tabular}{lll}
+event & starting time & ending time \\
+\hline
+$A$ & 1 & 3 \\
+$B$ & 2 & 5 \\
+$C$ & 3 & 9 \\
+$D$ & 6 & 8 \\
+\end{tabular}
+\end{center}
+In this case the maximum number of events is two.
+For example, we can select events $B$ and $D$
+as follows:
+\begin{center}
+\begin{tikzpicture}[scale=.4]
+  \begin{scope}
+    \draw (2, 0) rectangle (6, -1);
+    \draw[fill=lightgray] (4, -1.5) rectangle (10, -2.5);
+    \draw (6, -3) rectangle (18, -4);
+    \draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
+    \node at (2.5,-0.5) {$A$};
+    \node at (4.5,-2) {$B$};
+    \node at (6.5,-3.5) {$C$};
+    \node at (12.5,-5) {$D$};
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+
+It is possible to invent several greedy algorithms
+for the problem, but which of them works in every case?
+
+\subsubsection*{Algorithm 1}
+
+The first idea is to select as \emph{short}
+events as possible.
+In the example case this algorithm
+selects the following events:
+\begin{center}
+\begin{tikzpicture}[scale=.4]
+  \begin{scope}
+    \draw[fill=lightgray] (2, 0) rectangle (6, -1);
+    \draw (4, -1.5) rectangle (10, -2.5);
+    \draw (6, -3) rectangle (18, -4);
+    \draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
+    \node at (2.5,-0.5) {$A$};
+    \node at (4.5,-2) {$B$};
+    \node at (6.5,-3.5) {$C$};
+    \node at (12.5,-5) {$D$};
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+
+However, selecting short events is not always
+a correct strategy. For example, the algorithm fails
+in the following case:
+\begin{center}
+\begin{tikzpicture}[scale=.4]
+  \begin{scope}
+    \draw (1, 0) rectangle (7, -1);
+    \draw[fill=lightgray] (6, -1.5) rectangle (9, -2.5);
+    \draw (8, -3) rectangle (14, -4);
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+If we select the short event, we can only select one event.
+However, it would be possible to select both long events.
+
+\subsubsection*{Algorithm 2}
+
+Another idea is to always select the next possible
+event that \emph{begins} as \emph{early} as possible.
+This algorithm selects the following events:
+\begin{center}
+\begin{tikzpicture}[scale=.4]
+  \begin{scope}
+    \draw[fill=lightgray] (2, 0) rectangle (6, -1);
+    \draw (4, -1.5) rectangle (10, -2.5);
+    \draw[fill=lightgray] (6, -3) rectangle (18, -4);
+    \draw (12, -4.5) rectangle (16, -5.5);
+    \node at (2.5,-0.5) {$A$};
+    \node at (4.5,-2) {$B$};
+    \node at (6.5,-3.5) {$C$};
+    \node at (12.5,-5) {$D$};
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+
+However, we can find a counterexample
+also for this algorithm.
+For example, in the following case,
+the algorithm only selects one event:
+\begin{center}
+\begin{tikzpicture}[scale=.4]
+  \begin{scope}
+    \draw[fill=lightgray] (1, 0) rectangle (14, -1);
+    \draw (3, -1.5) rectangle (7, -2.5);
+    \draw (8, -3) rectangle (12, -4);
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+If we select the first event, it is not possible
+to select any other events.
+However, it would be possible to select the
+other two events.
+
+\subsubsection*{Algorithm 3}
+
+The third idea is to always select the next
+possible event that \emph{ends} as \emph{early} as possible.
+This algorithm selects the following events: 
+\begin{center}
+\begin{tikzpicture}[scale=.4]
+  \begin{scope}
+    \draw[fill=lightgray] (2, 0) rectangle (6, -1);
+    \draw (4, -1.5) rectangle (10, -2.5);
+    \draw (6, -3) rectangle (18, -4);
+    \draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
+    \node at (2.5,-0.5) {$A$};
+    \node at (4.5,-2) {$B$};
+    \node at (6.5,-3.5) {$C$};
+    \node at (12.5,-5) {$D$};
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+
+It turns out that this algorithm
+\emph{always} produces an optimal solution.
+The reason for this is that it is always an optimal choice
+to first select an event that ends
+as early as possible.
+After this, it is an optimal choice
+to select the next event
+using the same strategy, etc.,
+until we cannot select any more events.
+
+One way to argue that the algorithm works
+is to consider
+what happens if we first select an event
+that ends later than the event that ends
+as early as possible.
+Now, we will have at most an equal number of
+choices how we can select the next event.
+Hence, selecting an event that ends later
+can never yield a better solution,
+and the greedy algorithm is correct.
+
+\section{Tasks and deadlines}
+
+Let us now consider a problem where
+we are given $n$ tasks with durations and deadlines
+and our task is to choose an order to perform the tasks.
+For each task, we earn $d-x$ points
+where $d$ is the task's deadline
+and $x$ is the moment when we finish the task.
+What is the largest possible total score
+we can obtain?
+
+For example, suppose that the tasks are as follows:
+\begin{center}
+\begin{tabular}{lll}
+task & duration & deadline \\
+\hline
+$A$ & 4 & 2 \\
+$B$ & 3 & 5 \\
+$C$ & 2 & 7 \\
+$D$ & 4 & 5 \\
+\end{tabular}
+\end{center}
+In this case, an optimal schedule for the tasks
+is as follows:
+\begin{center}
+\begin{tikzpicture}[scale=.4]
+  \begin{scope}
+    \draw (0, 0) rectangle (4, -1);
+    \draw (4, 0) rectangle (10, -1);
+    \draw (10, 0) rectangle (18, -1);
+    \draw (18, 0) rectangle (26, -1);
+    \node at (0.5,-0.5) {$C$};
+    \node at (4.5,-0.5) {$B$};
+    \node at (10.5,-0.5) {$A$};
+    \node at (18.5,-0.5) {$D$};
+
+    \draw (0,1.5) -- (26,1.5);
+    \foreach \i in {0,2,...,26}
+    {
+        \draw (\i,1.25) -- (\i,1.75);
+    }
+    \footnotesize
+    \node at (0,2.5) {0};
+    \node at (10,2.5) {5};
+    \node at (20,2.5) {10};
+
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+In this solution, $C$ yields 5 points,
+$B$ yields 0 points, $A$ yields $-7$ points
+and $D$ yields $-8$ points,
+so the total score is $-10$.
+
+Surprisingly, the optimal solution to the problem
+does not depend on the deadlines at all,
+but a correct greedy strategy is to simply
+perform the tasks \emph{sorted by their durations}
+in increasing order.
+The reason for this is that if we ever perform
+two tasks one after another such that the first task
+takes longer than the second task,
+we can obtain a better solution if we swap the tasks.
+For example, consider the following schedule:
+\begin{center}
+\begin{tikzpicture}[scale=.4]
+  \begin{scope}
+    \draw (0, 0) rectangle (8, -1);
+    \draw (8, 0) rectangle (12, -1);
+    \node at (0.5,-0.5) {$X$};
+    \node at (8.5,-0.5) {$Y$};
+
+\draw [decoration={brace}, decorate, line width=0.3mm] (7.75,-1.5) -- (0.25,-1.5);
+\draw [decoration={brace}, decorate, line width=0.3mm] (11.75,-1.5) -- (8.25,-1.5);
+
+\footnotesize
+\node at (4,-2.5) {$a$};
+\node at (10,-2.5) {$b$};
+
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+Here $a>b$, so we should swap the tasks:
+\begin{center}
+\begin{tikzpicture}[scale=.4]
+  \begin{scope}
+    \draw (0, 0) rectangle (4, -1);
+    \draw (4, 0) rectangle (12, -1);
+    \node at (0.5,-0.5) {$Y$};
+    \node at (4.5,-0.5) {$X$};
+
+\draw [decoration={brace}, decorate, line width=0.3mm] (3.75,-1.5) -- (0.25,-1.5);
+\draw [decoration={brace}, decorate, line width=0.3mm] (11.75,-1.5) -- (4.25,-1.5);
+
+\footnotesize
+\node at (2,-2.5) {$b$};
+\node at (8,-2.5) {$a$};
+
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+Now $X$ gives $b$ points less and $Y$ gives $a$ points more,
+so the total score increases by $a-b > 0$.
+In an optimal solution,
+for any two consecutive tasks,
+it must hold that the shorter task comes
+before the longer task.
+Thus, the tasks must be performed
+sorted by their durations.
+
+\section{Minimizing sums}
+
+We next consider a problem where
+we are given $n$ numbers $a_1,a_2,\ldots,a_n$
+and our task is to find a value $x$
+that minimizes the sum
+\[|a_1-x|^c+|a_2-x|^c+\cdots+|a_n-x|^c.\]
+We focus on the cases $c=1$ and $c=2$.
+
+\subsubsection{Case $c=1$}
+
+In this case, we should minimize the sum
+\[|a_1-x|+|a_2-x|+\cdots+|a_n-x|.\]
+For example, if the numbers are $[1,2,9,2,6]$,
+the best solution is to select $x=2$
+which produces the sum
+\[
+|1-2|+|2-2|+|9-2|+|2-2|+|6-2|=12.
+\]
+In the general case, the best choice for $x$
+is the \textit{median} of the numbers,
+i.e., the middle number after sorting.
+For example, the list $[1,2,9,2,6]$
+becomes $[1,2,2,6,9]$ after sorting,
+so the median is 2.
+
+The median is an optimal choice,
+because if $x$ is smaller than the median,
+the sum becomes smaller by increasing $x$,
+and if $x$ is larger then the median,
+the sum becomes smaller by decreasing $x$.
+Hence, the optimal solution is that $x$
+is the median.
+If $n$ is even and there are two medians,
+both medians and all values between them
+are optimal choices.
+
+\subsubsection{Case $c=2$}
+
+In this case, we should minimize the sum
+\[(a_1-x)^2+(a_2-x)^2+\cdots+(a_n-x)^2.\]
+For example, if the numbers are $[1,2,9,2,6]$,
+the best solution is to select $x=4$
+which produces the sum
+\[
+(1-4)^2+(2-4)^2+(9-4)^2+(2-4)^2+(6-4)^2=46.
+\]
+In the general case, the best choice for $x$
+is the \emph{average} of the numbers.
+In the example the average is $(1+2+9+2+6)/5=4$.
+This result can be derived by presenting
+the sum as follows:
+\[
+nx^2 - 2x(a_1+a_2+\cdots+a_n) + (a_1^2+a_2^2+\cdots+a_n^2)
+\]
+The last part does not depend on $x$,
+so we can ignore it.
+The remaining parts form a function
+$nx^2-2xs$ where $s=a_1+a_2+\cdots+a_n$.
+This is a parabola opening upwards
+with roots $x=0$ and $x=2s/n$,
+and the minimum value is the average
+of the roots $x=s/n$, i.e.,
+the average of the numbers $a_1,a_2,\ldots,a_n$.
+
+\section{Data compression}
+
+\index{data compression}
+\index{binary code}
+\index{codeword}
+
+A \key{binary code} assigns for each character
+of a string a \key{codeword} that consists of bits.
+We can \emph{compress} the string using the binary code
+by replacing each character by the
+corresponding codeword.
+For example, the following binary code
+assigns codewords for characters
+\texttt{A}–\texttt{D}:
+\begin{center}
+\begin{tabular}{rr}
+character & codeword \\
+\hline
+\texttt{A} & 00 \\
+\texttt{B} & 01 \\
+\texttt{C} & 10 \\
+\texttt{D} & 11 \\
+\end{tabular}
+\end{center}
+This is a \key{constant-length} code
+which means that the length of each
+codeword is the same.
+For example, we can compress the string
+\texttt{AABACDACA} as follows:
+\[00\,00\,01\,00\,10\,11\,00\,10\,00\]
+Using this code, the length of the compressed
+string is 18 bits.
+However, we can compress the string better
+if we use a \key{variable-length} code
+where codewords may have different lengths.
+Then we can give short codewords for
+characters that appear often
+and long codewords for characters
+that appear rarely.
+It turns out that an \key{optimal} code
+for the above string is as follows:
+\begin{center}
+\begin{tabular}{rr}
+character & codeword \\
+\hline
+\texttt{A} & 0 \\
+\texttt{B} & 110 \\
+\texttt{C} & 10 \\
+\texttt{D} & 111 \\
+\end{tabular}
+\end{center}
+An optimal code produces a compressed string
+that is as short as possible.
+In this case, the compressed string using
+the optimal code is
+\[0\,0\,110\,0\,10\,111\,0\,10\,0,\]
+so only 15 bits are needed instead of 18 bits.
+Thus, thanks to a better code it was possible to
+save 3 bits in the compressed string.
+
+We require that no codeword
+is a prefix of another codeword.
+For example, it is not allowed that a code
+would contain both codewords 10
+and 1011.
+The reason for this is that we want
+to be able to generate the original string
+from the compressed string.
+If a codeword could be a prefix of another codeword,
+this would not always be possible.
+For example, the following code is \emph{not} valid:
+\begin{center}
+\begin{tabular}{rr}
+character & codeword \\
+\hline
+\texttt{A} & 10 \\
+\texttt{B} & 11 \\
+\texttt{C} & 1011 \\
+\texttt{D} & 111 \\
+\end{tabular}
+\end{center}
+Using this code, it would not be possible to know
+if the compressed string 1011 corresponds to
+the string \texttt{AB} or the string \texttt{C}.
+
+\index{Huffman coding}
+
+\subsubsection{Huffman coding}
+
+\key{Huffman coding}\footnote{D. A. Huffman discovered this method
+when solving a university course assignment
+and published the algorithm in 1952 \cite{huf52}.} is a greedy algorithm
+that constructs an optimal code for
+compressing a given string.
+The algorithm builds a binary tree
+based on the frequencies of the characters
+in the string,
+and each character's codeword can be read
+by following a path from the root to
+the corresponding node.
+A move to the left corresponds to bit 0,
+and a move to the right corresponds to bit 1.
+
+Initially, each character of the string is
+represented by a node whose weight is the
+number of times the character occurs in the string.
+Then at each step two nodes with minimum weights
+are combined by creating
+a new node whose weight is the sum of the weights
+of the original nodes.
+The process continues until all nodes have been combined.
+
+Next we will see how Huffman coding creates
+the optimal code for the string
+\texttt{AABACDACA}.
+Initially, there are four nodes that correspond
+to the characters of the string:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$5$};
+\node[draw, circle] (2) at (2,0) {$1$};
+\node[draw, circle] (3) at (4,0) {$2$};
+\node[draw, circle] (4) at (6,0) {$1$};
+
+\node[color=blue] at (0,-0.75) {\texttt{A}};
+\node[color=blue] at (2,-0.75) {\texttt{B}};
+\node[color=blue] at (4,-0.75) {\texttt{C}};
+\node[color=blue] at (6,-0.75) {\texttt{D}};
+
+%\path[draw,thick,-] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+The node that represents character \texttt{A}
+has weight 5 because character \texttt{A}
+appears 5 times in the string.
+The other weights have been calculated
+in the same way.
+
+The first step is to combine the nodes that
+correspond to characters \texttt{B} and \texttt{D},
+both with weight 1.
+The result is:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$5$};
+\node[draw, circle] (3) at (2,0) {$2$};
+\node[draw, circle] (2) at (4,0) {$1$};
+\node[draw, circle] (4) at (6,0) {$1$};
+\node[draw, circle] (5) at (5,1) {$2$};
+
+\node[color=blue] at (0,-0.75) {\texttt{A}};
+\node[color=blue] at (2,-0.75) {\texttt{C}};
+\node[color=blue] at (4,-0.75) {\texttt{B}};
+\node[color=blue] at (6,-0.75) {\texttt{D}};
+
+\node at (4.3,0.7) {0};
+\node at (5.7,0.7) {1};
+
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+After this, the nodes with weight 2 are combined:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,0) {$5$};
+\node[draw, circle] (3) at (3,1) {$2$};
+\node[draw, circle] (2) at (4,0) {$1$};
+\node[draw, circle] (4) at (6,0) {$1$};
+\node[draw, circle] (5) at (5,1) {$2$};
+\node[draw, circle] (6) at (4,2) {$4$};
+
+\node[color=blue] at (1,-0.75) {\texttt{A}};
+\node[color=blue] at (3,1-0.75) {\texttt{C}};
+\node[color=blue] at (4,-0.75) {\texttt{B}};
+\node[color=blue] at (6,-0.75) {\texttt{D}};
+
+\node at (4.3,0.7) {0};
+\node at (5.7,0.7) {1};
+\node at (3.3,1.7) {0};
+\node at (4.7,1.7) {1};
+
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (5) -- (6);
+\end{tikzpicture}
+\end{center}
+Finally, the two remaining nodes are combined:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (2,2) {$5$};
+\node[draw, circle] (3) at (3,1) {$2$};
+\node[draw, circle] (2) at (4,0) {$1$};
+\node[draw, circle] (4) at (6,0) {$1$};
+\node[draw, circle] (5) at (5,1) {$2$};
+\node[draw, circle] (6) at (4,2) {$4$};
+\node[draw, circle] (7) at (3,3) {$9$};
+
+\node[color=blue] at (2,2-0.75) {\texttt{A}};
+\node[color=blue] at (3,1-0.75) {\texttt{C}};
+\node[color=blue] at (4,-0.75) {\texttt{B}};
+\node[color=blue] at (6,-0.75) {\texttt{D}};
+
+\node at (4.3,0.7) {0};
+\node at (5.7,0.7) {1};
+\node at (3.3,1.7) {0};
+\node at (4.7,1.7) {1};
+\node at (2.3,2.7) {0};
+\node at (3.7,2.7) {1};
+
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (5) -- (6);
+\path[draw,thick,-] (1) -- (7);
+\path[draw,thick,-] (6) -- (7);
+\end{tikzpicture}
+\end{center}
+
+Now all nodes are in the tree, so the code is ready.
+The following codewords can be read from the tree:
+\begin{center}
+\begin{tabular}{rr}
+character & codeword \\
+\hline
+\texttt{A} & 0 \\
+\texttt{B} & 110 \\
+\texttt{C} & 10 \\
+\texttt{D} & 111 \\
+\end{tabular}
+\end{center}
--- a/chapter07.tex
+++ b/chapter07.tex
--- a/chapter08.tex
+++ b/chapter08.tex
@ -0,0 +1,732 @@
+\chapter{Amortized analysis}
+
+\index{amortized analysis}
+
+The time complexity of an algorithm
+is often easy to analyze
+just by examining the structure
+of the algorithm:
+what loops does the algorithm contain
+and how many times the loops are performed.
+However, sometimes a straightforward analysis
+does not give a true picture of the efficiency of the algorithm.
+
+\key{Amortized analysis} can be used to analyze
+algorithms that contain operations whose
+time complexity varies.
+The idea is to estimate the total time used to
+all such operations during the
+execution of the algorithm, instead of focusing
+on individual operations.
+
+\section{Two pointers method}
+
+\index{two pointers method}
+
+In the \key{two pointers method},
+two pointers are used to
+iterate through the array values.
+Both pointers can move to one direction only,
+which ensures that the algorithm works efficiently.
+Next we discuss two problems that can be solved
+using the two pointers method.
+
+\subsubsection{Subarray sum}
+
+As the first example,
+consider a problem where we are
+given an array of $n$ positive integers
+and a target sum $x$,
+and we want to find a subarray whose sum is $x$
+or report that there is no such subarray.
+
+For example, the array
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$1$};
+\node at (5.5,0.5) {$1$};
+\node at (6.5,0.5) {$2$};
+\node at (7.5,0.5) {$3$};
+\end{tikzpicture}
+\end{center}
+contains a subarray whose sum is 8:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (2,0) rectangle (5,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$1$};
+\node at (5.5,0.5) {$1$};
+\node at (6.5,0.5) {$2$};
+\node at (7.5,0.5) {$3$};
+\end{tikzpicture}
+\end{center}
+
+This problem can be solved in
+$O(n)$ time by using the two pointers method.
+The idea is to maintain pointers that point to the
+first and last value of a subarray.
+On each turn, the left pointer moves one step
+to the right, and the right pointer moves to the right
+as long as the resulting subarray sum is at most $x$.
+If the sum becomes exactly $x$,
+a solution has been found.
+
+As an example, consider the following array
+and a target sum $x=8$:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$1$};
+\node at (5.5,0.5) {$1$};
+\node at (6.5,0.5) {$2$};
+\node at (7.5,0.5) {$3$};
+\end{tikzpicture}
+\end{center}
+
+The initial subarray contains the values
+1, 3 and 2 whose sum is 6:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (0,0) rectangle (3,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$1$};
+\node at (5.5,0.5) {$1$};
+\node at (6.5,0.5) {$2$};
+\node at (7.5,0.5) {$3$};
+
+\draw[thick,->] (0.5,-0.7) -- (0.5,-0.1);
+\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
+\end{tikzpicture}
+\end{center}
+
+Then, the left pointer moves one step to the right.
+The right pointer does not move, because otherwise
+the subarray sum would exceed $x$.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (1,0) rectangle (3,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$1$};
+\node at (5.5,0.5) {$1$};
+\node at (6.5,0.5) {$2$};
+\node at (7.5,0.5) {$3$};
+
+\draw[thick,->] (1.5,-0.7) -- (1.5,-0.1);
+\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
+\end{tikzpicture}
+\end{center}
+
+Again, the left pointer moves one step to the right,
+and this time the right pointer moves three
+steps to the right.
+The subarray sum is $2+5+1=8$, so a subarray
+whose sum is $x$ has been found.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (2,0) rectangle (5,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$1$};
+\node at (5.5,0.5) {$1$};
+\node at (6.5,0.5) {$2$};
+\node at (7.5,0.5) {$3$};
+
+\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
+\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1);
+\end{tikzpicture}
+\end{center}
+
+The running time of the algorithm depends on
+the number of steps the right pointer moves.
+While there is no useful upper bound on how many steps the
+pointer can move on a \emph{single} turn.
+we know that the pointer moves \emph{a total of}
+$O(n)$ steps during the algorithm,
+because it only moves to the right.
+
+Since both the left and right pointer
+move $O(n)$ steps during the algorithm,
+the algorithm works in $O(n)$ time.
+
+\subsubsection{2SUM problem}
+
+\index{2SUM problem}
+
+Another problem that can be solved using
+the two pointers method is the following problem,
+also known as the \key{2SUM problem}:
+given an array of $n$ numbers and
+a target sum $x$, find
+two array values such that their sum is $x$,
+or report that no such values exist.
+
+To solve the problem, we first
+sort the array values in increasing order.
+After that, we iterate through the array using
+two pointers.
+The left pointer starts at the first value
+and moves one step to the right on each turn.
+The right pointer begins at the last value
+and always moves to the left until the sum of the
+left and right value is at most $x$.
+If the sum is exactly $x$,
+a solution has been found.
+
+For example, consider the following array
+and a target sum $x=12$:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$4$};
+\node at (2.5,0.5) {$5$};
+\node at (3.5,0.5) {$6$};
+\node at (4.5,0.5) {$7$};
+\node at (5.5,0.5) {$9$};
+\node at (6.5,0.5) {$9$};
+\node at (7.5,0.5) {$10$};
+\end{tikzpicture}
+\end{center}
+
+The initial positions of the pointers
+are as follows.
+The sum of the values is $1+10=11$
+that is smaller than $x$.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (0,0) rectangle (1,1);
+\fill[color=lightgray] (7,0) rectangle (8,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$4$};
+\node at (2.5,0.5) {$5$};
+\node at (3.5,0.5) {$6$};
+\node at (4.5,0.5) {$7$};
+\node at (5.5,0.5) {$9$};
+\node at (6.5,0.5) {$9$};
+\node at (7.5,0.5) {$10$};
+
+\draw[thick,->] (0.5,-0.7) -- (0.5,-0.1);
+\draw[thick,->] (7.5,-0.7) -- (7.5,-0.1);
+\end{tikzpicture}
+\end{center}
+
+Then the left pointer moves one step to the right.
+The right pointer moves three steps to the left,
+and the sum becomes $4+7=11$.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (1,0) rectangle (2,1);
+\fill[color=lightgray] (4,0) rectangle (5,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$4$};
+\node at (2.5,0.5) {$5$};
+\node at (3.5,0.5) {$6$};
+\node at (4.5,0.5) {$7$};
+\node at (5.5,0.5) {$9$};
+\node at (6.5,0.5) {$9$};
+\node at (7.5,0.5) {$10$};
+
+\draw[thick,->] (1.5,-0.7) -- (1.5,-0.1);
+\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1);
+\end{tikzpicture}
+\end{center}
+
+After this, the left pointer moves one step to the right again.
+The right pointer does not move, and a solution
+$5+7=12$ has been found.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (2,0) rectangle (3,1);
+\fill[color=lightgray] (4,0) rectangle (5,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$4$};
+\node at (2.5,0.5) {$5$};
+\node at (3.5,0.5) {$6$};
+\node at (4.5,0.5) {$7$};
+\node at (5.5,0.5) {$9$};
+\node at (6.5,0.5) {$9$};
+\node at (7.5,0.5) {$10$};
+
+\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
+\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1);
+\end{tikzpicture}
+\end{center}
+
+The running time of the algorithm is
+$O(n \log n)$, because it first sorts
+the array in $O(n \log n)$ time,
+and then both pointers move $O(n)$ steps.
+
+Note that it is possible to solve the problem
+in another way in $O(n \log n)$ time using binary search.
+In such a solution, we iterate through the array
+and for each array value, we try to find another
+value that yields the sum $x$.
+This can be done by performing $n$ binary searches,
+each of which takes $O(\log n)$ time.
+
+\index{3SUM problem}
+A more difficult problem is 
+the \key{3SUM problem} that asks to
+find \emph{three} array values
+whose sum is $x$.
+Using the idea of the above algorithm,
+this problem can be solved in $O(n^2)$ time\footnote{For a long time,
+it was thought that solving
+the 3SUM problem more efficiently than in $O(n^2)$ time
+would not be possible.
+However, in 2014, it turned out \cite{gro14}
+that this is not the case.}.
+Can you see how?
+
+\section{Nearest smaller elements}
+
+\index{nearest smaller elements}
+
+Amortized analysis is often used to
+estimate the number of operations
+performed on a data structure.
+The operations may be distributed unevenly so
+that most operations occur during a
+certain phase of the algorithm, but the total
+number of the operations is limited.
+
+As an example, consider the problem
+of finding for each array element
+the \key{nearest smaller element}, i.e.,
+the first smaller element that precedes the element
+in the array.
+It is possible that no such element exists,
+in which case the algorithm should report this.
+Next we will see how the problem can be
+efficiently solved using a stack structure.
+
+We go through the array from left to right
+and maintain a stack of array elements.
+At each array position, we remove elements from the stack
+until the top element is smaller than the
+current element, or the stack is empty.
+Then, we report that the top element is
+the nearest smaller element of the current element,
+or if the stack is empty, there is no such element.
+Finally, we add the current element to the stack.
+
+As an example, consider the following array:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$2$};
+\node at (4.5,0.5) {$5$};
+\node at (5.5,0.5) {$3$};
+\node at (6.5,0.5) {$4$};
+\node at (7.5,0.5) {$2$};
+\end{tikzpicture}
+\end{center}
+
+First, the elements 1, 3 and 4 are added to the stack,
+because each element is larger than the previous element.
+Thus, the nearest smaller element of 4 is 3,
+and the nearest smaller element of 3 is 1.
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (2,0) rectangle (3,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$2$};
+\node at (4.5,0.5) {$5$};
+\node at (5.5,0.5) {$3$};
+\node at (6.5,0.5) {$4$};
+\node at (7.5,0.5) {$2$};
+
+\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
+\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2);
+\draw (2.2,0.2-1.2) rectangle (2.8,0.8-1.2);
+
+\node at (0.5,0.5-1.2) {$1$};
+\node at (1.5,0.5-1.2) {$3$};
+\node at (2.5,0.5-1.2) {$4$};
+
+\draw[->,thick] (0.8,0.5-1.2) -- (1.2,0.5-1.2);
+\draw[->,thick] (1.8,0.5-1.2) -- (2.2,0.5-1.2);
+\end{tikzpicture}
+\end{center}
+
+The next element 2 is smaller than the two top
+elements in the stack.
+Thus, the elements 3 and 4 are removed from the stack,
+and then the element 2 is added to the stack.
+Its nearest smaller element is 1:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (3,0) rectangle (4,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$2$};
+\node at (4.5,0.5) {$5$};
+\node at (5.5,0.5) {$3$};
+\node at (6.5,0.5) {$4$};
+\node at (7.5,0.5) {$2$};
+
+\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
+\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
+
+\node at (0.5,0.5-1.2) {$1$};
+\node at (3.5,0.5-1.2) {$2$};
+
+\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2);
+\end{tikzpicture}
+\end{center}
+
+Then, the element 5 is larger than the element 2,
+so it will be added to the stack, and
+its nearest smaller element is 2:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (4,0) rectangle (5,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$2$};
+\node at (4.5,0.5) {$5$};
+\node at (5.5,0.5) {$3$};
+\node at (6.5,0.5) {$4$};
+\node at (7.5,0.5) {$2$};
+
+\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
+\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
+\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2);
+
+\node at (0.5,0.5-1.2) {$1$};
+\node at (3.5,0.5-1.2) {$2$};
+\node at (4.5,0.5-1.2) {$5$};
+
+\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2);
+\draw[->,thick] (3.8,0.5-1.2) -- (4.2,0.5-1.2);
+\end{tikzpicture}
+\end{center}
+
+After this, the element 5 is removed from the stack
+and the elements 3 and 4 are added to the stack:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (6,0) rectangle (7,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$2$};
+\node at (4.5,0.5) {$5$};
+\node at (5.5,0.5) {$3$};
+\node at (6.5,0.5) {$4$};
+\node at (7.5,0.5) {$2$};
+
+\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
+\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
+\draw (5.2,0.2-1.2) rectangle (5.8,0.8-1.2);
+\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2);
+
+\node at (0.5,0.5-1.2) {$1$};
+\node at (3.5,0.5-1.2) {$2$};
+\node at (5.5,0.5-1.2) {$3$};
+\node at (6.5,0.5-1.2) {$4$};
+
+\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2);
+\draw[->,thick] (3.8,0.5-1.2) -- (5.2,0.5-1.2);
+\draw[->,thick] (5.8,0.5-1.2) -- (6.2,0.5-1.2);
+\end{tikzpicture}
+\end{center}
+
+Finally, all elements except 1 are removed
+from the stack and the last element 2
+is added to the stack:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (7,0) rectangle (8,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$1$};
+\node at (1.5,0.5) {$3$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$2$};
+\node at (4.5,0.5) {$5$};
+\node at (5.5,0.5) {$3$};
+\node at (6.5,0.5) {$4$};
+\node at (7.5,0.5) {$2$};
+
+\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
+\draw (7.2,0.2-1.2) rectangle (7.8,0.8-1.2);
+
+\node at (0.5,0.5-1.2) {$1$};
+\node at (7.5,0.5-1.2) {$2$};
+
+\draw[->,thick] (0.8,0.5-1.2) -- (7.2,0.5-1.2);
+\end{tikzpicture}
+\end{center}
+
+The efficiency of the algorithm depends on
+the total number of stack operations.
+If the current element is larger than
+the top element in the stack, it is directly
+added to the stack, which is efficient.
+However, sometimes the stack can contain several
+larger elements and it takes time to remove them.
+Still, each element is added \emph{exactly once} to the stack
+and removed \emph{at most once} from the stack.
+Thus, each element causes $O(1)$ stack operations,
+and the algorithm works in $O(n)$ time.
+
+\section{Sliding window minimum}
+
+\index{sliding window}
+\index{sliding window minimum}
+
+A \key{sliding window} is a constant-size subarray
+that moves from left to right through the array.
+At each window position,
+we want to calculate some information
+about the elements inside the window.
+In this section, we focus on the problem
+of maintaining the \key{sliding window minimum},
+which means that
+we should report the smallest value inside each window.
+
+The sliding window minimum can be calculated
+using a similar idea that we used to calculate
+the nearest smaller elements.
+We maintain a queue
+where each element is larger than
+the previous element,
+and the first element
+always corresponds to the minimum element inside the window.
+After each window move,
+we remove elements from the end of the queue
+until the last queue element
+is smaller than the new window element,
+or the queue becomes empty.
+We also remove the first queue element
+if it is not inside the window anymore.
+Finally, we add the new window element
+to the end of the queue.
+
+As an example, consider the following array:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$2$};
+\node at (1.5,0.5) {$1$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$3$};
+\node at (5.5,0.5) {$4$};
+\node at (6.5,0.5) {$1$};
+\node at (7.5,0.5) {$2$};
+\end{tikzpicture}
+\end{center}
+
+Suppose that the size of the sliding window is 4.
+At the first window position, the smallest value is 1:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (0,0) rectangle (4,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$2$};
+\node at (1.5,0.5) {$1$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$3$};
+\node at (5.5,0.5) {$4$};
+\node at (6.5,0.5) {$1$};
+\node at (7.5,0.5) {$2$};
+
+\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2);
+\draw (2.2,0.2-1.2) rectangle (2.8,0.8-1.2);
+\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
+
+\node at (1.5,0.5-1.2) {$1$};
+\node at (2.5,0.5-1.2) {$4$};
+\node at (3.5,0.5-1.2) {$5$};
+
+\draw[->,thick] (1.8,0.5-1.2) -- (2.2,0.5-1.2);
+\draw[->,thick] (2.8,0.5-1.2) -- (3.2,0.5-1.2);
+\end{tikzpicture}
+\end{center}
+
+Then the window moves one step right.
+The new element 3 is smaller than the elements
+4 and 5 in the queue, so the elements 4 and 5
+are removed from the queue
+and the element 3 is added to the queue.
+The smallest value is still 1.
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (1,0) rectangle (5,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$2$};
+\node at (1.5,0.5) {$1$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$3$};
+\node at (5.5,0.5) {$4$};
+\node at (6.5,0.5) {$1$};
+\node at (7.5,0.5) {$2$};
+
+\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2);
+\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2);
+
+\node at (1.5,0.5-1.2) {$1$};
+\node at (4.5,0.5-1.2) {$3$};
+
+\draw[->,thick] (1.8,0.5-1.2) -- (4.2,0.5-1.2);
+\end{tikzpicture}
+\end{center}
+
+After this, the window moves again,
+and the smallest element 1
+does not belong to the window anymore.
+Thus, it is removed from the queue and the smallest
+value is now 3. Also the new element 4
+is added to the queue.
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (2,0) rectangle (6,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$2$};
+\node at (1.5,0.5) {$1$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$3$};
+\node at (5.5,0.5) {$4$};
+\node at (6.5,0.5) {$1$};
+\node at (7.5,0.5) {$2$};
+
+\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2);
+\draw (5.2,0.2-1.2) rectangle (5.8,0.8-1.2);
+
+\node at (4.5,0.5-1.2) {$3$};
+\node at (5.5,0.5-1.2) {$4$};
+
+\draw[->,thick] (4.8,0.5-1.2) -- (5.2,0.5-1.2);
+\end{tikzpicture}
+\end{center}
+
+The next new element 1 is smaller than all elements
+in the queue.
+Thus, all elements are removed from the queue
+and it will only contain the element 1:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (3,0) rectangle (7,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$2$};
+\node at (1.5,0.5) {$1$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$3$};
+\node at (5.5,0.5) {$4$};
+\node at (6.5,0.5) {$1$};
+\node at (7.5,0.5) {$2$};
+
+\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2);
+
+\node at (6.5,0.5-1.2) {$1$};
+\end{tikzpicture}
+\end{center}
+
+Finally the window reaches its last position.
+The element 2 is added to the queue,
+but the smallest value inside the window
+is still 1.
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (4,0) rectangle (8,1);
+\draw (0,0) grid (8,1);
+
+\node at (0.5,0.5) {$2$};
+\node at (1.5,0.5) {$1$};
+\node at (2.5,0.5) {$4$};
+\node at (3.5,0.5) {$5$};
+\node at (4.5,0.5) {$3$};
+\node at (5.5,0.5) {$4$};
+\node at (6.5,0.5) {$1$};
+\node at (7.5,0.5) {$2$};
+
+\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2);
+\draw (7.2,0.2-1.2) rectangle (7.8,0.8-1.2);
+
+\node at (6.5,0.5-1.2) {$1$};
+\node at (7.5,0.5-1.2) {$2$};
+
+\draw[->,thick] (6.8,0.5-1.2) -- (7.2,0.5-1.2);
+\end{tikzpicture}
+\end{center}
+
+Since each array element
+is added to the queue exactly once and
+removed from the queue at most once,
+the algorithm works in $O(n)$ time.
+
+
+
--- a/chapter09.tex
+++ b/chapter09.tex
--- a/chapter10.tex
+++ b/chapter10.tex
@ -0,0 +1,849 @@
+\chapter{Bit manipulation}
+
+All data in computer programs is internally stored as bits,
+i.e., as numbers 0 and 1.
+This chapter discusses the bit representation
+of integers, and shows examples
+of how to use bit operations.
+It turns out that there are many uses for
+bit manipulation in algorithm programming.
+
+\section{Bit representation}
+
+\index{bit representation}
+
+In programming, an $n$ bit integer is internally
+stored as a binary number that consists of $n$ bits.
+For example, the C++ type \texttt{int} is
+a 32-bit type, which means that every \texttt{int}
+number consists of 32 bits.
+
+Here is the bit representation of
+the \texttt{int} number 43: 
+\[00000000000000000000000000101011\]
+The bits in the representation are indexed from right to left.
+To convert a bit representation $b_k \cdots b_2 b_1 b_0$ into a number,
+we can use the formula
+\[b_k 2^k + \ldots + b_2 2^2 + b_1 2^1 + b_0 2^0.\]
+For example,
+\[1 \cdot 2^5 + 1 \cdot 2^3 + 1 \cdot 2^1 + 1 \cdot 2^0 = 43.\]
+
+The bit representation of a number is either
+\key{signed} or \key{unsigned}.
+Usually a signed representation is used,
+which means that both negative and positive
+numbers can be represented.
+A signed variable of $n$ bits can contain any
+integer between $-2^{n-1}$ and $2^{n-1}-1$.
+For example, the \texttt{int} type in C++ is
+a signed type, so an \texttt{int} variable can contain any
+integer between $-2^{31}$ and $2^{31}-1$.
+
+The first bit in a signed representation
+is the sign of the number (0 for nonnegative numbers
+and 1 for negative numbers), and
+the remaining $n-1$ bits contain the magnitude of the number.
+\key{Two's complement} is used, which means that the
+opposite number of a number is calculated by first
+inverting all the bits in the number,
+and then increasing the number by one.
+
+For example, the bit representation of
+the \texttt{int} number $-43$ is
+\[11111111111111111111111111010101.\]
+
+In an unsigned representation, only nonnegative
+numbers can be used, but the upper bound for the values is larger.
+An unsigned variable of $n$ bits can contain any
+integer between $0$ and $2^n-1$.
+For example, in C++, an \texttt{unsigned int} variable
+can contain any integer between $0$ and $2^{32}-1$.
+
+There is a connection between the
+representations:
+a signed number $-x$ equals an unsigned number $2^n-x$.
+For example, the following code shows that
+the signed number $x=-43$ equals the unsigned
+number $y=2^{32}-43$:
+\begin{lstlisting}
+int x = -43;
+unsigned int y = x;
+cout << x << "\n"; // -43
+cout << y << "\n"; // 4294967253
+\end{lstlisting}
+
+If a number is larger than the upper bound
+of the bit representation, the number will overflow.
+In a signed representation,
+the next number after $2^{n-1}-1$ is $-2^{n-1}$,
+and in an unsigned representation,
+the next number after $2^n-1$ is $0$.
+For example, consider the following code:
+\begin{lstlisting}
+int x = 2147483647
+cout << x << "\n"; // 2147483647
+x++;
+cout << x << "\n"; // -2147483648
+\end{lstlisting}
+
+Initially, the value of $x$ is $2^{31}-1$.
+This is the largest value that can be stored
+in an \texttt{int} variable,
+so the next number after $2^{31}-1$ is $-2^{31}$.
+
+
+\section{Bit operations}
+
+\newcommand\XOR{\mathbin{\char`\^}}
+
+\subsubsection{And operation}
+
+\index{and operation}
+
+The \key{and} operation $x$ \& $y$ produces a number
+that has one bits in positions where both
+$x$ and $y$ have one bits.
+For example, $22$ \& $26$ = 18, because
+
+\begin{center}
+\begin{tabular}{rrr}
+& 10110 & (22)\\
+\& & 11010 & (26) \\
+\hline
+ = & 10010 & (18) \\
+\end{tabular}
+\end{center}
+
+Using the and operation, we can check if a number
+$x$ is even because
+$x$ \& $1$ = 0 if $x$ is even, and
+$x$ \& $1$ = 1 if $x$ is odd.
+More generally, $x$ is divisible by $2^k$
+exactly when $x$ \& $(2^k-1)$ = 0.
+
+\subsubsection{Or operation}
+
+\index{or operation}
+
+The \key{or} operation $x$ | $y$ produces a number
+that has one bits in positions where at least one
+of $x$ and $y$ have one bits.
+For example, $22$ | $26$ = 30, because
+
+\begin{center}
+\begin{tabular}{rrr}
+& 10110 & (22)\\
+| & 11010 & (26) \\
+\hline
+ = & 11110 & (30) \\
+\end{tabular}
+\end{center}
+
+\subsubsection{Xor operation}
+
+\index{xor operation}
+
+The \key{xor} operation $x$ $\XOR$ $y$ produces a number
+that has one bits in positions where exactly one
+of $x$ and $y$ have one bits.
+For example, $22$ $\XOR$ $26$ = 12, because
+
+\begin{center}
+\begin{tabular}{rrr}
+& 10110 & (22)\\
+$\XOR$ & 11010 & (26) \\
+\hline
+ = & 01100 & (12) \\
+\end{tabular}
+\end{center}
+
+\subsubsection{Not operation}
+
+\index{not operation}
+
+The \key{not} operation \textasciitilde$x$
+produces a number where all the bits of $x$
+have been inverted.
+The formula \textasciitilde$x = -x-1$ holds,
+for example, \textasciitilde$29 = -30$.
+
+The result of the not operation at the bit level
+depends on the length of the bit representation,
+because the operation inverts all bits.
+For example, if the numbers are 32-bit
+\texttt{int} numbers, the result is as follows:
+
+\begin{center}
+\begin{tabular}{rrrr}
+$x$ & = & 29 &   00000000000000000000000000011101 \\
+\textasciitilde$x$ & = & $-30$ & 11111111111111111111111111100010 \\
+\end{tabular}
+\end{center}
+
+\subsubsection{Bit shifts}
+
+\index{bit shift}
+
+The left bit shift $x < < k$ appends $k$
+zero bits to the number,
+and the right bit shift $x > > k$
+removes the $k$ last bits from the number.
+For example, $14 < < 2 = 56$,
+because $14$ and $56$ correspond to 1110 and 111000.
+Similarly, $49 > > 3 = 6$,
+because $49$ and $6$ correspond to 110001 and 110.
+
+Note that $x < < k$
+corresponds to multiplying $x$ by $2^k$,
+and $x > > k$
+corresponds to dividing $x$ by $2^k$
+rounded down to an integer.
+
+\subsubsection{Applications}
+
+A number of the form $1 < < k$ has a one bit
+in position $k$ and all other bits are zero,
+so we can use such numbers to access single bits of numbers.
+In particular, the $k$th bit of a number is one
+exactly when $x$ \& $(1 < < k)$ is not zero.
+The following code prints the bit representation
+of an \texttt{int} number $x$:
+
+\begin{lstlisting}
+for (int i = 31; i >= 0; i--) {
+    if (x&(1<<i)) cout << "1";
+    else cout << "0";
+}
+\end{lstlisting}
+
+It is also possible to modify single bits
+of numbers using similar ideas.
+For example, the formula $x$ | $(1 < < k)$
+sets the $k$th bit of $x$ to one,
+the formula
+$x$ \& \textasciitilde $(1 < < k)$
+sets the $k$th bit of $x$ to zero,
+and the formula
+$x$ $\XOR$ $(1 < < k)$
+inverts the $k$th bit of $x$.
+
+The formula $x$ \& $(x-1)$ sets the last
+one bit of $x$ to zero,
+and the formula $x$ \& $-x$ sets all the
+one bits to zero, except for the last one bit.
+The formula $x$ | $(x-1)$
+inverts all the bits after the last one bit.
+Also note that a positive number $x$ is
+a power of two exactly when $x$ \& $(x-1) = 0$.
+
+\subsubsection*{Additional functions}
+
+The g++ compiler provides the following
+functions for counting bits:
+
+\begin{itemize}
+\item
+$\texttt{\_\_builtin\_clz}(x)$:
+the number of zeros at the beginning of the number
+\item
+$\texttt{\_\_builtin\_ctz}(x)$:
+the number of zeros at the end of the number
+\item
+$\texttt{\_\_builtin\_popcount}(x)$:
+the number of ones in the number
+\item
+$\texttt{\_\_builtin\_parity}(x)$:
+the parity (even or odd) of the number of ones
+\end{itemize}
+\begin{samepage}
+
+The functions can be used as follows:
+\begin{lstlisting}
+int x = 5328; // 00000000000000000001010011010000
+cout << __builtin_clz(x) << "\n"; // 19
+cout << __builtin_ctz(x) << "\n"; // 4
+cout << __builtin_popcount(x) << "\n"; // 5
+cout << __builtin_parity(x) << "\n"; // 1
+\end{lstlisting}
+\end{samepage}
+
+While the above functions only support \texttt{int} numbers,
+there are also \texttt{long long} versions of
+the functions available with the suffix \texttt{ll}.
+
+\section{Representing sets}
+
+Every subset of a set
+$\{0,1,2,\ldots,n-1\}$
+can be represented as an $n$ bit integer
+whose one bits indicate which
+elements belong to the subset.
+This is an efficient way to represent sets,
+because every element requires only one bit of memory,
+and set operations can be implemented as bit operations.
+
+For example, since \texttt{int} is a 32-bit type,
+an \texttt{int} number can represent any subset
+of the set $\{0,1,2,\ldots,31\}$.
+The bit representation of the set $\{1,3,4,8\}$ is
+\[00000000000000000000000100011010,\]
+which corresponds to the number $2^8+2^4+2^3+2^1=282$.
+
+\subsubsection{Set implementation}
+
+The following code declares an \texttt{int}
+variable $x$ that can contain
+a subset of $\{0,1,2,\ldots,31\}$.
+After this, the code adds the elements 1, 3, 4 and 8
+to the set and prints the size of the set.
+\begin{lstlisting}
+int x = 0;
+x |= (1<<1);
+x |= (1<<3);
+x |= (1<<4);
+x |= (1<<8);
+cout << __builtin_popcount(x) << "\n"; // 4
+\end{lstlisting}
+Then, the following code prints all
+elements that belong to the set:
+\begin{lstlisting}
+for (int i = 0; i < 32; i++) {
+    if (x&(1<<i)) cout << i << " ";
+}
+// output: 1 3 4 8
+\end{lstlisting}
+
+\subsubsection{Set operations}
+
+Set operations can be implemented as follows as bit operations:
+
+\begin{center}
+\begin{tabular}{lll}
+& set syntax & bit syntax \\
+\hline
+intersection & $a \cap b$ & $a$ \& $b$ \\
+union & $a \cup b$ & $a$ | $b$ \\
+complement & $\bar a$ & \textasciitilde$a$ \\
+difference & $a \setminus b$ & $a$ \& (\textasciitilde$b$) \\
+\end{tabular}
+\end{center}
+
+For example, the following code first constructs
+the sets $x=\{1,3,4,8\}$ and $y=\{3,6,8,9\}$,
+and then constructs the set $z = x \cup y = \{1,3,4,6,8,9\}$:
+
+\begin{lstlisting}
+int x = (1<<1)|(1<<3)|(1<<4)|(1<<8);
+int y = (1<<3)|(1<<6)|(1<<8)|(1<<9);
+int z = x|y;
+cout << __builtin_popcount(z) << "\n"; // 6
+\end{lstlisting}
+
+\subsubsection{Iterating through subsets}
+
+The following code goes through
+the subsets of $\{0,1,\ldots,n-1\}$:
+
+\begin{lstlisting}
+for (int b = 0; b < (1<<n); b++) {
+    // process subset b
+}
+\end{lstlisting}
+The following code goes through
+the subsets with exactly $k$ elements:
+\begin{lstlisting}
+for (int b = 0; b < (1<<n); b++) {
+    if (__builtin_popcount(b) == k) {
+        // process subset b
+    }
+}
+\end{lstlisting}
+The following code goes through the subsets
+of a set $x$:
+\begin{lstlisting}
+int b = 0;
+do {
+    // process subset b
+} while (b=(b-x)&x);
+\end{lstlisting}
+
+\section{Bit optimizations}
+
+Many algorithms can be optimized using
+bit operations.
+Such optimizations do not change the
+time complexity of the algorithm,
+but they may have a large impact
+on the actual running time of the code.
+In this section we discuss examples
+of such situations.
+
+\subsubsection{Hamming distances}
+
+\index{Hamming distance}
+The \key{Hamming distance}
+$\texttt{hamming}(a,b)$ between two
+strings $a$ and $b$ of equal length is
+the number of positions where the strings differ.
+For example,
+\[\texttt{hamming}(01101,11001)=2.\]
+
+Consider the following problem: Given
+a list of $n$ bit strings, each of length $k$,
+calculate the minimum Hamming distance
+between two strings in the list.
+For example, the answer for $[00111,01101,11110]$
+is 2, because
+\begin{itemize}[noitemsep]
+\item $\texttt{hamming}(00111,01101)=2$,
+\item $\texttt{hamming}(00111,11110)=3$, and
+\item $\texttt{hamming}(01101,11110)=3$.
+\end{itemize}
+
+A straightforward way to solve the problem is
+to go through all pairs of strings and calculate
+their Hamming distances,
+which yields an $O(n^2 k)$ time algorithm.
+The following function can be used to
+calculate distances:
+\begin{lstlisting}
+int hamming(string a, string b) {
+    int d = 0;
+    for (int i = 0; i < k; i++) {
+        if (a[i] != b[i]) d++;
+    }
+    return d;
+}
+\end{lstlisting}
+
+However, if $k$ is small, we can optimize the code
+by storing the bit strings as integers and
+calculating the Hamming distances using bit operations.
+In particular, if $k \le 32$, we can just store
+the strings as \texttt{int} values and use the
+following function to calculate distances:
+\begin{lstlisting}
+int hamming(int a, int b) {
+    return __builtin_popcount(a^b);
+}
+\end{lstlisting}
+In the above function, the xor operation constructs
+a bit string that has one bits in positions
+where $a$ and $b$ differ.
+Then, the number of bits is calculated using
+the \texttt{\_\_builtin\_popcount} function.
+
+To compare the implementations, we generated
+a list of 10000 random bit strings of length 30.
+Using the first approach, the search took
+13.5 seconds, and after the bit optimization,
+it only took 0.5 seconds.
+Thus, the bit optimized code was almost
+30 times faster than the original code.
+
+\subsubsection{Counting subgrids}
+
+As another example, consider the
+following problem:
+Given an $n \times n$ grid whose
+each square is either black (1) or white (0),
+calculate the number of subgrids
+whose all corners are black.
+For example, the grid
+\begin{center}
+\begin{tikzpicture}[scale=0.5]
+\fill[black] (1,1) rectangle (2,2);
+\fill[black] (1,4) rectangle (2,5);
+\fill[black] (4,1) rectangle (5,2);
+\fill[black] (4,4) rectangle (5,5);
+\fill[black] (1,3) rectangle (2,4);
+\fill[black] (2,3) rectangle (3,4);
+\fill[black] (2,1) rectangle (3,2);
+\fill[black] (0,2) rectangle (1,3);
+\draw (0,0) grid (5,5);
+\end{tikzpicture}
+\end{center}
+contains two such subgrids:
+\begin{center}
+\begin{tikzpicture}[scale=0.5]
+\fill[black] (1,1) rectangle (2,2);
+\fill[black] (1,4) rectangle (2,5);
+\fill[black] (4,1) rectangle (5,2);
+\fill[black] (4,4) rectangle (5,5);
+\fill[black] (1,3) rectangle (2,4);
+\fill[black] (2,3) rectangle (3,4);
+\fill[black] (2,1) rectangle (3,2);
+\fill[black] (0,2) rectangle (1,3);
+\draw (0,0) grid (5,5);
+
+\fill[black] (7+1,1) rectangle (7+2,2);
+\fill[black] (7+1,4) rectangle (7+2,5);
+\fill[black] (7+4,1) rectangle (7+5,2);
+\fill[black] (7+4,4) rectangle (7+5,5);
+\fill[black] (7+1,3) rectangle (7+2,4);
+\fill[black] (7+2,3) rectangle (7+3,4);
+\fill[black] (7+2,1) rectangle (7+3,2);
+\fill[black] (7+0,2) rectangle (7+1,3);
+\draw (7+0,0) grid (7+5,5);
+
+\draw[color=red,line width=1mm] (1,1) rectangle (3,4);
+\draw[color=red,line width=1mm] (7+1,1) rectangle (7+5,5);
+\end{tikzpicture}
+\end{center}
+
+There is an $O(n^3)$ time algorithm for solving the problem:
+go through all $O(n^2)$ pairs of rows and for each pair
+$(a,b)$ calculate the number of columns that contain a black
+square in both rows in $O(n)$ time.
+The following code assumes that $\texttt{color}[y][x]$
+denotes the color in row $y$ and column $x$:
+\begin{lstlisting}
+int count = 0;
+for (int i = 0; i < n; i++) {
+    if (color[a][i] == 1 && color[b][i] == 1) count++;
+}
+\end{lstlisting}
+Then, those columns
+account for $\texttt{count}(\texttt{count}-1)/2$ subgrids with black corners,
+because we can choose any two of them to form a subgrid.
+
+To optimize this algorithm, we divide the grid into blocks
+of columns such that each block consists of $N$
+consecutive columns. Then, each row is stored as
+a list of $N$-bit numbers that describe the colors
+of the squares. Now we can process $N$ columns at the same time
+using bit operations. In the following code,
+$\texttt{color}[y][k]$ represents
+a block of $N$ colors as bits.
+\begin{lstlisting}
+int count = 0;
+for (int i = 0; i <= n/N; i++) {
+    count += __builtin_popcount(color[a][i]&color[b][i]);
+}
+\end{lstlisting}
+The resulting algorithm works in $O(n^3/N)$ time.
+
+We generated a random grid of size $2500 \times 2500$
+and compared the original and bit optimized implementation.
+While the original code took $29.6$ seconds,
+the bit optimized version only took $3.1$ seconds
+with $N=32$ (\texttt{int} numbers) and $1.7$ seconds
+with $N=64$ (\texttt{long long} numbers).
+
+\section{Dynamic programming}
+
+Bit operations provide an efficient and convenient
+way to implement dynamic programming algorithms
+whose states contain subsets of elements,
+because such states can be stored as integers.
+Next we discuss examples of combining
+bit operations and dynamic programming.
+
+\subsubsection{Optimal selection}
+
+As a first example, consider the following problem:
+We are given the prices of $k$ products
+over $n$ days, and we want to buy each product
+exactly once.
+However, we are allowed to buy at most one product
+in a day.
+What is the minimum total price?
+For example, consider the following scenario ($k=3$ and $n=8$):
+\begin{center}
+\begin{tikzpicture}[scale=.65]
+    \draw (0, 0) grid (8,3);
+    \node at (-2.5,2.5) {product 0};
+    \node at (-2.5,1.5) {product 1};
+    \node at (-2.5,0.5) {product 2};
+
+    \foreach \x in {0,...,7}
+        {\node at (\x+0.5,3.5) {\x};}
+    \foreach \x/\v in {0/6,1/9,2/5,3/2,4/8,5/9,6/1,7/6}
+        {\node at (\x+0.5,2.5) {\v};}
+    \foreach \x/\v in {0/8,1/2,2/6,3/2,4/7,5/5,6/7,7/2}
+        {\node at (\x+0.5,1.5) {\v};}
+    \foreach \x/\v in {0/5,1/3,2/9,3/7,4/3,5/5,6/1,7/4}
+        {\node at (\x+0.5,0.5) {\v};}
+\end{tikzpicture}
+\end{center}
+In this scenario, the minimum total price is $5$:
+\begin{center}
+\begin{tikzpicture}[scale=.65]
+    \fill [color=lightgray] (1, 1) rectangle (2, 2);
+    \fill [color=lightgray] (3, 2) rectangle (4, 3);
+    \fill [color=lightgray] (6, 0) rectangle (7, 1);
+    \draw (0, 0) grid (8,3);
+    \node at (-2.5,2.5) {product 0};
+    \node at (-2.5,1.5) {product 1};
+    \node at (-2.5,0.5) {product 2};
+
+    \foreach \x in {0,...,7}
+        {\node at (\x+0.5,3.5) {\x};}
+    \foreach \x/\v in {0/6,1/9,2/5,3/2,4/8,5/9,6/1,7/6}
+        {\node at (\x+0.5,2.5) {\v};}
+    \foreach \x/\v in {0/8,1/2,2/6,3/2,4/7,5/5,6/7,7/2}
+        {\node at (\x+0.5,1.5) {\v};}
+    \foreach \x/\v in {0/5,1/3,2/9,3/7,4/3,5/5,6/1,7/4}
+        {\node at (\x+0.5,0.5) {\v};}
+\end{tikzpicture}
+\end{center}
+
+Let $\texttt{price}[x][d]$ denote the price of product $x$
+on day $d$.
+For example, in the above scenario $\texttt{price}[2][3] = 7$.
+Then, let $\texttt{total}(S,d)$ denote the minimum total
+price for buying a subset $S$ of products by day $d$.
+Using this function, the solution to the problem is
+$\texttt{total}(\{0 \ldots k-1\},n-1)$.
+
+First, $\texttt{total}(\emptyset,d) = 0$,
+because it does not cost anything to buy an empty set,
+and $\texttt{total}(\{x\},0) = \texttt{price}[x][0]$,
+because there is one way to buy one product on the first day.
+Then, the following recurrence can be used:
+\begin{equation*}
+\begin{split}
+\texttt{total}(S,d) = \min( & \texttt{total}(S,d-1), \\
+& \min_{x \in S} (\texttt{total}(S \setminus x,d-1)+\texttt{price}[x][d]))
+\end{split}
+\end{equation*}
+This means that we either do not buy any product on day $d$
+or buy a product $x$ that belongs to $S$.
+In the latter case, we remove $x$ from $S$ and add the
+price of $x$ to the total price.
+
+The next step is to calculate the values of the function
+using dynamic programming.
+To store the function values, we declare an array
+\begin{lstlisting}
+int total[1<<K][N];
+\end{lstlisting}
+where $K$ and $N$ are suitably large constants.
+The first dimension of the array corresponds to a bit
+representation of a subset.
+
+First, the cases where $d=0$ can be processed as follows:
+\begin{lstlisting}
+for (int x = 0; x < k; x++) {
+    total[1<<x][0] = price[x][0];
+}
+\end{lstlisting}
+Then, the recurrence translates into the following code:
+\begin{lstlisting}
+for (int d = 1; d < n; d++) {
+    for (int s = 0; s < (1<<k); s++) {
+        total[s][d] = total[s][d-1];
+        for (int x = 0; x < k; x++) {
+            if (s&(1<<x)) {
+                total[s][d] = min(total[s][d],
+                                    total[s^(1<<x)][d-1]+price[x][d]);
+            }
+        }
+    }
+}
+\end{lstlisting}
+The time complexity of the algorithm is $O(n 2^k k)$.
+
+\subsubsection{From permutations to subsets}
+
+Using dynamic programming, it is often possible
+to change an iteration over permutations into
+an iteration over subsets\footnote{This technique was introduced in 1962
+by M. Held and R. M. Karp \cite{hel62}.}.
+The benefit of this is that
+$n!$, the number of permutations,
+is much larger than $2^n$, the number of subsets.
+For example, if $n=20$, then
+$n! \approx 2.4 \cdot 10^{18}$ and $2^n \approx 10^6$.
+Thus, for certain values of $n$,
+we can efficiently go through the subsets but not through the permutations.
+
+As an example, consider the following problem:
+There is an elevator with maximum weight $x$,
+and $n$ people with known weights
+who want to get from the ground floor
+to the top floor.
+What is the minimum number of rides needed
+if the people enter the elevator in an optimal order?
+
+For example, suppose that $x=10$, $n=5$
+and the weights are as follows:
+\begin{center}
+\begin{tabular}{ll}
+person & weight \\
+\hline
+0 & 2 \\
+1 & 3 \\
+2 & 3 \\
+3 & 5 \\
+4 & 6 \\
+\end{tabular}
+\end{center}
+In this case, the minimum number of rides is 2.
+One optimal order is $\{0,2,3,1,4\}$,
+which partitions the people into two rides:
+first $\{0,2,3\}$ (total weight 10),
+and then $\{1,4\}$ (total weight 9).
+
+The problem can be easily solved in $O(n! n)$ time
+by testing all possible permutations of $n$ people.
+However, we can use dynamic programming to get
+a more efficient $O(2^n n)$ time algorithm.
+The idea is to calculate for each subset of people
+two values: the minimum number of rides needed and
+the minimum weight of people who ride in the last group.
+
+Let $\texttt{weight}[p]$ denote the weight of
+person $p$.
+We define two functions:
+$\texttt{rides}(S)$ is the minimum number of
+rides for a subset $S$,
+and $\texttt{last}(S)$ is the minimum weight
+of the last ride.
+For example, in the above scenario
+\[ \texttt{rides}(\{1,3,4\})=2 \hspace{10px} \textrm{and}
+\hspace{10px} \texttt{last}(\{1,3,4\})=5,\]
+because the optimal rides are $\{1,4\}$ and $\{3\}$,
+and the second ride has weight 5.
+Of course, our final goal is to calculate the value
+of $\texttt{rides}(\{0 \ldots n-1\})$.
+
+We can calculate the values
+of the functions recursively and then apply
+dynamic programming.
+The idea is to go through all people
+who belong to $S$ and optimally
+choose the last person $p$ who enters the elevator.
+Each such choice yields a subproblem
+for a smaller subset of people.
+If $\texttt{last}(S \setminus p)+\texttt{weight}[p] \le x$,
+we can add $p$ to the last ride.
+Otherwise, we have to reserve a new ride
+that initially only contains $p$.
+
+To implement dynamic programming,
+we declare an array
+\begin{lstlisting}
+pair<int,int> best[1<<N];
+\end{lstlisting}
+that contains for each subset $S$
+a pair $(\texttt{rides}(S),\texttt{last}(S))$.
+We set the value for an empty group as follows:
+\begin{lstlisting}
+best[0] = {1,0};
+\end{lstlisting}
+Then, we can fill the array as follows:
+
+\begin{lstlisting}
+for (int s = 1; s < (1<<n); s++) {
+    // initial value: n+1 rides are needed
+    best[s] = {n+1,0};
+    for (int p = 0; p < n; p++) {
+        if (s&(1<<p)) {
+            auto option = best[s^(1<<p)];
+            if (option.second+weight[p] <= x) {
+                // add p to an existing ride
+                option.second += weight[p];
+            } else {
+                // reserve a new ride for p
+                option.first++;
+                option.second = weight[p];
+            }
+            best[s] = min(best[s], option);
+        }
+    }
+}
+\end{lstlisting}
+Note that the above loop guarantees that
+for any two subsets $S_1$ and $S_2$
+such that $S_1 \subset S_2$, we process $S_1$ before $S_2$.
+Thus, the dynamic programming values are calculated in the
+correct order.
+
+\subsubsection{Counting subsets}
+
+Our last problem in this chapter is as follows:
+Let $X=\{0 \ldots n-1\}$, and each subset $S \subset X$
+is assigned an integer $\texttt{value}[S]$.
+Our task is to calculate for each $S$
+\[\texttt{sum}(S) = \sum_{A \subset S} \texttt{value}[A],\]
+i.e., the sum of values of subsets of $S$.
+
+For example, suppose that $n=3$ and the values are as follows:
+\begin{multicols}{2}
+\begin{itemize}
+\item $\texttt{value}[\emptyset] = 3$
+\item $\texttt{value}[\{0\}] = 1$
+\item $\texttt{value}[\{1\}] = 4$
+\item $\texttt{value}[\{0,1\}] = 5$
+\item $\texttt{value}[\{2\}] = 5$
+\item $\texttt{value}[\{0,2\}] = 1$
+\item $\texttt{value}[\{1,2\}] = 3$
+\item $\texttt{value}[\{0,1,2\}] = 3$
+\end{itemize}
+\end{multicols}
+In this case, for example,
+\begin{equation*}
+\begin{split}
+\texttt{sum}(\{0,2\}) &= \texttt{value}[\emptyset]+\texttt{value}[\{0\}]+\texttt{value}[\{2\}]+\texttt{value}[\{0,2\}] \\ 
+                      &= 3 + 1 + 5 + 1 = 10.
+\end{split}
+\end{equation*}
+
+Because there are a total of $2^n$ subsets,
+one possible solution is to go through all
+pairs of subsets in $O(2^{2n})$ time.
+However, using dynamic programming, we
+can solve the problem in $O(2^n n)$ time.
+The idea is to focus on sums where the
+elements that may be removed from $S$ are restricted.
+
+Let $\texttt{partial}(S,k)$ denote the sum of
+values of subsets of $S$ with the restriction
+that only elements $0 \ldots k$
+may be removed from $S$.
+For example,
+\[\texttt{partial}(\{0,2\},1)=\texttt{value}[\{2\}]+\texttt{value}[\{0,2\}],\]
+because we may only remove elements $0 \ldots 1$.
+We can calculate values of \texttt{sum} using
+values of \texttt{partial}, because
+\[\texttt{sum}(S) = \texttt{partial}(S,n-1).\]
+The base cases for the function are
+\[\texttt{partial}(S,-1)=\texttt{value}[S],\]
+because in this case no elements can be removed from $S$.
+Then, in the general case we can use the following recurrence:
+\begin{equation*}
+    \texttt{partial}(S,k) = \begin{cases}
+               \texttt{partial}(S,k-1) & k \notin S \\
+               \texttt{partial}(S,k-1) + \texttt{partial}(S \setminus \{k\},k-1) & k \in S
+           \end{cases}
+\end{equation*}
+Here we focus on the element $k$.
+If $k \in S$, we have two options: we may either keep $k$ in $S$
+or remove it from $S$.
+
+There is a particularly clever way to implement the
+calculation of sums. We can declare an array
+\begin{lstlisting}
+int sum[1<<N];
+\end{lstlisting}
+that will contain the sum of each subset.
+The array is initialized as follows:
+\begin{lstlisting}
+for (int s = 0; s < (1<<n); s++) {
+    sum[s] = value[s];
+}
+\end{lstlisting}
+Then, we can fill the array as follows:
+\begin{lstlisting}
+for (int k = 0; k < n; k++) {
+    for (int s = 0; s < (1<<n); s++) {
+        if (s&(1<<k)) sum[s] += sum[s^(1<<k)];
+    }
+}
+\end{lstlisting}
+This code calculates the values of $\texttt{partial}(S,k)$
+for $k=0 \ldots n-1$ to the array \texttt{sum}.
+Since $\texttt{partial}(S,k)$ is always based on
+$\texttt{partial}(S,k-1)$, we can reuse the array
+\texttt{sum}, which yields a very efficient implementation.
--- a/chapter11.tex
+++ b/chapter11.tex
@ -0,0 +1,764 @@
+\chapter{Basics of graphs}
+
+Many programming problems can be solved by
+modeling the problem as a graph problem
+and using an appropriate graph algorithm.
+A typical example of a graph is a network
+of roads and cities in a country.
+Sometimes, though, the graph is hidden
+in the problem and it may be difficult to detect it.
+
+This part of the book discusses graph algorithms,
+especially focusing on topics that
+are important in competitive programming.
+In this chapter, we go through concepts
+related to graphs,
+and study different ways to represent graphs in algorithms.
+
+\section{Graph terminology}
+
+\index{graph}
+\index{node}
+\index{edge}
+
+A \key{graph} consists of \key{nodes}
+and \key{edges}. In this book,
+the variable $n$ denotes the number of nodes
+in a graph, and the variable $m$ denotes
+the number of edges.
+The nodes are numbered
+using integers $1,2,\ldots,n$.
+
+For example, the following graph consists of 5 nodes and 7 edges:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (6,2) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+
+\index{path}
+
+A \key{path} leads from node $a$ to node $b$
+through edges of the graph.
+The \key{length} of a path is the number of
+edges in it.
+For example, the above graph contains
+a path $1 \rightarrow 3 \rightarrow 4 \rightarrow 5$
+of length 3
+from node 1 to node 5:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (6,2) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- (4);
+\path[draw=red,thick,->,line width=2pt] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+
+\index{cycle}
+
+A path is a \key{cycle} if the first and last
+node is the same.
+For example, the above graph contains
+a cycle $1 \rightarrow 3 \rightarrow 4 \rightarrow 1$.
+A path is \key{simple} if each node appears
+at most once in the path.
+
+
+% 
+% \begin{itemize}
+% \item $1 \rightarrow 2 \rightarrow 5$ (length 2)
+% \item $1 \rightarrow 4 \rightarrow 5$ (length 2)
+% \item $1 \rightarrow 2 \rightarrow 4 \rightarrow 5$ (length 3)
+% \item $1 \rightarrow 3 \rightarrow 4 \rightarrow 5$ (length 3)
+% \item $1 \rightarrow 4 \rightarrow 2 \rightarrow 5$ (length 3)
+% \item $1 \rightarrow 3 \rightarrow 4 \rightarrow 2 \rightarrow 5$ (length 4)
+% \end{itemize}
+
+\subsubsection{Connectivity}
+
+\index{connected graph}
+
+A graph is \key{connected} if there is a path
+between any two nodes.
+For example, the following graph is connected:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\end{tikzpicture}
+\end{center}
+
+The following graph is not connected,
+because it is not possible to get
+from node 4 to any other node:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (2) -- (3);
+\end{tikzpicture}
+\end{center}
+
+\index{component}
+
+The connected parts of a graph are
+called its \key{components}.
+For example, the following graph
+contains three components:
+$\{1,\,2,\,3\}$,
+$\{4,\,5,\,6,\,7\}$ and
+$\{8\}$.
+\begin{center}
+\begin{tikzpicture}[scale=0.8]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+
+\node[draw, circle] (6) at (6,1) {$6$};
+\node[draw, circle] (7) at (9,1) {$7$};
+\node[draw, circle] (4) at (6,3) {$4$};
+\node[draw, circle] (5) at (9,3) {$5$};
+
+\node[draw, circle] (8) at (11,2) {$8$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (4) -- (5);
+\path[draw,thick,-] (5) -- (7);
+\path[draw,thick,-] (6) -- (7);
+\path[draw,thick,-] (6) -- (4);
+\end{tikzpicture}
+\end{center}
+
+\index{tree}
+
+A \key{tree} is a connected graph
+that consists of $n$ nodes and $n-1$ edges.
+There is a unique path
+between any two nodes of a tree.
+For example, the following graph is a tree:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (6,2) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+%\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (4);
+%\path[draw,thick,-] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Edge directions}
+
+\index{directed graph}
+
+A graph is \key{directed}
+if the edges can be traversed
+in one direction only.
+For example, the following graph is directed:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (6,2) {$5$};
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (4);
+\path[draw,thick,->,>=latex] (2) -- (5);
+\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\path[draw,thick,->,>=latex] (3) -- (1);
+\end{tikzpicture}
+\end{center}
+
+The above graph contains
+a path $3 \rightarrow 1 \rightarrow 2 \rightarrow 5$
+from node $3$ to node $5$,
+but there is no path from node $5$ to node $3$.
+
+\subsubsection{Edge weights}
+
+\index{weighted graph}
+
+In a \key{weighted} graph, each edge is assigned
+a \key{weight}.
+The weights are often interpreted as edge lengths.
+For example, the following graph is weighted:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (6,2) {$5$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:1] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:7] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:7] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:3] {} (5);
+\end{tikzpicture}
+\end{center}
+
+The length of a path in a weighted graph
+is the sum of the edge weights on the path.
+For example, in the above graph,
+the length of the path
+$1 \rightarrow 2 \rightarrow 5$ is $12$,
+and the length of the path
+$1 \rightarrow 3 \rightarrow 4 \rightarrow 5$ is $11$.
+The latter path is the \key{shortest} path from node $1$ to node $5$.
+
+\subsubsection{Neighbors and degrees}
+
+\index{neighbor}
+\index{degree}
+
+Two nodes are \key{neighbors} or \key{adjacent}
+if there is an edge between them.
+The \key{degree} of a node
+is the number of its neighbors.
+For example, in the following graph,
+the neighbors of node 2 are 1, 4 and 5,
+so its degree is 3.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (6,2) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (2) -- (5);
+%\path[draw,thick,-] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+
+The sum of degrees in a graph is always $2m$,
+where $m$ is the number of edges,
+because each edge
+increases the degree of exactly two nodes by one.
+For this reason, the sum of degrees is always even.
+
+\index{regular graph}
+\index{complete graph}
+
+A graph is \key{regular} if the
+degree of every node is a constant $d$.
+A graph is \key{complete} if the
+degree of every node is $n-1$, i.e.,
+the graph contains all possible edges
+between the nodes.
+
+\index{indegree}
+\index{outdegree}
+
+In a directed graph, the \key{indegree}
+of a node is the number of edges
+that end at the node,
+and the \key{outdegree} of a node
+is the number of edges that start at the node.
+For example, in the following graph,
+the indegree of node 2 is 2,
+and the outdegree of node 2 is 1.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (6,2) {$5$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (1) -- (3);
+\path[draw,thick,->,>=latex] (1) -- (4);
+\path[draw,thick,->,>=latex] (3) -- (4);
+\path[draw,thick,->,>=latex] (2) -- (4);
+\path[draw,thick,<-,>=latex] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Colorings}
+
+\index{coloring}
+\index{bipartite graph}
+
+In a \key{coloring} of a graph,
+each node is assigned a color so that
+no adjacent nodes have the same color.
+
+A graph is \key{bipartite} if
+it is possible to color it using two colors.
+It turns out that a graph is bipartite
+exactly when it does not contain a cycle
+with an odd number of edges.
+For example, the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$2$};
+\node[draw, circle] (2) at (4,3) {$3$};
+\node[draw, circle] (3) at (1,1) {$5$};
+\node[draw, circle] (4) at (4,1) {$6$};
+\node[draw, circle] (5) at (-2,1) {$4$};
+\node[draw, circle] (6) at (-2,3) {$1$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (5) -- (6);
+\end{tikzpicture}
+\end{center}
+is bipartite, because it can be colored as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle, fill=blue!40] (1) at (1,3) {$2$};
+\node[draw, circle, fill=red!40] (2) at (4,3) {$3$};
+\node[draw, circle, fill=red!40] (3) at (1,1) {$5$};
+\node[draw, circle, fill=blue!40] (4) at (4,1) {$6$};
+\node[draw, circle, fill=red!40] (5) at (-2,1) {$4$};
+\node[draw, circle, fill=blue!40] (6) at (-2,3) {$1$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (5) -- (6);
+\end{tikzpicture}
+\end{center}
+However, the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$2$};
+\node[draw, circle] (2) at (4,3) {$3$};
+\node[draw, circle] (3) at (1,1) {$5$};
+\node[draw, circle] (4) at (4,1) {$6$};
+\node[draw, circle] (5) at (-2,1) {$4$};
+\node[draw, circle] (6) at (-2,3) {$1$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (5) -- (6);
+\path[draw,thick,-] (1) -- (6);
+\end{tikzpicture}
+\end{center}
+is not bipartite, because it is not possible to color
+the following cycle of three nodes using two colors:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$2$};
+\node[draw, circle] (2) at (4,3) {$3$};
+\node[draw, circle] (3) at (1,1) {$5$};
+\node[draw, circle] (4) at (4,1) {$6$};
+\node[draw, circle] (5) at (-2,1) {$4$};
+\node[draw, circle] (6) at (-2,3) {$1$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (5) -- (6);
+\path[draw,thick,-] (1) -- (6);
+
+\path[draw=red,thick,-,line width=2pt] (1) -- (3);
+\path[draw=red,thick,-,line width=2pt] (3) -- (6);
+\path[draw=red,thick,-,line width=2pt] (6) -- (1);
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Simplicity}
+
+\index{simple graph}
+
+A graph is \key{simple}
+if no edge starts and ends at the same node,
+and there are no multiple
+edges between two nodes.
+Often we assume that graphs are simple.
+For example, the following graph is \emph{not} simple:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$2$};
+\node[draw, circle] (2) at (4,3) {$3$};
+\node[draw, circle] (3) at (1,1) {$5$};
+\node[draw, circle] (4) at (4,1) {$6$};
+\node[draw, circle] (5) at (-2,1) {$4$};
+\node[draw, circle] (6) at (-2,3) {$1$};
+
+\path[draw,thick,-] (1) edge [bend right=20] (2);
+\path[draw,thick,-] (2) edge [bend right=20] (1);
+%\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (5) -- (6);
+
+\tikzset{every loop/.style={in=135,out=190}}
+\path[draw,thick,-] (5) edge [loop left] (5);
+\end{tikzpicture}
+\end{center}
+
+\section{Graph representation}
+
+There are several ways to represent graphs
+in algorithms.
+The choice of a data structure
+depends on the size of the graph and
+the way the algorithm processes it.
+Next we will go through three common representations.
+
+\subsubsection{Adjacency list representation}
+
+\index{adjacency list}
+
+In the adjacency list representation,
+each node $x$ in the graph is assigned an \key{adjacency list}
+that consists of nodes
+to which there is an edge from $x$.
+Adjacency lists are the most popular
+way to represent graphs, and most algorithms can be
+efficiently implemented using them.
+
+A convenient way to store the adjacency lists is to declare
+an array of vectors as follows:
+\begin{lstlisting}
+vector<int> adj[N];
+\end{lstlisting}
+
+The constant $N$ is chosen so that all
+adjacency lists can be stored.
+For example, the graph
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (3,1) {$4$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (2) -- (4);
+\path[draw,thick,->,>=latex] (3) -- (4);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\end{tikzpicture}
+\end{center}
+can be stored as follows:
+\begin{lstlisting}
+adj[1].push_back(2);
+adj[2].push_back(3);
+adj[2].push_back(4);
+adj[3].push_back(4);
+adj[4].push_back(1);
+\end{lstlisting}
+
+If the graph is undirected, it can be stored in a similar way,
+but each edge is added in both directions.
+
+For a weighted graph, the structure can be extended
+as follows:
+
+\begin{lstlisting}
+vector<pair<int,int>> adj[N];
+\end{lstlisting}
+
+In this case, the adjacency list of node $a$
+contains the pair $(b,w)$
+always when there is an edge from node $a$ to node $b$
+with weight $w$. For example, the graph
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (3,1) {$4$};
+
+\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
+\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
+\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
+\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
+\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
+\end{tikzpicture}
+\end{center}
+can be stored as follows:
+\begin{lstlisting}
+adj[1].push_back({2,5});
+adj[2].push_back({3,7});
+adj[2].push_back({4,6});
+adj[3].push_back({4,5});
+adj[4].push_back({1,2});
+\end{lstlisting}
+
+The benefit of using adjacency lists is that
+we can efficiently find the nodes to which
+we can move from a given node through an edge.
+For example, the following loop goes through all nodes
+to which we can move from node $s$:
+
+\begin{lstlisting}
+for (auto u : adj[s]) {
+    // process node u
+}
+\end{lstlisting}
+
+\subsubsection{Adjacency matrix representation}
+
+\index{adjacency matrix}
+
+An \key{adjacency matrix} is a two-dimensional array
+that indicates which edges the graph contains.
+We can efficiently check from an adjacency matrix
+if there is an edge between two nodes.
+The matrix can be stored as an array
+\begin{lstlisting}
+int adj[N][N];
+\end{lstlisting}
+where each value $\texttt{adj}[a][b]$ indicates
+whether the graph contains an edge from
+node $a$ to node $b$.
+If the edge is included in the graph,
+then $\texttt{adj}[a][b]=1$,
+and otherwise $\texttt{adj}[a][b]=0$.
+For example, the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (3,1) {$4$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (2) -- (4);
+\path[draw,thick,->,>=latex] (3) -- (4);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\end{tikzpicture}
+\end{center}
+can be represented as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (4,4);
+\node at (0.5,0.5) {1};
+\node at (1.5,0.5) {0};
+\node at (2.5,0.5) {0};
+\node at (3.5,0.5) {0};
+\node at (0.5,1.5) {0};
+\node at (1.5,1.5) {0};
+\node at (2.5,1.5) {0};
+\node at (3.5,1.5) {1};
+\node at (0.5,2.5) {0};
+\node at (1.5,2.5) {0};
+\node at (2.5,2.5) {1};
+\node at (3.5,2.5) {1};
+\node at (0.5,3.5) {0};
+\node at (1.5,3.5) {1};
+\node at (2.5,3.5) {0};
+\node at (3.5,3.5) {0};
+\node at (-0.5,0.5) {4};
+\node at (-0.5,1.5) {3};
+\node at (-0.5,2.5) {2};
+\node at (-0.5,3.5) {1};
+\node at (0.5,4.5) {1};
+\node at (1.5,4.5) {2};
+\node at (2.5,4.5) {3};
+\node at (3.5,4.5) {4};
+\end{tikzpicture}
+\end{center}
+
+If the graph is weighted, the adjacency matrix
+representation can be extended so that
+the matrix contains the weight of the edge
+if the edge exists.
+Using this representation, the graph
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (3,1) {$4$};
+
+\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
+\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
+\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
+\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
+\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
+\end{tikzpicture}
+\end{center}
+\begin{samepage}
+corresponds to the following matrix:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (4,4);
+\node at (0.5,0.5) {2};
+\node at (1.5,0.5) {0};
+\node at (2.5,0.5) {0};
+\node at (3.5,0.5) {0};
+\node at (0.5,1.5) {0};
+\node at (1.5,1.5) {0};
+\node at (2.5,1.5) {0};
+\node at (3.5,1.5) {5};
+\node at (0.5,2.5) {0};
+\node at (1.5,2.5) {0};
+\node at (2.5,2.5) {7};
+\node at (3.5,2.5) {6};
+\node at (0.5,3.5) {0};
+\node at (1.5,3.5) {5};
+\node at (2.5,3.5) {0};
+\node at (3.5,3.5) {0};
+\node at (-0.5,0.5) {4};
+\node at (-0.5,1.5) {3};
+\node at (-0.5,2.5) {2};
+\node at (-0.5,3.5) {1};
+\node at (0.5,4.5) {1};
+\node at (1.5,4.5) {2};
+\node at (2.5,4.5) {3};
+\node at (3.5,4.5) {4};
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+The drawback of the adjacency matrix representation
+is that the matrix contains $n^2$ elements,
+and usually most of them are zero.
+For this reason, the representation cannot be used
+if the graph is large.
+
+\subsubsection{Edge list representation}
+
+\index{edge list}
+
+An \key{edge list} contains all edges of a graph
+in some order.
+This is a convenient way to represent a graph
+if the algorithm processes all edges of the graph
+and it is not needed to find edges that start
+at a given node.
+
+The edge list can be stored in a vector
+\begin{lstlisting}
+vector<pair<int,int>> edges;
+\end{lstlisting}
+where each pair $(a,b)$ denotes that
+there is an edge from node $a$ to node $b$.
+Thus, the graph
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (3,1) {$4$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (2) -- (4);
+\path[draw,thick,->,>=latex] (3) -- (4);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\end{tikzpicture}
+\end{center}
+can be represented as follows:
+\begin{lstlisting}
+edges.push_back({1,2});
+edges.push_back({2,3});
+edges.push_back({2,4});
+edges.push_back({3,4});
+edges.push_back({4,1});
+\end{lstlisting}
+
+\noindent
+If the graph is weighted, the structure can
+be extended as follows:
+\begin{lstlisting}
+vector<tuple<int,int,int>> edges;
+\end{lstlisting}
+Each element in this list is of the
+form $(a,b,w)$, which means that there
+is an edge from node $a$ to node $b$ with weight $w$.
+For example, the graph
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (3,1) {$4$};
+
+\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
+\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
+\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
+\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
+\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
+\end{tikzpicture}
+\end{center}
+\begin{samepage}
+can be represented as follows\footnote{In some older compilers, the function
+\texttt{make\_tuple} must be used instead of the braces (for example,
+\texttt{make\_tuple(1,2,5)} instead of \texttt{\{1,2,5\}}).}:
+\begin{lstlisting}
+edges.push_back({1,2,5});
+edges.push_back({2,3,7});
+edges.push_back({2,4,6});
+edges.push_back({3,4,5});
+edges.push_back({4,1,2});
+\end{lstlisting}
+\end{samepage}
--- a/chapter12.tex
+++ b/chapter12.tex
@ -0,0 +1,549 @@
+\chapter{Graph traversal}
+
+This chapter discusses two fundamental
+graph algorithms:
+depth-first search and breadth-first search.
+Both algorithms are given a starting
+node in the graph,
+and they visit all nodes that can be reached
+from the starting node.
+The difference in the algorithms is the order
+in which they visit the nodes.
+
+\section{Depth-first search}
+
+\index{depth-first search}
+
+\key{Depth-first search} (DFS)
+is a straightforward graph traversal technique.
+The algorithm begins at a starting node,
+and proceeds to all other nodes that are
+reachable from the starting node using
+the edges of the graph.
+
+Depth-first search always follows a single
+path in the graph as long as it finds
+new nodes.
+After this, it returns to previous
+nodes and begins to explore other parts of the graph.
+The algorithm keeps track of visited nodes,
+so that it processes each node only once.
+
+\subsubsection*{Example}
+
+Let us consider how depth-first search processes
+the following graph:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+We may begin the search at any node of the graph;
+now we will begin the search at node 1.
+
+The search first proceeds to node 2:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
+\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (2);
+\end{tikzpicture}
+\end{center}
+After this, nodes 3 and 5 will be visited:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
+\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
+\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- (5);
+\end{tikzpicture}
+\end{center}
+The neighbors of node 5 are 2 and 3,
+but the search has already visited both of them,
+so it is time to return to the previous nodes.
+Also the neighbors of nodes 3 and 2
+have been visited, so we next move
+from node 1 to node 4:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
+\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
+\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
+\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
+\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (4);
+\end{tikzpicture}
+\end{center}
+After this, the search terminates because it has visited
+all nodes.
+
+The time complexity of depth-first search is $O(n+m)$
+where $n$ is the number of nodes and $m$ is the
+number of edges,
+because the algorithm processes each node and edge once.
+
+\subsubsection*{Implementation}
+
+Depth-first search can be conveniently
+implemented using recursion.
+The following function \texttt{dfs} begins
+a depth-first search at a given node.
+The function assumes that the graph is
+stored as adjacency lists in an array
+\begin{lstlisting}
+vector<int> adj[N];
+\end{lstlisting}
+and also maintains an array
+\begin{lstlisting}
+bool visited[N];
+\end{lstlisting}
+that keeps track of the visited nodes.
+Initially, each array value is \texttt{false},
+and when the search arrives at node $s$,
+the value of \texttt{visited}[$s$] becomes \texttt{true}.
+The function can be implemented as follows:
+\begin{lstlisting}
+void dfs(int s) {
+    if (visited[s]) return;
+    visited[s] = true;
+    // process node s
+    for (auto u: adj[s]) {
+        dfs(u);
+    }
+}
+\end{lstlisting}
+
+\section{Breadth-first search}
+
+\index{breadth-first search}
+
+\key{Breadth-first search} (BFS) visits the nodes
+in increasing order of their distance
+from the starting node.
+Thus, we can calculate the distance
+from the starting node to all other
+nodes using breadth-first search.
+However, breadth-first search is more difficult
+to implement than depth-first search.
+
+Breadth-first search goes through the nodes
+one level after another.
+First the search explores the nodes whose
+distance from the starting node is 1,
+then the nodes whose distance is 2, and so on.
+This process continues until all nodes
+have been visited.
+
+\subsubsection*{Example}
+
+Let us consider how breadth-first search processes
+the following graph:
+
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,5) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+\node[draw, circle] (6) at (5,3) {$6$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (5) -- (6);
+\end{tikzpicture}
+\end{center}
+Suppose that the search begins at node 1.
+First, we process all nodes that can be reached
+from node 1 using a single edge:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
+\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,5) {$3$};
+\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+\node[draw, circle] (6) at (5,3) {$6$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (5) -- (6);
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (2);
+\path[draw=red,thick,->,line width=2pt] (1) -- (4);
+\end{tikzpicture}
+\end{center}
+After this, we proceed to nodes 3 and 5:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
+\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
+\node[draw, circle,fill=lightgray] (3) at (5,5) {$3$};
+\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
+\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
+\node[draw, circle] (6) at (5,3) {$6$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (5) -- (6);
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (2) -- (3);
+\path[draw=red,thick,->,line width=2pt] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+Finally, we visit node 6:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
+\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
+\node[draw, circle,fill=lightgray] (3) at (5,5) {$3$};
+\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
+\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
+\node[draw, circle,fill=lightgray] (6) at (5,3) {$6$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (5) -- (6);
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (3) -- (6);
+\path[draw=red,thick,->,line width=2pt] (5) -- (6);
+\end{tikzpicture}
+\end{center}
+Now we have calculated the distances
+from the starting node to all nodes of the graph.
+The distances are as follows:
+
+\begin{tabular}{ll}
+\\
+node & distance \\
+\hline
+1 & 0 \\
+2 & 1 \\
+3 & 2 \\
+4 & 1 \\
+5 & 2 \\
+6 & 3 \\
+\\
+\end{tabular}
+
+Like in depth-first search,
+the time complexity of breadth-first search
+is $O(n+m)$, where $n$ is the number of nodes
+and $m$ is the number of edges.
+
+\subsubsection*{Implementation}
+
+Breadth-first search is more difficult
+to implement than depth-first search,
+because the algorithm visits nodes
+in different parts of the graph.
+A typical implementation is based on
+a queue that contains nodes.
+At each step, the next node in the queue
+will be processed.
+
+The following code assumes that the graph is stored
+as adjacency lists and maintains the following
+data structures:
+\begin{lstlisting}
+queue<int> q;
+bool visited[N];
+int distance[N];
+\end{lstlisting}
+
+The queue \texttt{q}
+contains nodes to be processed
+in increasing order of their distance.
+New nodes are always added to the end
+of the queue, and the node at the beginning
+of the queue is the next node to be processed.
+The array \texttt{visited} indicates
+which nodes the search has already visited,
+and the array \texttt{distance} will contain the
+distances from the starting node to all nodes of the graph.
+
+The search can be implemented as follows,
+starting at node $x$:
+\begin{lstlisting}
+visited[x] = true;
+distance[x] = 0;
+q.push(x);
+while (!q.empty()) {
+    int s = q.front(); q.pop();
+    // process node s
+    for (auto u : adj[s]) {
+        if (visited[u]) continue;
+        visited[u] = true;
+        distance[u] = distance[s]+1;
+        q.push(u);
+    }
+}
+\end{lstlisting}
+
+\section{Applications}
+
+Using the graph traversal algorithms,
+we can check many properties of graphs.
+Usually, both depth-first search and
+breadth-first search may be used,
+but in practice, depth-first search
+is a better choice, because it is
+easier to implement.
+In the following applications we will
+assume that the graph is undirected.
+
+\subsubsection{Connectivity check}
+
+\index{connected graph}
+
+A graph is connected if there is a path
+between any two nodes of the graph.
+Thus, we can check if a graph is connected
+by starting at an arbitrary node and
+finding out if we can reach all other nodes.
+
+For example, in the graph
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (2) at (7,5) {$2$};
+\node[draw, circle] (1) at (3,5) {$1$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (5) at (7,3) {$5$};
+\node[draw, circle] (4) at (3,3) {$4$};
+
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+a depth-first search from node $1$ visits
+the following nodes:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (2) at (7,5) {$2$};
+\node[draw, circle,fill=lightgray] (1) at (3,5) {$1$};
+\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
+\node[draw, circle] (5) at (7,3) {$5$};
+\node[draw, circle,fill=lightgray] (4) at (3,3) {$4$};
+
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- (4);
+
+\end{tikzpicture}
+\end{center}
+
+Since the search did not visit all the nodes,
+we can conclude that the graph is not connected.
+In a similar way, we can also find all connected components
+of a graph by iterating through the nodes and always
+starting a new depth-first search if the current node
+does not belong to any component yet.
+
+\subsubsection{Finding cycles}
+
+\index{cycle}
+
+A graph contains a cycle if during a graph traversal,
+we find a node whose neighbor (other than the
+previous node in the current path) has already been
+visited.
+For example, the graph
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (2) at (7,5) {$2$};
+\node[draw, circle] (1) at (3,5) {$1$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (5) at (7,3) {$5$};
+\node[draw, circle] (4) at (3,3) {$4$};
+
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (3) -- (5);
+\end{tikzpicture}
+\end{center}
+contains two cycles and we can find one
+of them as follows:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle,fill=lightgray] (2) at (7,5) {$2$};
+\node[draw, circle,fill=lightgray] (1) at (3,5) {$1$};
+\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
+\node[draw, circle,fill=lightgray] (5) at (7,3) {$5$};
+\node[draw, circle] (4) at (3,3) {$4$};
+
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (3) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- (5);
+
+\end{tikzpicture}
+\end{center}
+After moving from node 2 to node 5 we notice that
+the neighbor 3 of node 5 has already been visited.
+Thus, the graph contains a cycle that goes through node 3,
+for example, $3 \rightarrow 2 \rightarrow 5 \rightarrow 3$.
+
+Another way to find out whether a graph contains a cycle
+is to simply calculate the number of nodes and edges
+in every component.
+If a component contains $c$ nodes and no cycle,
+it must contain exactly $c-1$ edges
+(so it has to be a tree).
+If there are $c$ or more edges, the component
+surely contains a cycle.
+
+\subsubsection{Bipartiteness check}
+
+\index{bipartite graph}
+
+A graph is bipartite if its nodes can be colored
+using two colors so that there are no adjacent
+nodes with the same color.
+It is surprisingly easy to check if a graph
+is bipartite using graph traversal algorithms.
+
+The idea is to color the starting node blue,
+all its neighbors red, all their neighbors blue, and so on.
+If at some point of the search we notice that
+two adjacent nodes have the same color,
+this means that the graph is not bipartite.
+Otherwise the graph is bipartite and one coloring
+has been found.
+
+For example, the graph
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (2) at (5,5) {$2$};
+\node[draw, circle] (1) at (3,5) {$1$};
+\node[draw, circle] (3) at (7,4) {$3$};
+\node[draw, circle] (5) at (5,3) {$5$};
+\node[draw, circle] (4) at (3,3) {$4$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (5) -- (4);
+\path[draw,thick,-] (4) -- (1);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (5) -- (3);
+\end{tikzpicture}
+\end{center}
+is not bipartite, because a search from node 1
+proceeds as follows:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle,fill=red!40] (2) at (5,5) {$2$};
+\node[draw, circle,fill=blue!40] (1) at (3,5) {$1$};
+\node[draw, circle,fill=blue!40] (3) at (7,4) {$3$};
+\node[draw, circle,fill=red!40] (5) at (5,3) {$5$};
+\node[draw, circle] (4) at (3,3) {$4$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (5) -- (4);
+\path[draw,thick,-] (4) -- (1);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (5) -- (3);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- (5);
+\path[draw=red,thick,->,line width=2pt] (5) -- (2);
+\end{tikzpicture}
+\end{center}
+We notice that the color of both nodes 2 and 5
+is red, while they are adjacent nodes in the graph.
+Thus, the graph is not bipartite.
+
+This algorithm always works, because when there
+are only two colors available,
+the color of the starting node in a component
+determines the colors of all other nodes in the component.
+It does not make any difference whether the
+starting node is red or blue.
+
+Note that in the general case,
+it is difficult to find out if the nodes
+in a graph can be colored using $k$ colors
+so that no adjacent nodes have the same color.
+Even when $k=3$, no efficient algorithm is known
+but the problem is NP-hard.
--- a/chapter13.tex
+++ b/chapter13.tex
@ -0,0 +1,802 @@
+\chapter{Shortest paths}
+
+\index{shortest path}
+
+Finding a shortest path between two nodes
+of a graph
+is an important problem that has many
+practical applications.
+For example, a natural problem related to a road network
+is to calculate the shortest possible length of a route
+between two cities, given the lengths of the roads.
+
+In an unweighted graph, the length of a path equals
+the number of its edges, and we can
+simply use breadth-first search to find
+a shortest path.
+However, in this chapter we focus on
+weighted graphs
+where more sophisticated algorithms
+are needed
+for finding shortest paths.
+
+\section{Bellman–Ford algorithm}
+
+\index{Bellman–Ford algorithm}
+
+The \key{Bellman–Ford algorithm}\footnote{The algorithm is named after
+R. E. Bellman and L. R. Ford who published it independently
+in 1958 and 1956, respectively \cite{bel58,for56a}.} finds
+shortest paths from a starting node to all
+nodes of the graph.
+The algorithm can process all kinds of graphs,
+provided that the graph does not contain a
+cycle with negative length.
+If the graph contains a negative cycle,
+the algorithm can detect this.
+
+The algorithm keeps track of distances
+from the starting node to all nodes of the graph.
+Initially, the distance to the starting node is 0
+and the distance to all other nodes in infinite.
+The algorithm reduces the distances by finding
+edges that shorten the paths until it is not
+possible to reduce any distance.
+
+\subsubsection{Example}
+
+Let us consider how the Bellman–Ford algorithm
+works in the following graph:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (1,3) {1};
+\node[draw, circle] (2) at (4,3) {2};
+\node[draw, circle] (3) at (1,1) {3};
+\node[draw, circle] (4) at (4,1) {4};
+\node[draw, circle] (5) at (6,2) {6};
+\node[color=red] at (1,3+0.55) {$0$};
+\node[color=red] at (4,3+0.55) {$\infty$};
+\node[color=red] at (1,1-0.55) {$\infty$};
+\node[color=red] at (4,1-0.55) {$\infty$};
+\node[color=red] at (6,2-0.55) {$\infty$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
+\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
+\end{tikzpicture}
+\end{center}
+Each node of the graph is assigned a distance.
+Initially, the distance to the starting node is 0,
+and the distance to all other nodes is infinite.
+
+The algorithm searches for edges that reduce distances.
+First, all edges from node 1 reduce distances:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (1,3) {1};
+\node[draw, circle] (2) at (4,3) {2};
+\node[draw, circle] (3) at (1,1) {3};
+\node[draw, circle] (4) at (4,1) {4};
+\node[draw, circle] (5) at (6,2) {5};
+\node[color=red] at (1,3+0.55) {$0$};
+\node[color=red] at (4,3+0.55) {$5$};
+\node[color=red] at (1,1-0.55) {$3$};
+\node[color=red] at (4,1-0.55) {$7$};
+\node[color=red] at (6,2-0.55) {$\infty$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
+\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (2);
+\path[draw=red,thick,->,line width=2pt] (1) -- (3);
+\path[draw=red,thick,->,line width=2pt] (1) -- (4);
+\end{tikzpicture}
+\end{center}
+After this, edges
+$2 \rightarrow 5$ and $3 \rightarrow 4$
+reduce distances:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (1,3) {1};
+\node[draw, circle] (2) at (4,3) {2};
+\node[draw, circle] (3) at (1,1) {3};
+\node[draw, circle] (4) at (4,1) {4};
+\node[draw, circle] (5) at (6,2) {5};
+\node[color=red] at (1,3+0.55) {$0$};
+\node[color=red] at (4,3+0.55) {$5$};
+\node[color=red] at (1,1-0.55) {$3$};
+\node[color=red] at (4,1-0.55) {$4$};
+\node[color=red] at (6,2-0.55) {$7$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
+\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
+
+\path[draw=red,thick,->,line width=2pt] (2) -- (5);
+\path[draw=red,thick,->,line width=2pt] (3) -- (4);
+\end{tikzpicture}
+\end{center}
+Finally, there is one more change:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (1,3) {1};
+\node[draw, circle] (2) at (4,3) {2};
+\node[draw, circle] (3) at (1,1) {3};
+\node[draw, circle] (4) at (4,1) {4};
+\node[draw, circle] (5) at (6,2) {5};
+\node[color=red] at (1,3+0.55) {$0$};
+\node[color=red] at (4,3+0.55) {$5$};
+\node[color=red] at (1,1-0.55) {$3$};
+\node[color=red] at (4,1-0.55) {$4$};
+\node[color=red] at (6,2-0.55) {$6$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
+\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
+
+\path[draw=red,thick,->,line width=2pt] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+
+After this, no edge can reduce any distance.
+This means that the distances are final,
+and we have successfully
+calculated the shortest distances
+from the starting node to all nodes of the graph.
+
+For example, the shortest distance 3
+from node 1 to node 5 corresponds to
+the following path:
+
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (1,3) {1};
+\node[draw, circle] (2) at (4,3) {2};
+\node[draw, circle] (3) at (1,1) {3};
+\node[draw, circle] (4) at (4,1) {4};
+\node[draw, circle] (5) at (6,2) {5};
+\node[color=red] at (1,3+0.55) {$0$};
+\node[color=red] at (4,3+0.55) {$5$};
+\node[color=red] at (1,1-0.55) {$3$};
+\node[color=red] at (4,1-0.55) {$4$};
+\node[color=red] at (6,2-0.55) {$6$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
+\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- (4);
+\path[draw=red,thick,->,line width=2pt] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Implementation}
+
+The following implementation of the
+Bellman–Ford algorithm determines the shortest distances
+from a node $x$ to all nodes of the graph.
+The code assumes that the graph is stored
+as an edge list \texttt{edges}
+that consists of tuples of the form $(a,b,w)$,
+meaning that there is an edge from node $a$ to node $b$
+with weight $w$.
+
+The algorithm consists of $n-1$ rounds,
+and on each round the algorithm goes through
+all edges of the graph and tries to
+reduce the distances.
+The algorithm constructs an array \texttt{distance}
+that will contain the distances from $x$
+to all nodes of the graph.
+The constant \texttt{INF} denotes an infinite distance.
+
+\begin{lstlisting}
+for (int i = 1; i <= n; i++) distance[i] = INF;
+distance[x] = 0;
+for (int i = 1; i <= n-1; i++) {
+    for (auto e : edges) {
+        int a, b, w;
+        tie(a, b, w) = e;
+        distance[b] = min(distance[b], distance[a]+w);
+    }
+}
+\end{lstlisting}
+
+The time complexity of the algorithm is $O(nm)$,
+because the algorithm consists of $n-1$ rounds and
+iterates through all $m$ edges during a round.
+If there are no negative cycles in the graph,
+all distances are final after $n-1$ rounds,
+because each shortest path can contain at most $n-1$ edges.
+
+In practice, the final distances can usually
+be found faster than in $n-1$ rounds.
+Thus, a possible way to make the algorithm more efficient
+is to stop the algorithm if no distance
+can be reduced during a round.
+
+\subsubsection{Negative cycles}
+
+\index{negative cycle}
+
+The Bellman–Ford algorithm can also be used to
+check if the graph contains a cycle with negative length.
+For example, the graph
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (2,1) {$2$};
+\node[draw, circle] (3) at (2,-1) {$3$};
+\node[draw, circle] (4) at (4,0) {$4$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:$3$] {} (2);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:$1$] {} (4);
+\path[draw,thick,-] (1) -- node[font=\small,label=below:$5$] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:$-7$] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=right:$2$] {} (3);
+\end{tikzpicture}
+\end{center}
+\noindent
+contains a negative cycle
+$2 \rightarrow 3 \rightarrow 4 \rightarrow 2$
+with length $-4$.
+
+If the graph contains a negative cycle,
+we can shorten infinitely many times
+any path that contains the cycle by repeating the cycle
+again and again.
+Thus, the concept of a shortest path
+is not meaningful in this situation.
+
+A negative cycle can be detected
+using the Bellman–Ford algorithm by
+running the algorithm for $n$ rounds.
+If the last round reduces any distance,
+the graph contains a negative cycle.
+Note that this algorithm can be used to
+search for
+a negative cycle in the whole graph
+regardless of the starting node.
+
+\subsubsection{SPFA algorithm}
+
+\index{SPFA algorithm}
+
+The \key{SPFA algorithm} (''Shortest Path Faster Algorithm'') \cite{fan94}
+is a variant of the Bellman–Ford algorithm,
+that is often more efficient than the original algorithm.
+The SPFA algorithm does not go through all the edges on each round,
+but instead, it chooses the edges to be examined
+in a more intelligent way.
+
+The algorithm maintains a queue of nodes that might
+be used for reducing the distances.
+First, the algorithm adds the starting node $x$
+to the queue.
+Then, the algorithm always processes the
+first node in the queue, and when an edge
+$a \rightarrow b$ reduces a distance,
+node $b$ is added to the queue.
+% 
+% The following implementation uses a 
+% \texttt{queue} \texttt{q}.
+% In addition, an array \texttt{inqueue} indicates
+% if a node is already in the queue,
+% in which case the algorithm does not add
+% the node to the queue again.
+% 
+% \begin{lstlisting}
+% for (int i = 1; i <= n; i++) distance[i] = INF;
+% distance[x] = 0;
+% q.push(x);
+% while (!q.empty()) {
+%     int a = q.front(); q.pop();
+%     inqueue[a] = false;
+%     for (auto b : v[a]) {
+%         if (distance[a]+b.second < distance[b.first]) {
+%             distance[b.first] = distance[a]+b.second;
+%             if (!inqueue[b]) {q.push(b); inqueue[b] = true;}
+%         }
+%     }
+% }
+% \end{lstlisting}
+
+The efficiency of the SPFA algorithm depends
+on the structure of the graph:
+the algorithm is often efficient,
+but its worst case time complexity is still
+$O(nm)$ and it is possible to create inputs
+that make the algorithm as slow as the
+original Bellman–Ford algorithm.
+
+\section{Dijkstra's algorithm}
+
+\index{Dijkstra's algorithm}
+
+\key{Dijkstra's algorithm}\footnote{E. W. Dijkstra published the algorithm in 1959 \cite{dij59};
+however, his original paper does not mention how to implement the algorithm efficiently.}
+finds shortest
+paths from the starting node to all nodes of the graph,
+like the Bellman–Ford algorithm.
+The benefit of Dijsktra's algorithm is that
+it is more efficient and can be used for
+processing large graphs.
+However, the algorithm requires that there
+are no negative weight edges in the graph.
+
+Like the Bellman–Ford algorithm,
+Dijkstra's algorithm maintains distances
+to the nodes and reduces them during the search.
+Dijkstra's algorithm is efficient, because
+it only processes
+each edge in the graph once, using the fact
+that there are no negative edges.
+
+\subsubsection{Example}
+
+Let us consider how Dijkstra's algorithm
+works in the following graph when the
+starting node is node 1:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {3};
+\node[draw, circle] (2) at (4,3) {4};
+\node[draw, circle] (3) at (1,1) {2};
+\node[draw, circle] (4) at (4,1) {1};
+\node[draw, circle] (5) at (6,2) {5};
+
+\node[color=red] at (1,3+0.6) {$\infty$};
+\node[color=red] at (4,3+0.6) {$\infty$};
+\node[color=red] at (1,1-0.6) {$\infty$};
+\node[color=red] at (4,1-0.6) {$0$};
+\node[color=red] at (6,2-0.6) {$\infty$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
+\end{tikzpicture}
+\end{center}
+Like in the Bellman–Ford algorithm,
+initially the distance to the starting node is 0
+and the distance to all other nodes is infinite.
+
+At each step, Dijkstra's algorithm selects a node
+that has not been processed yet and whose distance
+is as small as possible.
+The first such node is node 1 with distance 0.
+
+When a node is selected, the algorithm
+goes through all edges that start at the node
+and reduces the distances using them:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {3};
+\node[draw, circle] (2) at (4,3) {4};
+\node[draw, circle] (3) at (1,1) {2};
+\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
+\node[draw, circle] (5) at (6,2) {5};
+
+\node[color=red] at (1,3+0.6) {$\infty$};
+\node[color=red] at (4,3+0.6) {$9$};
+\node[color=red] at (1,1-0.6) {$5$};
+\node[color=red] at (4,1-0.6) {$0$};
+\node[color=red] at (6,2-0.6) {$1$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
+
+\path[draw=red,thick,->,line width=2pt] (4) -- (2);
+\path[draw=red,thick,->,line width=2pt] (4) -- (3);
+\path[draw=red,thick,->,line width=2pt] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+In this case,
+the edges from node 1 reduced the distances of
+nodes 2, 4 and 5, whose distances are now 5, 9 and 1.
+
+The next node to be processed is node 5 with distance 1.
+This reduces the distance to node 4 from 9 to 3:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (1,3) {3};
+\node[draw, circle] (2) at (4,3) {4};
+\node[draw, circle] (3) at (1,1) {2};
+\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
+\node[draw, circle, fill=lightgray] (5) at (6,2) {5};
+
+\node[color=red] at (1,3+0.6) {$\infty$};
+\node[color=red] at (4,3+0.6) {$3$};
+\node[color=red] at (1,1-0.6) {$5$};
+\node[color=red] at (4,1-0.6) {$0$};
+\node[color=red] at (6,2-0.6) {$1$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
+
+\path[draw=red,thick,->,line width=2pt] (5) -- (2);
+\end{tikzpicture}
+\end{center}
+After this, the next node is node 4, which reduces
+the distance to node 3 to 9:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {3};
+\node[draw, circle, fill=lightgray] (2) at (4,3) {4};
+\node[draw, circle] (3) at (1,1) {2};
+\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
+\node[draw, circle, fill=lightgray] (5) at (6,2) {5};
+
+\node[color=red] at (1,3+0.6) {$9$};
+\node[color=red] at (4,3+0.6) {$3$};
+\node[color=red] at (1,1-0.6) {$5$};
+\node[color=red] at (4,1-0.6) {$0$};
+\node[color=red] at (6,2-0.6) {$1$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
+
+\path[draw=red,thick,->,line width=2pt] (2) -- (1);
+\end{tikzpicture}
+\end{center}
+
+A remarkable property in Dijkstra's algorithm is that
+whenever a node is selected, its distance is final.
+For example, at this point of the algorithm,
+the distances 0, 1 and 3 are the final distances
+to nodes 1, 5 and 4.
+
+After this, the algorithm processes the two
+remaining nodes, and the final distances are as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle, fill=lightgray] (1) at (1,3) {3};
+\node[draw, circle, fill=lightgray] (2) at (4,3) {4};
+\node[draw, circle, fill=lightgray] (3) at (1,1) {2};
+\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
+\node[draw, circle, fill=lightgray] (5) at (6,2) {5};
+
+\node[color=red] at (1,3+0.6) {$7$};
+\node[color=red] at (4,3+0.6) {$3$};
+\node[color=red] at (1,1-0.6) {$5$};
+\node[color=red] at (4,1-0.6) {$0$};
+\node[color=red] at (6,2-0.6) {$1$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Negative edges}
+
+The efficiency of Dijkstra's algorithm is
+based on the fact that the graph does not
+contain negative edges.
+If there is a negative edge,
+the algorithm may give incorrect results.
+As an example, consider the following graph:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (2,1) {$2$};
+\node[draw, circle] (3) at (2,-1) {$3$};
+\node[draw, circle] (4) at (4,0) {$4$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:2] {} (2);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:3] {} (4);
+\path[draw,thick,-] (1) -- node[font=\small,label=below:6] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:$-5$] {} (4);
+\end{tikzpicture}
+\end{center}
+\noindent
+The shortest path from node 1 to node 4 is
+$1 \rightarrow 3 \rightarrow 4$
+and its length is 1.
+However, Dijkstra's algorithm
+finds the path $1 \rightarrow 2 \rightarrow 4$
+by following the minimum weight edges.
+The algorithm does not take into account that
+on the other path, the weight $-5$
+compensates the previous large weight $6$.
+
+\subsubsection{Implementation}
+
+The following implementation of Dijkstra's algorithm
+calculates the minimum distances from a node $x$
+to other nodes of the graph.
+The graph is stored as adjacency lists
+so that \texttt{adj[$a$]} contains a pair $(b,w)$
+always when there is an edge from node $a$ to node $b$
+with weight $w$.
+
+An efficient implementation of Dijkstra's algorithm
+requires that it is possible to efficiently find the
+minimum distance node that has not been processed.
+An appropriate data structure for this is a priority queue
+that contains the nodes ordered by their distances.
+Using a priority queue, the next node to be processed
+can be retrieved in logarithmic time.
+
+In the following code, the priority queue
+\texttt{q} contains pairs of the form $(-d,x)$,
+meaning that the current distance to node $x$ is $d$.
+The array $\texttt{distance}$ contains the distance to
+each node, and the array $\texttt{processed}$ indicates
+whether a node has been processed.
+Initially the distance is $0$ to $x$ and $\infty$ to all other nodes.
+
+\begin{lstlisting}
+for (int i = 1; i <= n; i++) distance[i] = INF;
+distance[x] = 0;
+q.push({0,x});
+while (!q.empty()) {
+    int a = q.top().second; q.pop();
+    if (processed[a]) continue;
+    processed[a] = true;
+    for (auto u : adj[a]) {
+        int b = u.first, w = u.second;
+        if (distance[a]+w < distance[b]) {
+            distance[b] = distance[a]+w;
+            q.push({-distance[b],b});
+        }
+    }
+}
+\end{lstlisting}
+
+Note that the priority queue contains \emph{negative}
+distances to nodes.
+The reason for this is that the
+default version of the C++ priority queue finds maximum
+elements, while we want to find minimum elements.
+By using negative distances,
+we can directly use the default priority queue\footnote{Of
+course, we could also declare the priority queue as in Chapter 4.5
+and use positive distances, but the implementation would be a bit longer.}.
+Also note that there may be several instances of the same
+node in the priority queue; however, only the instance with the
+minimum distance will be processed.
+
+The time complexity of the above implementation is
+$O(n+m \log m)$, because the algorithm goes through
+all nodes of the graph and adds for each edge
+at most one distance to the priority queue.
+
+\section{Floyd–Warshall algorithm}
+
+\index{Floyd–Warshall algorithm}
+
+The \key{Floyd–Warshall algorithm}\footnote{The algorithm
+is named after R. W. Floyd and S. Warshall
+who published it independently in 1962 \cite{flo62,war62}.}
+provides an alternative way to approach the problem
+of finding shortest paths.
+Unlike the other algorithms of this chapter,
+it finds all shortest paths between the nodes
+in a single run.
+
+The algorithm maintains a two-dimensional array
+that contains distances between the nodes.
+First, distances are calculated only using
+direct edges between the nodes,
+and after this, the algorithm reduces distances
+by using intermediate nodes in paths.
+
+\subsubsection{Example}
+
+Let us consider how the Floyd–Warshall algorithm
+works in the following graph:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$3$};
+\node[draw, circle] (2) at (4,3) {$4$};
+\node[draw, circle] (3) at (1,1) {$2$};
+\node[draw, circle] (4) at (4,1) {$1$};
+\node[draw, circle] (5) at (6,2) {$5$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
+\end{tikzpicture}
+\end{center}
+
+Initially, the distance from each node to itself is $0$,
+and the distance between nodes $a$ and $b$ is $x$
+if there is an edge between nodes $a$ and $b$ with weight $x$.
+All other distances are infinite.
+
+In this graph, the initial array is as follows:
+\begin{center}
+\begin{tabular}{r|rrrrr}
+ & 1 & 2 & 3 & 4 & 5 \\
+\hline
+1 & 0 & 5 & $\infty$ & 9 & 1 \\
+2 & 5 & 0 & 2 & $\infty$ & $\infty$ \\
+3 & $\infty$ & 2 & 0 & 7 & $\infty$ \\
+4 & 9 & $\infty$ & 7 & 0 & 2 \\
+5 & 1 & $\infty$ & $\infty$ & 2 & 0 \\
+\end{tabular}
+\end{center}
+\vspace{10pt}
+The algorithm consists of consecutive rounds.
+On each round, the algorithm selects a new node
+that can act as an intermediate node in paths from now on,
+and distances are reduced using this node.
+
+On the first round, node 1 is the new intermediate node.
+There is a new path between nodes 2 and 4
+with length 14, because node 1 connects them.
+There is also a new path 
+between nodes 2 and 5 with length 6.
+
+\begin{center}
+\begin{tabular}{r|rrrrr}
+ & 1 & 2 & 3 & 4 & 5 \\
+\hline
+1 & 0 & 5 & $\infty$ & 9 & 1 \\
+2 & 5 & 0 & 2 & \textbf{14} & \textbf{6} \\
+3 & $\infty$ & 2 & 0 & 7 & $\infty$ \\
+4 & 9 & \textbf{14} & 7 & 0 & 2 \\
+5 & 1 & \textbf{6} & $\infty$ & 2 & 0 \\
+\end{tabular}
+\end{center}
+\vspace{10pt}
+
+On the second round, node 2 is the new intermediate node.
+This creates new paths between nodes 1 and 3
+and between nodes 3 and 5:
+
+\begin{center}
+\begin{tabular}{r|rrrrr}
+ & 1 & 2 & 3 & 4 & 5 \\
+\hline
+1 & 0 & 5 & \textbf{7} & 9 & 1 \\
+2 & 5 & 0 & 2 & 14 & 6 \\
+3 & \textbf{7} & 2 & 0 & 7 & \textbf{8} \\
+4 & 9 & 14 & 7 & 0 & 2 \\
+5 & 1 & 6 & \textbf{8} & 2 & 0 \\
+\end{tabular}
+\end{center}
+\vspace{10pt}
+
+On the third round, node 3 is the new intermediate round.
+There is a new path between nodes 2 and 4:
+
+\begin{center}
+\begin{tabular}{r|rrrrr}
+ & 1 & 2 & 3 & 4 & 5 \\
+\hline
+1 & 0 & 5 & 7 & 9 & 1 \\
+2 & 5 & 0 & 2 & \textbf{9} & 6 \\
+3 & 7 & 2 & 0 & 7 & 8 \\
+4 & 9 & \textbf{9} & 7 & 0 & 2 \\
+5 & 1 & 6 & 8 & 2 & 0 \\
+\end{tabular}
+\end{center}
+\vspace{10pt}
+
+The algorithm continues like this,
+until all nodes have been appointed intermediate nodes.
+After the algorithm has finished, the array contains
+the minimum distances between any two nodes:
+
+\begin{center}
+\begin{tabular}{r|rrrrr}
+ & 1 & 2 & 3 & 4 & 5 \\
+\hline
+1 & 0 & 5 & 7 & 3 & 1 \\
+2 & 5 & 0 & 2 & 8 & 6 \\
+3 & 7 & 2 & 0 & 7 & 8 \\
+4 & 3 & 8 & 7 & 0 & 2 \\
+5 & 1 & 6 & 8 & 2 & 0 \\
+\end{tabular}
+\end{center}
+
+For example, the array tells us that the
+shortest distance between nodes 2 and 4 is 8.
+This corresponds to the following path:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$3$};
+\node[draw, circle] (2) at (4,3) {$4$};
+\node[draw, circle] (3) at (1,1) {$2$};
+\node[draw, circle] (4) at (4,1) {$1$};
+\node[draw, circle] (5) at (6,2) {$5$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
+
+\path[draw=red,thick,->,line width=2pt] (3) -- (4);
+\path[draw=red,thick,->,line width=2pt] (4) -- (5);
+\path[draw=red,thick,->,line width=2pt] (5) -- (2);
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Implementation}
+
+The advantage of the
+Floyd–Warshall algorithm that it is
+easy to implement.
+The following code constructs a
+distance matrix where $\texttt{distance}[a][b]$
+is the shortest distance between nodes $a$ and $b$.
+First, the algorithm initializes \texttt{distance}
+using the adjacency matrix \texttt{adj} of the graph:
+
+\begin{lstlisting}
+for (int i = 1; i <= n; i++) {
+    for (int j = 1; j <= n; j++) {
+        if (i == j) distance[i][j] = 0;
+        else if (adj[i][j]) distance[i][j] = adj[i][j];
+        else distance[i][j] = INF;
+    }
+}
+\end{lstlisting}
+After this, the shortest distances can be found as follows:
+\begin{lstlisting}
+for (int k = 1; k <= n; k++) {
+    for (int i = 1; i <= n; i++) {
+        for (int j = 1; j <= n; j++) {
+            distance[i][j] = min(distance[i][j],
+                                   distance[i][k]+distance[k][j]);
+        }
+    }
+}
+\end{lstlisting}
+
+The time complexity of the algorithm is $O(n^3)$,
+because it contains three nested loops
+that go through the nodes of the graph.
+
+Since the implementation of the Floyd–Warshall
+algorithm is simple, the algorithm can be
+a good choice even if it is only needed to find a
+single shortest path in the graph.
+However, the algorithm can only be used when the graph
+is so small that a cubic time complexity is fast enough.
--- a/chapter14.tex
+++ b/chapter14.tex
@ -0,0 +1,609 @@
+\chapter{Tree algorithms}
+
+\index{tree}
+
+A \key{tree} is a connected, acyclic graph
+that consists of $n$ nodes and $n-1$ edges.
+Removing any edge from a tree divides it
+into two components,
+and adding any edge to a tree creates a cycle.
+Moreover, there is always a unique path between any
+two nodes of a tree.
+
+For example, the following tree consists of 8 nodes and 7 edges:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,3) {$1$};
+\node[draw, circle] (2) at (2,3) {$4$};
+\node[draw, circle] (3) at (0,1) {$2$};
+\node[draw, circle] (4) at (2,1) {$3$};
+\node[draw, circle] (5) at (4,1) {$7$};
+\node[draw, circle] (6) at (-2,3) {$5$};
+\node[draw, circle] (7) at (-2,1) {$6$};
+\node[draw, circle] (8) at (-4,1) {$8$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+\path[draw,thick,-] (7) -- (8);
+\end{tikzpicture}
+\end{center}
+
+\index{leaf}
+
+The \key{leaves} of a tree are the nodes
+with degree 1, i.e., with only one neighbor.
+For example, the leaves of the above tree
+are nodes 3, 5, 7 and 8.
+
+\index{root}
+\index{rooted tree}
+
+In a \key{rooted} tree, one of the nodes
+is appointed the \key{root} of the tree,
+and all other nodes are
+placed underneath the root.
+For example, in the following tree,
+node 1 is the root node.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,3) {$1$};
+\node[draw, circle] (4) at (2,1) {$4$};
+\node[draw, circle] (2) at (-2,1) {$2$};
+\node[draw, circle] (3) at (0,1) {$3$};
+\node[draw, circle] (7) at (2,-1) {$7$};
+\node[draw, circle] (5) at (-3,-1) {$5$};
+\node[draw, circle] (6) at (-1,-1) {$6$};
+\node[draw, circle] (8) at (-1,-3) {$8$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (6);
+\path[draw,thick,-] (4) -- (7);
+\path[draw,thick,-] (6) -- (8);
+\end{tikzpicture}
+\end{center}
+\index{child}
+\index{parent}
+
+In a rooted tree, the \key{children} of a node
+are its lower neighbors, and the \key{parent} of a node
+is its upper neighbor.
+Each node has exactly one parent,
+except for the root that does not have a parent.
+For example, in the above tree,
+the children of node 2 are nodes 5 and 6,
+and its parent is node 1.
+
+\index{subtree}
+
+The structure of a rooted tree is \emph{recursive}:
+each node of the tree acts as the root of a \key{subtree}
+that contains the node itself and all nodes
+that are in the subtrees of its children.
+For example, in the above tree, the subtree of node 2
+consists of nodes 2, 5, 6 and 8:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (2) at (-2,1) {$2$};
+\node[draw, circle] (5) at (-3,-1) {$5$};
+\node[draw, circle] (6) at (-1,-1) {$6$};
+\node[draw, circle] (8) at (-1,-3) {$8$};
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (6);
+\path[draw,thick,-] (6) -- (8);
+\end{tikzpicture}
+\end{center}
+
+\section{Tree traversal}
+
+General graph traversal algorithms
+can be used to traverse the nodes of a tree.
+However, the traversal of a tree is easier to implement than
+that of a general graph, because
+there are no cycles in the tree and it is not
+possible to reach a node from multiple directions.
+
+The typical way to traverse a tree is to start
+a depth-first search at an arbitrary node.
+The following recursive function can be used:
+
+\begin{lstlisting}
+void dfs(int s, int e) {
+    // process node s
+    for (auto u : adj[s]) {
+        if (u != e) dfs(u, s);
+    }
+}
+\end{lstlisting}
+
+The function is given two parameters: the current node $s$
+and the previous node $e$.
+The purpose of the parameter $e$ is to make sure
+that the search only moves to nodes
+that have not been visited yet.
+
+The following function call starts the search
+at node $x$:
+
+\begin{lstlisting}
+dfs(x, 0);
+\end{lstlisting}
+
+In the first call $e=0$, because there is no
+previous node, and it is allowed
+to proceed to any direction in the tree.
+
+\subsubsection{Dynamic programming}
+
+Dynamic programming can be used to calculate
+some information during a tree traversal.
+Using dynamic programming, we can, for example,
+calculate in $O(n)$ time for each node of a rooted tree the
+number of nodes in its subtree
+or the length of the longest path from the node
+to a leaf.
+
+As an example, let us calculate for each node $s$
+a value $\texttt{count}[s]$: the number of nodes in its subtree.
+The subtree contains the node itself and
+all nodes in the subtrees of its children,
+so we can calculate the number of nodes
+recursively using the following code:
+
+\begin{lstlisting}
+void dfs(int s, int e) {
+    count[s] = 1;
+    for (auto u : adj[s]) {
+        if (u == e) continue;
+        dfs(u, s);
+        count[s] += count[u];
+    }
+}
+\end{lstlisting}
+
+\section{Diameter}
+
+\index{diameter}
+
+The \key{diameter} of a tree
+is the maximum length of a path between two nodes.
+For example, consider the following tree:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,3) {$1$};
+\node[draw, circle] (2) at (2,3) {$4$};
+\node[draw, circle] (3) at (0,1) {$2$};
+\node[draw, circle] (4) at (2,1) {$3$};
+\node[draw, circle] (5) at (4,1) {$7$};
+\node[draw, circle] (6) at (-2,3) {$5$};
+\node[draw, circle] (7) at (-2,1) {$6$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+\end{tikzpicture}
+\end{center}
+The diameter of this tree is 4,
+which corresponds to the following path:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,3) {$1$};
+\node[draw, circle] (2) at (2,3) {$4$};
+\node[draw, circle] (3) at (0,1) {$2$};
+\node[draw, circle] (4) at (2,1) {$3$};
+\node[draw, circle] (5) at (4,1) {$7$};
+\node[draw, circle] (6) at (-2,3) {$5$};
+\node[draw, circle] (7) at (-2,1) {$6$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+
+\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
+\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
+\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
+\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+Note that there may be several maximum-length paths.
+In the above path, we could replace node 6 with node 5
+to obtain another path with length 4.
+
+Next we will discuss two $O(n)$ time algorithms
+for calculating the diameter of a tree.
+The first algorithm is based on dynamic programming,
+and the second algorithm uses two depth-first searches.
+
+\subsubsection{Algorithm 1}
+
+A general way to approach many tree problems
+is to first root the tree arbitrarily.
+After this, we can try to solve the problem
+separately for each subtree.
+Our first algorithm for calculating the diameter
+is based on this idea.
+
+An important observation is that every path
+in a rooted tree has a \emph{highest point}:
+the highest node that belongs to the path.
+Thus, we can calculate for each node the length
+of the longest path whose highest point is the node.
+One of those paths corresponds to the diameter of the tree.
+
+For example, in the following tree,
+node 1 is the highest point on the path
+that corresponds to the diameter:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,3) {$1$};
+\node[draw, circle] (2) at (2,1) {$4$};
+\node[draw, circle] (3) at (-2,1) {$2$};
+\node[draw, circle] (4) at (0,1) {$3$};
+\node[draw, circle] (5) at (2,-1) {$7$};
+\node[draw, circle] (6) at (-3,-1) {$5$};
+\node[draw, circle] (7) at (-1,-1) {$6$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+
+\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
+\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
+\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
+\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+
+We calculate for each node $x$ two values:
+\begin{itemize}
+\item $\texttt{toLeaf}(x)$: the maximum length of a path from $x$ to any leaf
+\item $\texttt{maxLength}(x)$: the maximum length of a path
+whose highest point is $x$
+\end{itemize}
+For example, in the above tree,
+$\texttt{toLeaf}(1)=2$, because there is a path
+$1 \rightarrow 2 \rightarrow 6$,
+and $\texttt{maxLength}(1)=4$,
+because there is a path
+$6 \rightarrow 2 \rightarrow 1 \rightarrow 4 \rightarrow 7$.
+In this case, $\texttt{maxLength}(1)$ equals the diameter.
+
+Dynamic programming can be used to calculate the above
+values for all nodes in $O(n)$ time.
+First, to calculate $\texttt{toLeaf}(x)$,
+we go through the children of $x$,
+choose a child $c$ with maximum $\texttt{toLeaf}(c)$
+and add one to this value.
+Then, to calculate $\texttt{maxLength}(x)$,
+we choose two distinct children $a$ and $b$
+such that the sum $\texttt{toLeaf}(a)+\texttt{toLeaf}(b)$
+is maximum and add two to this sum.
+
+\subsubsection{Algorithm 2}
+
+Another efficient way to calculate the diameter
+of a tree is based on two depth-first searches.
+First, we choose an arbitrary node $a$ in the tree
+and find the farthest node $b$ from $a$.
+Then, we find the farthest node $c$ from $b$.
+The diameter of the tree is the distance between $b$ and $c$.
+
+In the following graph, $a$, $b$ and $c$ could be:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,3) {$1$};
+\node[draw, circle] (2) at (2,3) {$4$};
+\node[draw, circle] (3) at (0,1) {$2$};
+\node[draw, circle] (4) at (2,1) {$3$};
+\node[draw, circle] (5) at (4,1) {$7$};
+\node[draw, circle] (6) at (-2,3) {$5$};
+\node[draw, circle] (7) at (-2,1) {$6$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+\node[color=red] at (2,1.6) {$a$};
+\node[color=red] at (-2,1.6) {$b$};
+\node[color=red] at (4,1.6) {$c$};
+
+\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
+\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
+\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
+\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+
+This is an elegant method, but why does it work?
+
+It helps to draw the tree differently so that
+the path that corresponds to the diameter
+is horizontal, and all other
+nodes hang from it:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (2,1) {$1$};
+\node[draw, circle] (2) at (4,1) {$4$};
+\node[draw, circle] (3) at (0,1) {$2$};
+\node[draw, circle] (4) at (2,-1) {$3$};
+\node[draw, circle] (5) at (6,1) {$7$};
+\node[draw, circle] (6) at (0,-1) {$5$};
+\node[draw, circle] (7) at (-2,1) {$6$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+\node[color=red] at (2,-1.6) {$a$};
+\node[color=red] at (-2,1.6) {$b$};
+\node[color=red] at (6,1.6) {$c$};
+\node[color=red] at (2,1.6) {$x$};
+
+\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
+\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
+\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
+\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+
+Node $x$ indicates the place where the path
+from node $a$ joins the path that corresponds
+to the diameter.
+The farthest node from $a$
+is node $b$, node $c$ or some other node
+that is at least as far from node $x$.
+Thus, this node is always a valid choice for
+an endpoint of a path that corresponds to the diameter.
+
+\section{All longest paths}
+
+Our next problem is to calculate for every node
+in the tree the maximum length of a path
+that begins at the node.
+This can be seen as a generalization of the
+tree diameter problem, because the largest of those
+lengths equals the diameter of the tree.
+Also this problem can be solved in $O(n)$ time.
+
+As an example, consider the following tree:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (-1.5,-1) {$4$};
+\node[draw, circle] (3) at (2,0) {$2$};
+\node[draw, circle] (4) at (-1.5,1) {$3$};
+\node[draw, circle] (6) at (3.5,-1) {$6$};
+\node[draw, circle] (7) at (3.5,1) {$5$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+\end{tikzpicture}
+\end{center}
+
+Let $\texttt{maxLength}(x)$ denote the maximum length
+of a path that begins at node $x$.
+For example, in the above tree,
+$\texttt{maxLength}(4)=3$, because there
+is a path $4 \rightarrow 1 \rightarrow 2 \rightarrow 6$.
+Here is a complete table of the values:
+\begin{center}
+\begin{tabular}{l|lllllll}
+node $x$ & 1 & 2 & 3 & 4 & 5 & 6 \\
+$\texttt{maxLength}(x)$ & 2 & 2 & 3 & 3 & 3 & 3 \\
+\end{tabular}
+\end{center}
+
+Also in this problem, a good starting point
+for solving the problem is to root the tree arbitrarily:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,3) {$1$};
+\node[draw, circle] (2) at (2,1) {$4$};
+\node[draw, circle] (3) at (-2,1) {$2$};
+\node[draw, circle] (4) at (0,1) {$3$};
+\node[draw, circle] (6) at (-3,-1) {$5$};
+\node[draw, circle] (7) at (-1,-1) {$6$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+\end{tikzpicture}
+\end{center}
+
+The first part of the problem is to calculate for every node $x$
+the maximum length of a path that goes through a child of $x$.
+For example, the longest path from node 1
+goes through its child 2:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,3) {$1$};
+\node[draw, circle] (2) at (2,1) {$4$};
+\node[draw, circle] (3) at (-2,1) {$2$};
+\node[draw, circle] (4) at (0,1) {$3$};
+\node[draw, circle] (6) at (-3,-1) {$5$};
+\node[draw, circle] (7) at (-1,-1) {$6$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+
+\path[draw,thick,->,color=red,line width=2pt] (1) -- (3);
+\path[draw,thick,->,color=red,line width=2pt] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+This part is easy to solve in $O(n)$ time, because we can use
+dynamic programming as we have done previously.
+
+Then, the second part of the problem is to calculate
+for every node $x$ the maximum length of a path
+through its parent $p$.
+For example, the longest path
+from node 3 goes through its parent 1:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,3) {$1$};
+\node[draw, circle] (2) at (2,1) {$4$};
+\node[draw, circle] (3) at (-2,1) {$2$};
+\node[draw, circle] (4) at (0,1) {$3$};
+\node[draw, circle] (6) at (-3,-1) {$5$};
+\node[draw, circle] (7) at (-1,-1) {$6$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+
+\path[draw,thick,->,color=red,line width=2pt] (4) -- (1);
+\path[draw,thick,->,color=red,line width=2pt] (1) -- (3);
+\path[draw,thick,->,color=red,line width=2pt] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+
+At first glance, it seems that we should choose
+the longest path from $p$.
+However, this \emph{does not} always work,
+because the longest path from $p$
+may go through $x$.
+Here is an example of this situation:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,3) {$1$};
+\node[draw, circle] (2) at (2,1) {$4$};
+\node[draw, circle] (3) at (-2,1) {$2$};
+\node[draw, circle] (4) at (0,1) {$3$};
+\node[draw, circle] (6) at (-3,-1) {$5$};
+\node[draw, circle] (7) at (-1,-1) {$6$};
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (3) -- (7);
+
+\path[draw,thick,->,color=red,line width=2pt] (3) -- (1);
+\path[draw,thick,->,color=red,line width=2pt] (1) -- (2);
+\end{tikzpicture}
+\end{center}
+
+Still, we can solve the second part in
+$O(n)$ time by storing \emph{two} maximum lengths
+for each node $x$:
+\begin{itemize}
+\item $\texttt{maxLength}_1(x)$:
+the maximum length of a path from $x$
+\item $\texttt{maxLength}_2(x)$
+the maximum length of a path from $x$
+in another direction than the first path
+\end{itemize}
+For example, in the above graph,
+$\texttt{maxLength}_1(1)=2$
+using the path $1 \rightarrow 2 \rightarrow 5$,
+and $\texttt{maxLength}_2(1)=1$
+using the path $1 \rightarrow 3$.
+
+Finally, if the path that corresponds to
+$\texttt{maxLength}_1(p)$ goes through $x$,
+we conclude that the maximum length is
+$\texttt{maxLength}_2(p)+1$,
+and otherwise the maximum length is
+$\texttt{maxLength}_1(p)+1$.
+
+
+\section{Binary trees}
+
+\index{binary tree}
+
+\begin{samepage}
+A \key{binary tree} is a rooted tree
+where each node has a left and right subtree.
+It is possible that a subtree of a node is empty.
+Thus, every node in a binary tree has
+zero, one or two children.
+
+For example, the following tree is a binary tree:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (-1.5,-1.5) {$2$};
+\node[draw, circle] (3) at (1.5,-1.5) {$3$};
+\node[draw, circle] (4) at (-3,-3) {$4$};
+\node[draw, circle] (5) at (0,-3) {$5$};
+\node[draw, circle] (6) at (-1.5,-4.5) {$6$};
+\node[draw, circle] (7) at (3,-3) {$7$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (5) -- (6);
+\path[draw,thick,-] (3) -- (7);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+\index{pre-order}
+\index{in-order}
+\index{post-order}
+
+The nodes of a binary tree have three natural
+orderings that correspond to different ways to 
+recursively traverse the tree:
+
+\begin{itemize}
+\item \key{pre-order}: first process the root,
+then traverse the left subtree, then traverse the right subtree
+\item \key{in-order}: first traverse the left subtree,
+then process the root, then traverse the right subtree
+\item \key{post-order}: first traverse the left subtree,
+then traverse the right subtree, then process the root
+\end{itemize}
+
+For the above tree, the nodes in
+pre-order are
+$[1,2,4,5,6,3,7]$,
+in in-order $[4,2,6,5,1,3,7]$
+and in post-order $[4,6,5,2,7,3,1]$.
+
+If we know the pre-order and in-order
+of a tree, we can reconstruct the exact structure of the tree.
+For example, the above tree is the only possible tree
+with pre-order $[1,2,4,5,6,3,7]$ and
+in-order $[4,2,6,5,1,3,7]$.
+In a similar way, the post-order and in-order
+also determine the structure of a tree.
+
+However, the situation is different if we only know
+the pre-order and post-order of a tree.
+In this case, there may be more than one tree
+that match the orderings.
+For example, in both of the trees
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (-1.5,-1.5) {$2$};
+\path[draw,thick,-] (1) -- (2);
+
+\node[draw, circle] (1b) at (0+4,0) {$1$};
+\node[draw, circle] (2b) at (1.5+4,-1.5) {$2$};
+\path[draw,thick,-] (1b) -- (2b);
+\end{tikzpicture}
+\end{center}
+the pre-order is $[1,2]$ and the post-order is $[2,1]$,
+but the structures of the trees are different.
+
--- a/chapter15.tex
+++ b/chapter15.tex
@ -0,0 +1,712 @@
+\chapter{Spanning trees}
+
+\index{spanning tree}
+
+A \key{spanning tree} of a graph consists of
+all nodes of the graph and some of the
+edges of the graph so that there is a path
+between any two nodes.
+Like trees in general, spanning trees are
+connected and acyclic.
+Usually there are several ways to construct a spanning tree.
+
+For example, consider the following graph:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+One spanning tree for the graph is as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+
+The weight of a spanning tree is the sum of its edge weights.
+For example, the weight of the above spanning tree is
+$3+5+9+3+2=22$.
+
+\index{minimum spanning tree}
+
+A \key{minimum spanning tree}
+is a spanning tree whose weight is as small as possible.
+The weight of a minimum spanning tree for the example graph
+is 20, and such a tree can be constructed as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+
+\index{maximum spanning tree}
+
+In a similar way, a \key{maximum spanning tree}
+is a spanning tree whose weight is as large as possible.
+The weight of a maximum spanning tree for the
+example graph is 32:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+\end{tikzpicture}
+\end{center}
+
+Note that a graph may have several
+minimum and maximum spanning trees,
+so the trees are not unique.
+
+It turns out that several greedy methods
+can be used to construct minimum and maximum
+spanning trees.
+In this chapter, we discuss two algorithms
+that process
+the edges of the graph ordered by their weights.
+We focus on finding minimum spanning trees,
+but the same algorithms can find
+maximum spanning trees by processing the edges in reverse order.
+
+\section{Kruskal's algorithm}
+
+\index{Kruskal's algorithm}
+
+In \key{Kruskal's algorithm}\footnote{The algorithm was published in 1956
+by J. B. Kruskal \cite{kru56}.}, the initial spanning tree
+only contains the nodes of the graph
+and does not contain any edges.
+Then the algorithm goes through the edges
+ordered by their weights, and always adds an edge
+to the tree if it does not create a cycle.
+
+The algorithm maintains the components
+of the tree.
+Initially, each node of the graph
+belongs to a separate component.
+Always when an edge is added to the tree,
+two components are joined.
+Finally, all nodes belong to the same component,
+and a minimum spanning tree has been found.
+
+\subsubsection{Example}
+
+\begin{samepage}
+Let us consider how Kruskal's algorithm processes the
+following graph:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+\begin{samepage}
+The first step of the algorithm is to sort the
+edges in increasing order of their weights.
+The result is the following list:
+
+\begin{tabular}{ll}
+\\
+edge & weight \\
+\hline
+5--6 & 2 \\
+1--2 & 3 \\
+3--6 & 3 \\
+1--5 & 5 \\
+2--3 & 5 \\
+2--5 & 6 \\
+4--6 & 7 \\
+3--4 & 9 \\
+\\
+\end{tabular}
+\end{samepage}
+
+After this, the algorithm goes through the list
+and adds each edge to the tree if it joins
+two separate components.
+
+Initially, each node is in its own component:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+%\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+The first edge to be added to the tree is
+the edge 5--6 that creates a component $\{5,6\}$
+by joining the components $\{5\}$ and $\{6\}$:
+
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+
+%\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+After this, the edges 1--2, 3--6 and 1--5 are added in a similar way:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+
+After those steps, most components have been joined
+and there are two components in the tree:
+$\{1,2,3,5,6\}$ and $\{4\}$.
+
+The next edge in the list is the edge 2--3,
+but it will not be included in the tree, because
+nodes 2 and 3 are already in the same component.
+For the same reason, the edge 2--5 will not be included in the tree.
+
+\begin{samepage}
+Finally, the edge 4--6 will be included in the tree:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+After this, the algorithm will not add any
+new edges, because the graph is connected
+and there is a path between any two nodes.
+The resulting graph is a minimum spanning tree
+with weight $2+3+3+5+7=20$.
+
+\subsubsection{Why does this work?}
+
+It is a good question why Kruskal's algorithm works.
+Why does the greedy strategy guarantee that we
+will find a minimum spanning tree?
+
+Let us see what happens if the minimum weight edge of
+the graph is \emph{not} included in the spanning tree.
+For example, suppose that a spanning tree
+for the previous graph would not contain the
+minimum weight edge 5--6.
+We do not know the exact structure of such a spanning tree,
+but in any case it has to contain some edges.
+Assume that the tree would be as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+
+\path[draw,thick,-,dashed] (1) -- (2);
+\path[draw,thick,-,dashed] (2) -- (5);
+\path[draw,thick,-,dashed] (2) -- (3);
+\path[draw,thick,-,dashed] (3) -- (4);
+\path[draw,thick,-,dashed] (4) -- (6);
+\end{tikzpicture}
+\end{center}
+
+However, it is not possible that the above tree
+would be a minimum spanning tree for the graph.
+The reason for this is that we can remove an edge
+from the tree and replace it with the minimum weight edge 5--6.
+This produces a spanning tree whose weight is
+\emph{smaller}:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+
+\path[draw,thick,-,dashed] (1) -- (2);
+\path[draw,thick,-,dashed] (2) -- (5);
+\path[draw,thick,-,dashed] (3) -- (4);
+\path[draw,thick,-,dashed] (4) -- (6);
+\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+\end{tikzpicture}
+\end{center}
+
+For this reason, it is always optimal
+to include the minimum weight edge
+in the tree to produce a minimum spanning tree.
+Using a similar argument, we can show that it
+is also optimal to add the next edge in weight order
+to the tree, and so on.
+Hence, Kruskal's algorithm works correctly and
+always produces a minimum spanning tree.
+
+\subsubsection{Implementation}
+
+When implementing Kruskal's algorithm,
+it is convenient to use
+the edge list representation of the graph.
+The first phase of the algorithm sorts the
+edges in the list in $O(m \log m)$ time.
+After this, the second phase of the algorithm
+builds the minimum spanning tree as follows:
+
+\begin{lstlisting}
+for (...) {
+  if (!same(a,b)) unite(a,b);
+}
+\end{lstlisting}
+
+The loop goes through the edges in the list
+and always processes an edge $a$--$b$
+where $a$ and $b$ are two nodes.
+Two functions are needed:
+the function \texttt{same} determines
+if $a$ and $b$ are in the same component,
+and the function \texttt{unite}
+joins the components that contain $a$ and $b$.
+
+The problem is how to efficiently implement
+the functions \texttt{same} and \texttt{unite}.
+One possibility is to implement the function
+\texttt{same} as a graph traversal and check if
+we can get from node $a$ to node $b$.
+However, the time complexity of such a function
+would be $O(n+m)$
+and the resulting algorithm would be slow,
+because the function \texttt{same} will be called for each edge in the graph.
+
+We will solve the problem using a union-find structure
+that implements both functions in $O(\log n)$ time.
+Thus, the time complexity of Kruskal's algorithm
+will be $O(m \log n)$ after sorting the edge list.
+
+\section{Union-find structure}
+
+\index{union-find structure}
+
+A \key{union-find structure} maintains
+a collection of sets.
+The sets are disjoint, so no element
+belongs to more than one set.
+Two $O(\log n)$ time operations are supported:
+the \texttt{unite} operation joins two sets,
+and the \texttt{find} operation finds the representative
+of the set that contains a given element\footnote{The structure presented here
+was introduced in 1971 by J. D. Hopcroft and J. D. Ullman \cite{hop71}.
+Later, in 1975, R. E. Tarjan studied a more sophisticated variant
+of the structure \cite{tar75} that is discussed in many algorithm
+textbooks nowadays.}.
+
+\subsubsection{Structure}
+
+In a union-find structure, one element in each set
+is the representative of the set,
+and there is a chain from any other element of the
+set to the representative.
+For example, assume that the sets are
+$\{1,4,7\}$, $\{5\}$ and $\{2,3,6,8\}$:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (0,-1) {$1$};
+\node[draw, circle] (2) at (7,0) {$2$};
+\node[draw, circle] (3) at (7,-1.5) {$3$};
+\node[draw, circle] (4) at (1,0) {$4$};
+\node[draw, circle] (5) at (4,0) {$5$};
+\node[draw, circle] (6) at (6,-2.5) {$6$};
+\node[draw, circle] (7) at (2,-1) {$7$};
+\node[draw, circle] (8) at (8,-2.5) {$8$};
+
+\path[draw,thick,->] (1) -- (4);
+\path[draw,thick,->] (7) -- (4);
+
+\path[draw,thick,->] (3) -- (2);
+\path[draw,thick,->] (6) -- (3);
+\path[draw,thick,->] (8) -- (3);
+
+\end{tikzpicture}
+\end{center}
+In this case the representatives
+of the sets are 4, 5 and 2.
+We can find the representative of any element
+by following the chain that begins at the element.
+For example, the element 2 is the representative
+for the element 6, because
+we follow the chain $6 \rightarrow 3 \rightarrow 2$.
+Two elements belong to the same set exactly when
+their representatives are the same.
+
+Two sets can be joined by connecting the
+representative of one set to the
+representative of the other set.
+For example, the sets
+$\{1,4,7\}$ and $\{2,3,6,8\}$
+can be joined as follows:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (2,-1) {$1$};
+\node[draw, circle] (2) at (7,0) {$2$};
+\node[draw, circle] (3) at (7,-1.5) {$3$};
+\node[draw, circle] (4) at (3,0) {$4$};
+\node[draw, circle] (6) at (6,-2.5) {$6$};
+\node[draw, circle] (7) at (4,-1) {$7$};
+\node[draw, circle] (8) at (8,-2.5) {$8$};
+
+\path[draw,thick,->] (1) -- (4);
+\path[draw,thick,->] (7) -- (4);
+
+\path[draw,thick,->] (3) -- (2);
+\path[draw,thick,->] (6) -- (3);
+\path[draw,thick,->] (8) -- (3);
+
+\path[draw,thick,->] (4) -- (2);
+\end{tikzpicture}
+\end{center}
+
+The resulting set contains the elements
+$\{1,2,3,4,6,7,8\}$.
+From this on, the element 2 is the representative
+for the entire set and the old representative 4
+points to the element 2.
+
+The efficiency of the union-find structure depends on
+how the sets are joined.
+It turns out that we can follow a simple strategy:
+always connect the representative of the
+\emph{smaller} set to the representative of the \emph{larger} set
+(or if the sets are of equal size,
+we can make an arbitrary choice).
+Using this strategy, the length of any chain
+will be $O(\log n)$, so we can
+find the representative of any element
+efficiently by following the corresponding chain.
+
+\subsubsection{Implementation}
+
+The union-find structure can be implemented
+using arrays.
+In the following implementation,
+the array \texttt{link} contains for each element
+the next element
+in the chain or the element itself if it is
+a representative,
+and the array \texttt{size} indicates for each representative
+the size of the corresponding set.
+
+Initially, each element belongs to a separate set:
+\begin{lstlisting}
+for (int i = 1; i <= n; i++) link[i] = i;
+for (int i = 1; i <= n; i++) size[i] = 1;
+\end{lstlisting}
+
+The function \texttt{find} returns
+the representative for an element $x$.
+The representative can be found by following
+the chain that begins at $x$.
+
+\begin{lstlisting}
+int find(int x) {
+    while (x != link[x]) x = link[x];
+    return x;
+}
+\end{lstlisting}
+
+The function \texttt{same} checks
+whether elements $a$ and $b$ belong to the same set.
+This can easily be done by using the
+function \texttt{find}:
+
+\begin{lstlisting}
+bool same(int a, int b) {
+    return find(a) == find(b);
+}
+\end{lstlisting}
+
+\begin{samepage}
+The function \texttt{unite} joins the sets
+that contain elements $a$ and $b$
+(the elements have to be in different sets).
+The function first finds the representatives
+of the sets and then connects the smaller
+set to the larger set.
+
+\begin{lstlisting}
+void unite(int a, int b) {
+    a = find(a);
+    b = find(b);
+    if (size[a] < size[b]) swap(a,b);
+    size[a] += size[b];
+    link[b] = a;
+}
+\end{lstlisting}
+\end{samepage}
+
+The time complexity of the function \texttt{find}
+is $O(\log n)$ assuming that the length of each
+chain is $O(\log n)$.
+In this case, the functions \texttt{same} and \texttt{unite}
+also work in $O(\log n)$ time.
+The function \texttt{unite} makes sure that the
+length of each chain is $O(\log n)$ by connecting
+the smaller set to the larger set.
+
+\section{Prim's algorithm}
+
+\index{Prim's algorithm}
+
+\key{Prim's algorithm}\footnote{The algorithm is
+named after R. C. Prim who published it in 1957 \cite{pri57}.
+However, the same algorithm was discovered already in 1930
+by V. Jarník.} is an alternative method
+for finding a minimum spanning tree.
+The algorithm first adds an arbitrary node
+to the tree.
+After this, the algorithm always chooses
+a minimum-weight edge that
+adds a new node to the tree.
+Finally, all nodes have been added to the tree
+and a minimum spanning tree has been found.
+
+Prim's algorithm resembles Dijkstra's algorithm.
+The difference is that Dijkstra's algorithm always
+selects an edge whose distance from the starting
+node is minimum, but Prim's algorithm simply selects
+the minimum weight edge that adds a new node to the tree.
+
+\subsubsection{Example}
+
+Let us consider how Prim's algorithm works
+in the following graph:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+
+%\path[draw=red,thick,-,line width=2pt] (5) -- (6);
+\end{tikzpicture}
+\end{center}
+Initially, there are no edges between the nodes:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+%\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+An arbitrary node can be the starting node,
+so let us choose node 1.
+First, we add node 2 that is connected by
+an edge of weight 3:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+
+After this, there are two edges with weight 5,
+so we can add either node 3 or node 5 to the tree.
+Let us add node 3 first:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+
+\begin{samepage}
+The process continues until all nodes have been included in the tree:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1.5,2) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (5,3) {$3$};
+\node[draw, circle] (4) at (6.5,2) {$4$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
+%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
+%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
+\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
+\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
+%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
+\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+\subsubsection{Implementation}
+
+Like Dijkstra's algorithm, Prim's algorithm can be
+efficiently implemented using a priority queue.
+The priority queue should contain all nodes
+that can be connected to the current component using
+a single edge, in increasing order of the weights
+of the corresponding edges.
+
+The time complexity of Prim's algorithm is
+$O(n + m \log m)$ that equals the time complexity
+of Dijkstra's algorithm.
+In practice, Prim's and Kruskal's algorithms
+are both efficient, and the choice of the algorithm
+is a matter of taste.
+Still, most competitive programmers use Kruskal's algorithm.
--- a/chapter16.tex
+++ b/chapter16.tex
@ -0,0 +1,708 @@
+\chapter{Directed graphs}
+
+In this chapter, we focus on two classes of directed graphs:
+\begin{itemize}
+\item \key{Acyclic graphs}:
+There are no cycles in the graph,
+so there is no path from any node to itself\footnote{Directed acyclic
+graphs are sometimes called DAGs.}.
+\item \key{Successor graphs}:
+The outdegree of each node is 1,
+so each node has a unique successor.
+\end{itemize}
+It turns out that in both cases,
+we can design efficient algorithms that are based
+on the special properties of the graphs.
+
+\section{Topological sorting}
+
+\index{topological sorting}
+\index{cycle}
+
+A \key{topological sort} is an ordering
+of the nodes of a directed graph
+such that if there is a path from node $a$ to node $b$,
+then node $a$ appears before node $b$ in the ordering.
+For example, for the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,5) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+\node[draw, circle] (6) at (5,3) {$6$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (5) -- (3);
+\path[draw,thick,->,>=latex] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+one topological sort is
+$[4,1,5,2,3,6]$:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (-6,0) {$1$};
+\node[draw, circle] (2) at (-3,0) {$2$};
+\node[draw, circle] (3) at (-1.5,0) {$3$};
+\node[draw, circle] (4) at (-7.5,0) {$4$};
+\node[draw, circle] (5) at (-4.5,0) {$5$};
+\node[draw, circle] (6) at (-0,0) {$6$};
+
+\path[draw,thick,->,>=latex] (1) edge [bend right=30] (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\path[draw,thick,->,>=latex] (4) edge [bend left=30] (5);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (5) edge [bend left=30]  (3);
+\path[draw,thick,->,>=latex] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+
+An acyclic graph always has a topological sort.
+However, if the graph contains a cycle,
+it is not possible to form a topological sort,
+because no node of the cycle can appear
+before the other nodes of the cycle in the ordering.
+It turns out that depth-first search can be used
+to both check if a directed graph contains a cycle
+and, if it does not contain a cycle, to construct a topological sort.
+
+\subsubsection{Algorithm}
+
+The idea is to go through the nodes of the graph
+and always begin a depth-first search at the current node
+if it has not been processed yet.
+During the searches, the nodes have three possible states:
+
+\begin{itemize}
+\item state 0: the node has not been processed (white)
+\item state 1: the node is under processing (light gray)
+\item state 2: the node has been processed (dark gray)
+\end{itemize}
+
+Initially, the state of each node is 0.
+When a search reaches a node for the first time,
+its state becomes 1.
+Finally, after all successors of the node have
+been processed, its state becomes 2.
+
+If the graph contains a cycle, we will find this out
+during the search, because sooner or later
+we will arrive at a node whose state is 1.
+In this case, it is not possible to construct a topological sort.
+
+If the graph does not contain a cycle, we can construct
+a topological sort by 
+adding each node to a list when the state of the node becomes 2.
+This list in reverse order is a topological sort.
+
+\subsubsection{Example 1}
+
+In the example graph, the search first proceeds
+from node 1 to node 6:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle,fill=gray!20] (1) at (1,5) {$1$};
+\node[draw, circle,fill=gray!20] (2) at (3,5) {$2$};
+\node[draw, circle,fill=gray!20] (3) at (5,5) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+\node[draw, circle,fill=gray!80] (6) at (5,3) {$6$};
+
+\path[draw,thick,->,>=latex] (4) -- (1);
+\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (5) -- (3);
+%\path[draw,thick,->,>=latex] (3) -- (6);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+
+Now node 6 has been processed, so it is added to the list.
+After this, also nodes 3, 2 and 1 are added to the list:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle,fill=gray!80] (1) at (1,5) {$1$};
+\node[draw, circle,fill=gray!80] (2) at (3,5) {$2$};
+\node[draw, circle,fill=gray!80] (3) at (5,5) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+\node[draw, circle,fill=gray!80] (6) at (5,3) {$6$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (5) -- (3);
+\path[draw,thick,->,>=latex] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+
+At this point, the list is $[6,3,2,1]$.
+The next search begins at node 4:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle,fill=gray!80] (1) at (1,5) {$1$};
+\node[draw, circle,fill=gray!80] (2) at (3,5) {$2$};
+\node[draw, circle,fill=gray!80] (3) at (5,5) {$3$};
+\node[draw, circle,fill=gray!20] (4) at (1,3) {$4$};
+\node[draw, circle,fill=gray!80] (5) at (3,3) {$5$};
+\node[draw, circle,fill=gray!80] (6) at (5,3) {$6$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (4) -- (1);
+%\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (5) -- (3);
+\path[draw,thick,->,>=latex] (3) -- (6);
+
+\path[draw=red,thick,->,line width=2pt] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+
+Thus, the final list is $[6,3,2,1,5,4]$.
+We have processed all nodes, so a topological sort has
+been found.
+The topological sort is the reverse list
+$[4,5,1,2,3,6]$:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (3,0) {$1$};
+\node[draw, circle] (2) at (4.5,0) {$2$};
+\node[draw, circle] (3) at (6,0) {$3$};
+\node[draw, circle] (4) at (0,0) {$4$};
+\node[draw, circle] (5) at (1.5,0) {$5$};
+\node[draw, circle] (6) at (7.5,0) {$6$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (4) edge [bend left=30] (1);
+\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (5) edge [bend right=30] (2);
+\path[draw,thick,->,>=latex] (5) edge [bend right=40] (3);
+\path[draw,thick,->,>=latex] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+
+Note that a topological sort is not unique,
+and there can be several topological sorts for a graph.
+
+\subsubsection{Example 2}
+
+Let us now consider a graph for which we
+cannot construct a topological sort,
+because the graph contains a cycle:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,5) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+\node[draw, circle] (6) at (5,3) {$6$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (3) -- (5);
+\path[draw,thick,->,>=latex] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+The search proceeds as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle,fill=gray!20] (1) at (1,5) {$1$};
+\node[draw, circle,fill=gray!20] (2) at (3,5) {$2$};
+\node[draw, circle,fill=gray!20] (3) at (5,5) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle,fill=gray!20] (5) at (3,3) {$5$};
+\node[draw, circle] (6) at (5,3) {$6$};
+
+\path[draw,thick,->,>=latex] (4) -- (1);
+\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (3) -- (6);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- (5);
+\path[draw=red,thick,->,line width=2pt] (5) -- (2);
+\end{tikzpicture}
+\end{center}
+The search reaches node 2 whose state is 1,
+which means that the graph contains a cycle.
+In this example, there is a cycle
+$2 \rightarrow 3 \rightarrow 5 \rightarrow 2$.
+
+\section{Dynamic programming}
+
+If a directed graph is acyclic,
+dynamic programming can be applied to it.
+For example, we can efficiently solve the following
+problems concerning paths from a starting node
+to an ending node:
+
+\begin{itemize}
+\item how many different paths are there?
+\item what is the shortest/longest path?
+\item what is the minimum/maximum number of edges in a path?
+\item which nodes certainly appear in any path?
+\end{itemize}
+
+\subsubsection{Counting the number of paths}
+
+As an example, let us calculate the number of paths
+from node 1 to node 6 in the following graph:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,5) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+\node[draw, circle] (6) at (5,3) {$6$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (1) -- (4);
+\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (5) -- (3);
+\path[draw,thick,->,>=latex] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+There are a total of three such paths:
+\begin{itemize}
+\item $1 \rightarrow 2 \rightarrow 3 \rightarrow 6$
+\item $1 \rightarrow 4 \rightarrow 5 \rightarrow 2 \rightarrow 3 \rightarrow 6$
+\item $1 \rightarrow 4 \rightarrow 5 \rightarrow 3 \rightarrow 6$
+\end{itemize}
+
+Let $\texttt{paths}(x)$ denote the number of paths from
+node 1 to node $x$.
+As a base case, $\texttt{paths}(1)=1$.
+Then, to calculate other values of $\texttt{paths}(x)$,
+we may use the recursion
+\[\texttt{paths}(x) = \texttt{paths}(a_1)+\texttt{paths}(a_2)+\cdots+\texttt{paths}(a_k)\]
+where $a_1,a_2,\ldots,a_k$ are the nodes from which there
+is an edge to $x$.
+Since the graph is acyclic, the values of $\texttt{paths}(x)$
+can be calculated in the order of a topological sort.
+A topological sort for the above graph is as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (4.5,0) {$2$};
+\node[draw, circle] (3) at (6,0) {$3$};
+\node[draw, circle] (4) at (1.5,0) {$4$};
+\node[draw, circle] (5) at (3,0) {$5$};
+\node[draw, circle] (6) at (7.5,0) {$6$};
+
+\path[draw,thick,->,>=latex] (1) edge [bend left=30] (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (1) -- (4);
+\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (5) edge [bend right=30] (3);
+\path[draw,thick,->,>=latex] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+Hence, the numbers of paths are as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,5) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+\node[draw, circle] (6) at (5,3) {$6$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (1) -- (4);
+\path[draw,thick,->,>=latex] (4) -- (5);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (5) -- (3);
+\path[draw,thick,->,>=latex] (3) -- (6);
+
+\node[color=red] at (1,2.3) {$1$};
+\node[color=red] at (3,2.3) {$1$};
+\node[color=red] at (5,2.3) {$3$};
+\node[color=red] at (1,5.7) {$1$};
+\node[color=red] at (3,5.7) {$2$};
+\node[color=red] at (5,5.7) {$3$};
+\end{tikzpicture}
+\end{center}
+
+For example, to calculate the value of $\texttt{paths}(3)$,
+we can use the formula $\texttt{paths}(2)+\texttt{paths}(5)$,
+because there are edges from nodes 2 and 5
+to node 3.
+Since $\texttt{paths}(2)=2$ and $\texttt{paths}(5)=1$, we conclude that $\texttt{paths}(3)=3$.
+
+\subsubsection{Extending Dijkstra's algorithm}
+
+\index{Dijkstra's algorithm}
+
+A by-product of Dijkstra's algorithm is a directed, acyclic
+graph that indicates for each node of the original graph
+the possible ways to reach the node using a shortest path
+from the starting node.
+Dynamic programming can be applied to that graph.
+For example, in the graph
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (2,0) {$2$};
+\node[draw, circle] (3) at (0,-2) {$3$};
+\node[draw, circle] (4) at (2,-2) {$4$};
+\node[draw, circle] (5) at (4,-1) {$5$};
+
+\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
+\path[draw,thick,-] (1) -- node[font=\small,label=left:5] {} (3);
+\path[draw,thick,-] (2) -- node[font=\small,label=right:4] {} (4);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:8] {} (5);
+\path[draw,thick,-] (3) -- node[font=\small,label=below:2] {} (4);
+\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
+\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (3);
+\end{tikzpicture}
+\end{center}
+the shortest paths from node 1 may use the following edges:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (2,0) {$2$};
+\node[draw, circle] (3) at (0,-2) {$3$};
+\node[draw, circle] (4) at (2,-2) {$4$};
+\node[draw, circle] (5) at (4,-1) {$5$};
+
+\path[draw,thick,->] (1) -- node[font=\small,label=above:3] {} (2);
+\path[draw,thick,->] (1) -- node[font=\small,label=left:5] {} (3);
+\path[draw,thick,->] (2) -- node[font=\small,label=right:4] {} (4);
+\path[draw,thick,->] (3) -- node[font=\small,label=below:2] {} (4);
+\path[draw,thick,->] (4) -- node[font=\small,label=below:1] {} (5);
+\path[draw,thick,->] (2) -- node[font=\small,label=above:2] {} (3);
+\end{tikzpicture}
+\end{center}
+
+Now we can, for example, calculate the number of
+shortest paths from node 1 to node 5
+using dynamic programming:
+\begin{center}
+\begin{tikzpicture}
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (2,0) {$2$};
+\node[draw, circle] (3) at (0,-2) {$3$};
+\node[draw, circle] (4) at (2,-2) {$4$};
+\node[draw, circle] (5) at (4,-1) {$5$};
+
+\path[draw,thick,->] (1) -- node[font=\small,label=above:3] {} (2);
+\path[draw,thick,->] (1) -- node[font=\small,label=left:5] {} (3);
+\path[draw,thick,->] (2) -- node[font=\small,label=right:4] {} (4);
+\path[draw,thick,->] (3) -- node[font=\small,label=below:2] {} (4);
+\path[draw,thick,->] (4) -- node[font=\small,label=below:1] {} (5);
+\path[draw,thick,->] (2) -- node[font=\small,label=above:2] {} (3);
+
+\node[color=red] at (0,0.7) {$1$};
+\node[color=red] at (2,0.7) {$1$};
+\node[color=red] at (0,-2.7) {$2$};
+\node[color=red] at (2,-2.7) {$3$};
+\node[color=red] at (4,-1.7) {$3$};
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Representing problems as graphs}
+
+Actually, any dynamic programming problem
+can be represented as a directed, acyclic graph.
+In such a graph, each node corresponds to a dynamic programming state
+and the edges indicate how the states depend on each other.
+
+As an example, consider the problem
+of forming a sum of money $n$
+using coins
+$\{c_1,c_2,\ldots,c_k\}$.
+In this problem, we can construct a graph where
+each node corresponds to a sum of money,
+and the edges show how the coins can be chosen.
+For example, for coins $\{1,3,4\}$ and $n=6$,
+the graph is as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (0) at (0,0) {$0$};
+\node[draw, circle] (1) at (2,0) {$1$};
+\node[draw, circle] (2) at (4,0) {$2$};
+\node[draw, circle] (3) at (6,0) {$3$};
+\node[draw, circle] (4) at (8,0) {$4$};
+\node[draw, circle] (5) at (10,0) {$5$};
+\node[draw, circle] (6) at (12,0) {$6$};
+
+\path[draw,thick,->] (0) -- (1);
+\path[draw,thick,->] (1) -- (2);
+\path[draw,thick,->] (2) -- (3);
+\path[draw,thick,->] (3) -- (4);
+\path[draw,thick,->] (4) -- (5);
+\path[draw,thick,->] (5) -- (6);
+
+\path[draw,thick,->] (0) edge [bend right=30] (3);
+\path[draw,thick,->] (1) edge [bend right=30] (4);
+\path[draw,thick,->] (2) edge [bend right=30] (5);
+\path[draw,thick,->] (3) edge [bend right=30] (6);
+
+\path[draw,thick,->] (0) edge [bend left=30] (4);
+\path[draw,thick,->] (1) edge [bend left=30] (5);
+\path[draw,thick,->] (2) edge [bend left=30] (6);
+\end{tikzpicture}
+\end{center}
+
+Using this representation,
+the shortest path from node 0 to node $n$
+corresponds to a solution with the minimum number of coins,
+and the total number of paths from node 0 to node $n$
+equals the total number of solutions.
+
+\section{Successor paths}
+
+\index{successor graph}
+\index{functional graph}
+
+For the rest of the chapter,
+we will focus on \key{successor graphs}.
+In those graphs,
+the outdegree of each node is 1, i.e.,
+exactly one edge starts at each node.
+A successor graph consists of one or more
+components, each of which contains
+one cycle and some paths that lead to it.
+
+Successor graphs are sometimes called
+\key{functional graphs}.
+The reason for this is that any successor graph
+corresponds to a function that defines
+the edges of the graph.
+The parameter for the function is a node of the graph,
+and the function gives the successor of that node.
+
+For example, the function
+\begin{center}
+\begin{tabular}{r|rrrrrrrrr}
+$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
+\hline
+$\texttt{succ}(x)$ & 3 & 5 & 7 & 6 & 2 & 2 & 1 & 6 & 3 \\
+\end{tabular}
+\end{center}
+defines the following graph:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (2,0) {$2$};
+\node[draw, circle] (3) at (-2,0) {$3$};
+\node[draw, circle] (4) at (1,-3) {$4$};
+\node[draw, circle] (5) at (4,0) {$5$};
+\node[draw, circle] (6) at (2,-1.5) {$6$};
+\node[draw, circle] (7) at (-2,-1.5) {$7$};
+\node[draw, circle] (8) at (3,-3) {$8$};
+\node[draw, circle] (9) at (-4,0) {$9$};
+
+\path[draw,thick,->] (1) -- (3);
+\path[draw,thick,->] (2)  edge [bend left=40] (5);
+\path[draw,thick,->] (3) -- (7);
+\path[draw,thick,->] (4) -- (6);
+\path[draw,thick,->] (5)  edge [bend left=40] (2);
+\path[draw,thick,->] (6) -- (2);
+\path[draw,thick,->] (7) -- (1);
+\path[draw,thick,->] (8) -- (6);
+\path[draw,thick,->] (9) -- (3);
+\end{tikzpicture}
+\end{center}
+
+Since each node of a successor graph has a
+unique successor, we can also define a function $\texttt{succ}(x,k)$
+that gives the node that we will reach if
+we begin at node $x$ and walk $k$ steps forward.
+For example, in the above graph $\texttt{succ}(4,6)=2$,
+because we will reach node 2 by walking 6 steps from node 4:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$4$};
+\node[draw, circle] (2) at (1.5,0) {$6$};
+\node[draw, circle] (3) at (3,0) {$2$};
+\node[draw, circle] (4) at (4.5,0) {$5$};
+\node[draw, circle] (5) at (6,0) {$2$};
+\node[draw, circle] (6) at (7.5,0) {$5$};
+\node[draw, circle] (7) at (9,0) {$2$};
+
+\path[draw,thick,->] (1) -- (2);
+\path[draw,thick,->] (2) -- (3);
+\path[draw,thick,->] (3) -- (4);
+\path[draw,thick,->] (4) -- (5);
+\path[draw,thick,->] (5) -- (6);
+\path[draw,thick,->] (6) -- (7);
+\end{tikzpicture}
+\end{center}
+
+A straightforward way to calculate a value of $\texttt{succ}(x,k)$
+is to start at node $x$ and walk $k$ steps forward, which takes $O(k)$ time.
+However, using preprocessing, any value of $\texttt{succ}(x,k)$
+can be calculated in only $O(\log k)$ time.
+
+The idea is to precalculate all values of $\texttt{succ}(x,k)$ where
+$k$ is a power of two and at most $u$, where $u$ is
+the maximum number of steps we will ever walk.
+This can be efficiently done, because
+we can use the following recursion:
+
+\begin{equation*}
+    \texttt{succ}(x,k) = \begin{cases}
+               \texttt{succ}(x)              & k = 1\\
+               \texttt{succ}(\texttt{succ}(x,k/2),k/2)   & k > 1\\
+           \end{cases}
+\end{equation*}
+
+Precalculating the values takes $O(n \log u)$ time,
+because $O(\log u)$ values are calculated for each node.
+In the above graph, the first values are as follows:
+
+\begin{center}
+\begin{tabular}{r|rrrrrrrrr}
+$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
+\hline
+$\texttt{succ}(x,1)$ & 3 & 5 & 7 & 6 & 2 & 2 & 1 & 6 & 3 \\
+$\texttt{succ}(x,2)$ & 7 & 2 & 1 & 2 & 5 & 5 & 3 & 2 & 7 \\
+$\texttt{succ}(x,4)$ & 3 & 2 & 7 & 2 & 5 & 5 & 1 & 2 & 3 \\
+$\texttt{succ}(x,8)$ & 7 & 2 & 1 & 2 & 5 & 5 & 3 & 2 & 7 \\
+$\cdots$ \\
+\end{tabular}
+\end{center}
+
+After this, any value of $\texttt{succ}(x,k)$ can be calculated
+by presenting the number of steps $k$ as a sum of powers of two.
+For example, if we want to calculate the value of $\texttt{succ}(x,11)$,
+we first form the representation $11=8+2+1$.
+Using that,
+\[\texttt{succ}(x,11)=\texttt{succ}(\texttt{succ}(\texttt{succ}(x,8),2),1).\]
+For example, in the previous graph
+\[\texttt{succ}(4,11)=\texttt{succ}(\texttt{succ}(\texttt{succ}(4,8),2),1)=5.\]
+
+Such a representation always consists of
+$O(\log k)$ parts, so calculating a value of $\texttt{succ}(x,k)$
+takes $O(\log k)$ time.
+
+\section{Cycle detection}
+
+\index{cycle}
+\index{cycle detection}
+
+Consider a successor graph that only contains
+a path that ends in a cycle.
+We may ask the following questions:
+if we begin our walk at the starting node,
+what is the first node in the cycle
+and how many nodes does the cycle contain?
+
+For example, in the graph
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (5) at (0,0) {$5$};
+\node[draw, circle] (4) at (-2,0) {$4$};
+\node[draw, circle] (6) at (-1,1.5) {$6$};
+\node[draw, circle] (3) at (-4,0) {$3$};
+\node[draw, circle] (2) at (-6,0) {$2$};
+\node[draw, circle] (1) at (-8,0) {$1$};
+
+\path[draw,thick,->] (1) -- (2);
+\path[draw,thick,->] (2) -- (3);
+\path[draw,thick,->] (3) -- (4);
+\path[draw,thick,->] (4) -- (5);
+\path[draw,thick,->] (5) -- (6);
+\path[draw,thick,->] (6) -- (4);
+\end{tikzpicture}
+\end{center}
+we begin our walk at node 1,
+the first node that belongs to the cycle is node 4, and the cycle consists
+of three nodes (4, 5 and 6).
+
+A simple way to detect the cycle is to walk in the
+graph and keep track of
+all nodes that have been visited. Once a node is visited
+for the second time, we can conclude
+that the node is the first node in the cycle.
+This method works in $O(n)$ time and also uses
+$O(n)$ memory.
+
+However, there are better algorithms for cycle detection.
+The time complexity of such algorithms is still $O(n)$,
+but they only use $O(1)$ memory.
+This is an important improvement if $n$ is large.
+Next we will discuss Floyd's algorithm that
+achieves these properties.
+
+\subsubsection{Floyd's algorithm}
+
+\index{Floyd's algorithm}
+
+\key{Floyd's algorithm}\footnote{The idea of the algorithm is mentioned in \cite{knu982}
+and attributed to R. W. Floyd; however, it is not known if Floyd actually
+discovered the algorithm.} walks forward 
+in the graph using two pointers $a$ and $b$.
+Both pointers begin at a node $x$ that
+is the starting node of the graph.
+Then, on each turn, the pointer $a$ walks
+one step forward and the pointer $b$
+walks two steps forward.
+The process continues until
+the pointers meet each other:
+\begin{lstlisting}
+a = succ(x);
+b = succ(succ(x));
+while (a != b) {
+    a = succ(a);
+    b = succ(succ(b));
+}
+\end{lstlisting}
+
+At this point, the pointer $a$ has walked $k$ steps
+and the pointer $b$ has walked $2k$ steps,
+so the length of the cycle divides $k$.
+Thus, the first node that belongs to the cycle
+can be found by moving the pointer $a$ to node $x$
+and advancing the pointers
+step by step until they meet again.
+\begin{lstlisting}
+a = x;
+while (a != b) {
+    a = succ(a);
+    b = succ(b);
+}
+first = a;
+\end{lstlisting}
+
+After this, the length of the cycle
+can be calculated as follows:
+\begin{lstlisting}
+b = succ(a);
+length = 1;
+while (a != b) {
+    b = succ(b);
+    length++;
+}
+\end{lstlisting}
--- a/chapter17.tex
+++ b/chapter17.tex
@ -0,0 +1,563 @@
+\chapter{Strong connectivity}
+
+\index{strongly connected graph}
+
+In a directed graph,
+the edges can be traversed in one direction only,
+so even if the graph is connected,
+this does not guarantee that there would be
+a path from a node to another node.
+For this reason, it is meaningful to define a new concept
+that requires more than connectivity.
+
+A graph is \key{strongly connected}
+if there is a path from any node to all
+other nodes in the graph.
+For example, in the following picture,
+the left graph is strongly connected
+while the right graph is not.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,1) {$1$};
+\node[draw, circle] (2) at (3,1) {$2$};
+\node[draw, circle] (3) at (1,-1) {$3$};
+\node[draw, circle] (4) at (3,-1) {$4$};
+
+\path[draw,thick,->] (1) -- (2);
+\path[draw,thick,->] (2) -- (4);
+\path[draw,thick,->] (4) -- (3);
+\path[draw,thick,->] (3) -- (1);
+
+\node[draw, circle] (1b) at (6,1) {$1$};
+\node[draw, circle] (2b) at (8,1) {$2$};
+\node[draw, circle] (3b) at (6,-1) {$3$};
+\node[draw, circle] (4b) at (8,-1) {$4$};
+
+\path[draw,thick,->] (1b) -- (2b);
+\path[draw,thick,->] (2b) -- (4b);
+\path[draw,thick,->] (4b) -- (3b);
+\path[draw,thick,->] (1b) -- (3b);
+\end{tikzpicture}
+\end{center}
+
+The right graph is not strongly connected
+because, for example, there is no path
+from node 2 to node 1.
+
+\index{strongly connected component}
+\index{component graph}
+
+The \key{strongly connected components}
+of a graph divide the graph into strongly connected
+parts that are as large as possible.
+The strongly connected components form an
+acyclic \key{component graph} that represents
+the deep structure of the original graph.
+
+For example, for the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9,label distance=-2mm]
+\node[draw, circle] (1) at (-1,1) {$7$};
+\node[draw, circle] (2) at (-3,2) {$3$};
+\node[draw, circle] (4) at (-5,2) {$2$};
+\node[draw, circle] (6) at (-7,2) {$1$};
+\node[draw, circle] (3) at (-3,0) {$6$};
+\node[draw, circle] (5) at (-5,0) {$5$};
+\node[draw, circle] (7) at (-7,0) {$4$};
+
+\path[draw,thick,->] (2) -- (1);
+\path[draw,thick,->] (1) -- (3);
+\path[draw,thick,->] (3) -- (2);
+\path[draw,thick,->] (2) -- (4);
+\path[draw,thick,->] (3) -- (5);
+\path[draw,thick,->] (4) edge [bend left] (6);
+\path[draw,thick,->] (6) edge [bend left] (4);
+\path[draw,thick,->] (4) -- (5);
+\path[draw,thick,->] (5) -- (7);
+\path[draw,thick,->] (6) -- (7);
+\end{tikzpicture}
+\end{center}
+the strongly connected components are as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (-1,1) {$7$};
+\node[draw, circle] (2) at (-3,2) {$3$};
+\node[draw, circle] (4) at (-5,2) {$2$};
+\node[draw, circle] (6) at (-7,2) {$1$};
+\node[draw, circle] (3) at (-3,0) {$6$};
+\node[draw, circle] (5) at (-5,0) {$5$};
+\node[draw, circle] (7) at (-7,0) {$4$};
+
+\path[draw,thick,->] (2) -- (1);
+\path[draw,thick,->] (1) -- (3);
+\path[draw,thick,->] (3) -- (2);
+\path[draw,thick,->] (2) -- (4);
+\path[draw,thick,->] (3) -- (5);
+\path[draw,thick,->] (4) edge [bend left] (6);
+\path[draw,thick,->] (6) edge [bend left] (4);
+\path[draw,thick,->] (4) -- (5);
+\path[draw,thick,->] (5) -- (7);
+\path[draw,thick,->] (6) -- (7);
+
+\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
+\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
+\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
+\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
+\end{tikzpicture}
+\end{center}
+The corresponding component graph is as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (-3,1) {$B$};
+\node[draw, circle] (2) at (-6,2) {$A$};
+\node[draw, circle] (3) at (-5,0) {$D$};
+\node[draw, circle] (4) at (-7,0) {$C$};
+
+\path[draw,thick,->] (1) -- (2);
+\path[draw,thick,->] (1) -- (3);
+\path[draw,thick,->] (2) -- (3);
+\path[draw,thick,->] (2) -- (4);
+\path[draw,thick,->] (3) -- (4);
+\end{tikzpicture}
+\end{center}
+The components are $A=\{1,2\}$,
+$B=\{3,6,7\}$, $C=\{4\}$ and $D=\{5\}$.
+
+A component graph is an acyclic, directed graph,
+so it is easier to process than the original graph.
+Since the graph does not contain cycles,
+we can always construct a topological sort and
+use dynamic programming techniques like those
+presented in Chapter 16.
+
+\section{Kosaraju's algorithm}
+
+\index{Kosaraju's algorithm}
+
+\key{Kosaraju's algorithm}\footnote{According to \cite{aho83},
+S. R. Kosaraju invented this algorithm in 1978
+but did not publish it. In 1981, the same algorithm was rediscovered
+and published by M. Sharir \cite{sha81}.} is an efficient
+method for finding the strongly connected components
+of a directed graph.
+The algorithm performs two depth-first searches:
+the first search constructs a list of nodes
+according to the structure of the graph,
+and the second search forms the strongly connected components.
+
+\subsubsection{Search 1}
+
+The first phase of Kosaraju's algorithm constructs
+a list of nodes in the order in which a
+depth-first search processes them.
+The algorithm goes through the nodes,
+and begins a depth-first search at each 
+unprocessed node.
+Each node will be added to the list
+after it has been processed.
+
+In the example graph, the nodes are processed
+in the following order:
+\begin{center}
+\begin{tikzpicture}[scale=0.9,label distance=-2mm]
+\node[draw, circle] (1) at (-1,1) {$7$};
+\node[draw, circle] (2) at (-3,2) {$3$};
+\node[draw, circle] (4) at (-5,2) {$2$};
+\node[draw, circle] (6) at (-7,2) {$1$};
+\node[draw, circle] (3) at (-3,0) {$6$};
+\node[draw, circle] (5) at (-5,0) {$5$};
+\node[draw, circle] (7) at (-7,0) {$4$};
+
+\node at (-7,2.75) {$1/8$};
+\node at (-5,2.75) {$2/7$};
+\node at (-3,2.75) {$9/14$};
+\node at (-7,-0.75) {$4/5$};
+\node at (-5,-0.75) {$3/6$};
+\node at (-3,-0.75) {$11/12$};
+\node at (-1,1.75) {$10/13$};
+
+\path[draw,thick,->] (2) -- (1);
+\path[draw,thick,->] (1) -- (3);
+\path[draw,thick,->] (3) -- (2);
+\path[draw,thick,->] (2) -- (4);
+\path[draw,thick,->] (3) -- (5);
+\path[draw,thick,->] (4) edge [bend left] (6);
+\path[draw,thick,->] (6) edge [bend left] (4);
+\path[draw,thick,->] (4) -- (5);
+\path[draw,thick,->] (5) -- (7);
+\path[draw,thick,->] (6) -- (7);
+\end{tikzpicture}
+\end{center}
+
+The notation $x/y$ means that
+processing the node started
+at time $x$ and finished at time $y$.
+Thus, the corresponding list is as follows:
+
+\begin{tabular}{ll}
+\\
+node & processing time \\
+\hline
+4 & 5 \\
+5 & 6 \\
+2 & 7 \\
+1 & 8 \\
+6 & 12 \\
+7 & 13 \\
+3 & 14 \\
+\\
+\end{tabular}
+% 
+% In the second phase of the algorithm,
+% the nodes will be processed
+% in reverse order: $[3,7,6,1,2,5,4]$.
+
+\subsubsection{Search 2}
+
+The second phase of the algorithm
+forms the strongly connected components
+of the graph.
+First, the algorithm reverses every
+edge in the graph.
+This guarantees that during the second search,
+we will always find strongly connected
+components that do not have extra nodes.
+
+After reversing the edges,
+the example graph is as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9,label distance=-2mm]
+\node[draw, circle] (1) at (-1,1) {$7$};
+\node[draw, circle] (2) at (-3,2) {$3$};
+\node[draw, circle] (4) at (-5,2) {$2$};
+\node[draw, circle] (6) at (-7,2) {$1$};
+\node[draw, circle] (3) at (-3,0) {$6$};
+\node[draw, circle] (5) at (-5,0) {$5$};
+\node[draw, circle] (7) at (-7,0) {$4$};
+
+\path[draw,thick,<-] (2) -- (1);
+\path[draw,thick,<-] (1) -- (3);
+\path[draw,thick,<-] (3) -- (2);
+\path[draw,thick,<-] (2) -- (4);
+\path[draw,thick,<-] (3) -- (5);
+\path[draw,thick,<-] (4) edge [bend left] (6);
+\path[draw,thick,<-] (6) edge [bend left] (4);
+\path[draw,thick,<-] (4) -- (5);
+\path[draw,thick,<-] (5) -- (7);
+\path[draw,thick,<-] (6) -- (7);
+\end{tikzpicture}
+\end{center}
+
+After this, the algorithm goes through
+the list of nodes created by the first search,
+in \emph{reverse} order.
+If a node does not belong to a component,
+the algorithm creates a new component
+and starts a depth-first search
+that adds all new nodes found during the search
+to the new component.
+
+In the example graph, the first component
+begins at node 3:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9,label distance=-2mm]
+\node[draw, circle] (1) at (-1,1) {$7$};
+\node[draw, circle] (2) at (-3,2) {$3$};
+\node[draw, circle] (4) at (-5,2) {$2$};
+\node[draw, circle] (6) at (-7,2) {$1$};
+\node[draw, circle] (3) at (-3,0) {$6$};
+\node[draw, circle] (5) at (-5,0) {$5$};
+\node[draw, circle] (7) at (-7,0) {$4$};
+
+\path[draw,thick,<-] (2) -- (1);
+\path[draw,thick,<-] (1) -- (3);
+\path[draw,thick,<-] (3) -- (2);
+\path[draw,thick,<-] (2) -- (4);
+\path[draw,thick,<-] (3) -- (5);
+\path[draw,thick,<-] (4) edge [bend left] (6);
+\path[draw,thick,<-] (6) edge [bend left] (4);
+\path[draw,thick,<-] (4) -- (5);
+\path[draw,thick,<-] (5) -- (7);
+\path[draw,thick,<-] (6) -- (7);
+
+\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
+\end{tikzpicture}
+\end{center}
+
+Note that since all edges are reversed,
+the component does not ''leak'' to other parts in the graph.
+
+\begin{samepage}
+The next nodes in the list are nodes 7 and 6,
+but they already belong to a component,
+so the next new component begins at node 1:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9,label distance=-2mm]
+\node[draw, circle] (1) at (-1,1) {$7$};
+\node[draw, circle] (2) at (-3,2) {$3$};
+\node[draw, circle] (4) at (-5,2) {$2$};
+\node[draw, circle] (6) at (-7,2) {$1$};
+\node[draw, circle] (3) at (-3,0) {$6$};
+\node[draw, circle] (5) at (-5,0) {$5$};
+\node[draw, circle] (7) at (-7,0) {$4$};
+
+\path[draw,thick,<-] (2) -- (1);
+\path[draw,thick,<-] (1) -- (3);
+\path[draw,thick,<-] (3) -- (2);
+\path[draw,thick,<-] (2) -- (4);
+\path[draw,thick,<-] (3) -- (5);
+\path[draw,thick,<-] (4) edge [bend left] (6);
+\path[draw,thick,<-] (6) edge [bend left] (4);
+\path[draw,thick,<-] (4) -- (5);
+\path[draw,thick,<-] (5) -- (7);
+\path[draw,thick,<-] (6) -- (7);
+
+\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
+\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
+%\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
+%\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+\begin{samepage}
+Finally, the algorithm processes nodes 5 and 4
+that create the remaining strongly connected components:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9,label distance=-2mm]
+\node[draw, circle] (1) at (-1,1) {$7$};
+\node[draw, circle] (2) at (-3,2) {$3$};
+\node[draw, circle] (4) at (-5,2) {$2$};
+\node[draw, circle] (6) at (-7,2) {$1$};
+\node[draw, circle] (3) at (-3,0) {$6$};
+\node[draw, circle] (5) at (-5,0) {$5$};
+\node[draw, circle] (7) at (-7,0) {$4$};
+
+\path[draw,thick,<-] (2) -- (1);
+\path[draw,thick,<-] (1) -- (3);
+\path[draw,thick,<-] (3) -- (2);
+\path[draw,thick,<-] (2) -- (4);
+\path[draw,thick,<-] (3) -- (5);
+\path[draw,thick,<-] (4) edge [bend left] (6);
+\path[draw,thick,<-] (6) edge [bend left] (4);
+\path[draw,thick,<-] (4) -- (5);
+\path[draw,thick,<-] (5) -- (7);
+\path[draw,thick,<-] (6) -- (7);
+
+\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
+\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
+\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
+\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+The time complexity of the algorithm is $O(n+m)$,
+because the algorithm
+performs two depth-first searches.
+
+\section{2SAT problem}
+
+\index{2SAT problem}
+
+Strong connectivity is also linked with the
+\key{2SAT problem}\footnote{The algorithm presented here was
+introduced in \cite{asp79}.
+There is also another well-known linear-time algorithm \cite{eve75}
+that is based on backtracking.}.
+In this problem, we are given a logical formula
+\[
+(a_1 \lor b_1) \land (a_2 \lor b_2) \land \cdots \land (a_m \lor b_m),
+\]
+where each $a_i$ and $b_i$ is either a logical variable
+($x_1,x_2,\ldots,x_n$)
+or a negation of a logical variable
+($\lnot x_1, \lnot x_2, \ldots, \lnot x_n$).
+The symbols ''$\land$'' and ''$\lor$'' denote
+logical operators ''and'' and ''or''.
+Our task is to assign each variable a value
+so that the formula is true, or state
+that this is not possible.
+
+For example, the formula
+\[
+L_1 = (x_2 \lor \lnot x_1) \land
+      (\lnot x_1 \lor \lnot x_2) \land
+      (x_1 \lor x_3) \land
+      (\lnot x_2 \lor \lnot x_3) \land
+      (x_1 \lor x_4)
+\]
+is true when the variables are assigned as follows:
+
+\[
+\begin{cases}
+x_1 = \textrm{false} \\
+x_2 = \textrm{false} \\
+x_3 = \textrm{true} \\
+x_4 = \textrm{true} \\
+\end{cases}
+\]
+
+However, the formula
+\[
+L_2 = (x_1 \lor x_2) \land
+      (x_1 \lor \lnot x_2) \land
+      (\lnot x_1 \lor x_3) \land
+      (\lnot x_1 \lor \lnot x_3)
+\]
+is always false, regardless of how we
+assign the values.
+The reason for this is that we cannot
+choose a value for $x_1$
+without creating a contradiction.
+If $x_1$ is false, both $x_2$ and $\lnot x_2$
+should be true which is impossible,
+and if $x_1$ is true, both $x_3$ and $\lnot x_3$
+should be true which is also impossible.
+
+The 2SAT problem can be represented as a graph
+whose nodes correspond to
+variables $x_i$ and negations $\lnot x_i$,
+and edges determine the connections
+between the variables.
+Each pair $(a_i \lor b_i)$ generates two edges:
+$\lnot a_i \to b_i$ and $\lnot b_i \to a_i$.
+This means that if $a_i$ does not hold,
+$b_i$ must hold, and vice versa.
+
+The graph for the formula $L_1$ is:
+\\
+\begin{center}
+\begin{tikzpicture}[scale=1.0,minimum size=2pt]
+\node[draw, circle, inner sep=1.3pt] (1) at (1,2) {$\lnot x_3$};
+\node[draw, circle] (2) at (3,2) {$x_2$};
+\node[draw, circle, inner sep=1.3pt] (3) at (1,0) {$\lnot x_4$};
+\node[draw, circle] (4) at (3,0) {$x_1$};
+\node[draw, circle, inner sep=1.3pt] (5) at (5,2) {$\lnot x_1$};
+\node[draw, circle] (6) at (7,2) {$x_4$};
+\node[draw, circle, inner sep=1.3pt] (7) at (5,0) {$\lnot x_2$};
+\node[draw, circle] (8) at (7,0) {$x_3$};
+ 
+\path[draw,thick,->] (1) -- (4);
+\path[draw,thick,->] (4) -- (2);
+\path[draw,thick,->] (2) -- (1);
+\path[draw,thick,->] (3) -- (4);
+\path[draw,thick,->] (2) -- (5);
+\path[draw,thick,->] (4) -- (7);
+\path[draw,thick,->] (5) -- (6);
+\path[draw,thick,->] (5) -- (8);
+\path[draw,thick,->] (8) -- (7);
+\path[draw,thick,->] (7) -- (5);
+\end{tikzpicture}
+\end{center}
+And the graph for the formula $L_2$ is:
+\\
+\begin{center}
+\begin{tikzpicture}[scale=1.0,minimum size=2pt]
+\node[draw, circle] (1) at (1,2) {$x_3$};
+\node[draw, circle] (2) at (3,2) {$x_2$};
+\node[draw, circle, inner sep=1.3pt] (3) at (5,2) {$\lnot x_2$};
+\node[draw, circle, inner sep=1.3pt] (4) at (7,2) {$\lnot x_3$};
+\node[draw, circle, inner sep=1.3pt] (5) at (4,3.5) {$\lnot x_1$};
+\node[draw, circle] (6) at (4,0.5) {$x_1$};
+
+\path[draw,thick,->] (1) -- (5);
+\path[draw,thick,->] (4) -- (5);
+\path[draw,thick,->] (6) -- (1);
+\path[draw,thick,->] (6) -- (4);
+\path[draw,thick,->] (5) -- (2);
+\path[draw,thick,->] (5) -- (3);
+\path[draw,thick,->] (2) -- (6);
+\path[draw,thick,->] (3) -- (6);
+\end{tikzpicture}
+\end{center}
+
+The structure of the graph tells us whether
+it is possible to assign the values
+of the variables so
+that the formula is true.
+It turns out that this can be done
+exactly when there are no nodes
+$x_i$ and $\lnot x_i$ such that
+both nodes belong to the
+same strongly connected component.
+If there are such nodes,
+the graph contains
+a path from $x_i$ to $\lnot x_i$
+and also a path from $\lnot x_i$ to $x_i$,
+so both $x_i$ and $\lnot x_i$ should be true
+which is not possible.
+
+In the graph of the formula $L_1$
+there are no nodes $x_i$ and $\lnot x_i$
+such that both nodes 
+belong to the same strongly connected component,
+so a solution exists.
+In the graph of the formula $L_2$
+all nodes belong to the same strongly connected component,
+so a solution does not exist.
+
+If a solution exists, the values for the variables
+can be found by going through the nodes of the
+component graph in a reverse topological sort order.
+At each step, we process a component 
+that does not contain edges that lead to an
+unprocessed component.
+If the variables in the component
+have not been assigned values,
+their values will be determined
+according to the values in the component,
+and if they already have values,
+they remain unchanged.
+The process continues until each variable
+has been assigned a value.
+
+The component graph for the formula $L_1$ is as follows:
+\begin{center}
+\begin{tikzpicture}[scale=1.0]
+\node[draw, circle] (1) at (0,0) {$A$};
+\node[draw, circle] (2) at (2,0) {$B$};
+\node[draw, circle] (3) at (4,0) {$C$};
+\node[draw, circle] (4) at (6,0) {$D$};
+
+\path[draw,thick,->] (1) -- (2);
+\path[draw,thick,->] (2) -- (3);
+\path[draw,thick,->] (3) -- (4);
+\end{tikzpicture}
+\end{center}
+
+The components are
+$A = \{\lnot x_4\}$,
+$B = \{x_1, x_2, \lnot x_3\}$,
+$C = \{\lnot x_1, \lnot x_2, x_3\}$ and
+$D = \{x_4\}$.
+When constructing the solution,
+we first process the component $D$
+where $x_4$ becomes true.
+After this, we process the component $C$
+where $x_1$ and $x_2$ become false
+and $x_3$ becomes true.
+All variables have been assigned values,
+so the remaining components $A$ and $B$
+do not change the variables.
+
+Note that this method works, because the
+graph has a special structure:
+if there are paths from node $x_i$ to node $x_j$
+and from node $x_j$ to node $\lnot x_j$,
+then node $x_i$ never becomes true.
+The reason for this is that there is also
+a path from node $\lnot x_j$ to node $\lnot x_i$,
+and both $x_i$ and $x_j$ become false.
+
+\index{3SAT problem}
+
+A more difficult problem is the \key{3SAT problem},
+where each part of the formula is of the form
+$(a_i \lor b_i \lor c_i)$.
+This problem is NP-hard, so no efficient algorithm
+for solving the problem is known.
--- a/chapter18.tex
+++ b/chapter18.tex
--- a/chapter19.tex
+++ b/chapter19.tex
@ -0,0 +1,674 @@
+\chapter{Paths and circuits}
+
+This chapter focuses on two types of paths in graphs:
+\begin{itemize}
+\item An \key{Eulerian path} is a path that
+goes through each edge exactly once.
+\item A \key{Hamiltonian path} is a path
+that visits each node exactly once.
+\end{itemize}
+
+While Eulerian and Hamiltonian paths look like
+similar concepts at first glance,
+the computational problems related to them
+are very different.
+It turns out that there is a simple rule that
+determines whether a graph contains an Eulerian path,
+and there is also an efficient algorithm to
+find such a path if it exists.
+On the contrary, checking the existence of a Hamiltonian path is a NP-hard
+problem, and no efficient algorithm is known for solving the problem.
+
+\section{Eulerian paths}
+
+\index{Eulerian path}
+
+An \key{Eulerian path}\footnote{L. Euler studied such paths in 1736
+when he solved the famous Königsberg bridge problem.
+This was the birth of graph theory.} is a path
+that goes exactly once through each edge of the graph.
+For example, the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+has an Eulerian path from node 2 to node 5:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:1.}] {} (1);
+\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]left:2.}] {} (4);
+\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]south:3.}] {} (5);
+\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]left:4.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:5.}] {} (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]south:6.}] {} (5);
+\end{tikzpicture}
+\end{center}
+\index{Eulerian circuit}
+An \key{Eulerian circuit}
+is an Eulerian path that starts and ends
+at the same node.
+For example, the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (4);
+\end{tikzpicture}
+\end{center}
+has an Eulerian circuit that starts and ends at node 1:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (4);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]left:1.}] {} (4);
+\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]south:2.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]right:3.}] {} (5);
+\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:4.}] {} (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]north:5.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:6.}] {} (1);
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Existence}
+
+The existence of Eulerian paths and circuits
+depends on the degrees of the nodes.
+First, an undirected graph has an Eulerian path 
+exactly when all the edges
+belong to the same connected component and
+\begin{itemize}
+\item the degree of each node is even \emph{or}
+\item the degree of exactly two nodes is odd,
+and the degree of all other nodes is even.
+\end{itemize}
+
+In the first case, each Eulerian path is also an Eulerian circuit.
+In the second case, the odd-degree nodes are the starting
+and ending nodes of an Eulerian path which is not an Eulerian circuit.
+
+\begin{samepage}
+For example, in the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+nodes 1, 3 and 4 have a degree of 2,
+and nodes 2 and 5 have a degree of 3.
+Exactly two nodes have an odd degree,
+so there is an Eulerian path between nodes 2 and 5,
+but the graph does not contain an Eulerian circuit.
+
+In a directed graph,
+we focus on indegrees and outdegrees
+of the nodes.
+A directed graph contains an Eulerian path
+exactly when all the edges belong to the same
+connected component and
+\begin{itemize}
+\item in each node, the indegree equals the outdegree, \emph{or}
+\item in one node, the indegree is one larger than the outdegree,
+in another node, the outdegree is one larger than the indegree,
+and in all other nodes, the indegree equals the outdegree.
+\end{itemize}
+
+In the first case, each Eulerian path
+is also an Eulerian circuit,
+and in the second case, the graph contains an Eulerian path
+that begins at the node whose outdegree is larger
+and ends at the node whose indegree is larger.
+
+For example, in the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\path[draw,thick,->,>=latex] (3) -- (5);
+\path[draw,thick,->,>=latex] (2) -- (5);
+\path[draw,thick,->,>=latex] (5) -- (4);
+\end{tikzpicture}
+\end{center}
+nodes 1, 3 and 4 have both indegree 1 and outdegree 1,
+node 2 has indegree 1 and outdegree 2,
+and node 5 has indegree 2 and outdegree 1.
+Hence, the graph contains an Eulerian path
+from node 2 to node 5:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:1.}] {} (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]south:2.}] {} (5);
+\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:3.}] {} (4);
+\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]left:4.}] {} (1);
+\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:5.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]left:6.}] {} (5);
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Hierholzer's algorithm}
+
+\index{Hierholzer's algorithm}
+
+\key{Hierholzer's algorithm}\footnote{The algorithm was published
+in 1873 after Hierholzer's death \cite{hie73}.} is an efficient
+method for constructing
+an Eulerian circuit.
+The algorithm consists of several rounds,
+each of which adds new edges to the circuit.
+Of course, we assume that the graph contains
+an Eulerian circuit; otherwise Hierholzer's
+algorithm cannot find it.
+
+First, the algorithm constructs a circuit that contains
+some (not necessarily all) of the edges of the graph.
+After this, the algorithm extends the circuit
+step by step by adding subcircuits to it.
+The process continues until all edges have been added
+to the circuit.
+
+The algorithm extends the circuit by always finding
+a node $x$ that belongs to the circuit but has
+an outgoing edge that is not included in the circuit.
+The algorithm constructs a new path from node $x$
+that only contains edges that are not yet in the circuit.
+Sooner or later,
+the path will return to node $x$,
+which creates a subcircuit.
+
+If the graph only contains an Eulerian path,
+we can still use Hierholzer's algorithm
+to find it by adding an extra edge to the graph
+and removing the edge after the circuit
+has been constructed.
+For example, in an undirected graph,
+we add the extra edge between the two
+odd-degree nodes.
+
+Next we will see how Hierholzer's algorithm
+constructs an Eulerian circuit for an undirected graph.
+
+\subsubsection{Example}
+
+\begin{samepage}
+Let us consider the following graph:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (3,5) {$1$};
+\node[draw, circle] (2) at (1,3) {$2$};
+\node[draw, circle] (3) at (3,3) {$3$};
+\node[draw, circle] (4) at (5,3) {$4$};
+\node[draw, circle] (5) at (1,1) {$5$};
+\node[draw, circle] (6) at (3,1) {$6$};
+\node[draw, circle] (7) at (5,1) {$7$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (6);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (4) -- (7);
+\path[draw,thick,-] (5) -- (6);
+\path[draw,thick,-] (6) -- (7);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+\begin{samepage}
+Suppose that the algorithm first creates a circuit
+that begins at node 1.
+A possible circuit is
+$1 \rightarrow 2 \rightarrow 3 \rightarrow 1$:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (3,5) {$1$};
+\node[draw, circle] (2) at (1,3) {$2$};
+\node[draw, circle] (3) at (3,3) {$3$};
+\node[draw, circle] (4) at (5,3) {$4$};
+\node[draw, circle] (5) at (1,1) {$5$};
+\node[draw, circle] (6) at (3,1) {$6$};
+\node[draw, circle] (7) at (5,1) {$7$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (6);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (4) -- (7);
+\path[draw,thick,-] (5) -- (6);
+\path[draw,thick,-] (6) -- (7);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:2.}] {} (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]east:3.}] {} (1);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+After this, the algorithm adds
+the subcircuit
+$2 \rightarrow 5 \rightarrow 6 \rightarrow 2$
+to the circuit:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (3,5) {$1$};
+\node[draw, circle] (2) at (1,3) {$2$};
+\node[draw, circle] (3) at (3,3) {$3$};
+\node[draw, circle] (4) at (5,3) {$4$};
+\node[draw, circle] (5) at (1,1) {$5$};
+\node[draw, circle] (6) at (3,1) {$6$};
+\node[draw, circle] (7) at (5,1) {$7$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (6);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (4) -- (7);
+\path[draw,thick,-] (5) -- (6);
+\path[draw,thick,-] (6) -- (7);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]west:2.}] {} (5);
+\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:3.}] {} (6);
+\path[draw=red,thick,->,line width=2pt] (6) -- node[font=\small,label={[red]north:4.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:5.}] {} (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]east:6.}] {} (1);
+\end{tikzpicture}
+\end{center}
+Finally, the algorithm adds the subcircuit
+$6 \rightarrow 3 \rightarrow 4 \rightarrow 7 \rightarrow 6$
+to the circuit:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (3,5) {$1$};
+\node[draw, circle] (2) at (1,3) {$2$};
+\node[draw, circle] (3) at (3,3) {$3$};
+\node[draw, circle] (4) at (5,3) {$4$};
+\node[draw, circle] (5) at (1,1) {$5$};
+\node[draw, circle] (6) at (3,1) {$6$};
+\node[draw, circle] (7) at (5,1) {$7$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (2) -- (6);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (3) -- (6);
+\path[draw,thick,-] (4) -- (7);
+\path[draw,thick,-] (5) -- (6);
+\path[draw,thick,-] (6) -- (7);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]west:2.}] {} (5);
+\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:3.}] {} (6);
+\path[draw=red,thick,->,line width=2pt] (6) -- node[font=\small,label={[red]east:4.}] {} (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]north:5.}] {} (4);
+\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]east:6.}] {} (7);
+\path[draw=red,thick,->,line width=2pt] (7) -- node[font=\small,label={[red]south:7.}] {} (6);
+\path[draw=red,thick,->,line width=2pt] (6) -- node[font=\small,label={[red]right:8.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:9.}] {} (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]east:10.}] {} (1);
+\end{tikzpicture}
+\end{center}
+Now all edges are included in the circuit,
+so we have successfully constructed an Eulerian circuit.
+
+\section{Hamiltonian paths}
+
+\index{Hamiltonian path}
+
+A \key{Hamiltonian path}
+%\footnote{W. R. Hamilton (1805--1865) was an Irish mathematician.}
+is a path
+that visits each node of the graph exactly once.
+For example, the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+contains a Hamiltonian path from node 1 to node 3:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]left:1.}] {} (4);
+\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]south:2.}] {} (5);
+\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]left:3.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:4.}] {} (3);
+\end{tikzpicture}
+\end{center}
+
+\index{Hamiltonian circuit}
+
+If a Hamiltonian path begins and ends at the same node,
+it is called a \key{Hamiltonian circuit}.
+The graph above also has an Hamiltonian circuit
+that begins and ends at node 1:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,5) {$1$};
+\node[draw, circle] (2) at (3,5) {$2$};
+\node[draw, circle] (3) at (5,4) {$3$};
+\node[draw, circle] (4) at (1,3) {$4$};
+\node[draw, circle] (5) at (3,3) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (2) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (5);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+
+\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
+\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:2.}] {} (3);
+\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]south:3.}] {} (5);
+\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:4.}] {} (4);
+\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]left:5.}] {} (1);
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Existence}
+
+No efficient method is known for testing if a graph
+contains a Hamiltonian path, and the problem is NP-hard.
+Still, in some special cases, we can be certain
+that a graph contains a Hamiltonian path.
+
+A simple observation is that if the graph is complete,
+i.e., there is an edge between all pairs of nodes,
+it also contains a Hamiltonian path.
+Also stronger results have been achieved:
+
+\begin{itemize}
+\item
+\index{Dirac's theorem}
+\key{Dirac's theorem}: %\cite{dir52}
+If the degree of each node is at least $n/2$,
+the graph contains a Hamiltonian path.
+\item
+\index{Ore's theorem}
+\key{Ore's theorem}: %\cite{ore60}
+If the sum of degrees of each non-adjacent pair of nodes
+is at least $n$,
+the graph contains a Hamiltonian path.
+\end{itemize}
+
+A common property in these theorems and other results is
+that they guarantee the existence of a Hamiltonian path
+if the graph has \emph{a large number} of edges.
+This makes sense, because the more edges the graph contains,
+the more possibilities there is to construct a Hamiltonian path.
+
+\subsubsection{Construction}
+
+Since there is no efficient way to check if a Hamiltonian
+path exists, it is clear that there is also no method
+to efficiently construct the path, because otherwise
+we could just try to construct the path and see
+whether it exists.
+
+A simple way to search for a Hamiltonian path is
+to use a backtracking algorithm that goes through all
+possible ways to construct the path.
+The time complexity of such an algorithm is at least $O(n!)$,
+because there are $n!$ different ways to choose the order of $n$ nodes.
+
+A more efficient solution is based on dynamic programming
+(see Chapter 10.5).
+The idea is to calculate values
+of a function $\texttt{possible}(S,x)$,
+where $S$ is a subset of nodes and $x$
+is one of the nodes.
+The function indicates whether there is a Hamiltonian path
+that visits the nodes of $S$ and ends at node $x$.
+It is possible to implement this solution in $O(2^n n^2)$ time.
+
+\section{De Bruijn sequences}
+
+\index{De Bruijn sequence}
+
+A \key{De Bruijn sequence}
+is a string that contains
+every string of length $n$
+exactly once as a substring, for a fixed
+alphabet of $k$ characters.
+The length of such a string is 
+$k^n+n-1$ characters.
+For example, when $n=3$ and $k=2$,
+an example of a De Bruijn sequence is
+\[0001011100.\]
+The substrings of this string are all
+combinations of three bits:
+000, 001, 010, 011, 100, 101, 110 and 111.
+
+It turns out that each De Bruijn sequence
+corresponds to an Eulerian path in a graph.
+The idea is to construct a graph where
+each node contains a string of $n-1$ characters
+and each edge adds one character to the string.
+The following graph corresponds to the above scenario:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.8]
+\node[draw, circle] (00) at (-3,0) {00};
+\node[draw, circle] (11) at (3,0) {11};
+\node[draw, circle] (01) at (0,2) {01};
+\node[draw, circle] (10) at (0,-2) {10};
+
+\path[draw,thick,->] (00) edge [bend left=20] node[font=\small,label=1] {} (01);
+\path[draw,thick,->] (01) edge [bend left=20] node[font=\small,label=1] {} (11);
+\path[draw,thick,->] (11) edge [bend left=20] node[font=\small,label=below:0] {} (10);
+\path[draw,thick,->] (10) edge [bend left=20] node[font=\small,label=below:0] {} (00);
+
+\path[draw,thick,->] (01) edge [bend left=30] node[font=\small,label=right:0] {} (10);
+\path[draw,thick,->] (10) edge [bend left=30] node[font=\small,label=left:1] {} (01);
+
+\path[draw,thick,-] (00) edge [loop left] node[font=\small,label=below:0] {} (00);
+\path[draw,thick,-] (11) edge [loop right] node[font=\small,label=below:1] {} (11);
+\end{tikzpicture}
+\end{center}
+
+An Eulerian path in this graph corresponds to a string
+that contains all strings of length $n$.
+The string contains the characters of the starting node
+and all characters of the edges.
+The starting node has $n-1$ characters
+and there are $k^n$ characters in the edges,
+so the length of the string is $k^n+n-1$.
+
+\section{Knight's tours}
+
+\index{knight's tour}
+
+A \key{knight's tour} is a sequence of moves
+of a knight on an $n \times n$ chessboard
+following the rules of chess such that the knight
+visits each square exactly once.
+A knight's tour is called a \emph{closed} tour
+if the knight finally returns to the starting square and
+otherwise it is called an \emph{open} tour.
+
+For example, here is an open knight's tour on a $5 \times 5$ board:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (5,5);
+\node at (0.5,4.5) {$1$};
+\node at (1.5,4.5) {$4$};
+\node at (2.5,4.5) {$11$};
+\node at (3.5,4.5) {$16$};
+\node at (4.5,4.5) {$25$};
+\node at (0.5,3.5) {$12$};
+\node at (1.5,3.5) {$17$};
+\node at (2.5,3.5) {$2$};
+\node at (3.5,3.5) {$5$};
+\node at (4.5,3.5) {$10$};
+\node at (0.5,2.5) {$3$};
+\node at (1.5,2.5) {$20$};
+\node at (2.5,2.5) {$7$};
+\node at (3.5,2.5) {$24$};
+\node at (4.5,2.5) {$15$};
+\node at (0.5,1.5) {$18$};
+\node at (1.5,1.5) {$13$};
+\node at (2.5,1.5) {$22$};
+\node at (3.5,1.5) {$9$};
+\node at (4.5,1.5) {$6$};
+\node at (0.5,0.5) {$21$};
+\node at (1.5,0.5) {$8$};
+\node at (2.5,0.5) {$19$};
+\node at (3.5,0.5) {$14$};
+\node at (4.5,0.5) {$23$};
+\end{tikzpicture}
+\end{center}
+
+A knight's tour corresponds to a Hamiltonian path in a graph
+whose nodes represent the squares of the board,
+and two nodes are connected with an edge if a knight
+can move between the squares according to the rules of chess.
+
+A natural way to construct a knight's tour is to use backtracking.
+The search can be made more efficient by using
+\emph{heuristics} that attempt to guide the knight so that
+a complete tour will be found quickly.
+
+\subsubsection{Warnsdorf's rule}
+
+\index{heuristic}
+\index{Warnsdorf's rule}
+
+\key{Warnsdorf's rule} is a simple and effective heuristic
+for finding a knight's tour\footnote{This heuristic was proposed
+in Warnsdorf's book \cite{war23} in 1823. There are
+also polynomial algorithms for finding knight's tours
+\cite{par97}, but they are more complicated.}.
+Using the rule, it is possible to efficiently construct a tour
+even on a large board.
+The idea is to always move the knight so that it ends up
+in a square where the number of possible moves is as
+\emph{small} as possible.
+
+For example, in the following situation, there are five
+possible squares to which the knight can move (squares $a \ldots e$):
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (5,5);
+\node at (0.5,4.5) {$1$};
+\node at (2.5,3.5) {$2$};
+\node at (4.5,4.5) {$a$};
+\node at (0.5,2.5) {$b$};
+\node at (4.5,2.5) {$e$};
+\node at (1.5,1.5) {$c$};
+\node at (3.5,1.5) {$d$};
+\end{tikzpicture}
+\end{center}
+In this situation, Warnsdorf's rule moves the knight to square $a$,
+because after this choice, there is only a single possible move.
+The other choices would move the knight to squares where
+there would be three moves available.
+
+
--- a/chapter20.tex
+++ b/chapter20.tex
--- a/chapter21.tex
+++ b/chapter21.tex
@ -0,0 +1,726 @@
+\chapter{Number theory}
+
+\index{number theory}
+
+\key{Number theory} is a branch of mathematics
+that studies integers.
+Number theory is a fascinating field,
+because many questions involving integers
+are very difficult to solve even if they
+seem simple at first glance.
+
+As an example, consider the following equation:
+\[x^3 + y^3 + z^3 = 33\]
+It is easy to find three real numbers $x$, $y$ and $z$
+that satisfy the equation.
+For example, we can choose
+\[
+\begin{array}{lcl}
+x = 3, \\
+y = \sqrt[3]{3}, \\
+z = \sqrt[3]{3}.\\
+\end{array}
+\]
+However, it is an open problem in number theory
+if there are any three
+\emph{integers} $x$, $y$ and $z$
+that would satisfy the equation \cite{bec07}.
+
+In this chapter, we will focus on basic concepts
+and algorithms in number theory.
+Throughout the chapter, we will assume that all numbers
+are integers, if not otherwise stated.
+
+\section{Primes and factors}
+
+\index{divisibility}
+\index{factor}
+\index{divisor}
+
+A number $a$ is called a \key{factor} or a \key{divisor} of a number $b$
+if $a$ divides $b$.
+If $a$ is a factor of $b$,
+we write $a \mid b$, and otherwise we write $a \nmid b$.
+For example, the factors of 24 are
+1, 2, 3, 4, 6, 8, 12 and 24.
+
+\index{prime}
+\index{prime decomposition}
+
+A number $n>1$ is a \key{prime}
+if its only positive factors are 1 and $n$.
+For example, 7, 19 and 41 are primes,
+but 35 is not a prime, because $5 \cdot 7 = 35$.
+For every number $n>1$, there is a unique
+\key{prime factorization}
+\[ n = p_1^{\alpha_1} p_2^{\alpha_2} \cdots p_k^{\alpha_k},\]
+where $p_1,p_2,\ldots,p_k$ are distinct primes and
+$\alpha_1,\alpha_2,\ldots,\alpha_k$ are positive numbers.
+For example, the prime factorization for 84 is
+\[84 = 2^2 \cdot 3^1 \cdot 7^1.\]
+
+The \key{number of factors} of a number $n$ is
+\[\tau(n)=\prod_{i=1}^k (\alpha_i+1),\]
+because for each prime $p_i$, there are
+$\alpha_i+1$ ways to choose how many times
+it appears in the factor.
+For example, the number of factors
+of 84 is
+$\tau(84)=3 \cdot 2 \cdot 2 = 12$.
+The factors are
+1, 2, 3, 4, 6, 7, 12, 14, 21, 28, 42 and 84.
+
+The \key{sum of factors} of $n$ is
+\[\sigma(n)=\prod_{i=1}^k (1+p_i+\ldots+p_i^{\alpha_i}) = \prod_{i=1}^k \frac{p_i^{a_i+1}-1}{p_i-1},\]
+where the latter formula is based on the geometric progression formula.
+For example, the sum of factors of 84 is
+\[\sigma(84)=\frac{2^3-1}{2-1} \cdot \frac{3^2-1}{3-1} \cdot \frac{7^2-1}{7-1} = 7 \cdot 4 \cdot 8 = 224.\]
+
+The \key{product of factors} of $n$ is
+\[\mu(n)=n^{\tau(n)/2},\]
+because we can form $\tau(n)/2$ pairs from the factors,
+each with product $n$.
+For example, the factors of 84
+produce the pairs
+$1 \cdot 84$, $2 \cdot 42$, $3 \cdot 28$, etc.,
+and the product of the factors is $\mu(84)=84^6=351298031616$.
+
+\index{perfect number}
+
+A number $n$ is called a \key{perfect number} if $n=\sigma(n)-n$,
+i.e., $n$ equals the sum of its factors
+between $1$ and $n-1$.
+For example, 28 is a perfect number,
+because $28=1+2+4+7+14$.
+
+\subsubsection{Number of primes}
+
+It is easy to show that there is an infinite number
+of primes.
+If the number of primes would be finite,
+we could construct a set $P=\{p_1,p_2,\ldots,p_n\}$
+that would contain all the primes.
+For example, $p_1=2$, $p_2=3$, $p_3=5$, and so on.
+However, using $P$, we could form a new prime
+\[p_1 p_2 \cdots p_n+1\]
+that is larger than all elements in $P$.
+This is a contradiction, and the number of primes
+has to be infinite.
+
+\subsubsection{Density of primes}
+
+The density of primes means how often there are primes
+among the numbers.
+Let $\pi(n)$ denote the number of primes between
+$1$ and $n$. For example, $\pi(10)=4$, because
+there are 4 primes between $1$ and $10$: 2, 3, 5 and 7.
+
+It is possible to show that
+\[\pi(n) \approx \frac{n}{\ln n},\]
+which means that primes are quite frequent.
+For example, the number of primes between
+$1$ and $10^6$ is $\pi(10^6)=78498$,
+and $10^6 / \ln 10^6 \approx 72382$.
+
+\subsubsection{Conjectures}
+
+There are many \emph{conjectures} involving primes.
+Most people think that the conjectures are true,
+but nobody has been able to prove them.
+For example, the following conjectures are famous:
+
+\begin{itemize}
+\index{Goldbach's conjecture}
+\item \key{Goldbach's conjecture}:
+Each even integer $n>2$ can be represented as a
+sum $n=a+b$ so that both $a$ and $b$ are primes.
+\index{twin prime}
+\item \key{Twin prime conjecture}:
+There is an infinite number of pairs
+of the form $\{p,p+2\}$,
+where both $p$ and $p+2$ are primes.
+\index{Legendre's conjecture}
+\item \key{Legendre's conjecture}:
+There is always a prime between numbers
+$n^2$ and $(n+1)^2$, where $n$ is any positive integer.
+\end{itemize}
+
+\subsubsection{Basic algorithms}
+
+If a number $n$ is not prime,
+it can be represented as a product $a \cdot b$,
+where $a \le \sqrt n$ or $b \le \sqrt n$,
+so it certainly has a factor between $2$ and $\lfloor \sqrt n \rfloor$.
+Using this observation, we can both test
+if a number is prime and find the prime factorization
+of a number in $O(\sqrt n)$ time.
+
+The following function \texttt{prime} checks
+if the given number $n$ is prime.
+The function attempts to divide $n$ by
+all numbers between $2$ and $\lfloor \sqrt n \rfloor$,
+and if none of them divides $n$, then $n$ is prime.
+
+\begin{lstlisting}
+bool prime(int n) {
+    if (n < 2) return false;
+    for (int x = 2; x*x <= n; x++) {
+        if (n%x == 0) return false;
+    }
+    return true;
+}
+\end{lstlisting}
+
+\noindent
+The following function \texttt{factors}
+constructs a vector that contains the prime
+factorization of $n$.
+The function divides $n$ by its prime factors,
+and adds them to the vector.
+The process ends when the remaining number $n$
+has no factors between $2$ and $\lfloor \sqrt n \rfloor$.
+If $n>1$, it is prime and the last factor.
+
+\begin{lstlisting}
+vector<int> factors(int n) {
+    vector<int> f;
+    for (int x = 2; x*x <= n; x++) {
+        while (n%x == 0) {
+            f.push_back(x);
+            n /= x;
+        }
+    }
+    if (n > 1) f.push_back(n);
+    return f;
+}
+\end{lstlisting}
+
+Note that each prime factor appears in the vector
+as many times as it divides the number.
+For example, $24=2^3 \cdot 3$,
+so the result of the function is $[2,2,2,3]$.
+
+\subsubsection{Sieve of Eratosthenes}
+
+\index{sieve of Eratosthenes}
+
+The \key{sieve of Eratosthenes}
+%\footnote{Eratosthenes (c. 276 BC -- c. 194 BC) was a Greek mathematician.}
+is a preprocessing
+algorithm that builds an array using which we
+can efficiently check if a given number between $2 \ldots n$
+is prime and, if it is not, find one prime factor of the number.
+
+The algorithm builds an array $\texttt{sieve}$
+whose positions $2,3,\ldots,n$ are used.
+The value $\texttt{sieve}[k]=0$ means
+that $k$ is prime,
+and the value $\texttt{sieve}[k] \neq 0$
+means that $k$ is not a prime and one
+of its prime factors is $\texttt{sieve}[k]$.
+
+The algorithm iterates through the numbers
+$2 \ldots n$ one by one.
+Always when a new prime $x$ is found,
+the algorithm records that the multiples
+of $x$ ($2x,3x,4x,\ldots$) are not primes,
+because the number $x$ divides them.
+
+For example, if $n=20$, the array is as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (19,1);
+
+\node at (0.5,0.5) {$0$};
+\node at (1.5,0.5) {$0$};
+\node at (2.5,0.5) {$2$};
+\node at (3.5,0.5) {$0$};
+\node at (4.5,0.5) {$3$};
+\node at (5.5,0.5) {$0$};
+\node at (6.5,0.5) {$2$};
+\node at (7.5,0.5) {$3$};
+\node at (8.5,0.5) {$5$};
+\node at (9.5,0.5) {$0$};
+\node at (10.5,0.5) {$3$};
+\node at (11.5,0.5) {$0$};
+\node at (12.5,0.5) {$7$};
+\node at (13.5,0.5) {$5$};
+\node at (14.5,0.5) {$2$};
+\node at (15.5,0.5) {$0$};
+\node at (16.5,0.5) {$3$};
+\node at (17.5,0.5) {$0$};
+\node at (18.5,0.5) {$5$};
+
+\footnotesize
+
+\node at (0.5,1.5) {$2$};
+\node at (1.5,1.5) {$3$};
+\node at (2.5,1.5) {$4$};
+\node at (3.5,1.5) {$5$};
+\node at (4.5,1.5) {$6$};
+\node at (5.5,1.5) {$7$};
+\node at (6.5,1.5) {$8$};
+\node at (7.5,1.5) {$9$};
+\node at (8.5,1.5) {$10$};
+\node at (9.5,1.5) {$11$};
+\node at (10.5,1.5) {$12$};
+\node at (11.5,1.5) {$13$};
+\node at (12.5,1.5) {$14$};
+\node at (13.5,1.5) {$15$};
+\node at (14.5,1.5) {$16$};
+\node at (15.5,1.5) {$17$};
+\node at (16.5,1.5) {$18$};
+\node at (17.5,1.5) {$19$};
+\node at (18.5,1.5) {$20$};
+
+\end{tikzpicture}
+\end{center}
+
+The following code implements the sieve of
+Eratosthenes.
+The code assumes that each element of
+\texttt{sieve} is initially zero.
+
+\begin{lstlisting}
+for (int x = 2; x <= n; x++) {
+    if (sieve[x]) continue;
+    for (int u = 2*x; u <= n; u += x) {
+        sieve[u] = x;
+    }
+}
+\end{lstlisting}
+
+The inner loop of the algorithm is executed
+$n/x$ times for each value of $x$.
+Thus, an upper bound for the running time
+of the algorithm is the harmonic sum
+\[\sum_{x=2}^n n/x = n/2 + n/3 + n/4 + \cdots + n/n = O(n \log n).\]
+
+\index{harmonic sum}
+
+In fact, the algorithm is more efficient,
+because the inner loop will be executed only if
+the number $x$ is prime.
+It can be shown that the running time of the
+algorithm is only $O(n \log \log n)$,
+a complexity very near to $O(n)$. 
+
+\subsubsection{Euclid's algorithm}
+
+\index{greatest common divisor}
+\index{least common multiple}
+\index{Euclid's algorithm}
+
+The \key{greatest common divisor} of
+numbers $a$ and $b$, $\gcd(a,b)$,
+is the greatest number that divides both $a$ and $b$,
+and the \key{least common multiple} of
+$a$ and $b$, $\textrm{lcm}(a,b)$,
+is the smallest number that is divisible by
+both $a$ and $b$.
+For example,
+$\gcd(24,36)=12$ and
+$\textrm{lcm}(24,36)=72$.
+
+The greatest common divisor and the least common multiple
+are connected as follows:
+\[\textrm{lcm}(a,b)=\frac{ab}{\textrm{gcd}(a,b)}\]
+
+\key{Euclid's algorithm}\footnote{Euclid was a Greek mathematician who
+lived in about 300 BC. This is perhaps the first known algorithm in history.} provides an efficient way
+to find the greatest common divisor of two numbers.
+The algorithm is based on the following formula:
+\begin{equation*}
+    \textrm{gcd}(a,b) = \begin{cases}
+               a        & b = 0\\
+               \textrm{gcd}(b,a \bmod b) & b \neq 0\\
+           \end{cases}
+\end{equation*}
+
+For example,
+\[\textrm{gcd}(24,36) = \textrm{gcd}(36,24)
+= \textrm{gcd}(24,12) = \textrm{gcd}(12,0)=12.\]
+
+The algorithm can be implemented as follows:
+\begin{lstlisting}
+int gcd(int a, int b) {
+    if (b == 0) return a;
+    return gcd(b, a%b);
+}
+\end{lstlisting}
+
+It can be shown that Euclid's algorithm works
+in $O(\log n)$ time, where $n=\min(a,b)$.
+The worst case for the algorithm is
+the case when $a$ and $b$ are consecutive Fibonacci numbers.
+For example,
+\[\textrm{gcd}(13,8)=\textrm{gcd}(8,5)
+=\textrm{gcd}(5,3)=\textrm{gcd}(3,2)=\textrm{gcd}(2,1)=\textrm{gcd}(1,0)=1.\]
+
+\subsubsection{Euler's totient function}
+
+\index{coprime}
+\index{Euler's totient function}
+
+Numbers $a$ and $b$ are \key{coprime}
+if $\textrm{gcd}(a,b)=1$.
+\key{Euler's totient function} $\varphi(n)$
+%\footnote{Euler presented this function in 1763.}
+gives the number of coprime numbers to $n$
+between $1$ and $n$.
+For example, $\varphi(12)=4$,
+because 1, 5, 7 and 11
+are coprime to 12.
+
+The value of $\varphi(n)$ can be calculated
+from the prime factorization of $n$
+using the formula
+\[ \varphi(n) = \prod_{i=1}^k p_i^{\alpha_i-1}(p_i-1). \]
+For example, $\varphi(12)=2^1 \cdot (2-1) \cdot 3^0 \cdot (3-1)=4$.
+Note that $\varphi(n)=n-1$ if $n$ is prime.
+
+\section{Modular arithmetic}
+
+\index{modular arithmetic}
+
+In \key{modular arithmetic},
+the set of numbers is limited so
+that only numbers $0,1,2,\ldots,m-1$ are used,
+where $m$ is a constant.
+Each number $x$ is
+represented by the number $x \bmod m$:
+the remainder after dividing $x$ by $m$.
+For example, if $m=17$, then $75$
+is represented by $75 \bmod 17 = 7$.
+
+Often we can take remainders before doing
+calculations.
+In particular, the following formulas hold:
+\[
+\begin{array}{rcl}
+(x+y) \bmod m & = & (x \bmod m + y \bmod m) \bmod m \\
+(x-y) \bmod m & = & (x \bmod m - y \bmod m) \bmod m \\
+(x \cdot y) \bmod m & = & (x \bmod m \cdot y \bmod m) \bmod m \\
+x^n \bmod m & = & (x \bmod m)^n \bmod m \\
+\end{array}
+\]
+
+\subsubsection{Modular exponentiation}
+
+There is often need to efficiently calculate
+the value of $x^n \bmod m$.
+This can be done in $O(\log n)$ time
+using the following recursion:
+\begin{equation*}
+    x^n = \begin{cases}
+               1        & n = 0\\
+               x^{n/2} \cdot x^{n/2} & \text{$n$ is even}\\
+               x^{n-1} \cdot x & \text{$n$ is odd}
+           \end{cases}
+\end{equation*}
+
+It is important that in the case of an even $n$,
+the value of $x^{n/2}$ is calculated only once.
+This guarantees that the time complexity of the
+algorithm is $O(\log n)$, because $n$ is always halved
+when it is even.
+
+The following function calculates the value of
+$x^n \bmod m$:
+
+\begin{lstlisting}
+int modpow(int x, int n, int m) {
+    if (n == 0) return 1%m;
+    long long u = modpow(x,n/2,m);
+    u = (u*u)%m;
+    if (n%2 == 1) u = (u*x)%m;
+    return u;
+}
+\end{lstlisting}
+
+\subsubsection{Fermat's theorem and Euler's theorem}
+
+\index{Fermat's theorem}
+\index{Euler's theorem}
+
+\key{Fermat's theorem}
+%\footnote{Fermat discovered this theorem in 1640.}
+states that
+\[x^{m-1} \bmod m = 1\]
+when $m$ is prime and $x$ and $m$ are coprime.
+This also yields
+\[x^k \bmod m = x^{k \bmod (m-1)} \bmod m.\]
+More generally, \key{Euler's theorem}
+%\footnote{Euler published this theorem in 1763.}
+states that
+\[x^{\varphi(m)} \bmod m = 1\]
+when $x$ and $m$ are coprime.
+Fermat's theorem follows from Euler's theorem,
+because if $m$ is a prime, then $\varphi(m)=m-1$.
+
+\subsubsection{Modular inverse}
+
+\index{modular inverse}
+
+The inverse of $x$ modulo $m$
+is a number $x^{-1}$ such that
+\[ x x^{-1} \bmod m = 1. \]
+For example, if $x=6$ and $m=17$,
+then $x^{-1}=3$, because $6\cdot3 \bmod 17=1$.
+
+Using modular inverses, we can divide numbers
+modulo $m$, because division by $x$
+corresponds to multiplication by $x^{-1}$.
+For example, to evaluate the value of $36/6 \bmod 17$,
+we can use the formula $2 \cdot 3 \bmod 17$,
+because $36 \bmod 17 = 2$ and $6^{-1} \bmod 17 = 3$.
+
+However, a modular inverse does not always exist.
+For example, if $x=2$ and $m=4$, the equation
+\[ x x^{-1} \bmod m = 1 \]
+cannot be solved, because all multiples of 2
+are even and the remainder can never be 1 when $m=4$.
+It turns out that the value of $x^{-1} \bmod m$
+can be calculated exactly when $x$ and $m$ are coprime.
+
+If a modular inverse exists, it can be
+calculated using the formula
+\[
+x^{-1} = x^{\varphi(m)-1}.
+\]
+If $m$ is prime, the formula becomes
+\[
+x^{-1} = x^{m-2}.
+\]
+For example,
+\[6^{-1} \bmod 17 =6^{17-2} \bmod 17 = 3.\]
+
+This formula allows us to efficiently calculate
+modular inverses using the modular exponentation algorithm.
+The formula can be derived using Euler's theorem.
+First, the modular inverse should satisfy the following equation:
+\[
+x x^{-1} \bmod m = 1.
+\]
+On the other hand, according to Euler's theorem,
+\[
+x^{\varphi(m)} \bmod m =  xx^{\varphi(m)-1} \bmod m = 1,
+\]
+so the numbers $x^{-1}$ and $x^{\varphi(m)-1}$ are equal.
+
+\subsubsection{Computer arithmetic}
+
+In programming, unsigned integers are represented modulo $2^k$,
+where $k$ is the number of bits of the data type.
+A usual consequence of this is that a number wraps around
+if it becomes too large.
+
+For example, in C++, numbers of type \texttt{unsigned int}
+are represented modulo $2^{32}$.
+The following code declares an \texttt{unsigned int}
+variable whose value is $123456789$.
+After this, the value will be multiplied by itself,
+and the result is
+$123456789^2 \bmod 2^{32} = 2537071545$.
+
+\begin{lstlisting}
+unsigned int x = 123456789;
+cout << x*x << "\n"; // 2537071545
+\end{lstlisting}
+
+\section{Solving equations}
+
+\subsubsection*{Diophantine equations}
+
+\index{Diophantine equation}
+
+A \key{Diophantine equation}
+%\footnote{Diophantus of Alexandria was a Greek mathematician who lived in the 3th century.}
+is an equation of the form
+\[ ax + by = c, \]
+where $a$, $b$ and $c$ are constants
+and the values of $x$ and $y$ should be found.
+Each number in the equation has to be an integer.
+For example, one solution for the equation
+$5x+2y=11$ is $x=3$ and $y=-2$.
+
+\index{extended Euclid's algorithm}
+
+We can efficiently solve a Diophantine equation
+by using Euclid's algorithm.
+It turns out that we can extend Euclid's algorithm
+so that it will find numbers $x$ and $y$
+that satisfy the following equation:
+\[
+ax + by = \textrm{gcd}(a,b)
+\]
+
+A Diophantine equation can be solved if
+$c$ is divisible by
+$\textrm{gcd}(a,b)$,
+and otherwise it cannot be solved.
+
+As an example, let us find numbers $x$ and $y$
+that satisfy the following equation:
+\[
+39x + 15y = 12
+\]
+The equation can be solved, because
+$\textrm{gcd}(39,15)=3$ and $3 \mid 12$.
+When Euclid's algorithm calculates the
+greatest common divisor of 39 and 15,
+it produces the following sequence of function calls:
+\[
+\textrm{gcd}(39,15) = \textrm{gcd}(15,9)
+= \textrm{gcd}(9,6) = \textrm{gcd}(6,3)
+= \textrm{gcd}(3,0) = 3 \]
+This corresponds to the following equations:
+\[
+\begin{array}{lcl}
+39 - 2 \cdot 15 & = & 9 \\
+15 - 1 \cdot 9 & = & 6 \\
+9 - 1 \cdot 6 & = & 3 \\
+\end{array}
+\]
+Using these equations, we can derive
+\[
+39 \cdot 2 + 15 \cdot (-5) = 3
+\]
+and by multiplying this by 4, the result is
+\[
+39 \cdot 8 + 15 \cdot (-20) = 12,
+\]
+so a solution to the equation is
+$x=8$ and $y=-20$.
+
+A solution to a Diophantine equation is not unique,
+because we can form an infinite number of solutions
+if we know one solution.
+If a pair $(x,y)$ is a solution, then also all pairs
+\[(x+\frac{kb}{\textrm{gcd}(a,b)},y-\frac{ka}{\textrm{gcd}(a,b)})\]
+are solutions, where $k$ is any integer.
+
+\subsubsection{Chinese remainder theorem}
+
+\index{Chinese remainder theorem}
+
+The \key{Chinese remainder theorem} solves
+a group of equations of the form
+\[
+\begin{array}{lcl}
+x & = & a_1 \bmod m_1 \\
+x & = & a_2 \bmod m_2 \\
+\cdots \\
+x & = & a_n \bmod m_n \\
+\end{array}
+\]
+where all pairs of $m_1,m_2,\ldots,m_n$ are coprime.
+
+Let $x^{-1}_m$ be the inverse of $x$ modulo $m$, and
+\[ X_k = \frac{m_1 m_2 \cdots m_n}{m_k}.\]
+Using this notation, a solution to the equations is
+\[x = a_1 X_1 {X_1}^{-1}_{m_1} + a_2 X_2 {X_2}^{-1}_{m_2} + \cdots + a_n X_n {X_n}^{-1}_{m_n}.\]
+In this solution, for each $k=1,2,\ldots,n$,
+\[a_k X_k {X_k}^{-1}_{m_k} \bmod m_k = a_k,\]
+because
+\[X_k {X_k}^{-1}_{m_k} \bmod m_k = 1.\]
+Since all other terms in the sum are divisible by $m_k$,
+they have no effect on the remainder,
+and $x \bmod m_k = a_k$.
+
+For example, a solution for
+\[
+\begin{array}{lcl}
+x & = & 3 \bmod 5 \\
+x & = & 4 \bmod 7 \\
+x & = & 2 \bmod 3 \\
+\end{array}
+\]
+is
+\[ 3 \cdot 21 \cdot 1 + 4 \cdot 15 \cdot 1 + 2 \cdot 35 \cdot 2 = 263.\]
+
+Once we have found a solution $x$,
+we can create an infinite number of other solutions,
+because all numbers of the form
+\[x+m_1 m_2 \cdots m_n\]
+are solutions.
+
+\section{Other results}
+
+\subsubsection{Lagrange's theorem}
+
+\index{Lagrange's theorem}
+
+\key{Lagrange's theorem}
+%\footnote{J.-L. Lagrange (1736--1813) was an Italian mathematician.}
+states that every positive integer
+can be represented as a sum of four squares, i.e.,
+$a^2+b^2+c^2+d^2$.
+For example, the number 123 can be represented
+as the sum $8^2+5^2+5^2+3^2$.
+
+\subsubsection{Zeckendorf's theorem}
+
+\index{Zeckendorf's theorem}
+\index{Fibonacci number}
+
+\key{Zeckendorf's theorem}
+%\footnote{E. Zeckendorf published the theorem in 1972 \cite{zec72}; however, this was not a new result.}
+states that every
+positive integer has a unique representation
+as a sum of Fibonacci numbers such that
+no two numbers are equal or consecutive
+Fibonacci numbers.
+For example, the number 74 can be represented
+as the sum $55+13+5+1$.
+
+\subsubsection{Pythagorean triples}
+
+\index{Pythagorean triple}
+\index{Euclid's formula}
+
+A \key{Pythagorean triple} is a triple $(a,b,c)$
+that satisfies the Pythagorean theorem
+$a^2+b^2=c^2$, which means that there is a right triangle
+with side lengths $a$, $b$ and $c$.
+For example, $(3,4,5)$ is a Pythagorean triple.
+
+If $(a,b,c)$ is a Pythagorean triple,
+all triples of the form $(ka,kb,kc)$
+are also Pythagorean triples where $k>1$.
+A Pythagorean triple is \emph{primitive} if
+$a$, $b$ and $c$ are coprime,
+and all Pythagorean triples can be constructed
+from primitive triples using a multiplier $k$.
+
+\key{Euclid's formula} can be used to produce
+all primitive Pythagorean triples.
+Each such triple is of the form
+\[(n^2-m^2,2nm,n^2+m^2),\]
+where $0<m<n$, $n$ and $m$ are coprime
+and at least one of $n$ and $m$ is even.
+For example, when $m=1$ and $n=2$, the formula
+produces the smallest Pythagorean triple
+\[(2^2-1^2,2\cdot2\cdot1,2^2+1^2)=(3,4,5).\]
+
+\subsubsection{Wilson's theorem}
+
+\index{Wilson's theorem}
+
+\key{Wilson's theorem}
+%\footnote{J. Wilson (1741--1793) was an English mathematician.}
+states that a number $n$
+is prime exactly when
+\[(n-1)! \bmod n = n-1.\]
+For example, the number 11 is prime, because
+\[10! \bmod 11 = 10,\]
+and the number 12 is not prime, because
+\[11! \bmod 12 = 0 \neq 11.\]
+
+Hence, Wilson's theorem can be used to find out
+whether a number is prime. However, in practice, the theorem cannot be
+applied to large values of $n$, because it is difficult
+to calculate values of $(n-1)!$ when $n$ is large.
+
+
--- a/chapter22.tex
+++ b/chapter22.tex
@ -0,0 +1,925 @@
+\chapter{Combinatorics}
+
+\index{combinatorics}
+
+\key{Combinatorics} studies methods for counting
+combinations of objects.
+Usually, the goal is to find a way to
+count the combinations efficiently
+without generating each combination separately.
+
+As an example, consider the problem
+of counting the number of ways to
+represent an integer $n$ as a sum of positive integers.
+For example, there are 8 representations
+for $4$:
+\begin{multicols}{2}
+\begin{itemize}
+\item $1+1+1+1$
+\item $1+1+2$
+\item $1+2+1$
+\item $2+1+1$
+\item $2+2$
+\item $3+1$
+\item $1+3$
+\item $4$
+\end{itemize}
+\end{multicols}
+
+A combinatorial problem can often be solved
+using a recursive function.
+In this problem, we can define a function $f(n)$
+that gives the number of representations for $n$.
+For example, $f(4)=8$ according to the above example.
+The values of the function
+can be recursively calculated as follows:
+\begin{equation*}
+    f(n) = \begin{cases}
+               1               & n = 0\\
+               f(0)+f(1)+\cdots+f(n-1) & n > 0\\
+           \end{cases}
+\end{equation*}
+The base case is $f(0)=1$,
+because the empty sum represents the number 0.
+Then, if $n>0$, we consider all ways to
+choose the first number of the sum.
+If the first number is $k$,
+there are $f(n-k)$ representations
+for the remaining part of the sum.
+Thus, we calculate the sum of all values
+of the form $f(n-k)$ where $k<n$.
+
+The first values for the function are:
+\[
+\begin{array}{lcl}
+f(0) & = & 1 \\
+f(1) & = & 1 \\
+f(2) & = & 2 \\
+f(3) & = & 4 \\
+f(4) & = & 8 \\
+\end{array}
+\]
+
+Sometimes, a recursive formula can be replaced
+with a closed-form formula.
+In this problem,
+\[
+f(n)=2^{n-1},
+\]
+which is based on the fact that there are $n-1$
+possible positions for +-signs in the sum
+and we can choose any subset of them.
+
+\section{Binomial coefficients}
+
+\index{binomial coefficient}
+
+The \key{binomial coefficient} ${n \choose k}$
+equals the number of ways we can choose a subset
+of $k$ elements from a set of $n$ elements.
+For example, ${5 \choose 3}=10$,
+because the set $\{1,2,3,4,5\}$
+has 10 subsets of 3 elements:
+\[ \{1,2,3\}, \{1,2,4\}, \{1,2,5\}, \{1,3,4\}, \{1,3,5\}, 
+\{1,4,5\}, \{2,3,4\}, \{2,3,5\}, \{2,4,5\}, \{3,4,5\} \]
+
+\subsubsection{Formula 1}
+
+Binomial coefficients can be
+recursively calculated as follows:
+
+\[
+{n \choose k}  =  {n-1 \choose k-1} + {n-1 \choose k}
+\]
+
+The idea is to fix an element $x$ in the set.
+If $x$ is included in the subset,
+we have to choose $k-1$
+elements from $n-1$ elements,
+and if $x$ is not included in the subset,
+we have to choose $k$ elements from $n-1$ elements.
+
+The base cases for the recursion are
+\[
+{n \choose 0}  =  {n \choose n} = 1,
+\]
+because there is always exactly
+one way to construct an empty subset
+and a subset that contains all the elements.
+
+\subsubsection{Formula 2}
+
+Another way to calculate binomial coefficients is as follows:
+\[
+{n \choose k}  =  \frac{n!}{k!(n-k)!}.
+\]
+
+There are $n!$ permutations of $n$ elements.
+We go through all permutations and always
+include the first $k$ elements of the permutation
+in the subset.
+Since the order of the elements in the subset
+and outside the subset does not matter,
+the result is divided by $k!$ and $(n-k)!$
+
+\subsubsection{Properties}
+
+For binomial coefficients,
+\[
+{n \choose k}  =  {n \choose n-k},
+\]
+because we actually divide a set of $n$ elements into
+two subsets: the first contains $k$ elements
+and the second contains $n-k$ elements.
+
+The sum of binomial coefficients is
+\[
+{n \choose 0}+{n \choose 1}+{n \choose 2}+\ldots+{n \choose n}=2^n.
+\]
+
+The reason for the name ''binomial coefficient''
+can be seen when the binomial $(a+b)$ is raised to
+the $n$th power:
+
+\[ (a+b)^n =
+{n \choose 0} a^n b^0 + 
+{n \choose 1} a^{n-1} b^1 +
+\ldots + 
+{n \choose n-1} a^1 b^{n-1} +
+{n \choose n} a^0 b^n. \]
+
+\index{Pascal's triangle}
+
+Binomial coefficients also appear in
+\key{Pascal's triangle}
+where each value equals the sum of two
+above values:
+\begin{center}
+\begin{tikzpicture}{0.9}
+\node at (0,0) {1};
+\node at (-0.5,-0.5) {1};
+\node at (0.5,-0.5) {1};
+\node at (-1,-1) {1};
+\node at (0,-1) {2};
+\node at (1,-1) {1};
+\node at (-1.5,-1.5) {1};
+\node at (-0.5,-1.5) {3};
+\node at (0.5,-1.5) {3};
+\node at (1.5,-1.5) {1};
+\node at (-2,-2) {1};
+\node at (-1,-2) {4};
+\node at (0,-2) {6};
+\node at (1,-2) {4};
+\node at (2,-2) {1};
+\node at (-2,-2.5) {$\ldots$};
+\node at (-1,-2.5) {$\ldots$};
+\node at (0,-2.5) {$\ldots$};
+\node at (1,-2.5) {$\ldots$};
+\node at (2,-2.5) {$\ldots$};
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Boxes and balls}
+
+''Boxes and balls'' is a useful model,
+where we count the ways to
+place $k$ balls in $n$ boxes.
+Let us consider three scenarios:
+
+\textit{Scenario 1}: Each box can contain
+at most one ball.
+For example, when $n=5$ and $k=2$,
+there are 10 solutions:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.5]
+\newcommand\lax[3]{
+\path[draw,thick,-] (#1-0.5,#2+0.5) -- (#1-0.5,#2-0.5) --
+                    (#1+0.5,#2-0.5) -- (#1+0.5,#2+0.5);
+\ifthenelse{\equal{#3}{1}}{\draw[fill=black] (#1,#2-0.3) circle (0.15);}{}
+\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1-0.2,#2-0.3) circle (0.15);}{}
+\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1+0.2,#2-0.3) circle (0.15);}{}
+}
+\newcommand\laa[7]{
+    \lax{#1}{#2}{#3}
+    \lax{#1+1.2}{#2}{#4}
+    \lax{#1+2.4}{#2}{#5}
+    \lax{#1+3.6}{#2}{#6}
+    \lax{#1+4.8}{#2}{#7}
+}
+
+\laa{0}{0}{1}{1}{0}{0}{0}
+\laa{0}{-2}{1}{0}{1}{0}{0}
+\laa{0}{-4}{1}{0}{0}{1}{0}
+\laa{0}{-6}{1}{0}{0}{0}{1}
+\laa{8}{0}{0}{1}{1}{0}{0}
+\laa{8}{-2}{0}{1}{0}{1}{0}
+\laa{8}{-4}{0}{1}{0}{0}{1}
+\laa{16}{0}{0}{0}{1}{1}{0}
+\laa{16}{-2}{0}{0}{1}{0}{1}
+\laa{16}{-4}{0}{0}{0}{1}{1}
+
+\end{tikzpicture}
+\end{center}
+
+In this scenario, the answer is directly the
+binomial coefficient ${n \choose k}$.
+
+\textit{Scenario 2}: A box can contain multiple balls.
+For example, when $n=5$ and $k=2$,
+there are 15 solutions:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.5]
+\newcommand\lax[3]{
+\path[draw,thick,-] (#1-0.5,#2+0.5) -- (#1-0.5,#2-0.5) --
+                    (#1+0.5,#2-0.5) -- (#1+0.5,#2+0.5);
+\ifthenelse{\equal{#3}{1}}{\draw[fill=black] (#1,#2-0.3) circle (0.15);}{}
+\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1-0.2,#2-0.3) circle (0.15);}{}
+\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1+0.2,#2-0.3) circle (0.15);}{}
+}
+\newcommand\laa[7]{
+    \lax{#1}{#2}{#3}
+    \lax{#1+1.2}{#2}{#4}
+    \lax{#1+2.4}{#2}{#5}
+    \lax{#1+3.6}{#2}{#6}
+    \lax{#1+4.8}{#2}{#7}
+}
+
+\laa{0}{0}{2}{0}{0}{0}{0}
+\laa{0}{-2}{1}{1}{0}{0}{0}
+\laa{0}{-4}{1}{0}{1}{0}{0}
+\laa{0}{-6}{1}{0}{0}{1}{0}
+\laa{0}{-8}{1}{0}{0}{0}{1}
+\laa{8}{0}{0}{2}{0}{0}{0}
+\laa{8}{-2}{0}{1}{1}{0}{0}
+\laa{8}{-4}{0}{1}{0}{1}{0}
+\laa{8}{-6}{0}{1}{0}{0}{1}
+\laa{8}{-8}{0}{0}{2}{0}{0}
+\laa{16}{0}{0}{0}{1}{1}{0}
+\laa{16}{-2}{0}{0}{1}{0}{1}
+\laa{16}{-4}{0}{0}{0}{2}{0}
+\laa{16}{-6}{0}{0}{0}{1}{1}
+\laa{16}{-8}{0}{0}{0}{0}{2}
+
+\end{tikzpicture}
+\end{center}
+
+The process of placing the balls in the boxes
+can be represented as a string
+that consists of symbols
+''o'' and ''$\rightarrow$''.
+Initially, assume that we are standing at the leftmost box.
+The symbol ''o'' means that we place a ball
+in the current box, and the symbol
+''$\rightarrow$'' means that we move to
+the next box to the right.
+
+Using this notation, each solution is a string
+that contains $k$ times the symbol ''o'' and
+$n-1$ times the symbol ''$\rightarrow$''.
+For example, the upper-right solution
+in the above picture corresponds to the string
+''$\rightarrow$ $\rightarrow$ o $\rightarrow$ o $\rightarrow$''.
+Thus, the number of solutions is
+${k+n-1 \choose k}$.
+
+\textit{Scenario 3}: Each box may contain at most one ball,
+and in addition, no two adjacent boxes may both contain a ball.
+For example, when $n=5$ and $k=2$,
+there are 6 solutions:
+
+
+\begin{center}
+\begin{tikzpicture}[scale=0.5]
+\newcommand\lax[3]{
+\path[draw,thick,-] (#1-0.5,#2+0.5) -- (#1-0.5,#2-0.5) --
+                    (#1+0.5,#2-0.5) -- (#1+0.5,#2+0.5);
+\ifthenelse{\equal{#3}{1}}{\draw[fill=black] (#1,#2-0.3) circle (0.15);}{}
+\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1-0.2,#2-0.3) circle (0.15);}{}
+\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1+0.2,#2-0.3) circle (0.15);}{}
+}
+\newcommand\laa[7]{
+    \lax{#1}{#2}{#3}
+    \lax{#1+1.2}{#2}{#4}
+    \lax{#1+2.4}{#2}{#5}
+    \lax{#1+3.6}{#2}{#6}
+    \lax{#1+4.8}{#2}{#7}
+}
+
+\laa{0}{0}{1}{0}{1}{0}{0}
+\laa{0}{-2}{1}{0}{0}{1}{0}
+\laa{8}{0}{1}{0}{0}{0}{1}
+\laa{8}{-2}{0}{1}{0}{1}{0}
+\laa{16}{0}{0}{1}{0}{0}{1}
+\laa{16}{-2}{0}{0}{1}{0}{1}
+\end{tikzpicture}
+\end{center}
+
+In this scenario, we can assume that
+$k$ balls are initially placed in boxes
+and there is an empty box between each
+two adjacent boxes.
+The remaining task is to choose the
+positions for the remaining empty boxes.
+There are $n-2k+1$ such boxes and
+$k+1$ positions for them.
+Thus, using the formula of scenario 2,
+the number of solutions is
+${n-k+1 \choose n-2k+1}$.
+
+\subsubsection{Multinomial coefficients}
+
+\index{multinomial coefficient}
+
+The \key{multinomial coefficient}
+\[ {n \choose k_1,k_2,\ldots,k_m} = \frac{n!}{k_1! k_2! \cdots k_m!}, \]
+equals the number of ways
+we can divide $n$ elements into subsets
+of sizes $k_1,k_2,\ldots,k_m$,
+where $k_1+k_2+\cdots+k_m=n$.
+Multinomial coefficients can be seen as a
+generalization of binomial cofficients;
+if $m=2$, the above formula
+corresponds to the binomial coefficient formula.
+
+\section{Catalan numbers}
+
+\index{Catalan number}
+
+The \key{Catalan number}
+%\footnote{E. C. Catalan (1814--1894) was a Belgian mathematician.}
+$C_n$ equals the
+number of valid
+parenthesis expressions that consist of
+$n$ left parentheses and $n$ right parentheses.
+
+For example, $C_3=5$, because
+we can construct the following parenthesis
+expressions using three
+left and right parentheses:
+
+\begin{itemize}[noitemsep]
+\item \texttt{()()()}
+\item \texttt{(())()}
+\item \texttt{()(())}
+\item \texttt{((()))}
+\item \texttt{(()())}
+\end{itemize}
+
+\subsubsection{Parenthesis expressions}
+
+\index{parenthesis expression}
+
+What is exactly a \emph{valid parenthesis expression}?
+The following rules precisely define all
+valid parenthesis expressions:
+
+\begin{itemize}
+\item An empty parenthesis expression is valid.
+\item If an expression $A$ is valid,
+then also the expression
+\texttt{(}$A$\texttt{)} is valid.
+\item If expressions $A$ and $B$ are valid,
+then also the expression $AB$ is valid.
+\end{itemize}
+
+Another way to characterize valid 
+parenthesis expressions is that if
+we choose any prefix of such an expression,
+it has to contain at least as many left
+parentheses as right parentheses.
+In addition, the complete expression has to
+contain an equal number of left and right
+parentheses.
+
+\subsubsection{Formula 1}
+
+Catalan numbers can be calculated using the formula
+\[ C_n = \sum_{i=0}^{n-1} C_{i} C_{n-i-1}.\]
+
+The sum goes through the ways to divide the
+expression into two parts
+such that both parts are valid
+expressions and the first part is as short as possible
+but not empty.
+For any $i$, the first part contains $i+1$ pairs
+of parentheses and the number of expressions
+is the product of the following values:
+
+\begin{itemize}
+\item $C_{i}$: the number of ways to construct an expression
+using the parentheses of the first part,
+not counting the outermost parentheses
+\item $C_{n-i-1}$: the number of ways to construct an
+expression using the parentheses of the second part
+\end{itemize}
+
+The base case is $C_0=1$,
+because we can construct an empty parenthesis
+expression using zero pairs of parentheses.
+
+\subsubsection{Formula 2}
+
+Catalan numbers can also be calculated
+using binomial coefficients:
+\[ C_n = \frac{1}{n+1} {2n \choose n}\]
+The formula can be explained as follows:
+
+There are a total of ${2n \choose n}$ ways
+to construct a (not necessarily valid)
+parenthesis expression that contains $n$ left
+parentheses and $n$ right parentheses.
+Let us calculate the number of such
+expressions that are \emph{not} valid.
+
+If a parenthesis expression is not valid,
+it has to contain a prefix where the
+number of right parentheses exceeds the
+number of left parentheses.
+The idea is to reverse each parenthesis
+that belongs to such a prefix.
+For example, the expression
+\texttt{())()(} contains a prefix \texttt{())},
+and after reversing the prefix,
+the expression becomes \texttt{)((()(}.
+
+The resulting expression consists of $n+1$
+left parentheses and $n-1$ right parentheses.
+The number of such expressions is ${2n \choose n+1}$,
+which equals the number of non-valid
+parenthesis expressions.
+Thus, the number of valid parenthesis
+expressions can be calculated using the formula
+\[{2n \choose n}-{2n \choose n+1} = {2n \choose n} - \frac{n}{n+1} {2n \choose n} = \frac{1}{n+1} {2n \choose n}.\]
+
+\subsubsection{Counting trees}
+
+Catalan numbers are also related to trees:
+
+\begin{itemize}
+\item there are $C_n$ binary trees of $n$ nodes
+\item there are $C_{n-1}$ rooted trees of $n$ nodes
+\end{itemize}
+\noindent
+For example, for $C_3=5$, the binary trees are
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\path[draw,thick,-] (0,0) -- (-1,-1);
+\path[draw,thick,-] (0,0) -- (1,-1);
+\draw[fill=white] (0,0) circle (0.3);
+\draw[fill=white] (-1,-1) circle (0.3);
+\draw[fill=white] (1,-1) circle (0.3);
+
+\path[draw,thick,-] (4,0) -- (4-0.75,-1) -- (4-1.5,-2);
+\draw[fill=white] (4,0) circle (0.3);
+\draw[fill=white] (4-0.75,-1) circle (0.3);
+\draw[fill=white] (4-1.5,-2) circle (0.3);
+
+\path[draw,thick,-] (6.5,0) -- (6.5-0.75,-1) -- (6.5-0,-2);
+\draw[fill=white] (6.5,0) circle (0.3);
+\draw[fill=white] (6.5-0.75,-1) circle (0.3);
+\draw[fill=white] (6.5-0,-2) circle (0.3);
+
+\path[draw,thick,-] (9,0) -- (9+0.75,-1) -- (9-0,-2);
+\draw[fill=white] (9,0) circle (0.3);
+\draw[fill=white] (9+0.75,-1) circle (0.3);
+\draw[fill=white] (9-0,-2) circle (0.3);
+
+\path[draw,thick,-] (11.5,0) -- (11.5+0.75,-1) -- (11.5+1.5,-2);
+\draw[fill=white] (11.5,0) circle (0.3);
+\draw[fill=white] (11.5+0.75,-1) circle (0.3);
+\draw[fill=white] (11.5+1.5,-2) circle (0.3);
+\end{tikzpicture}
+\end{center}
+and the rooted trees are
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\path[draw,thick,-] (0,0) -- (-1,-1);
+\path[draw,thick,-] (0,0) -- (0,-1);
+\path[draw,thick,-] (0,0) -- (1,-1);
+\draw[fill=white] (0,0) circle (0.3);
+\draw[fill=white] (-1,-1) circle (0.3);
+\draw[fill=white] (0,-1) circle (0.3);
+\draw[fill=white] (1,-1) circle (0.3);
+
+\path[draw,thick,-] (3,0) -- (3,-1) -- (3,-2) -- (3,-3);
+\draw[fill=white] (3,0) circle (0.3);
+\draw[fill=white] (3,-1) circle (0.3);
+\draw[fill=white] (3,-2) circle (0.3);
+\draw[fill=white] (3,-3) circle (0.3);
+
+\path[draw,thick,-] (6+0,0) -- (6-1,-1);
+\path[draw,thick,-] (6+0,0) -- (6+1,-1) -- (6+1,-2);
+\draw[fill=white] (6+0,0) circle (0.3);
+\draw[fill=white] (6-1,-1) circle (0.3);
+\draw[fill=white] (6+1,-1) circle (0.3);
+\draw[fill=white] (6+1,-2) circle (0.3);
+
+\path[draw,thick,-] (9+0,0) -- (9+1,-1);
+\path[draw,thick,-] (9+0,0) -- (9-1,-1) -- (9-1,-2);
+\draw[fill=white] (9+0,0) circle (0.3);
+\draw[fill=white] (9+1,-1) circle (0.3);
+\draw[fill=white] (9-1,-1) circle (0.3);
+\draw[fill=white] (9-1,-2) circle (0.3);
+
+\path[draw,thick,-] (12+0,0) -- (12+0,-1) -- (12-1,-2);
+\path[draw,thick,-] (12+0,0) -- (12+0,-1) -- (12+1,-2);
+\draw[fill=white] (12+0,0) circle (0.3);
+\draw[fill=white] (12+0,-1) circle (0.3);
+\draw[fill=white] (12-1,-2) circle (0.3);
+\draw[fill=white] (12+1,-2) circle (0.3);
+
+\end{tikzpicture}
+\end{center}
+
+\section{Inclusion-exclusion}
+
+\index{inclusion-exclusion}
+
+\key{Inclusion-exclusion} is a technique
+that can be used for counting the size
+of a union of sets when the sizes of
+the intersections are known, and vice versa.
+A simple example of the technique is the formula
+\[ |A \cup B| = |A| + |B| - |A \cap B|,\]
+where $A$ and $B$ are sets and $|X|$
+denotes the size of $X$.
+The formula can be illustrated as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.8]
+
+\draw (0,0) circle (1.5);
+\draw (1.5,0) circle (1.5);
+
+\node at (-0.75,0) {\small $A$};
+\node at (2.25,0) {\small $B$};
+\node at (0.75,0) {\small $A \cap B$};
+
+\end{tikzpicture}
+\end{center}
+
+Our goal is to calculate
+the size of the union $A \cup B$
+that corresponds to the area of the region
+that belongs to at least one circle.
+The picture shows that we can calculate
+the area of $A \cup B$ by first summing the
+areas of $A$ and $B$ and then subtracting
+the area of $A \cap B$.
+
+The same idea can be applied when the number
+of sets is larger.
+When there are three sets, the inclusion-exclusion formula is
+\[ |A \cup B \cup C| = |A| + |B| + |C| - |A \cap B|  - |A \cap C|  - |B \cap C| + |A \cap B \cap C| \]
+and the corresponding picture is
+
+\begin{center}
+\begin{tikzpicture}[scale=0.8]
+
+\draw (0,0) circle (1.75);
+\draw (2,0) circle (1.75);
+\draw (1,1.5) circle (1.75);
+
+\node at (-0.75,-0.25) {\small $A$};
+\node at (2.75,-0.25) {\small $B$};
+\node at (1,2.5) {\small $C$};
+\node at (1,-0.5) {\small $A \cap B$};
+\node at (0,1.25) {\small $A \cap C$};
+\node at (2,1.25) {\small $B \cap C$};
+\node at (1,0.5) {\scriptsize $A \cap B \cap C$};
+
+\end{tikzpicture}
+\end{center}
+
+In the general case, the size of the 
+union $X_1 \cup X_2 \cup \cdots \cup X_n$
+can be calculated by going through all possible
+intersections that contain some of the sets $X_1,X_2,\ldots,X_n$.
+If the intersection contains an odd number of sets,
+its size is added to the answer,
+and otherwise its size is subtracted from the answer.
+
+Note that there are similar formulas
+for calculating
+the size of an intersection from the sizes of
+unions. For example,
+\[ |A \cap B| = |A| + |B| - |A \cup B|\]
+and
+\[ |A \cap B \cap C| = |A| + |B| + |C| - |A \cup B|  - |A \cup C|  - |B \cup C| + |A \cup B \cup C| .\]
+
+\subsubsection{Derangements}
+
+\index{derangement}
+
+As an example, let us count the number of \key{derangements}
+of elements $\{1,2,\ldots,n\}$, i.e., permutations
+where no element remains in its original place.
+For example, when $n=3$, there are
+two derangements: $(2,3,1)$ and $(3,1,2)$.
+
+One approach for solving the problem is to use
+inclusion-exclusion.
+Let $X_k$ be the set of permutations
+that contain the element $k$ at position $k$.
+For example, when $n=3$, the sets are as follows:
+\[
+\begin{array}{lcl}
+X_1 & = & \{(1,2,3),(1,3,2)\} \\
+X_2 & = & \{(1,2,3),(3,2,1)\} \\
+X_3 & = & \{(1,2,3),(2,1,3)\} \\
+\end{array}
+\]
+Using these sets, the number of derangements equals
+\[ n! - |X_1 \cup X_2 \cup \cdots \cup X_n|, \]
+so it suffices to calculate the size of the union.
+Using inclusion-exclusion, this reduces to
+calculating sizes of intersections which can be
+done efficiently.
+For example, when $n=3$, the size of
+$|X_1 \cup X_2 \cup X_3|$ is
+\[
+\begin{array}{lcl}
+ & & |X_1| + |X_2| + |X_3| - |X_1 \cap X_2|  - |X_1 \cap X_3|  - |X_2 \cap X_3| + |X_1 \cap X_2 \cap X_3| \\
+ & = & 2+2+2-1-1-1+1 \\
+ & = & 4, \\
+\end{array}
+\]
+so the number of solutions is $3!-4=2$.
+
+It turns out that the problem can also be solved
+without using inclusion-exclusion.
+Let $f(n)$ denote the number of derangements
+for $\{1,2,\ldots,n\}$. We can use the following
+recursive formula:
+
+\begin{equation*}
+    f(n) = \begin{cases}
+               0               & n = 1\\
+               1               & n = 2\\
+               (n-1)(f(n-2) + f(n-1)) & n>2 \\
+           \end{cases}
+\end{equation*}
+
+The formula can be derived by considering
+the possibilities how the element 1 changes
+in the derangement.
+There are $n-1$ ways to choose an element $x$
+that replaces the element 1.
+In each such choice, there are two options:
+
+\textit{Option 1:} We also replace the element $x$
+with the element 1.
+After this, the remaining task is to construct
+a derangement of $n-2$ elements.
+
+\textit{Option 2:} We replace the element $x$
+with some other element than 1.
+Now we have to construct a derangement
+of $n-1$ element, because we cannot replace
+the element $x$ with the element $1$, and all other
+elements must be changed.
+
+\section{Burnside's lemma}
+
+\index{Burnside's lemma}
+
+\key{Burnside's lemma}
+%\footnote{Actually, Burnside did not discover this lemma; he only mentioned it in his book \cite{bur97}.}
+can be used to count
+the number of combinations so that
+only one representative is counted
+for each group of symmetric combinations.
+Burnside's lemma states that the number of
+combinations is
+\[\sum_{k=1}^n \frac{c(k)}{n},\]
+where there are $n$ ways to change the
+position of a combination,
+and there are $c(k)$ combinations that
+remain unchanged when the $k$th way is applied.
+
+As an example, let us calculate the number of
+necklaces of $n$ pearls,
+where each pearl has $m$ possible colors.
+Two necklaces are symmetric if they are
+similar after rotating them.
+For example, the necklace
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw[fill=white] (0,0) circle (1);
+\draw[fill=red] (0,1) circle (0.3);
+\draw[fill=blue] (1,0) circle (0.3);
+\draw[fill=red] (0,-1) circle (0.3);
+\draw[fill=green] (-1,0) circle (0.3);
+\end{tikzpicture}
+\end{center}
+has the following symmetric necklaces:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw[fill=white] (0,0) circle (1);
+\draw[fill=red] (0,1) circle (0.3);
+\draw[fill=blue] (1,0) circle (0.3);
+\draw[fill=red] (0,-1) circle (0.3);
+\draw[fill=green] (-1,0) circle (0.3);
+
+\draw[fill=white] (4,0) circle (1);
+\draw[fill=green] (4+0,1) circle (0.3);
+\draw[fill=red] (4+1,0) circle (0.3);
+\draw[fill=blue] (4+0,-1) circle (0.3);
+\draw[fill=red] (4+-1,0) circle (0.3);
+
+\draw[fill=white] (8,0) circle (1);
+\draw[fill=red] (8+0,1) circle (0.3);
+\draw[fill=green] (8+1,0) circle (0.3);
+\draw[fill=red] (8+0,-1) circle (0.3);
+\draw[fill=blue] (8+-1,0) circle (0.3);
+
+\draw[fill=white] (12,0) circle (1);
+\draw[fill=blue] (12+0,1) circle (0.3);
+\draw[fill=red] (12+1,0) circle (0.3);
+\draw[fill=green] (12+0,-1) circle (0.3);
+\draw[fill=red] (12+-1,0) circle (0.3);
+\end{tikzpicture}
+\end{center}
+There are $n$ ways to change the position
+of a necklace,
+because we can rotate it
+$0,1,\ldots,n-1$ steps clockwise.
+If the number of steps is 0,
+all $m^n$ necklaces remain the same,
+and if the number of steps is 1,
+only the $m$ necklaces where each
+pearl has the same color remain the same.
+
+More generally, when the number of steps is $k$,
+a total of
+\[m^{\textrm{gcd}(k,n)}\]
+necklaces remain the same,
+where $\textrm{gcd}(k,n)$ is the greatest common
+divisor of $k$ and $n$.
+The reason for this is that blocks
+of pearls of size $\textrm{gcd}(k,n)$
+will replace each other.
+Thus, according to Burnside's lemma,
+the number of necklaces is
+\[\sum_{i=0}^{n-1} \frac{m^{\textrm{gcd}(i,n)}}{n}. \]
+For example, the number of necklaces of length 4
+with 3 colors is
+\[\frac{3^4+3+3^2+3}{4} = 24. \]
+
+\section{Cayley's formula}
+
+\index{Cayley's formula}
+
+\key{Cayley's formula}
+% \footnote{While the formula is named after A. Cayley,
+% who studied it in 1889, it was discovered earlier by C. W. Borchardt in 1860.}
+states that
+there are $n^{n-2}$ labeled trees
+that contain $n$ nodes.
+The nodes are labeled $1,2,\ldots,n$,
+and two trees are different
+if either their structure or
+labeling is different.
+
+\begin{samepage}
+For example, when $n=4$, the number of labeled
+trees is $4^{4-2}=16$:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.8]
+\footnotesize
+
+\newcommand\puua[6]{
+\path[draw,thick,-] (#1,#2) -- (#1-1.25,#2-1.5);
+\path[draw,thick,-] (#1,#2) -- (#1,#2-1.5);
+\path[draw,thick,-] (#1,#2) -- (#1+1.25,#2-1.5);
+\node[draw, circle, fill=white] at (#1,#2) {#3};
+\node[draw, circle, fill=white] at (#1-1.25,#2-1.5) {#4};
+\node[draw, circle, fill=white] at (#1,#2-1.5) {#5};
+\node[draw, circle, fill=white] at (#1+1.25,#2-1.5) {#6};
+}
+\newcommand\puub[6]{
+\path[draw,thick,-] (#1,#2) -- (#1+1,#2);
+\path[draw,thick,-] (#1+1,#2) -- (#1+2,#2);
+\path[draw,thick,-] (#1+2,#2) -- (#1+3,#2);
+\node[draw, circle, fill=white] at (#1,#2) {#3};
+\node[draw, circle, fill=white] at (#1+1,#2) {#4};
+\node[draw, circle, fill=white] at (#1+2,#2) {#5};
+\node[draw, circle, fill=white] at (#1+3,#2) {#6};
+}
+
+\puua{0}{0}{1}{2}{3}{4}
+\puua{4}{0}{2}{1}{3}{4}
+\puua{8}{0}{3}{1}{2}{4}
+\puua{12}{0}{4}{1}{2}{3}
+
+\puub{0}{-3}{1}{2}{3}{4}
+\puub{4.5}{-3}{1}{2}{4}{3}
+\puub{9}{-3}{1}{3}{2}{4}
+\puub{0}{-4.5}{1}{3}{4}{2}
+\puub{4.5}{-4.5}{1}{4}{2}{3}
+\puub{9}{-4.5}{1}{4}{3}{2}
+\puub{0}{-6}{2}{1}{3}{4}
+\puub{4.5}{-6}{2}{1}{4}{3}
+\puub{9}{-6}{2}{3}{1}{4}
+\puub{0}{-7.5}{2}{4}{1}{3}
+\puub{4.5}{-7.5}{3}{1}{2}{4}
+\puub{9}{-7.5}{3}{2}{1}{4}
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+Next we will see how Cayley's formula can
+be derived using Prüfer codes.
+
+\subsubsection{Prüfer code}
+
+\index{Prüfer code}
+
+A \key{Prüfer code}
+%\footnote{In 1918, H. Prüfer proved Cayley's theorem using Prüfer codes \cite{pru18}.}
+is a sequence of
+$n-2$ numbers that describes a labeled tree.
+The code is constructed by following a process
+that removes $n-2$ leaves from the tree.
+At each step, the leaf with the smallest label is removed,
+and the label of its only neighbor is added to the code.
+
+For example, let us calculate the Prüfer code
+of the following graph:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (2,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (2,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (5.5,2) {$5$};
+
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+
+First we remove node 1 and add node 4 to the code:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+%\node[draw, circle] (1) at (2,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (2,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (5.5,2) {$5$};
+
+%\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+
+Then we remove node 3 and add node 4 to the code:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+%\node[draw, circle] (1) at (2,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+%\node[draw, circle] (3) at (2,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (5.5,2) {$5$};
+
+%\path[draw,thick,-] (1) -- (4);
+%\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+
+Finally we remove node 4 and add node 2 to the code:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+%\node[draw, circle] (1) at (2,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+%\node[draw, circle] (3) at (2,1) {$3$};
+%\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (5.5,2) {$5$};
+
+%\path[draw,thick,-] (1) -- (4);
+%\path[draw,thick,-] (3) -- (4);
+%\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\end{tikzpicture}
+\end{center}
+
+Thus, the Prüfer code of the graph is $[4,4,2]$.
+
+We can construct a Prüfer code for any tree,
+and more importantly,
+the original tree can be reconstructed
+from a Prüfer code.
+Hence, the number of labeled trees
+of $n$ nodes equals
+$n^{n-2}$, the number of Prüfer codes
+of size $n$.
--- a/chapter23.tex
+++ b/chapter23.tex
@ -0,0 +1,856 @@
+\chapter{Matrices}
+
+\index{matrix}
+
+A \key{matrix} is a mathematical concept
+that corresponds to a two-dimensional array
+in programming. For example,
+\[
+A = 
+ \begin{bmatrix}
+  6 & 13 & 7 & 4 \\
+  7 & 0 & 8 & 2 \\
+  9 & 5 & 4 & 18 \\
+ \end{bmatrix}
+\]
+is a matrix of size $3 \times 4$, i.e.,
+it has 3 rows and 4 columns.
+The notation $[i,j]$ refers to
+the element in row $i$ and column $j$
+in a matrix.
+For example, in the above matrix,
+$A[2,3]=8$ and $A[3,1]=9$.
+
+\index{vector}
+
+A special case of a matrix is a \key{vector}
+that is a one-dimensional matrix of size $n \times 1$.
+For example,
+\[
+V =
+\begin{bmatrix}
+4 \\
+7 \\
+5 \\
+\end{bmatrix}
+\]
+is a vector that contains three elements.
+
+\index{transpose}
+
+The \key{transpose} $A^T$ of a matrix $A$
+is obtained when the rows and columns of $A$
+are swapped, i.e., $A^T[i,j]=A[j,i]$:
+\[
+A^T = 
+ \begin{bmatrix}
+  6 & 7 & 9 \\
+  13 & 0 & 5 \\
+  7 & 8 & 4 \\
+  4 & 2 & 18 \\
+ \end{bmatrix}
+\]
+
+\index{square matrix}
+
+A matrix is a \key{square matrix} if it
+has the same number of rows and columns.
+For example, the following matrix is a
+square matrix:
+
+\[
+S = 
+ \begin{bmatrix}
+  3 & 12 & 4  \\
+  5 & 9 & 15  \\
+  0 & 2 & 4 \\
+ \end{bmatrix}
+\]
+
+\section{Operations}
+
+The sum $A+B$ of matrices $A$ and $B$
+is defined if the matrices are of the same size.
+The result is a matrix where each element
+is the sum of the corresponding elements
+in $A$ and $B$.
+
+For example,
+\[
+ \begin{bmatrix}
+  6 & 1 & 4 \\
+  3 & 9 & 2 \\
+ \end{bmatrix}
+
+ \begin{bmatrix}
+  4 & 9 & 3 \\
+  8 & 1 & 3 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  6+4 & 1+9 & 4+3 \\
+  3+8 & 9+1 & 2+3 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  10 & 10 & 7 \\
+  11 & 10 & 5 \\
+ \end{bmatrix}.
+\]
+
+Multiplying a matrix $A$ by a value $x$ means
+that each element of $A$ is multiplied by $x$.
+For example,
+\[
+ 2 \cdot \begin{bmatrix}
+  6 & 1 & 4 \\
+  3 & 9 & 2 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  2 \cdot 6 & 2\cdot1 & 2\cdot4 \\
+  2\cdot3 & 2\cdot9 & 2\cdot2 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  12 & 2 & 8 \\
+  6 & 18 & 4 \\
+ \end{bmatrix}.
+\]
+
+\subsubsection{Matrix multiplication}
+
+\index{matrix multiplication}
+
+The product $AB$ of matrices $A$ and $B$
+is defined if $A$ is of size $a \times n$
+and $B$ is of size $n \times b$, i.e.,
+the width of $A$ equals the height of $B$.
+The result is a matrix of size $a \times b$
+whose elements are calculated using the formula
+\[
+AB[i,j] = \sum_{k=1}^n A[i,k] \cdot B[k,j].
+\]
+
+The idea is that each element of $AB$
+is a sum of products of elements of $A$ and $B$
+according to the following picture:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.5]
+\draw (0,0) grid (4,3);
+\draw (5,0) grid (10,3);
+\draw (5,4) grid (10,8);
+
+\node at (2,-1) {$A$};
+\node at (7.5,-1) {$AB$};
+\node at (11,6) {$B$};
+
+\draw[thick,->,red,line width=2pt] (0,1.5) -- (4,1.5);
+\draw[thick,->,red,line width=2pt] (6.5,8) -- (6.5,4);
+\draw[thick,red,line width=2pt] (6.5,1.5) circle (0.4);
+\end{tikzpicture}
+\end{center}
+
+For example,
+
+\[
+ \begin{bmatrix}
+  1 & 4 \\
+  3 & 9 \\
+  8 & 6 \\
+ \end{bmatrix}
+\cdot
+ \begin{bmatrix}
+  1 & 6 \\
+  2 & 9 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  1 \cdot 1 + 4 \cdot 2 & 1 \cdot 6 + 4 \cdot 9 \\
+  3 \cdot 1 + 9 \cdot 2 & 3 \cdot 6 + 9 \cdot 9 \\
+  8 \cdot 1 + 6 \cdot 2 & 8 \cdot 6 + 6 \cdot 9 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  9 & 42 \\
+  21 & 99 \\
+  20 & 102 \\
+ \end{bmatrix}.
+\]
+
+Matrix multiplication is associative,
+so $A(BC)=(AB)C$ holds,
+but it is not commutative,
+so $AB = BA$ does not usually hold.
+
+\index{identity matrix}
+
+An \key{identity matrix} is a square matrix
+where each element on the diagonal is 1
+and all other elements are 0.
+For example, the following matrix
+is the $3 \times 3$ identity matrix:
+\[
+ I = \begin{bmatrix}
+  1 & 0 & 0 \\
+  0 & 1 & 0 \\
+  0 & 0 & 1 \\
+ \end{bmatrix}
+\]
+
+\begin{samepage}
+Multiplying a matrix by an identity matrix
+does not change it. For example,
+\[
+ \begin{bmatrix}
+  1 & 0 & 0 \\
+  0 & 1 & 0 \\
+  0 & 0 & 1 \\
+ \end{bmatrix}
+\cdot
+ \begin{bmatrix}
+  1 & 4 \\
+  3 & 9 \\
+  8 & 6 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  1 & 4 \\
+  3 & 9 \\
+  8 & 6 \\
+ \end{bmatrix} \hspace{10px} \textrm{and} \hspace{10px}
+ \begin{bmatrix}
+  1 & 4 \\
+  3 & 9 \\
+  8 & 6 \\
+ \end{bmatrix}
+\cdot
+ \begin{bmatrix}
+  1 & 0 \\
+  0 & 1 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  1 & 4 \\
+  3 & 9 \\
+  8 & 6 \\
+ \end{bmatrix}.
+\]
+\end{samepage}
+
+Using a straightforward algorithm,
+we can calculate the product of
+two $n \times n$ matrices
+in $O(n^3)$ time.
+There are also more efficient algorithms
+for matrix multiplication\footnote{The first such
+algorithm was Strassen's algorithm,
+published in 1969 \cite{str69},
+whose time complexity is $O(n^{2.80735})$;
+the best current algorithm \cite{gal14}
+works in $O(n^{2.37286})$ time.},
+but they are mostly of theoretical interest
+and such algorithms are not necessary
+in competitive programming.
+
+
+\subsubsection{Matrix power}
+
+\index{matrix power}
+
+The power $A^k$ of a matrix $A$ is defined
+if $A$ is a square matrix.
+The definition is based on matrix multiplication:
+\[ A^k = \underbrace{A \cdot A \cdot A \cdots A}_{\textrm{$k$ times}} \]
+For example,
+
+\[
+ \begin{bmatrix}
+  2 & 5 \\
+  1 & 4 \\
+ \end{bmatrix}^3 =
+ \begin{bmatrix}
+  2 & 5 \\
+  1 & 4 \\
+ \end{bmatrix} \cdot
+ \begin{bmatrix}
+  2 & 5 \\
+  1 & 4 \\
+ \end{bmatrix} \cdot
+ \begin{bmatrix}
+  2 & 5 \\
+  1 & 4 \\
+ \end{bmatrix} =
+ \begin{bmatrix}
+  48 & 165 \\
+  33 & 114 \\
+ \end{bmatrix}.
+\]
+In addition, $A^0$ is an identity matrix. For example,
+\[
+ \begin{bmatrix}
+  2 & 5 \\
+  1 & 4 \\
+ \end{bmatrix}^0 =
+ \begin{bmatrix}
+  1 & 0 \\
+  0 & 1 \\
+ \end{bmatrix}.
+\]
+
+The matrix $A^k$ can be efficiently calculated
+in $O(n^3 \log k)$ time using the
+algorithm in Chapter 21.2. For example,
+\[
+ \begin{bmatrix}
+  2 & 5 \\
+  1 & 4 \\
+ \end{bmatrix}^8 =
+ \begin{bmatrix}
+  2 & 5 \\
+  1 & 4 \\
+ \end{bmatrix}^4 \cdot
+ \begin{bmatrix}
+  2 & 5 \\
+  1 & 4 \\
+ \end{bmatrix}^4.
+\]
+
+\subsubsection{Determinant}
+
+\index{determinant}
+
+The \key{determinant} $\det(A)$ of a matrix $A$
+is defined if $A$ is a square matrix.
+If $A$ is of size $1 \times 1$,
+then $\det(A)=A[1,1]$.
+The determinant of a larger matrix is
+calculated recursively using the formula \index{cofactor}
+\[\det(A)=\sum_{j=1}^n A[1,j] C[1,j],\]
+where $C[i,j]$ is the \key{cofactor} of $A$
+at $[i,j]$.
+The cofactor is calculated using the formula
+\[C[i,j] = (-1)^{i+j} \det(M[i,j]),\]
+where $M[i,j]$ is obtained by removing
+row $i$ and column $j$ from $A$.
+Due to the coefficient $(-1)^{i+j}$ in the cofactor,
+every other determinant is positive
+and negative.
+For example,
+\[
+\det(
+ \begin{bmatrix}
+  3 & 4 \\
+  1 & 6 \\
+ \end{bmatrix}
+) = 3 \cdot 6 - 4 \cdot 1 = 14 
+\]
+and
+\[
+\det(
+ \begin{bmatrix}
+  2 & 4 & 3 \\
+  5 & 1 & 6 \\
+  7 & 2 & 4 \\
+ \end{bmatrix}
+) = 
+2 \cdot
+\det(
+ \begin{bmatrix}
+  1 & 6 \\
+  2 & 4 \\
+ \end{bmatrix}
+)
+-4 \cdot
+\det(
+ \begin{bmatrix}
+  5 & 6 \\
+  7 & 4 \\
+ \end{bmatrix}
+)
+3 \cdot
+\det(
+ \begin{bmatrix}
+  5 & 1 \\
+  7 & 2 \\
+ \end{bmatrix}
+) = 81.
+\]
+
+\index{inverse matrix}
+
+The determinant of $A$ tells us
+whether there is an \key{inverse matrix}
+$A^{-1}$ such that $A \cdot A^{-1} = I$,
+where $I$ is an identity matrix.
+It turns out that $A^{-1}$ exists
+exactly when $\det(A) \neq 0$,
+and it can be calculated using the formula
+
+\[A^{-1}[i,j] = \frac{C[j,i]}{det(A)}.\]
+
+For example,
+
+\[
+\underbrace{
+ \begin{bmatrix}
+  2 & 4 & 3\\
+  5 & 1 & 6\\
+  7 & 2 & 4\\
+ \end{bmatrix}
+}_{A}
+\cdot
+\underbrace{
+ \frac{1}{81}
+ \begin{bmatrix}
+   -8 & -10 & 21 \\
+   22 & -13 & 3 \\
+   3 & 24 & -18 \\
+ \end{bmatrix}
+}_{A^{-1}}
+=
+\underbrace{
+ \begin{bmatrix}
+  1 & 0 & 0 \\
+  0 & 1 & 0 \\
+  0 & 0 & 1 \\
+ \end{bmatrix}
+}_{I}.
+\]
+
+\section{Linear recurrences}
+
+\index{linear recurrence}
+
+A \key{linear recurrence}
+is a function $f(n)$
+whose initial values are
+$f(0),f(1),\ldots,f(k-1)$
+and larger values
+are calculated recursively using the formula
+\[f(n) = c_1 f(n-1) + c_2 f(n-2) + \ldots + c_k f (n-k),\]
+where $c_1,c_2,\ldots,c_k$ are constant coefficients.
+
+Dynamic programming can be used to calculate
+any value of $f(n)$ in $O(kn)$ time by calculating
+all values of $f(0),f(1),\ldots,f(n)$ one after another.
+However, if $k$ is small, it is possible to calculate
+$f(n)$ much more efficiently in $O(k^3 \log n)$
+time using matrix operations.
+
+\subsubsection{Fibonacci numbers}
+
+\index{Fibonacci number}
+
+A simple example of a linear recurrence is the
+following function that defines the Fibonacci numbers:
+\[
+\begin{array}{lcl}
+f(0) & = & 0 \\
+f(1) & = & 1 \\
+f(n) & = & f(n-1)+f(n-2) \\
+\end{array}
+\]
+In this case, $k=2$ and $c_1=c_2=1$.
+
+\begin{samepage}
+To efficiently calculate Fibonacci numbers,
+we represent the
+Fibonacci formula as a
+square matrix $X$ of size $2 \times 2$,
+for which the following holds:
+\[ X \cdot
+ \begin{bmatrix}
+  f(i) \\
+  f(i+1) \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  f(i+1) \\
+  f(i+2) \\
+ \end{bmatrix}
+ \]
+Thus, values $f(i)$ and $f(i+1)$ are given as
+''input'' for $X$,
+and $X$ calculates values $f(i+1)$ and $f(i+2)$
+from them.
+It turns out that such a matrix is
+
+\[ X = 
+ \begin{bmatrix}
+  0 & 1 \\
+  1 & 1 \\
+ \end{bmatrix}.
+\]
+\end{samepage}
+\noindent
+For example,
+\[
+ \begin{bmatrix}
+  0 & 1 \\
+  1 & 1 \\
+ \end{bmatrix}
+\cdot
+ \begin{bmatrix}
+  f(5) \\
+  f(6) \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  0 & 1 \\
+  1 & 1 \\
+ \end{bmatrix}
+\cdot
+ \begin{bmatrix}
+  5 \\
+  8 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  8 \\
+  13 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  f(6) \\
+  f(7) \\
+ \end{bmatrix}.
+\]
+Thus, we can calculate $f(n)$ using the formula
+\[
+ \begin{bmatrix}
+  f(n) \\
+  f(n+1) \\
+ \end{bmatrix}
+=
+X^n \cdot
+ \begin{bmatrix}
+  f(0) \\
+  f(1) \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  0 & 1 \\
+  1 & 1 \\
+ \end{bmatrix}^n
+\cdot
+ \begin{bmatrix}
+  0 \\
+  1 \\
+ \end{bmatrix}.
+\]
+The value of $X^n$ can be calculated in
+$O(\log n)$ time,
+so the value of $f(n)$ can also be calculated
+in $O(\log n)$ time.
+
+\subsubsection{General case}
+
+Let us now consider the general case where
+$f(n)$ is any linear recurrence.
+Again, our goal is to construct a matrix $X$
+for which
+
+\[ X \cdot
+ \begin{bmatrix}
+  f(i) \\
+  f(i+1) \\
+  \vdots \\
+  f(i+k-1) \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  f(i+1) \\
+  f(i+2) \\
+  \vdots \\
+  f(i+k) \\
+ \end{bmatrix}.
+\]
+Such a matrix is
+\[
+X =
+ \begin{bmatrix}
+  0 & 1 & 0 & 0 & \cdots & 0 \\
+  0 & 0 & 1 & 0 & \cdots & 0 \\
+  0 & 0 & 0 & 1 & \cdots & 0 \\
+  \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\
+  0 & 0 & 0 & 0 & \cdots & 1 \\
+  c_k & c_{k-1} & c_{k-2} & c_{k-3} & \cdots & c_1 \\
+ \end{bmatrix}.
+\]
+In the first $k-1$ rows, each element is 0
+except that one element is 1.
+These rows replace $f(i)$ with $f(i+1)$,
+$f(i+1)$ with $f(i+2)$, and so on.
+The last row contains the coefficients of the recurrence
+to calculate the new value $f(i+k)$.
+
+\begin{samepage}
+Now, $f(n)$ can be calculated in
+$O(k^3 \log n)$ time using the formula
+\[
+ \begin{bmatrix}
+  f(n) \\
+  f(n+1) \\
+  \vdots \\
+  f(n+k-1) \\
+ \end{bmatrix}
+=
+X^n \cdot
+ \begin{bmatrix}
+  f(0) \\
+  f(1) \\
+  \vdots \\
+  f(k-1) \\
+ \end{bmatrix}.
+\]
+\end{samepage}
+
+\section{Graphs and matrices}
+
+\subsubsection{Counting paths}
+
+The powers of an adjacency matrix of a graph
+have an interesting property.
+When $V$ is an adjacency matrix of an unweighted graph,
+the matrix $V^n$ contains the numbers of paths of
+$n$ edges between the nodes in the graph.
+
+For example, for the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (1,1) {$4$};
+\node[draw, circle] (3) at (3,3) {$2$};
+\node[draw, circle] (4) at (5,3) {$3$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+
+\path[draw,thick,->,>=latex] (1) -- (2);
+\path[draw,thick,->,>=latex] (2) -- (3);
+\path[draw,thick,->,>=latex] (3) -- (1);
+\path[draw,thick,->,>=latex] (4) -- (3);
+\path[draw,thick,->,>=latex] (3) -- (5);
+\path[draw,thick,->,>=latex] (3) -- (6);
+\path[draw,thick,->,>=latex] (6) -- (4);
+\path[draw,thick,->,>=latex] (6) -- (5);
+\end{tikzpicture}
+\end{center}
+the adjacency matrix is
+\[
+V= \begin{bmatrix}
+  0 & 0 & 0 & 1 & 0 & 0 \\
+  1 & 0 & 0 & 0 & 1 & 1 \\
+  0 & 1 & 0 & 0 & 0 & 0 \\
+  0 & 1 & 0 & 0 & 0 & 0 \\
+  0 & 0 & 0 & 0 & 0 & 0 \\
+  0 & 0 & 1 & 0 & 1 & 0 \\
+ \end{bmatrix}.
+\]
+Now, for example, the matrix
+\[
+V^4= \begin{bmatrix}
+  0 & 0 & 1 & 1 & 1 & 0 \\
+  2 & 0 & 0 & 0 & 2 & 2 \\
+  0 & 2 & 0 & 0 & 0 & 0 \\
+  0 & 2 & 0 & 0 & 0 & 0 \\
+  0 & 0 & 0 & 0 & 0 & 0 \\
+  0 & 0 & 1 & 1 & 1 & 0 \\
+ \end{bmatrix}
+\]
+contains the numbers of paths of 4 edges
+between the nodes.
+For example, $V^4[2,5]=2$,
+because there are two paths of 4 edges
+from node 2 to node 5:
+$2 \rightarrow 1 \rightarrow 4 \rightarrow 2 \rightarrow 5$
+and 
+$2 \rightarrow 6 \rightarrow 3 \rightarrow 2 \rightarrow 5$.
+
+\subsubsection{Shortest paths}
+
+Using a similar idea in a weighted graph,
+we can calculate for each pair of nodes the minimum
+length of a path
+between them that contains exactly $n$ edges.
+To calculate this, we have to define matrix multiplication
+in a new way, so that we do not calculate the numbers
+of paths but minimize the lengths of paths.
+
+\begin{samepage}
+As an example, consider the following graph:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (1,1) {$4$};
+\node[draw, circle] (3) at (3,3) {$2$};
+\node[draw, circle] (4) at (5,3) {$3$};
+\node[draw, circle] (5) at (3,1) {$5$};
+\node[draw, circle] (6) at (5,1) {$6$};
+
+\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=left:4] {} (2);
+\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:1] {} (3);
+\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=north:2] {} (1);
+\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=north:4] {} (3);
+\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=left:1] {} (5);
+\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=left:2] {} (6);
+\path[draw,thick,->,>=latex] (6) -- node[font=\small,label=right:3] {} (4);
+\path[draw,thick,->,>=latex] (6) -- node[font=\small,label=below:2] {} (5);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+Let us construct an adjacency matrix where
+$\infty$ means that an edge does not exist,
+and other values correspond to edge weights.
+The matrix is
+\[
+V= \begin{bmatrix}
+  \infty & \infty & \infty & 4 & \infty & \infty \\
+  2 & \infty & \infty & \infty & 1 & 2 \\
+  \infty & 4 & \infty & \infty & \infty & \infty \\
+  \infty & 1 & \infty & \infty & \infty & \infty \\
+  \infty & \infty & \infty & \infty & \infty & \infty \\
+  \infty & \infty & 3 & \infty & 2 & \infty \\
+ \end{bmatrix}.
+\]
+
+Instead of the formula
+\[
+AB[i,j] = \sum_{k=1}^n A[i,k] \cdot B[k,j]
+\]
+we now use the formula
+\[
+AB[i,j] = \min_{k=1}^n A[i,k] + B[k,j]
+\]
+for matrix multiplication, so we calculate
+a minimum instead of a sum,
+and a sum of elements instead of a product.
+After this modification,
+matrix powers correspond to
+shortest paths in the graph.
+
+For example, as
+\[
+V^4= \begin{bmatrix}
+  \infty & \infty & 10 & 11 & 9 & \infty \\
+  9 & \infty & \infty & \infty & 8 & 9 \\
+  \infty & 11 & \infty & \infty & \infty & \infty \\
+  \infty & 8 & \infty & \infty & \infty & \infty \\
+  \infty & \infty & \infty & \infty & \infty & \infty \\
+  \infty & \infty & 12 & 13 & 11 & \infty \\
+ \end{bmatrix},
+\]
+we can conclude that the minimum length of a path
+of 4 edges
+from node 2 to node 5 is 8.
+Such a path is
+$2 \rightarrow 1 \rightarrow 4 \rightarrow 2 \rightarrow 5$.
+
+\subsubsection{Kirchhoff's theorem}
+
+\index{Kirchhoff's theorem}
+\index{spanning tree}
+
+\key{Kirchhoff's theorem}
+%\footnote{G. R. Kirchhoff (1824--1887) was a German physicist.}
+provides a way
+to calculate the number of spanning trees
+of a graph as a determinant of a special matrix.
+For example, the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (3,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (3,1) {$4$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (1) -- (4);
+\end{tikzpicture}
+\end{center}
+has three spanning trees:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1a) at (1,3) {$1$};
+\node[draw, circle] (2a) at (3,3) {$2$};
+\node[draw, circle] (3a) at (1,1) {$3$};
+\node[draw, circle] (4a) at (3,1) {$4$};
+
+\path[draw,thick,-] (1a) -- (2a);
+%\path[draw,thick,-] (1a) -- (3a);
+\path[draw,thick,-] (3a) -- (4a);
+\path[draw,thick,-] (1a) -- (4a);
+
+\node[draw, circle] (1b) at (1+4,3) {$1$};
+\node[draw, circle] (2b) at (3+4,3) {$2$};
+\node[draw, circle] (3b) at (1+4,1) {$3$};
+\node[draw, circle] (4b) at (3+4,1) {$4$};
+
+\path[draw,thick,-] (1b) -- (2b);
+\path[draw,thick,-] (1b) -- (3b);
+%\path[draw,thick,-] (3b) -- (4b);
+\path[draw,thick,-] (1b) -- (4b);
+
+\node[draw, circle] (1c) at (1+8,3) {$1$};
+\node[draw, circle] (2c) at (3+8,3) {$2$};
+\node[draw, circle] (3c) at (1+8,1) {$3$};
+\node[draw, circle] (4c) at (3+8,1) {$4$};
+
+\path[draw,thick,-] (1c) -- (2c);
+\path[draw,thick,-] (1c) -- (3c);
+\path[draw,thick,-] (3c) -- (4c);
+%\path[draw,thick,-] (1c) -- (4c);
+\end{tikzpicture}
+\end{center}
+\index{Laplacean matrix}
+To calculate the number of spanning trees,
+we construct a \key{Laplacean matrix} $L$,
+where $L[i,i]$ is the degree of node $i$
+and $L[i,j]=-1$ if there is an edge between
+nodes $i$ and $j$, and otherwise $L[i,j]=0$.
+The Laplacean matrix for the above graph is as follows:
+\[
+L= \begin{bmatrix}
+  3 & -1 & -1 & -1 \\
+  -1 & 1 & 0 & 0 \\
+  -1 & 0 & 2 & -1 \\
+  -1 & 0 & -1 & 2 \\
+ \end{bmatrix}
+\]
+
+It can be shown that
+the number of spanning trees equals
+the determinant of a matrix that is obtained
+when we remove any row and any column from $L$.
+For example, if we remove the first row
+and column, the result is
+
+\[ \det(
+\begin{bmatrix}
+  1 & 0 & 0 \\
+  0 & 2 & -1 \\
+  0 & -1 & 2 \\
+ \end{bmatrix}
+) =3.\]
+The determinant is always the same,
+regardless of which row and column we remove from $L$.
+
+Note that Cayley's formula in Chapter 22.5 is
+a special case of Kirchhoff's theorem,
+because in a complete graph of $n$ nodes
+
+\[ \det(
+\begin{bmatrix}
+  n-1 & -1 & \cdots & -1 \\
+  -1 & n-1 & \cdots & -1 \\
+  \vdots & \vdots & \ddots & \vdots \\
+  -1 & -1 & \cdots & n-1 \\
+ \end{bmatrix}
+) =n^{n-2}.\]
+
+
+
--- a/chapter24.tex
+++ b/chapter24.tex
@ -0,0 +1,689 @@
+\chapter{Probability}
+
+\index{probability}
+
+A \key{probability} is a real number between $0$ and $1$
+that indicates how probable an event is.
+If an event is certain to happen,
+its probability is 1,
+and if an event is impossible,
+its probability is 0.
+The probability of an event is denoted $P(\cdots)$
+where the three dots describe the event.
+
+For example, when throwing a dice,
+the outcome is an integer between $1$ and $6$,
+and the probability of each outcome is $1/6$.
+For example, we can calculate the following probabilities:
+
+\begin{itemize}[noitemsep]
+\item $P(\textrm{''the outcome is 4''})=1/6$
+\item $P(\textrm{''the outcome is not 6''})=5/6$
+\item $P(\textrm{''the outcome is even''})=1/2$
+\end{itemize}
+
+\section{Calculation}
+
+To calculate the probability of an event,
+we can either use combinatorics
+or simulate the process that generates the event.
+As an example, let us calculate the probability
+of drawing three cards with the same value
+from a shuffled deck of cards
+(for example, $\spadesuit 8$, $\clubsuit 8$ and $\diamondsuit 8$).
+
+\subsubsection*{Method 1}
+
+We can calculate the probability using the formula
+
+\[\frac{\textrm{number of desired outcomes}}{\textrm{total number of outcomes}}.\]
+
+In this problem, the desired outcomes are those
+in which the value of each card is the same.
+There are $13 {4 \choose 3}$ such outcomes,
+because there are $13$ possibilities for the
+value of the cards and ${4 \choose 3}$ ways to
+choose $3$ suits from $4$ possible suits.
+
+There are a total of ${52 \choose 3}$ outcomes,
+because we choose 3 cards from 52 cards.
+Thus, the probability of the event is
+
+\[\frac{13 {4 \choose 3}}{{52 \choose 3}} = \frac{1}{425}.\]
+
+\subsubsection*{Method 2}
+
+Another way to calculate the probability is
+to simulate the process that generates the event.
+In this example, we draw three cards, so the process
+consists of three steps.
+We require that each step of the process is successful.
+
+Drawing the first card certainly succeeds,
+because there are no restrictions.
+The second step succeeds with probability $3/51$,
+because there are 51 cards left and 3 of them
+have the same value as the first card.
+In a similar way, the third step succeeds with probability $2/50$.
+
+The probability that the entire process succeeds is
+
+\[1 \cdot \frac{3}{51} \cdot \frac{2}{50} = \frac{1}{425}.\]
+
+\section{Events}
+
+An event in probability theory can be represented as a set
+\[A \subset X,\]
+where $X$ contains all possible outcomes
+and $A$ is a subset of outcomes.
+For example, when drawing a dice, the outcomes are
+\[X = \{1,2,3,4,5,6\}.\]
+Now, for example, the event ''the outcome is even''
+corresponds to the set
+\[A = \{2,4,6\}.\]
+
+Each outcome $x$ is assigned a probability $p(x)$.
+Then, the probability $P(A)$ of an event
+$A$ can be calculated as a sum
+of probabilities of outcomes using the formula
+\[P(A) = \sum_{x \in A} p(x).\]
+For example, when throwing a dice,
+$p(x)=1/6$ for each outcome $x$,
+so the probability of the event
+''the outcome is even'' is
+\[p(2)+p(4)+p(6)=1/2.\]
+
+The total probability of the outcomes in $X$ must
+be 1, i.e., $P(X)=1$.
+
+Since the events in probability theory are sets,
+we can manipulate them using standard set operations:
+
+\begin{itemize}
+\item The \key{complement} $\bar A$ means
+''$A$ does not happen''.
+For example, when throwing a dice, 
+the complement of $A=\{2,4,6\}$ is
+$\bar A = \{1,3,5\}$.
+\item The \key{union} $A \cup B$ means
+''$A$ or $B$ happen''.
+For example, the union of
+$A=\{2,5\}$
+and $B=\{4,5,6\}$ is
+$A \cup B = \{2,4,5,6\}$.
+\item The \key{intersection} $A \cap B$ means
+''$A$ and $B$ happen''.
+For example, the intersection of
+$A=\{2,5\}$ and $B=\{4,5,6\}$ is
+$A \cap B = \{5\}$.
+\end{itemize}
+
+\subsubsection{Complement}
+
+The probability of the complement
+$\bar A$ is calculated using the formula
+\[P(\bar A)=1-P(A).\]
+
+Sometimes, we can solve a problem easily
+using complements by solving the opposite problem.
+For example, the probability of getting
+at least one six when throwing a dice ten times is
+\[1-(5/6)^{10}.\]
+
+Here $5/6$ is the probability that the outcome
+of a single throw is not six, and
+$(5/6)^{10}$ is the probability that none of
+the ten throws is a six.
+The complement of this is the answer to the problem.
+
+\subsubsection{Union}
+
+The probability of the union $A \cup B$
+is calculated using the formula
+\[P(A \cup B)=P(A)+P(B)-P(A \cap B).\]
+For example, when throwing a dice,
+the union of the events
+\[A=\textrm{''the outcome is even''}\]
+and
+\[B=\textrm{''the outcome is less than 4''}\]
+is
+\[A \cup B=\textrm{''the outcome is even or less than 4''},\]
+and its probability is
+\[P(A \cup B) = P(A)+P(B)-P(A \cap B)=1/2+1/2-1/6=5/6.\]
+
+If the events $A$ and $B$ are \key{disjoint}, i.e.,
+$A \cap B$ is empty,
+the probability of the event $A \cup B$ is simply
+
+\[P(A \cup B)=P(A)+P(B).\]
+
+\subsubsection{Conditional probability}
+
+\index{conditional probability}
+
+The \key{conditional probability}
+\[P(A | B) = \frac{P(A \cap B)}{P(B)}\]
+is the probability of $A$
+assuming that $B$ happens.
+Hence, when calculating the
+probability of $A$, we only consider the outcomes
+that also belong to $B$.
+
+Using the previous sets,
+\[P(A | B)= 1/3,\]
+because the outcomes of $B$ are
+$\{1,2,3\}$, and one of them is even.
+This is the probability of an even outcome
+if we know that the outcome is between $1 \ldots 3$.
+
+\subsubsection{Intersection}
+
+\index{independence}
+
+Using conditional probability,
+the probability of the intersection
+$A \cap B$ can be calculated using the formula
+\[P(A \cap B)=P(A)P(B|A).\]
+Events $A$ and $B$ are \key{independent} if
+\[P(A|B)=P(A) \hspace{10px}\textrm{and}\hspace{10px} P(B|A)=P(B),\]
+which means that the fact that $B$ happens does not
+change the probability of $A$, and vice versa.
+In this case, the probability of the intersection is
+\[P(A \cap B)=P(A)P(B).\]
+For example, when drawing a card from a deck, the events
+\[A = \textrm{''the suit is clubs''}\]
+and
+\[B = \textrm{''the value is four''}\]
+are independent. Hence the event
+\[A \cap B = \textrm{''the card is the four of clubs''}\]
+happens with probability
+\[P(A \cap B)=P(A)P(B)=1/4 \cdot 1/13 = 1/52.\]
+
+\section{Random variables}
+
+\index{random variable}
+
+A \key{random variable} is a value that is generated
+by a random process.
+For example, when throwing two dice,
+a possible random variable is
+\[X=\textrm{''the sum of the outcomes''}.\]
+For example, if the outcomes are $[4,6]$
+(meaning that we first throw a four and then a six),
+then the value of $X$ is 10.
+
+We denote $P(X=x)$ the probability that
+the value of a random variable $X$ is $x$.
+For example, when throwing two dice,
+$P(X=10)=3/36$,
+because the total number of outcomes is 36
+and there are three possible ways to obtain
+the sum 10: $[4,6]$, $[5,5]$ and $[6,4]$.
+
+\subsubsection{Expected value}
+
+\index{expected value}
+
+The \key{expected value} $E[X]$ indicates the
+average value of a random variable $X$.
+The expected value can be calculated as the sum
+\[\sum_x P(X=x)x,\]
+where $x$ goes through all possible values of $X$.
+
+For example, when throwing a dice,
+the expected outcome is
+\[1/6 \cdot 1 + 1/6 \cdot 2 + 1/6 \cdot 3 + 1/6 \cdot 4 + 1/6 \cdot 5 + 1/6 \cdot 6 = 7/2.\]
+
+A useful property of expected values is \key{linearity}.
+It means that the sum
+$E[X_1+X_2+\cdots+X_n]$
+always equals the sum
+$E[X_1]+E[X_2]+\cdots+E[X_n]$.
+This formula holds even if random variables
+depend on each other.
+
+For example, when throwing two dice,
+the expected sum is
+\[E[X_1+X_2]=E[X_1]+E[X_2]=7/2+7/2=7.\]
+
+Let us now consider a problem where
+$n$ balls are randomly placed in $n$ boxes,
+and our task is to calculate the expected
+number of empty boxes.
+Each ball has an equal probability to
+be placed in any of the boxes.
+For example, if $n=2$, the possibilities
+are as follows:
+\begin{center}
+\begin{tikzpicture}
+\draw (0,0) rectangle (1,1);
+\draw (1.2,0) rectangle (2.2,1);
+\draw (3,0) rectangle (4,1);
+\draw (4.2,0) rectangle (5.2,1);
+\draw (6,0) rectangle (7,1);
+\draw (7.2,0) rectangle (8.2,1);
+\draw (9,0) rectangle (10,1);
+\draw (10.2,0) rectangle (11.2,1);
+
+\draw[fill=blue] (0.5,0.2) circle (0.1);
+\draw[fill=red] (1.7,0.2) circle (0.1);
+\draw[fill=red] (3.5,0.2) circle (0.1);
+\draw[fill=blue] (4.7,0.2) circle (0.1);
+\draw[fill=blue] (6.25,0.2) circle (0.1);
+\draw[fill=red] (6.75,0.2) circle (0.1);
+\draw[fill=blue] (10.45,0.2) circle (0.1);
+\draw[fill=red] (10.95,0.2) circle (0.1);
+\end{tikzpicture}
+\end{center}
+In this case, the expected number of
+empty boxes is
+\[\frac{0+0+1+1}{4} = \frac{1}{2}.\]
+In the general case, the probability that a
+single box is empty is
+\[\Big(\frac{n-1}{n}\Big)^n,\]
+because no ball should be placed in it.
+Hence, using linearity, the expected number of
+empty boxes is
+\[n \cdot \Big(\frac{n-1}{n}\Big)^n.\]
+
+\subsubsection{Distributions}
+
+\index{distribution}
+
+The \key{distribution} of a random variable $X$
+shows the probability of each value that
+$X$ may have.
+The distribution consists of values $P(X=x)$.
+For example, when throwing two dice,
+the distribution for their sum is:
+\begin{center}
+\small {
+\begin{tabular}{r|rrrrrrrrrrrrr}
+$x$ & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 \\
+$P(X=x)$ & $1/36$ & $2/36$ & $3/36$ & $4/36$ & $5/36$ & $6/36$ & $5/36$ & $4/36$ & $3/36$ & $2/36$ & $1/36$ \\
+\end{tabular}
+}
+\end{center}
+
+\index{uniform distribution}
+In a \key{uniform distribution},
+the random variable $X$ has $n$ possible
+values $a,a+1,\ldots,b$ and the probability of each value is $1/n$.
+For example, when throwing a dice,
+$a=1$, $b=6$ and $P(X=x)=1/6$ for each value $x$.
+
+The expected value of $X$ in a uniform distribution is
+\[E[X] = \frac{a+b}{2}.\]
+
+\index{binomial distribution}
+In a \key{binomial distribution}, $n$ attempts
+are made
+and the probability that a single attempt succeeds
+is $p$.
+The random variable $X$ counts the number of
+successful attempts,
+and the probability of a value $x$ is
+\[P(X=x)=p^x (1-p)^{n-x} {n \choose x},\]
+where $p^x$ and $(1-p)^{n-x}$ correspond to
+successful and unsuccessful attemps,
+and ${n \choose x}$ is the number of ways
+we can choose the order of the attempts.
+
+For example, when throwing a dice ten times,
+the probability of throwing a six exactly
+three times is $(1/6)^3 (5/6)^7 {10 \choose 3}$.
+
+The expected value of $X$ in a binomial distribution is
+\[E[X] = pn.\]
+
+\index{geometric distribution}
+In a \key{geometric distribution},
+the probability that an attempt succeeds is $p$,
+and we continue until the first success happens.
+The random variable $X$ counts the number
+of attempts needed, and the probability of
+a value $x$ is
+\[P(X=x)=(1-p)^{x-1} p,\]
+where $(1-p)^{x-1}$ corresponds to the unsuccessful attemps
+and $p$ corresponds to the first successful attempt.
+
+For example, if we throw a dice until we throw a six,
+the probability that the number of throws
+is exactly 4 is $(5/6)^3 1/6$.
+
+The expected value of $X$ in a geometric distribution is
+\[E[X]=\frac{1}{p}.\]
+
+\section{Markov chains}
+
+\index{Markov chain}
+
+A \key{Markov chain}
+% \footnote{A. A. Markov (1856--1922)
+% was a Russian mathematician.}
+is a random process
+that consists of states and transitions between them.
+For each state, we know the probabilities
+for moving to other states.
+A Markov chain can be represented as a graph
+whose nodes are states and edges are transitions.
+
+As an example, consider a problem
+where we are in floor 1 in an $n$ floor building.
+At each step, we randomly walk either one floor
+up or one floor down, except that we always
+walk one floor up from floor 1 and one floor down
+from floor $n$.
+What is the probability of being in floor $m$
+after $k$ steps?
+
+In this problem, each floor of the building
+corresponds to a state in a Markov chain.
+For example, if $n=5$, the graph is as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (2,0) {$2$};
+\node[draw, circle] (3) at (4,0) {$3$};
+\node[draw, circle] (4) at (6,0) {$4$};
+\node[draw, circle] (5) at (8,0) {$5$};
+
+\path[draw,thick,->] (1) edge [bend left=40] node[font=\small,label=$1$] {} (2);
+\path[draw,thick,->] (2) edge [bend left=40] node[font=\small,label=$1/2$] {} (3);
+\path[draw,thick,->] (3) edge [bend left=40] node[font=\small,label=$1/2$] {} (4);
+\path[draw,thick,->] (4) edge [bend left=40] node[font=\small,label=$1/2$] {} (5);
+
+\path[draw,thick,->] (5) edge [bend left=40] node[font=\small,label=below:$1$] {} (4);
+\path[draw,thick,->] (4) edge [bend left=40] node[font=\small,label=below:$1/2$] {} (3);
+\path[draw,thick,->] (3) edge [bend left=40] node[font=\small,label=below:$1/2$] {} (2);
+\path[draw,thick,->] (2) edge [bend left=40] node[font=\small,label=below:$1/2$] {} (1);
+
+%\path[draw,thick,->] (1) edge [bend left=40] node[font=\small,label=below:$1$] {} (2);
+\end{tikzpicture}
+\end{center}
+
+The probability distribution
+of a Markov chain is a vector
+$[p_1,p_2,\ldots,p_n]$, where $p_k$ is the
+probability that the current state is $k$.
+The formula $p_1+p_2+\cdots+p_n=1$ always holds.
+
+In the above scenario, the initial distribution is
+$[1,0,0,0,0]$, because we always begin in floor 1.
+The next distribution is $[0,1,0,0,0]$,
+because we can only move from floor 1 to floor 2.
+After this, we can either move one floor up
+or one floor down, so the next distribution is
+$[1/2,0,1/2,0,0]$, and so on.
+
+An efficient way to simulate the walk in
+a Markov chain is to use dynamic programming.
+The idea is to maintain the probability distribution,
+and at each step go through all possibilities
+how we can move.
+Using this method, we can simulate
+a walk of $m$ steps in $O(n^2 m)$ time.
+
+The transitions of a Markov chain can also be
+represented as a matrix that updates the
+probability distribution.
+In the above scenario, the matrix is
+
+\[ 
+ \begin{bmatrix}
+  0 & 1/2 & 0 & 0 & 0 \\
+  1 & 0 & 1/2 & 0 & 0 \\
+  0 & 1/2 & 0 & 1/2 & 0 \\
+  0 & 0 & 1/2 & 0 & 1 \\
+  0 & 0 & 0 & 1/2 & 0 \\
+ \end{bmatrix}.
+\]
+
+When we multiply a probability distribution by this matrix,
+we get the new distribution after moving one step.
+For example, we can move from the distribution
+$[1,0,0,0,0]$ to the distribution
+$[0,1,0,0,0]$ as follows:
+
+\[ 
+ \begin{bmatrix}
+  0 & 1/2 & 0 & 0 & 0 \\
+  1 & 0 & 1/2 & 0 & 0 \\
+  0 & 1/2 & 0 & 1/2 & 0 \\
+  0 & 0 & 1/2 & 0 & 1 \\
+  0 & 0 & 0 & 1/2 & 0 \\
+ \end{bmatrix}
+ \begin{bmatrix}
+  1 \\
+  0 \\
+  0 \\
+  0 \\
+  0 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  0 \\
+  1 \\
+  0 \\
+  0 \\
+  0 \\
+ \end{bmatrix}.
+\]
+
+By calculating matrix powers efficiently,
+we can calculate the distribution after $m$ steps
+in $O(n^3 \log m)$ time.
+
+\section{Randomized algorithms}
+
+\index{randomized algorithm}
+
+Sometimes we can use randomness for solving a problem,
+even if the problem is not related to probabilities.
+A \key{randomized algorithm} is an algorithm that
+is based on randomness.
+
+\index{Monte Carlo algorithm}
+
+A \key{Monte Carlo algorithm} is a randomized algorithm
+that may sometimes give a wrong answer.
+For such an algorithm to be useful,
+the probability of a wrong answer should be small.
+
+\index{Las Vegas algorithm}
+
+A \key{Las Vegas algorithm} is a randomized algorithm
+that always gives the correct answer,
+but its running time varies randomly.
+The goal is to design an algorithm that is
+efficient with high probability.
+
+Next we will go through three example problems that
+can be solved using randomness.
+
+\subsubsection{Order statistics}
+
+\index{order statistic}
+
+The $kth$ \key{order statistic} of an array
+is the element at position $k$ after sorting
+the array in increasing order.
+It is easy to calculate any order statistic
+in $O(n \log n)$ time by first sorting the array,
+but is it really needed to sort the entire array
+just to find one element?
+
+It turns out that we can find order statistics
+using a randomized algorithm without sorting the array.
+The algorithm, called \key{quickselect}\footnote{In 1961,
+C. A. R. Hoare published two algorithms that
+are efficient on average: \index{quicksort} \index{quickselect}
+\key{quicksort} \cite{hoa61a} for sorting arrays and
+\key{quickselect} \cite{hoa61b} for finding order statistics.}, is a Las Vegas algorithm:
+its running time is usually $O(n)$
+but $O(n^2)$ in the worst case.
+
+The algorithm chooses a random element $x$
+of the array, and moves elements smaller than $x$
+to the left part of the array,
+and all other elements to the right part of the array.
+This takes $O(n)$ time when there are $n$ elements.
+Assume that the left part contains $a$ elements
+and the right part contains $b$ elements.
+If $a=k$, element $x$ is the $k$th order statistic.
+Otherwise, if $a>k$, we recursively find the $k$th order
+statistic for the left part,
+and if $a<k$, we recursively find the $r$th order
+statistic for the right part where $r=k-a$.
+The search continues in a similar way, until the element
+has been found.
+
+When each element $x$ is randomly chosen,
+the size of the array about halves at each step,
+so the time complexity for
+finding the $k$th order statistic is about
+\[n+n/2+n/4+n/8+\cdots < 2n = O(n).\]
+
+The worst case of the algorithm requires still $O(n^2)$ time,
+because it is possible that $x$ is always chosen
+in such a way that it is one of the smallest or largest
+elements in the array and $O(n)$ steps are needed.
+However, the probability for this is so small
+that this never happens in practice.
+
+\subsubsection{Verifying matrix multiplication}
+
+\index{matrix multiplication}
+
+Our next problem is to \emph{verify}
+if $AB=C$ holds when $A$, $B$ and $C$
+are matrices of size $n \times n$.
+Of course, we can solve the problem
+by calculating the product $AB$ again
+(in $O(n^3)$ time using the basic algorithm),
+but one could hope that verifying the
+answer would by easier than to calculate it from scratch.
+
+It turns out that we can solve the problem
+using a Monte Carlo algorithm\footnote{R. M. Freivalds published
+this algorithm in 1977 \cite{fre77}, and it is sometimes
+called \index{Freivalds' algoritm} \key{Freivalds' algorithm}.} whose
+time complexity is only $O(n^2)$.
+The idea is simple: we choose a random vector
+$X$ of $n$ elements, and calculate the matrices
+$ABX$ and $CX$. If $ABX=CX$, we report that $AB=C$,
+and otherwise we report that $AB \neq C$.
+
+The time complexity of the algorithm is
+$O(n^2)$, because we can calculate the matrices
+$ABX$ and $CX$ in $O(n^2)$ time.
+We can calculate the matrix $ABX$ efficiently
+by using the representation $A(BX)$, so only two
+multiplications of $n \times n$ and $n \times 1$
+size matrices are needed.
+
+The drawback of the algorithm is
+that there is a small chance that the algorithm
+makes a mistake when it reports that $AB=C$.
+For example, 
+\[
+ \begin{bmatrix}
+  6 & 8 \\
+  1 & 3 \\
+ \end{bmatrix}
+\neq
+ \begin{bmatrix}
+  8 & 7 \\
+  3 & 2 \\
+ \end{bmatrix},
+\]
+but
+\[
+ \begin{bmatrix}
+  6 & 8 \\
+  1 & 3 \\
+ \end{bmatrix}
+ \begin{bmatrix}
+  3 \\
+  6 \\
+ \end{bmatrix}
+=
+ \begin{bmatrix}
+  8 & 7 \\
+  3 & 2 \\
+ \end{bmatrix}
+ \begin{bmatrix}
+  3 \\
+  6 \\
+ \end{bmatrix}.
+\]
+However, in practice, the probability that the
+algorithm makes a mistake is small,
+and we can decrease the probability by
+verifying the result using multiple random vectors $X$
+before reporting that $AB=C$.
+
+\subsubsection{Graph coloring}
+
+\index{coloring}
+
+Given a graph that contains $n$ nodes and $m$ edges,
+our task is to find a way to color the nodes
+of the graph using two colors so that
+for at least $m/2$ edges, the endpoints 
+have different colors.
+For example, in the graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (1,3) {$1$};
+\node[draw, circle] (2) at (4,3) {$2$};
+\node[draw, circle] (3) at (1,1) {$3$};
+\node[draw, circle] (4) at (4,1) {$4$};
+\node[draw, circle] (5) at (6,2) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+a valid coloring is as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle, fill=blue!40] (1) at (1,3) {$1$};
+\node[draw, circle, fill=red!40] (2) at (4,3) {$2$};
+\node[draw, circle, fill=red!40] (3) at (1,1) {$3$};
+\node[draw, circle, fill=blue!40] (4) at (4,1) {$4$};
+\node[draw, circle, fill=blue!40] (5) at (6,2) {$5$};
+
+\path[draw,thick,-] (1) -- (2);
+\path[draw,thick,-] (1) -- (3);
+\path[draw,thick,-] (1) -- (4);
+\path[draw,thick,-] (3) -- (4);
+\path[draw,thick,-] (2) -- (4);
+\path[draw,thick,-] (2) -- (5);
+\path[draw,thick,-] (4) -- (5);
+\end{tikzpicture}
+\end{center}
+The above graph contains 7 edges, and for 5 of them,
+the endpoints have different colors,
+so the coloring is valid.
+
+The problem can be solved using a Las Vegas algorithm
+that generates random colorings until a valid coloring
+has been found.
+In a random coloring, the color of each node is
+independently chosen so that the probability of
+both colors is $1/2$.
+
+In a random coloring, the probability that the endpoints
+of a single edge have different colors is $1/2$.
+Hence, the expected number of edges whose endpoints
+have different colors is $m/2$.
+Since it is expected that a random coloring is valid,
+we will quickly find a valid coloring in practice.
+
--- a/chapter25.tex
+++ b/chapter25.tex
@ -0,0 +1,801 @@
+\chapter{Game theory}
+
+In this chapter, we will focus on two-player
+games that do not contain random elements.
+Our goal is to find a strategy that we can
+follow to win the game
+no matter what the opponent does,
+if such a strategy exists.
+
+It turns out that there is a general strategy
+for such games,
+and we can analyze the games using the \key{nim theory}.
+First, we will analyze simple games where
+players remove sticks from heaps,
+and after this, we will generalize the strategy
+used in those games to other games.
+
+\section{Game states}
+
+Let us consider a game where there is initially
+a heap of $n$ sticks.
+Players $A$ and $B$ move alternately,
+and player $A$ begins.
+On each move, the player has to remove
+1, 2 or 3 sticks from the heap,
+and the player who removes the last stick wins the game.
+
+For example, if $n=10$, the game may proceed as follows:
+\begin{itemize}[noitemsep]
+\item Player $A$ removes 2 sticks (8 sticks left).
+\item Player $B$ removes 3 sticks (5 sticks left).
+\item Player $A$ removes 1 stick (4 sticks left).
+\item Player $B$ removes 2 sticks (2 sticks left).
+\item Player $A$ removes 2 sticks and wins.
+\end{itemize}
+
+This game consists of states $0,1,2,\ldots,n$,
+where the number of the state corresponds to
+the number of sticks left.
+
+\subsubsection{Winning and losing states}
+
+\index{winning state}
+\index{losing state}
+
+A \key{winning state} is a state where
+the player will win the game if they
+play optimally,
+and a \key{losing state} is a state
+where the player will lose the game if the
+opponent plays optimally.
+It turns out that we can classify all states
+of a game so that each state is either
+a winning state or a losing state.
+
+In the above game, state 0 is clearly a
+losing state, because the player cannot make
+any moves.
+States 1, 2 and 3 are winning states,
+because we can remove 1, 2 or 3 sticks
+and win the game.
+State 4, in turn, is a losing state,
+because any move leads to a state that
+is a winning state for the opponent.
+
+More generally, if there is a move that leads
+from the current state to a losing state,
+the current state is a winning state,
+and otherwise the current state is a losing state.
+Using this observation, we can classify all states
+of a game starting with losing states where
+there are no possible moves.
+
+The states $0 \ldots 15$ of the above game
+can be classified as follows
+($W$ denotes a winning state and $L$ denotes a losing state):
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (16,1);
+
+\node at (0.5,0.5) {$L$};
+\node at (1.5,0.5) {$W$};
+\node at (2.5,0.5) {$W$};
+\node at (3.5,0.5) {$W$};
+\node at (4.5,0.5) {$L$};
+\node at (5.5,0.5) {$W$};
+\node at (6.5,0.5) {$W$};
+\node at (7.5,0.5) {$W$};
+\node at (8.5,0.5) {$L$};
+\node at (9.5,0.5) {$W$};
+\node at (10.5,0.5) {$W$};
+\node at (11.5,0.5) {$W$};
+\node at (12.5,0.5) {$L$};
+\node at (13.5,0.5) {$W$};
+\node at (14.5,0.5) {$W$};
+\node at (15.5,0.5) {$W$};
+
+\footnotesize
+\node at (0.5,1.4) {$0$};
+\node at (1.5,1.4) {$1$};
+\node at (2.5,1.4) {$2$};
+\node at (3.5,1.4) {$3$};
+\node at (4.5,1.4) {$4$};
+\node at (5.5,1.4) {$5$};
+\node at (6.5,1.4) {$6$};
+\node at (7.5,1.4) {$7$};
+\node at (8.5,1.4) {$8$};
+\node at (9.5,1.4) {$9$};
+\node at (10.5,1.4) {$10$};
+\node at (11.5,1.4) {$11$};
+\node at (12.5,1.4) {$12$};
+\node at (13.5,1.4) {$13$};
+\node at (14.5,1.4) {$14$};
+\node at (15.5,1.4) {$15$};
+\end{tikzpicture}
+\end{center}
+
+It is easy to analyze this game:
+a state $k$ is a losing state if $k$ is
+divisible by 4, and otherwise it
+is a winning state.
+An optimal way to play the game is
+to always choose a move after which
+the number of sticks in the heap
+is divisible by 4.
+Finally, there are no sticks left and
+the opponent has lost.
+
+Of course, this strategy requires that
+the number of sticks is \emph{not} divisible by 4
+when it is our move.
+If it is, there is nothing we can do,
+and the opponent will win the game if
+they play optimally.
+
+\subsubsection{State graph}
+
+Let us now consider another stick game,
+where in each state $k$, it is allowed to remove
+any number $x$ of sticks such that $x$
+is smaller than $k$ and divides $k$.
+For example, in state 8 we may remove
+1, 2 or 4 sticks, but in state 7 the only
+allowed move is to remove 1 stick.
+
+The following picture shows the states
+$1 \ldots 9$ of the game as a \key{state graph},
+whose nodes are the states and edges are the moves between them:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {$1$};
+\node[draw, circle] (2) at (2,0) {$2$};
+\node[draw, circle] (3) at (3.5,-1) {$3$};
+\node[draw, circle] (4) at (1.5,-2) {$4$};
+\node[draw, circle] (5) at (3,-2.75) {$5$};
+\node[draw, circle] (6) at (2.5,-4.5) {$6$};
+\node[draw, circle] (7) at (0.5,-3.25) {$7$};
+\node[draw, circle] (8) at (-1,-4) {$8$};
+\node[draw, circle] (9) at (1,-5.5) {$9$};
+
+\path[draw,thick,->,>=latex] (2) -- (1);
+\path[draw,thick,->,>=latex] (3) edge [bend right=20] (2);
+\path[draw,thick,->,>=latex] (4) edge [bend left=20] (2);
+\path[draw,thick,->,>=latex] (4) edge [bend left=20] (3);
+\path[draw,thick,->,>=latex] (5) edge [bend right=20] (4);
+\path[draw,thick,->,>=latex] (6) edge [bend left=20] (5);
+\path[draw,thick,->,>=latex] (6) edge [bend left=20] (4);
+\path[draw,thick,->,>=latex] (6) edge [bend right=40] (3);
+\path[draw,thick,->,>=latex] (7) edge [bend right=20] (6);
+\path[draw,thick,->,>=latex] (8) edge [bend right=20] (7);
+\path[draw,thick,->,>=latex] (8) edge [bend right=20] (6);
+\path[draw,thick,->,>=latex] (8) edge [bend left=20] (4);
+\path[draw,thick,->,>=latex] (9) edge [bend left=20] (8);
+\path[draw,thick,->,>=latex] (9) edge [bend right=20] (6);
+\end{tikzpicture}
+\end{center}
+
+The final state in this game is always state 1,
+which is a losing state, because there are no
+valid moves.
+The classification of states $1 \ldots 9$
+is as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (1,0) grid (10,1);
+
+\node at (1.5,0.5) {$L$};
+\node at (2.5,0.5) {$W$};
+\node at (3.5,0.5) {$L$};
+\node at (4.5,0.5) {$W$};
+\node at (5.5,0.5) {$L$};
+\node at (6.5,0.5) {$W$};
+\node at (7.5,0.5) {$L$};
+\node at (8.5,0.5) {$W$};
+\node at (9.5,0.5) {$L$};
+
+\footnotesize
+\node at (1.5,1.4) {$1$};
+\node at (2.5,1.4) {$2$};
+\node at (3.5,1.4) {$3$};
+\node at (4.5,1.4) {$4$};
+\node at (5.5,1.4) {$5$};
+\node at (6.5,1.4) {$6$};
+\node at (7.5,1.4) {$7$};
+\node at (8.5,1.4) {$8$};
+\node at (9.5,1.4) {$9$};
+\end{tikzpicture}
+\end{center}
+
+Surprisingly, in this game,
+all even-numbered states are winning states,
+and all odd-numbered states are losing states.
+
+\section{Nim game}
+
+\index{nim game}
+
+The \key{nim game} is a simple game that
+has an important role in game theory,
+because many other games can be played using
+the same strategy.
+First, we focus on nim,
+and then we generalize the strategy
+to other games.
+
+There are $n$ heaps in nim,
+and each heap contains some number of sticks.
+The players move alternately,
+and on each turn, the player chooses
+a heap that still contains sticks
+and removes any number of sticks from it.
+The winner is the player who removes the last stick.
+
+The states in nim are of the form
+$[x_1,x_2,\ldots,x_n]$,
+where $x_k$ denotes the number of sticks in heap $k$.
+For example, $[10,12,5]$ is a game where
+there are three heaps with 10, 12 and 5 sticks.
+The state $[0,0,\ldots,0]$ is a losing state,
+because it is not possible to remove any sticks,
+and this is always the final state.
+
+\subsubsection{Analysis}
+\index{nim sum}
+
+It turns out that we can easily classify
+any nim state by calculating
+the \key{nim sum} $s = x_1 \oplus x_2 \oplus \cdots \oplus x_n$,
+where $\oplus$ is the xor operation\footnote{The optimal strategy
+for nim was published in 1901 by C. L. Bouton \cite{bou01}.}.
+The states whose nim sum is 0 are losing states,
+and all other states are winning states.
+For example, the nim sum of
+$[10,12,5]$ is $10 \oplus 12 \oplus 5 = 3$,
+so the state is a winning state.
+
+But how is the nim sum related to the nim game?
+We can explain this by looking at how the nim
+sum changes when the nim state changes.
+
+\textit{Losing states:}
+The final state $[0,0,\ldots,0]$ is a losing state,
+and its nim sum is 0, as expected.
+In other losing states, any move leads to
+a winning state, because when a single value $x_k$ changes,
+the nim sum also changes, so the nim sum
+is different from 0 after the move.
+
+\textit{Winning states:}
+We can move to a losing state if
+there is any heap $k$ for which $x_k \oplus s < x_k$.
+In this case, we can remove sticks from
+heap $k$ so that it will contain $x_k \oplus s$ sticks,
+which will lead to a losing state.
+There is always such a heap, where $x_k$
+has a one bit at the position of the leftmost
+one bit of $s$.
+
+As an example, consider the state $[10,12,5]$.
+This state is a winning state,
+because its nim sum is 3.
+Thus, there has to be a move which
+leads to a losing state.
+Next we will find out such a move.
+
+The nim sum of the state is as follows:
+
+\begin{center}
+\begin{tabular}{r|r}
+10 & \texttt{1010} \\
+12 & \texttt{1100} \\
+5 & \texttt{0101} \\
+\hline
+3 & \texttt{0011} \\
+\end{tabular}
+\end{center}
+
+In this case, the heap with 10 sticks
+is the only heap that has a one bit
+at the position of the leftmost
+one bit of the nim sum:
+
+\begin{center}
+\begin{tabular}{r|r}
+10 & \texttt{10\underline{1}0} \\
+12 & \texttt{1100} \\
+5 & \texttt{0101} \\
+\hline
+3 & \texttt{00\underline{1}1} \\
+\end{tabular}
+\end{center}
+
+The new size of the heap has to be
+$10 \oplus 3 = 9$,
+so we will remove just one stick.
+After this, the state will be $[9,12,5]$,
+which is a losing state:
+
+\begin{center}
+\begin{tabular}{r|r}
+9 & \texttt{1001} \\
+12 & \texttt{1100} \\
+5 & \texttt{0101} \\
+\hline
+0 & \texttt{0000} \\
+\end{tabular}
+\end{center}
+
+\subsubsection{Misère game}
+
+\index{misère game}
+
+In a \key{misère game}, the goal of the game
+is opposite,
+so the player who removes the last stick
+loses the game.
+It turns out that the misère nim game can be
+optimally played almost like the standard nim game.
+
+The idea is to first play the misère game
+like the standard game, but change the strategy
+at the end of the game.
+The new strategy will be introduced in a situation
+where each heap would contain at most one stick
+after the next move.
+
+In the standard game, we should choose a move
+after which there is an even number of heaps with one stick.
+However, in the misère game, we choose a move so that
+there is an odd number of heaps with one stick.
+
+This strategy works because a state where the
+strategy changes always appears in the game,
+and this state is a winning state, because
+it contains exactly one heap that has more than one stick
+so the nim sum is not 0.
+
+\section{Sprague–Grundy theorem}
+
+\index{Sprague–Grundy theorem}
+
+The \key{Sprague–Grundy theorem}\footnote{The theorem was
+independently discovered by R. Sprague \cite{spr35} and P. M. Grundy \cite{gru39}.} generalizes the
+strategy used in nim to all games that fulfil
+the following requirements:
+
+\begin{itemize}[noitemsep]
+\item There are two players who move alternately.
+\item The game consists of states, and the possible moves
+in a state do not depend on whose turn it is.
+\item The game ends when a player cannot make a move.
+\item The game surely ends sooner or later.
+\item The players have complete information about
+the states and allowed moves, and there is no randomness in the game.
+\end{itemize}
+The idea is to calculate for each game state
+a Grundy number that corresponds to the number of
+sticks in a nim heap.
+When we know the Grundy numbers of all states,
+we can play the game like the nim game.
+
+\subsubsection{Grundy numbers}
+
+\index{Grundy number}
+\index{mex function}
+
+The \key{Grundy number} of a game state is
+\[\textrm{mex}(\{g_1,g_2,\ldots,g_n\}),\]
+where $g_1,g_2,\ldots,g_n$ are the Grundy numbers of the
+states to which we can move,
+and the mex function gives the smallest
+nonnegative number that is not in the set.
+For example, $\textrm{mex}(\{0,1,3\})=2$.
+If there are no possible moves in a state,
+its Grundy number is 0, because
+$\textrm{mex}(\emptyset)=0$.
+
+For example, in the state graph
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {\phantom{0}};
+\node[draw, circle] (2) at (2,0) {\phantom{0}};
+\node[draw, circle] (3) at (4,0) {\phantom{0}};
+\node[draw, circle] (4) at (1,-2) {\phantom{0}};
+\node[draw, circle] (5) at (3,-2) {\phantom{0}};
+\node[draw, circle] (6) at (5,-2) {\phantom{0}};
+
+\path[draw,thick,->,>=latex] (2) -- (1);
+\path[draw,thick,->,>=latex] (3) -- (2);
+\path[draw,thick,->,>=latex] (5) -- (4);
+\path[draw,thick,->,>=latex] (6) -- (5);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\path[draw,thick,->,>=latex] (4) -- (2);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (6) -- (2);
+\end{tikzpicture}
+\end{center}
+the Grundy numbers are as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\node[draw, circle] (1) at (0,0) {0};
+\node[draw, circle] (2) at (2,0) {1};
+\node[draw, circle] (3) at (4,0) {0};
+\node[draw, circle] (4) at (1,-2) {2};
+\node[draw, circle] (5) at (3,-2) {0};
+\node[draw, circle] (6) at (5,-2) {2};
+
+\path[draw,thick,->,>=latex] (2) -- (1);
+\path[draw,thick,->,>=latex] (3) -- (2);
+\path[draw,thick,->,>=latex] (5) -- (4);
+\path[draw,thick,->,>=latex] (6) -- (5);
+\path[draw,thick,->,>=latex] (4) -- (1);
+\path[draw,thick,->,>=latex] (4) -- (2);
+\path[draw,thick,->,>=latex] (5) -- (2);
+\path[draw,thick,->,>=latex] (6) -- (2);
+\end{tikzpicture}
+\end{center}
+The Grundy number of a losing state is 0,
+and the Grundy number of a winning state is
+a positive number.
+
+The Grundy number of a state corresponds to
+the number of sticks in a nim heap.
+If the Grundy number is 0, we can only move to
+states whose Grundy numbers are positive,
+and if the Grundy number is $x>0$, we can move
+to states whose Grundy numbers include all numbers
+$0,1,\ldots,x-1$.
+
+As an example, consider a game where
+the players move a figure in a maze.
+Each square in the maze is either floor or wall.
+On each turn, the player has to move
+the figure some number
+of steps left or up.
+The winner of the game is the player who
+makes the last move.
+
+The following picture shows a possible initial state
+of the game, where @ denotes the figure and *
+denotes a square where it can move.
+
+\begin{center}
+\begin{tikzpicture}[scale=.65]
+  \begin{scope}
+    \fill [color=black] (0, 1) rectangle (1, 2);
+    \fill [color=black] (0, 3) rectangle (1, 4);
+    \fill [color=black] (2, 2) rectangle (3, 3);
+    \fill [color=black] (2, 4) rectangle (3, 5);
+    \fill [color=black] (4, 3) rectangle (5, 4);
+
+    \draw (0, 0) grid (5, 5);
+    
+    \node at (4.5,0.5) {@};
+    \node at (3.5,0.5) {*};
+    \node at (2.5,0.5) {*};
+    \node at (1.5,0.5) {*};
+    \node at (0.5,0.5) {*};
+    \node at (4.5,1.5) {*};
+    \node at (4.5,2.5) {*};
+    
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+
+The states of the game are all floor squares
+of the maze.
+In the above maze, the Grundy numbers
+are as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=.65]
+  \begin{scope}
+    \fill [color=black] (0, 1) rectangle (1, 2);
+    \fill [color=black] (0, 3) rectangle (1, 4);
+    \fill [color=black] (2, 2) rectangle (3, 3);
+    \fill [color=black] (2, 4) rectangle (3, 5);
+    \fill [color=black] (4, 3) rectangle (5, 4);
+
+    \draw (0, 0) grid (5, 5);
+    
+    \node at (0.5,4.5) {0};
+    \node at (1.5,4.5) {1};
+    \node at (2.5,4.5) {};
+    \node at (3.5,4.5) {0};
+    \node at (4.5,4.5) {1};
+
+    \node at (0.5,3.5) {};
+    \node at (1.5,3.5) {0};
+    \node at (2.5,3.5) {1};
+    \node at (3.5,3.5) {2};
+    \node at (4.5,3.5) {};
+
+    \node at (0.5,2.5) {0};
+    \node at (1.5,2.5) {2};
+    \node at (2.5,2.5) {};
+    \node at (3.5,2.5) {1};
+    \node at (4.5,2.5) {0};
+
+    \node at (0.5,1.5) {};
+    \node at (1.5,1.5) {3};
+    \node at (2.5,1.5) {0};
+    \node at (3.5,1.5) {4};
+    \node at (4.5,1.5) {1};
+
+    \node at (0.5,0.5) {0};
+    \node at (1.5,0.5) {4};
+    \node at (2.5,0.5) {1};
+    \node at (3.5,0.5) {3};
+    \node at (4.5,0.5) {2};
+  \end{scope}
+\end{tikzpicture}
+\end{center}
+
+Thus, each state of the maze game
+corresponds to a heap in the nim game.
+For example, the Grundy number for
+the lower-right square is 2,
+so it is a winning state.
+We can reach a losing state and
+win the game by moving
+either four steps left or
+two steps up.
+
+Note that unlike in the original nim game,
+it may be possible to move to a state whose
+Grundy number is larger than the Grundy number
+of the current state.
+However, the opponent can always choose a move
+that cancels such a move, so it is not possible
+to escape from a losing state.
+
+\subsubsection{Subgames}
+
+Next we will assume that our game consists
+of subgames, and on each turn, the player
+first chooses a subgame and then a move in the subgame.
+The game ends when it is not possible to make any move
+in any subgame.
+
+In this case, the Grundy number of a game
+is the nim sum of the Grundy numbers of the subgames.
+The game can be played like a nim game by calculating
+all Grundy numbers for subgames and then their nim sum.
+
+As an example, consider a game that consists
+of three mazes.
+In this game, on each turn, the player chooses one
+of the mazes and then moves the figure in the maze.
+Assume that the initial state of the game is as follows:
+
+\begin{center}
+\begin{tabular}{ccc}
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \fill [color=black] (0, 1) rectangle (1, 2);
+    \fill [color=black] (0, 3) rectangle (1, 4);
+    \fill [color=black] (2, 2) rectangle (3, 3);
+    \fill [color=black] (2, 4) rectangle (3, 5);
+    \fill [color=black] (4, 3) rectangle (5, 4);
+
+    \draw (0, 0) grid (5, 5);
+
+    \node at (4.5,0.5) {@};
+
+    \end{scope}
+\end{tikzpicture}
+&
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \fill [color=black] (1, 1) rectangle (2, 3);
+    \fill [color=black] (2, 3) rectangle (3, 4);
+    \fill [color=black] (4, 4) rectangle (5, 5);
+
+    \draw (0, 0) grid (5, 5);
+    
+    \node at (4.5,0.5) {@};
+
+  \end{scope}
+\end{tikzpicture}
+&
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \fill [color=black] (1, 1) rectangle (4, 4);
+
+    \draw (0, 0) grid (5, 5);
+    
+    \node at (4.5,0.5) {@};
+  \end{scope}
+\end{tikzpicture}
+\end{tabular}
+\end{center}
+
+The Grundy numbers for the mazes are as follows:
+
+\begin{center}
+\begin{tabular}{ccc}
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \fill [color=black] (0, 1) rectangle (1, 2);
+    \fill [color=black] (0, 3) rectangle (1, 4);
+    \fill [color=black] (2, 2) rectangle (3, 3);
+    \fill [color=black] (2, 4) rectangle (3, 5);
+    \fill [color=black] (4, 3) rectangle (5, 4);
+
+    \draw (0, 0) grid (5, 5);
+
+    \node at (0.5,4.5) {0};
+    \node at (1.5,4.5) {1};
+    \node at (2.5,4.5) {};
+    \node at (3.5,4.5) {0};
+    \node at (4.5,4.5) {1};
+
+    \node at (0.5,3.5) {};
+    \node at (1.5,3.5) {0};
+    \node at (2.5,3.5) {1};
+    \node at (3.5,3.5) {2};
+    \node at (4.5,3.5) {};
+
+    \node at (0.5,2.5) {0};
+    \node at (1.5,2.5) {2};
+    \node at (2.5,2.5) {};
+    \node at (3.5,2.5) {1};
+    \node at (4.5,2.5) {0};
+
+    \node at (0.5,1.5) {};
+    \node at (1.5,1.5) {3};
+    \node at (2.5,1.5) {0};
+    \node at (3.5,1.5) {4};
+    \node at (4.5,1.5) {1};
+
+    \node at (0.5,0.5) {0};
+    \node at (1.5,0.5) {4};
+    \node at (2.5,0.5) {1};
+    \node at (3.5,0.5) {3};
+    \node at (4.5,0.5) {2};
+    \end{scope}
+\end{tikzpicture}
+&
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \fill [color=black] (1, 1) rectangle (2, 3);
+    \fill [color=black] (2, 3) rectangle (3, 4);
+    \fill [color=black] (4, 4) rectangle (5, 5);
+
+    \draw (0, 0) grid (5, 5);
+
+    \node at (0.5,4.5) {0};
+    \node at (1.5,4.5) {1};
+    \node at (2.5,4.5) {2};
+    \node at (3.5,4.5) {3};
+    \node at (4.5,4.5) {};
+
+    \node at (0.5,3.5) {1};
+    \node at (1.5,3.5) {0};
+    \node at (2.5,3.5) {};
+    \node at (3.5,3.5) {0};
+    \node at (4.5,3.5) {1};
+
+    \node at (0.5,2.5) {2};
+    \node at (1.5,2.5) {};
+    \node at (2.5,2.5) {0};
+    \node at (3.5,2.5) {1};
+    \node at (4.5,2.5) {2};
+
+    \node at (0.5,1.5) {3};
+    \node at (1.5,1.5) {};
+    \node at (2.5,1.5) {1};
+    \node at (3.5,1.5) {2};
+    \node at (4.5,1.5) {0};
+
+    \node at (0.5,0.5) {4};
+    \node at (1.5,0.5) {0};
+    \node at (2.5,0.5) {2};
+    \node at (3.5,0.5) {5};
+    \node at (4.5,0.5) {3};
+  \end{scope}
+\end{tikzpicture}
+&
+\begin{tikzpicture}[scale=.55]
+  \begin{scope}
+    \fill [color=black] (1, 1) rectangle (4, 4);
+
+    \draw (0, 0) grid (5, 5);
+
+    \node at (0.5,4.5) {0};
+    \node at (1.5,4.5) {1};
+    \node at (2.5,4.5) {2};
+    \node at (3.5,4.5) {3};
+    \node at (4.5,4.5) {4};
+
+    \node at (0.5,3.5) {1};
+    \node at (1.5,3.5) {};
+    \node at (2.5,3.5) {};
+    \node at (3.5,3.5) {};
+    \node at (4.5,3.5) {0};
+
+    \node at (0.5,2.5) {2};
+    \node at (1.5,2.5) {};
+    \node at (2.5,2.5) {};
+    \node at (3.5,2.5) {};
+    \node at (4.5,2.5) {1};
+
+    \node at (0.5,1.5) {3};
+    \node at (1.5,1.5) {};
+    \node at (2.5,1.5) {};
+    \node at (3.5,1.5) {};
+    \node at (4.5,1.5) {2};
+
+    \node at (0.5,0.5) {4};
+    \node at (1.5,0.5) {0};
+    \node at (2.5,0.5) {1};
+    \node at (3.5,0.5) {2};
+    \node at (4.5,0.5) {3};
+  \end{scope}
+\end{tikzpicture}
+\end{tabular}
+\end{center}
+
+In the initial state, the nim sum of the Grundy numbers
+is $2 \oplus 3 \oplus 3 = 2$, so
+the first player can win the game.
+One optimal move is to move two steps up
+in the first maze, which produces the nim sum
+$0 \oplus 3 \oplus 3 = 0$.
+
+\subsubsection{Grundy's game}
+
+Sometimes a move in a game divides the game
+into subgames that are independent of each other.
+In this case, the Grundy number of the game is
+
+\[\textrm{mex}(\{g_1, g_2, \ldots, g_n \}),\]
+where $n$ is the number of possible moves and
+\[g_k = a_{k,1} \oplus a_{k,2} \oplus \ldots \oplus a_{k,m},\]
+where move $k$ generates subgames with
+Grundy numbers $a_{k,1},a_{k,2},\ldots,a_{k,m}$.
+
+\index{Grundy's game}
+
+An example of such a game is \key{Grundy's game}.
+Initially, there is a single heap that contains $n$ sticks.
+On each turn, the player chooses a heap and divides
+it into two nonempty heaps such that the heaps
+are of different size.
+The player who makes the last move wins the game.
+
+Let $f(n)$ be the Grundy number of a heap
+that contains $n$ sticks.
+The Grundy number can be calculated by going
+through all ways to divide the heap into
+two heaps.
+For example, when $n=8$, the possibilities
+are $1+7$, $2+6$ and $3+5$, so
+\[f(8)=\textrm{mex}(\{f(1) \oplus f(7), f(2) \oplus f(6), f(3) \oplus f(5)\}).\]
+
+In this game, the value of $f(n)$ is based on the values
+of $f(1),\ldots,f(n-1)$.
+The base cases are $f(1)=f(2)=0$,
+because it is not possible to divide the heaps
+of 1 and 2 sticks.
+The first Grundy numbers are:
+\[
+\begin{array}{lcl}
+f(1) & = & 0 \\
+f(2) & = & 0 \\
+f(3) & = & 1 \\
+f(4) & = & 0 \\
+f(5) & = & 2 \\
+f(6) & = & 1 \\
+f(7) & = & 0 \\
+f(8) & = & 2 \\
+\end{array}
+\]
+The Grundy number for $n=8$ is 2,
+so it is possible to win the game.
+The winning move is to create heaps
+$1+7$, because $f(1) \oplus f(7) = 0$.
+
--- a/chapter26.tex
+++ b/chapter26.tex
--- a/chapter27.tex
+++ b/chapter27.tex
@ -0,0 +1,559 @@
+\chapter{Square root algorithms}
+
+\index{square root algorithm}
+
+A \key{square root algorithm} is an algorithm
+that has a square root in its time complexity.
+A square root can be seen as a ''poor man's logarithm'':
+the complexity $O(\sqrt n)$ is better than $O(n)$
+but worse than $O(\log n)$.
+In any case, many square root algorithms are fast and usable in practice.
+
+As an example, consider the problem of
+creating a data structure that supports
+two operations on an array:
+modifying an element at a given position
+and calculating the sum of elements in the given range.
+We have previously solved the problem using
+binary indexed and segment trees,
+that support both operations in $O(\log n)$ time.
+However, now we will solve the problem
+in another way using a square root structure
+that allows us to modify elements in $O(1)$ time
+and calculate sums in $O(\sqrt n)$ time.
+
+The idea is to divide the array into \emph{blocks}
+of size $\sqrt n$ so that each block contains
+the sum of elements inside the block.
+For example, an array of 16 elements will be
+divided into blocks of 4 elements as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) grid (16,1);
+
+\draw (0,1) rectangle (4,2);
+\draw (4,1) rectangle (8,2);
+\draw (8,1) rectangle (12,2);
+\draw (12,1) rectangle (16,2);
+
+\node at (0.5, 0.5) {5};
+\node at (1.5, 0.5) {8};
+\node at (2.5, 0.5) {6};
+\node at (3.5, 0.5) {3};
+\node at (4.5, 0.5) {2};
+\node at (5.5, 0.5) {7};
+\node at (6.5, 0.5) {2};
+\node at (7.5, 0.5) {6};
+\node at (8.5, 0.5) {7};
+\node at (9.5, 0.5) {1};
+\node at (10.5, 0.5) {7};
+\node at (11.5, 0.5) {5};
+\node at (12.5, 0.5) {6};
+\node at (13.5, 0.5) {2};
+\node at (14.5, 0.5) {3};
+\node at (15.5, 0.5) {2};
+
+\node at (2, 1.5) {21};
+\node at (6, 1.5) {17};
+\node at (10, 1.5) {20};
+\node at (14, 1.5) {13};
+
+\end{tikzpicture}
+\end{center}
+
+In this structure,
+it is easy to modify array elements,
+because it is only needed to update
+the sum of a single block
+after each modification,
+which can be done in $O(1)$ time.
+For example, the following picture shows
+how the value of an element and
+the sum of the corresponding block change:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (5,0) rectangle (6,1);
+\draw (0,0) grid (16,1);
+
+\fill[color=lightgray] (4,1) rectangle (8,2);
+\draw (0,1) rectangle (4,2);
+\draw (4,1) rectangle (8,2);
+\draw (8,1) rectangle (12,2);
+\draw (12,1) rectangle (16,2);
+
+\node at (0.5, 0.5) {5};
+\node at (1.5, 0.5) {8};
+\node at (2.5, 0.5) {6};
+\node at (3.5, 0.5) {3};
+\node at (4.5, 0.5) {2};
+\node at (5.5, 0.5) {5};
+\node at (6.5, 0.5) {2};
+\node at (7.5, 0.5) {6};
+\node at (8.5, 0.5) {7};
+\node at (9.5, 0.5) {1};
+\node at (10.5, 0.5) {7};
+\node at (11.5, 0.5) {5};
+\node at (12.5, 0.5) {6};
+\node at (13.5, 0.5) {2};
+\node at (14.5, 0.5) {3};
+\node at (15.5, 0.5) {2};
+
+\node at (2, 1.5) {21};
+\node at (6, 1.5) {15};
+\node at (10, 1.5) {20};
+\node at (14, 1.5) {13};
+
+\end{tikzpicture}
+\end{center}
+
+Then, to calculate the sum of elements in a range,
+we divide the range into three parts such that 
+the sum consists of values of single elements
+and sums of blocks between them:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (3,0) rectangle (4,1);
+\fill[color=lightgray] (12,0) rectangle (13,1);
+\fill[color=lightgray] (13,0) rectangle (14,1);
+\draw (0,0) grid (16,1);
+
+\fill[color=lightgray] (4,1) rectangle (8,2);
+\fill[color=lightgray] (8,1) rectangle (12,2);
+\draw (0,1) rectangle (4,2);
+\draw (4,1) rectangle (8,2);
+\draw (8,1) rectangle (12,2);
+\draw (12,1) rectangle (16,2);
+
+\node at (0.5, 0.5) {5};
+\node at (1.5, 0.5) {8};
+\node at (2.5, 0.5) {6};
+\node at (3.5, 0.5) {3};
+\node at (4.5, 0.5) {2};
+\node at (5.5, 0.5) {5};
+\node at (6.5, 0.5) {2};
+\node at (7.5, 0.5) {6};
+\node at (8.5, 0.5) {7};
+\node at (9.5, 0.5) {1};
+\node at (10.5, 0.5) {7};
+\node at (11.5, 0.5) {5};
+\node at (12.5, 0.5) {6};
+\node at (13.5, 0.5) {2};
+\node at (14.5, 0.5) {3};
+\node at (15.5, 0.5) {2};
+
+\node at (2, 1.5) {21};
+\node at (6, 1.5) {15};
+\node at (10, 1.5) {20};
+\node at (14, 1.5) {13};
+
+\draw [decoration={brace}, decorate, line width=0.5mm] (14,-0.25) -- (3,-0.25);
+
+\end{tikzpicture}
+\end{center}
+
+Since the number of single elements is $O(\sqrt n)$
+and the number of blocks is also $O(\sqrt n)$,
+the sum query takes $O(\sqrt n)$ time.
+The purpose of the block size $\sqrt n$ is
+that it \emph{balances} two things:
+the array is divided into $\sqrt n$ blocks,
+each of which contains $\sqrt n$ elements.
+
+In practice, it is not necessary to use the
+exact value of $\sqrt n$ as a parameter,
+and instead we may use parameters $k$ and $n/k$ where $k$ is
+different from $\sqrt n$.
+The optimal parameter depends on the problem and input.
+For example, if an algorithm often goes
+through the blocks but rarely inspects
+single elements inside the blocks,
+it may be a good idea to divide the array into
+$k < \sqrt n$ blocks, each of which contains $n/k > \sqrt n$
+elements.
+
+\section{Combining algorithms}
+
+In this section we discuss two square root algorithms
+that are based on combining two algorithms into one algorithm.
+In both cases, we could use either of the algorithms
+without the other
+and solve the problem in $O(n^2)$ time.
+However, by combining the algorithms, the running
+time is only $O(n \sqrt n)$.
+
+\subsubsection{Case processing}
+
+Suppose that we are given a two-dimensional
+grid that contains $n$ cells.
+Each cell is assigned a letter,
+and our task is to find two cells
+with the same letter whose distance is minimum,
+where the distance between cells
+$(x_1,y_1)$ and $(x_2,y_2)$ is $|x_1-x_2|+|y_1-y_2|$.
+For example, consider the following grid:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\node at (0.5,0.5) {A};
+\node at (0.5,1.5) {B};
+\node at (0.5,2.5) {C};
+\node at (0.5,3.5) {A};
+\node at (1.5,0.5) {C};
+\node at (1.5,1.5) {D};
+\node at (1.5,2.5) {E};
+\node at (1.5,3.5) {F};
+\node at (2.5,0.5) {B};
+\node at (2.5,1.5) {A};
+\node at (2.5,2.5) {G};
+\node at (2.5,3.5) {B};
+\node at (3.5,0.5) {D};
+\node at (3.5,1.5) {F};
+\node at (3.5,2.5) {E};
+\node at (3.5,3.5) {A};
+\draw (0,0) grid (4,4);
+\end{tikzpicture}
+\end{center}
+In this case, the minimum distance is 2 between the two 'E' letters.
+
+We can solve the problem by considering each letter separately.
+Using this approach, the new problem is to calculate
+the minimum distance
+between two cells with a \emph{fixed} letter $c$.
+We focus on two algorithms for this:
+
+\emph{Algorithm 1:} Go through all pairs of cells with letter $c$,
+and calculate the minimum distance between such cells.
+This will take $O(k^2)$ time where $k$ is the number of cells with letter $c$.
+
+\emph{Algorithm 2:} Perform a breadth-first search that simultaneously
+starts at each cell with letter $c$. The minimum distance between
+two cells with letter $c$ will be calculated in $O(n)$ time.
+
+One way to solve the problem is to choose either of the
+algorithms and use it for all letters.
+If we use Algorithm 1, the running time is $O(n^2)$,
+because all cells may contain the same letter,
+and in this case $k=n$.
+Also if we use Algorithm 2, the running time is $O(n^2)$,
+because all cells may have different letters,
+and in this case $n$ searches are needed.
+
+However, we can \emph{combine} the two algorithms and
+use different algorithms for different letters
+depending on how many times each letter appears in the grid.
+Assume that a letter $c$ appears $k$ times.
+If $k \le \sqrt n$, we use Algorithm 1, and if $k > \sqrt n$,
+we use Algorithm 2.
+It turns out that by doing this, the total running time
+of the algorithm is only $O(n \sqrt n)$.
+
+First, suppose that we use Algorithm 1 for a letter $c$.
+Since $c$ appears at most $\sqrt n$ times in the grid,
+we compare each cell with letter $c$ $O(\sqrt n)$ times
+with other cells.
+Thus, the time used for processing all such cells is $O(n \sqrt n)$.
+Then, suppose that we use Algorithm 2 for a letter $c$.
+There are at most $\sqrt n$ such letters,
+so processing those letters also takes $O(n \sqrt n)$ time.
+
+\subsubsection{Batch processing}
+
+Our next problem also deals with
+a two-dimensional grid that contains $n$ cells.
+Initially, each cell except one is white.
+We perform $n-1$ operations, each of which first
+calculates the minimum distance from a given white cell
+to a black cell, and then paints the white cell black.
+
+For example, consider the following operation:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=black] (1,1) rectangle (2,2);
+\fill[color=black] (3,1) rectangle (4,2);
+\fill[color=black] (0,3) rectangle (1,4);
+\node at (2.5,3.5) {*};
+\draw (0,0) grid (4,4);
+\end{tikzpicture}
+\end{center}
+
+First, we calculate the minimum distance
+from the white cell marked with * to a black cell.
+The minimum distance is 2, because we can move
+two steps left to a black cell.
+Then, we paint the white cell black:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=black] (1,1) rectangle (2,2);
+\fill[color=black] (3,1) rectangle (4,2);
+\fill[color=black] (0,3) rectangle (1,4);
+\fill[color=black] (2,3) rectangle (3,4);
+\draw (0,0) grid (4,4);
+\end{tikzpicture}
+\end{center}
+
+Consider the following two algorithms:
+
+\emph{Algorithm 1:} Use breadth-first search
+to calculate
+for each white cell the distance to the nearest black cell.
+This takes $O(n)$ time, and after the search,
+we can find the minimum distance from any white cell
+to a black cell in $O(1)$ time.
+
+\emph{Algorithm 2:} Maintain a list of cells that have been
+painted black, go through this list at each operation
+and then add a new cell to the list.
+An operation takes $O(k)$ time where $k$ is the length of the list.
+
+We combine the above algorithms by
+dividing the operations into
+$O(\sqrt n)$ \emph{batches}, each of which consists
+of $O(\sqrt n)$ operations.
+At the beginning of each batch,
+we perform Algorithm 1.
+Then, we use Algorithm 2 to process the operations
+in the batch.
+We clear the list of Algorithm 2 between
+the batches.
+At each operation,
+the minimum distance to a black cell
+is either the distance calculated by Algorithm 1
+or the distance calculated by Algorithm 2.
+
+The resulting algorithm works in
+$O(n \sqrt n)$ time.
+First, Algorithm 1 is performed $O(\sqrt n)$ times,
+and each search works in $O(n)$ time.
+Second, when using Algorithm 2 in a batch,
+the list contains $O(\sqrt n)$ cells
+(because we clear the list between the batches)
+and each operation takes $O(\sqrt n)$ time.
+
+\section{Integer partitions}
+
+Some square root algorithms are based on
+the following observation:
+if a positive integer $n$ is represented as
+a sum of positive integers,
+such a sum always contains at most
+$O(\sqrt n)$ \emph{distinct} numbers.
+The reason for this is that to construct
+a sum that contains a maximum number of distinct
+numbers, we should choose \emph{small} numbers.
+If we choose the numbers $1,2,\ldots,k$,
+the resulting sum is
+\[\frac{k(k+1)}{2}.\]
+Thus, the maximum amount of distinct numbers is $k = O(\sqrt n)$.
+Next we will discuss two problems that can be solved
+efficiently using this observation.
+
+\subsubsection{Knapsack}
+
+Suppose that we are given a list of integer weights
+whose sum is $n$.
+Our task is to find out all sums that can be formed using
+a subset of the weights. For example, if the weights are
+$\{1,3,3\}$, the possible sums are as follows:
+
+\begin{itemize}[noitemsep]
+\item $0$ (empty set)
+\item $1$
+\item $3$
+\item $1+3=4$
+\item $3+3=6$
+\item $1+3+3=7$
+\end{itemize}
+
+Using the standard knapsack approach (see Chapter 7.4),
+the problem can be solved as follows:
+we define a function $\texttt{possible}(x,k)$ whose value is 1
+if the sum $x$ can be formed using the first $k$ weights,
+and 0 otherwise.
+Since the sum of the weights is $n$,
+there are at most $n$ weights and
+all values of the function can be calculated
+in $O(n^2)$ time using dynamic programming.
+
+However, we can make the algorithm more efficient
+by using the fact that there are at most $O(\sqrt n)$
+\emph{distinct} weights.
+Thus, we can process the weights in groups
+that consists of similar weights.
+We can process each group
+in $O(n)$ time, which yields an $O(n \sqrt n)$ time algorithm.
+
+The idea is to use an array that records the sums of weights
+that can be formed using the groups processed so far.
+The array contains $n$ elements: element $k$ is 1 if the sum
+$k$ can be formed and 0 otherwise.
+To process a group of weights, we scan the array
+from left to right and record the new sums of weights that
+can be formed using this group and the previous groups.
+
+\subsubsection{String construction}
+
+Given a string \texttt{s} of length $n$
+and a set of strings $D$ whose total length is $m$,
+consider the problem of counting the number of ways
+\texttt{s} can be formed as a concatenation of strings in $D$.
+For example,
+if $\texttt{s}=\texttt{ABAB}$ and
+$D=\{\texttt{A},\texttt{B},\texttt{AB}\}$,
+there are 4 ways:
+
+\begin{itemize}[noitemsep]
+\item $\texttt{A}+\texttt{B}+\texttt{A}+\texttt{B}$
+\item $\texttt{AB}+\texttt{A}+\texttt{B}$
+\item $\texttt{A}+\texttt{B}+\texttt{AB}$
+\item $\texttt{AB}+\texttt{AB}$
+\end{itemize}
+
+We can solve the problem using dynamic programming:
+Let $\texttt{count}(k)$ denote the number of ways to construct the prefix
+$\texttt{s}[0 \ldots k]$ using the strings in $D$.
+Now $\texttt{count}(n-1)$ gives the answer to the problem,
+and we can solve the problem in $O(n^2)$ time
+using a trie structure.
+
+However, we can solve the problem more efficiently
+by using string hashing and the fact that there
+are at most $O(\sqrt m)$ distinct string lengths in $D$.
+First, we construct a set $H$ that contains all
+hash values of the strings in $D$.
+Then, when calculating a value of $\texttt{count}(k)$,
+we go through all values of $p$
+such that there is a string of length $p$ in $D$,
+calculate the hash value of $\texttt{s}[k-p+1 \ldots k]$
+and check if it belongs to $H$.
+Since there are at most $O(\sqrt m)$ distinct string lengths,
+this results in an algorithm whose running time is $O(n \sqrt m)$.
+
+\section{Mo's algorithm}
+
+\index{Mo's algorithm}
+
+\key{Mo's algorithm}\footnote{According to \cite{cod15}, this algorithm
+is named after Mo Tao, a Chinese competitive programmer, but
+the technique has appeared earlier in the literature \cite{ken06}.}
+can be used in many problems
+that require processing range queries in 
+a \emph{static} array, i.e., the array values
+do not change between the queries.
+In each query, we are given a range $[a,b]$,
+and we should calculate a value based on the
+array elements between positions $a$ and $b$.
+Since the array is static,
+the queries can be processed in any order,
+and Mo's algorithm
+processes the queries in a special order which guarantees
+that the algorithm works efficiently.
+
+Mo's algorithm maintains an \emph{active range}
+of the array, and the answer to a query
+concerning the active range is known at each moment.
+The algorithm processes the queries one by one,
+and always moves the endpoints of the
+active range by inserting and removing elements.
+The time complexity of the algorithm is
+$O(n \sqrt n f(n))$ where the array contains
+$n$ elements, there are $n$ queries
+and each insertion and removal of an element
+takes $O(f(n))$ time.
+
+The trick in Mo's algorithm is the order
+in which the queries are processed:
+The array is divided into blocks of $k=O(\sqrt n)$
+elements, and a query $[a_1,b_1]$
+is processed before a query $[a_2,b_2]$
+if either 
+\begin{itemize}
+\item $\lfloor a_1/k \rfloor < \lfloor a_2/k \rfloor$ or
+\item $\lfloor a_1/k \rfloor = \lfloor a_2/k \rfloor$ and $b_1 < b_2$.
+\end{itemize}
+
+Thus, all queries whose left endpoints are
+in a certain block are processed one after another
+sorted according to their right endpoints.
+Using this order, the algorithm
+only performs $O(n \sqrt n)$ operations,
+because the left endpoint moves
+$O(n)$ times $O(\sqrt n)$ steps,
+and the right endpoint moves
+$O(\sqrt n)$ times $O(n)$ steps. Thus, both
+endpoints move a total of $O(n \sqrt n)$ steps during the algorithm.
+
+\subsubsection*{Example}
+
+As an example, consider a problem
+where we are given a set of queries,
+each of them corresponding to a range in an array,
+and our task is to calculate for each query
+the number of \emph{distinct} elements in the range.
+
+In Mo's algorithm, the queries are always sorted
+in the same way, but it depends on the problem
+how the answer to the query is maintained.
+In this problem, we can maintain an array 
+\texttt{count} where $\texttt{count}[x]$
+indicates the number of times an element $x$
+occurs in the active range.
+
+When we move from one query to another query,
+the active range changes.
+For example, if the current range is
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (1,0) rectangle (5,1);
+\draw (0,0) grid (9,1);
+\node at (0.5, 0.5) {4};
+\node at (1.5, 0.5) {2};
+\node at (2.5, 0.5) {5};
+\node at (3.5, 0.5) {4};
+\node at (4.5, 0.5) {2};
+\node at (5.5, 0.5) {4};
+\node at (6.5, 0.5) {3};
+\node at (7.5, 0.5) {3};
+\node at (8.5, 0.5) {4};
+\end{tikzpicture}
+\end{center}
+and the next range is
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\fill[color=lightgray] (2,0) rectangle (7,1);
+\draw (0,0) grid (9,1);
+\node at (0.5, 0.5) {4};
+\node at (1.5, 0.5) {2};
+\node at (2.5, 0.5) {5};
+\node at (3.5, 0.5) {4};
+\node at (4.5, 0.5) {2};
+\node at (5.5, 0.5) {4};
+\node at (6.5, 0.5) {3};
+\node at (7.5, 0.5) {3};
+\node at (8.5, 0.5) {4};
+\end{tikzpicture}
+\end{center}
+there will be three steps:
+the left endpoint moves one step to the right,
+and the right endpoint moves two steps to the right.
+
+After each step, the array \texttt{count}
+needs to be updated.
+After adding an element $x$,
+we increase the value of 
+$\texttt{count}[x]$ by 1,
+and if $\texttt{count}[x]=1$ after this,
+we also increase the answer to the query by 1.
+Similarly, after removing an element $x$,
+we decrease the value of 
+$\texttt{count}[x]$ by 1,
+and if $\texttt{count}[x]=0$ after this,
+we also decrease the answer to the query by 1.
+
+In this problem, the time needed to perform
+each step is $O(1)$, so the total time complexity
+of the algorithm is $O(n \sqrt n)$.
--- a/chapter28.tex
+++ b/chapter28.tex
--- a/chapter29.tex
+++ b/chapter29.tex
@ -0,0 +1,782 @@
+\chapter{Geometry}
+
+\index{geometry}
+
+In geometric problems, it is often challenging
+to find a way to approach the problem so that
+the solution to the problem can be conveniently implemented
+and the number of special cases is small.
+
+As an example, consider a problem where
+we are given the vertices of a quadrilateral
+(a polygon that has four vertices),
+and our task is to calculate its area.
+For example, a possible input for the problem is as follows:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.45]
+
+\draw[fill] (6,2) circle [radius=0.1];
+\draw[fill] (5,6) circle [radius=0.1];
+\draw[fill] (2,5) circle [radius=0.1];
+\draw[fill] (1,1) circle [radius=0.1];
+\draw[thick] (6,2) -- (5,6) -- (2,5) -- (1,1) -- (6,2);
+\end{tikzpicture}
+\end{center}
+One way to approach the problem is to divide
+the quadrilateral into two triangles by a straight
+line between two opposite vertices:
+\begin{center}
+\begin{tikzpicture}[scale=0.45]
+
+\draw[fill] (6,2) circle [radius=0.1];
+\draw[fill] (5,6) circle [radius=0.1];
+\draw[fill] (2,5) circle [radius=0.1];
+\draw[fill] (1,1) circle [radius=0.1];
+
+\draw[thick] (6,2) -- (5,6) -- (2,5) -- (1,1) -- (6,2);
+\draw[dashed,thick] (2,5) -- (6,2);
+\end{tikzpicture}
+\end{center}
+After this, it suffices to sum the areas
+of the triangles.
+The area of a triangle can be calculated,
+for example, using \key{Heron's formula}
+%\footnote{Heron of Alexandria (c. 10--70) was a Greek mathematician.}
+\[ \sqrt{s (s-a) (s-b) (s-c)},\]
+where $a$, $b$ and $c$ are the lengths
+of the triangle's sides and
+$s=(a+b+c)/2$.
+\index{Heron's formula}
+
+This is a possible way to solve the problem,
+but there is one pitfall:
+how to divide the quadrilateral into triangles?
+It turns out that sometimes we cannot just pick
+two arbitrary opposite vertices.
+For example, in the following situation,
+the division line is \emph{outside} the quadrilateral:
+\begin{center}
+\begin{tikzpicture}[scale=0.45]
+
+\draw[fill] (6,2) circle [radius=0.1];
+\draw[fill] (3,2) circle [radius=0.1];
+\draw[fill] (2,5) circle [radius=0.1];
+\draw[fill] (1,1) circle [radius=0.1];
+\draw[thick] (6,2) -- (3,2) -- (2,5) -- (1,1) -- (6,2);
+
+\draw[dashed,thick] (2,5) -- (6,2);
+\end{tikzpicture}
+\end{center}
+However, another way to draw the line works:
+\begin{center}
+\begin{tikzpicture}[scale=0.45]
+
+\draw[fill] (6,2) circle [radius=0.1];
+\draw[fill] (3,2) circle [radius=0.1];
+\draw[fill] (2,5) circle [radius=0.1];
+\draw[fill] (1,1) circle [radius=0.1];
+\draw[thick] (6,2) -- (3,2) -- (2,5) -- (1,1) -- (6,2);
+
+\draw[dashed,thick] (3,2) -- (1,1);
+\end{tikzpicture}
+\end{center}
+It is clear for a human which of the lines is the correct
+choice, but the situation is difficult for a computer.
+                           
+However, it turns out that we can solve the problem using
+another method that is more convenient to a programmer.
+Namely, there is a general formula
+\[x_1y_2-x_2y_1+x_2y_3-x_3y_2+x_3y_4-x_4y_3+x_4y_1-x_1y_4,\]
+that calculates the area of a quadrilateral
+whose vertices are
+$(x_1,y_1)$,
+$(x_2,y_2)$,
+$(x_3,y_3)$ and
+$(x_4,y_4)$.
+This formula is easy to implement, there are no special
+cases, and we can even generalize the formula
+to \emph{all} polygons.
+
+\section{Complex numbers}
+
+\index{complex number}
+\index{point}
+\index{vector}
+
+A \key{complex number} is a number of the form $x+y i$,
+where $i = \sqrt{-1}$ is the \key{imaginary unit}.
+A geometric interpretation of a complex number is
+that it represents a two-dimensional point $(x,y)$
+or a vector from the origin to a point $(x,y)$.
+
+For example, $4+2i$ corresponds to the
+following point and vector:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.45]
+
+\draw[->,thick] (-5,0)--(5,0);
+\draw[->,thick] (0,-5)--(0,5);
+
+\draw[fill] (4,2) circle [radius=0.1];
+\draw[->,thick] (0,0)--(4-0.1,2-0.1);
+
+\node at (4,2.8) {$(4,2)$};
+\end{tikzpicture}
+\end{center}
+
+\index{complex@\texttt{complex}}
+
+The C++ complex number class \texttt{complex} is
+useful when solving geometric problems.
+Using the class we can represent points and vectors
+as complex numbers, and the class contains tools
+that are useful in geometry.
+
+In the following code, \texttt{C} is the type of
+a coordinate and \texttt{P} is the type of a point or a vector.
+In addition, the code defines macros \texttt{X} and \texttt{Y}
+that can be used to refer to x and y coordinates.
+
+\begin{lstlisting}
+typedef long long C;
+typedef complex<C> P;
+#define X real()
+#define Y imag()
+\end{lstlisting}
+
+For example, the following code defines a point $p=(4,2)$
+and prints its x and y coordinates:
+
+\begin{lstlisting}
+P p = {4,2};
+cout << p.X << " " << p.Y << "\n"; // 4 2
+\end{lstlisting}
+
+The following code defines vectors $v=(3,1)$ and $u=(2,2)$,
+and after that calculates the sum $s=v+u$.
+
+\begin{lstlisting}
+P v = {3,1};
+P u = {2,2};
+P s = v+u;
+cout << s.X << " " << s.Y << "\n"; // 5 3
+\end{lstlisting}
+
+In practice,
+an appropriate coordinate type is usually
+\texttt{long long} (integer) or \texttt{long double}
+(real number).
+It is a good idea to use integer whenever possible,
+because calculations with integers are exact.
+If real numbers are needed,
+precision errors should be taken into account
+when comparing numbers.
+A safe way to check if real numbers $a$ and $b$ are equal
+is to compare them using $|a-b|<\epsilon$,
+where $\epsilon$ is a small number (for example, $\epsilon=10^{-9}$).
+
+\subsubsection*{Functions}
+
+In the following examples, the coordinate type is
+\texttt{long double}.
+
+The function $\texttt{abs}(v)$ calculates the length
+$|v|$ of a vector $v=(x,y)$
+using the formula $\sqrt{x^2+y^2}$.
+The function can also be used for
+calculating the distance between points
+$(x_1,y_1)$ and $(x_2,y_2)$,
+because that distance equals the length
+of the vector $(x_2-x_1,y_2-y_1)$.
+
+The following code calculates the distance
+between points $(4,2)$ and $(3,-1)$:
+\begin{lstlisting}
+P a = {4,2};
+P b = {3,-1};
+cout << abs(b-a) << "\n"; // 3.16228
+\end{lstlisting}
+
+The function $\texttt{arg}(v)$ calculates the
+angle of a vector $v=(x,y)$ with respect to the x axis.
+The function gives the angle in radians,
+where $r$ radians equals $180 r/\pi$ degrees.
+The angle of a vector that points to the right is 0,
+and angles decrease clockwise and increase
+counterclockwise.
+
+The function $\texttt{polar}(s,a)$ constructs a vector
+whose length is $s$ and that points to an angle $a$.
+A vector can be rotated by an angle $a$
+by multiplying it by a vector with length 1 and angle $a$.
+
+The following code calculates the angle of
+the vector $(4,2)$, rotates it $1/2$ radians
+counterclockwise, and then calculates the angle again:
+
+\begin{lstlisting}
+P v = {4,2};
+cout << arg(v) << "\n"; // 0.463648
+v *= polar(1.0,0.5);
+cout << arg(v) << "\n"; // 0.963648
+\end{lstlisting}
+
+\section{Points and lines}
+
+\index{cross product}
+
+The \key{cross product} $a \times b$ of vectors
+$a=(x_1,y_1)$ and $b=(x_2,y_2)$ is calculated
+using the formula $x_1 y_2 - x_2 y_1$.
+The cross product tells us whether $b$
+turns left (positive value), does not turn (zero)
+or turns right (negative value)
+when it is placed directly after $a$.
+
+The following picture illustrates the above cases:
+\begin{center}
+\begin{tikzpicture}[scale=0.45]
+
+\draw[->,thick] (0,0)--(4,2);
+\draw[->,thick] (4,2)--(4+1,2+2);
+
+\node at (2.5,0.5) {$a$};
+\node at (5,2.5) {$b$};
+
+\node at (3,-2) {$a \times b = 6$};
+
+\draw[->,thick] (8+0,0)--(8+4,2);
+\draw[->,thick] (8+4,2)--(8+4+2,2+1);
+
+\node at (8+2.5,0.5) {$a$};
+\node at (8+5,1.5) {$b$};
+
+\node at (8+3,-2) {$a \times b = 0$};
+
+\draw[->,thick] (16+0,0)--(16+4,2);
+\draw[->,thick] (16+4,2)--(16+4+2,2-1);
+
+\node at (16+2.5,0.5) {$a$};
+\node at (16+5,2.5) {$b$};
+
+\node at (16+3,-2) {$a \times b = -8$};
+\end{tikzpicture}
+\end{center}
+
+\noindent
+For example, in the first case
+$a=(4,2)$ and $b=(1,2)$.
+The following code calculates the cross product
+using the class \texttt{complex}:
+
+\begin{lstlisting}
+P a = {4,2};
+P b = {1,2};
+C p = (conj(a)*b).Y; // 6
+\end{lstlisting}
+
+The above code works, because
+the function \texttt{conj} negates the y coordinate
+of a vector,
+and when the vectors $(x_1,-y_1)$ and $(x_2,y_2)$
+are multiplied together, the y coordinate
+of the result is $x_1 y_2 - x_2 y_1$.
+
+\subsubsection{Point location}
+
+Cross products can be used to test
+whether a point is located on the left or right
+side of a line.
+Assume that the line goes through points
+$s_1$ and $s_2$, we are looking from $s_1$
+to $s_2$ and the point is $p$.
+
+For example, in the following picture,
+$p$ is on the left side of the line:
+\begin{center}
+\begin{tikzpicture}[scale=0.45]
+\draw[dashed,thick,->] (0,-3)--(12,6);
+\draw[fill] (4,0) circle [radius=0.1];
+\draw[fill] (8,3) circle [radius=0.1];
+\draw[fill] (5,3) circle [radius=0.1];
+\node at (4,-1) {$s_1$};
+\node at (8,2) {$s_2$};
+\node at (5,4) {$p$};
+\end{tikzpicture}
+\end{center}
+
+The cross product $(p-s_1) \times (p-s_2)$
+tells us the location of the point $p$.
+If the cross product is positive,
+$p$ is located on the left side,
+and if the cross product is negative,
+$p$ is located on the right side.
+Finally, if the cross product is zero,
+points $s_1$, $s_2$ and $p$ are on the same line.
+
+\subsubsection{Line segment intersection}
+
+\index{line segment intersection}
+
+Next we consider the problem of testing
+whether two line segments
+$ab$ and $cd$ intersect. The possible cases are:
+
+\textit{Case 1:}
+The line segments are on the same line
+and they overlap each other.
+In this case, there is an infinite number of
+intersection points.
+For example, in the following picture,
+all points between $c$ and $b$ are
+intersection points:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\draw (1.5,1.5)--(6,3);
+\draw (0,1)--(4.5,2.5);
+\draw[fill] (0,1) circle [radius=0.05];
+\node at (0,0.5) {$a$};
+\draw[fill] (1.5,1.5) circle [radius=0.05];
+\node at (6,2.5) {$d$};
+\draw[fill] (4.5,2.5) circle [radius=0.05];
+\node at (1.5,1) {$c$};
+\draw[fill] (6,3) circle [radius=0.05];
+\node at (4.5,2) {$b$};
+\end{tikzpicture}
+\end{center}
+
+In this case, we can use cross products to
+check if all points are on the same line.
+After this, we can sort the points and check
+whether the line segments overlap each other.
+
+\textit{Case 2:}
+The line segments have a common vertex
+that is the only intersection point.
+For example, in the following picture the
+intersection point is $b=c$:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\draw (0,0)--(4,2);
+\draw (4,2)--(6,1);
+\draw[fill] (0,0) circle [radius=0.05];
+\draw[fill] (4,2) circle [radius=0.05];
+\draw[fill] (6,1) circle [radius=0.05];
+
+\node at (0,0.5) {$a$};
+\node at (4,2.5) {$b=c$};
+\node at (6,1.5) {$d$};
+\end{tikzpicture}
+\end{center}
+
+This case is easy to check, because
+there are only four possibilities
+for the intersection point:
+$a=c$, $a=d$, $b=c$ and $b=d$.
+
+\textit{Case 3:}
+There is exactly one intersection point
+that is not a vertex of any line segment.
+In the following picture, the point $p$
+is the intersection point:
+\begin{center}
+\begin{tikzpicture}[scale=0.9]
+\draw (0,1)--(6,3);
+\draw (2,4)--(4,0);
+\draw[fill] (0,1) circle [radius=0.05];
+\node at (0,0.5) {$c$};
+\draw[fill] (6,3) circle [radius=0.05];
+\node at (6,2.5) {$d$};
+\draw[fill] (2,4) circle [radius=0.05];
+\node at (1.5,3.5) {$a$};
+\draw[fill] (4,0) circle [radius=0.05];
+\node at (4,-0.4) {$b$};
+\draw[fill] (3,2) circle [radius=0.05];
+\node at (3,1.5) {$p$};
+\end{tikzpicture}
+\end{center}
+
+In this case, the line segments intersect
+exactly when both points $c$ and $d$ are
+on different sides of a line through $a$ and $b$,
+and points $a$ and $b$ are on different
+sides of a line through $c$ and $d$.
+We can use cross products to check this.
+
+\subsubsection{Point distance from a line}
+
+Another feature of cross products is that
+the area of a triangle can be calculated
+using the formula
+\[\frac{| (a-c) \times (b-c) |}{2},\]
+where $a$, $b$ and $c$ are the vertices of the triangle.
+Using this fact, we can derive a formula
+for calculating the shortest distance between a point and a line.
+For example, in the following picture $d$ is the
+shortest distance between the point $p$ and the line
+that is defined by the points $s_1$ and $s_2$:
+\begin{center}
+\begin{tikzpicture}[scale=0.75]
+\draw (-2,-1)--(6,3);
+\draw[dashed] (1,4)--(2.40,1.2);
+\node at (0,-0.5) {$s_1$};
+\node at (4,1.5) {$s_2$};
+\node at (0.5,4) {$p$};
+\node at (2,2.7) {$d$};
+\draw[fill] (0,0) circle [radius=0.05];
+\draw[fill] (4,2) circle [radius=0.05];
+\draw[fill] (1,4) circle [radius=0.05];
+\end{tikzpicture}
+\end{center}
+
+The area of the triangle whose vertices are
+$s_1$, $s_2$ and $p$ can be calculated in two ways:
+it is both
+$\frac{1}{2} |s_2-s_1| d$ and
+$\frac{1}{2} ((s_1-p) \times (s_2-p))$.
+Thus, the shortest distance is
+\[ d = \frac{(s_1-p) \times (s_2-p)}{|s_2-s_1|} .\]
+
+\subsubsection{Point inside a polygon}
+
+Let us now consider the problem of
+testing whether a point is located inside or outside
+a polygon.
+For example, in the following picture point $a$
+is inside the polygon and point $b$ is outside
+the polygon.
+
+\begin{center}
+\begin{tikzpicture}[scale=0.75]
+%\draw (0,0)--(2,-2)--(3,1)--(5,1)--(2,3)--(1,2)--(-1,2)--(1,4)--(-2,4)--(-2,1)--(-3,3)--(-4,0)--(0,0);
+\draw (0,0)--(2,2)--(5,1)--(2,3)--(1,2)--(-1,2)--(1,4)--(-2,4)--(-2,1)--(-3,3)--(-4,0)--(0,0);
+
+\draw[fill] (-3,1) circle [radius=0.05];
+\node at (-3,0.5) {$a$};
+\draw[fill] (1,3) circle [radius=0.05];
+\node at (1,2.5) {$b$};
+\end{tikzpicture}
+\end{center}
+
+A convenient way to solve the problem is to
+send a \emph{ray} from the point to an arbitrary direction
+and calculate the number of times it touches
+the boundary of the polygon.
+If the number is odd,
+the point is inside the polygon,
+and if the number is even,
+the point is outside the polygon.
+
+\begin{samepage}
+For example, we could send the following rays:
+\begin{center}
+\begin{tikzpicture}[scale=0.75]
+\draw (0,0)--(2,2)--(5,1)--(2,3)--(1,2)--(-1,2)--(1,4)--(-2,4)--(-2,1)--(-3,3)--(-4,0)--(0,0);
+
+\draw[fill] (-3,1) circle [radius=0.05];
+\node at (-3,0.5) {$a$};
+\draw[fill] (1,3) circle [radius=0.05];
+\node at (1,2.5) {$b$};
+
+\draw[dashed,->] (-3,1)--(-6,0);
+\draw[dashed,->] (-3,1)--(0,5);
+
+\draw[dashed,->] (1,3)--(3.5,0);
+\draw[dashed,->] (1,3)--(3,4);
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+The rays from $a$ touch 1 and 3 times
+the boundary of the polygon,
+so $a$ is inside the polygon.
+Correspondingly, the rays from $b$
+touch 0 and 2 times the boundary of the polygon,
+so $b$ is outside the polygon.
+
+\section{Polygon area}
+
+A general formula for calculating the area
+of a polygon, sometimes called the \key{shoelace formula},
+is as follows: \index{shoelace formula}
+\[\frac{1}{2} |\sum_{i=1}^{n-1} (p_i \times p_{i+1})| =
+\frac{1}{2} |\sum_{i=1}^{n-1} (x_i y_{i+1} - x_{i+1} y_i)|, \]
+Here the vertices are
+$p_1=(x_1,y_1)$, $p_2=(x_2,y_2)$, $\ldots$, $p_n=(x_n,y_n)$
+in such an order that
+$p_i$ and $p_{i+1}$ are adjacent vertices on the boundary
+of the polygon,
+and the first and last vertex is the same, i.e., $p_1=p_n$.
+
+For example, the area of the polygon
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\filldraw (4,1.4) circle (2pt);
+\filldraw (7,3.4) circle (2pt);
+\filldraw (5,5.4) circle (2pt);
+\filldraw (2,4.4) circle (2pt);
+\filldraw (4,3.4) circle (2pt);
+\node (1) at (4,1) {(4,1)};
+\node (2) at (7.2,3) {(7,3)};
+\node (3) at (5,5.8) {(5,5)};
+\node (4) at (2,4) {(2,4)};
+\node (5) at (3.5,3) {(4,3)};
+\path[draw] (4,1.4) -- (7,3.4) -- (5,5.4) -- (2,4.4) -- (4,3.4) -- (4,1.4);
+\end{tikzpicture}
+\end{center}
+is
+\[\frac{|(2\cdot5-5\cdot4)+(5\cdot3-7\cdot5)+(7\cdot1-4\cdot3)+(4\cdot3-4\cdot1)+(4\cdot4-2\cdot3)|}{2} = 17/2.\]
+
+The idea of the formula is to go through trapezoids
+whose one side is a side of the polygon,
+and another side lies on the horizontal line $y=0$.
+For example:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\path[draw,fill=lightgray] (5,5.4) -- (7,3.4) -- (7,0) -- (5,0) -- (5,5.4);
+\filldraw (4,1.4) circle (2pt);
+\filldraw (7,3.4) circle (2pt);
+\filldraw (5,5.4) circle (2pt);
+\filldraw (2,4.4) circle (2pt);
+\filldraw (4,3.4) circle (2pt);
+\node (1) at (4,1) {(4,1)};
+\node (2) at (7.2,3) {(7,3)};
+\node (3) at (5,5.8) {(5,5)};
+\node (4) at (2,4) {(2,4)};
+\node (5) at (3.5,3) {(4,3)};
+\path[draw] (4,1.4) -- (7,3.4) -- (5,5.4) -- (2,4.4) -- (4,3.4) -- (4,1.4);
+\draw (0,0) -- (10,0);
+\end{tikzpicture}
+\end{center}
+The area of such a trapezoid is
+\[(x_{i+1}-x_{i}) \frac{y_i+y_{i+1}}{2},\]
+where the vertices of the polygon are $p_i$ and $p_{i+1}$.
+If $x_{i+1}>x_{i}$, the area is positive,
+and if $x_{i+1}<x_{i}$, the area is negative.
+
+The area of the polygon is the sum of areas of
+all such trapezoids, which yields the formula
+\[|\sum_{i=1}^{n-1} (x_{i+1}-x_{i}) \frac{y_i+y_{i+1}}{2}| =
+\frac{1}{2} |\sum_{i=1}^{n-1} (x_i y_{i+1} - x_{i+1} y_i)|.\]
+
+Note that the absolute value of the sum is taken,
+because the value of the sum may be positive or negative,
+depending on whether we walk clockwise or counterclockwise
+along the boundary of the polygon.
+
+\subsubsection{Pick's theorem}
+
+\index{Pick's theorem}
+
+\key{Pick's theorem} provides another way to calculate
+the area of a polygon provided that all vertices 
+of the polygon have integer coordinates.
+According to Pick's theorem, the area of the polygon is
+\[ a + b/2 -1,\]
+where $a$ is the number of integer points inside the polygon
+and $b$ is the number of integer points on the boundary of the polygon.
+
+For example, the area of the polygon
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\filldraw (4,1.4) circle (2pt);
+\filldraw (7,3.4) circle (2pt);
+\filldraw (5,5.4) circle (2pt);
+\filldraw (2,4.4) circle (2pt);
+\filldraw (4,3.4) circle (2pt);
+\node (1) at (4,1) {(4,1)};
+\node (2) at (7.2,3) {(7,3)};
+\node (3) at (5,5.8) {(5,5)};
+\node (4) at (2,4) {(2,4)};
+\node (5) at (3.5,3) {(4,3)};
+\path[draw] (4,1.4) -- (7,3.4) -- (5,5.4) -- (2,4.4) -- (4,3.4) -- (4,1.4);
+
+\filldraw (2,4.4) circle (2pt);
+\filldraw (3,4.4) circle (2pt);
+\filldraw (4,4.4) circle (2pt);
+\filldraw (5,4.4) circle (2pt);
+\filldraw (6,4.4) circle (2pt);
+
+\filldraw (4,3.4) circle (2pt);
+\filldraw (5,3.4) circle (2pt);
+\filldraw (6,3.4) circle (2pt);
+\filldraw (7,3.4) circle (2pt);
+
+\filldraw (4,2.4) circle (2pt);
+\filldraw (5,2.4) circle (2pt);
+\end{tikzpicture}
+\end{center}
+is $6+7/2-1=17/2$.
+
+\section{Distance functions}
+
+\index{distance function}
+\index{Euclidean distance}
+\index{Manhattan distance}
+
+A \key{distance function} defines the distance between
+two points.
+The usual distance function is the
+\key{Euclidean distance} where the distance between
+points $(x_1,y_1)$ and $(x_2,y_2)$ is
+\[\sqrt{(x_2-x_1)^2+(y_2-y_1)^2}.\]
+An alternative distance function is the
+\key{Manhattan distance}
+where the distance between points
+$(x_1,y_1)$ and $(x_2,y_2)$ is
+\[|x_1-x_2|+|y_1-y_2|.\]
+\begin{samepage}
+For example, consider the following picture:
+\begin{center}
+\begin{tikzpicture}
+
+\draw[fill] (2,1) circle [radius=0.05];
+\draw[fill] (5,2) circle [radius=0.05];
+
+\node at (2,0.5) {$(2,1)$};
+\node at (5,1.5) {$(5,2)$};
+
+\draw[dashed] (2,1) -- (5,2);
+
+\draw[fill] (5+2,1) circle [radius=0.05];
+\draw[fill] (5+5,2) circle [radius=0.05];
+
+\node at (5+2,0.5) {$(2,1)$};
+\node at (5+5,1.5) {$(5,2)$};
+
+\draw[dashed] (5+2,1) -- (5+2,2);
+\draw[dashed] (5+2,2) -- (5+5,2);
+
+\node at (3.5,-0.5) {Euclidean distance};
+\node at (5+3.5,-0.5) {Manhattan distance};
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+The Euclidean distance between the points is
+\[\sqrt{(5-2)^2+(2-1)^2}=\sqrt{10}\]
+and the Manhattan distance is
+\[|5-2|+|2-1|=4.\]
+The following picture shows regions that are within a distance of 1
+from the center point, using the Euclidean and Manhattan distances:
+\begin{center}
+\begin{tikzpicture}
+
+\draw[fill=gray!20] (0,0) circle [radius=1];
+\draw[fill] (0,0) circle [radius=0.05];
+
+\node at (0,-1.5) {Euclidean distance};
+
+\draw[fill=gray!20] (5+0,1) -- (5-1,0) -- (5+0,-1) -- (5+1,0) -- (5+0,1);
+\draw[fill] (5,0) circle [radius=0.05];
+\node at (5,-1.5) {Manhattan distance};
+\end{tikzpicture}
+\end{center}
+
+\subsubsection{Rotating coordinates}
+
+Some problems are easier to solve if
+Manhattan distances are used instead of Euclidean distances.
+As an example, consider a problem where we are given
+$n$ points in the two-dimensional plane
+and our task is to calculate the maximum Manhattan
+distance between any two points.
+
+For example, consider the following set of points:
+\begin{center}
+\begin{tikzpicture}[scale=0.65]
+\draw[color=gray] (-1,-1) grid (4,4);
+
+\filldraw (0,2) circle (2.5pt);
+\filldraw (3,3) circle (2.5pt);
+\filldraw (1,0) circle (2.5pt);
+\filldraw (3,1) circle (2.5pt);
+
+\node at (0,1.5) {$A$};
+\node at (3,2.5) {$C$};
+\node at (1,-0.5) {$B$};
+\node at (3,0.5) {$D$};
+\end{tikzpicture}
+\end{center}
+The maximum Manhattan distance is 5
+between points $B$ and $C$:
+\begin{center}
+\begin{tikzpicture}[scale=0.65]
+\draw[color=gray] (-1,-1) grid (4,4);
+
+\filldraw (0,2) circle (2.5pt);
+\filldraw (3,3) circle (2.5pt);
+\filldraw (1,0) circle (2.5pt);
+\filldraw (3,1) circle (2.5pt);
+
+\node at (0,1.5) {$A$};
+\node at (3,2.5) {$C$};
+\node at (1,-0.5) {$B$};
+\node at (3,0.5) {$D$};
+
+\path[draw=red,thick,line width=2pt] (1,0) -- (1,3) -- (3,3);
+\end{tikzpicture}
+\end{center}
+
+A useful technique related to Manhattan distances
+is to rotate all coordinates 45 degrees so that
+a point $(x,y)$ becomes $(x+y,y-x)$.
+For example, after rotating the above points,
+the result is:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.6]
+\draw[color=gray] (0,-3) grid (7,3);
+
+\filldraw (2,2) circle (2.5pt);
+\filldraw (6,0) circle (2.5pt);
+\filldraw (1,-1) circle (2.5pt);
+\filldraw (4,-2) circle (2.5pt);
+
+\node at (2,1.5) {$A$};
+\node at (6,-0.5) {$C$};
+\node at (1,-1.5) {$B$};
+\node at (4,-2.5) {$D$};
+\end{tikzpicture}
+\end{center}
+And the maximum distance is as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.6]
+\draw[color=gray] (0,-3) grid (7,3);
+
+\filldraw (2,2) circle (2.5pt);
+\filldraw (6,0) circle (2.5pt);
+\filldraw (1,-1) circle (2.5pt);
+\filldraw (4,-2) circle (2.5pt);
+
+\node at (2,1.5) {$A$};
+\node at (6,-0.5) {$C$};
+\node at (1,-1.5) {$B$};
+\node at (4,-2.5) {$D$};
+
+\path[draw=red,thick,line width=2pt] (1,-1) -- (4,2) -- (6,0);
+\end{tikzpicture}
+\end{center}
+
+Consider two points $p_1=(x_1,y_1)$ and $p_2=(x_2,y_2)$ whose rotated
+coordinates are $p'_1=(x'_1,y'_1)$ and $p'_2=(x'_2,y'_2)$.
+Now there are two ways to express the Manhattan distance
+between $p_1$ and $p_2$:
+\[|x_1-x_2|+|y_1-y_2| = \max(|x'_1-x'_2|,|y'_1-y'_2|)\]
+
+For example, if $p_1=(1,0)$ and $p_2=(3,3)$,
+the rotated coordinates are $p'_1=(1,-1)$ and $p'_2=(6,0)$
+and the Manhattan distance is
+\[|1-3|+|0-3| = \max(|1-6|,|-1-0|) = 5.\]
+
+The rotated coordinates provide a simple way
+to operate with Manhattan distances, because we can
+consider x and y coordinates separately.
+To maximize the Manhattan distance between two points,
+we should find two points whose
+rotated coordinates maximize the value of
+\[\max(|x'_1-x'_2|,|y'_1-y'_2|).\]
+This is easy, because either the horizontal or vertical
+difference of the rotated coordinates has to be maximum.
--- a/chapter30.tex
+++ b/chapter30.tex
@ -0,0 +1,847 @@
+\chapter{Sweep line algorithms}
+
+\index{sweep line}
+
+Many geometric problems can be solved using
+\key{sweep line} algorithms.
+The idea in such algorithms is to represent
+an instance of the problem as a set of events that correspond
+to points in the plane.
+The events are processed in increasing order
+according to their x or y coordinates.
+
+As an example, consider the following problem:
+There is a company that has $n$ employees,
+and we know for each employee their arrival and
+leaving times on a certain day.
+Our task is to calculate the maximum number of
+employees that were in the office at the same time.
+
+The problem can be solved by modeling the situation
+so that each employee is assigned two events that
+correspond to their arrival and leaving times.
+After sorting the events, we go through them
+and keep track of the number of people in the office.
+For example, the table
+\begin{center}
+\begin{tabular}{ccc}
+person & arrival time & leaving time \\
+\hline
+John & 10 & 15 \\
+Maria & 6 & 12 \\
+Peter & 14 & 16 \\
+Lisa & 5 & 13 \\
+\end{tabular}
+\end{center}
+corresponds to the following events:
+\begin{center}
+\begin{tikzpicture}[scale=0.6]
+\draw (0,0) rectangle (17,-6.5);
+\path[draw,thick,-] (10,-1) -- (15,-1);
+\path[draw,thick,-] (6,-2.5) -- (12,-2.5);
+\path[draw,thick,-] (14,-4) -- (16,-4);
+\path[draw,thick,-] (5,-5.5) -- (13,-5.5);
+
+\draw[fill] (10,-1) circle [radius=0.05];
+\draw[fill] (15,-1) circle [radius=0.05];
+\draw[fill] (6,-2.5) circle [radius=0.05];
+\draw[fill] (12,-2.5) circle [radius=0.05];
+\draw[fill] (14,-4) circle [radius=0.05];
+\draw[fill] (16,-4) circle [radius=0.05];
+\draw[fill] (5,-5.5) circle [radius=0.05];
+\draw[fill] (13,-5.5) circle [radius=0.05];
+
+\node at (2,-1) {John};
+\node at (2,-2.5) {Maria};
+\node at (2,-4) {Peter};
+\node at (2,-5.5) {Lisa};
+\end{tikzpicture}
+\end{center}
+We go through the events from left to right
+and maintain a counter.
+Always when a person arrives, we increase
+the value of the counter by one,
+and when a person leaves,
+we decrease the value of the counter by one.
+The answer to the problem is the maximum
+value of the counter during the algorithm.
+
+In the example, the events are processed as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.6]
+\path[draw,thick,->] (0.5,0.5) -- (16.5,0.5);
+\draw (0,0) rectangle (17,-6.5);
+\path[draw,thick,-] (10,-1) -- (15,-1);
+\path[draw,thick,-] (6,-2.5) -- (12,-2.5);
+\path[draw,thick,-] (14,-4) -- (16,-4);
+\path[draw,thick,-] (5,-5.5) -- (13,-5.5);
+
+\draw[fill] (10,-1) circle [radius=0.05];
+\draw[fill] (15,-1) circle [radius=0.05];
+\draw[fill] (6,-2.5) circle [radius=0.05];
+\draw[fill] (12,-2.5) circle [radius=0.05];
+\draw[fill] (14,-4) circle [radius=0.05];
+\draw[fill] (16,-4) circle [radius=0.05];
+\draw[fill] (5,-5.5) circle [radius=0.05];
+\draw[fill] (13,-5.5) circle [radius=0.05];
+
+\node at (2,-1) {John};
+\node at (2,-2.5) {Maria};
+\node at (2,-4) {Peter};
+\node at (2,-5.5) {Lisa};
+
+\path[draw,dashed] (10,0)--(10,-6.5);
+\path[draw,dashed] (15,0)--(15,-6.5);
+\path[draw,dashed] (6,0)--(6,-6.5);
+\path[draw,dashed] (12,0)--(12,-6.5);
+\path[draw,dashed] (14,0)--(14,-6.5);
+\path[draw,dashed] (16,0)--(16,-6.5);
+\path[draw,dashed] (5,0)--(5,-6.5);
+\path[draw,dashed] (13,0)--(13,-6.5);
+
+\node at (10,-7) {$+$};
+\node at (15,-7) {$-$};
+\node at (6,-7) {$+$};
+\node at (12,-7) {$-$};
+\node at (14,-7) {$+$};
+\node at (16,-7) {$-$};
+\node at (5,-7) {$+$};
+\node at (13,-7) {$-$};
+
+\node at (10,-8) {$3$};
+\node at (15,-8) {$1$};
+\node at (6,-8) {$2$};
+\node at (12,-8) {$2$};
+\node at (14,-8) {$2$};
+\node at (16,-8) {$0$};
+\node at (5,-8) {$1$};
+\node at (13,-8) {$1$};
+\end{tikzpicture}
+\end{center}
+The symbols $+$ and $-$ indicate whether the
+value of the counter increases or decreases,
+and the value of the counter is shown below.
+The maximum value of the counter is 3
+between John's arrival and Maria's leaving.
+
+The running time of the algorithm is $O(n \log n)$,
+because sorting the events takes $O(n \log n)$ time
+and the rest of the algorithm takes $O(n)$ time.
+
+\section{Intersection points}
+
+\index{intersection point}
+
+Given a set of $n$ line segments, each of them being either
+horizontal or vertical, consider the problem of
+counting the total number of intersection points.
+For example, when the line segments are
+\begin{center}
+\begin{tikzpicture}[scale=0.5]
+\path[draw,thick,-] (0,2) -- (5,2);
+\path[draw,thick,-] (1,4) -- (6,4);
+\path[draw,thick,-] (6,3) -- (10,3);
+\path[draw,thick,-] (2,1) -- (2,6);
+\path[draw,thick,-] (8,2) -- (8,5);
+\end{tikzpicture}
+\end{center}
+there are three intersection points:
+\begin{center}
+\begin{tikzpicture}[scale=0.5]
+\path[draw,thick,-] (0,2) -- (5,2);
+\path[draw,thick,-] (1,4) -- (6,4);
+\path[draw,thick,-] (6,3) -- (10,3);
+\path[draw,thick,-] (2,1) -- (2,6);
+\path[draw,thick,-] (8,2) -- (8,5);
+
+\draw[fill] (2,2) circle [radius=0.15];
+\draw[fill] (2,4) circle [radius=0.15];
+\draw[fill] (8,3) circle [radius=0.15];
+
+\end{tikzpicture}
+\end{center}
+
+It is easy to solve the problem in $O(n^2)$ time,
+because we can go through all possible pairs of line segments
+and check if they intersect.
+However, we can solve the problem more efficiently
+in $O(n \log n)$ time using a sweep line algorithm
+and a range query data structure.
+
+The idea is to process the endpoints of the line
+segments from left to right and 
+focus on three types of events:
+\begin{enumerate}[noitemsep]
+\item[(1)] horizontal segment begins
+\item[(2)] horizontal segment ends
+\item[(3)] vertical segment
+\end{enumerate}
+
+The following events correspond to the example:
+\begin{center}
+\begin{tikzpicture}[scale=0.6]
+\path[draw,dashed] (0,2) -- (5,2);
+\path[draw,dashed] (1,4) -- (6,4);
+\path[draw,dashed] (6,3) -- (10,3);
+\path[draw,dashed] (2,1) -- (2,6);
+\path[draw,dashed] (8,2) -- (8,5);
+
+\node at (0,2) {$1$};
+\node at (5,2) {$2$};
+\node at (1,4) {$1$};
+\node at (6,4) {$2$};
+\node at (6,3) {$1$};
+\node at (10,3) {$2$};
+
+\node at (2,3.5) {$3$};
+\node at (8,3.5) {$3$};
+\end{tikzpicture}
+\end{center}
+
+We go through the events from left to right
+and use a data structure that maintains a set of
+y coordinates where there is an active horizontal segment.
+At event 1, we add the y coordinate of the segment
+to the set, and at event 2, we remove the
+y coordinate from the set.
+
+Intersection points are calculated at event 3.
+When there is a vertical segment between points
+$y_1$ and $y_2$, we count the number of active
+horizontal segments whose y coordinate is between
+$y_1$ and $y_2$, and add this number to the total
+number of intersection points.
+
+To store y coordinates of horizontal segments,
+we can use a binary indexed or segment tree,
+possibly with index compression.
+When such structures are used, processing each event
+takes $O(\log n)$ time, so the total running
+time of the algorithm is $O(n \log n)$.
+
+\section{Closest pair problem}
+
+\index{closest pair}
+
+Given a set of $n$ points, our next problem is
+to find two points whose Euclidean distance is minimum.
+For example, if the points are
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0)--(12,0)--(12,4)--(0,4)--(0,0);
+
+\draw (1,2) circle [radius=0.1];
+\draw (3,1) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5.5,1.5) circle [radius=0.1];
+\draw (6,2.5) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (9,1.5) circle [radius=0.1];
+\draw (10,2) circle [radius=0.1];
+\draw (1.5,3.5) circle [radius=0.1];
+\draw (1.5,1) circle [radius=0.1];
+\draw (2.5,3) circle [radius=0.1];
+\draw (4.5,1.5) circle [radius=0.1];
+\draw (5.25,0.5) circle [radius=0.1];
+\draw (6.5,2) circle [radius=0.1];
+\end{tikzpicture}
+\end{center}
+\begin{samepage}
+we should find the following points:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0)--(12,0)--(12,4)--(0,4)--(0,0);
+
+\draw (1,2) circle [radius=0.1];
+\draw (3,1) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5.5,1.5) circle [radius=0.1];
+\draw[fill] (6,2.5) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (9,1.5) circle [radius=0.1];
+\draw (10,2) circle [radius=0.1];
+\draw (1.5,3.5) circle [radius=0.1];
+\draw (1.5,1) circle [radius=0.1];
+\draw (2.5,3) circle [radius=0.1];
+\draw (4.5,1.5) circle [radius=0.1];
+\draw (5.25,0.5) circle [radius=0.1];
+\draw[fill] (6.5,2) circle [radius=0.1];
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+
+This is another example of a problem
+that can be solved in $O(n \log n)$ time
+using a sweep line algorithm\footnote{Besides this approach,
+there is also an
+$O(n \log n)$ time divide-and-conquer algorithm \cite{sha75}
+that divides the points into two sets and recursively
+solves the problem for both sets.}.
+We go through the points from left to right
+and maintain a value $d$: the minimum distance
+between two points seen so far.
+At each point, we find the nearest point to the left.
+If the distance is less than $d$, it is the
+new minimum distance and we update
+the value of $d$.
+
+If the current point is $(x,y)$
+and there is a point to the left
+within a distance of less than $d$,
+the x coordinate of such a point must
+be between $[x-d,x]$ and the y coordinate
+must be between $[y-d,y+d]$.
+Thus, it suffices to only consider points
+that are located in those ranges,
+which makes the algorithm efficient.
+
+For example, in the following picture, the
+region marked with dashed lines contains
+the points that can be within a distance of $d$
+from the active point:
+
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0)--(12,0)--(12,4)--(0,4)--(0,0);
+
+\draw (1,2) circle [radius=0.1];
+\draw (3,1) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5.5,1.5) circle [radius=0.1];
+\draw (6,2.5) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (9,1.5) circle [radius=0.1];
+\draw (10,2) circle [radius=0.1];
+\draw (1.5,3.5) circle [radius=0.1];
+\draw (1.5,1) circle [radius=0.1];
+\draw (2.5,3) circle [radius=0.1];
+\draw (4.5,1.5) circle [radius=0.1];
+\draw (5.25,0.5) circle [radius=0.1];
+\draw[fill] (6.5,2) circle [radius=0.1];
+
+\draw[dashed] (6.5,0.75)--(6.5,3.25);
+\draw[dashed] (5.25,0.75)--(5.25,3.25);
+\draw[dashed] (5.25,0.75)--(6.5,0.75);
+\draw[dashed] (5.25,3.25)--(6.5,3.25);
+
+\draw [decoration={brace}, decorate, line width=0.3mm] (5.25,3.5) -- (6.5,3.5);
+\node at (5.875,4) {$d$};
+\draw [decoration={brace}, decorate, line width=0.3mm] (6.75,3.25) -- (6.75,2);
+\node at (7.25,2.625) {$d$};
+\end{tikzpicture}
+\end{center}
+
+The efficiency of the algorithm is based on the fact
+that the region always contains
+only $O(1)$ points.
+We can go through those points in $O(\log n)$ time
+by maintaining a set of points whose x coordinate
+is between $[x-d,x]$, in increasing order according
+to their y coordinates.
+
+The time complexity of the algorithm is $O(n \log n)$,
+because we go through $n$ points and
+find for each point the nearest point to the left
+in $O(\log n)$ time.
+
+\section{Convex hull problem}
+
+A \key{convex hull} is the smallest convex polygon
+that contains all points of a given set.
+Convexity means that a line segment between
+any two vertices of the polygon is completely
+inside the polygon.
+
+\begin{samepage}
+For example, for the points
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+\end{tikzpicture}
+\end{center}
+\end{samepage}
+the convex hull is as follows:
+\begin{center}
+\begin{tikzpicture}[scale=0.7]
+\draw (0,0)--(4,-1)--(7,1)--(6,3)--(2,4)--(0,2)--(0,0);
+
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+\end{tikzpicture}
+\end{center}
+
+\index{Andrew's algorithm}
+
+\key{Andrew's algorithm} \cite{and79} provides
+an easy way to
+construct the convex hull for a set of points
+in $O(n \log n)$ time.
+The algorithm first locates the leftmost
+and rightmost points, and then
+constructs the convex hull in two parts:
+first the upper hull and then the lower hull.
+Both parts are similar, so we can focus on
+constructing the upper hull.
+
+First, we sort the points primarily according to
+x coordinates and secondarily according to y coordinates.
+After this, we go through the points and
+add each point to the hull.
+Always after adding a point to the hull,
+we make sure that the last line segment
+in the hull does not turn left.
+As long as it turns left, we repeatedly remove the
+second last point from the hull.
+
+The following pictures show how
+Andrew's algorithm works:
+\\
+\begin{tabular}{ccccccc}
+\\
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(1,1);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(1,1)--(2,2);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,2);
+\end{tikzpicture}
+\\
+1 & & 2 & & 3 & & 4 \\
+\end{tabular}
+\\
+\begin{tabular}{ccccccc}
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,2)--(2,4);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(3,2);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,-1);
+\end{tikzpicture}
+\\
+5 & & 6 & & 7 & & 8 \\
+\end{tabular}
+\\
+\begin{tabular}{ccccccc}
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,-1)--(4,0);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,0);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,0)--(4,3);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,3);
+\end{tikzpicture}
+\\
+9 & & 10 & & 11 & & 12 \\
+\end{tabular}
+\\
+\begin{tabular}{ccccccc}
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(4,3);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2)--(6,1);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2)--(6,1)--(6,3);
+\end{tikzpicture}
+\\
+13 & & 14 & & 15 & & 16 \\
+\end{tabular}
+\\
+\begin{tabular}{ccccccc}
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2)--(6,3);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(4,3)--(6,3);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(6,3);
+\end{tikzpicture}
+& \hspace{0.1cm} &
+\begin{tikzpicture}[scale=0.3]
+\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
+\draw (0,0) circle [radius=0.1];
+\draw (4,-1) circle [radius=0.1];
+\draw (7,1) circle [radius=0.1];
+\draw (6,3) circle [radius=0.1];
+\draw (2,4) circle [radius=0.1];
+\draw (0,2) circle [radius=0.1];
+
+\draw (1,1) circle [radius=0.1];
+\draw (2,2) circle [radius=0.1];
+\draw (3,2) circle [radius=0.1];
+\draw (4,0) circle [radius=0.1];
+\draw (4,3) circle [radius=0.1];
+\draw (5,2) circle [radius=0.1];
+\draw (6,1) circle [radius=0.1];
+
+\draw (0,0)--(0,2)--(2,4)--(6,3)--(7,1);
+\end{tikzpicture}
+\\
+17 & & 18 & & 19 & & 20
+\end{tabular}
+
+
+
+
--- a/list.tex
+++ b/list.tex
@ -0,0 +1,388 @@
+\begin{thebibliography}{9}
+
+\bibitem{aho83}
+  A. V. Aho, J. E. Hopcroft and J. Ullman.
+  \emph{Data Structures and Algorithms},
+  Addison-Wesley, 1983.
+
+\bibitem{ahu91}
+  R. K. Ahuja and J. B. Orlin.
+  Distance directed augmenting path algorithms for maximum flow and parametric maximum flow problems.
+  \emph{Naval Research Logistics}, 38(3):413--430, 1991.
+
+\bibitem{and79}
+  A. M. Andrew.
+  Another efficient algorithm for convex hulls in two dimensions.
+  \emph{Information Processing Letters}, 9(5):216--219, 1979.
+
+\bibitem{asp79}
+  B. Aspvall, M. F. Plass and R. E. Tarjan.
+  A linear-time algorithm for testing the truth of certain quantified boolean formulas.
+  \emph{Information Processing Letters}, 8(3):121--123, 1979.
+
+\bibitem{bel58}
+  R. Bellman.
+  On a routing problem.
+  \emph{Quarterly of Applied Mathematics}, 16(1):87--90, 1958.
+
+\bibitem{bec07}
+  M. Beck, E. Pine, W. Tarrat and K. Y. Jensen.
+  New integer representations as the sum of three cubes.
+  \emph{Mathematics of Computation}, 76(259):1683--1690, 2007.
+
+\bibitem{ben00}
+  M. A. Bender and M. Farach-Colton.
+  The LCA problem revisited. In
+  \emph{Latin American Symposium on Theoretical Informatics}, 88--94, 2000.
+
+\bibitem{ben86}
+  J. Bentley.
+  \emph{Programming Pearls}.
+  Addison-Wesley, 1999 (2nd edition).
+
+\bibitem{ben80}
+  J. Bentley and D. Wood.
+  An optimal worst case algorithm for reporting intersections of rectangles.
+  \emph{IEEE Transactions on Computers}, C-29(7):571--577, 1980.
+
+\bibitem{bou01}
+  C. L. Bouton.
+  Nim, a game with a complete mathematical theory.
+  \emph{Annals of Mathematics}, 3(1/4):35--39, 1901.
+
+% \bibitem{bur97}
+%   W. Burnside.
+%   \emph{Theory of Groups of Finite Order},
+%   Cambridge University Press, 1897.
+
+\bibitem{coci}
+  Croatian Open Competition in Informatics, \url{http://hsin.hr/coci/}
+
+\bibitem{cod15}
+  Codeforces: On ''Mo's algorithm'',
+  \url{http://codeforces.com/blog/entry/20032}
+
+\bibitem{cor09}
+  T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein.
+  \emph{Introduction to Algorithms}, MIT Press, 2009 (3rd edition).
+
+\bibitem{dij59}
+  E. W. Dijkstra.
+  A note on two problems in connexion with graphs.
+  \emph{Numerische Mathematik}, 1(1):269--271, 1959.
+
+\bibitem{dik12}
+  K. Diks et al.
+  \emph{Looking for a Challenge? The Ultimate Problem Set from
+  the University of Warsaw Programming Competitions}, University of Warsaw, 2012.
+
+% \bibitem{dil50}
+%   R. P. Dilworth.
+%   A decomposition theorem for partially ordered sets.
+%   \emph{Annals of Mathematics}, 51(1):161--166, 1950.
+
+% \bibitem{dir52}
+%   G. A. Dirac.
+%   Some theorems on abstract graphs.
+%   \emph{Proceedings of the London Mathematical Society}, 3(1):69--81, 1952.
+
+\bibitem{dim15}
+  M. Dima and R. Ceterchi.
+  Efficient range minimum queries using binary indexed trees.
+  \emph{Olympiad in Informatics}, 9(1):39--44, 2015.
+
+\bibitem{edm65}
+  J. Edmonds.
+  Paths, trees, and flowers.
+  \emph{Canadian Journal of Mathematics}, 17(3):449--467, 1965.
+
+\bibitem{edm72}
+  J. Edmonds and R. M. Karp.
+  Theoretical improvements in algorithmic efficiency for network flow problems.
+  \emph{Journal of the ACM}, 19(2):248--264, 1972.
+
+\bibitem{eve75}
+  S. Even, A. Itai and A. Shamir.
+  On the complexity of time table and multi-commodity flow problems.
+  \emph{16th Annual Symposium on Foundations of Computer Science}, 184--193, 1975.
+
+\bibitem{fan94}
+  D. Fanding.
+  A faster algorithm for shortest-path -- SPFA.
+  \emph{Journal of Southwest Jiaotong University}, 2, 1994.
+
+\bibitem{fen94}
+  P. M. Fenwick.
+  A new data structure for cumulative frequency tables.
+  \emph{Software: Practice and Experience}, 24(3):327--336, 1994.
+
+\bibitem{fis06}
+  J. Fischer and V. Heun.
+  Theoretical and practical improvements on the RMQ-problem, with applications to LCA and LCE.
+  In \emph{Annual Symposium on Combinatorial Pattern Matching}, 36--48, 2006.
+
+\bibitem{flo62}
+  R. W. Floyd
+  Algorithm 97: shortest path.
+  \emph{Communications of the ACM}, 5(6):345, 1962.
+
+\bibitem{for56a}
+  L. R. Ford.
+  Network flow theory.
+  RAND Corporation, Santa Monica, California, 1956.
+
+\bibitem{for56}
+  L. R. Ford and D. R. Fulkerson.
+  Maximal flow through a network.
+  \emph{Canadian Journal of Mathematics}, 8(3):399--404, 1956.
+
+\bibitem{fre77}
+  R. Freivalds.
+  Probabilistic machines can use less running time.
+  In \emph{IFIP congress}, 839--842, 1977.
+
+\bibitem{gal14}
+  F. Le Gall.
+  Powers of tensors and fast matrix multiplication.
+  In \emph{Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation},
+  296--303, 2014.
+
+\bibitem{gar79}
+  M. R. Garey and D. S. Johnson.
+  \emph{Computers and Intractability:
+  A Guide to the Theory of NP-Completeness},
+  W. H. Freeman and Company, 1979.
+
+\bibitem{goo17}
+  Google Code Jam Statistics (2017),
+  \url{https://www.go-hero.net/jam/17}
+
+\bibitem{gro14}
+  A. Grønlund and S. Pettie.
+  Threesomes, degenerates, and love triangles.
+  In \emph{Proceedings of the 55th Annual Symposium on Foundations of Computer Science},
+  621--630, 2014.
+
+\bibitem{gru39}
+  P. M. Grundy.
+  Mathematics and games.
+  \emph{Eureka}, 2(5):6--8, 1939.
+
+\bibitem{gus97}
+  D. Gusfield.
+  \emph{Algorithms on Strings, Trees and Sequences:
+  Computer Science and Computational Biology},
+  Cambridge University Press, 1997.
+
+% \bibitem{hal35}
+%   P. Hall.
+%   On representatives of subsets.
+%   \emph{Journal London Mathematical Society} 10(1):26--30, 1935.
+
+\bibitem{hal13}
+  S. Halim and F. Halim.
+  \emph{Competitive Programming 3: The New Lower Bound of Programming Contests}, 2013.
+
+\bibitem{hel62}
+  M. Held and R. M. Karp.
+  A dynamic programming approach to sequencing problems.
+  \emph{Journal of the Society for Industrial and Applied Mathematics}, 10(1):196--210, 1962.
+
+\bibitem{hie73}
+  C. Hierholzer and C. Wiener.
+  Über die Möglichkeit, einen Linienzug ohne Wiederholung und ohne Unterbrechung zu umfahren.
+  \emph{Mathematische Annalen}, 6(1), 30--32, 1873.
+
+\bibitem{hoa61a}
+  C. A. R. Hoare.
+  Algorithm 64: Quicksort.
+  \emph{Communications of the ACM}, 4(7):321, 1961.
+
+\bibitem{hoa61b}
+  C. A. R. Hoare.
+  Algorithm 65: Find.
+  \emph{Communications of the ACM}, 4(7):321--322, 1961.
+
+\bibitem{hop71}
+  J. E. Hopcroft and J. D. Ullman.
+  A linear list merging algorithm.
+  Technical report, Cornell University, 1971.
+
+\bibitem{hor74}
+  E. Horowitz and S. Sahni.
+  Computing partitions with applications to the knapsack problem.
+  \emph{Journal of the ACM}, 21(2):277--292, 1974.
+
+\bibitem{huf52}
+  D. A. Huffman.
+  A method for the construction of minimum-redundancy codes.
+  \emph{Proceedings of the IRE}, 40(9):1098--1101, 1952.
+
+\bibitem{iois}
+  The International Olympiad in Informatics Syllabus,
+  \url{https://people.ksp.sk/~misof/ioi-syllabus/}
+
+\bibitem{kar87}
+  R. M. Karp and M. O. Rabin.
+  Efficient randomized pattern-matching algorithms.
+  \emph{IBM Journal of Research and Development}, 31(2):249--260, 1987.
+
+\bibitem{kas61}
+  P. W. Kasteleyn.  
+  The statistics of dimers on a lattice: I. The number of dimer arrangements on a quadratic lattice.
+  \emph{Physica}, 27(12):1209--1225, 1961.
+
+\bibitem{ken06}
+  C. Kent, G. M. Landau and M. Ziv-Ukelson.
+  On the complexity of sparse exon assembly.
+  \emph{Journal of Computational Biology}, 13(5):1013--1027, 2006.
+
+
+\bibitem{kle05}
+  J. Kleinberg and É. Tardos.
+  \emph{Algorithm Design}, Pearson, 2005.
+
+\bibitem{knu982}
+  D. E. Knuth.
+  \emph{The Art of Computer Programming. Volume 2: Seminumerical Algorithms}, Addison–Wesley, 1998 (3rd edition).
+
+\bibitem{knu983}
+  D. E. Knuth.
+  \emph{The Art of Computer Programming. Volume 3: Sorting and Searching}, Addison–Wesley, 1998 (2nd edition).
+
+% \bibitem{kon31}
+%   D. Kőnig.
+%   Gráfok és mátrixok.
+%   \emph{Matematikai és Fizikai Lapok}, 38(1):116--119, 1931.
+
+\bibitem{kru56}
+  J. B. Kruskal.
+  On the shortest spanning subtree of a graph and the traveling salesman problem.
+  \emph{Proceedings of the American Mathematical Society}, 7(1):48--50, 1956.
+
+\bibitem{lev66}
+  V. I. Levenshtein.
+  Binary codes capable of correcting deletions, insertions, and reversals.
+  \emph{Soviet physics doklady}, 10(8):707--710, 1966.
+
+\bibitem{mai84}
+  M. G. Main and R. J. Lorentz.
+  An $O(n \log n)$ algorithm for finding all repetitions in a string.
+  \emph{Journal of Algorithms}, 5(3):422--432, 1984.
+
+% \bibitem{ore60}
+%   Ø. Ore.
+%   Note on Hamilton circuits.
+%   \emph{The American Mathematical Monthly}, 67(1):55, 1960.
+
+\bibitem{pac13}
+  J. Pachocki and J. Radoszewski.
+  Where to use and how not to use polynomial string hashing.
+  \emph{Olympiads in Informatics}, 7(1):90--100, 2013.
+
+\bibitem{par97}
+  I. Parberry.
+  An efficient algorithm for the Knight's tour problem.
+  \emph{Discrete Applied Mathematics}, 73(3):251--260, 1997.
+
+% \bibitem{pic99}
+%   G. Pick.
+%   Geometrisches zur Zahlenlehre.
+%   \emph{Sitzungsberichte des deutschen naturwissenschaftlich-medicinischen Vereines
+%   für Böhmen "Lotos" in Prag. (Neue Folge)}, 19:311--319, 1899.
+
+\bibitem{pea05}
+  D. Pearson.
+  A polynomial-time algorithm for the change-making problem.
+  \emph{Operations Research Letters}, 33(3):231--234, 2005.
+
+\bibitem{pri57}
+  R. C. Prim.
+  Shortest connection networks and some generalizations.
+  \emph{Bell System Technical Journal}, 36(6):1389--1401, 1957.
+
+% \bibitem{pru18}
+%   H. Prüfer.
+%   Neuer Beweis eines Satzes über Permutationen.
+%   \emph{Arch. Math. Phys}, 27:742--744, 1918.
+
+\bibitem{q27}
+  27-Queens Puzzle: Massively Parallel Enumeration and Solution Counting.
+  \url{https://github.com/preusser/q27}
+
+\bibitem{sha75}
+  M. I. Shamos and D. Hoey.
+  Closest-point problems.
+  In \emph{Proceedings of the 16th Annual Symposium on Foundations of Computer Science}, 151--162, 1975.
+
+\bibitem{sha81}
+  M. Sharir.
+  A strong-connectivity algorithm and its applications in data flow analysis.
+  \emph{Computers \& Mathematics with Applications}, 7(1):67--72, 1981.
+
+\bibitem{ski08}
+  S. S. Skiena.
+  \emph{The Algorithm Design Manual}, Springer, 2008 (2nd edition).
+
+\bibitem{ski03}
+  S. S. Skiena and M. A. Revilla.
+  \emph{Programming Challenges: The Programming Contest Training Manual},
+  Springer, 2003.
+
+\bibitem{main}
+  SZKOpuł, \texttt{https://szkopul.edu.pl/}
+
+\bibitem{spr35}
+  R. Sprague.
+  Über mathematische Kampfspiele.
+  \emph{Tohoku Mathematical Journal}, 41:438--444, 1935.
+
+\bibitem{sta06}
+  P. Stańczyk.
+  \emph{Algorytmika praktyczna w konkursach Informatycznych},
+  MSc thesis, University of Warsaw, 2006.
+
+\bibitem{str69}
+  V. Strassen.
+  Gaussian elimination is not optimal.
+  \emph{Numerische Mathematik}, 13(4):354--356, 1969.
+
+\bibitem{tar75}
+  R. E. Tarjan.
+  Efficiency of a good but not linear set union algorithm.
+  \emph{Journal of the ACM}, 22(2):215--225, 1975.
+
+\bibitem{tar79}
+  R. E. Tarjan.
+  Applications of path compression on balanced trees.
+  \emph{Journal of the ACM}, 26(4):690--715, 1979.
+
+\bibitem{tar84}
+  R. E. Tarjan and U. Vishkin.
+  Finding biconnected componemts and computing tree functions in logarithmic parallel time.
+  In \emph{Proceedings of the 25th Annual Symposium on Foundations of Computer Science}, 12--20, 1984.
+
+\bibitem{tem61}
+  H. N. V. Temperley and M. E. Fisher.
+  Dimer problem in statistical mechanics -- an exact result.
+  \emph{Philosophical Magazine}, 6(68):1061--1063, 1961.
+
+\bibitem{usaco}
+  USA Computing Olympiad, \url{http://www.usaco.org/}
+
+\bibitem{war23}
+  H. C. von Warnsdorf.
+  \emph{Des Rösselsprunges einfachste und allgemeinste Lösung}.
+  Schmalkalden, 1823.
+
+\bibitem{war62}
+  S. Warshall.
+  A theorem on boolean matrices.
+  \emph{Journal of the ACM}, 9(1):11--12, 1962.
+
+% \bibitem{zec72}
+%   E. Zeckendorf.
+%   Représentation des nombres naturels par une somme de nombres de Fibonacci ou de nombres de Lucas.
+%   \emph{Bull. Soc. Roy. Sci. Liege}, 41:179--182, 1972.
+
+\end{thebibliography}
--- a/preface.tex
+++ b/preface.tex
@ -0,0 +1,33 @@
+\chapter*{Preface}
+\markboth{\MakeUppercase{Preface}}{}
+\addcontentsline{toc}{chapter}{Preface}
+
+The purpose of this book is to give you
+a thorough introduction to competitive programming.
+It is assumed that you already
+know the basics of programming, but no previous
+background in competitive programming is needed.
+
+The book is especially intended for
+students who want to learn algorithms and
+possibly participate in
+the International Olympiad in Informatics (IOI) or
+in the International Collegiate Programming Contest (ICPC).
+Of course, the book is also suitable for 
+anybody else interested in competitive programming.
+
+It takes a long time to become a good competitive
+programmer, but it is also an opportunity to learn a lot.
+You can be sure that you will get
+a good general understanding of algorithms
+if you spend time reading the book,
+solving problems and taking part in contests.
+
+The book is under continuous development.
+You can always send feedback on the book to
+\texttt{ahslaaks@cs.helsinki.fi}.
+
+\begin{flushright}
+Helsinki, August 2019 \\
+Antti Laaksonen
+\end{flushright}