diff --git a/README.md b/README.md index d87cb7f..53d7c87 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,22 @@ -# cphb +# Competitive Programmer's Handbook -SOI adjusted Competitive Programmer's Handbook -(see https://github.com/pllk/cphb for the original) \ No newline at end of file +Competitive Programmer's Handbook is a modern introduction to competitive programming. +The book discusses programming tricks and algorithm design techniques relevant in competitive programming. + +## CSES Problem Set + +The CSES Problem Set contains a collection of competitive programming problems. +You can practice the techniques presented in the book by solving the problems. + +https://cses.fi/problemset/ + +## License + +The license of the book is Creative Commons BY-NC-SA. + +## Other books + +Guide to Competitive Programming is a printed book, published by Springer, based on Competitive Programmer's Handbook. +There is also a Russian edition Олимпиадное программирование (Olympiad Programming) and a Korean edition 알고리즘 트레이닝: 프로그래밍 대회 입문 가이드. + +https://cses.fi/book/ diff --git a/book.pdf b/book.pdf new file mode 100644 index 0000000..7ddd8e9 Binary files /dev/null and b/book.pdf differ diff --git a/book.tex b/book.tex new file mode 100644 index 0000000..42c2193 --- /dev/null +++ b/book.tex @@ -0,0 +1,131 @@ +\documentclass[twoside,12pt,a4paper,english]{book} + +%\includeonly{chapter04,list} + +\usepackage[english]{babel} +\usepackage[utf8]{inputenc} +\usepackage{listings} +\usepackage[table]{xcolor} +\usepackage{tikz} +\usepackage{multicol} +\usepackage{hyperref} +\usepackage{array} +\usepackage{microtype} + +\usepackage{fouriernc} +\usepackage[T1]{fontenc} + +\usepackage{graphicx} +\usepackage{framed} +\usepackage{amssymb} +\usepackage{amsmath} + +\usepackage{pifont} +\usepackage{ifthen} +\usepackage{makeidx} +\usepackage{enumitem} + +\usepackage{titlesec} + +\usepackage{skak} +\usepackage[scaled=0.95]{inconsolata} + + +\usetikzlibrary{patterns,snakes} +\pagestyle{plain} + +\definecolor{keywords}{HTML}{44548A} +\definecolor{strings}{HTML}{00999A} +\definecolor{comments}{HTML}{990000} + +\lstset{language=C++,frame=single,basicstyle=\ttfamily \small,showstringspaces=false,columns=flexible} +\lstset{ + literate={ö}{{\"o}}1 + {ä}{{\"a}}1 + {ü}{{\"u}}1 +} +\lstset{xleftmargin=20pt,xrightmargin=5pt} +\lstset{aboveskip=12pt,belowskip=8pt} + +\lstset{ + commentstyle=\color{comments}, + keywordstyle=\color{keywords}, + stringstyle=\color{strings} +} + +\date{Draft \today} + +\usepackage[a4paper,vmargin=30mm,hmargin=33mm,footskip=15mm]{geometry} + +\title{\Huge Competitive Programmer's Handbook} +\author{\Large Antti Laaksonen} + +\makeindex +\usepackage[totoc]{idxlayout} + +\titleformat{\subsubsection} +{\normalfont\large\bfseries\sffamily}{\thesubsection}{1em}{} + +\begin{document} + +%\selectlanguage{finnish} + +%\setcounter{page}{1} +%\pagenumbering{roman} + +\frontmatter +\maketitle +\setcounter{tocdepth}{1} +\tableofcontents + +\include{preface} + +\mainmatter +\pagenumbering{arabic} +\setcounter{page}{1} + +\newcommand{\key}[1] {\textbf{#1}} + +\part{Basic techniques} +\include{chapter01} +\include{chapter02} +\include{chapter03} +\include{chapter04} +\include{chapter05} +\include{chapter06} +\include{chapter07} +\include{chapter08} +\include{chapter09} +\include{chapter10} +\part{Graph algorithms} +\include{chapter11} +\include{chapter12} +\include{chapter13} +\include{chapter14} +\include{chapter15} +\include{chapter16} +\include{chapter17} +\include{chapter18} +\include{chapter19} +\include{chapter20} +\part{Advanced topics} +\include{chapter21} +\include{chapter22} +\include{chapter23} +\include{chapter24} +\include{chapter25} +\include{chapter26} +\include{chapter27} +\include{chapter28} +\include{chapter29} +\include{chapter30} + +\cleardoublepage +\phantomsection +\addcontentsline{toc}{chapter}{Bibliography} +\include{list} + +\cleardoublepage +\printindex + +\end{document} \ No newline at end of file diff --git a/chapter01.tex b/chapter01.tex new file mode 100644 index 0000000..60ae11a --- /dev/null +++ b/chapter01.tex @@ -0,0 +1,990 @@ +\chapter{Introduction} + +Competitive programming combines two topics: +(1) the design of algorithms and (2) the implementation of algorithms. + +The \key{design of algorithms} consists of problem solving +and mathematical thinking. +Skills for analyzing problems and solving them +creatively are needed. +An algorithm for solving a problem +has to be both correct and efficient, +and the core of the problem is often +about inventing an efficient algorithm. + +Theoretical knowledge of algorithms +is important to competitive programmers. +Typically, a solution to a problem is +a combination of well-known techniques and +new insights. +The techniques that appear in competitive programming +also form the basis for the scientific research +of algorithms. + +The \key{implementation of algorithms} requires good +programming skills. +In competitive programming, the solutions +are graded by testing an implemented algorithm +using a set of test cases. +Thus, it is not enough that the idea of the +algorithm is correct, but the implementation also +has to be correct. + +A good coding style in contests is +straightforward and concise. +Programs should be written quickly, +because there is not much time available. +Unlike in traditional software engineering, +the programs are short (usually at most a few +hundred lines of code), and they do not need to +be maintained after the contest. + +\section{Programming languages} + +\index{programming language} + +At the moment, the most popular programming +languages used in contests are C++, Python and Java. +For example, in Google Code Jam 2017, +among the best 3,000 participants, +79 \% used C++, +16 \% used Python and +8 \% used Java \cite{goo17}. +Some participants also used several languages. + +Many people think that C++ is the best choice +for a competitive programmer, +and C++ is nearly always available in +contest systems. +The benefits of using C++ are that +it is a very efficient language and +its standard library contains a +large collection +of data structures and algorithms. + +On the other hand, it is good to +master several languages and understand +their strengths. +For example, if large integers are needed +in the problem, +Python can be a good choice, because it +contains built-in operations for +calculating with large integers. +Still, most problems in programming contests +are set so that +using a specific programming language +is not an unfair advantage. + +All example programs in this book are written in C++, +and the standard library's +data structures and algorithms are often used. +The programs follow the C++11 standard, +which can be used in most contests nowadays. +If you cannot program in C++ yet, +now is a good time to start learning. + +\subsubsection{C++ code template} + +A typical C++ code template for competitive programming +looks like this: + +\begin{lstlisting} +#include + +using namespace std; + +int main() { + // solution comes here +} +\end{lstlisting} + +The \texttt{\#include} line at the beginning +of the code is a feature of the \texttt{g++} compiler +that allows us to include the entire standard library. +Thus, it is not needed to separately include +libraries such as \texttt{iostream}, +\texttt{vector} and \texttt{algorithm}, +but rather they are available automatically. + +The \texttt{using} line declares +that the classes and functions +of the standard library can be used directly +in the code. +Without the \texttt{using} line we would have +to write, for example, \texttt{std::cout}, +but now it suffices to write \texttt{cout}. + +The code can be compiled using the following command: + +\begin{lstlisting} +g++ -std=c++11 -O2 -Wall test.cpp -o test +\end{lstlisting} + +This command produces a binary file \texttt{test} +from the source code \texttt{test.cpp}. +The compiler follows the C++11 standard +(\texttt{-std=c++11}), +optimizes the code (\texttt{-O2}) +and shows warnings about possible errors (\texttt{-Wall}). + +\section{Input and output} + +\index{input and output} + +In most contests, standard streams are used for +reading input and writing output. +In C++, the standard streams are +\texttt{cin} for input and \texttt{cout} for output. +In addition, the C functions +\texttt{scanf} and \texttt{printf} can be used. + +The input for the program usually consists of +numbers and strings that are separated with +spaces and newlines. +They can be read from the \texttt{cin} stream +as follows: + +\begin{lstlisting} +int a, b; +string x; +cin >> a >> b >> x; +\end{lstlisting} + +This kind of code always works, +assuming that there is at least one space +or newline between each element in the input. +For example, the above code can read +both of the following inputs: +\begin{lstlisting} +123 456 monkey +\end{lstlisting} +\begin{lstlisting} +123 456 +monkey +\end{lstlisting} +The \texttt{cout} stream is used for output +as follows: +\begin{lstlisting} +int a = 123, b = 456; +string x = "monkey"; +cout << a << " " << b << " " << x << "\n"; +\end{lstlisting} + +Input and output is sometimes +a bottleneck in the program. +The following lines at the beginning of the code +make input and output more efficient: + +\begin{lstlisting} +ios::sync_with_stdio(0); +cin.tie(0); +\end{lstlisting} + +Note that the newline \texttt{"\textbackslash n"} +works faster than \texttt{endl}, +because \texttt{endl} always causes +a flush operation. + +The C functions \texttt{scanf} +and \texttt{printf} are an alternative +to the C++ standard streams. +They are usually a bit faster, +but they are also more difficult to use. +The following code reads two integers from the input: +\begin{lstlisting} +int a, b; +scanf("%d %d", &a, &b); +\end{lstlisting} +The following code prints two integers: +\begin{lstlisting} +int a = 123, b = 456; +printf("%d %d\n", a, b); +\end{lstlisting} + +Sometimes the program should read a whole line +from the input, possibly containing spaces. +This can be accomplished by using the +\texttt{getline} function: + +\begin{lstlisting} +string s; +getline(cin, s); +\end{lstlisting} + +If the amount of data is unknown, the following +loop is useful: +\begin{lstlisting} +while (cin >> x) { + // code +} +\end{lstlisting} +This loop reads elements from the input +one after another, until there is no +more data available in the input. + +In some contest systems, files are used for +input and output. +An easy solution for this is to write +the code as usual using standard streams, +but add the following lines to the beginning of the code: +\begin{lstlisting} +freopen("input.txt", "r", stdin); +freopen("output.txt", "w", stdout); +\end{lstlisting} +After this, the program reads the input from the file +''input.txt'' and writes the output to the file +''output.txt''. + +\section{Working with numbers} + +\index{integer} + +\subsubsection{Integers} + +The most used integer type in competitive programming +is \texttt{int}, which is a 32-bit type with +a value range of $-2^{31} \ldots 2^{31}-1$ +or about $-2 \cdot 10^9 \ldots 2 \cdot 10^9$. +If the type \texttt{int} is not enough, +the 64-bit type \texttt{long long} can be used. +It has a value range of $-2^{63} \ldots 2^{63}-1$ +or about $-9 \cdot 10^{18} \ldots 9 \cdot 10^{18}$. + +The following code defines a +\texttt{long long} variable: +\begin{lstlisting} +long long x = 123456789123456789LL; +\end{lstlisting} +The suffix \texttt{LL} means that the +type of the number is \texttt{long long}. + +A common mistake when using the type \texttt{long long} +is that the type \texttt{int} is still used somewhere +in the code. +For example, the following code contains +a subtle error: + +\begin{lstlisting} +int a = 123456789; +long long b = a*a; +cout << b << "\n"; // -1757895751 +\end{lstlisting} + +Even though the variable \texttt{b} is of type \texttt{long long}, +both numbers in the expression \texttt{a*a} +are of type \texttt{int} and the result is +also of type \texttt{int}. +Because of this, the variable \texttt{b} will +contain a wrong result. +The problem can be solved by changing the type +of \texttt{a} to \texttt{long long} or +by changing the expression to \texttt{(long long)a*a}. + +Usually contest problems are set so that the +type \texttt{long long} is enough. +Still, it is good to know that +the \texttt{g++} compiler also provides +a 128-bit type \texttt{\_\_int128\_t} +with a value range of +$-2^{127} \ldots 2^{127}-1$ or about $-10^{38} \ldots 10^{38}$. +However, this type is not available in all contest systems. + +\subsubsection{Modular arithmetic} + +\index{remainder} +\index{modular arithmetic} + +We denote by $x \bmod m$ the remainder +when $x$ is divided by $m$. +For example, $17 \bmod 5 = 2$, +because $17 = 3 \cdot 5 + 2$. + +Sometimes, the answer to a problem is a +very large number but it is enough to +output it ''modulo $m$'', i.e., +the remainder when the answer is divided by $m$ +(for example, ''modulo $10^9+7$''). +The idea is that even if the actual answer +is very large, +it suffices to use the types +\texttt{int} and \texttt{long long}. + +An important property of the remainder is that +in addition, subtraction and multiplication, +the remainder can be taken before the operation: + +\[ +\begin{array}{rcr} +(a+b) \bmod m & = & (a \bmod m + b \bmod m) \bmod m \\ +(a-b) \bmod m & = & (a \bmod m - b \bmod m) \bmod m \\ +(a \cdot b) \bmod m & = & (a \bmod m \cdot b \bmod m) \bmod m +\end{array} +\] + +Thus, we can take the remainder after every operation +and the numbers will never become too large. + +For example, the following code calculates $n!$, +the factorial of $n$, modulo $m$: +\begin{lstlisting} +long long x = 1; +for (int i = 2; i <= n; i++) { + x = (x*i)%m; +} +cout << x%m << "\n"; +\end{lstlisting} + +Usually we want the remainder to always +be between $0\ldots m-1$. +However, in C++ and other languages, +the remainder of a negative number +is either zero or negative. +An easy way to make sure there +are no negative remainders is to first calculate +the remainder as usual and then add $m$ +if the result is negative: +\begin{lstlisting} +x = x%m; +if (x < 0) x += m; +\end{lstlisting} +However, this is only needed when there +are subtractions in the code and the +remainder may become negative. + +\subsubsection{Floating point numbers} + +\index{floating point number} + +The usual floating point types in +competitive programming are +the 64-bit \texttt{double} +and, as an extension in the \texttt{g++} compiler, +the 80-bit \texttt{long double}. +In most cases, \texttt{double} is enough, +but \texttt{long double} is more accurate. + +The required precision of the answer +is usually given in the problem statement. +An easy way to output the answer is to use +the \texttt{printf} function +and give the number of decimal places +in the formatting string. +For example, the following code prints +the value of $x$ with 9 decimal places: + +\begin{lstlisting} +printf("%.9f\n", x); +\end{lstlisting} + +A difficulty when using floating point numbers +is that some numbers cannot be represented +accurately as floating point numbers, +and there will be rounding errors. +For example, the result of the following code +is surprising: + +\begin{lstlisting} +double x = 0.3*3+0.1; +printf("%.20f\n", x); // 0.99999999999999988898 +\end{lstlisting} + +Due to a rounding error, +the value of \texttt{x} is a bit smaller than 1, +while the correct value would be 1. + +It is risky to compare floating point numbers +with the \texttt{==} operator, +because it is possible that the values should be +equal but they are not because of precision errors. +A better way to compare floating point numbers +is to assume that two numbers are equal +if the difference between them is less than $\varepsilon$, +where $\varepsilon$ is a small number. + +In practice, the numbers can be compared +as follows ($\varepsilon=10^{-9}$): + +\begin{lstlisting} +if (abs(a-b) < 1e-9) { + // a and b are equal +} +\end{lstlisting} + +Note that while floating point numbers are inaccurate, +integers up to a certain limit can still be +represented accurately. +For example, using \texttt{double}, +it is possible to accurately represent all +integers whose absolute value is at most $2^{53}$. + +\section{Shortening code} + +Short code is ideal in competitive programming, +because programs should be written +as fast as possible. +Because of this, competitive programmers often define +shorter names for datatypes and other parts of code. + +\subsubsection{Type names} +\index{tuppdef@\texttt{typedef}} +Using the command \texttt{typedef} +it is possible to give a shorter name +to a datatype. +For example, the name \texttt{long long} is long, +so we can define a shorter name \texttt{ll}: +\begin{lstlisting} +typedef long long ll; +\end{lstlisting} +After this, the code +\begin{lstlisting} +long long a = 123456789; +long long b = 987654321; +cout << a*b << "\n"; +\end{lstlisting} +can be shortened as follows: +\begin{lstlisting} +ll a = 123456789; +ll b = 987654321; +cout << a*b << "\n"; +\end{lstlisting} + +The command \texttt{typedef} +can also be used with more complex types. +For example, the following code gives +the name \texttt{vi} for a vector of integers +and the name \texttt{pi} for a pair +that contains two integers. +\begin{lstlisting} +typedef vector vi; +typedef pair pi; +\end{lstlisting} + +\subsubsection{Macros} +\index{macro} +Another way to shorten code is to define +\key{macros}. +A macro means that certain strings in +the code will be changed before the compilation. +In C++, macros are defined using the +\texttt{\#define} keyword. + +For example, we can define the following macros: +\begin{lstlisting} +#define F first +#define S second +#define PB push_back +#define MP make_pair +\end{lstlisting} +After this, the code +\begin{lstlisting} +v.push_back(make_pair(y1,x1)); +v.push_back(make_pair(y2,x2)); +int d = v[i].first+v[i].second; +\end{lstlisting} +can be shortened as follows: +\begin{lstlisting} +v.PB(MP(y1,x1)); +v.PB(MP(y2,x2)); +int d = v[i].F+v[i].S; +\end{lstlisting} + +A macro can also have parameters +which makes it possible to shorten loops and other +structures. +For example, we can define the following macro: +\begin{lstlisting} +#define REP(i,a,b) for (int i = a; i <= b; i++) +\end{lstlisting} +After this, the code +\begin{lstlisting} +for (int i = 1; i <= n; i++) { + search(i); +} +\end{lstlisting} +can be shortened as follows: +\begin{lstlisting} +REP(i,1,n) { + search(i); +} +\end{lstlisting} + +Sometimes macros cause bugs that may be difficult +to detect. For example, consider the following macro +that calculates the square of a number: +\begin{lstlisting} +#define SQ(a) a*a +\end{lstlisting} +This macro \emph{does not} always work as expected. +For example, the code +\begin{lstlisting} +cout << SQ(3+3) << "\n"; +\end{lstlisting} +corresponds to the code +\begin{lstlisting} +cout << 3+3*3+3 << "\n"; // 15 +\end{lstlisting} + +A better version of the macro is as follows: +\begin{lstlisting} +#define SQ(a) (a)*(a) +\end{lstlisting} +Now the code +\begin{lstlisting} +cout << SQ(3+3) << "\n"; +\end{lstlisting} +corresponds to the code +\begin{lstlisting} +cout << (3+3)*(3+3) << "\n"; // 36 +\end{lstlisting} + + +\section{Mathematics} + +Mathematics plays an important role in competitive +programming, and it is not possible to become +a successful competitive programmer without +having good mathematical skills. +This section discusses some important +mathematical concepts and formulas that +are needed later in the book. + +\subsubsection{Sum formulas} + +Each sum of the form +\[\sum_{x=1}^n x^k = 1^k+2^k+3^k+\ldots+n^k,\] +where $k$ is a positive integer, +has a closed-form formula that is a +polynomial of degree $k+1$. +For example\footnote{\index{Faulhaber's formula} +There is even a general formula for such sums, called \key{Faulhaber's formula}, +but it is too complex to be presented here.}, +\[\sum_{x=1}^n x = 1+2+3+\ldots+n = \frac{n(n+1)}{2}\] +and +\[\sum_{x=1}^n x^2 = 1^2+2^2+3^2+\ldots+n^2 = \frac{n(n+1)(2n+1)}{6}.\] + +An \key{arithmetic progression} is a \index{arithmetic progression} +sequence of numbers +where the difference between any two consecutive +numbers is constant. +For example, +\[3, 7, 11, 15\] +is an arithmetic progression with constant 4. +The sum of an arithmetic progression can be calculated +using the formula +\[\underbrace{a + \cdots + b}_{n \,\, \textrm{numbers}} = \frac{n(a+b)}{2}\] +where $a$ is the first number, +$b$ is the last number and +$n$ is the amount of numbers. +For example, +\[3+7+11+15=\frac{4 \cdot (3+15)}{2} = 36.\] +The formula is based on the fact +that the sum consists of $n$ numbers and +the value of each number is $(a+b)/2$ on average. + +\index{geometric progression} +A \key{geometric progression} is a sequence +of numbers +where the ratio between any two consecutive +numbers is constant. +For example, +\[3,6,12,24\] +is a geometric progression with constant 2. +The sum of a geometric progression can be calculated +using the formula +\[a + ak + ak^2 + \cdots + b = \frac{bk-a}{k-1}\] +where $a$ is the first number, +$b$ is the last number and the +ratio between consecutive numbers is $k$. +For example, +\[3+6+12+24=\frac{24 \cdot 2 - 3}{2-1} = 45.\] + +This formula can be derived as follows. Let +\[ S = a + ak + ak^2 + \cdots + b .\] +By multiplying both sides by $k$, we get +\[ kS = ak + ak^2 + ak^3 + \cdots + bk,\] +and solving the equation +\[ kS-S = bk-a\] +yields the formula. + +A special case of a sum of a geometric progression is the formula +\[1+2+4+8+\ldots+2^{n-1}=2^n-1.\] + +\index{harmonic sum} + +A \key{harmonic sum} is a sum of the form +\[ \sum_{x=1}^n \frac{1}{x} = 1+\frac{1}{2}+\frac{1}{3}+\ldots+\frac{1}{n}.\] + +An upper bound for a harmonic sum is $\log_2(n)+1$. +Namely, we can +modify each term $1/k$ so that $k$ becomes +the nearest power of two that does not exceed $k$. +For example, when $n=6$, we can estimate +the sum as follows: +\[ 1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\frac{1}{5}+\frac{1}{6} \le +1+\frac{1}{2}+\frac{1}{2}+\frac{1}{4}+\frac{1}{4}+\frac{1}{4}.\] +This upper bound consists of $\log_2(n)+1$ parts +($1$, $2 \cdot 1/2$, $4 \cdot 1/4$, etc.), +and the value of each part is at most 1. + +\subsubsection{Set theory} + +\index{set theory} +\index{set} +\index{intersection} +\index{union} +\index{difference} +\index{subset} +\index{universal set} +\index{complement} + +A \key{set} is a collection of elements. +For example, the set +\[X=\{2,4,7\}\] +contains elements 2, 4 and 7. +The symbol $\emptyset$ denotes an empty set, +and $|S|$ denotes the size of a set $S$, +i.e., the number of elements in the set. +For example, in the above set, $|X|=3$. + +If a set $S$ contains an element $x$, +we write $x \in S$, +and otherwise we write $x \notin S$. +For example, in the above set +\[4 \in X \hspace{10px}\textrm{and}\hspace{10px} 5 \notin X.\] + +\begin{samepage} +New sets can be constructed using set operations: +\begin{itemize} +\item The \key{intersection} $A \cap B$ consists of elements +that are in both $A$ and $B$. +For example, if $A=\{1,2,5\}$ and $B=\{2,4\}$, +then $A \cap B = \{2\}$. +\item The \key{union} $A \cup B$ consists of elements +that are in $A$ or $B$ or both. +For example, if $A=\{3,7\}$ and $B=\{2,3,8\}$, +then $A \cup B = \{2,3,7,8\}$. +\item The \key{complement} $\bar A$ consists of elements +that are not in $A$. +The interpretation of a complement depends on +the \key{universal set}, which contains all possible elements. +For example, if $A=\{1,2,5,7\}$ and the universal set is +$\{1,2,\ldots,10\}$, then $\bar A = \{3,4,6,8,9,10\}$. +\item The \key{difference} $A \setminus B = A \cap \bar B$ +consists of elements that are in $A$ but not in $B$. +Note that $B$ can contain elements that are not in $A$. +For example, if $A=\{2,3,7,8\}$ and $B=\{3,5,8\}$, +then $A \setminus B = \{2,7\}$. +\end{itemize} +\end{samepage} + +If each element of $A$ also belongs to $S$, +we say that $A$ is a \key{subset} of $S$, +denoted by $A \subset S$. +A set $S$ always has $2^{|S|}$ subsets, +including the empty set. +For example, the subsets of the set $\{2,4,7\}$ are +\begin{center} +$\emptyset$, +$\{2\}$, $\{4\}$, $\{7\}$, $\{2,4\}$, $\{2,7\}$, $\{4,7\}$ and $\{2,4,7\}$. +\end{center} + +Some often used sets are +$\mathbb{N}$ (natural numbers), +$\mathbb{Z}$ (integers), +$\mathbb{Q}$ (rational numbers) and +$\mathbb{R}$ (real numbers). +The set $\mathbb{N}$ +can be defined in two ways, depending +on the situation: +either $\mathbb{N}=\{0,1,2,\ldots\}$ +or $\mathbb{N}=\{1,2,3,...\}$. + +We can also construct a set using a rule of the form +\[\{f(n) : n \in S\},\] +where $f(n)$ is some function. +This set contains all elements of the form $f(n)$, +where $n$ is an element in $S$. +For example, the set +\[X=\{2n : n \in \mathbb{Z}\}\] +contains all even integers. + +\subsubsection{Logic} + +\index{logic} +\index{negation} +\index{conjuction} +\index{disjunction} +\index{implication} +\index{equivalence} + +The value of a logical expression is either +\key{true} (1) or \key{false} (0). +The most important logical operators are +$\lnot$ (\key{negation}), +$\land$ (\key{conjunction}), +$\lor$ (\key{disjunction}), +$\Rightarrow$ (\key{implication}) and +$\Leftrightarrow$ (\key{equivalence}). +The following table shows the meanings of these operators: + +\begin{center} +\begin{tabular}{rr|rrrrrrr} +$A$ & $B$ & $\lnot A$ & $\lnot B$ & $A \land B$ & $A \lor B$ & $A \Rightarrow B$ & $A \Leftrightarrow B$ \\ +\hline +0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 \\ +0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 \\ +1 & 0 & 0 & 1 & 0 & 1 & 0 & 0 \\ +1 & 1 & 0 & 0 & 1 & 1 & 1 & 1 \\ +\end{tabular} +\end{center} + +The expression $\lnot A$ has the opposite value of $A$. +The expression $A \land B$ is true if both $A$ and $B$ +are true, +and the expression $A \lor B$ is true if $A$ or $B$ or both +are true. +The expression $A \Rightarrow B$ is true +if whenever $A$ is true, also $B$ is true. +The expression $A \Leftrightarrow B$ is true +if $A$ and $B$ are both true or both false. + +\index{predicate} + +A \key{predicate} is an expression that is true or false +depending on its parameters. +Predicates are usually denoted by capital letters. +For example, we can define a predicate $P(x)$ +that is true exactly when $x$ is a prime number. +Using this definition, $P(7)$ is true but $P(8)$ is false. + +\index{quantifier} + +A \key{quantifier} connects a logical expression +to the elements of a set. +The most important quantifiers are +$\forall$ (\key{for all}) and $\exists$ (\key{there is}). +For example, +\[\forall x (\exists y (y < x))\] +means that for each element $x$ in the set, +there is an element $y$ in the set +such that $y$ is smaller than $x$. +This is true in the set of integers, +but false in the set of natural numbers. + +Using the notation described above, +we can express many kinds of logical propositions. +For example, +\[\forall x ((x>1 \land \lnot P(x)) \Rightarrow (\exists a (\exists b (a > 1 \land b > 1 \land x = ab))))\] +means that if a number $x$ is larger than 1 +and not a prime number, +then there are numbers $a$ and $b$ +that are larger than $1$ and whose product is $x$. +This proposition is true in the set of integers. + +\subsubsection{Functions} + +The function $\lfloor x \rfloor$ rounds the number $x$ +down to an integer, and the function +$\lceil x \rceil$ rounds the number $x$ +up to an integer. For example, +\[ \lfloor 3/2 \rfloor = 1 \hspace{10px} \textrm{and} \hspace{10px} \lceil 3/2 \rceil = 2.\] + +The functions $\min(x_1,x_2,\ldots,x_n)$ +and $\max(x_1,x_2,\ldots,x_n)$ +give the smallest and largest of values +$x_1,x_2,\ldots,x_n$. +For example, +\[ \min(1,2,3)=1 \hspace{10px} \textrm{and} \hspace{10px} \max(1,2,3)=3.\] + +\index{factorial} + +The \key{factorial} $n!$ can be defined +\[\prod_{x=1}^n x = 1 \cdot 2 \cdot 3 \cdot \ldots \cdot n\] +or recursively +\[ +\begin{array}{lcl} +0! & = & 1 \\ +n! & = & n \cdot (n-1)! \\ +\end{array} +\] + +\index{Fibonacci number} + +The \key{Fibonacci numbers} +%\footnote{Fibonacci (c. 1175--1250) was an Italian mathematician.} +arise in many situations. +They can be defined recursively as follows: +\[ +\begin{array}{lcl} +f(0) & = & 0 \\ +f(1) & = & 1 \\ +f(n) & = & f(n-1)+f(n-2) \\ +\end{array} +\] +The first Fibonacci numbers are +\[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, \ldots\] +There is also a closed-form formula +for calculating Fibonacci numbers, which is sometimes called +\index{Binet's formula} \key{Binet's formula}: +\[f(n)=\frac{(1 + \sqrt{5})^n - (1-\sqrt{5})^n}{2^n \sqrt{5}}.\] + +\subsubsection{Logarithms} + +\index{logarithm} + +The \key{logarithm} of a number $x$ +is denoted $\log_k(x)$, where $k$ is the base +of the logarithm. +According to the definition, +$\log_k(x)=a$ exactly when $k^a=x$. + +A useful property of logarithms is +that $\log_k(x)$ equals the number of times +we have to divide $x$ by $k$ before we reach +the number 1. +For example, $\log_2(32)=5$ +because 5 divisions by 2 are needed: + +\[32 \rightarrow 16 \rightarrow 8 \rightarrow 4 \rightarrow 2 \rightarrow 1 \] + +Logarithms are often used in the analysis of +algorithms, because many efficient algorithms +halve something at each step. +Hence, we can estimate the efficiency of such algorithms +using logarithms. + +The logarithm of a product is +\[\log_k(ab) = \log_k(a)+\log_k(b),\] +and consequently, +\[\log_k(x^n) = n \cdot \log_k(x).\] +In addition, the logarithm of a quotient is +\[\log_k\Big(\frac{a}{b}\Big) = \log_k(a)-\log_k(b).\] +Another useful formula is +\[\log_u(x) = \frac{\log_k(x)}{\log_k(u)},\] +and using this, it is possible to calculate +logarithms to any base if there is a way to +calculate logarithms to some fixed base. + +\index{natural logarithm} + +The \key{natural logarithm} $\ln(x)$ of a number $x$ +is a logarithm whose base is $e \approx 2.71828$. +Another property of logarithms is that +the number of digits of an integer $x$ in base $b$ is +$\lfloor \log_b(x)+1 \rfloor$. +For example, the representation of +$123$ in base $2$ is 1111011 and +$\lfloor \log_2(123)+1 \rfloor = 7$. + +\section{Contests and resources} + +\subsubsection{IOI} + +The International Olympiad in Informatics (IOI) +is an annual programming contest for +secondary school students. +Each country is allowed to send a team of +four students to the contest. +There are usually about 300 participants +from 80 countries. + +The IOI consists of two five-hour long contests. +In both contests, the participants are asked to +solve three algorithm tasks of various difficulty. +The tasks are divided into subtasks, +each of which has an assigned score. +Even if the contestants are divided into teams, +they compete as individuals. + +The IOI syllabus \cite{iois} regulates the topics +that may appear in IOI tasks. +Almost all the topics in the IOI syllabus +are covered by this book. + +Participants for the IOI are selected through +national contests. +Before the IOI, many regional contests are organized, +such as the Baltic Olympiad in Informatics (BOI), +the Central European Olympiad in Informatics (CEOI) +and the Asia-Pacific Informatics Olympiad (APIO). + +Some countries organize online practice contests +for future IOI participants, +such as the Croatian Open Competition in Informatics \cite{coci} +and the USA Computing Olympiad \cite{usaco}. +In addition, a large collection of problems from Polish contests +is available online \cite{main}. + +\subsubsection{ICPC} + +The International Collegiate Programming Contest (ICPC) +is an annual programming contest for university students. +Each team in the contest consists of three students, +and unlike in the IOI, the students work together; +there is only one computer available for each team. + +The ICPC consists of several stages, and finally the +best teams are invited to the World Finals. +While there are tens of thousands of participants +in the contest, there are only a small number\footnote{The exact number of final +slots varies from year to year; in 2017, there were 133 final slots.} of final slots available, +so even advancing to the finals +is a great achievement in some regions. + +In each ICPC contest, the teams have five hours of time to +solve about ten algorithm problems. +A solution to a problem is accepted only if it solves +all test cases efficiently. +During the contest, competitors may view the results of other teams, +but for the last hour the scoreboard is frozen and it +is not possible to see the results of the last submissions. + +The topics that may appear at the ICPC are not so well +specified as those at the IOI. +In any case, it is clear that more knowledge is needed +at the ICPC, especially more mathematical skills. + +\subsubsection{Online contests} + +There are also many online contests that are open for everybody. +At the moment, the most active contest site is Codeforces, +which organizes contests about weekly. +In Codeforces, participants are divided into two divisions: +beginners compete in Div2 and more experienced programmers in Div1. +Other contest sites include AtCoder, CS Academy, HackerRank and Topcoder. + +Some companies organize online contests with onsite finals. +Examples of such contests are Facebook Hacker Cup, +Google Code Jam and Yandex.Algorithm. +Of course, companies also use those contests for recruiting: +performing well in a contest is a good way to prove one's skills. + +\subsubsection{Books} + +There are already some books (besides this book) that +focus on competitive programming and algorithmic problem solving: + +\begin{itemize} +\item S. S. Skiena and M. A. Revilla: +\emph{Programming Challenges: The Programming Contest Training Manual} \cite{ski03} +\item S. Halim and F. Halim: +\emph{Competitive Programming 3: The New Lower Bound of Programming Contests} \cite{hal13} +\item K. Diks et al.: \emph{Looking for a Challenge? The Ultimate Problem Set from +the University of Warsaw Programming Competitions} \cite{dik12} +\end{itemize} + +The first two books are intended for beginners, +whereas the last book contains advanced material. + +Of course, general algorithm books are also suitable for +competitive programmers. +Some popular books are: + +\begin{itemize} +\item T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein: +\emph{Introduction to Algorithms} \cite{cor09} +\item J. Kleinberg and É. Tardos: +\emph{Algorithm Design} \cite{kle05} +\item S. S. Skiena: +\emph{The Algorithm Design Manual} \cite{ski08} +\end{itemize} diff --git a/chapter02.tex b/chapter02.tex new file mode 100644 index 0000000..d062d67 --- /dev/null +++ b/chapter02.tex @@ -0,0 +1,538 @@ +\chapter{Time complexity} + +\index{time complexity} + +The efficiency of algorithms is important in competitive programming. +Usually, it is easy to design an algorithm +that solves the problem slowly, +but the real challenge is to invent a +fast algorithm. +If the algorithm is too slow, it will get only +partial points or no points at all. + +The \key{time complexity} of an algorithm +estimates how much time the algorithm will use +for some input. +The idea is to represent the efficiency +as a function whose parameter is the size of the input. +By calculating the time complexity, +we can find out whether the algorithm is fast enough +without implementing it. + +\section{Calculation rules} + +The time complexity of an algorithm +is denoted $O(\cdots)$ +where the three dots represent some +function. +Usually, the variable $n$ denotes +the input size. +For example, if the input is an array of numbers, +$n$ will be the size of the array, +and if the input is a string, +$n$ will be the length of the string. + +\subsubsection*{Loops} + +A common reason why an algorithm is slow is +that it contains many loops that go through the input. +The more nested loops the algorithm contains, +the slower it is. +If there are $k$ nested loops, +the time complexity is $O(n^k)$. + +For example, the time complexity of the following code is $O(n)$: +\begin{lstlisting} +for (int i = 1; i <= n; i++) { + // code +} +\end{lstlisting} + +And the time complexity of the following code is $O(n^2)$: +\begin{lstlisting} +for (int i = 1; i <= n; i++) { + for (int j = 1; j <= n; j++) { + // code + } +} +\end{lstlisting} + +\subsubsection*{Order of magnitude} + +A time complexity does not tell us the exact number +of times the code inside a loop is executed, +but it only shows the order of magnitude. +In the following examples, the code inside the loop +is executed $3n$, $n+5$ and $\lceil n/2 \rceil$ times, +but the time complexity of each code is $O(n)$. + +\begin{lstlisting} +for (int i = 1; i <= 3*n; i++) { + // code +} +\end{lstlisting} + +\begin{lstlisting} +for (int i = 1; i <= n+5; i++) { + // code +} +\end{lstlisting} + +\begin{lstlisting} +for (int i = 1; i <= n; i += 2) { + // code +} +\end{lstlisting} + +As another example, +the time complexity of the following code is $O(n^2)$: + +\begin{lstlisting} +for (int i = 1; i <= n; i++) { + for (int j = i+1; j <= n; j++) { + // code + } +} +\end{lstlisting} + +\subsubsection*{Phases} + +If the algorithm consists of consecutive phases, +the total time complexity is the largest +time complexity of a single phase. +The reason for this is that the slowest +phase is usually the bottleneck of the code. + +For example, the following code consists +of three phases with time complexities +$O(n)$, $O(n^2)$ and $O(n)$. +Thus, the total time complexity is $O(n^2)$. + +\begin{lstlisting} +for (int i = 1; i <= n; i++) { + // code +} +for (int i = 1; i <= n; i++) { + for (int j = 1; j <= n; j++) { + // code + } +} +for (int i = 1; i <= n; i++) { + // code +} +\end{lstlisting} + +\subsubsection*{Several variables} + +Sometimes the time complexity depends on +several factors. +In this case, the time complexity formula +contains several variables. + +For example, the time complexity of the +following code is $O(nm)$: + +\begin{lstlisting} +for (int i = 1; i <= n; i++) { + for (int j = 1; j <= m; j++) { + // code + } +} +\end{lstlisting} + +\subsubsection*{Recursion} + +The time complexity of a recursive function +depends on the number of times the function is called +and the time complexity of a single call. +The total time complexity is the product of +these values. + +For example, consider the following function: +\begin{lstlisting} +void f(int n) { + if (n == 1) return; + f(n-1); +} +\end{lstlisting} +The call $\texttt{f}(n)$ causes $n$ function calls, +and the time complexity of each call is $O(1)$. +Thus, the total time complexity is $O(n)$. + +As another example, consider the following function: +\begin{lstlisting} +void g(int n) { + if (n == 1) return; + g(n-1); + g(n-1); +} +\end{lstlisting} +In this case each function call generates two other +calls, except for $n=1$. +Let us see what happens when $g$ is called +with parameter $n$. +The following table shows the function calls +produced by this single call: +\begin{center} +\begin{tabular}{rr} +function call & number of calls \\ +\hline +$g(n)$ & 1 \\ +$g(n-1)$ & 2 \\ +$g(n-2)$ & 4 \\ +$\cdots$ & $\cdots$ \\ +$g(1)$ & $2^{n-1}$ \\ +\end{tabular} +\end{center} +Based on this, the time complexity is +\[1+2+4+\cdots+2^{n-1} = 2^n-1 = O(2^n).\] + +\section{Complexity classes} + +\index{complexity classes} + +The following list contains common time complexities +of algorithms: + +\begin{description} +\item[$O(1)$] +\index{constant-time algorithm} +The running time of a \key{constant-time} algorithm +does not depend on the input size. +A typical constant-time algorithm is a direct +formula that calculates the answer. + +\item[$O(\log n)$] +\index{logarithmic algorithm} +A \key{logarithmic} algorithm often halves +the input size at each step. +The running time of such an algorithm +is logarithmic, because +$\log_2 n$ equals the number of times +$n$ must be divided by 2 to get 1. + +\item[$O(\sqrt n)$] +A \key{square root algorithm} is slower than +$O(\log n)$ but faster than $O(n)$. +A special property of square roots is that +$\sqrt n = n/\sqrt n$, so the square root $\sqrt n$ lies, +in some sense, in the middle of the input. + +\item[$O(n)$] +\index{linear algorithm} +A \key{linear} algorithm goes through the input +a constant number of times. +This is often the best possible time complexity, +because it is usually necessary to access each +input element at least once before +reporting the answer. + +\item[$O(n \log n)$] +This time complexity often indicates that the +algorithm sorts the input, +because the time complexity of efficient +sorting algorithms is $O(n \log n)$. +Another possibility is that the algorithm +uses a data structure where each operation +takes $O(\log n)$ time. + +\item[$O(n^2)$] +\index{quadratic algorithm} +A \key{quadratic} algorithm often contains +two nested loops. +It is possible to go through all pairs of +the input elements in $O(n^2)$ time. + +\item[$O(n^3)$] +\index{cubic algorithm} +A \key{cubic} algorithm often contains +three nested loops. +It is possible to go through all triplets of +the input elements in $O(n^3)$ time. + +\item[$O(2^n)$] +This time complexity often indicates that +the algorithm iterates through all +subsets of the input elements. +For example, the subsets of $\{1,2,3\}$ are +$\emptyset$, $\{1\}$, $\{2\}$, $\{3\}$, $\{1,2\}$, +$\{1,3\}$, $\{2,3\}$ and $\{1,2,3\}$. + +\item[$O(n!)$] +This time complexity often indicates that +the algorithm iterates through all +permutations of the input elements. +For example, the permutations of $\{1,2,3\}$ are +$(1,2,3)$, $(1,3,2)$, $(2,1,3)$, $(2,3,1)$, +$(3,1,2)$ and $(3,2,1)$. + +\end{description} + +\index{polynomial algorithm} +An algorithm is \key{polynomial} +if its time complexity is at most $O(n^k)$ +where $k$ is a constant. +All the above time complexities except +$O(2^n)$ and $O(n!)$ are polynomial. +In practice, the constant $k$ is usually small, +and therefore a polynomial time complexity +roughly means that the algorithm is \emph{efficient}. + +\index{NP-hard problem} + +Most algorithms in this book are polynomial. +Still, there are many important problems for which +no polynomial algorithm is known, i.e., +nobody knows how to solve them efficiently. +\key{NP-hard} problems are an important set +of problems, for which no polynomial algorithm +is known\footnote{A classic book on the topic is +M. R. Garey's and D. S. Johnson's +\emph{Computers and Intractability: A Guide to the Theory +of NP-Completeness} \cite{gar79}.}. + +\section{Estimating efficiency} + +By calculating the time complexity of an algorithm, +it is possible to check, before +implementing the algorithm, that it is +efficient enough for the problem. +The starting point for estimations is the fact that +a modern computer can perform some hundreds of +millions of operations in a second. + +For example, assume that the time limit for +a problem is one second and the input size is $n=10^5$. +If the time complexity is $O(n^2)$, +the algorithm will perform about $(10^5)^2=10^{10}$ operations. +This should take at least some tens of seconds, +so the algorithm seems to be too slow for solving the problem. + +On the other hand, given the input size, +we can try to \emph{guess} +the required time complexity of the algorithm +that solves the problem. +The following table contains some useful estimates +assuming a time limit of one second. + +\begin{center} +\begin{tabular}{ll} +input size & required time complexity \\ +\hline +$n \le 10$ & $O(n!)$ \\ +$n \le 20$ & $O(2^n)$ \\ +$n \le 500$ & $O(n^3)$ \\ +$n \le 5000$ & $O(n^2)$ \\ +$n \le 10^6$ & $O(n \log n)$ or $O(n)$ \\ +$n$ is large & $O(1)$ or $O(\log n)$ \\ +\end{tabular} +\end{center} + +For example, if the input size is $n=10^5$, +it is probably expected that the time +complexity of the algorithm is $O(n)$ or $O(n \log n)$. +This information makes it easier to design the algorithm, +because it rules out approaches that would yield +an algorithm with a worse time complexity. + +\index{constant factor} + +Still, it is important to remember that a +time complexity is only an estimate of efficiency, +because it hides the \emph{constant factors}. +For example, an algorithm that runs in $O(n)$ time +may perform $n/2$ or $5n$ operations. +This has an important effect on the actual +running time of the algorithm. + +\section{Maximum subarray sum} + +\index{maximum subarray sum} + +There are often several possible algorithms +for solving a problem such that their +time complexities are different. +This section discusses a classic problem that +has a straightforward $O(n^3)$ solution. +However, by designing a better algorithm, it +is possible to solve the problem in $O(n^2)$ +time and even in $O(n)$ time. + +Given an array of $n$ numbers, +our task is to calculate the +\key{maximum subarray sum}, i.e., +the largest possible sum of +a sequence of consecutive values +in the array\footnote{J. Bentley's +book \emph{Programming Pearls} \cite{ben86} made the problem popular.}. +The problem is interesting when there may be +negative values in the array. +For example, in the array +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$-1$}; +\node at (1.5,0.5) {$2$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$-3$}; +\node at (4.5,0.5) {$5$}; +\node at (5.5,0.5) {$2$}; +\node at (6.5,0.5) {$-5$}; +\node at (7.5,0.5) {$2$}; +\end{tikzpicture} +\end{center} +\begin{samepage} +the following subarray produces the maximum sum $10$: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (1,0) rectangle (6,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$-1$}; +\node at (1.5,0.5) {$2$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$-3$}; +\node at (4.5,0.5) {$5$}; +\node at (5.5,0.5) {$2$}; +\node at (6.5,0.5) {$-5$}; +\node at (7.5,0.5) {$2$}; +\end{tikzpicture} +\end{center} +\end{samepage} + +We assume that an empty subarray is allowed, +so the maximum subarray sum is always at least $0$. + +\subsubsection{Algorithm 1} + +A straightforward way to solve the problem +is to go through all possible subarrays, +calculate the sum of values in each subarray and maintain +the maximum sum. +The following code implements this algorithm: + +\begin{lstlisting} +int best = 0; +for (int a = 0; a < n; a++) { + for (int b = a; b < n; b++) { + int sum = 0; + for (int k = a; k <= b; k++) { + sum += array[k]; + } + best = max(best,sum); + } +} +cout << best << "\n"; +\end{lstlisting} + +The variables \texttt{a} and \texttt{b} fix the first and +last index of the subarray, +and the sum of values is calculated to the variable \texttt{sum}. +The variable \texttt{best} contains the maximum sum found during the search. + +The time complexity of the algorithm is $O(n^3)$, +because it consists of three nested loops +that go through the input. + +\subsubsection{Algorithm 2} + +It is easy to make Algorithm 1 more efficient +by removing one loop from it. +This is possible by calculating the sum at the same +time when the right end of the subarray moves. +The result is the following code: + +\begin{lstlisting} +int best = 0; +for (int a = 0; a < n; a++) { + int sum = 0; + for (int b = a; b < n; b++) { + sum += array[b]; + best = max(best,sum); + } +} +cout << best << "\n"; +\end{lstlisting} +After this change, the time complexity is $O(n^2)$. + +\subsubsection{Algorithm 3} + +Surprisingly, it is possible to solve the problem +in $O(n)$ time\footnote{In \cite{ben86}, this linear-time algorithm +is attributed to J. B. Kadane, and the algorithm is sometimes +called \index{Kadane's algorithm} \key{Kadane's algorithm}.}, which means +that just one loop is enough. +The idea is to calculate, for each array position, +the maximum sum of a subarray that ends at that position. +After this, the answer for the problem is the +maximum of those sums. + +Consider the subproblem of finding the maximum-sum subarray +that ends at position $k$. +There are two possibilities: +\begin{enumerate} +\item The subarray only contains the element at position $k$. +\item The subarray consists of a subarray that ends +at position $k-1$, followed by the element at position $k$. +\end{enumerate} + +In the latter case, since we want to +find a subarray with maximum sum, +the subarray that ends at position $k-1$ +should also have the maximum sum. +Thus, we can solve the problem efficiently +by calculating the maximum subarray sum +for each ending position from left to right. + +The following code implements the algorithm: +\begin{lstlisting} +int best = 0, sum = 0; +for (int k = 0; k < n; k++) { + sum = max(array[k],sum+array[k]); + best = max(best,sum); +} +cout << best << "\n"; +\end{lstlisting} + +The algorithm only contains one loop +that goes through the input, +so the time complexity is $O(n)$. +This is also the best possible time complexity, +because any algorithm for the problem +has to examine all array elements at least once. + +\subsubsection{Efficiency comparison} + +It is interesting to study how efficient +algorithms are in practice. +The following table shows the running times +of the above algorithms for different +values of $n$ on a modern computer. + +In each test, the input was generated randomly. +The time needed for reading the input was not +measured. + +\begin{center} +\begin{tabular}{rrrr} +array size $n$ & Algorithm 1 & Algorithm 2 & Algorithm 3 \\ +\hline +$10^2$ & $0.0$ s & $0.0$ s & $0.0$ s \\ +$10^3$ & $0.1$ s & $0.0$ s & $0.0$ s \\ +$10^4$ & > $10.0$ s & $0.1$ s & $0.0$ s \\ +$10^5$ & > $10.0$ s & $5.3$ s & $0.0$ s \\ +$10^6$ & > $10.0$ s & > $10.0$ s & $0.0$ s \\ +$10^7$ & > $10.0$ s & > $10.0$ s & $0.0$ s \\ +\end{tabular} +\end{center} + +The comparison shows that all algorithms +are efficient when the input size is small, +but larger inputs bring out remarkable +differences in the running times of the algorithms. +Algorithm 1 becomes slow +when $n=10^4$, and Algorithm 2 +becomes slow when $n=10^5$. +Only Algorithm 3 is able to process +even the largest inputs instantly. diff --git a/chapter03.tex b/chapter03.tex new file mode 100644 index 0000000..00b095d --- /dev/null +++ b/chapter03.tex @@ -0,0 +1,863 @@ +\chapter{Sorting} + +\index{sorting} + +\key{Sorting} +is a fundamental algorithm design problem. +Many efficient algorithms +use sorting as a subroutine, +because it is often easier to process +data if the elements are in a sorted order. + +For example, the problem ''does an array contain +two equal elements?'' is easy to solve using sorting. +If the array contains two equal elements, +they will be next to each other after sorting, +so it is easy to find them. +Also, the problem ''what is the most frequent element +in an array?'' can be solved similarly. + +There are many algorithms for sorting, and they are +also good examples of how to apply +different algorithm design techniques. +The efficient general sorting algorithms +work in $O(n \log n)$ time, +and many algorithms that use sorting +as a subroutine also +have this time complexity. + +\section{Sorting theory} + +The basic problem in sorting is as follows: +\begin{framed} +\noindent +Given an array that contains $n$ elements, +your task is to sort the elements +in increasing order. +\end{framed} +\noindent +For example, the array +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$8$}; +\node at (3.5,0.5) {$2$}; +\node at (4.5,0.5) {$9$}; +\node at (5.5,0.5) {$2$}; +\node at (6.5,0.5) {$5$}; +\node at (7.5,0.5) {$6$}; +\end{tikzpicture} +\end{center} +will be as follows after sorting: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$2$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$3$}; +\node at (4.5,0.5) {$5$}; +\node at (5.5,0.5) {$6$}; +\node at (6.5,0.5) {$8$}; +\node at (7.5,0.5) {$9$}; +\end{tikzpicture} +\end{center} + +\subsubsection{$O(n^2)$ algorithms} + +\index{bubble sort} + +Simple algorithms for sorting an array +work in $O(n^2)$ time. +Such algorithms are short and usually +consist of two nested loops. +A famous $O(n^2)$ time sorting algorithm +is \key{bubble sort} where the elements +''bubble'' in the array according to their values. + +Bubble sort consists of $n$ rounds. +On each round, the algorithm iterates through +the elements of the array. +Whenever two consecutive elements are found +that are not in correct order, +the algorithm swaps them. +The algorithm can be implemented as follows: +\begin{lstlisting} +for (int i = 0; i < n; i++) { + for (int j = 0; j < n-1; j++) { + if (array[j] > array[j+1]) { + swap(array[j],array[j+1]); + } + } +} +\end{lstlisting} + +After the first round of the algorithm, +the largest element will be in the correct position, +and in general, after $k$ rounds, the $k$ largest +elements will be in the correct positions. +Thus, after $n$ rounds, the whole array +will be sorted. + +For example, in the array + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$8$}; +\node at (3.5,0.5) {$2$}; +\node at (4.5,0.5) {$9$}; +\node at (5.5,0.5) {$2$}; +\node at (6.5,0.5) {$5$}; +\node at (7.5,0.5) {$6$}; +\end{tikzpicture} +\end{center} + +\noindent +the first round of bubble sort swaps elements +as follows: + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$9$}; +\node at (5.5,0.5) {$2$}; +\node at (6.5,0.5) {$5$}; +\node at (7.5,0.5) {$6$}; + +\draw[thick,<->] (3.5,-0.25) .. controls (3.25,-1.00) and (2.75,-1.00) .. (2.5,-0.25); +\end{tikzpicture} +\end{center} + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$2$}; +\node at (5.5,0.5) {$9$}; +\node at (6.5,0.5) {$5$}; +\node at (7.5,0.5) {$6$}; + +\draw[thick,<->] (5.5,-0.25) .. controls (5.25,-1.00) and (4.75,-1.00) .. (4.5,-0.25); +\end{tikzpicture} +\end{center} + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$2$}; +\node at (5.5,0.5) {$5$}; +\node at (6.5,0.5) {$9$}; +\node at (7.5,0.5) {$6$}; + +\draw[thick,<->] (6.5,-0.25) .. controls (6.25,-1.00) and (5.75,-1.00) .. (5.5,-0.25); +\end{tikzpicture} +\end{center} + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$2$}; +\node at (5.5,0.5) {$5$}; +\node at (6.5,0.5) {$6$}; +\node at (7.5,0.5) {$9$}; + +\draw[thick,<->] (7.5,-0.25) .. controls (7.25,-1.00) and (6.75,-1.00) .. (6.5,-0.25); +\end{tikzpicture} +\end{center} + +\subsubsection{Inversions} + +\index{inversion} + +Bubble sort is an example of a sorting +algorithm that always swaps \emph{consecutive} +elements in the array. +It turns out that the time complexity +of such an algorithm is \emph{always} +at least $O(n^2)$, because in the worst case, +$O(n^2)$ swaps are required for sorting the array. + +A useful concept when analyzing sorting +algorithms is an \key{inversion}: +a pair of array elements +$(\texttt{array}[a],\texttt{array}[b])$ such that +$a\texttt{array}[b]$, +i.e., the elements are in the wrong order. +For example, the array +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$2$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$6$}; +\node at (4.5,0.5) {$3$}; +\node at (5.5,0.5) {$5$}; +\node at (6.5,0.5) {$9$}; +\node at (7.5,0.5) {$8$}; +\end{tikzpicture} +\end{center} +has three inversions: $(6,3)$, $(6,5)$ and $(9,8)$. +The number of inversions indicates +how much work is needed to sort the array. +An array is completely sorted when +there are no inversions. +On the other hand, if the array elements +are in the reverse order, +the number of inversions is the largest possible: +\[1+2+\cdots+(n-1)=\frac{n(n-1)}{2} = O(n^2)\] + +Swapping a pair of consecutive elements that are +in the wrong order removes exactly one inversion +from the array. +Hence, if a sorting algorithm can only +swap consecutive elements, each swap removes +at most one inversion, and the time complexity +of the algorithm is at least $O(n^2)$. + +\subsubsection{$O(n \log n)$ algorithms} + +\index{merge sort} + +It is possible to sort an array efficiently +in $O(n \log n)$ time using algorithms +that are not limited to swapping consecutive elements. +One such algorithm is \key{merge sort}\footnote{According to \cite{knu983}, +merge sort was invented by J. von Neumann in 1945.}, +which is based on recursion. + +Merge sort sorts a subarray \texttt{array}$[a \ldots b]$ as follows: + +\begin{enumerate} +\item If $a=b$, do not do anything, because the subarray is already sorted. +\item Calculate the position of the middle element: $k=\lfloor (a+b)/2 \rfloor$. +\item Recursively sort the subarray \texttt{array}$[a \ldots k]$. +\item Recursively sort the subarray \texttt{array}$[k+1 \ldots b]$. +\item \emph{Merge} the sorted subarrays \texttt{array}$[a \ldots k]$ and +\texttt{array}$[k+1 \ldots b]$ +into a sorted subarray \texttt{array}$[a \ldots b]$. +\end{enumerate} + +Merge sort is an efficient algorithm, because it +halves the size of the subarray at each step. +The recursion consists of $O(\log n)$ levels, +and processing each level takes $O(n)$ time. +Merging the subarrays \texttt{array}$[a \ldots k]$ and \texttt{array}$[k+1 \ldots b]$ +is possible in linear time, because they are already sorted. + +For example, consider sorting the following array: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$6$}; +\node at (3.5,0.5) {$2$}; +\node at (4.5,0.5) {$8$}; +\node at (5.5,0.5) {$2$}; +\node at (6.5,0.5) {$5$}; +\node at (7.5,0.5) {$9$}; +\end{tikzpicture} +\end{center} + +The array will be divided into two subarrays +as follows: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (4,1); +\draw (5,0) grid (9,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$6$}; +\node at (3.5,0.5) {$2$}; + +\node at (5.5,0.5) {$8$}; +\node at (6.5,0.5) {$2$}; +\node at (7.5,0.5) {$5$}; +\node at (8.5,0.5) {$9$}; +\end{tikzpicture} +\end{center} + +Then, the subarrays will be sorted recursively +as follows: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (4,1); +\draw (5,0) grid (9,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$2$}; +\node at (2.5,0.5) {$3$}; +\node at (3.5,0.5) {$6$}; + +\node at (5.5,0.5) {$2$}; +\node at (6.5,0.5) {$5$}; +\node at (7.5,0.5) {$8$}; +\node at (8.5,0.5) {$9$}; +\end{tikzpicture} +\end{center} + +Finally, the algorithm merges the sorted +subarrays and creates the final sorted array: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$2$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$3$}; +\node at (4.5,0.5) {$5$}; +\node at (5.5,0.5) {$6$}; +\node at (6.5,0.5) {$8$}; +\node at (7.5,0.5) {$9$}; +\end{tikzpicture} +\end{center} + +\subsubsection{Sorting lower bound} + +Is it possible to sort an array faster +than in $O(n \log n)$ time? +It turns out that this is \emph{not} possible +when we restrict ourselves to sorting algorithms +that are based on comparing array elements. + +The lower bound for the time complexity +can be proved by considering sorting +as a process where each comparison of two elements +gives more information about the contents of the array. +The process creates the following tree: + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) rectangle (3,1); +\node at (1.5,0.5) {$x < y?$}; + +\draw[thick,->] (1.5,0) -- (-2.5,-1.5); +\draw[thick,->] (1.5,0) -- (5.5,-1.5); + +\draw (-4,-2.5) rectangle (-1,-1.5); +\draw (4,-2.5) rectangle (7,-1.5); +\node at (-2.5,-2) {$x < y?$}; +\node at (5.5,-2) {$x < y?$}; + +\draw[thick,->] (-2.5,-2.5) -- (-4.5,-4); +\draw[thick,->] (-2.5,-2.5) -- (-0.5,-4); +\draw[thick,->] (5.5,-2.5) -- (3.5,-4); +\draw[thick,->] (5.5,-2.5) -- (7.5,-4); + +\draw (-6,-5) rectangle (-3,-4); +\draw (-2,-5) rectangle (1,-4); +\draw (2,-5) rectangle (5,-4); +\draw (6,-5) rectangle (9,-4); +\node at (-4.5,-4.5) {$x < y?$}; +\node at (-0.5,-4.5) {$x < y?$}; +\node at (3.5,-4.5) {$x < y?$}; +\node at (7.5,-4.5) {$x < y?$}; + +\draw[thick,->] (-4.5,-5) -- (-5.5,-6); +\draw[thick,->] (-4.5,-5) -- (-3.5,-6); +\draw[thick,->] (-0.5,-5) -- (0.5,-6); +\draw[thick,->] (-0.5,-5) -- (-1.5,-6); +\draw[thick,->] (3.5,-5) -- (2.5,-6); +\draw[thick,->] (3.5,-5) -- (4.5,-6); +\draw[thick,->] (7.5,-5) -- (6.5,-6); +\draw[thick,->] (7.5,-5) -- (8.5,-6); +\end{tikzpicture} +\end{center} + +Here ''$x v = {4,2,5,3,5,8,3}; +sort(v.begin(),v.end()); +\end{lstlisting} +After the sorting, the contents of the +vector will be +$[2,3,3,4,5,5,8]$. +The default sorting order is increasing, +but a reverse order is possible as follows: +\begin{lstlisting} +sort(v.rbegin(),v.rend()); +\end{lstlisting} +An ordinary array can be sorted as follows: +\begin{lstlisting} +int n = 7; // array size +int a[] = {4,2,5,3,5,8,3}; +sort(a,a+n); +\end{lstlisting} +\newpage +The following code sorts the string \texttt{s}: +\begin{lstlisting} +string s = "monkey"; +sort(s.begin(), s.end()); +\end{lstlisting} +Sorting a string means that the characters +of the string are sorted. +For example, the string ''monkey'' becomes ''ekmnoy''. + +\subsubsection{Comparison operators} + +\index{comparison operator} + +The function \texttt{sort} requires that +a \key{comparison operator} is defined for the data type +of the elements to be sorted. +When sorting, this operator will be used +whenever it is necessary to find out the order of two elements. + +Most C++ data types have a built-in comparison operator, +and elements of those types can be sorted automatically. +For example, numbers are sorted according to their values +and strings are sorted in alphabetical order. + +\index{pair@\texttt{pair}} + +Pairs (\texttt{pair}) are sorted primarily according to their +first elements (\texttt{first}). +However, if the first elements of two pairs are equal, +they are sorted according to their second elements (\texttt{second}): +\begin{lstlisting} +vector> v; +v.push_back({1,5}); +v.push_back({2,3}); +v.push_back({1,2}); +sort(v.begin(), v.end()); +\end{lstlisting} +After this, the order of the pairs is +$(1,2)$, $(1,5)$ and $(2,3)$. + +\index{tuple@\texttt{tuple}} + +In a similar way, tuples (\texttt{tuple}) +are sorted primarily by the first element, +secondarily by the second element, etc.\footnote{Note that in some older compilers, +the function \texttt{make\_tuple} has to be used to create a tuple instead of +braces (for example, \texttt{make\_tuple(2,1,4)} instead of \texttt{\{2,1,4\}}).}: +\begin{lstlisting} +vector> v; +v.push_back({2,1,4}); +v.push_back({1,5,3}); +v.push_back({2,1,3}); +sort(v.begin(), v.end()); +\end{lstlisting} +After this, the order of the tuples is +$(1,5,3)$, $(2,1,3)$ and $(2,1,4)$. + +\subsubsection{User-defined structs} + +User-defined structs do not have a comparison +operator automatically. +The operator should be defined inside +the struct as a function +\texttt{operator<}, +whose parameter is another element of the same type. +The operator should return \texttt{true} +if the element is smaller than the parameter, +and \texttt{false} otherwise. + +For example, the following struct \texttt{P} +contains the x and y coordinates of a point. +The comparison operator is defined so that +the points are sorted primarily by the x coordinate +and secondarily by the y coordinate. + +\begin{lstlisting} +struct P { + int x, y; + bool operator<(const P &p) { + if (x != p.x) return x < p.x; + else return y < p.y; + } +}; +\end{lstlisting} + +\subsubsection{Comparison functions} + +\index{comparison function} + +It is also possible to give an external +\key{comparison function} to the \texttt{sort} function +as a callback function. +For example, the following comparison function \texttt{comp} +sorts strings primarily by length and secondarily +by alphabetical order: + +\begin{lstlisting} +bool comp(string a, string b) { + if (a.size() != b.size()) return a.size() < b.size(); + return a < b; +} +\end{lstlisting} +Now a vector of strings can be sorted as follows: +\begin{lstlisting} +sort(v.begin(), v.end(), comp); +\end{lstlisting} + +\section{Binary search} + +\index{binary search} + +A general method for searching for an element +in an array is to use a \texttt{for} loop +that iterates through the elements of the array. +For example, the following code searches for +an element $x$ in an array: + +\begin{lstlisting} +for (int i = 0; i < n; i++) { + if (array[i] == x) { + // x found at index i + } +} +\end{lstlisting} + +The time complexity of this approach is $O(n)$, +because in the worst case, it is necessary to check +all elements of the array. +If the order of the elements is arbitrary, +this is also the best possible approach, because +there is no additional information available where +in the array we should search for the element $x$. + +However, if the array is \emph{sorted}, +the situation is different. +In this case it is possible to perform the +search much faster, because the order of the +elements in the array guides the search. +The following \key{binary search} algorithm +efficiently searches for an element in a sorted array +in $O(\log n)$ time. + +\subsubsection{Method 1} + +The usual way to implement binary search +resembles looking for a word in a dictionary. +The search maintains an active region in the array, +which initially contains all array elements. +Then, a number of steps is performed, +each of which halves the size of the region. + +At each step, the search checks the middle element +of the active region. +If the middle element is the target element, +the search terminates. +Otherwise, the search recursively continues +to the left or right half of the region, +depending on the value of the middle element. + +The above idea can be implemented as follows: +\begin{lstlisting} +int a = 0, b = n-1; +while (a <= b) { + int k = (a+b)/2; + if (array[k] == x) { + // x found at index k + } + if (array[k] > x) b = k-1; + else a = k+1; +} +\end{lstlisting} + +In this implementation, the active region is $a \ldots b$, +and initially the region is $0 \ldots n-1$. +The algorithm halves the size of the region at each step, +so the time complexity is $O(\log n)$. + +\subsubsection{Method 2} + +An alternative method to implement binary search +is based on an efficient way to iterate through +the elements of the array. +The idea is to make jumps and slow the speed +when we get closer to the target element. + +The search goes through the array from left to +right, and the initial jump length is $n/2$. +At each step, the jump length will be halved: +first $n/4$, then $n/8$, $n/16$, etc., until +finally the length is 1. +After the jumps, either the target element has +been found or we know that it does not appear in the array. + +The following code implements the above idea: +\begin{lstlisting} +int k = 0; +for (int b = n/2; b >= 1; b /= 2) { + while (k+b < n && array[k+b] <= x) k += b; +} +if (array[k] == x) { + // x found at index k +} +\end{lstlisting} + +During the search, the variable $b$ +contains the current jump length. +The time complexity of the algorithm is $O(\log n)$, +because the code in the \texttt{while} loop +is performed at most twice for each jump length. + +\subsubsection{C++ functions} + +The C++ standard library contains the following functions +that are based on binary search and work in logarithmic time: + +\begin{itemize} +\item \texttt{lower\_bound} returns a pointer to the +first array element whose value is at least $x$. +\item \texttt{upper\_bound} returns a pointer to the +first array element whose value is larger than $x$. +\item \texttt{equal\_range} returns both above pointers. +\end{itemize} + +The functions assume that the array is sorted. +If there is no such element, the pointer points to +the element after the last array element. +For example, the following code finds out whether +an array contains an element with value $x$: + +\begin{lstlisting} +auto k = lower_bound(array,array+n,x)-array; +if (k < n && array[k] == x) { + // x found at index k +} +\end{lstlisting} + +Then, the following code counts the number of elements +whose value is $x$: + +\begin{lstlisting} +auto a = lower_bound(array, array+n, x); +auto b = upper_bound(array, array+n, x); +cout << b-a << "\n"; +\end{lstlisting} + +Using \texttt{equal\_range}, the code becomes shorter: + +\begin{lstlisting} +auto r = equal_range(array, array+n, x); +cout << r.second-r.first << "\n"; +\end{lstlisting} + +\subsubsection{Finding the smallest solution} + +An important use for binary search is +to find the position where the value of a \emph{function} changes. +Suppose that we wish to find the smallest value $k$ +that is a valid solution for a problem. +We are given a function $\texttt{ok}(x)$ +that returns \texttt{true} if $x$ is a valid solution +and \texttt{false} otherwise. +In addition, we know that $\texttt{ok}(x)$ is \texttt{false} +when $x= 1; b /= 2) { + while (!ok(x+b)) x += b; +} +int k = x+1; +\end{lstlisting} + +The search finds the largest value of $x$ for which +$\texttt{ok}(x)$ is \texttt{false}. +Thus, the next value $k=x+1$ +is the smallest possible value for which +$\texttt{ok}(k)$ is \texttt{true}. +The initial jump length $z$ has to be +large enough, for example some value +for which we know beforehand that $\texttt{ok}(z)$ is \texttt{true}. + +The algorithm calls the function \texttt{ok} +$O(\log z)$ times, so the total time complexity +depends on the function \texttt{ok}. +For example, if the function works in $O(n)$ time, +the total time complexity is $O(n \log z)$. + +\subsubsection{Finding the maximum value} + +Binary search can also be used to find +the maximum value for a function that is +first increasing and then decreasing. +Our task is to find a position $k$ such that + +\begin{itemize} +\item +$f(x)f(x+1)$ when $x \ge k$. +\end{itemize} + +The idea is to use binary search +for finding the largest value of $x$ +for which $f(x)f(x+2)$. +The following code implements the search: + +\begin{lstlisting} +int x = -1; +for (int b = z; b >= 1; b /= 2) { + while (f(x+b) < f(x+b+1)) x += b; +} +int k = x+1; +\end{lstlisting} + +Note that unlike in the ordinary binary search, +here it is not allowed that consecutive values +of the function are equal. +In this case it would not be possible to know +how to continue the search. diff --git a/chapter04.tex b/chapter04.tex new file mode 100644 index 0000000..9ae6bdc --- /dev/null +++ b/chapter04.tex @@ -0,0 +1,794 @@ +\chapter{Data structures} + +\index{data structure} + +A \key{data structure} is a way to store +data in the memory of a computer. +It is important to choose an appropriate +data structure for a problem, +because each data structure has its own +advantages and disadvantages. +The crucial question is: which operations +are efficient in the chosen data structure? + +This chapter introduces the most important +data structures in the C++ standard library. +It is a good idea to use the standard library +whenever possible, +because it will save a lot of time. +Later in the book we will learn about more sophisticated +data structures that are not available +in the standard library. + +\section{Dynamic arrays} + +\index{dynamic array} +\index{vector} + +A \key{dynamic array} is an array whose +size can be changed during the execution +of the program. +The most popular dynamic array in C++ is +the \texttt{vector} structure, +which can be used almost like an ordinary array. + +The following code creates an empty vector and +adds three elements to it: + +\begin{lstlisting} +vector v; +v.push_back(3); // [3] +v.push_back(2); // [3,2] +v.push_back(5); // [3,2,5] +\end{lstlisting} + +After this, the elements can be accessed like in an ordinary array: + +\begin{lstlisting} +cout << v[0] << "\n"; // 3 +cout << v[1] << "\n"; // 2 +cout << v[2] << "\n"; // 5 +\end{lstlisting} + +The function \texttt{size} returns the number of elements in the vector. +The following code iterates through +the vector and prints all elements in it: + +\begin{lstlisting} +for (int i = 0; i < v.size(); i++) { + cout << v[i] << "\n"; +} +\end{lstlisting} + +\begin{samepage} +A shorter way to iterate through a vector is as follows: + +\begin{lstlisting} +for (auto x : v) { + cout << x << "\n"; +} +\end{lstlisting} +\end{samepage} + +The function \texttt{back} returns the last element +in the vector, and +the function \texttt{pop\_back} removes the last element: + +\begin{lstlisting} +vector v; +v.push_back(5); +v.push_back(2); +cout << v.back() << "\n"; // 2 +v.pop_back(); +cout << v.back() << "\n"; // 5 +\end{lstlisting} + +The following code creates a vector with five elements: + +\begin{lstlisting} +vector v = {2,4,2,5,1}; +\end{lstlisting} + +Another way to create a vector is to give the number +of elements and the initial value for each element: + +\begin{lstlisting} +// size 10, initial value 0 +vector v(10); +\end{lstlisting} +\begin{lstlisting} +// size 10, initial value 5 +vector v(10, 5); +\end{lstlisting} + +The internal implementation of a vector +uses an ordinary array. +If the size of the vector increases and +the array becomes too small, +a new array is allocated and all the +elements are moved to the new array. +However, this does not happen often and the +average time complexity of +\texttt{push\_back} is $O(1)$. + +\index{string} + +The \texttt{string} structure +is also a dynamic array that can be used almost like a vector. +In addition, there is special syntax for strings +that is not available in other data structures. +Strings can be combined using the \texttt{+} symbol. +The function $\texttt{substr}(k,x)$ returns the substring +that begins at position $k$ and has length $x$, +and the function $\texttt{find}(\texttt{t})$ finds the position +of the first occurrence of a substring \texttt{t}. + +The following code presents some string operations: + +\begin{lstlisting} +string a = "hatti"; +string b = a+a; +cout << b << "\n"; // hattihatti +b[5] = 'v'; +cout << b << "\n"; // hattivatti +string c = b.substr(3,4); +cout << c << "\n"; // tiva +\end{lstlisting} + +\section{Set structures} + +\index{set} + +A \key{set} is a data structure that +maintains a collection of elements. +The basic operations of sets are element +insertion, search and removal. + +The C++ standard library contains two set +implementations: +The structure \texttt{set} is based on a balanced +binary tree and its operations work in $O(\log n)$ time. +The structure \texttt{unordered\_set} uses hashing, +and its operations work in $O(1)$ time on average. + +The choice of which set implementation to use +is often a matter of taste. +The benefit of the \texttt{set} structure +is that it maintains the order of the elements +and provides functions that are not available +in \texttt{unordered\_set}. +On the other hand, \texttt{unordered\_set} +can be more efficient. + +The following code creates a set +that contains integers, +and shows some of the operations. +The function \texttt{insert} adds an element to the set, +the function \texttt{count} returns the number of occurrences +of an element in the set, +and the function \texttt{erase} removes an element from the set. + +\begin{lstlisting} +set s; +s.insert(3); +s.insert(2); +s.insert(5); +cout << s.count(3) << "\n"; // 1 +cout << s.count(4) << "\n"; // 0 +s.erase(3); +s.insert(4); +cout << s.count(3) << "\n"; // 0 +cout << s.count(4) << "\n"; // 1 +\end{lstlisting} + +A set can be used mostly like a vector, +but it is not possible to access +the elements using the \texttt{[]} notation. +The following code creates a set, +prints the number of elements in it, and then +iterates through all the elements: +\begin{lstlisting} +set s = {2,5,6,8}; +cout << s.size() << "\n"; // 4 +for (auto x : s) { + cout << x << "\n"; +} +\end{lstlisting} + +An important property of sets is +that all their elements are \emph{distinct}. +Thus, the function \texttt{count} always returns +either 0 (the element is not in the set) +or 1 (the element is in the set), +and the function \texttt{insert} never adds +an element to the set if it is +already there. +The following code illustrates this: + +\begin{lstlisting} +set s; +s.insert(5); +s.insert(5); +s.insert(5); +cout << s.count(5) << "\n"; // 1 +\end{lstlisting} + +C++ also contains the structures +\texttt{multiset} and \texttt{unordered\_multiset} +that otherwise work like \texttt{set} +and \texttt{unordered\_set} +but they can contain multiple instances of an element. +For example, in the following code all three instances +of the number 5 are added to a multiset: + +\begin{lstlisting} +multiset s; +s.insert(5); +s.insert(5); +s.insert(5); +cout << s.count(5) << "\n"; // 3 +\end{lstlisting} +The function \texttt{erase} removes +all instances of an element +from a multiset: +\begin{lstlisting} +s.erase(5); +cout << s.count(5) << "\n"; // 0 +\end{lstlisting} +Often, only one instance should be removed, +which can be done as follows: +\begin{lstlisting} +s.erase(s.find(5)); +cout << s.count(5) << "\n"; // 2 +\end{lstlisting} + +\section{Map structures} + +\index{map} + +A \key{map} is a generalized array +that consists of key-value-pairs. +While the keys in an ordinary array are always +the consecutive integers $0,1,\ldots,n-1$, +where $n$ is the size of the array, +the keys in a map can be of any data type and +they do not have to be consecutive values. + +The C++ standard library contains two map +implementations that correspond to the set +implementations: the structure +\texttt{map} is based on a balanced +binary tree and accessing elements +takes $O(\log n)$ time, +while the structure +\texttt{unordered\_map} uses hashing +and accessing elements takes $O(1)$ time on average. + +The following code creates a map +where the keys are strings and the values are integers: + +\begin{lstlisting} +map m; +m["monkey"] = 4; +m["banana"] = 3; +m["harpsichord"] = 9; +cout << m["banana"] << "\n"; // 3 +\end{lstlisting} + +If the value of a key is requested +but the map does not contain it, +the key is automatically added to the map with +a default value. +For example, in the following code, +the key ''aybabtu'' with value 0 +is added to the map. + +\begin{lstlisting} +map m; +cout << m["aybabtu"] << "\n"; // 0 +\end{lstlisting} +The function \texttt{count} checks +if a key exists in a map: +\begin{lstlisting} +if (m.count("aybabtu")) { + // key exists +} +\end{lstlisting} +The following code prints all the keys and values +in a map: +\begin{lstlisting} +for (auto x : m) { + cout << x.first << " " << x.second << "\n"; +} +\end{lstlisting} + +\section{Iterators and ranges} + +\index{iterator} + +Many functions in the C++ standard library +operate with iterators. +An \key{iterator} is a variable that points +to an element in a data structure. + +The often used iterators \texttt{begin} +and \texttt{end} define a range that contains +all elements in a data structure. +The iterator \texttt{begin} points to +the first element in the data structure, +and the iterator \texttt{end} points to +the position \emph{after} the last element. +The situation looks as follows: + +\begin{center} +\begin{tabular}{llllllllll} +\{ & 3, & 4, & 6, & 8, & 12, & 13, & 14, & 17 & \} \\ +& $\uparrow$ & & & & & & & & $\uparrow$ \\ +& \multicolumn{3}{l}{\texttt{s.begin()}} & & & & & & \texttt{s.end()} \\ +\end{tabular} +\end{center} + +Note the asymmetry in the iterators: +\texttt{s.begin()} points to an element in the data structure, +while \texttt{s.end()} points outside the data structure. +Thus, the range defined by the iterators is \emph{half-open}. + +\subsubsection{Working with ranges} + +Iterators are used in C++ standard library functions +that are given a range of elements in a data structure. +Usually, we want to process all elements in a +data structure, so the iterators +\texttt{begin} and \texttt{end} are given for the function. + +For example, the following code sorts a vector +using the function \texttt{sort}, +then reverses the order of the elements using the function +\texttt{reverse}, and finally shuffles the order of +the elements using the function \texttt{random\_shuffle}. + +\index{sort@\texttt{sort}} +\index{reverse@\texttt{reverse}} +\index{random\_shuffle@\texttt{random\_shuffle}} + +\begin{lstlisting} +sort(v.begin(), v.end()); +reverse(v.begin(), v.end()); +random_shuffle(v.begin(), v.end()); +\end{lstlisting} + +These functions can also be used with an ordinary array. +In this case, the functions are given pointers to the array +instead of iterators: + +\newpage +\begin{lstlisting} +sort(a, a+n); +reverse(a, a+n); +random_shuffle(a, a+n); +\end{lstlisting} + +\subsubsection{Set iterators} + +Iterators are often used to access +elements of a set. +The following code creates an iterator +\texttt{it} that points to the smallest element in a set: +\begin{lstlisting} +set::iterator it = s.begin(); +\end{lstlisting} +A shorter way to write the code is as follows: +\begin{lstlisting} +auto it = s.begin(); +\end{lstlisting} +The element to which an iterator points +can be accessed using the \texttt{*} symbol. +For example, the following code prints +the first element in the set: + +\begin{lstlisting} +auto it = s.begin(); +cout << *it << "\n"; +\end{lstlisting} + +Iterators can be moved using the operators +\texttt{++} (forward) and \texttt{--} (backward), +meaning that the iterator moves to the next +or previous element in the set. + +The following code prints all the elements +in increasing order: +\begin{lstlisting} +for (auto it = s.begin(); it != s.end(); it++) { + cout << *it << "\n"; +} +\end{lstlisting} +The following code prints the largest element in the set: +\begin{lstlisting} +auto it = s.end(); it--; +cout << *it << "\n"; +\end{lstlisting} + +The function $\texttt{find}(x)$ returns an iterator +that points to an element whose value is $x$. +However, if the set does not contain $x$, +the iterator will be \texttt{end}. + +\begin{lstlisting} +auto it = s.find(x); +if (it == s.end()) { + // x is not found +} +\end{lstlisting} + +The function $\texttt{lower\_bound}(x)$ returns +an iterator to the smallest element in the set +whose value is \emph{at least} $x$, and +the function $\texttt{upper\_bound}(x)$ +returns an iterator to the smallest element in the set +whose value is \emph{larger than} $x$. +In both functions, if such an element does not exist, +the return value is \texttt{end}. +These functions are not supported by the +\texttt{unordered\_set} structure which +does not maintain the order of the elements. + +\begin{samepage} +For example, the following code finds the element +nearest to $x$: + +\begin{lstlisting} +auto it = s.lower_bound(x); +if (it == s.begin()) { + cout << *it << "\n"; +} else if (it == s.end()) { + it--; + cout << *it << "\n"; +} else { + int a = *it; it--; + int b = *it; + if (x-b < a-x) cout << b << "\n"; + else cout << a << "\n"; +} +\end{lstlisting} + +The code assumes that the set is not empty, +and goes through all possible cases +using an iterator \texttt{it}. +First, the iterator points to the smallest +element whose value is at least $x$. +If \texttt{it} equals \texttt{begin}, +the corresponding element is nearest to $x$. +If \texttt{it} equals \texttt{end}, +the largest element in the set is nearest to $x$. +If none of the previous cases hold, +the element nearest to $x$ is either the +element that corresponds to \texttt{it} or the previous element. +\end{samepage} + +\section{Other structures} + +\subsubsection{Bitset} + +\index{bitset} + +A \key{bitset} is an array +whose each value is either 0 or 1. +For example, the following code creates +a bitset that contains 10 elements: +\begin{lstlisting} +bitset<10> s; +s[1] = 1; +s[3] = 1; +s[4] = 1; +s[7] = 1; +cout << s[4] << "\n"; // 1 +cout << s[5] << "\n"; // 0 +\end{lstlisting} + +The benefit of using bitsets is that +they require less memory than ordinary arrays, +because each element in a bitset only +uses one bit of memory. +For example, +if $n$ bits are stored in an \texttt{int} array, +$32n$ bits of memory will be used, +but a corresponding bitset only requires $n$ bits of memory. +In addition, the values of a bitset +can be efficiently manipulated using +bit operators, which makes it possible to +optimize algorithms using bit sets. + +The following code shows another way to create the above bitset: +\begin{lstlisting} +bitset<10> s(string("0010011010")); // from right to left +cout << s[4] << "\n"; // 1 +cout << s[5] << "\n"; // 0 +\end{lstlisting} + +The function \texttt{count} returns the number +of ones in the bitset: + +\begin{lstlisting} +bitset<10> s(string("0010011010")); +cout << s.count() << "\n"; // 4 +\end{lstlisting} + +The following code shows examples of using bit operations: +\begin{lstlisting} +bitset<10> a(string("0010110110")); +bitset<10> b(string("1011011000")); +cout << (a&b) << "\n"; // 0010010000 +cout << (a|b) << "\n"; // 1011111110 +cout << (a^b) << "\n"; // 1001101110 +\end{lstlisting} + +\subsubsection{Deque} + +\index{deque} + +A \key{deque} is a dynamic array +whose size can be efficiently +changed at both ends of the array. +Like a vector, a deque provides the functions +\texttt{push\_back} and \texttt{pop\_back}, but +it also includes the functions +\texttt{push\_front} and \texttt{pop\_front} +which are not available in a vector. + +A deque can be used as follows: +\begin{lstlisting} +deque d; +d.push_back(5); // [5] +d.push_back(2); // [5,2] +d.push_front(3); // [3,5,2] +d.pop_back(); // [3,5] +d.pop_front(); // [5] +\end{lstlisting} + +The internal implementation of a deque +is more complex than that of a vector, +and for this reason, a deque is slower than a vector. +Still, both adding and removing +elements take $O(1)$ time on average at both ends. + +\subsubsection{Stack} + +\index{stack} + +A \key{stack} +is a data structure that provides two +$O(1)$ time operations: +adding an element to the top, +and removing an element from the top. +It is only possible to access the top +element of a stack. + +The following code shows how a stack can be used: +\begin{lstlisting} +stack s; +s.push(3); +s.push(2); +s.push(5); +cout << s.top(); // 5 +s.pop(); +cout << s.top(); // 2 +\end{lstlisting} +\subsubsection{Queue} + +\index{queue} + +A \key{queue} also +provides two $O(1)$ time operations: +adding an element to the end of the queue, +and removing the first element in the queue. +It is only possible to access the first +and last element of a queue. + +The following code shows how a queue can be used: +\begin{lstlisting} +queue q; +q.push(3); +q.push(2); +q.push(5); +cout << q.front(); // 3 +q.pop(); +cout << q.front(); // 2 +\end{lstlisting} + +\subsubsection{Priority queue} + +\index{priority queue} +\index{heap} + +A \key{priority queue} +maintains a set of elements. +The supported operations are insertion and, +depending on the type of the queue, +retrieval and removal of +either the minimum or maximum element. +Insertion and removal take $O(\log n)$ time, +and retrieval takes $O(1)$ time. + +While an ordered set efficiently supports +all the operations of a priority queue, +the benefit of using a priority queue is +that it has smaller constant factors. +A priority queue is usually implemented using +a heap structure that is much simpler than a +balanced binary tree used in an ordered set. + +\begin{samepage} +By default, the elements in a C++ +priority queue are sorted in decreasing order, +and it is possible to find and remove the +largest element in the queue. +The following code illustrates this: + +\begin{lstlisting} +priority_queue q; +q.push(3); +q.push(5); +q.push(7); +q.push(2); +cout << q.top() << "\n"; // 7 +q.pop(); +cout << q.top() << "\n"; // 5 +q.pop(); +q.push(6); +cout << q.top() << "\n"; // 6 +q.pop(); +\end{lstlisting} +\end{samepage} + +If we want to create a priority queue +that supports finding and removing +the smallest element, +we can do it as follows: + +\begin{lstlisting} +priority_queue,greater> q; +\end{lstlisting} + +\subsubsection{Policy-based data structures} + +The \texttt{g++} compiler also supports +some data structures that are not part +of the C++ standard library. +Such structures are called \emph{policy-based} +data structures. +To use these structures, the following lines +must be added to the code: +\begin{lstlisting} +#include +using namespace __gnu_pbds; +\end{lstlisting} +After this, we can define a data structure \texttt{indexed\_set} that +is like \texttt{set} but can be indexed like an array. +The definition for \texttt{int} values is as follows: +\begin{lstlisting} +typedef tree,rb_tree_tag, + tree_order_statistics_node_update> indexed_set; +\end{lstlisting} +Now we can create a set as follows: +\begin{lstlisting} +indexed_set s; +s.insert(2); +s.insert(3); +s.insert(7); +s.insert(9); +\end{lstlisting} +The speciality of this set is that we have access to +the indices that the elements would have in a sorted array. +The function $\texttt{find\_by\_order}$ returns +an iterator to the element at a given position: +\begin{lstlisting} +auto x = s.find_by_order(2); +cout << *x << "\n"; // 7 +\end{lstlisting} +And the function $\texttt{order\_of\_key}$ +returns the position of a given element: +\begin{lstlisting} +cout << s.order_of_key(7) << "\n"; // 2 +\end{lstlisting} +If the element does not appear in the set, +we get the position that the element would have +in the set: +\begin{lstlisting} +cout << s.order_of_key(6) << "\n"; // 2 +cout << s.order_of_key(8) << "\n"; // 3 +\end{lstlisting} +Both the functions work in logarithmic time. + +\section{Comparison to sorting} + +It is often possible to solve a problem +using either data structures or sorting. +Sometimes there are remarkable differences +in the actual efficiency of these approaches, +which may be hidden in their time complexities. + +Let us consider a problem where +we are given two lists $A$ and $B$ +that both contain $n$ elements. +Our task is to calculate the number of elements +that belong to both of the lists. +For example, for the lists +\[A = [5,2,8,9] \hspace{10px} \textrm{and} \hspace{10px} B = [3,2,9,5],\] +the answer is 3 because the numbers 2, 5 +and 9 belong to both of the lists. + +A straightforward solution to the problem is +to go through all pairs of elements in $O(n^2)$ time, +but next we will focus on +more efficient algorithms. + +\subsubsection{Algorithm 1} + +We construct a set of the elements that appear in $A$, +and after this, we iterate through the elements +of $B$ and check for each elements if it +also belongs to $A$. +This is efficient because the elements of $A$ +are in a set. +Using the \texttt{set} structure, +the time complexity of the algorithm is $O(n \log n)$. + +\subsubsection{Algorithm 2} + +It is not necessary to maintain an ordered set, +so instead of the \texttt{set} structure +we can also use the \texttt{unordered\_set} structure. +This is an easy way to make the algorithm +more efficient, because we only have to change +the underlying data structure. +The time complexity of the new algorithm is $O(n)$. + +\subsubsection{Algorithm 3} + +Instead of data structures, we can use sorting. +First, we sort both lists $A$ and $B$. +After this, we iterate through both the lists +at the same time and find the common elements. +The time complexity of sorting is $O(n \log n)$, +and the rest of the algorithm works in $O(n)$ time, +so the total time complexity is $O(n \log n)$. + +\subsubsection{Efficiency comparison} + +The following table shows how efficient +the above algorithms are when $n$ varies and +the elements of the lists are random +integers between $1 \ldots 10^9$: + +\begin{center} +\begin{tabular}{rrrr} +$n$ & Algorithm 1 & Algorithm 2 & Algorithm 3 \\ +\hline +$10^6$ & $1.5$ s & $0.3$ s & $0.2$ s \\ +$2 \cdot 10^6$ & $3.7$ s & $0.8$ s & $0.3$ s \\ +$3 \cdot 10^6$ & $5.7$ s & $1.3$ s & $0.5$ s \\ +$4 \cdot 10^6$ & $7.7$ s & $1.7$ s & $0.7$ s \\ +$5 \cdot 10^6$ & $10.0$ s & $2.3$ s & $0.9$ s \\ +\end{tabular} +\end{center} + +Algorithms 1 and 2 are equal except that +they use different set structures. +In this problem, this choice has an important effect on +the running time, because Algorithm 2 +is 4–5 times faster than Algorithm 1. + +However, the most efficient algorithm is Algorithm 3 +which uses sorting. +It only uses half the time compared to Algorithm 2. +Interestingly, the time complexity of both +Algorithm 1 and Algorithm 3 is $O(n \log n)$, +but despite this, Algorithm 3 is ten times faster. +This can be explained by the fact that +sorting is a simple procedure and it is done +only once at the beginning of Algorithm 3, +and the rest of the algorithm works in linear time. +On the other hand, +Algorithm 1 maintains a complex balanced binary tree +during the whole algorithm. diff --git a/chapter05.tex b/chapter05.tex new file mode 100644 index 0000000..9fec183 --- /dev/null +++ b/chapter05.tex @@ -0,0 +1,758 @@ +\chapter{Complete search} + +\key{Complete search} +is a general method that can be used +to solve almost any algorithm problem. +The idea is to generate all possible +solutions to the problem using brute force, +and then select the best solution or count the +number of solutions, depending on the problem. + +Complete search is a good technique +if there is enough time to go through all the solutions, +because the search is usually easy to implement +and it always gives the correct answer. +If complete search is too slow, +other techniques, such as greedy algorithms or +dynamic programming, may be needed. + +\section{Generating subsets} + +\index{subset} + +We first consider the problem of generating +all subsets of a set of $n$ elements. +For example, the subsets of $\{0,1,2\}$ are +$\emptyset$, $\{0\}$, $\{1\}$, $\{2\}$, $\{0,1\}$, +$\{0,2\}$, $\{1,2\}$ and $\{0,1,2\}$. +There are two common methods to generate subsets: +we can either perform a recursive search +or exploit the bit representation of integers. + +\subsubsection{Method 1} + +An elegant way to go through all subsets +of a set is to use recursion. +The following function \texttt{search} +generates the subsets of the set +$\{0,1,\ldots,n-1\}$. +The function maintains a vector \texttt{subset} +that will contain the elements of each subset. +The search begins when the function is called +with parameter 0. + +\begin{lstlisting} +void search(int k) { + if (k == n) { + // process subset + } else { + search(k+1); + subset.push_back(k); + search(k+1); + subset.pop_back(); + } +} +\end{lstlisting} + +When the function \texttt{search} +is called with parameter $k$, +it decides whether to include the +element $k$ in the subset or not, +and in both cases, +then calls itself with parameter $k+1$ +However, if $k=n$, the function notices that +all elements have been processed +and a subset has been generated. + +The following tree illustrates the function calls when $n=3$. +We can always choose either the left branch +($k$ is not included in the subset) or the right branch +($k$ is included in the subset). + +\begin{center} +\begin{tikzpicture}[scale=.45] + \begin{scope} + \small + \node at (0,0) {$\texttt{search}(0)$}; + + \node at (-8,-4) {$\texttt{search}(1)$}; + \node at (8,-4) {$\texttt{search}(1)$}; + + \path[draw,thick,->] (0,0-0.5) -- (-8,-4+0.5); + \path[draw,thick,->] (0,0-0.5) -- (8,-4+0.5); + + \node at (-12,-8) {$\texttt{search}(2)$}; + \node at (-4,-8) {$\texttt{search}(2)$}; + \node at (4,-8) {$\texttt{search}(2)$}; + \node at (12,-8) {$\texttt{search}(2)$}; + + \path[draw,thick,->] (-8,-4-0.5) -- (-12,-8+0.5); + \path[draw,thick,->] (-8,-4-0.5) -- (-4,-8+0.5); + \path[draw,thick,->] (8,-4-0.5) -- (4,-8+0.5); + \path[draw,thick,->] (8,-4-0.5) -- (12,-8+0.5); + + \node at (-14,-12) {$\texttt{search}(3)$}; + \node at (-10,-12) {$\texttt{search}(3)$}; + \node at (-6,-12) {$\texttt{search}(3)$}; + \node at (-2,-12) {$\texttt{search}(3)$}; + \node at (2,-12) {$\texttt{search}(3)$}; + \node at (6,-12) {$\texttt{search}(3)$}; + \node at (10,-12) {$\texttt{search}(3)$}; + \node at (14,-12) {$\texttt{search}(3)$}; + + \node at (-14,-13.5) {$\emptyset$}; + \node at (-10,-13.5) {$\{2\}$}; + \node at (-6,-13.5) {$\{1\}$}; + \node at (-2,-13.5) {$\{1,2\}$}; + \node at (2,-13.5) {$\{0\}$}; + \node at (6,-13.5) {$\{0,2\}$}; + \node at (10,-13.5) {$\{0,1\}$}; + \node at (14,-13.5) {$\{0,1,2\}$}; + + + \path[draw,thick,->] (-12,-8-0.5) -- (-14,-12+0.5); + \path[draw,thick,->] (-12,-8-0.5) -- (-10,-12+0.5); + \path[draw,thick,->] (-4,-8-0.5) -- (-6,-12+0.5); + \path[draw,thick,->] (-4,-8-0.5) -- (-2,-12+0.5); + \path[draw,thick,->] (4,-8-0.5) -- (2,-12+0.5); + \path[draw,thick,->] (4,-8-0.5) -- (6,-12+0.5); + \path[draw,thick,->] (12,-8-0.5) -- (10,-12+0.5); + \path[draw,thick,->] (12,-8-0.5) -- (14,-12+0.5); +\end{scope} +\end{tikzpicture} +\end{center} + +\subsubsection{Method 2} + +Another way to generate subsets is based on +the bit representation of integers. +Each subset of a set of $n$ elements +can be represented as a sequence of $n$ bits, +which corresponds to an integer between $0 \ldots 2^n-1$. +The ones in the bit sequence indicate +which elements are included in the subset. + +The usual convention is that +the last bit corresponds to element 0, +the second last bit corresponds to element 1, +and so on. +For example, the bit representation of 25 +is 11001, which corresponds to the subset $\{0,3,4\}$. + +The following code goes through the subsets +of a set of $n$ elements + +\begin{lstlisting} +for (int b = 0; b < (1< subset; + for (int i = 0; i < n; i++) { + if (b&(1< permutation; +for (int i = 0; i < n; i++) { + permutation.push_back(i); +} +do { + // process permutation +} while (next_permutation(permutation.begin(),permutation.end())); +\end{lstlisting} + +\section{Backtracking} + +\index{backtracking} + +A \key{backtracking} algorithm +begins with an empty solution +and extends the solution step by step. +The search recursively +goes through all different ways how +a solution can be constructed. + +\index{queen problem} + +As an example, consider the problem of +calculating the number +of ways $n$ queens can be placed on +an $n \times n$ chessboard so that +no two queens attack each other. +For example, when $n=4$, +there are two possible solutions: + +\begin{center} +\begin{tikzpicture}[scale=.65] + \begin{scope} + \draw (0, 0) grid (4, 4); + \node at (1.5,3.5) {\symqueen}; + \node at (3.5,2.5) {\symqueen}; + \node at (0.5,1.5) {\symqueen}; + \node at (2.5,0.5) {\symqueen}; + + \draw (6, 0) grid (10, 4); + \node at (6+2.5,3.5) {\symqueen}; + \node at (6+0.5,2.5) {\symqueen}; + \node at (6+3.5,1.5) {\symqueen}; + \node at (6+1.5,0.5) {\symqueen}; + + \end{scope} +\end{tikzpicture} +\end{center} + +The problem can be solved using backtracking +by placing queens to the board row by row. +More precisely, exactly one queen will +be placed on each row so that no queen attacks +any of the queens placed before. +A solution has been found when all +$n$ queens have been placed on the board. + +For example, when $n=4$, +some partial solutions generated by +the backtracking algorithm are as follows: + +\begin{center} +\begin{tikzpicture}[scale=.55] + \begin{scope} + \draw (0, 0) grid (4, 4); + + \draw (-9, -6) grid (-5, -2); + \draw (-3, -6) grid (1, -2); + \draw (3, -6) grid (7, -2); + \draw (9, -6) grid (13, -2); + + \node at (-9+0.5,-3+0.5) {\symqueen}; + \node at (-3+1+0.5,-3+0.5) {\symqueen}; + \node at (3+2+0.5,-3+0.5) {\symqueen}; + \node at (9+3+0.5,-3+0.5) {\symqueen}; + + \draw (2,0) -- (-7,-2); + \draw (2,0) -- (-1,-2); + \draw (2,0) -- (5,-2); + \draw (2,0) -- (11,-2); + + \draw (-11, -12) grid (-7, -8); + \draw (-6, -12) grid (-2, -8); + \draw (-1, -12) grid (3, -8); + \draw (4, -12) grid (8, -8); + \draw[white] (11, -12) grid (15, -8); + \node at (-11+1+0.5,-9+0.5) {\symqueen}; + \node at (-6+1+0.5,-9+0.5) {\symqueen}; + \node at (-1+1+0.5,-9+0.5) {\symqueen}; + \node at (4+1+0.5,-9+0.5) {\symqueen}; + \node at (-11+0+0.5,-10+0.5) {\symqueen}; + \node at (-6+1+0.5,-10+0.5) {\symqueen}; + \node at (-1+2+0.5,-10+0.5) {\symqueen}; + \node at (4+3+0.5,-10+0.5) {\symqueen}; + + \draw (-1,-6) -- (-9,-8); + \draw (-1,-6) -- (-4,-8); + \draw (-1,-6) -- (1,-8); + \draw (-1,-6) -- (6,-8); + + \node at (-9,-13) {illegal}; + \node at (-4,-13) {illegal}; + \node at (1,-13) {illegal}; + \node at (6,-13) {valid}; + + \end{scope} +\end{tikzpicture} +\end{center} + +At the bottom level, the three first configurations +are illegal, because the queens attack each other. +However, the fourth configuration is valid +and it can be extended to a complete solution by +placing two more queens to the board. +There is only one way to place the two remaining queens. + +\begin{samepage} +The algorithm can be implemented as follows: +\begin{lstlisting} +void search(int y) { + if (y == n) { + count++; + return; + } + for (int x = 0; x < n; x++) { + if (column[x] || diag1[x+y] || diag2[x-y+n-1]) continue; + column[x] = diag1[x+y] = diag2[x-y+n-1] = 1; + search(y+1); + column[x] = diag1[x+y] = diag2[x-y+n-1] = 0; + } +} +\end{lstlisting} +\end{samepage} +The search begins by calling \texttt{search(0)}. +The size of the board is $n \times n$, +and the code calculates the number of solutions +to \texttt{count}. + +The code assumes that the rows and columns +of the board are numbered from 0 to $n-1$. +When the function \texttt{search} is +called with parameter $y$, +it places a queen on row $y$ +and then calls itself with parameter $y+1$. +Then, if $y=n$, a solution has been found +and the variable \texttt{count} is increased by one. + +The array \texttt{column} keeps track of columns +that contain a queen, +and the arrays \texttt{diag1} and \texttt{diag2} +keep track of diagonals. +It is not allowed to add another queen to a +column or diagonal that already contains a queen. +For example, the columns and diagonals of +the $4 \times 4$ board are numbered as follows: + +\begin{center} +\begin{tikzpicture}[scale=.65] + \begin{scope} + \draw (0-6, 0) grid (4-6, 4); + \node at (-6+0.5,3.5) {$0$}; + \node at (-6+1.5,3.5) {$1$}; + \node at (-6+2.5,3.5) {$2$}; + \node at (-6+3.5,3.5) {$3$}; + \node at (-6+0.5,2.5) {$0$}; + \node at (-6+1.5,2.5) {$1$}; + \node at (-6+2.5,2.5) {$2$}; + \node at (-6+3.5,2.5) {$3$}; + \node at (-6+0.5,1.5) {$0$}; + \node at (-6+1.5,1.5) {$1$}; + \node at (-6+2.5,1.5) {$2$}; + \node at (-6+3.5,1.5) {$3$}; + \node at (-6+0.5,0.5) {$0$}; + \node at (-6+1.5,0.5) {$1$}; + \node at (-6+2.5,0.5) {$2$}; + \node at (-6+3.5,0.5) {$3$}; + + \draw (0, 0) grid (4, 4); + \node at (0.5,3.5) {$0$}; + \node at (1.5,3.5) {$1$}; + \node at (2.5,3.5) {$2$}; + \node at (3.5,3.5) {$3$}; + \node at (0.5,2.5) {$1$}; + \node at (1.5,2.5) {$2$}; + \node at (2.5,2.5) {$3$}; + \node at (3.5,2.5) {$4$}; + \node at (0.5,1.5) {$2$}; + \node at (1.5,1.5) {$3$}; + \node at (2.5,1.5) {$4$}; + \node at (3.5,1.5) {$5$}; + \node at (0.5,0.5) {$3$}; + \node at (1.5,0.5) {$4$}; + \node at (2.5,0.5) {$5$}; + \node at (3.5,0.5) {$6$}; + + \draw (6, 0) grid (10, 4); + \node at (6.5,3.5) {$3$}; + \node at (7.5,3.5) {$4$}; + \node at (8.5,3.5) {$5$}; + \node at (9.5,3.5) {$6$}; + \node at (6.5,2.5) {$2$}; + \node at (7.5,2.5) {$3$}; + \node at (8.5,2.5) {$4$}; + \node at (9.5,2.5) {$5$}; + \node at (6.5,1.5) {$1$}; + \node at (7.5,1.5) {$2$}; + \node at (8.5,1.5) {$3$}; + \node at (9.5,1.5) {$4$}; + \node at (6.5,0.5) {$0$}; + \node at (7.5,0.5) {$1$}; + \node at (8.5,0.5) {$2$}; + \node at (9.5,0.5) {$3$}; + + \node at (-4,-1) {\texttt{column}}; + \node at (2,-1) {\texttt{diag1}}; + \node at (8,-1) {\texttt{diag2}}; + + \end{scope} +\end{tikzpicture} +\end{center} + +Let $q(n)$ denote the number of ways +to place $n$ queens on an $n \times n$ chessboard. +The above backtracking +algorithm tells us that, for example, $q(8)=92$. +When $n$ increases, the search quickly becomes slow, +because the number of solutions increases +exponentially. +For example, calculating $q(16)=14772512$ +using the above algorithm already takes about a minute +on a modern computer\footnote{There is no known way to efficiently +calculate larger values of $q(n)$. The current record is +$q(27)=234907967154122528$, calculated in 2016 \cite{q27}.}. + +\section{Pruning the search} + +We can often optimize backtracking +by pruning the search tree. +The idea is to add ''intelligence'' to the algorithm +so that it will notice as soon as possible +if a partial solution cannot be extended +to a complete solution. +Such optimizations can have a tremendous +effect on the efficiency of the search. + +Let us consider the problem +of calculating the number of paths +in an $n \times n$ grid from the upper-left corner +to the lower-right corner such that the +path visits each square exactly once. +For example, in a $7 \times 7$ grid, +there are 111712 such paths. +One of the paths is as follows: + +\begin{center} +\begin{tikzpicture}[scale=.55] + \begin{scope} + \draw (0, 0) grid (7, 7); + \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) -- + (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) -- + (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) -- + (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) -- + (5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) -- + (3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) -- + (4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) -- + (5.5,6.5) -- (6.5,6.5) -- (6.5,0.5); + \end{scope} +\end{tikzpicture} +\end{center} + +We focus on the $7 \times 7$ case, +because its level of difficulty is appropriate to our needs. +We begin with a straightforward backtracking algorithm, +and then optimize it step by step using observations +of how the search can be pruned. +After each optimization, we measure the running time +of the algorithm and the number of recursive calls, +so that we clearly see the effect of each +optimization on the efficiency of the search. + +\subsubsection{Basic algorithm} + +The first version of the algorithm does not contain +any optimizations. We simply use backtracking to generate +all possible paths from the upper-left corner to +the lower-right corner and count the number of such paths. + +\begin{itemize} +\item +running time: 483 seconds +\item +number of recursive calls: 76 billion +\end{itemize} + +\subsubsection{Optimization 1} + +In any solution, we first move one step +down or right. +There are always two paths that +are symmetric +about the diagonal of the grid +after the first step. +For example, the following paths are symmetric: + +\begin{center} +\begin{tabular}{ccc} +\begin{tikzpicture}[scale=.55] + \begin{scope} + \draw (0, 0) grid (7, 7); + \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) -- + (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) -- + (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) -- + (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) -- + (5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) -- + (3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) -- + (4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) -- + (5.5,6.5) -- (6.5,6.5) -- (6.5,0.5); + \end{scope} +\end{tikzpicture} +& \hspace{20px} +& +\begin{tikzpicture}[scale=.55] + \begin{scope}[yscale=1,xscale=-1,rotate=-90] + \draw (0, 0) grid (7, 7); + \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) -- + (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) -- + (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) -- + (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) -- + (5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) -- + (3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) -- + (4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) -- + (5.5,6.5) -- (6.5,6.5) -- (6.5,0.5); + \end{scope} +\end{tikzpicture} +\end{tabular} +\end{center} + +Hence, we can decide that we always first +move one step down (or right), +and finally multiply the number of solutions by two. + +\begin{itemize} +\item +running time: 244 seconds +\item +number of recursive calls: 38 billion +\end{itemize} + +\subsubsection{Optimization 2} + +If the path reaches the lower-right square +before it has visited all other squares of the grid, +it is clear that +it will not be possible to complete the solution. +An example of this is the following path: + +\begin{center} +\begin{tikzpicture}[scale=.55] + \begin{scope} + \draw (0, 0) grid (7, 7); + \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) -- + (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) -- + (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) -- + (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) -- + (6.5,0.5); + \end{scope} +\end{tikzpicture} +\end{center} +Using this observation, we can terminate the search +immediately if we reach the lower-right square too early. +\begin{itemize} +\item +running time: 119 seconds +\item +number of recursive calls: 20 billion +\end{itemize} + +\subsubsection{Optimization 3} + +If the path touches a wall +and can turn either left or right, +the grid splits into two parts +that contain unvisited squares. +For example, in the following situation, +the path can turn either left or right: + +\begin{center} +\begin{tikzpicture}[scale=.55] + \begin{scope} + \draw (0, 0) grid (7, 7); + \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) -- + (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) -- + (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) -- + (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) -- + (5.5,0.5) -- (5.5,6.5); + \end{scope} +\end{tikzpicture} +\end{center} +In this case, we cannot visit all squares anymore, +so we can terminate the search. +This optimization is very useful: + +\begin{itemize} +\item +running time: 1.8 seconds +\item +number of recursive calls: 221 million +\end{itemize} + +\subsubsection{Optimization 4} + +The idea of Optimization 3 +can be generalized: +if the path cannot continue forward +but can turn either left or right, +the grid splits into two parts +that both contain unvisited squares. +For example, consider the following path: + +\begin{center} +\begin{tikzpicture}[scale=.55] + \begin{scope} + \draw (0, 0) grid (7, 7); + \draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) -- + (2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) -- + (3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) -- + (1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) -- + (5.5,0.5) -- (5.5,4.5) -- (3.5,4.5); + \end{scope} +\end{tikzpicture} +\end{center} +It is clear that we cannot visit all squares anymore, +so we can terminate the search. +After this optimization, the search is +very efficient: + +\begin{itemize} +\item +running time: 0.6 seconds +\item +number of recursive calls: 69 million +\end{itemize} + +~\\ +Now is a good moment to stop optimizing +the algorithm and see what we have achieved. +The running time of the original algorithm +was 483 seconds, +and now after the optimizations, +the running time is only 0.6 seconds. +Thus, the algorithm became nearly 1000 times +faster after the optimizations. + +This is a usual phenomenon in backtracking, +because the search tree is usually large +and even simple observations can effectively +prune the search. +Especially useful are optimizations that +occur during the first steps of the algorithm, +i.e., at the top of the search tree. + +\section{Meet in the middle} + +\index{meet in the middle} + +\key{Meet in the middle} is a technique +where the search space is divided into +two parts of about equal size. +A separate search is performed +for both of the parts, +and finally the results of the searches are combined. + +The technique can be used +if there is an efficient way to combine the +results of the searches. +In such a situation, the two searches may require less +time than one large search. +Typically, we can turn a factor of $2^n$ +into a factor of $2^{n/2}$ using the meet in the +middle technique. + +As an example, consider a problem where +we are given a list of $n$ numbers and +a number $x$, +and we want to find out if it is possible +to choose some numbers from the list so that +their sum is $x$. +For example, given the list $[2,4,5,9]$ and $x=15$, +we can choose the numbers $[2,4,9]$ to get $2+4+9=15$. +However, if $x=10$ for the same list, +it is not possible to form the sum. + +A simple algorithm to the problem is to +go through all subsets of the elements and +check if the sum of any of the subsets is $x$. +The running time of such an algorithm is $O(2^n)$, +because there are $2^n$ subsets. +However, using the meet in the middle technique, +we can achieve a more efficient $O(2^{n/2})$ time algorithm\footnote{This +idea was introduced in 1974 by E. Horowitz and S. Sahni \cite{hor74}.}. +Note that $O(2^n)$ and $O(2^{n/2})$ are different +complexities because $2^{n/2}$ equals $\sqrt{2^n}$. + +The idea is to divide the list into +two lists $A$ and $B$ such that both +lists contain about half of the numbers. +The first search generates all subsets +of $A$ and stores their sums to a list $S_A$. +Correspondingly, the second search creates +a list $S_B$ from $B$. +After this, it suffices to check if it is possible +to choose one element from $S_A$ and another +element from $S_B$ such that their sum is $x$. +This is possible exactly when there is a way to +form the sum $x$ using the numbers of the original list. + +For example, suppose that the list is $[2,4,5,9]$ and $x=15$. +First, we divide the list into $A=[2,4]$ and $B=[5,9]$. +After this, we create lists +$S_A=[0,2,4,6]$ and $S_B=[0,5,9,14]$. +In this case, the sum $x=15$ is possible to form, +because $S_A$ contains the sum $6$, +$S_B$ contains the sum $9$, and $6+9=15$. +This corresponds to the solution $[2,4,9]$. + +We can implement the algorithm so that +its time complexity is $O(2^{n/2})$. +First, we generate \emph{sorted} lists $S_A$ and $S_B$, +which can be done in $O(2^{n/2})$ time using a merge-like technique. +After this, since the lists are sorted, +we can check in $O(2^{n/2})$ time if +the sum $x$ can be created from $S_A$ and $S_B$. \ No newline at end of file diff --git a/chapter06.tex b/chapter06.tex new file mode 100644 index 0000000..326d63c --- /dev/null +++ b/chapter06.tex @@ -0,0 +1,680 @@ +\chapter{Greedy algorithms} + +\index{greedy algorithm} + +A \key{greedy algorithm} +constructs a solution to the problem +by always making a choice that looks +the best at the moment. +A greedy algorithm never takes back +its choices, but directly constructs +the final solution. +For this reason, greedy algorithms +are usually very efficient. + +The difficulty in designing greedy algorithms +is to find a greedy strategy +that always produces an optimal solution +to the problem. +The locally optimal choices in a greedy +algorithm should also be globally optimal. +It is often difficult to argue that +a greedy algorithm works. + +\section{Coin problem} + +As a first example, we consider a problem +where we are given a set of coins +and our task is to form a sum of money $n$ +using the coins. +The values of the coins are +$\texttt{coins}=\{c_1,c_2,\ldots,c_k\}$, +and each coin can be used as many times we want. +What is the minimum number of coins needed? + +For example, if the coins are the euro coins (in cents) +\[\{1,2,5,10,20,50,100,200\}\] +and $n=520$, +we need at least four coins. +The optimal solution is to select coins +$200+200+100+20$ whose sum is 520. + +\subsubsection{Greedy algorithm} + +A simple greedy algorithm to the problem +always selects the largest possible coin, +until the required sum of money has been constructed. +This algorithm works in the example case, +because we first select two 200 cent coins, +then one 100 cent coin and finally one 20 cent coin. +But does this algorithm always work? + +It turns out that if the coins are the euro coins, +the greedy algorithm \emph{always} works, i.e., +it always produces a solution with the fewest +possible number of coins. +The correctness of the algorithm can be +shown as follows: + +First, each coin 1, 5, 10, 50 and 100 appears +at most once in an optimal solution, +because if the +solution would contain two such coins, +we could replace them by one coin and +obtain a better solution. +For example, if the solution would contain +coins $5+5$, we could replace them by coin $10$. + +In the same way, coins 2 and 20 appear +at most twice in an optimal solution, +because we could replace +coins $2+2+2$ by coins $5+1$ and +coins $20+20+20$ by coins $50+10$. +Moreover, an optimal solution cannot contain +coins $2+2+1$ or $20+20+10$, +because we could replace them by coins $5$ and $50$. + +Using these observations, +we can show for each coin $x$ that +it is not possible to optimally construct +a sum $x$ or any larger sum by only using coins +that are smaller than $x$. +For example, if $x=100$, the largest optimal +sum using the smaller coins is $50+20+20+5+2+2=99$. +Thus, the greedy algorithm that always selects +the largest coin produces the optimal solution. + +This example shows that it can be difficult +to argue that a greedy algorithm works, +even if the algorithm itself is simple. + +\subsubsection{General case} + +In the general case, the coin set can contain any coins +and the greedy algorithm \emph{does not} necessarily produce +an optimal solution. + +We can prove that a greedy algorithm does not work +by showing a counterexample +where the algorithm gives a wrong answer. +In this problem we can easily find a counterexample: +if the coins are $\{1,3,4\}$ and the target sum +is 6, the greedy algorithm produces the solution +$4+1+1$ while the optimal solution is $3+3$. + +It is not known if the general coin problem +can be solved using any greedy algorithm\footnote{However, it is possible +to \emph{check} in polynomial time +if the greedy algorithm presented in this chapter works for +a given set of coins \cite{pea05}.}. +However, as we will see in Chapter 7, +in some cases, +the general problem can be efficiently +solved using a dynamic +programming algorithm that always gives the +correct answer. + +\section{Scheduling} + +Many scheduling problems can be solved +using greedy algorithms. +A classic problem is as follows: +Given $n$ events with their starting and ending +times, find a schedule +that includes as many events as possible. +It is not possible to select an event partially. +For example, consider the following events: +\begin{center} +\begin{tabular}{lll} +event & starting time & ending time \\ +\hline +$A$ & 1 & 3 \\ +$B$ & 2 & 5 \\ +$C$ & 3 & 9 \\ +$D$ & 6 & 8 \\ +\end{tabular} +\end{center} +In this case the maximum number of events is two. +For example, we can select events $B$ and $D$ +as follows: +\begin{center} +\begin{tikzpicture}[scale=.4] + \begin{scope} + \draw (2, 0) rectangle (6, -1); + \draw[fill=lightgray] (4, -1.5) rectangle (10, -2.5); + \draw (6, -3) rectangle (18, -4); + \draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5); + \node at (2.5,-0.5) {$A$}; + \node at (4.5,-2) {$B$}; + \node at (6.5,-3.5) {$C$}; + \node at (12.5,-5) {$D$}; + \end{scope} +\end{tikzpicture} +\end{center} + +It is possible to invent several greedy algorithms +for the problem, but which of them works in every case? + +\subsubsection*{Algorithm 1} + +The first idea is to select as \emph{short} +events as possible. +In the example case this algorithm +selects the following events: +\begin{center} +\begin{tikzpicture}[scale=.4] + \begin{scope} + \draw[fill=lightgray] (2, 0) rectangle (6, -1); + \draw (4, -1.5) rectangle (10, -2.5); + \draw (6, -3) rectangle (18, -4); + \draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5); + \node at (2.5,-0.5) {$A$}; + \node at (4.5,-2) {$B$}; + \node at (6.5,-3.5) {$C$}; + \node at (12.5,-5) {$D$}; + \end{scope} +\end{tikzpicture} +\end{center} + +However, selecting short events is not always +a correct strategy. For example, the algorithm fails +in the following case: +\begin{center} +\begin{tikzpicture}[scale=.4] + \begin{scope} + \draw (1, 0) rectangle (7, -1); + \draw[fill=lightgray] (6, -1.5) rectangle (9, -2.5); + \draw (8, -3) rectangle (14, -4); + \end{scope} +\end{tikzpicture} +\end{center} +If we select the short event, we can only select one event. +However, it would be possible to select both long events. + +\subsubsection*{Algorithm 2} + +Another idea is to always select the next possible +event that \emph{begins} as \emph{early} as possible. +This algorithm selects the following events: +\begin{center} +\begin{tikzpicture}[scale=.4] + \begin{scope} + \draw[fill=lightgray] (2, 0) rectangle (6, -1); + \draw (4, -1.5) rectangle (10, -2.5); + \draw[fill=lightgray] (6, -3) rectangle (18, -4); + \draw (12, -4.5) rectangle (16, -5.5); + \node at (2.5,-0.5) {$A$}; + \node at (4.5,-2) {$B$}; + \node at (6.5,-3.5) {$C$}; + \node at (12.5,-5) {$D$}; + \end{scope} +\end{tikzpicture} +\end{center} + +However, we can find a counterexample +also for this algorithm. +For example, in the following case, +the algorithm only selects one event: +\begin{center} +\begin{tikzpicture}[scale=.4] + \begin{scope} + \draw[fill=lightgray] (1, 0) rectangle (14, -1); + \draw (3, -1.5) rectangle (7, -2.5); + \draw (8, -3) rectangle (12, -4); + \end{scope} +\end{tikzpicture} +\end{center} +If we select the first event, it is not possible +to select any other events. +However, it would be possible to select the +other two events. + +\subsubsection*{Algorithm 3} + +The third idea is to always select the next +possible event that \emph{ends} as \emph{early} as possible. +This algorithm selects the following events: +\begin{center} +\begin{tikzpicture}[scale=.4] + \begin{scope} + \draw[fill=lightgray] (2, 0) rectangle (6, -1); + \draw (4, -1.5) rectangle (10, -2.5); + \draw (6, -3) rectangle (18, -4); + \draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5); + \node at (2.5,-0.5) {$A$}; + \node at (4.5,-2) {$B$}; + \node at (6.5,-3.5) {$C$}; + \node at (12.5,-5) {$D$}; + \end{scope} +\end{tikzpicture} +\end{center} + +It turns out that this algorithm +\emph{always} produces an optimal solution. +The reason for this is that it is always an optimal choice +to first select an event that ends +as early as possible. +After this, it is an optimal choice +to select the next event +using the same strategy, etc., +until we cannot select any more events. + +One way to argue that the algorithm works +is to consider +what happens if we first select an event +that ends later than the event that ends +as early as possible. +Now, we will have at most an equal number of +choices how we can select the next event. +Hence, selecting an event that ends later +can never yield a better solution, +and the greedy algorithm is correct. + +\section{Tasks and deadlines} + +Let us now consider a problem where +we are given $n$ tasks with durations and deadlines +and our task is to choose an order to perform the tasks. +For each task, we earn $d-x$ points +where $d$ is the task's deadline +and $x$ is the moment when we finish the task. +What is the largest possible total score +we can obtain? + +For example, suppose that the tasks are as follows: +\begin{center} +\begin{tabular}{lll} +task & duration & deadline \\ +\hline +$A$ & 4 & 2 \\ +$B$ & 3 & 5 \\ +$C$ & 2 & 7 \\ +$D$ & 4 & 5 \\ +\end{tabular} +\end{center} +In this case, an optimal schedule for the tasks +is as follows: +\begin{center} +\begin{tikzpicture}[scale=.4] + \begin{scope} + \draw (0, 0) rectangle (4, -1); + \draw (4, 0) rectangle (10, -1); + \draw (10, 0) rectangle (18, -1); + \draw (18, 0) rectangle (26, -1); + \node at (0.5,-0.5) {$C$}; + \node at (4.5,-0.5) {$B$}; + \node at (10.5,-0.5) {$A$}; + \node at (18.5,-0.5) {$D$}; + + \draw (0,1.5) -- (26,1.5); + \foreach \i in {0,2,...,26} + { + \draw (\i,1.25) -- (\i,1.75); + } + \footnotesize + \node at (0,2.5) {0}; + \node at (10,2.5) {5}; + \node at (20,2.5) {10}; + + \end{scope} +\end{tikzpicture} +\end{center} +In this solution, $C$ yields 5 points, +$B$ yields 0 points, $A$ yields $-7$ points +and $D$ yields $-8$ points, +so the total score is $-10$. + +Surprisingly, the optimal solution to the problem +does not depend on the deadlines at all, +but a correct greedy strategy is to simply +perform the tasks \emph{sorted by their durations} +in increasing order. +The reason for this is that if we ever perform +two tasks one after another such that the first task +takes longer than the second task, +we can obtain a better solution if we swap the tasks. +For example, consider the following schedule: +\begin{center} +\begin{tikzpicture}[scale=.4] + \begin{scope} + \draw (0, 0) rectangle (8, -1); + \draw (8, 0) rectangle (12, -1); + \node at (0.5,-0.5) {$X$}; + \node at (8.5,-0.5) {$Y$}; + +\draw [decoration={brace}, decorate, line width=0.3mm] (7.75,-1.5) -- (0.25,-1.5); +\draw [decoration={brace}, decorate, line width=0.3mm] (11.75,-1.5) -- (8.25,-1.5); + +\footnotesize +\node at (4,-2.5) {$a$}; +\node at (10,-2.5) {$b$}; + + \end{scope} +\end{tikzpicture} +\end{center} +Here $a>b$, so we should swap the tasks: +\begin{center} +\begin{tikzpicture}[scale=.4] + \begin{scope} + \draw (0, 0) rectangle (4, -1); + \draw (4, 0) rectangle (12, -1); + \node at (0.5,-0.5) {$Y$}; + \node at (4.5,-0.5) {$X$}; + +\draw [decoration={brace}, decorate, line width=0.3mm] (3.75,-1.5) -- (0.25,-1.5); +\draw [decoration={brace}, decorate, line width=0.3mm] (11.75,-1.5) -- (4.25,-1.5); + +\footnotesize +\node at (2,-2.5) {$b$}; +\node at (8,-2.5) {$a$}; + + \end{scope} +\end{tikzpicture} +\end{center} +Now $X$ gives $b$ points less and $Y$ gives $a$ points more, +so the total score increases by $a-b > 0$. +In an optimal solution, +for any two consecutive tasks, +it must hold that the shorter task comes +before the longer task. +Thus, the tasks must be performed +sorted by their durations. + +\section{Minimizing sums} + +We next consider a problem where +we are given $n$ numbers $a_1,a_2,\ldots,a_n$ +and our task is to find a value $x$ +that minimizes the sum +\[|a_1-x|^c+|a_2-x|^c+\cdots+|a_n-x|^c.\] +We focus on the cases $c=1$ and $c=2$. + +\subsubsection{Case $c=1$} + +In this case, we should minimize the sum +\[|a_1-x|+|a_2-x|+\cdots+|a_n-x|.\] +For example, if the numbers are $[1,2,9,2,6]$, +the best solution is to select $x=2$ +which produces the sum +\[ +|1-2|+|2-2|+|9-2|+|2-2|+|6-2|=12. +\] +In the general case, the best choice for $x$ +is the \textit{median} of the numbers, +i.e., the middle number after sorting. +For example, the list $[1,2,9,2,6]$ +becomes $[1,2,2,6,9]$ after sorting, +so the median is 2. + +The median is an optimal choice, +because if $x$ is smaller than the median, +the sum becomes smaller by increasing $x$, +and if $x$ is larger then the median, +the sum becomes smaller by decreasing $x$. +Hence, the optimal solution is that $x$ +is the median. +If $n$ is even and there are two medians, +both medians and all values between them +are optimal choices. + +\subsubsection{Case $c=2$} + +In this case, we should minimize the sum +\[(a_1-x)^2+(a_2-x)^2+\cdots+(a_n-x)^2.\] +For example, if the numbers are $[1,2,9,2,6]$, +the best solution is to select $x=4$ +which produces the sum +\[ +(1-4)^2+(2-4)^2+(9-4)^2+(2-4)^2+(6-4)^2=46. +\] +In the general case, the best choice for $x$ +is the \emph{average} of the numbers. +In the example the average is $(1+2+9+2+6)/5=4$. +This result can be derived by presenting +the sum as follows: +\[ +nx^2 - 2x(a_1+a_2+\cdots+a_n) + (a_1^2+a_2^2+\cdots+a_n^2) +\] +The last part does not depend on $x$, +so we can ignore it. +The remaining parts form a function +$nx^2-2xs$ where $s=a_1+a_2+\cdots+a_n$. +This is a parabola opening upwards +with roots $x=0$ and $x=2s/n$, +and the minimum value is the average +of the roots $x=s/n$, i.e., +the average of the numbers $a_1,a_2,\ldots,a_n$. + +\section{Data compression} + +\index{data compression} +\index{binary code} +\index{codeword} + +A \key{binary code} assigns for each character +of a string a \key{codeword} that consists of bits. +We can \emph{compress} the string using the binary code +by replacing each character by the +corresponding codeword. +For example, the following binary code +assigns codewords for characters +\texttt{A}–\texttt{D}: +\begin{center} +\begin{tabular}{rr} +character & codeword \\ +\hline +\texttt{A} & 00 \\ +\texttt{B} & 01 \\ +\texttt{C} & 10 \\ +\texttt{D} & 11 \\ +\end{tabular} +\end{center} +This is a \key{constant-length} code +which means that the length of each +codeword is the same. +For example, we can compress the string +\texttt{AABACDACA} as follows: +\[00\,00\,01\,00\,10\,11\,00\,10\,00\] +Using this code, the length of the compressed +string is 18 bits. +However, we can compress the string better +if we use a \key{variable-length} code +where codewords may have different lengths. +Then we can give short codewords for +characters that appear often +and long codewords for characters +that appear rarely. +It turns out that an \key{optimal} code +for the above string is as follows: +\begin{center} +\begin{tabular}{rr} +character & codeword \\ +\hline +\texttt{A} & 0 \\ +\texttt{B} & 110 \\ +\texttt{C} & 10 \\ +\texttt{D} & 111 \\ +\end{tabular} +\end{center} +An optimal code produces a compressed string +that is as short as possible. +In this case, the compressed string using +the optimal code is +\[0\,0\,110\,0\,10\,111\,0\,10\,0,\] +so only 15 bits are needed instead of 18 bits. +Thus, thanks to a better code it was possible to +save 3 bits in the compressed string. + +We require that no codeword +is a prefix of another codeword. +For example, it is not allowed that a code +would contain both codewords 10 +and 1011. +The reason for this is that we want +to be able to generate the original string +from the compressed string. +If a codeword could be a prefix of another codeword, +this would not always be possible. +For example, the following code is \emph{not} valid: +\begin{center} +\begin{tabular}{rr} +character & codeword \\ +\hline +\texttt{A} & 10 \\ +\texttt{B} & 11 \\ +\texttt{C} & 1011 \\ +\texttt{D} & 111 \\ +\end{tabular} +\end{center} +Using this code, it would not be possible to know +if the compressed string 1011 corresponds to +the string \texttt{AB} or the string \texttt{C}. + +\index{Huffman coding} + +\subsubsection{Huffman coding} + +\key{Huffman coding}\footnote{D. A. Huffman discovered this method +when solving a university course assignment +and published the algorithm in 1952 \cite{huf52}.} is a greedy algorithm +that constructs an optimal code for +compressing a given string. +The algorithm builds a binary tree +based on the frequencies of the characters +in the string, +and each character's codeword can be read +by following a path from the root to +the corresponding node. +A move to the left corresponds to bit 0, +and a move to the right corresponds to bit 1. + +Initially, each character of the string is +represented by a node whose weight is the +number of times the character occurs in the string. +Then at each step two nodes with minimum weights +are combined by creating +a new node whose weight is the sum of the weights +of the original nodes. +The process continues until all nodes have been combined. + +Next we will see how Huffman coding creates +the optimal code for the string +\texttt{AABACDACA}. +Initially, there are four nodes that correspond +to the characters of the string: + +\begin{center} +\begin{tikzpicture}[scale=0.9] +\node[draw, circle] (1) at (0,0) {$5$}; +\node[draw, circle] (2) at (2,0) {$1$}; +\node[draw, circle] (3) at (4,0) {$2$}; +\node[draw, circle] (4) at (6,0) {$1$}; + +\node[color=blue] at (0,-0.75) {\texttt{A}}; +\node[color=blue] at (2,-0.75) {\texttt{B}}; +\node[color=blue] at (4,-0.75) {\texttt{C}}; +\node[color=blue] at (6,-0.75) {\texttt{D}}; + +%\path[draw,thick,-] (4) -- (5); +\end{tikzpicture} +\end{center} +The node that represents character \texttt{A} +has weight 5 because character \texttt{A} +appears 5 times in the string. +The other weights have been calculated +in the same way. + +The first step is to combine the nodes that +correspond to characters \texttt{B} and \texttt{D}, +both with weight 1. +The result is: +\begin{center} +\begin{tikzpicture}[scale=0.9] +\node[draw, circle] (1) at (0,0) {$5$}; +\node[draw, circle] (3) at (2,0) {$2$}; +\node[draw, circle] (2) at (4,0) {$1$}; +\node[draw, circle] (4) at (6,0) {$1$}; +\node[draw, circle] (5) at (5,1) {$2$}; + +\node[color=blue] at (0,-0.75) {\texttt{A}}; +\node[color=blue] at (2,-0.75) {\texttt{C}}; +\node[color=blue] at (4,-0.75) {\texttt{B}}; +\node[color=blue] at (6,-0.75) {\texttt{D}}; + +\node at (4.3,0.7) {0}; +\node at (5.7,0.7) {1}; + +\path[draw,thick,-] (2) -- (5); +\path[draw,thick,-] (4) -- (5); +\end{tikzpicture} +\end{center} +After this, the nodes with weight 2 are combined: +\begin{center} +\begin{tikzpicture}[scale=0.9] +\node[draw, circle] (1) at (1,0) {$5$}; +\node[draw, circle] (3) at (3,1) {$2$}; +\node[draw, circle] (2) at (4,0) {$1$}; +\node[draw, circle] (4) at (6,0) {$1$}; +\node[draw, circle] (5) at (5,1) {$2$}; +\node[draw, circle] (6) at (4,2) {$4$}; + +\node[color=blue] at (1,-0.75) {\texttt{A}}; +\node[color=blue] at (3,1-0.75) {\texttt{C}}; +\node[color=blue] at (4,-0.75) {\texttt{B}}; +\node[color=blue] at (6,-0.75) {\texttt{D}}; + +\node at (4.3,0.7) {0}; +\node at (5.7,0.7) {1}; +\node at (3.3,1.7) {0}; +\node at (4.7,1.7) {1}; + +\path[draw,thick,-] (2) -- (5); +\path[draw,thick,-] (4) -- (5); +\path[draw,thick,-] (3) -- (6); +\path[draw,thick,-] (5) -- (6); +\end{tikzpicture} +\end{center} +Finally, the two remaining nodes are combined: +\begin{center} +\begin{tikzpicture}[scale=0.9] +\node[draw, circle] (1) at (2,2) {$5$}; +\node[draw, circle] (3) at (3,1) {$2$}; +\node[draw, circle] (2) at (4,0) {$1$}; +\node[draw, circle] (4) at (6,0) {$1$}; +\node[draw, circle] (5) at (5,1) {$2$}; +\node[draw, circle] (6) at (4,2) {$4$}; +\node[draw, circle] (7) at (3,3) {$9$}; + +\node[color=blue] at (2,2-0.75) {\texttt{A}}; +\node[color=blue] at (3,1-0.75) {\texttt{C}}; +\node[color=blue] at (4,-0.75) {\texttt{B}}; +\node[color=blue] at (6,-0.75) {\texttt{D}}; + +\node at (4.3,0.7) {0}; +\node at (5.7,0.7) {1}; +\node at (3.3,1.7) {0}; +\node at (4.7,1.7) {1}; +\node at (2.3,2.7) {0}; +\node at (3.7,2.7) {1}; + +\path[draw,thick,-] (2) -- (5); +\path[draw,thick,-] (4) -- (5); +\path[draw,thick,-] (3) -- (6); +\path[draw,thick,-] (5) -- (6); +\path[draw,thick,-] (1) -- (7); +\path[draw,thick,-] (6) -- (7); +\end{tikzpicture} +\end{center} + +Now all nodes are in the tree, so the code is ready. +The following codewords can be read from the tree: +\begin{center} +\begin{tabular}{rr} +character & codeword \\ +\hline +\texttt{A} & 0 \\ +\texttt{B} & 110 \\ +\texttt{C} & 10 \\ +\texttt{D} & 111 \\ +\end{tabular} +\end{center} diff --git a/chapter07.tex b/chapter07.tex new file mode 100644 index 0000000..70fd873 --- /dev/null +++ b/chapter07.tex @@ -0,0 +1,1049 @@ +\chapter{Dynamic programming} + +\index{dynamic programming} + +\key{Dynamic programming} +is a technique that combines the correctness +of complete search and the efficiency +of greedy algorithms. +Dynamic programming can be applied if the +problem can be divided into overlapping subproblems +that can be solved independently. + +There are two uses for dynamic programming: + +\begin{itemize} +\item +\key{Finding an optimal solution}: +We want to find a solution that is +as large as possible or as small as possible. +\item +\key{Counting the number of solutions}: +We want to calculate the total number of +possible solutions. +\end{itemize} + +We will first see how dynamic programming can +be used to find an optimal solution, +and then we will use the same idea for +counting the solutions. + +Understanding dynamic programming is a milestone +in every competitive programmer's career. +While the basic idea is simple, +the challenge is how to apply +dynamic programming to different problems. +This chapter introduces a set of classic problems +that are a good starting point. + +\section{Coin problem} + +We first focus on a problem that we +have already seen in Chapter 6: +Given a set of coin values $\texttt{coins} = \{c_1,c_2,\ldots,c_k\}$ +and a target sum of money $n$, our task is to +form the sum $n$ using as few coins as possible. + +In Chapter 6, we solved the problem using a +greedy algorithm that always chooses the largest +possible coin. +The greedy algorithm works, for example, +when the coins are the euro coins, +but in the general case the greedy algorithm +does not necessarily produce an optimal solution. + +Now is time to solve the problem efficiently +using dynamic programming, so that the algorithm +works for any coin set. +The dynamic programming +algorithm is based on a recursive function +that goes through all possibilities how to +form the sum, like a brute force algorithm. +However, the dynamic programming +algorithm is efficient because +it uses \emph{memoization} and +calculates the answer to each subproblem only once. + +\subsubsection{Recursive formulation} + +The idea in dynamic programming is to +formulate the problem recursively so +that the solution to the problem can be +calculated from solutions to smaller +subproblems. +In the coin problem, a natural recursive +problem is as follows: +what is the smallest number of coins +required to form a sum $x$? + +Let $\texttt{solve}(x)$ +denote the minimum +number of coins required for a sum $x$. +The values of the function depend on the +values of the coins. +For example, if $\texttt{coins} = \{1,3,4\}$, +the first values of the function are as follows: + +\[ +\begin{array}{lcl} +\texttt{solve}(0) & = & 0 \\ +\texttt{solve}(1) & = & 1 \\ +\texttt{solve}(2) & = & 2 \\ +\texttt{solve}(3) & = & 1 \\ +\texttt{solve}(4) & = & 1 \\ +\texttt{solve}(5) & = & 2 \\ +\texttt{solve}(6) & = & 2 \\ +\texttt{solve}(7) & = & 2 \\ +\texttt{solve}(8) & = & 2 \\ +\texttt{solve}(9) & = & 3 \\ +\texttt{solve}(10) & = & 3 \\ +\end{array} +\] + +For example, $\texttt{solve}(10)=3$, +because at least 3 coins are needed +to form the sum 10. +The optimal solution is $3+3+4=10$. + +The essential property of $\texttt{solve}$ is +that its values can be +recursively calculated from its smaller values. +The idea is to focus on the \emph{first} +coin that we choose for the sum. +For example, in the above scenario, +the first coin can be either 1, 3 or 4. +If we first choose coin 1, +the remaining task is to form the sum 9 +using the minimum number of coins, +which is a subproblem of the original problem. +Of course, the same applies to coins 3 and 4. +Thus, we can use the following recursive formula +to calculate the minimum number of coins: +\begin{equation*} +\begin{split} +\texttt{solve}(x) = \min( & \texttt{solve}(x-1)+1, \\ + & \texttt{solve}(x-3)+1, \\ + & \texttt{solve}(x-4)+1). +\end{split} +\end{equation*} +The base case of the recursion is $\texttt{solve}(0)=0$, +because no coins are needed to form an empty sum. +For example, +\[ \texttt{solve}(10) = \texttt{solve}(7)+1 = \texttt{solve}(4)+2 = \texttt{solve}(0)+3 = 3.\] + +Now we are ready to give a general recursive function +that calculates the minimum number of +coins needed to form a sum $x$: +\begin{equation*} + \texttt{solve}(x) = \begin{cases} + \infty & x < 0\\ + 0 & x = 0\\ + \min_{c \in \texttt{coins}} \texttt{solve}(x-c)+1 & x > 0 \\ + \end{cases} +\end{equation*} + +First, if $x<0$, the value is $\infty$, +because it is impossible to form a negative +sum of money. +Then, if $x=0$, the value is $0$, +because no coins are needed to form an empty sum. +Finally, if $x>0$, the variable $c$ goes through +all possibilities how to choose the first coin +of the sum. + +Once a recursive function that solves the problem +has been found, +we can directly implement a solution in C++ +(the constant \texttt{INF} denotes infinity): + +\begin{lstlisting} +int solve(int x) { + if (x < 0) return INF; + if (x == 0) return 0; + int best = INF; + for (auto c : coins) { + best = min(best, solve(x-c)+1); + } + return best; +} +\end{lstlisting} + +Still, this function is not efficient, +because there may be an exponential number of ways +to construct the sum. +However, next we will see how to make the +function efficient using a technique called memoization. + +\subsubsection{Using memoization} + +\index{memoization} + +The idea of dynamic programming is to use +\key{memoization} to efficiently calculate +values of a recursive function. +This means that the values of the function +are stored in an array after calculating them. +For each parameter, the value of the function +is calculated recursively only once, and after this, +the value can be directly retrieved from the array. + +In this problem, we use arrays +\begin{lstlisting} +bool ready[N]; +int value[N]; +\end{lstlisting} + +where $\texttt{ready}[x]$ indicates +whether the value of $\texttt{solve}(x)$ has been calculated, +and if it is, $\texttt{value}[x]$ +contains this value. +The constant $N$ has been chosen so +that all required values fit in the arrays. + +Now the function can be efficiently +implemented as follows: + +\begin{lstlisting} +int solve(int x) { + if (x < 0) return INF; + if (x == 0) return 0; + if (ready[x]) return value[x]; + int best = INF; + for (auto c : coins) { + best = min(best, solve(x-c)+1); + } + value[x] = best; + ready[x] = true; + return best; +} +\end{lstlisting} + +The function handles the base cases +$x<0$ and $x=0$ as previously. +Then the function checks from +$\texttt{ready}[x]$ if +$\texttt{solve}(x)$ has already been stored +in $\texttt{value}[x]$, +and if it is, the function directly returns it. +Otherwise the function calculates the value +of $\texttt{solve}(x)$ +recursively and stores it in $\texttt{value}[x]$. + +This function works efficiently, +because the answer for each parameter $x$ +is calculated recursively only once. +After a value of $\texttt{solve}(x)$ has been stored in $\texttt{value}[x]$, +it can be efficiently retrieved whenever the +function will be called again with the parameter $x$. +The time complexity of the algorithm is $O(nk)$, +where $n$ is the target sum and $k$ is the number of coins. + +Note that we can also \emph{iteratively} +construct the array \texttt{value} using +a loop that simply calculates all the values +of $\texttt{solve}$ for parameters $0 \ldots n$: +\begin{lstlisting} +value[0] = 0; +for (int x = 1; x <= n; x++) { + value[x] = INF; + for (auto c : coins) { + if (x-c >= 0) { + value[x] = min(value[x], value[x-c]+1); + } + } +} +\end{lstlisting} + +In fact, most competitive programmers prefer this +implementation, because it is shorter and has +lower constant factors. +From now on, we also use iterative implementations +in our examples. +Still, it is often easier to think about +dynamic programming solutions +in terms of recursive functions. + + +\subsubsection{Constructing a solution} + +Sometimes we are asked both to find the value +of an optimal solution and to give +an example how such a solution can be constructed. +In the coin problem, for example, +we can declare another array +that indicates for +each sum of money the first coin +in an optimal solution: +\begin{lstlisting} +int first[N]; +\end{lstlisting} +Then, we can modify the algorithm as follows: +\begin{lstlisting} +value[0] = 0; +for (int x = 1; x <= n; x++) { + value[x] = INF; + for (auto c : coins) { + if (x-c >= 0 && value[x-c]+1 < value[x]) { + value[x] = value[x-c]+1; + first[x] = c; + } + } +} +\end{lstlisting} +After this, the following code can be used to +print the coins that appear in an optimal solution for +the sum $n$: +\begin{lstlisting} +while (n > 0) { + cout << first[n] << "\n"; + n -= first[n]; +} +\end{lstlisting} + +\subsubsection{Counting the number of solutions} + +Let us now consider another version +of the coin problem where our task is to +calculate the total number of ways +to produce a sum $x$ using the coins. +For example, if $\texttt{coins}=\{1,3,4\}$ and +$x=5$, there are a total of 6 ways: + +\begin{multicols}{2} +\begin{itemize} +\item $1+1+1+1+1$ +\item $1+1+3$ +\item $1+3+1$ +\item $3+1+1$ +\item $1+4$ +\item $4+1$ +\end{itemize} +\end{multicols} + +Again, we can solve the problem recursively. +Let $\texttt{solve}(x)$ denote the number of ways +we can form the sum $x$. +For example, if $\texttt{coins}=\{1,3,4\}$, +then $\texttt{solve}(5)=6$ and the recursive formula is +\begin{equation*} +\begin{split} +\texttt{solve}(x) = & \texttt{solve}(x-1) + \\ + & \texttt{solve}(x-3) + \\ + & \texttt{solve}(x-4) . +\end{split} +\end{equation*} + +Then, the general recursive function is as follows: +\begin{equation*} + \texttt{solve}(x) = \begin{cases} + 0 & x < 0\\ + 1 & x = 0\\ + \sum_{c \in \texttt{coins}} \texttt{solve}(x-c) & x > 0 \\ + \end{cases} +\end{equation*} + +If $x<0$, the value is 0, because there are no solutions. +If $x=0$, the value is 1, because there is only one way +to form an empty sum. +Otherwise we calculate the sum of all values +of the form $\texttt{solve}(x-c)$ where $c$ is in \texttt{coins}. + +The following code constructs an array +$\texttt{count}$ such that +$\texttt{count}[x]$ equals +the value of $\texttt{solve}(x)$ +for $0 \le x \le n$: + +\begin{lstlisting} +count[0] = 1; +for (int x = 1; x <= n; x++) { + for (auto c : coins) { + if (x-c >= 0) { + count[x] += count[x-c]; + } + } +} +\end{lstlisting} + +Often the number of solutions is so large +that it is not required to calculate the exact number +but it is enough to give the answer modulo $m$ +where, for example, $m=10^9+7$. +This can be done by changing the code so that +all calculations are done modulo $m$. +In the above code, it suffices to add the line +\begin{lstlisting} + count[x] %= m; +\end{lstlisting} +after the line +\begin{lstlisting} + count[x] += count[x-c]; +\end{lstlisting} + +Now we have discussed all basic +ideas of dynamic programming. +Since dynamic programming can be used +in many different situations, +we will now go through a set of problems +that show further examples about the +possibilities of dynamic programming. + +\section{Longest increasing subsequence} + +\index{longest increasing subsequence} + +Our first problem is to find the +\key{longest increasing subsequence} +in an array of $n$ elements. +This is a maximum-length +sequence of array elements +that goes from left to right, +and each element in the sequence is larger +than the previous element. +For example, in the array + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$6$}; +\node at (1.5,0.5) {$2$}; +\node at (2.5,0.5) {$5$}; +\node at (3.5,0.5) {$1$}; +\node at (4.5,0.5) {$7$}; +\node at (5.5,0.5) {$4$}; +\node at (6.5,0.5) {$8$}; +\node at (7.5,0.5) {$3$}; + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +the longest increasing subsequence +contains 4 elements: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (1,0) rectangle (2,1); +\fill[color=lightgray] (2,0) rectangle (3,1); +\fill[color=lightgray] (4,0) rectangle (5,1); +\fill[color=lightgray] (6,0) rectangle (7,1); +\draw (0,0) grid (8,1); +\node at (0.5,0.5) {$6$}; +\node at (1.5,0.5) {$2$}; +\node at (2.5,0.5) {$5$}; +\node at (3.5,0.5) {$1$}; +\node at (4.5,0.5) {$7$}; +\node at (5.5,0.5) {$4$}; +\node at (6.5,0.5) {$8$}; +\node at (7.5,0.5) {$3$}; + +\draw[thick,->] (1.5,-0.25) .. controls (1.75,-1.00) and (2.25,-1.00) .. (2.4,-0.25); +\draw[thick,->] (2.6,-0.25) .. controls (3.0,-1.00) and (4.0,-1.00) .. (4.4,-0.25); +\draw[thick,->] (4.6,-0.25) .. controls (5.0,-1.00) and (6.0,-1.00) .. (6.5,-0.25); + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} + +Let $\texttt{length}(k)$ denote +the length of the +longest increasing subsequence +that ends at position $k$. +Thus, if we calculate all values of +$\texttt{length}(k)$ where $0 \le k \le n-1$, +we will find out the length of the +longest increasing subsequence. +For example, the values of the function +for the above array are as follows: +\[ +\begin{array}{lcl} +\texttt{length}(0) & = & 1 \\ +\texttt{length}(1) & = & 1 \\ +\texttt{length}(2) & = & 2 \\ +\texttt{length}(3) & = & 1 \\ +\texttt{length}(4) & = & 3 \\ +\texttt{length}(5) & = & 2 \\ +\texttt{length}(6) & = & 4 \\ +\texttt{length}(7) & = & 2 \\ +\end{array} +\] + +For example, $\texttt{length}(6)=4$, +because the longest increasing subsequence +that ends at position 6 consists of 4 elements. + +To calculate a value of $\texttt{length}(k)$, +we should find a position $i= 0) possible[x][k] |= possible[x-w[k]][k-1]; + possible[x][k] |= possible[x][k-1]; + } +} +\end{lstlisting} + +However, here is a better implementation that only uses +a one-dimensional array $\texttt{possible}[x]$ +that indicates whether we can construct a subset with sum $x$. +The trick is to update the array from right to left for +each new weight: +\begin{lstlisting} +possible[0] = true; +for (int k = 1; k <= n; k++) { + for (int x = W; x >= 0; x--) { + if (possible[x]) possible[x+w[k]] = true; + } +} +\end{lstlisting} + +Note that the general idea presented here can be used +in many knapsack problems. +For example, if we are given objects with weights and values, +we can determine for each weight sum the maximum value +sum of a subset. + +\section{Edit distance} + +\index{edit distance} +\index{Levenshtein distance} + +The \key{edit distance} or \key{Levenshtein distance}\footnote{The distance +is named after V. I. Levenshtein who studied it in connection with binary codes \cite{lev66}.} +is the minimum number of editing operations +needed to transform a string +into another string. +The allowed editing operations are as follows: +\begin{itemize} +\item insert a character (e.g. \texttt{ABC} $\rightarrow$ \texttt{ABCA}) +\item remove a character (e.g. \texttt{ABC} $\rightarrow$ \texttt{AC}) +\item modify a character (e.g. \texttt{ABC} $\rightarrow$ \texttt{ADC}) +\end{itemize} + +For example, the edit distance between +\texttt{LOVE} and \texttt{MOVIE} is 2, +because we can first perform the operation + \texttt{LOVE} $\rightarrow$ \texttt{MOVE} +(modify) and then the operation +\texttt{MOVE} $\rightarrow$ \texttt{MOVIE} +(insert). +This is the smallest possible number of operations, +because it is clear that only one operation is not enough. + +Suppose that we are given a string \texttt{x} +of length $n$ and a string \texttt{y} of length $m$, +and we want to calculate the edit distance between +\texttt{x} and \texttt{y}. +To solve the problem, we define a function +$\texttt{distance}(a,b)$ that gives the +edit distance between prefixes +$\texttt{x}[0 \ldots a]$ and $\texttt{y}[0 \ldots b]$. +Thus, using this function, the edit distance +between \texttt{x} and \texttt{y} equals $\texttt{distance}(n-1,m-1)$. + +We can calculate values of \texttt{distance} +as follows: +\begin{equation*} +\begin{split} +\texttt{distance}(a,b) = \min(& \texttt{distance}(a,b-1)+1, \\ + & \texttt{distance}(a-1,b)+1, \\ + & \texttt{distance}(a-1,b-1)+\texttt{cost}(a,b)). +\end{split} +\end{equation*} +Here $\texttt{cost}(a,b)=0$ if $\texttt{x}[a]=\texttt{y}[b]$, +and otherwise $\texttt{cost}(a,b)=1$. +The formula considers the following ways to +edit the string \texttt{x}: +\begin{itemize} +\item $\texttt{distance}(a,b-1)$: insert a character at the end of \texttt{x} +\item $\texttt{distance}(a-1,b)$: remove the last character from \texttt{x} +\item $\texttt{distance}(a-1,b-1)$: match or modify the last character of \texttt{x} +\end{itemize} +In the two first cases, one editing operation is needed +(insert or remove). +In the last case, if $\texttt{x}[a]=\texttt{y}[b]$, +we can match the last characters without editing, +and otherwise one editing operation is needed (modify). + +The following table shows the values of \texttt{distance} +in the example case: +\begin{center} +\begin{tikzpicture}[scale=.65] + \begin{scope} + %\fill [color=lightgray] (5, -3) rectangle (6, -4); + \draw (1, -1) grid (7, -6); + + \node at (0.5,-2.5) {\texttt{L}}; + \node at (0.5,-3.5) {\texttt{O}}; + \node at (0.5,-4.5) {\texttt{V}}; + \node at (0.5,-5.5) {\texttt{E}}; + + \node at (2.5,-0.5) {\texttt{M}}; + \node at (3.5,-0.5) {\texttt{O}}; + \node at (4.5,-0.5) {\texttt{V}}; + \node at (5.5,-0.5) {\texttt{I}}; + \node at (6.5,-0.5) {\texttt{E}}; + + \node at (1.5,-1.5) {$0$}; + \node at (1.5,-2.5) {$1$}; + \node at (1.5,-3.5) {$2$}; + \node at (1.5,-4.5) {$3$}; + \node at (1.5,-5.5) {$4$}; + \node at (2.5,-1.5) {$1$}; + \node at (2.5,-2.5) {$1$}; + \node at (2.5,-3.5) {$2$}; + \node at (2.5,-4.5) {$3$}; + \node at (2.5,-5.5) {$4$}; + \node at (3.5,-1.5) {$2$}; + \node at (3.5,-2.5) {$2$}; + \node at (3.5,-3.5) {$1$}; + \node at (3.5,-4.5) {$2$}; + \node at (3.5,-5.5) {$3$}; + \node at (4.5,-1.5) {$3$}; + \node at (4.5,-2.5) {$3$}; + \node at (4.5,-3.5) {$2$}; + \node at (4.5,-4.5) {$1$}; + \node at (4.5,-5.5) {$2$}; + \node at (5.5,-1.5) {$4$}; + \node at (5.5,-2.5) {$4$}; + \node at (5.5,-3.5) {$3$}; + \node at (5.5,-4.5) {$2$}; + \node at (5.5,-5.5) {$2$}; + \node at (6.5,-1.5) {$5$}; + \node at (6.5,-2.5) {$5$}; + \node at (6.5,-3.5) {$4$}; + \node at (6.5,-4.5) {$3$}; + \node at (6.5,-5.5) {$2$}; + \end{scope} +\end{tikzpicture} +\end{center} + +The lower-right corner of the table +tells us that the edit distance between +\texttt{LOVE} and \texttt{MOVIE} is 2. +The table also shows how to construct +the shortest sequence of editing operations. +In this case the path is as follows: + +\begin{center} +\begin{tikzpicture}[scale=.65] + \begin{scope} + \draw (1, -1) grid (7, -6); + + \node at (0.5,-2.5) {\texttt{L}}; + \node at (0.5,-3.5) {\texttt{O}}; + \node at (0.5,-4.5) {\texttt{V}}; + \node at (0.5,-5.5) {\texttt{E}}; + + \node at (2.5,-0.5) {\texttt{M}}; + \node at (3.5,-0.5) {\texttt{O}}; + \node at (4.5,-0.5) {\texttt{V}}; + \node at (5.5,-0.5) {\texttt{I}}; + \node at (6.5,-0.5) {\texttt{E}}; + + \node at (1.5,-1.5) {$0$}; + \node at (1.5,-2.5) {$1$}; + \node at (1.5,-3.5) {$2$}; + \node at (1.5,-4.5) {$3$}; + \node at (1.5,-5.5) {$4$}; + \node at (2.5,-1.5) {$1$}; + \node at (2.5,-2.5) {$1$}; + \node at (2.5,-3.5) {$2$}; + \node at (2.5,-4.5) {$3$}; + \node at (2.5,-5.5) {$4$}; + \node at (3.5,-1.5) {$2$}; + \node at (3.5,-2.5) {$2$}; + \node at (3.5,-3.5) {$1$}; + \node at (3.5,-4.5) {$2$}; + \node at (3.5,-5.5) {$3$}; + \node at (4.5,-1.5) {$3$}; + \node at (4.5,-2.5) {$3$}; + \node at (4.5,-3.5) {$2$}; + \node at (4.5,-4.5) {$1$}; + \node at (4.5,-5.5) {$2$}; + \node at (5.5,-1.5) {$4$}; + \node at (5.5,-2.5) {$4$}; + \node at (5.5,-3.5) {$3$}; + \node at (5.5,-4.5) {$2$}; + \node at (5.5,-5.5) {$2$}; + \node at (6.5,-1.5) {$5$}; + \node at (6.5,-2.5) {$5$}; + \node at (6.5,-3.5) {$4$}; + \node at (6.5,-4.5) {$3$}; + \node at (6.5,-5.5) {$2$}; + + \path[draw=red,thick,-,line width=2pt] (6.5,-5.5) -- (5.5,-4.5); + \path[draw=red,thick,-,line width=2pt] (5.5,-4.5) -- (4.5,-4.5); + \path[draw=red,thick,->,line width=2pt] (4.5,-4.5) -- (1.5,-1.5); + \end{scope} +\end{tikzpicture} +\end{center} + +The last characters of \texttt{LOVE} and \texttt{MOVIE} +are equal, so the edit distance between them +equals the edit distance between \texttt{LOV} and \texttt{MOVI}. +We can use one editing operation to remove the +character \texttt{I} from \texttt{MOVI}. +Thus, the edit distance is one larger than +the edit distance between \texttt{LOV} and \texttt{MOV}, etc. + +\section{Counting tilings} + +Sometimes the states of a dynamic programming solution +are more complex than fixed combinations of numbers. +As an example, +consider the problem of calculating +the number of distinct ways to +fill an $n \times m$ grid using +$1 \times 2$ and $2 \times 1$ size tiles. +For example, one valid solution +for the $4 \times 7$ grid is +\begin{center} +\begin{tikzpicture}[scale=.65] + \draw (0,0) grid (7,4); + \draw[fill=gray] (0+0.2,0+0.2) rectangle (2-0.2,1-0.2); + \draw[fill=gray] (2+0.2,0+0.2) rectangle (4-0.2,1-0.2); + \draw[fill=gray] (4+0.2,0+0.2) rectangle (6-0.2,1-0.2); + \draw[fill=gray] (0+0.2,1+0.2) rectangle (2-0.2,2-0.2); + \draw[fill=gray] (2+0.2,1+0.2) rectangle (4-0.2,2-0.2); + \draw[fill=gray] (1+0.2,2+0.2) rectangle (3-0.2,3-0.2); + \draw[fill=gray] (1+0.2,3+0.2) rectangle (3-0.2,4-0.2); + \draw[fill=gray] (4+0.2,3+0.2) rectangle (6-0.2,4-0.2); + + \draw[fill=gray] (0+0.2,2+0.2) rectangle (1-0.2,4-0.2); + \draw[fill=gray] (3+0.2,2+0.2) rectangle (4-0.2,4-0.2); + \draw[fill=gray] (6+0.2,2+0.2) rectangle (7-0.2,4-0.2); + \draw[fill=gray] (4+0.2,1+0.2) rectangle (5-0.2,3-0.2); + \draw[fill=gray] (5+0.2,1+0.2) rectangle (6-0.2,3-0.2); + \draw[fill=gray] (6+0.2,0+0.2) rectangle (7-0.2,2-0.2); + +\end{tikzpicture} +\end{center} +and the total number of solutions is 781. + +The problem can be solved using dynamic programming +by going through the grid row by row. +Each row in a solution can be represented as a +string that contains $m$ characters from the set +$\{\sqcap, \sqcup, \sqsubset, \sqsupset \}$. +For example, the above solution consists of four rows +that correspond to the following strings: +\begin{itemize} +\item +$\sqcap \sqsubset \sqsupset \sqcap \sqsubset \sqsupset \sqcap$ +\item +$\sqcup \sqsubset \sqsupset \sqcup \sqcap \sqcap \sqcup$ +\item +$\sqsubset \sqsupset \sqsubset \sqsupset \sqcup \sqcup \sqcap$ +\item +$\sqsubset \sqsupset \sqsubset \sqsupset \sqsubset \sqsupset \sqcup$ +\end{itemize} + +Let $\texttt{count}(k,x)$ denote the number of ways to +construct a solution for rows $1 \ldots k$ +of the grid such that string $x$ corresponds to row $k$. +It is possible to use dynamic programming here, +because the state of a row is constrained +only by the state of the previous row. + +A solution is valid if row $1$ does not contain +the character $\sqcup$, +row $n$ does not contain the character $\sqcap$, +and all consecutive rows are \emph{compatible}. +For example, the rows +$\sqcup \sqsubset \sqsupset \sqcup \sqcap \sqcap \sqcup$ and +$\sqsubset \sqsupset \sqsubset \sqsupset \sqcup \sqcup \sqcap$ +are compatible, while the rows +$\sqcap \sqsubset \sqsupset \sqcap \sqsubset \sqsupset \sqcap$ and +$\sqsubset \sqsupset \sqsubset \sqsupset \sqsubset \sqsupset \sqcup$ +are not compatible. + +Since a row consists of $m$ characters and there are +four choices for each character, the number of distinct +rows is at most $4^m$. +Thus, the time complexity of the solution is +$O(n 4^{2m})$ because we can go through the +$O(4^m)$ possible states for each row, +and for each state, there are $O(4^m)$ +possible states for the previous row. +In practice, it is a good idea to rotate the grid +so that the shorter side has length $m$, +because the factor $4^{2m}$ dominates the time complexity. + +It is possible to make the solution more efficient +by using a more compact representation for the rows. +It turns out that it is sufficient to know which +columns of the previous row contain the upper square +of a vertical tile. +Thus, we can represent a row using only characters +$\sqcap$ and $\Box$, where $\Box$ is a combination +of characters +$\sqcup$, $\sqsubset$ and $\sqsupset$. +Using this representation, there are only +$2^m$ distinct rows and the time complexity is +$O(n 2^{2m})$. + +As a final note, there is also a surprising direct formula +for calculating the number of tilings\footnote{Surprisingly, +this formula was discovered in 1961 by two research teams \cite{kas61,tem61} +that worked independently.}: +\[ \prod_{a=1}^{\lceil n/2 \rceil} \prod_{b=1}^{\lceil m/2 \rceil} 4 \cdot (\cos^2 \frac{\pi a}{n + 1} + \cos^2 \frac{\pi b}{m+1})\] +This formula is very efficient, because it calculates +the number of tilings in $O(nm)$ time, +but since the answer is a product of real numbers, +a problem when using the formula is +how to store the intermediate results accurately. + + diff --git a/chapter08.tex b/chapter08.tex new file mode 100644 index 0000000..9407e46 --- /dev/null +++ b/chapter08.tex @@ -0,0 +1,732 @@ +\chapter{Amortized analysis} + +\index{amortized analysis} + +The time complexity of an algorithm +is often easy to analyze +just by examining the structure +of the algorithm: +what loops does the algorithm contain +and how many times the loops are performed. +However, sometimes a straightforward analysis +does not give a true picture of the efficiency of the algorithm. + +\key{Amortized analysis} can be used to analyze +algorithms that contain operations whose +time complexity varies. +The idea is to estimate the total time used to +all such operations during the +execution of the algorithm, instead of focusing +on individual operations. + +\section{Two pointers method} + +\index{two pointers method} + +In the \key{two pointers method}, +two pointers are used to +iterate through the array values. +Both pointers can move to one direction only, +which ensures that the algorithm works efficiently. +Next we discuss two problems that can be solved +using the two pointers method. + +\subsubsection{Subarray sum} + +As the first example, +consider a problem where we are +given an array of $n$ positive integers +and a target sum $x$, +and we want to find a subarray whose sum is $x$ +or report that there is no such subarray. + +For example, the array +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$1$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$2$}; +\node at (7.5,0.5) {$3$}; +\end{tikzpicture} +\end{center} +contains a subarray whose sum is 8: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (2,0) rectangle (5,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$1$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$2$}; +\node at (7.5,0.5) {$3$}; +\end{tikzpicture} +\end{center} + +This problem can be solved in +$O(n)$ time by using the two pointers method. +The idea is to maintain pointers that point to the +first and last value of a subarray. +On each turn, the left pointer moves one step +to the right, and the right pointer moves to the right +as long as the resulting subarray sum is at most $x$. +If the sum becomes exactly $x$, +a solution has been found. + +As an example, consider the following array +and a target sum $x=8$: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$1$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$2$}; +\node at (7.5,0.5) {$3$}; +\end{tikzpicture} +\end{center} + +The initial subarray contains the values +1, 3 and 2 whose sum is 6: + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (0,0) rectangle (3,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$1$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$2$}; +\node at (7.5,0.5) {$3$}; + +\draw[thick,->] (0.5,-0.7) -- (0.5,-0.1); +\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1); +\end{tikzpicture} +\end{center} + +Then, the left pointer moves one step to the right. +The right pointer does not move, because otherwise +the subarray sum would exceed $x$. + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (1,0) rectangle (3,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$1$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$2$}; +\node at (7.5,0.5) {$3$}; + +\draw[thick,->] (1.5,-0.7) -- (1.5,-0.1); +\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1); +\end{tikzpicture} +\end{center} + +Again, the left pointer moves one step to the right, +and this time the right pointer moves three +steps to the right. +The subarray sum is $2+5+1=8$, so a subarray +whose sum is $x$ has been found. + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (2,0) rectangle (5,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$2$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$1$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$2$}; +\node at (7.5,0.5) {$3$}; + +\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1); +\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1); +\end{tikzpicture} +\end{center} + +The running time of the algorithm depends on +the number of steps the right pointer moves. +While there is no useful upper bound on how many steps the +pointer can move on a \emph{single} turn. +we know that the pointer moves \emph{a total of} +$O(n)$ steps during the algorithm, +because it only moves to the right. + +Since both the left and right pointer +move $O(n)$ steps during the algorithm, +the algorithm works in $O(n)$ time. + +\subsubsection{2SUM problem} + +\index{2SUM problem} + +Another problem that can be solved using +the two pointers method is the following problem, +also known as the \key{2SUM problem}: +given an array of $n$ numbers and +a target sum $x$, find +two array values such that their sum is $x$, +or report that no such values exist. + +To solve the problem, we first +sort the array values in increasing order. +After that, we iterate through the array using +two pointers. +The left pointer starts at the first value +and moves one step to the right on each turn. +The right pointer begins at the last value +and always moves to the left until the sum of the +left and right value is at most $x$. +If the sum is exactly $x$, +a solution has been found. + +For example, consider the following array +and a target sum $x=12$: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$4$}; +\node at (2.5,0.5) {$5$}; +\node at (3.5,0.5) {$6$}; +\node at (4.5,0.5) {$7$}; +\node at (5.5,0.5) {$9$}; +\node at (6.5,0.5) {$9$}; +\node at (7.5,0.5) {$10$}; +\end{tikzpicture} +\end{center} + +The initial positions of the pointers +are as follows. +The sum of the values is $1+10=11$ +that is smaller than $x$. + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (0,0) rectangle (1,1); +\fill[color=lightgray] (7,0) rectangle (8,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$4$}; +\node at (2.5,0.5) {$5$}; +\node at (3.5,0.5) {$6$}; +\node at (4.5,0.5) {$7$}; +\node at (5.5,0.5) {$9$}; +\node at (6.5,0.5) {$9$}; +\node at (7.5,0.5) {$10$}; + +\draw[thick,->] (0.5,-0.7) -- (0.5,-0.1); +\draw[thick,->] (7.5,-0.7) -- (7.5,-0.1); +\end{tikzpicture} +\end{center} + +Then the left pointer moves one step to the right. +The right pointer moves three steps to the left, +and the sum becomes $4+7=11$. + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (1,0) rectangle (2,1); +\fill[color=lightgray] (4,0) rectangle (5,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$4$}; +\node at (2.5,0.5) {$5$}; +\node at (3.5,0.5) {$6$}; +\node at (4.5,0.5) {$7$}; +\node at (5.5,0.5) {$9$}; +\node at (6.5,0.5) {$9$}; +\node at (7.5,0.5) {$10$}; + +\draw[thick,->] (1.5,-0.7) -- (1.5,-0.1); +\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1); +\end{tikzpicture} +\end{center} + +After this, the left pointer moves one step to the right again. +The right pointer does not move, and a solution +$5+7=12$ has been found. + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (2,0) rectangle (3,1); +\fill[color=lightgray] (4,0) rectangle (5,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$4$}; +\node at (2.5,0.5) {$5$}; +\node at (3.5,0.5) {$6$}; +\node at (4.5,0.5) {$7$}; +\node at (5.5,0.5) {$9$}; +\node at (6.5,0.5) {$9$}; +\node at (7.5,0.5) {$10$}; + +\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1); +\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1); +\end{tikzpicture} +\end{center} + +The running time of the algorithm is +$O(n \log n)$, because it first sorts +the array in $O(n \log n)$ time, +and then both pointers move $O(n)$ steps. + +Note that it is possible to solve the problem +in another way in $O(n \log n)$ time using binary search. +In such a solution, we iterate through the array +and for each array value, we try to find another +value that yields the sum $x$. +This can be done by performing $n$ binary searches, +each of which takes $O(\log n)$ time. + +\index{3SUM problem} +A more difficult problem is +the \key{3SUM problem} that asks to +find \emph{three} array values +whose sum is $x$. +Using the idea of the above algorithm, +this problem can be solved in $O(n^2)$ time\footnote{For a long time, +it was thought that solving +the 3SUM problem more efficiently than in $O(n^2)$ time +would not be possible. +However, in 2014, it turned out \cite{gro14} +that this is not the case.}. +Can you see how? + +\section{Nearest smaller elements} + +\index{nearest smaller elements} + +Amortized analysis is often used to +estimate the number of operations +performed on a data structure. +The operations may be distributed unevenly so +that most operations occur during a +certain phase of the algorithm, but the total +number of the operations is limited. + +As an example, consider the problem +of finding for each array element +the \key{nearest smaller element}, i.e., +the first smaller element that precedes the element +in the array. +It is possible that no such element exists, +in which case the algorithm should report this. +Next we will see how the problem can be +efficiently solved using a stack structure. + +We go through the array from left to right +and maintain a stack of array elements. +At each array position, we remove elements from the stack +until the top element is smaller than the +current element, or the stack is empty. +Then, we report that the top element is +the nearest smaller element of the current element, +or if the stack is empty, there is no such element. +Finally, we add the current element to the stack. + +As an example, consider the following array: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$2$}; +\node at (4.5,0.5) {$5$}; +\node at (5.5,0.5) {$3$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; +\end{tikzpicture} +\end{center} + +First, the elements 1, 3 and 4 are added to the stack, +because each element is larger than the previous element. +Thus, the nearest smaller element of 4 is 3, +and the nearest smaller element of 3 is 1. +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (2,0) rectangle (3,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$2$}; +\node at (4.5,0.5) {$5$}; +\node at (5.5,0.5) {$3$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2); +\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2); +\draw (2.2,0.2-1.2) rectangle (2.8,0.8-1.2); + +\node at (0.5,0.5-1.2) {$1$}; +\node at (1.5,0.5-1.2) {$3$}; +\node at (2.5,0.5-1.2) {$4$}; + +\draw[->,thick] (0.8,0.5-1.2) -- (1.2,0.5-1.2); +\draw[->,thick] (1.8,0.5-1.2) -- (2.2,0.5-1.2); +\end{tikzpicture} +\end{center} + +The next element 2 is smaller than the two top +elements in the stack. +Thus, the elements 3 and 4 are removed from the stack, +and then the element 2 is added to the stack. +Its nearest smaller element is 1: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (3,0) rectangle (4,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$2$}; +\node at (4.5,0.5) {$5$}; +\node at (5.5,0.5) {$3$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2); +\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2); + +\node at (0.5,0.5-1.2) {$1$}; +\node at (3.5,0.5-1.2) {$2$}; + +\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2); +\end{tikzpicture} +\end{center} + +Then, the element 5 is larger than the element 2, +so it will be added to the stack, and +its nearest smaller element is 2: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (4,0) rectangle (5,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$2$}; +\node at (4.5,0.5) {$5$}; +\node at (5.5,0.5) {$3$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2); +\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2); +\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2); + +\node at (0.5,0.5-1.2) {$1$}; +\node at (3.5,0.5-1.2) {$2$}; +\node at (4.5,0.5-1.2) {$5$}; + +\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2); +\draw[->,thick] (3.8,0.5-1.2) -- (4.2,0.5-1.2); +\end{tikzpicture} +\end{center} + +After this, the element 5 is removed from the stack +and the elements 3 and 4 are added to the stack: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (6,0) rectangle (7,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$2$}; +\node at (4.5,0.5) {$5$}; +\node at (5.5,0.5) {$3$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2); +\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2); +\draw (5.2,0.2-1.2) rectangle (5.8,0.8-1.2); +\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2); + +\node at (0.5,0.5-1.2) {$1$}; +\node at (3.5,0.5-1.2) {$2$}; +\node at (5.5,0.5-1.2) {$3$}; +\node at (6.5,0.5-1.2) {$4$}; + +\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2); +\draw[->,thick] (3.8,0.5-1.2) -- (5.2,0.5-1.2); +\draw[->,thick] (5.8,0.5-1.2) -- (6.2,0.5-1.2); +\end{tikzpicture} +\end{center} + +Finally, all elements except 1 are removed +from the stack and the last element 2 +is added to the stack: + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (7,0) rectangle (8,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$2$}; +\node at (4.5,0.5) {$5$}; +\node at (5.5,0.5) {$3$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2); +\draw (7.2,0.2-1.2) rectangle (7.8,0.8-1.2); + +\node at (0.5,0.5-1.2) {$1$}; +\node at (7.5,0.5-1.2) {$2$}; + +\draw[->,thick] (0.8,0.5-1.2) -- (7.2,0.5-1.2); +\end{tikzpicture} +\end{center} + +The efficiency of the algorithm depends on +the total number of stack operations. +If the current element is larger than +the top element in the stack, it is directly +added to the stack, which is efficient. +However, sometimes the stack can contain several +larger elements and it takes time to remove them. +Still, each element is added \emph{exactly once} to the stack +and removed \emph{at most once} from the stack. +Thus, each element causes $O(1)$ stack operations, +and the algorithm works in $O(n)$ time. + +\section{Sliding window minimum} + +\index{sliding window} +\index{sliding window minimum} + +A \key{sliding window} is a constant-size subarray +that moves from left to right through the array. +At each window position, +we want to calculate some information +about the elements inside the window. +In this section, we focus on the problem +of maintaining the \key{sliding window minimum}, +which means that +we should report the smallest value inside each window. + +The sliding window minimum can be calculated +using a similar idea that we used to calculate +the nearest smaller elements. +We maintain a queue +where each element is larger than +the previous element, +and the first element +always corresponds to the minimum element inside the window. +After each window move, +we remove elements from the end of the queue +until the last queue element +is smaller than the new window element, +or the queue becomes empty. +We also remove the first queue element +if it is not inside the window anymore. +Finally, we add the new window element +to the end of the queue. + +As an example, consider the following array: + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$2$}; +\node at (1.5,0.5) {$1$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$3$}; +\node at (5.5,0.5) {$4$}; +\node at (6.5,0.5) {$1$}; +\node at (7.5,0.5) {$2$}; +\end{tikzpicture} +\end{center} + +Suppose that the size of the sliding window is 4. +At the first window position, the smallest value is 1: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (0,0) rectangle (4,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$2$}; +\node at (1.5,0.5) {$1$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$3$}; +\node at (5.5,0.5) {$4$}; +\node at (6.5,0.5) {$1$}; +\node at (7.5,0.5) {$2$}; + +\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2); +\draw (2.2,0.2-1.2) rectangle (2.8,0.8-1.2); +\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2); + +\node at (1.5,0.5-1.2) {$1$}; +\node at (2.5,0.5-1.2) {$4$}; +\node at (3.5,0.5-1.2) {$5$}; + +\draw[->,thick] (1.8,0.5-1.2) -- (2.2,0.5-1.2); +\draw[->,thick] (2.8,0.5-1.2) -- (3.2,0.5-1.2); +\end{tikzpicture} +\end{center} + +Then the window moves one step right. +The new element 3 is smaller than the elements +4 and 5 in the queue, so the elements 4 and 5 +are removed from the queue +and the element 3 is added to the queue. +The smallest value is still 1. +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (1,0) rectangle (5,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$2$}; +\node at (1.5,0.5) {$1$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$3$}; +\node at (5.5,0.5) {$4$}; +\node at (6.5,0.5) {$1$}; +\node at (7.5,0.5) {$2$}; + +\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2); +\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2); + +\node at (1.5,0.5-1.2) {$1$}; +\node at (4.5,0.5-1.2) {$3$}; + +\draw[->,thick] (1.8,0.5-1.2) -- (4.2,0.5-1.2); +\end{tikzpicture} +\end{center} + +After this, the window moves again, +and the smallest element 1 +does not belong to the window anymore. +Thus, it is removed from the queue and the smallest +value is now 3. Also the new element 4 +is added to the queue. +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (2,0) rectangle (6,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$2$}; +\node at (1.5,0.5) {$1$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$3$}; +\node at (5.5,0.5) {$4$}; +\node at (6.5,0.5) {$1$}; +\node at (7.5,0.5) {$2$}; + +\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2); +\draw (5.2,0.2-1.2) rectangle (5.8,0.8-1.2); + +\node at (4.5,0.5-1.2) {$3$}; +\node at (5.5,0.5-1.2) {$4$}; + +\draw[->,thick] (4.8,0.5-1.2) -- (5.2,0.5-1.2); +\end{tikzpicture} +\end{center} + +The next new element 1 is smaller than all elements +in the queue. +Thus, all elements are removed from the queue +and it will only contain the element 1: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (3,0) rectangle (7,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$2$}; +\node at (1.5,0.5) {$1$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$3$}; +\node at (5.5,0.5) {$4$}; +\node at (6.5,0.5) {$1$}; +\node at (7.5,0.5) {$2$}; + +\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2); + +\node at (6.5,0.5-1.2) {$1$}; +\end{tikzpicture} +\end{center} + +Finally the window reaches its last position. +The element 2 is added to the queue, +but the smallest value inside the window +is still 1. +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (4,0) rectangle (8,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$2$}; +\node at (1.5,0.5) {$1$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$5$}; +\node at (4.5,0.5) {$3$}; +\node at (5.5,0.5) {$4$}; +\node at (6.5,0.5) {$1$}; +\node at (7.5,0.5) {$2$}; + +\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2); +\draw (7.2,0.2-1.2) rectangle (7.8,0.8-1.2); + +\node at (6.5,0.5-1.2) {$1$}; +\node at (7.5,0.5-1.2) {$2$}; + +\draw[->,thick] (6.8,0.5-1.2) -- (7.2,0.5-1.2); +\end{tikzpicture} +\end{center} + +Since each array element +is added to the queue exactly once and +removed from the queue at most once, +the algorithm works in $O(n)$ time. + + + diff --git a/chapter09.tex b/chapter09.tex new file mode 100644 index 0000000..2b34438 --- /dev/null +++ b/chapter09.tex @@ -0,0 +1,1403 @@ +\chapter{Range queries} + +\index{range query} +\index{sum query} +\index{minimum query} +\index{maximum query} + +In this chapter, we discuss data structures +that allow us to efficiently process range queries. +In a \key{range query}, +our task is to calculate a value +based on a subarray of an array. +Typical range queries are: +\begin{itemize} +\item $\texttt{sum}_q(a,b)$: calculate the sum of values in range $[a,b]$ +\item $\texttt{min}_q(a,b)$: find the minimum value in range $[a,b]$ +\item $\texttt{max}_q(a,b)$: find the maximum value in range $[a,b]$ +\end{itemize} + +For example, consider the range $[3,6]$ in the following array: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (3,0) rectangle (7,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$8$}; +\node at (3.5,0.5) {$4$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$3$}; +\node at (7.5,0.5) {$4$}; + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +In this case, $\texttt{sum}_q(3,6)=14$, +$\texttt{min}_q(3,6)=1$ and $\texttt{max}_q(3,6)=6$. + +A simple way to process range queries is to use +a loop that goes through all array values in the range. +For example, the following function can be +used to process sum queries on an array: + +\begin{lstlisting} +int sum(int a, int b) { + int s = 0; + for (int i = a; i <= b; i++) { + s += array[i]; + } + return s; +} +\end{lstlisting} + +This function works in $O(n)$ time, +where $n$ is the size of the array. +Thus, we can process $q$ queries in $O(nq)$ +time using the function. +However, if both $n$ and $q$ are large, this approach +is slow. Fortunately, it turns out that there are +ways to process range queries much more efficiently. + +\section{Static array queries} + +We first focus on a situation where +the array is \emph{static}, i.e., +the array values are never updated between the queries. +In this case, it suffices to construct +a static data structure that tells us +the answer for any possible query. + +\subsubsection{Sum queries} + +\index{prefix sum array} + +We can easily process +sum queries on a static array +by constructing a \key{prefix sum array}. +Each value in the prefix sum array equals +the sum of values in the original array up to that position, +i.e., the value at position $k$ is $\texttt{sum}_q(0,k)$. +The prefix sum array can be constructed in $O(n)$ time. + +For example, consider the following array: +\begin{center} +\begin{tikzpicture}[scale=0.7] +%\fill[color=lightgray] (3,0) rectangle (7,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +The corresponding prefix sum array is as follows: +\begin{center} +\begin{tikzpicture}[scale=0.7] +%\fill[color=lightgray] (3,0) rectangle (7,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$4$}; +\node at (2.5,0.5) {$8$}; +\node at (3.5,0.5) {$16$}; +\node at (4.5,0.5) {$22$}; +\node at (5.5,0.5) {$23$}; +\node at (6.5,0.5) {$27$}; +\node at (7.5,0.5) {$29$}; + + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +Since the prefix sum array contains all values +of $\texttt{sum}_q(0,k)$, +we can calculate any value of +$\texttt{sum}_q(a,b)$ in $O(1)$ time as follows: +\[ \texttt{sum}_q(a,b) = \texttt{sum}_q(0,b) - \texttt{sum}_q(0,a-1)\] +By defining $\texttt{sum}_q(0,-1)=0$, +the above formula also holds when $a=0$. + +For example, consider the range $[3,6]$: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (3,0) rectangle (7,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +In this case $\texttt{sum}_q(3,6)=8+6+1+4=19$. +This sum can be calculated from +two values of the prefix sum array: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (2,0) rectangle (3,1); +\fill[color=lightgray] (6,0) rectangle (7,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$4$}; +\node at (2.5,0.5) {$8$}; +\node at (3.5,0.5) {$16$}; +\node at (4.5,0.5) {$22$}; +\node at (5.5,0.5) {$23$}; +\node at (6.5,0.5) {$27$}; +\node at (7.5,0.5) {$29$}; + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +Thus, $\texttt{sum}_q(3,6)=\texttt{sum}_q(0,6)-\texttt{sum}_q(0,2)=27-8=19$. + +It is also possible to generalize this idea +to higher dimensions. +For example, we can construct a two-dimensional +prefix sum array that can be used to calculate +the sum of any rectangular subarray in $O(1)$ time. +Each sum in such an array corresponds to +a subarray +that begins at the upper-left corner of the array. + +\begin{samepage} +The following picture illustrates the idea: +\begin{center} +\begin{tikzpicture}[scale=0.54] +\draw[fill=lightgray] (3,2) rectangle (7,5); +\draw (0,0) grid (10,7); +\node[anchor=center] at (6.5, 2.5) {$A$}; +\node[anchor=center] at (2.5, 2.5) {$B$}; +\node[anchor=center] at (6.5, 5.5) {$C$}; +\node[anchor=center] at (2.5, 5.5) {$D$}; +\end{tikzpicture} +\end{center} +\end{samepage} + +The sum of the gray subarray can be calculated +using the formula +\[S(A) - S(B) - S(C) + S(D),\] +where $S(X)$ denotes the sum of values +in a rectangular +subarray from the upper-left corner +to the position of $X$. + +\subsubsection{Minimum queries} + +\index{sparse table} + +Minimum queries are more difficult to process +than sum queries. +Still, there is a quite simple +$O(n \log n)$ time preprocessing +method after which we can answer any minimum +query in $O(1)$ time\footnote{This technique +was introduced in \cite{ben00} and sometimes +called the \key{sparse table} method. +There are also more sophisticated techniques \cite{fis06} where +the preprocessing time is only $O(n)$, but such algorithms +are not needed in competitive programming.}. +Note that since minimum and maximum queries can +be processed similarly, +we can focus on minimum queries. + +The idea is to precalculate all values of +$\textrm{min}_q(a,b)$ where +$b-a+1$ (the length of the range) is a power of two. +For example, for the array + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +the following values are calculated: + +\begin{center} +\begin{tabular}{ccc} + +\begin{tabular}{lll} +$a$ & $b$ & $\texttt{min}_q(a,b)$ \\ +\hline +0 & 0 & 1 \\ +1 & 1 & 3 \\ +2 & 2 & 4 \\ +3 & 3 & 8 \\ +4 & 4 & 6 \\ +5 & 5 & 1 \\ +6 & 6 & 4 \\ +7 & 7 & 2 \\ +\end{tabular} + +& + +\begin{tabular}{lll} +$a$ & $b$ & $\texttt{min}_q(a,b)$ \\ +\hline +0 & 1 & 1 \\ +1 & 2 & 3 \\ +2 & 3 & 4 \\ +3 & 4 & 6 \\ +4 & 5 & 1 \\ +5 & 6 & 1 \\ +6 & 7 & 2 \\ +\\ +\end{tabular} + +& + +\begin{tabular}{lll} +$a$ & $b$ & $\texttt{min}_q(a,b)$ \\ +\hline +0 & 3 & 1 \\ +1 & 4 & 3 \\ +2 & 5 & 1 \\ +3 & 6 & 1 \\ +4 & 7 & 1 \\ +0 & 7 & 1 \\ +\\ +\\ +\end{tabular} + +\end{tabular} +\end{center} + +The number of precalculated values is $O(n \log n)$, +because there are $O(\log n)$ range lengths +that are powers of two. +The values can be calculated efficiently +using the recursive formula +\[\texttt{min}_q(a,b) = \min(\texttt{min}_q(a,a+w-1),\texttt{min}_q(a+w,b)),\] +where $b-a+1$ is a power of two and $w=(b-a+1)/2$. +Calculating all those values takes $O(n \log n)$ time. + +After this, any value of $\texttt{min}_q(a,b)$ can be calculated +in $O(1)$ time as a minimum of two precalculated values. +Let $k$ be the largest power of two that does not exceed $b-a+1$. +We can calculate the value of $\texttt{min}_q(a,b)$ using the formula +\[\texttt{min}_q(a,b) = \min(\texttt{min}_q(a,a+k-1),\texttt{min}_q(b-k+1,b)).\] +In the above formula, the range $[a,b]$ is represented +as the union of the ranges $[a,a+k-1]$ and $[b-k+1,b]$, both of length $k$. + +As an example, consider the range $[1,6]$: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (1,0) rectangle (7,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +The length of the range is 6, +and the largest power of two that does +not exceed 6 is 4. +Thus the range $[1,6]$ is +the union of the ranges $[1,4]$ and $[3,6]$: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (1,0) rectangle (5,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=lightgray] (3,0) rectangle (7,1); +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +Since $\texttt{min}_q(1,4)=3$ and $\texttt{min}_q(3,6)=1$, +we conclude that $\texttt{min}_q(1,6)=1$. + +\section{Binary indexed tree} + +\index{binary indexed tree} +\index{Fenwick tree} + +A \key{binary indexed tree} or a \key{Fenwick tree}\footnote{The +binary indexed tree structure was presented by P. M. Fenwick in 1994 \cite{fen94}.} +can be seen as a dynamic variant of a prefix sum array. +It supports two $O(\log n)$ time operations on an array: +processing a range sum query and updating a value. + +The advantage of a binary indexed tree is +that it allows us to efficiently update +array values between sum queries. +This would not be possible using a prefix sum array, +because after each update, it would be necessary to build the +whole prefix sum array again in $O(n)$ time. + +\subsubsection{Structure} + +Even if the name of the structure is a binary indexed \emph{tree}, +it is usually represented as an array. +In this section we assume that all arrays are one-indexed, +because it makes the implementation easier. + +Let $p(k)$ denote the largest power of two that +divides $k$. +We store a binary indexed tree as an array \texttt{tree} +such that +\[ \texttt{tree}[k] = \texttt{sum}_q(k-p(k)+1,k),\] +i.e., each position $k$ contains the sum of values +in a range of the original array whose length is $p(k)$ +and that ends at position $k$. +For example, since $p(6)=2$, $\texttt{tree}[6]$ +contains the value of $\texttt{sum}_q(5,6)$. + +For example, consider the following array: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$3$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$8$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$1$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$2$}; + +\footnotesize +\node at (0.5,1.4) {$1$}; +\node at (1.5,1.4) {$2$}; +\node at (2.5,1.4) {$3$}; +\node at (3.5,1.4) {$4$}; +\node at (4.5,1.4) {$5$}; +\node at (5.5,1.4) {$6$}; +\node at (6.5,1.4) {$7$}; +\node at (7.5,1.4) {$8$}; +\end{tikzpicture} +\end{center} + +The corresponding binary indexed tree is as follows: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$4$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$16$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$7$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$29$}; + +\footnotesize +\node at (0.5,1.4) {$1$}; +\node at (1.5,1.4) {$2$}; +\node at (2.5,1.4) {$3$}; +\node at (3.5,1.4) {$4$}; +\node at (4.5,1.4) {$5$}; +\node at (5.5,1.4) {$6$}; +\node at (6.5,1.4) {$7$}; +\node at (7.5,1.4) {$8$}; +\end{tikzpicture} +\end{center} + +The following picture shows more clearly +how each value in the binary indexed tree +corresponds to a range in the original array: + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$4$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$16$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$7$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$29$}; + +\footnotesize +\node at (0.5,1.4) {$1$}; +\node at (1.5,1.4) {$2$}; +\node at (2.5,1.4) {$3$}; +\node at (3.5,1.4) {$4$}; +\node at (4.5,1.4) {$5$}; +\node at (5.5,1.4) {$6$}; +\node at (6.5,1.4) {$7$}; +\node at (7.5,1.4) {$8$}; + +\draw[->,thick] (0.5,-0.9) -- (0.5,-0.1); +\draw[->,thick] (2.5,-0.9) -- (2.5,-0.1); +\draw[->,thick] (4.5,-0.9) -- (4.5,-0.1); +\draw[->,thick] (6.5,-0.9) -- (6.5,-0.1); +\draw[->,thick] (1.5,-1.9) -- (1.5,-0.1); +\draw[->,thick] (5.5,-1.9) -- (5.5,-0.1); +\draw[->,thick] (3.5,-2.9) -- (3.5,-0.1); +\draw[->,thick] (7.5,-3.9) -- (7.5,-0.1); + +\draw (0,-1) -- (1,-1) -- (1,-1.5) -- (0,-1.5) -- (0,-1); +\draw (2,-1) -- (3,-1) -- (3,-1.5) -- (2,-1.5) -- (2,-1); +\draw (4,-1) -- (5,-1) -- (5,-1.5) -- (4,-1.5) -- (4,-1); +\draw (6,-1) -- (7,-1) -- (7,-1.5) -- (6,-1.5) -- (6,-1); +\draw (0,-2) -- (2,-2) -- (2,-2.5) -- (0,-2.5) -- (0,-2); +\draw (4,-2) -- (6,-2) -- (6,-2.5) -- (4,-2.5) -- (4,-2); +\draw (0,-3) -- (4,-3) -- (4,-3.5) -- (0,-3.5) -- (0,-3); +\draw (0,-4) -- (8,-4) -- (8,-4.5) -- (0,-4.5) -- (0,-4); +\end{tikzpicture} +\end{center} + +Using a binary indexed tree, +any value of $\texttt{sum}_q(1,k)$ +can be calculated in $O(\log n)$ time, +because a range $[1,k]$ can always be divided into +$O(\log n)$ ranges whose sums are stored in the tree. + +For example, the range $[1,7]$ consists of +the following ranges: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$4$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$16$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$7$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$29$}; + +\footnotesize +\node at (0.5,1.4) {$1$}; +\node at (1.5,1.4) {$2$}; +\node at (2.5,1.4) {$3$}; +\node at (3.5,1.4) {$4$}; +\node at (4.5,1.4) {$5$}; +\node at (5.5,1.4) {$6$}; +\node at (6.5,1.4) {$7$}; +\node at (7.5,1.4) {$8$}; + +\draw[->,thick] (0.5,-0.9) -- (0.5,-0.1); +\draw[->,thick] (2.5,-0.9) -- (2.5,-0.1); +\draw[->,thick] (4.5,-0.9) -- (4.5,-0.1); +\draw[->,thick] (6.5,-0.9) -- (6.5,-0.1); +\draw[->,thick] (1.5,-1.9) -- (1.5,-0.1); +\draw[->,thick] (5.5,-1.9) -- (5.5,-0.1); +\draw[->,thick] (3.5,-2.9) -- (3.5,-0.1); +\draw[->,thick] (7.5,-3.9) -- (7.5,-0.1); + +\draw (0,-1) -- (1,-1) -- (1,-1.5) -- (0,-1.5) -- (0,-1); +\draw (2,-1) -- (3,-1) -- (3,-1.5) -- (2,-1.5) -- (2,-1); +\draw (4,-1) -- (5,-1) -- (5,-1.5) -- (4,-1.5) -- (4,-1); +\draw[fill=lightgray] (6,-1) -- (7,-1) -- (7,-1.5) -- (6,-1.5) -- (6,-1); +\draw (0,-2) -- (2,-2) -- (2,-2.5) -- (0,-2.5) -- (0,-2); +\draw[fill=lightgray] (4,-2) -- (6,-2) -- (6,-2.5) -- (4,-2.5) -- (4,-2); +\draw[fill=lightgray] (0,-3) -- (4,-3) -- (4,-3.5) -- (0,-3.5) -- (0,-3); +\draw (0,-4) -- (8,-4) -- (8,-4.5) -- (0,-4.5) -- (0,-4); +\end{tikzpicture} +\end{center} +Thus, we can calculate the corresponding sum as follows: +\[\texttt{sum}_q(1,7)=\texttt{sum}_q(1,4)+\texttt{sum}_q(5,6)+\texttt{sum}_q(7,7)=16+7+4=27\] + +To calculate the value of $\texttt{sum}_q(a,b)$ where $a>1$, +we can use the same trick that we used with prefix sum arrays: +\[ \texttt{sum}_q(a,b) = \texttt{sum}_q(1,b) - \texttt{sum}_q(1,a-1).\] +Since we can calculate both $\texttt{sum}_q(1,b)$ +and $\texttt{sum}_q(1,a-1)$ in $O(\log n)$ time, +the total time complexity is $O(\log n)$. + +Then, after updating a value in the original array, +several values in the binary indexed tree +should be updated. +For example, if the value at position 3 changes, +the sums of the following ranges change: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$1$}; +\node at (1.5,0.5) {$4$}; +\node at (2.5,0.5) {$4$}; +\node at (3.5,0.5) {$16$}; +\node at (4.5,0.5) {$6$}; +\node at (5.5,0.5) {$7$}; +\node at (6.5,0.5) {$4$}; +\node at (7.5,0.5) {$29$}; + +\footnotesize +\node at (0.5,1.4) {$1$}; +\node at (1.5,1.4) {$2$}; +\node at (2.5,1.4) {$3$}; +\node at (3.5,1.4) {$4$}; +\node at (4.5,1.4) {$5$}; +\node at (5.5,1.4) {$6$}; +\node at (6.5,1.4) {$7$}; +\node at (7.5,1.4) {$8$}; + +\draw[->,thick] (0.5,-0.9) -- (0.5,-0.1); +\draw[->,thick] (2.5,-0.9) -- (2.5,-0.1); +\draw[->,thick] (4.5,-0.9) -- (4.5,-0.1); +\draw[->,thick] (6.5,-0.9) -- (6.5,-0.1); +\draw[->,thick] (1.5,-1.9) -- (1.5,-0.1); +\draw[->,thick] (5.5,-1.9) -- (5.5,-0.1); +\draw[->,thick] (3.5,-2.9) -- (3.5,-0.1); +\draw[->,thick] (7.5,-3.9) -- (7.5,-0.1); + +\draw (0,-1) -- (1,-1) -- (1,-1.5) -- (0,-1.5) -- (0,-1); +\draw[fill=lightgray] (2,-1) -- (3,-1) -- (3,-1.5) -- (2,-1.5) -- (2,-1); +\draw (4,-1) -- (5,-1) -- (5,-1.5) -- (4,-1.5) -- (4,-1); +\draw (6,-1) -- (7,-1) -- (7,-1.5) -- (6,-1.5) -- (6,-1); +\draw (0,-2) -- (2,-2) -- (2,-2.5) -- (0,-2.5) -- (0,-2); +\draw (4,-2) -- (6,-2) -- (6,-2.5) -- (4,-2.5) -- (4,-2); +\draw[fill=lightgray] (0,-3) -- (4,-3) -- (4,-3.5) -- (0,-3.5) -- (0,-3); +\draw[fill=lightgray] (0,-4) -- (8,-4) -- (8,-4.5) -- (0,-4.5) -- (0,-4); +\end{tikzpicture} +\end{center} + +Since each array element belongs to $O(\log n)$ +ranges in the binary indexed tree, +it suffices to update $O(\log n)$ values in the tree. + +\subsubsection{Implementation} + +The operations of a binary indexed tree can be +efficiently implemented using bit operations. +The key fact needed is that we can +calculate any value of $p(k)$ using the formula +\[p(k) = k \& -k.\] + +The following function calculates the value of $\texttt{sum}_q(1,k)$: +\begin{lstlisting} +int sum(int k) { + int s = 0; + while (k >= 1) { + s += tree[k]; + k -= k&-k; + } + return s; +} +\end{lstlisting} + +The following function increases the +array value at position $k$ by $x$ +($x$ can be positive or negative): +\begin{lstlisting} +void add(int k, int x) { + while (k <= n) { + tree[k] += x; + k += k&-k; + } +} +\end{lstlisting} + +The time complexity of both the functions is +$O(\log n)$, because the functions access $O(\log n)$ +values in the binary indexed tree, and each move +to the next position takes $O(1)$ time. + +\section{Segment tree} + +\index{segment tree} + +A \key{segment tree}\footnote{The bottom-up-implementation in this chapter corresponds to +that in \cite{sta06}. Similar structures were used +in late 1970's to solve geometric problems \cite{ben80}.} is a data structure +that supports two operations: +processing a range query and +updating an array value. +Segment trees can support +sum queries, minimum and maximum queries and many other +queries so that both operations work in $O(\log n)$ time. + +Compared to a binary indexed tree, +the advantage of a segment tree is that it is +a more general data structure. +While binary indexed trees only support +sum queries\footnote{In fact, using \emph{two} binary +indexed trees it is possible to support minimum queries \cite{dim15}, +but this is more complicated than to use a segment tree.}, +segment trees also support other queries. +On the other hand, a segment tree requires more +memory and is a bit more difficult to implement. + +\subsubsection{Structure} + +A segment tree is a binary tree +such that the nodes on the bottom level of the tree +correspond to the array elements, +and the other nodes +contain information needed for processing range queries. + +In this section, we assume that the size +of the array is a power of two and zero-based +indexing is used, because it is convenient to build +a segment tree for such an array. +If the size of the array is not a power of two, +we can always append extra elements to it. + +We will first discuss segment trees that support sum queries. +As an example, consider the following array: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node at (0.5,0.5) {$5$}; +\node at (1.5,0.5) {$8$}; +\node at (2.5,0.5) {$6$}; +\node at (3.5,0.5) {$3$}; +\node at (4.5,0.5) {$2$}; +\node at (5.5,0.5) {$7$}; +\node at (6.5,0.5) {$2$}; +\node at (7.5,0.5) {$6$}; + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +The corresponding segment tree is as follows: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node[anchor=center] at (0.5, 0.5) {5}; +\node[anchor=center] at (1.5, 0.5) {8}; +\node[anchor=center] at (2.5, 0.5) {6}; +\node[anchor=center] at (3.5, 0.5) {3}; +\node[anchor=center] at (4.5, 0.5) {2}; +\node[anchor=center] at (5.5, 0.5) {7}; +\node[anchor=center] at (6.5, 0.5) {2}; +\node[anchor=center] at (7.5, 0.5) {6}; + +\node[draw, circle] (a) at (1,2.5) {13}; +\path[draw,thick,-] (a) -- (0.5,1); +\path[draw,thick,-] (a) -- (1.5,1); +\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {9}; +\path[draw,thick,-] (b) -- (2.5,1); +\path[draw,thick,-] (b) -- (3.5,1); +\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {9}; +\path[draw,thick,-] (c) -- (4.5,1); +\path[draw,thick,-] (c) -- (5.5,1); +\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8}; +\path[draw,thick,-] (d) -- (6.5,1); +\path[draw,thick,-] (d) -- (7.5,1); + +\node[draw, circle] (i) at (2,4.5) {22}; +\path[draw,thick,-] (i) -- (a); +\path[draw,thick,-] (i) -- (b); +\node[draw, circle] (j) at (6,4.5) {17}; +\path[draw,thick,-] (j) -- (c); +\path[draw,thick,-] (j) -- (d); + +\node[draw, circle] (m) at (4,6.5) {39}; +\path[draw,thick,-] (m) -- (i); +\path[draw,thick,-] (m) -- (j); +\end{tikzpicture} +\end{center} + +Each internal tree node +corresponds to an array range +whose size is a power of two. +In the above tree, the value of each internal +node is the sum of the corresponding array values, +and it can be calculated as the sum of +the values of its left and right child node. + +It turns out that any range $[a,b]$ +can be divided into $O(\log n)$ ranges +whose values are stored in tree nodes. +For example, consider the range [2,7]: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=gray!50] (2,0) rectangle (8,1); +\draw (0,0) grid (8,1); + +\node[anchor=center] at (0.5, 0.5) {5}; +\node[anchor=center] at (1.5, 0.5) {8}; +\node[anchor=center] at (2.5, 0.5) {6}; +\node[anchor=center] at (3.5, 0.5) {3}; +\node[anchor=center] at (4.5, 0.5) {2}; +\node[anchor=center] at (5.5, 0.5) {7}; +\node[anchor=center] at (6.5, 0.5) {2}; +\node[anchor=center] at (7.5, 0.5) {6}; + +\footnotesize +\node at (0.5,1.4) {$0$}; +\node at (1.5,1.4) {$1$}; +\node at (2.5,1.4) {$2$}; +\node at (3.5,1.4) {$3$}; +\node at (4.5,1.4) {$4$}; +\node at (5.5,1.4) {$5$}; +\node at (6.5,1.4) {$6$}; +\node at (7.5,1.4) {$7$}; +\end{tikzpicture} +\end{center} +Here $\texttt{sum}_q(2,7)=6+3+2+7+2+6=26$. +In this case, the following two tree nodes +correspond to the range: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node[anchor=center] at (0.5, 0.5) {5}; +\node[anchor=center] at (1.5, 0.5) {8}; +\node[anchor=center] at (2.5, 0.5) {6}; +\node[anchor=center] at (3.5, 0.5) {3}; +\node[anchor=center] at (4.5, 0.5) {2}; +\node[anchor=center] at (5.5, 0.5) {7}; +\node[anchor=center] at (6.5, 0.5) {2}; +\node[anchor=center] at (7.5, 0.5) {6}; + +\node[draw, circle] (a) at (1,2.5) {13}; +\path[draw,thick,-] (a) -- (0.5,1); +\path[draw,thick,-] (a) -- (1.5,1); +\node[draw, circle,fill=gray!50,minimum size=22pt] (b) at (3,2.5) {9}; +\path[draw,thick,-] (b) -- (2.5,1); +\path[draw,thick,-] (b) -- (3.5,1); +\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {9}; +\path[draw,thick,-] (c) -- (4.5,1); +\path[draw,thick,-] (c) -- (5.5,1); +\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8}; +\path[draw,thick,-] (d) -- (6.5,1); +\path[draw,thick,-] (d) -- (7.5,1); + +\node[draw, circle] (i) at (2,4.5) {22}; +\path[draw,thick,-] (i) -- (a); +\path[draw,thick,-] (i) -- (b); +\node[draw, circle,fill=gray!50] (j) at (6,4.5) {17}; +\path[draw,thick,-] (j) -- (c); +\path[draw,thick,-] (j) -- (d); + +\node[draw, circle] (m) at (4,6.5) {39}; +\path[draw,thick,-] (m) -- (i); +\path[draw,thick,-] (m) -- (j); +\end{tikzpicture} +\end{center} +Thus, another way to calculate the sum is $9+17=26$. + +When the sum is calculated using nodes +located as high as possible in the tree, +at most two nodes on each level +of the tree are needed. +Hence, the total number of nodes +is $O(\log n)$. + +After an array update, +we should update all nodes +whose value depends on the updated value. +This can be done by traversing the path +from the updated array element to the top node +and updating the nodes along the path. + +The following picture shows which tree nodes +change if the array value 7 changes: + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\fill[color=gray!50] (5,0) rectangle (6,1); +\draw (0,0) grid (8,1); + +\node[anchor=center] at (0.5, 0.5) {5}; +\node[anchor=center] at (1.5, 0.5) {8}; +\node[anchor=center] at (2.5, 0.5) {6}; +\node[anchor=center] at (3.5, 0.5) {3}; +\node[anchor=center] at (4.5, 0.5) {2}; +\node[anchor=center] at (5.5, 0.5) {7}; +\node[anchor=center] at (6.5, 0.5) {2}; +\node[anchor=center] at (7.5, 0.5) {6}; + +\node[draw, circle] (a) at (1,2.5) {13}; +\path[draw,thick,-] (a) -- (0.5,1); +\path[draw,thick,-] (a) -- (1.5,1); +\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {9}; +\path[draw,thick,-] (b) -- (2.5,1); +\path[draw,thick,-] (b) -- (3.5,1); +\node[draw, circle,minimum size=22pt,fill=gray!50] (c) at (5,2.5) {9}; +\path[draw,thick,-] (c) -- (4.5,1); +\path[draw,thick,-] (c) -- (5.5,1); +\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8}; +\path[draw,thick,-] (d) -- (6.5,1); +\path[draw,thick,-] (d) -- (7.5,1); + +\node[draw, circle] (i) at (2,4.5) {22}; +\path[draw,thick,-] (i) -- (a); +\path[draw,thick,-] (i) -- (b); +\node[draw, circle,fill=gray!50] (j) at (6,4.5) {17}; +\path[draw,thick,-] (j) -- (c); +\path[draw,thick,-] (j) -- (d); + +\node[draw, circle,fill=gray!50] (m) at (4,6.5) {39}; +\path[draw,thick,-] (m) -- (i); +\path[draw,thick,-] (m) -- (j); +\end{tikzpicture} +\end{center} + +The path from bottom to top +always consists of $O(\log n)$ nodes, +so each update changes $O(\log n)$ nodes in the tree. + +\subsubsection{Implementation} + +We store a segment tree as an array +of $2n$ elements where $n$ is the size of +the original array and a power of two. +The tree nodes are stored from top to bottom: +$\texttt{tree}[1]$ is the top node, +$\texttt{tree}[2]$ and $\texttt{tree}[3]$ +are its children, and so on. +Finally, the values from $\texttt{tree}[n]$ +to $\texttt{tree}[2n-1]$ correspond to +the values of the original array +on the bottom level of the tree. + +For example, the segment tree +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node[anchor=center] at (0.5, 0.5) {5}; +\node[anchor=center] at (1.5, 0.5) {8}; +\node[anchor=center] at (2.5, 0.5) {6}; +\node[anchor=center] at (3.5, 0.5) {3}; +\node[anchor=center] at (4.5, 0.5) {2}; +\node[anchor=center] at (5.5, 0.5) {7}; +\node[anchor=center] at (6.5, 0.5) {2}; +\node[anchor=center] at (7.5, 0.5) {6}; + +\node[draw, circle] (a) at (1,2.5) {13}; +\path[draw,thick,-] (a) -- (0.5,1); +\path[draw,thick,-] (a) -- (1.5,1); +\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {9}; +\path[draw,thick,-] (b) -- (2.5,1); +\path[draw,thick,-] (b) -- (3.5,1); +\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {9}; +\path[draw,thick,-] (c) -- (4.5,1); +\path[draw,thick,-] (c) -- (5.5,1); +\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8}; +\path[draw,thick,-] (d) -- (6.5,1); +\path[draw,thick,-] (d) -- (7.5,1); + +\node[draw, circle] (i) at (2,4.5) {22}; +\path[draw,thick,-] (i) -- (a); +\path[draw,thick,-] (i) -- (b); +\node[draw, circle] (j) at (6,4.5) {17}; +\path[draw,thick,-] (j) -- (c); +\path[draw,thick,-] (j) -- (d); + +\node[draw, circle] (m) at (4,6.5) {39}; +\path[draw,thick,-] (m) -- (i); +\path[draw,thick,-] (m) -- (j); +\end{tikzpicture} +\end{center} +is stored as follows: +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (15,1); + +\node at (0.5,0.5) {$39$}; +\node at (1.5,0.5) {$22$}; +\node at (2.5,0.5) {$17$}; +\node at (3.5,0.5) {$13$}; +\node at (4.5,0.5) {$9$}; +\node at (5.5,0.5) {$9$}; +\node at (6.5,0.5) {$8$}; +\node at (7.5,0.5) {$5$}; +\node at (8.5,0.5) {$8$}; +\node at (9.5,0.5) {$6$}; +\node at (10.5,0.5) {$3$}; +\node at (11.5,0.5) {$2$}; +\node at (12.5,0.5) {$7$}; +\node at (13.5,0.5) {$2$}; +\node at (14.5,0.5) {$6$}; + +\footnotesize +\node at (0.5,1.4) {$1$}; +\node at (1.5,1.4) {$2$}; +\node at (2.5,1.4) {$3$}; +\node at (3.5,1.4) {$4$}; +\node at (4.5,1.4) {$5$}; +\node at (5.5,1.4) {$6$}; +\node at (6.5,1.4) {$7$}; +\node at (7.5,1.4) {$8$}; +\node at (8.5,1.4) {$9$}; +\node at (9.5,1.4) {$10$}; +\node at (10.5,1.4) {$11$}; +\node at (11.5,1.4) {$12$}; +\node at (12.5,1.4) {$13$}; +\node at (13.5,1.4) {$14$}; +\node at (14.5,1.4) {$15$}; +\end{tikzpicture} +\end{center} +Using this representation, +the parent of $\texttt{tree}[k]$ +is $\texttt{tree}[\lfloor k/2 \rfloor]$, +and its children are $\texttt{tree}[2k]$ +and $\texttt{tree}[2k+1]$. +Note that this implies that the position of a node +is even if it is a left child and odd if it is a right child. + +The following function +calculates the value of $\texttt{sum}_q(a,b)$: +\begin{lstlisting} +int sum(int a, int b) { + a += n; b += n; + int s = 0; + while (a <= b) { + if (a%2 == 1) s += tree[a++]; + if (b%2 == 0) s += tree[b--]; + a /= 2; b /= 2; + } + return s; +} +\end{lstlisting} +The function maintains a range +that is initially $[a+n,b+n]$. +Then, at each step, the range is moved +one level higher in the tree, +and before that, the values of the nodes that do not +belong to the higher range are added to the sum. + +The following function increases the array value +at position $k$ by $x$: +\begin{lstlisting} +void add(int k, int x) { + k += n; + tree[k] += x; + for (k /= 2; k >= 1; k /= 2) { + tree[k] = tree[2*k]+tree[2*k+1]; + } +} +\end{lstlisting} +First the function updates the value +at the bottom level of the tree. +After this, the function updates the values of all +internal tree nodes, until it reaches +the top node of the tree. + +Both the above functions work +in $O(\log n)$ time, because a segment tree +of $n$ elements consists of $O(\log n)$ levels, +and the functions move one level higher +in the tree at each step. + +\subsubsection{Other queries} + +Segment trees can support all range queries +where it is possible to divide a range into two parts, +calculate the answer separately for both parts +and then efficiently combine the answers. +Examples of such queries are +minimum and maximum, greatest common divisor, +and bit operations and, or and xor. + +For example, the following segment tree +supports minimum queries: + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node[anchor=center] at (0.5, 0.5) {5}; +\node[anchor=center] at (1.5, 0.5) {8}; +\node[anchor=center] at (2.5, 0.5) {6}; +\node[anchor=center] at (3.5, 0.5) {3}; +\node[anchor=center] at (4.5, 0.5) {1}; +\node[anchor=center] at (5.5, 0.5) {7}; +\node[anchor=center] at (6.5, 0.5) {2}; +\node[anchor=center] at (7.5, 0.5) {6}; + +\node[draw, circle,minimum size=22pt] (a) at (1,2.5) {5}; +\path[draw,thick,-] (a) -- (0.5,1); +\path[draw,thick,-] (a) -- (1.5,1); +\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {3}; +\path[draw,thick,-] (b) -- (2.5,1); +\path[draw,thick,-] (b) -- (3.5,1); +\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {1}; +\path[draw,thick,-] (c) -- (4.5,1); +\path[draw,thick,-] (c) -- (5.5,1); +\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {2}; +\path[draw,thick,-] (d) -- (6.5,1); +\path[draw,thick,-] (d) -- (7.5,1); + +\node[draw, circle,minimum size=22pt] (i) at (2,4.5) {3}; +\path[draw,thick,-] (i) -- (a); +\path[draw,thick,-] (i) -- (b); +\node[draw, circle,minimum size=22pt] (j) at (6,4.5) {1}; +\path[draw,thick,-] (j) -- (c); +\path[draw,thick,-] (j) -- (d); + +\node[draw, circle,minimum size=22pt] (m) at (4,6.5) {1}; +\path[draw,thick,-] (m) -- (i); +\path[draw,thick,-] (m) -- (j); +\end{tikzpicture} +\end{center} + +In this case, every tree node contains +the smallest value in the corresponding +array range. +The top node of the tree contains the smallest +value in the whole array. +The operations can be implemented like previously, +but instead of sums, minima are calculated. + +The structure of a segment tree also allows us +to use binary search for locating array elements. +For example, if the tree supports minimum queries, +we can find the position of an element +with the smallest value in $O(\log n)$ time. + +For example, in the above tree, an +element with the smallest value 1 can be found +by traversing a path downwards from the top node: + +\begin{center} +\begin{tikzpicture}[scale=0.7] +\draw (0,0) grid (8,1); + +\node[anchor=center] at (0.5, 0.5) {5}; +\node[anchor=center] at (1.5, 0.5) {8}; +\node[anchor=center] at (2.5, 0.5) {6}; +\node[anchor=center] at (3.5, 0.5) {3}; +\node[anchor=center] at (4.5, 0.5) {1}; +\node[anchor=center] at (5.5, 0.5) {7}; +\node[anchor=center] at (6.5, 0.5) {2}; +\node[anchor=center] at (7.5, 0.5) {6}; + +\node[draw, circle,minimum size=22pt] (a) at (1,2.5) {5}; +\path[draw,thick,-] (a) -- (0.5,1); +\path[draw,thick,-] (a) -- (1.5,1); +\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {3}; +\path[draw,thick,-] (b) -- (2.5,1); +\path[draw,thick,-] (b) -- (3.5,1); +\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {1}; +\path[draw,thick,-] (c) -- (4.5,1); +\path[draw,thick,-] (c) -- (5.5,1); +\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {2}; +\path[draw,thick,-] (d) -- (6.5,1); +\path[draw,thick,-] (d) -- (7.5,1); + +\node[draw, circle,minimum size=22pt] (i) at (2,4.5) {3}; +\path[draw,thick,-] (i) -- (a); +\path[draw,thick,-] (i) -- (b); +\node[draw, circle,minimum size=22pt] (j) at (6,4.5) {1}; +\path[draw,thick,-] (j) -- (c); +\path[draw,thick,-] (j) -- (d); + +\node[draw, circle,minimum size=22pt] (m) at (4,6.5) {1}; +\path[draw,thick,-] (m) -- (i); +\path[draw,thick,-] (m) -- (j); + +\path[draw=red,thick,->,line width=2pt] (m) -- (j); +\path[draw=red,thick,->,line width=2pt] (j) -- (c); +\path[draw=red,thick,->,line width=2pt] (c) -- (4.5,1); +\end{tikzpicture} +\end{center} + +\section{Additional techniques} + +\subsubsection{Index compression} + +A limitation in data structures that +are built upon an array is that +the elements are indexed using +consecutive integers. +Difficulties arise when large indices +are needed. +For example, if we wish to use the index $10^9$, +the array should contain $10^9$ +elements which would require too much memory. + +\index{index compression} + +However, we can often bypass this limitation +by using \key{index compression}, +where the original indices are replaced +with indices $1,2,3,$ etc. +This can be done if we know all the indices +needed during the algorithm beforehand. + +The idea is to replace each original index $x$ +with $c(x)$ where $c$ is a function that +compresses the indices. +We require that the order of the indices +does not change, so if $a > k$ +removes the $k$ last bits from the number. +For example, $14 < < 2 = 56$, +because $14$ and $56$ correspond to 1110 and 111000. +Similarly, $49 > > 3 = 6$, +because $49$ and $6$ correspond to 110001 and 110. + +Note that $x < < k$ +corresponds to multiplying $x$ by $2^k$, +and $x > > k$ +corresponds to dividing $x$ by $2^k$ +rounded down to an integer. + +\subsubsection{Applications} + +A number of the form $1 < < k$ has a one bit +in position $k$ and all other bits are zero, +so we can use such numbers to access single bits of numbers. +In particular, the $k$th bit of a number is one +exactly when $x$ \& $(1 < < k)$ is not zero. +The following code prints the bit representation +of an \texttt{int} number $x$: + +\begin{lstlisting} +for (int i = 31; i >= 0; i--) { + if (x&(1< best[1<