Compare commits
No commits in common. "7b0a21413d86d7e2498a02e10442b6d970068da7" and "f269ae391910742788ac0d6626df1e221281f191" have entirely different histories.
7b0a21413d
...
f269ae3919
24
README.md
24
README.md
|
@ -1,4 +1,22 @@
|
||||||
# cphb
|
# Competitive Programmer's Handbook
|
||||||
|
|
||||||
SOI adjusted Competitive Programmer's Handbook
|
Competitive Programmer's Handbook is a modern introduction to competitive programming.
|
||||||
(see https://github.com/pllk/cphb for the original)
|
The book discusses programming tricks and algorithm design techniques relevant in competitive programming.
|
||||||
|
|
||||||
|
## CSES Problem Set
|
||||||
|
|
||||||
|
The CSES Problem Set contains a collection of competitive programming problems.
|
||||||
|
You can practice the techniques presented in the book by solving the problems.
|
||||||
|
|
||||||
|
https://cses.fi/problemset/
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
The license of the book is Creative Commons BY-NC-SA.
|
||||||
|
|
||||||
|
## Other books
|
||||||
|
|
||||||
|
Guide to Competitive Programming is a printed book, published by Springer, based on Competitive Programmer's Handbook.
|
||||||
|
There is also a Russian edition Олимпиадное программирование (Olympiad Programming) and a Korean edition 알고리즘 트레이닝: 프로그래밍 대회 입문 가이드.
|
||||||
|
|
||||||
|
https://cses.fi/book/
|
||||||
|
|
|
@ -0,0 +1,131 @@
|
||||||
|
\documentclass[twoside,12pt,a4paper,english]{book}
|
||||||
|
|
||||||
|
%\includeonly{chapter04,list}
|
||||||
|
|
||||||
|
\usepackage[english]{babel}
|
||||||
|
\usepackage[utf8]{inputenc}
|
||||||
|
\usepackage{listings}
|
||||||
|
\usepackage[table]{xcolor}
|
||||||
|
\usepackage{tikz}
|
||||||
|
\usepackage{multicol}
|
||||||
|
\usepackage{hyperref}
|
||||||
|
\usepackage{array}
|
||||||
|
\usepackage{microtype}
|
||||||
|
|
||||||
|
\usepackage{fouriernc}
|
||||||
|
\usepackage[T1]{fontenc}
|
||||||
|
|
||||||
|
\usepackage{graphicx}
|
||||||
|
\usepackage{framed}
|
||||||
|
\usepackage{amssymb}
|
||||||
|
\usepackage{amsmath}
|
||||||
|
|
||||||
|
\usepackage{pifont}
|
||||||
|
\usepackage{ifthen}
|
||||||
|
\usepackage{makeidx}
|
||||||
|
\usepackage{enumitem}
|
||||||
|
|
||||||
|
\usepackage{titlesec}
|
||||||
|
|
||||||
|
\usepackage{skak}
|
||||||
|
\usepackage[scaled=0.95]{inconsolata}
|
||||||
|
|
||||||
|
|
||||||
|
\usetikzlibrary{patterns,snakes}
|
||||||
|
\pagestyle{plain}
|
||||||
|
|
||||||
|
\definecolor{keywords}{HTML}{44548A}
|
||||||
|
\definecolor{strings}{HTML}{00999A}
|
||||||
|
\definecolor{comments}{HTML}{990000}
|
||||||
|
|
||||||
|
\lstset{language=C++,frame=single,basicstyle=\ttfamily \small,showstringspaces=false,columns=flexible}
|
||||||
|
\lstset{
|
||||||
|
literate={ö}{{\"o}}1
|
||||||
|
{ä}{{\"a}}1
|
||||||
|
{ü}{{\"u}}1
|
||||||
|
}
|
||||||
|
\lstset{xleftmargin=20pt,xrightmargin=5pt}
|
||||||
|
\lstset{aboveskip=12pt,belowskip=8pt}
|
||||||
|
|
||||||
|
\lstset{
|
||||||
|
commentstyle=\color{comments},
|
||||||
|
keywordstyle=\color{keywords},
|
||||||
|
stringstyle=\color{strings}
|
||||||
|
}
|
||||||
|
|
||||||
|
\date{Draft \today}
|
||||||
|
|
||||||
|
\usepackage[a4paper,vmargin=30mm,hmargin=33mm,footskip=15mm]{geometry}
|
||||||
|
|
||||||
|
\title{\Huge Competitive Programmer's Handbook}
|
||||||
|
\author{\Large Antti Laaksonen}
|
||||||
|
|
||||||
|
\makeindex
|
||||||
|
\usepackage[totoc]{idxlayout}
|
||||||
|
|
||||||
|
\titleformat{\subsubsection}
|
||||||
|
{\normalfont\large\bfseries\sffamily}{\thesubsection}{1em}{}
|
||||||
|
|
||||||
|
\begin{document}
|
||||||
|
|
||||||
|
%\selectlanguage{finnish}
|
||||||
|
|
||||||
|
%\setcounter{page}{1}
|
||||||
|
%\pagenumbering{roman}
|
||||||
|
|
||||||
|
\frontmatter
|
||||||
|
\maketitle
|
||||||
|
\setcounter{tocdepth}{1}
|
||||||
|
\tableofcontents
|
||||||
|
|
||||||
|
\include{preface}
|
||||||
|
|
||||||
|
\mainmatter
|
||||||
|
\pagenumbering{arabic}
|
||||||
|
\setcounter{page}{1}
|
||||||
|
|
||||||
|
\newcommand{\key}[1] {\textbf{#1}}
|
||||||
|
|
||||||
|
\part{Basic techniques}
|
||||||
|
\include{chapter01}
|
||||||
|
\include{chapter02}
|
||||||
|
\include{chapter03}
|
||||||
|
\include{chapter04}
|
||||||
|
\include{chapter05}
|
||||||
|
\include{chapter06}
|
||||||
|
\include{chapter07}
|
||||||
|
\include{chapter08}
|
||||||
|
\include{chapter09}
|
||||||
|
\include{chapter10}
|
||||||
|
\part{Graph algorithms}
|
||||||
|
\include{chapter11}
|
||||||
|
\include{chapter12}
|
||||||
|
\include{chapter13}
|
||||||
|
\include{chapter14}
|
||||||
|
\include{chapter15}
|
||||||
|
\include{chapter16}
|
||||||
|
\include{chapter17}
|
||||||
|
\include{chapter18}
|
||||||
|
\include{chapter19}
|
||||||
|
\include{chapter20}
|
||||||
|
\part{Advanced topics}
|
||||||
|
\include{chapter21}
|
||||||
|
\include{chapter22}
|
||||||
|
\include{chapter23}
|
||||||
|
\include{chapter24}
|
||||||
|
\include{chapter25}
|
||||||
|
\include{chapter26}
|
||||||
|
\include{chapter27}
|
||||||
|
\include{chapter28}
|
||||||
|
\include{chapter29}
|
||||||
|
\include{chapter30}
|
||||||
|
|
||||||
|
\cleardoublepage
|
||||||
|
\phantomsection
|
||||||
|
\addcontentsline{toc}{chapter}{Bibliography}
|
||||||
|
\include{list}
|
||||||
|
|
||||||
|
\cleardoublepage
|
||||||
|
\printindex
|
||||||
|
|
||||||
|
\end{document}
|
|
@ -0,0 +1,990 @@
|
||||||
|
\chapter{Introduction}
|
||||||
|
|
||||||
|
Competitive programming combines two topics:
|
||||||
|
(1) the design of algorithms and (2) the implementation of algorithms.
|
||||||
|
|
||||||
|
The \key{design of algorithms} consists of problem solving
|
||||||
|
and mathematical thinking.
|
||||||
|
Skills for analyzing problems and solving them
|
||||||
|
creatively are needed.
|
||||||
|
An algorithm for solving a problem
|
||||||
|
has to be both correct and efficient,
|
||||||
|
and the core of the problem is often
|
||||||
|
about inventing an efficient algorithm.
|
||||||
|
|
||||||
|
Theoretical knowledge of algorithms
|
||||||
|
is important to competitive programmers.
|
||||||
|
Typically, a solution to a problem is
|
||||||
|
a combination of well-known techniques and
|
||||||
|
new insights.
|
||||||
|
The techniques that appear in competitive programming
|
||||||
|
also form the basis for the scientific research
|
||||||
|
of algorithms.
|
||||||
|
|
||||||
|
The \key{implementation of algorithms} requires good
|
||||||
|
programming skills.
|
||||||
|
In competitive programming, the solutions
|
||||||
|
are graded by testing an implemented algorithm
|
||||||
|
using a set of test cases.
|
||||||
|
Thus, it is not enough that the idea of the
|
||||||
|
algorithm is correct, but the implementation also
|
||||||
|
has to be correct.
|
||||||
|
|
||||||
|
A good coding style in contests is
|
||||||
|
straightforward and concise.
|
||||||
|
Programs should be written quickly,
|
||||||
|
because there is not much time available.
|
||||||
|
Unlike in traditional software engineering,
|
||||||
|
the programs are short (usually at most a few
|
||||||
|
hundred lines of code), and they do not need to
|
||||||
|
be maintained after the contest.
|
||||||
|
|
||||||
|
\section{Programming languages}
|
||||||
|
|
||||||
|
\index{programming language}
|
||||||
|
|
||||||
|
At the moment, the most popular programming
|
||||||
|
languages used in contests are C++, Python and Java.
|
||||||
|
For example, in Google Code Jam 2017,
|
||||||
|
among the best 3,000 participants,
|
||||||
|
79 \% used C++,
|
||||||
|
16 \% used Python and
|
||||||
|
8 \% used Java \cite{goo17}.
|
||||||
|
Some participants also used several languages.
|
||||||
|
|
||||||
|
Many people think that C++ is the best choice
|
||||||
|
for a competitive programmer,
|
||||||
|
and C++ is nearly always available in
|
||||||
|
contest systems.
|
||||||
|
The benefits of using C++ are that
|
||||||
|
it is a very efficient language and
|
||||||
|
its standard library contains a
|
||||||
|
large collection
|
||||||
|
of data structures and algorithms.
|
||||||
|
|
||||||
|
On the other hand, it is good to
|
||||||
|
master several languages and understand
|
||||||
|
their strengths.
|
||||||
|
For example, if large integers are needed
|
||||||
|
in the problem,
|
||||||
|
Python can be a good choice, because it
|
||||||
|
contains built-in operations for
|
||||||
|
calculating with large integers.
|
||||||
|
Still, most problems in programming contests
|
||||||
|
are set so that
|
||||||
|
using a specific programming language
|
||||||
|
is not an unfair advantage.
|
||||||
|
|
||||||
|
All example programs in this book are written in C++,
|
||||||
|
and the standard library's
|
||||||
|
data structures and algorithms are often used.
|
||||||
|
The programs follow the C++11 standard,
|
||||||
|
which can be used in most contests nowadays.
|
||||||
|
If you cannot program in C++ yet,
|
||||||
|
now is a good time to start learning.
|
||||||
|
|
||||||
|
\subsubsection{C++ code template}
|
||||||
|
|
||||||
|
A typical C++ code template for competitive programming
|
||||||
|
looks like this:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
#include <bits/stdc++.h>
|
||||||
|
|
||||||
|
using namespace std;
|
||||||
|
|
||||||
|
int main() {
|
||||||
|
// solution comes here
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The \texttt{\#include} line at the beginning
|
||||||
|
of the code is a feature of the \texttt{g++} compiler
|
||||||
|
that allows us to include the entire standard library.
|
||||||
|
Thus, it is not needed to separately include
|
||||||
|
libraries such as \texttt{iostream},
|
||||||
|
\texttt{vector} and \texttt{algorithm},
|
||||||
|
but rather they are available automatically.
|
||||||
|
|
||||||
|
The \texttt{using} line declares
|
||||||
|
that the classes and functions
|
||||||
|
of the standard library can be used directly
|
||||||
|
in the code.
|
||||||
|
Without the \texttt{using} line we would have
|
||||||
|
to write, for example, \texttt{std::cout},
|
||||||
|
but now it suffices to write \texttt{cout}.
|
||||||
|
|
||||||
|
The code can be compiled using the following command:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
g++ -std=c++11 -O2 -Wall test.cpp -o test
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
This command produces a binary file \texttt{test}
|
||||||
|
from the source code \texttt{test.cpp}.
|
||||||
|
The compiler follows the C++11 standard
|
||||||
|
(\texttt{-std=c++11}),
|
||||||
|
optimizes the code (\texttt{-O2})
|
||||||
|
and shows warnings about possible errors (\texttt{-Wall}).
|
||||||
|
|
||||||
|
\section{Input and output}
|
||||||
|
|
||||||
|
\index{input and output}
|
||||||
|
|
||||||
|
In most contests, standard streams are used for
|
||||||
|
reading input and writing output.
|
||||||
|
In C++, the standard streams are
|
||||||
|
\texttt{cin} for input and \texttt{cout} for output.
|
||||||
|
In addition, the C functions
|
||||||
|
\texttt{scanf} and \texttt{printf} can be used.
|
||||||
|
|
||||||
|
The input for the program usually consists of
|
||||||
|
numbers and strings that are separated with
|
||||||
|
spaces and newlines.
|
||||||
|
They can be read from the \texttt{cin} stream
|
||||||
|
as follows:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
int a, b;
|
||||||
|
string x;
|
||||||
|
cin >> a >> b >> x;
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
This kind of code always works,
|
||||||
|
assuming that there is at least one space
|
||||||
|
or newline between each element in the input.
|
||||||
|
For example, the above code can read
|
||||||
|
both of the following inputs:
|
||||||
|
\begin{lstlisting}
|
||||||
|
123 456 monkey
|
||||||
|
\end{lstlisting}
|
||||||
|
\begin{lstlisting}
|
||||||
|
123 456
|
||||||
|
monkey
|
||||||
|
\end{lstlisting}
|
||||||
|
The \texttt{cout} stream is used for output
|
||||||
|
as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int a = 123, b = 456;
|
||||||
|
string x = "monkey";
|
||||||
|
cout << a << " " << b << " " << x << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Input and output is sometimes
|
||||||
|
a bottleneck in the program.
|
||||||
|
The following lines at the beginning of the code
|
||||||
|
make input and output more efficient:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
ios::sync_with_stdio(0);
|
||||||
|
cin.tie(0);
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Note that the newline \texttt{"\textbackslash n"}
|
||||||
|
works faster than \texttt{endl},
|
||||||
|
because \texttt{endl} always causes
|
||||||
|
a flush operation.
|
||||||
|
|
||||||
|
The C functions \texttt{scanf}
|
||||||
|
and \texttt{printf} are an alternative
|
||||||
|
to the C++ standard streams.
|
||||||
|
They are usually a bit faster,
|
||||||
|
but they are also more difficult to use.
|
||||||
|
The following code reads two integers from the input:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int a, b;
|
||||||
|
scanf("%d %d", &a, &b);
|
||||||
|
\end{lstlisting}
|
||||||
|
The following code prints two integers:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int a = 123, b = 456;
|
||||||
|
printf("%d %d\n", a, b);
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Sometimes the program should read a whole line
|
||||||
|
from the input, possibly containing spaces.
|
||||||
|
This can be accomplished by using the
|
||||||
|
\texttt{getline} function:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
string s;
|
||||||
|
getline(cin, s);
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
If the amount of data is unknown, the following
|
||||||
|
loop is useful:
|
||||||
|
\begin{lstlisting}
|
||||||
|
while (cin >> x) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
This loop reads elements from the input
|
||||||
|
one after another, until there is no
|
||||||
|
more data available in the input.
|
||||||
|
|
||||||
|
In some contest systems, files are used for
|
||||||
|
input and output.
|
||||||
|
An easy solution for this is to write
|
||||||
|
the code as usual using standard streams,
|
||||||
|
but add the following lines to the beginning of the code:
|
||||||
|
\begin{lstlisting}
|
||||||
|
freopen("input.txt", "r", stdin);
|
||||||
|
freopen("output.txt", "w", stdout);
|
||||||
|
\end{lstlisting}
|
||||||
|
After this, the program reads the input from the file
|
||||||
|
''input.txt'' and writes the output to the file
|
||||||
|
''output.txt''.
|
||||||
|
|
||||||
|
\section{Working with numbers}
|
||||||
|
|
||||||
|
\index{integer}
|
||||||
|
|
||||||
|
\subsubsection{Integers}
|
||||||
|
|
||||||
|
The most used integer type in competitive programming
|
||||||
|
is \texttt{int}, which is a 32-bit type with
|
||||||
|
a value range of $-2^{31} \ldots 2^{31}-1$
|
||||||
|
or about $-2 \cdot 10^9 \ldots 2 \cdot 10^9$.
|
||||||
|
If the type \texttt{int} is not enough,
|
||||||
|
the 64-bit type \texttt{long long} can be used.
|
||||||
|
It has a value range of $-2^{63} \ldots 2^{63}-1$
|
||||||
|
or about $-9 \cdot 10^{18} \ldots 9 \cdot 10^{18}$.
|
||||||
|
|
||||||
|
The following code defines a
|
||||||
|
\texttt{long long} variable:
|
||||||
|
\begin{lstlisting}
|
||||||
|
long long x = 123456789123456789LL;
|
||||||
|
\end{lstlisting}
|
||||||
|
The suffix \texttt{LL} means that the
|
||||||
|
type of the number is \texttt{long long}.
|
||||||
|
|
||||||
|
A common mistake when using the type \texttt{long long}
|
||||||
|
is that the type \texttt{int} is still used somewhere
|
||||||
|
in the code.
|
||||||
|
For example, the following code contains
|
||||||
|
a subtle error:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
int a = 123456789;
|
||||||
|
long long b = a*a;
|
||||||
|
cout << b << "\n"; // -1757895751
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Even though the variable \texttt{b} is of type \texttt{long long},
|
||||||
|
both numbers in the expression \texttt{a*a}
|
||||||
|
are of type \texttt{int} and the result is
|
||||||
|
also of type \texttt{int}.
|
||||||
|
Because of this, the variable \texttt{b} will
|
||||||
|
contain a wrong result.
|
||||||
|
The problem can be solved by changing the type
|
||||||
|
of \texttt{a} to \texttt{long long} or
|
||||||
|
by changing the expression to \texttt{(long long)a*a}.
|
||||||
|
|
||||||
|
Usually contest problems are set so that the
|
||||||
|
type \texttt{long long} is enough.
|
||||||
|
Still, it is good to know that
|
||||||
|
the \texttt{g++} compiler also provides
|
||||||
|
a 128-bit type \texttt{\_\_int128\_t}
|
||||||
|
with a value range of
|
||||||
|
$-2^{127} \ldots 2^{127}-1$ or about $-10^{38} \ldots 10^{38}$.
|
||||||
|
However, this type is not available in all contest systems.
|
||||||
|
|
||||||
|
\subsubsection{Modular arithmetic}
|
||||||
|
|
||||||
|
\index{remainder}
|
||||||
|
\index{modular arithmetic}
|
||||||
|
|
||||||
|
We denote by $x \bmod m$ the remainder
|
||||||
|
when $x$ is divided by $m$.
|
||||||
|
For example, $17 \bmod 5 = 2$,
|
||||||
|
because $17 = 3 \cdot 5 + 2$.
|
||||||
|
|
||||||
|
Sometimes, the answer to a problem is a
|
||||||
|
very large number but it is enough to
|
||||||
|
output it ''modulo $m$'', i.e.,
|
||||||
|
the remainder when the answer is divided by $m$
|
||||||
|
(for example, ''modulo $10^9+7$'').
|
||||||
|
The idea is that even if the actual answer
|
||||||
|
is very large,
|
||||||
|
it suffices to use the types
|
||||||
|
\texttt{int} and \texttt{long long}.
|
||||||
|
|
||||||
|
An important property of the remainder is that
|
||||||
|
in addition, subtraction and multiplication,
|
||||||
|
the remainder can be taken before the operation:
|
||||||
|
|
||||||
|
\[
|
||||||
|
\begin{array}{rcr}
|
||||||
|
(a+b) \bmod m & = & (a \bmod m + b \bmod m) \bmod m \\
|
||||||
|
(a-b) \bmod m & = & (a \bmod m - b \bmod m) \bmod m \\
|
||||||
|
(a \cdot b) \bmod m & = & (a \bmod m \cdot b \bmod m) \bmod m
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
|
||||||
|
Thus, we can take the remainder after every operation
|
||||||
|
and the numbers will never become too large.
|
||||||
|
|
||||||
|
For example, the following code calculates $n!$,
|
||||||
|
the factorial of $n$, modulo $m$:
|
||||||
|
\begin{lstlisting}
|
||||||
|
long long x = 1;
|
||||||
|
for (int i = 2; i <= n; i++) {
|
||||||
|
x = (x*i)%m;
|
||||||
|
}
|
||||||
|
cout << x%m << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Usually we want the remainder to always
|
||||||
|
be between $0\ldots m-1$.
|
||||||
|
However, in C++ and other languages,
|
||||||
|
the remainder of a negative number
|
||||||
|
is either zero or negative.
|
||||||
|
An easy way to make sure there
|
||||||
|
are no negative remainders is to first calculate
|
||||||
|
the remainder as usual and then add $m$
|
||||||
|
if the result is negative:
|
||||||
|
\begin{lstlisting}
|
||||||
|
x = x%m;
|
||||||
|
if (x < 0) x += m;
|
||||||
|
\end{lstlisting}
|
||||||
|
However, this is only needed when there
|
||||||
|
are subtractions in the code and the
|
||||||
|
remainder may become negative.
|
||||||
|
|
||||||
|
\subsubsection{Floating point numbers}
|
||||||
|
|
||||||
|
\index{floating point number}
|
||||||
|
|
||||||
|
The usual floating point types in
|
||||||
|
competitive programming are
|
||||||
|
the 64-bit \texttt{double}
|
||||||
|
and, as an extension in the \texttt{g++} compiler,
|
||||||
|
the 80-bit \texttt{long double}.
|
||||||
|
In most cases, \texttt{double} is enough,
|
||||||
|
but \texttt{long double} is more accurate.
|
||||||
|
|
||||||
|
The required precision of the answer
|
||||||
|
is usually given in the problem statement.
|
||||||
|
An easy way to output the answer is to use
|
||||||
|
the \texttt{printf} function
|
||||||
|
and give the number of decimal places
|
||||||
|
in the formatting string.
|
||||||
|
For example, the following code prints
|
||||||
|
the value of $x$ with 9 decimal places:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
printf("%.9f\n", x);
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
A difficulty when using floating point numbers
|
||||||
|
is that some numbers cannot be represented
|
||||||
|
accurately as floating point numbers,
|
||||||
|
and there will be rounding errors.
|
||||||
|
For example, the result of the following code
|
||||||
|
is surprising:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
double x = 0.3*3+0.1;
|
||||||
|
printf("%.20f\n", x); // 0.99999999999999988898
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Due to a rounding error,
|
||||||
|
the value of \texttt{x} is a bit smaller than 1,
|
||||||
|
while the correct value would be 1.
|
||||||
|
|
||||||
|
It is risky to compare floating point numbers
|
||||||
|
with the \texttt{==} operator,
|
||||||
|
because it is possible that the values should be
|
||||||
|
equal but they are not because of precision errors.
|
||||||
|
A better way to compare floating point numbers
|
||||||
|
is to assume that two numbers are equal
|
||||||
|
if the difference between them is less than $\varepsilon$,
|
||||||
|
where $\varepsilon$ is a small number.
|
||||||
|
|
||||||
|
In practice, the numbers can be compared
|
||||||
|
as follows ($\varepsilon=10^{-9}$):
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
if (abs(a-b) < 1e-9) {
|
||||||
|
// a and b are equal
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Note that while floating point numbers are inaccurate,
|
||||||
|
integers up to a certain limit can still be
|
||||||
|
represented accurately.
|
||||||
|
For example, using \texttt{double},
|
||||||
|
it is possible to accurately represent all
|
||||||
|
integers whose absolute value is at most $2^{53}$.
|
||||||
|
|
||||||
|
\section{Shortening code}
|
||||||
|
|
||||||
|
Short code is ideal in competitive programming,
|
||||||
|
because programs should be written
|
||||||
|
as fast as possible.
|
||||||
|
Because of this, competitive programmers often define
|
||||||
|
shorter names for datatypes and other parts of code.
|
||||||
|
|
||||||
|
\subsubsection{Type names}
|
||||||
|
\index{tuppdef@\texttt{typedef}}
|
||||||
|
Using the command \texttt{typedef}
|
||||||
|
it is possible to give a shorter name
|
||||||
|
to a datatype.
|
||||||
|
For example, the name \texttt{long long} is long,
|
||||||
|
so we can define a shorter name \texttt{ll}:
|
||||||
|
\begin{lstlisting}
|
||||||
|
typedef long long ll;
|
||||||
|
\end{lstlisting}
|
||||||
|
After this, the code
|
||||||
|
\begin{lstlisting}
|
||||||
|
long long a = 123456789;
|
||||||
|
long long b = 987654321;
|
||||||
|
cout << a*b << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
can be shortened as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
ll a = 123456789;
|
||||||
|
ll b = 987654321;
|
||||||
|
cout << a*b << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The command \texttt{typedef}
|
||||||
|
can also be used with more complex types.
|
||||||
|
For example, the following code gives
|
||||||
|
the name \texttt{vi} for a vector of integers
|
||||||
|
and the name \texttt{pi} for a pair
|
||||||
|
that contains two integers.
|
||||||
|
\begin{lstlisting}
|
||||||
|
typedef vector<int> vi;
|
||||||
|
typedef pair<int,int> pi;
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Macros}
|
||||||
|
\index{macro}
|
||||||
|
Another way to shorten code is to define
|
||||||
|
\key{macros}.
|
||||||
|
A macro means that certain strings in
|
||||||
|
the code will be changed before the compilation.
|
||||||
|
In C++, macros are defined using the
|
||||||
|
\texttt{\#define} keyword.
|
||||||
|
|
||||||
|
For example, we can define the following macros:
|
||||||
|
\begin{lstlisting}
|
||||||
|
#define F first
|
||||||
|
#define S second
|
||||||
|
#define PB push_back
|
||||||
|
#define MP make_pair
|
||||||
|
\end{lstlisting}
|
||||||
|
After this, the code
|
||||||
|
\begin{lstlisting}
|
||||||
|
v.push_back(make_pair(y1,x1));
|
||||||
|
v.push_back(make_pair(y2,x2));
|
||||||
|
int d = v[i].first+v[i].second;
|
||||||
|
\end{lstlisting}
|
||||||
|
can be shortened as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
v.PB(MP(y1,x1));
|
||||||
|
v.PB(MP(y2,x2));
|
||||||
|
int d = v[i].F+v[i].S;
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
A macro can also have parameters
|
||||||
|
which makes it possible to shorten loops and other
|
||||||
|
structures.
|
||||||
|
For example, we can define the following macro:
|
||||||
|
\begin{lstlisting}
|
||||||
|
#define REP(i,a,b) for (int i = a; i <= b; i++)
|
||||||
|
\end{lstlisting}
|
||||||
|
After this, the code
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i++) {
|
||||||
|
search(i);
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
can be shortened as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
REP(i,1,n) {
|
||||||
|
search(i);
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Sometimes macros cause bugs that may be difficult
|
||||||
|
to detect. For example, consider the following macro
|
||||||
|
that calculates the square of a number:
|
||||||
|
\begin{lstlisting}
|
||||||
|
#define SQ(a) a*a
|
||||||
|
\end{lstlisting}
|
||||||
|
This macro \emph{does not} always work as expected.
|
||||||
|
For example, the code
|
||||||
|
\begin{lstlisting}
|
||||||
|
cout << SQ(3+3) << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
corresponds to the code
|
||||||
|
\begin{lstlisting}
|
||||||
|
cout << 3+3*3+3 << "\n"; // 15
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
A better version of the macro is as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
#define SQ(a) (a)*(a)
|
||||||
|
\end{lstlisting}
|
||||||
|
Now the code
|
||||||
|
\begin{lstlisting}
|
||||||
|
cout << SQ(3+3) << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
corresponds to the code
|
||||||
|
\begin{lstlisting}
|
||||||
|
cout << (3+3)*(3+3) << "\n"; // 36
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
|
||||||
|
\section{Mathematics}
|
||||||
|
|
||||||
|
Mathematics plays an important role in competitive
|
||||||
|
programming, and it is not possible to become
|
||||||
|
a successful competitive programmer without
|
||||||
|
having good mathematical skills.
|
||||||
|
This section discusses some important
|
||||||
|
mathematical concepts and formulas that
|
||||||
|
are needed later in the book.
|
||||||
|
|
||||||
|
\subsubsection{Sum formulas}
|
||||||
|
|
||||||
|
Each sum of the form
|
||||||
|
\[\sum_{x=1}^n x^k = 1^k+2^k+3^k+\ldots+n^k,\]
|
||||||
|
where $k$ is a positive integer,
|
||||||
|
has a closed-form formula that is a
|
||||||
|
polynomial of degree $k+1$.
|
||||||
|
For example\footnote{\index{Faulhaber's formula}
|
||||||
|
There is even a general formula for such sums, called \key{Faulhaber's formula},
|
||||||
|
but it is too complex to be presented here.},
|
||||||
|
\[\sum_{x=1}^n x = 1+2+3+\ldots+n = \frac{n(n+1)}{2}\]
|
||||||
|
and
|
||||||
|
\[\sum_{x=1}^n x^2 = 1^2+2^2+3^2+\ldots+n^2 = \frac{n(n+1)(2n+1)}{6}.\]
|
||||||
|
|
||||||
|
An \key{arithmetic progression} is a \index{arithmetic progression}
|
||||||
|
sequence of numbers
|
||||||
|
where the difference between any two consecutive
|
||||||
|
numbers is constant.
|
||||||
|
For example,
|
||||||
|
\[3, 7, 11, 15\]
|
||||||
|
is an arithmetic progression with constant 4.
|
||||||
|
The sum of an arithmetic progression can be calculated
|
||||||
|
using the formula
|
||||||
|
\[\underbrace{a + \cdots + b}_{n \,\, \textrm{numbers}} = \frac{n(a+b)}{2}\]
|
||||||
|
where $a$ is the first number,
|
||||||
|
$b$ is the last number and
|
||||||
|
$n$ is the amount of numbers.
|
||||||
|
For example,
|
||||||
|
\[3+7+11+15=\frac{4 \cdot (3+15)}{2} = 36.\]
|
||||||
|
The formula is based on the fact
|
||||||
|
that the sum consists of $n$ numbers and
|
||||||
|
the value of each number is $(a+b)/2$ on average.
|
||||||
|
|
||||||
|
\index{geometric progression}
|
||||||
|
A \key{geometric progression} is a sequence
|
||||||
|
of numbers
|
||||||
|
where the ratio between any two consecutive
|
||||||
|
numbers is constant.
|
||||||
|
For example,
|
||||||
|
\[3,6,12,24\]
|
||||||
|
is a geometric progression with constant 2.
|
||||||
|
The sum of a geometric progression can be calculated
|
||||||
|
using the formula
|
||||||
|
\[a + ak + ak^2 + \cdots + b = \frac{bk-a}{k-1}\]
|
||||||
|
where $a$ is the first number,
|
||||||
|
$b$ is the last number and the
|
||||||
|
ratio between consecutive numbers is $k$.
|
||||||
|
For example,
|
||||||
|
\[3+6+12+24=\frac{24 \cdot 2 - 3}{2-1} = 45.\]
|
||||||
|
|
||||||
|
This formula can be derived as follows. Let
|
||||||
|
\[ S = a + ak + ak^2 + \cdots + b .\]
|
||||||
|
By multiplying both sides by $k$, we get
|
||||||
|
\[ kS = ak + ak^2 + ak^3 + \cdots + bk,\]
|
||||||
|
and solving the equation
|
||||||
|
\[ kS-S = bk-a\]
|
||||||
|
yields the formula.
|
||||||
|
|
||||||
|
A special case of a sum of a geometric progression is the formula
|
||||||
|
\[1+2+4+8+\ldots+2^{n-1}=2^n-1.\]
|
||||||
|
|
||||||
|
\index{harmonic sum}
|
||||||
|
|
||||||
|
A \key{harmonic sum} is a sum of the form
|
||||||
|
\[ \sum_{x=1}^n \frac{1}{x} = 1+\frac{1}{2}+\frac{1}{3}+\ldots+\frac{1}{n}.\]
|
||||||
|
|
||||||
|
An upper bound for a harmonic sum is $\log_2(n)+1$.
|
||||||
|
Namely, we can
|
||||||
|
modify each term $1/k$ so that $k$ becomes
|
||||||
|
the nearest power of two that does not exceed $k$.
|
||||||
|
For example, when $n=6$, we can estimate
|
||||||
|
the sum as follows:
|
||||||
|
\[ 1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\frac{1}{5}+\frac{1}{6} \le
|
||||||
|
1+\frac{1}{2}+\frac{1}{2}+\frac{1}{4}+\frac{1}{4}+\frac{1}{4}.\]
|
||||||
|
This upper bound consists of $\log_2(n)+1$ parts
|
||||||
|
($1$, $2 \cdot 1/2$, $4 \cdot 1/4$, etc.),
|
||||||
|
and the value of each part is at most 1.
|
||||||
|
|
||||||
|
\subsubsection{Set theory}
|
||||||
|
|
||||||
|
\index{set theory}
|
||||||
|
\index{set}
|
||||||
|
\index{intersection}
|
||||||
|
\index{union}
|
||||||
|
\index{difference}
|
||||||
|
\index{subset}
|
||||||
|
\index{universal set}
|
||||||
|
\index{complement}
|
||||||
|
|
||||||
|
A \key{set} is a collection of elements.
|
||||||
|
For example, the set
|
||||||
|
\[X=\{2,4,7\}\]
|
||||||
|
contains elements 2, 4 and 7.
|
||||||
|
The symbol $\emptyset$ denotes an empty set,
|
||||||
|
and $|S|$ denotes the size of a set $S$,
|
||||||
|
i.e., the number of elements in the set.
|
||||||
|
For example, in the above set, $|X|=3$.
|
||||||
|
|
||||||
|
If a set $S$ contains an element $x$,
|
||||||
|
we write $x \in S$,
|
||||||
|
and otherwise we write $x \notin S$.
|
||||||
|
For example, in the above set
|
||||||
|
\[4 \in X \hspace{10px}\textrm{and}\hspace{10px} 5 \notin X.\]
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
New sets can be constructed using set operations:
|
||||||
|
\begin{itemize}
|
||||||
|
\item The \key{intersection} $A \cap B$ consists of elements
|
||||||
|
that are in both $A$ and $B$.
|
||||||
|
For example, if $A=\{1,2,5\}$ and $B=\{2,4\}$,
|
||||||
|
then $A \cap B = \{2\}$.
|
||||||
|
\item The \key{union} $A \cup B$ consists of elements
|
||||||
|
that are in $A$ or $B$ or both.
|
||||||
|
For example, if $A=\{3,7\}$ and $B=\{2,3,8\}$,
|
||||||
|
then $A \cup B = \{2,3,7,8\}$.
|
||||||
|
\item The \key{complement} $\bar A$ consists of elements
|
||||||
|
that are not in $A$.
|
||||||
|
The interpretation of a complement depends on
|
||||||
|
the \key{universal set}, which contains all possible elements.
|
||||||
|
For example, if $A=\{1,2,5,7\}$ and the universal set is
|
||||||
|
$\{1,2,\ldots,10\}$, then $\bar A = \{3,4,6,8,9,10\}$.
|
||||||
|
\item The \key{difference} $A \setminus B = A \cap \bar B$
|
||||||
|
consists of elements that are in $A$ but not in $B$.
|
||||||
|
Note that $B$ can contain elements that are not in $A$.
|
||||||
|
For example, if $A=\{2,3,7,8\}$ and $B=\{3,5,8\}$,
|
||||||
|
then $A \setminus B = \{2,7\}$.
|
||||||
|
\end{itemize}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
If each element of $A$ also belongs to $S$,
|
||||||
|
we say that $A$ is a \key{subset} of $S$,
|
||||||
|
denoted by $A \subset S$.
|
||||||
|
A set $S$ always has $2^{|S|}$ subsets,
|
||||||
|
including the empty set.
|
||||||
|
For example, the subsets of the set $\{2,4,7\}$ are
|
||||||
|
\begin{center}
|
||||||
|
$\emptyset$,
|
||||||
|
$\{2\}$, $\{4\}$, $\{7\}$, $\{2,4\}$, $\{2,7\}$, $\{4,7\}$ and $\{2,4,7\}$.
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Some often used sets are
|
||||||
|
$\mathbb{N}$ (natural numbers),
|
||||||
|
$\mathbb{Z}$ (integers),
|
||||||
|
$\mathbb{Q}$ (rational numbers) and
|
||||||
|
$\mathbb{R}$ (real numbers).
|
||||||
|
The set $\mathbb{N}$
|
||||||
|
can be defined in two ways, depending
|
||||||
|
on the situation:
|
||||||
|
either $\mathbb{N}=\{0,1,2,\ldots\}$
|
||||||
|
or $\mathbb{N}=\{1,2,3,...\}$.
|
||||||
|
|
||||||
|
We can also construct a set using a rule of the form
|
||||||
|
\[\{f(n) : n \in S\},\]
|
||||||
|
where $f(n)$ is some function.
|
||||||
|
This set contains all elements of the form $f(n)$,
|
||||||
|
where $n$ is an element in $S$.
|
||||||
|
For example, the set
|
||||||
|
\[X=\{2n : n \in \mathbb{Z}\}\]
|
||||||
|
contains all even integers.
|
||||||
|
|
||||||
|
\subsubsection{Logic}
|
||||||
|
|
||||||
|
\index{logic}
|
||||||
|
\index{negation}
|
||||||
|
\index{conjuction}
|
||||||
|
\index{disjunction}
|
||||||
|
\index{implication}
|
||||||
|
\index{equivalence}
|
||||||
|
|
||||||
|
The value of a logical expression is either
|
||||||
|
\key{true} (1) or \key{false} (0).
|
||||||
|
The most important logical operators are
|
||||||
|
$\lnot$ (\key{negation}),
|
||||||
|
$\land$ (\key{conjunction}),
|
||||||
|
$\lor$ (\key{disjunction}),
|
||||||
|
$\Rightarrow$ (\key{implication}) and
|
||||||
|
$\Leftrightarrow$ (\key{equivalence}).
|
||||||
|
The following table shows the meanings of these operators:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rr|rrrrrrr}
|
||||||
|
$A$ & $B$ & $\lnot A$ & $\lnot B$ & $A \land B$ & $A \lor B$ & $A \Rightarrow B$ & $A \Leftrightarrow B$ \\
|
||||||
|
\hline
|
||||||
|
0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 \\
|
||||||
|
0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 \\
|
||||||
|
1 & 0 & 0 & 1 & 0 & 1 & 0 & 0 \\
|
||||||
|
1 & 1 & 0 & 0 & 1 & 1 & 1 & 1 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The expression $\lnot A$ has the opposite value of $A$.
|
||||||
|
The expression $A \land B$ is true if both $A$ and $B$
|
||||||
|
are true,
|
||||||
|
and the expression $A \lor B$ is true if $A$ or $B$ or both
|
||||||
|
are true.
|
||||||
|
The expression $A \Rightarrow B$ is true
|
||||||
|
if whenever $A$ is true, also $B$ is true.
|
||||||
|
The expression $A \Leftrightarrow B$ is true
|
||||||
|
if $A$ and $B$ are both true or both false.
|
||||||
|
|
||||||
|
\index{predicate}
|
||||||
|
|
||||||
|
A \key{predicate} is an expression that is true or false
|
||||||
|
depending on its parameters.
|
||||||
|
Predicates are usually denoted by capital letters.
|
||||||
|
For example, we can define a predicate $P(x)$
|
||||||
|
that is true exactly when $x$ is a prime number.
|
||||||
|
Using this definition, $P(7)$ is true but $P(8)$ is false.
|
||||||
|
|
||||||
|
\index{quantifier}
|
||||||
|
|
||||||
|
A \key{quantifier} connects a logical expression
|
||||||
|
to the elements of a set.
|
||||||
|
The most important quantifiers are
|
||||||
|
$\forall$ (\key{for all}) and $\exists$ (\key{there is}).
|
||||||
|
For example,
|
||||||
|
\[\forall x (\exists y (y < x))\]
|
||||||
|
means that for each element $x$ in the set,
|
||||||
|
there is an element $y$ in the set
|
||||||
|
such that $y$ is smaller than $x$.
|
||||||
|
This is true in the set of integers,
|
||||||
|
but false in the set of natural numbers.
|
||||||
|
|
||||||
|
Using the notation described above,
|
||||||
|
we can express many kinds of logical propositions.
|
||||||
|
For example,
|
||||||
|
\[\forall x ((x>1 \land \lnot P(x)) \Rightarrow (\exists a (\exists b (a > 1 \land b > 1 \land x = ab))))\]
|
||||||
|
means that if a number $x$ is larger than 1
|
||||||
|
and not a prime number,
|
||||||
|
then there are numbers $a$ and $b$
|
||||||
|
that are larger than $1$ and whose product is $x$.
|
||||||
|
This proposition is true in the set of integers.
|
||||||
|
|
||||||
|
\subsubsection{Functions}
|
||||||
|
|
||||||
|
The function $\lfloor x \rfloor$ rounds the number $x$
|
||||||
|
down to an integer, and the function
|
||||||
|
$\lceil x \rceil$ rounds the number $x$
|
||||||
|
up to an integer. For example,
|
||||||
|
\[ \lfloor 3/2 \rfloor = 1 \hspace{10px} \textrm{and} \hspace{10px} \lceil 3/2 \rceil = 2.\]
|
||||||
|
|
||||||
|
The functions $\min(x_1,x_2,\ldots,x_n)$
|
||||||
|
and $\max(x_1,x_2,\ldots,x_n)$
|
||||||
|
give the smallest and largest of values
|
||||||
|
$x_1,x_2,\ldots,x_n$.
|
||||||
|
For example,
|
||||||
|
\[ \min(1,2,3)=1 \hspace{10px} \textrm{and} \hspace{10px} \max(1,2,3)=3.\]
|
||||||
|
|
||||||
|
\index{factorial}
|
||||||
|
|
||||||
|
The \key{factorial} $n!$ can be defined
|
||||||
|
\[\prod_{x=1}^n x = 1 \cdot 2 \cdot 3 \cdot \ldots \cdot n\]
|
||||||
|
or recursively
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
0! & = & 1 \\
|
||||||
|
n! & = & n \cdot (n-1)! \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
|
||||||
|
\index{Fibonacci number}
|
||||||
|
|
||||||
|
The \key{Fibonacci numbers}
|
||||||
|
%\footnote{Fibonacci (c. 1175--1250) was an Italian mathematician.}
|
||||||
|
arise in many situations.
|
||||||
|
They can be defined recursively as follows:
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
f(0) & = & 0 \\
|
||||||
|
f(1) & = & 1 \\
|
||||||
|
f(n) & = & f(n-1)+f(n-2) \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
The first Fibonacci numbers are
|
||||||
|
\[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, \ldots\]
|
||||||
|
There is also a closed-form formula
|
||||||
|
for calculating Fibonacci numbers, which is sometimes called
|
||||||
|
\index{Binet's formula} \key{Binet's formula}:
|
||||||
|
\[f(n)=\frac{(1 + \sqrt{5})^n - (1-\sqrt{5})^n}{2^n \sqrt{5}}.\]
|
||||||
|
|
||||||
|
\subsubsection{Logarithms}
|
||||||
|
|
||||||
|
\index{logarithm}
|
||||||
|
|
||||||
|
The \key{logarithm} of a number $x$
|
||||||
|
is denoted $\log_k(x)$, where $k$ is the base
|
||||||
|
of the logarithm.
|
||||||
|
According to the definition,
|
||||||
|
$\log_k(x)=a$ exactly when $k^a=x$.
|
||||||
|
|
||||||
|
A useful property of logarithms is
|
||||||
|
that $\log_k(x)$ equals the number of times
|
||||||
|
we have to divide $x$ by $k$ before we reach
|
||||||
|
the number 1.
|
||||||
|
For example, $\log_2(32)=5$
|
||||||
|
because 5 divisions by 2 are needed:
|
||||||
|
|
||||||
|
\[32 \rightarrow 16 \rightarrow 8 \rightarrow 4 \rightarrow 2 \rightarrow 1 \]
|
||||||
|
|
||||||
|
Logarithms are often used in the analysis of
|
||||||
|
algorithms, because many efficient algorithms
|
||||||
|
halve something at each step.
|
||||||
|
Hence, we can estimate the efficiency of such algorithms
|
||||||
|
using logarithms.
|
||||||
|
|
||||||
|
The logarithm of a product is
|
||||||
|
\[\log_k(ab) = \log_k(a)+\log_k(b),\]
|
||||||
|
and consequently,
|
||||||
|
\[\log_k(x^n) = n \cdot \log_k(x).\]
|
||||||
|
In addition, the logarithm of a quotient is
|
||||||
|
\[\log_k\Big(\frac{a}{b}\Big) = \log_k(a)-\log_k(b).\]
|
||||||
|
Another useful formula is
|
||||||
|
\[\log_u(x) = \frac{\log_k(x)}{\log_k(u)},\]
|
||||||
|
and using this, it is possible to calculate
|
||||||
|
logarithms to any base if there is a way to
|
||||||
|
calculate logarithms to some fixed base.
|
||||||
|
|
||||||
|
\index{natural logarithm}
|
||||||
|
|
||||||
|
The \key{natural logarithm} $\ln(x)$ of a number $x$
|
||||||
|
is a logarithm whose base is $e \approx 2.71828$.
|
||||||
|
Another property of logarithms is that
|
||||||
|
the number of digits of an integer $x$ in base $b$ is
|
||||||
|
$\lfloor \log_b(x)+1 \rfloor$.
|
||||||
|
For example, the representation of
|
||||||
|
$123$ in base $2$ is 1111011 and
|
||||||
|
$\lfloor \log_2(123)+1 \rfloor = 7$.
|
||||||
|
|
||||||
|
\section{Contests and resources}
|
||||||
|
|
||||||
|
\subsubsection{IOI}
|
||||||
|
|
||||||
|
The International Olympiad in Informatics (IOI)
|
||||||
|
is an annual programming contest for
|
||||||
|
secondary school students.
|
||||||
|
Each country is allowed to send a team of
|
||||||
|
four students to the contest.
|
||||||
|
There are usually about 300 participants
|
||||||
|
from 80 countries.
|
||||||
|
|
||||||
|
The IOI consists of two five-hour long contests.
|
||||||
|
In both contests, the participants are asked to
|
||||||
|
solve three algorithm tasks of various difficulty.
|
||||||
|
The tasks are divided into subtasks,
|
||||||
|
each of which has an assigned score.
|
||||||
|
Even if the contestants are divided into teams,
|
||||||
|
they compete as individuals.
|
||||||
|
|
||||||
|
The IOI syllabus \cite{iois} regulates the topics
|
||||||
|
that may appear in IOI tasks.
|
||||||
|
Almost all the topics in the IOI syllabus
|
||||||
|
are covered by this book.
|
||||||
|
|
||||||
|
Participants for the IOI are selected through
|
||||||
|
national contests.
|
||||||
|
Before the IOI, many regional contests are organized,
|
||||||
|
such as the Baltic Olympiad in Informatics (BOI),
|
||||||
|
the Central European Olympiad in Informatics (CEOI)
|
||||||
|
and the Asia-Pacific Informatics Olympiad (APIO).
|
||||||
|
|
||||||
|
Some countries organize online practice contests
|
||||||
|
for future IOI participants,
|
||||||
|
such as the Croatian Open Competition in Informatics \cite{coci}
|
||||||
|
and the USA Computing Olympiad \cite{usaco}.
|
||||||
|
In addition, a large collection of problems from Polish contests
|
||||||
|
is available online \cite{main}.
|
||||||
|
|
||||||
|
\subsubsection{ICPC}
|
||||||
|
|
||||||
|
The International Collegiate Programming Contest (ICPC)
|
||||||
|
is an annual programming contest for university students.
|
||||||
|
Each team in the contest consists of three students,
|
||||||
|
and unlike in the IOI, the students work together;
|
||||||
|
there is only one computer available for each team.
|
||||||
|
|
||||||
|
The ICPC consists of several stages, and finally the
|
||||||
|
best teams are invited to the World Finals.
|
||||||
|
While there are tens of thousands of participants
|
||||||
|
in the contest, there are only a small number\footnote{The exact number of final
|
||||||
|
slots varies from year to year; in 2017, there were 133 final slots.} of final slots available,
|
||||||
|
so even advancing to the finals
|
||||||
|
is a great achievement in some regions.
|
||||||
|
|
||||||
|
In each ICPC contest, the teams have five hours of time to
|
||||||
|
solve about ten algorithm problems.
|
||||||
|
A solution to a problem is accepted only if it solves
|
||||||
|
all test cases efficiently.
|
||||||
|
During the contest, competitors may view the results of other teams,
|
||||||
|
but for the last hour the scoreboard is frozen and it
|
||||||
|
is not possible to see the results of the last submissions.
|
||||||
|
|
||||||
|
The topics that may appear at the ICPC are not so well
|
||||||
|
specified as those at the IOI.
|
||||||
|
In any case, it is clear that more knowledge is needed
|
||||||
|
at the ICPC, especially more mathematical skills.
|
||||||
|
|
||||||
|
\subsubsection{Online contests}
|
||||||
|
|
||||||
|
There are also many online contests that are open for everybody.
|
||||||
|
At the moment, the most active contest site is Codeforces,
|
||||||
|
which organizes contests about weekly.
|
||||||
|
In Codeforces, participants are divided into two divisions:
|
||||||
|
beginners compete in Div2 and more experienced programmers in Div1.
|
||||||
|
Other contest sites include AtCoder, CS Academy, HackerRank and Topcoder.
|
||||||
|
|
||||||
|
Some companies organize online contests with onsite finals.
|
||||||
|
Examples of such contests are Facebook Hacker Cup,
|
||||||
|
Google Code Jam and Yandex.Algorithm.
|
||||||
|
Of course, companies also use those contests for recruiting:
|
||||||
|
performing well in a contest is a good way to prove one's skills.
|
||||||
|
|
||||||
|
\subsubsection{Books}
|
||||||
|
|
||||||
|
There are already some books (besides this book) that
|
||||||
|
focus on competitive programming and algorithmic problem solving:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item S. S. Skiena and M. A. Revilla:
|
||||||
|
\emph{Programming Challenges: The Programming Contest Training Manual} \cite{ski03}
|
||||||
|
\item S. Halim and F. Halim:
|
||||||
|
\emph{Competitive Programming 3: The New Lower Bound of Programming Contests} \cite{hal13}
|
||||||
|
\item K. Diks et al.: \emph{Looking for a Challenge? The Ultimate Problem Set from
|
||||||
|
the University of Warsaw Programming Competitions} \cite{dik12}
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
The first two books are intended for beginners,
|
||||||
|
whereas the last book contains advanced material.
|
||||||
|
|
||||||
|
Of course, general algorithm books are also suitable for
|
||||||
|
competitive programmers.
|
||||||
|
Some popular books are:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein:
|
||||||
|
\emph{Introduction to Algorithms} \cite{cor09}
|
||||||
|
\item J. Kleinberg and É. Tardos:
|
||||||
|
\emph{Algorithm Design} \cite{kle05}
|
||||||
|
\item S. S. Skiena:
|
||||||
|
\emph{The Algorithm Design Manual} \cite{ski08}
|
||||||
|
\end{itemize}
|
|
@ -0,0 +1,538 @@
|
||||||
|
\chapter{Time complexity}
|
||||||
|
|
||||||
|
\index{time complexity}
|
||||||
|
|
||||||
|
The efficiency of algorithms is important in competitive programming.
|
||||||
|
Usually, it is easy to design an algorithm
|
||||||
|
that solves the problem slowly,
|
||||||
|
but the real challenge is to invent a
|
||||||
|
fast algorithm.
|
||||||
|
If the algorithm is too slow, it will get only
|
||||||
|
partial points or no points at all.
|
||||||
|
|
||||||
|
The \key{time complexity} of an algorithm
|
||||||
|
estimates how much time the algorithm will use
|
||||||
|
for some input.
|
||||||
|
The idea is to represent the efficiency
|
||||||
|
as a function whose parameter is the size of the input.
|
||||||
|
By calculating the time complexity,
|
||||||
|
we can find out whether the algorithm is fast enough
|
||||||
|
without implementing it.
|
||||||
|
|
||||||
|
\section{Calculation rules}
|
||||||
|
|
||||||
|
The time complexity of an algorithm
|
||||||
|
is denoted $O(\cdots)$
|
||||||
|
where the three dots represent some
|
||||||
|
function.
|
||||||
|
Usually, the variable $n$ denotes
|
||||||
|
the input size.
|
||||||
|
For example, if the input is an array of numbers,
|
||||||
|
$n$ will be the size of the array,
|
||||||
|
and if the input is a string,
|
||||||
|
$n$ will be the length of the string.
|
||||||
|
|
||||||
|
\subsubsection*{Loops}
|
||||||
|
|
||||||
|
A common reason why an algorithm is slow is
|
||||||
|
that it contains many loops that go through the input.
|
||||||
|
The more nested loops the algorithm contains,
|
||||||
|
the slower it is.
|
||||||
|
If there are $k$ nested loops,
|
||||||
|
the time complexity is $O(n^k)$.
|
||||||
|
|
||||||
|
For example, the time complexity of the following code is $O(n)$:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i++) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
And the time complexity of the following code is $O(n^2)$:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i++) {
|
||||||
|
for (int j = 1; j <= n; j++) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection*{Order of magnitude}
|
||||||
|
|
||||||
|
A time complexity does not tell us the exact number
|
||||||
|
of times the code inside a loop is executed,
|
||||||
|
but it only shows the order of magnitude.
|
||||||
|
In the following examples, the code inside the loop
|
||||||
|
is executed $3n$, $n+5$ and $\lceil n/2 \rceil$ times,
|
||||||
|
but the time complexity of each code is $O(n)$.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= 3*n; i++) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n+5; i++) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i += 2) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
As another example,
|
||||||
|
the time complexity of the following code is $O(n^2)$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i++) {
|
||||||
|
for (int j = i+1; j <= n; j++) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection*{Phases}
|
||||||
|
|
||||||
|
If the algorithm consists of consecutive phases,
|
||||||
|
the total time complexity is the largest
|
||||||
|
time complexity of a single phase.
|
||||||
|
The reason for this is that the slowest
|
||||||
|
phase is usually the bottleneck of the code.
|
||||||
|
|
||||||
|
For example, the following code consists
|
||||||
|
of three phases with time complexities
|
||||||
|
$O(n)$, $O(n^2)$ and $O(n)$.
|
||||||
|
Thus, the total time complexity is $O(n^2)$.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i++) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
for (int i = 1; i <= n; i++) {
|
||||||
|
for (int j = 1; j <= n; j++) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
}
|
||||||
|
for (int i = 1; i <= n; i++) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection*{Several variables}
|
||||||
|
|
||||||
|
Sometimes the time complexity depends on
|
||||||
|
several factors.
|
||||||
|
In this case, the time complexity formula
|
||||||
|
contains several variables.
|
||||||
|
|
||||||
|
For example, the time complexity of the
|
||||||
|
following code is $O(nm)$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i++) {
|
||||||
|
for (int j = 1; j <= m; j++) {
|
||||||
|
// code
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection*{Recursion}
|
||||||
|
|
||||||
|
The time complexity of a recursive function
|
||||||
|
depends on the number of times the function is called
|
||||||
|
and the time complexity of a single call.
|
||||||
|
The total time complexity is the product of
|
||||||
|
these values.
|
||||||
|
|
||||||
|
For example, consider the following function:
|
||||||
|
\begin{lstlisting}
|
||||||
|
void f(int n) {
|
||||||
|
if (n == 1) return;
|
||||||
|
f(n-1);
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
The call $\texttt{f}(n)$ causes $n$ function calls,
|
||||||
|
and the time complexity of each call is $O(1)$.
|
||||||
|
Thus, the total time complexity is $O(n)$.
|
||||||
|
|
||||||
|
As another example, consider the following function:
|
||||||
|
\begin{lstlisting}
|
||||||
|
void g(int n) {
|
||||||
|
if (n == 1) return;
|
||||||
|
g(n-1);
|
||||||
|
g(n-1);
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
In this case each function call generates two other
|
||||||
|
calls, except for $n=1$.
|
||||||
|
Let us see what happens when $g$ is called
|
||||||
|
with parameter $n$.
|
||||||
|
The following table shows the function calls
|
||||||
|
produced by this single call:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rr}
|
||||||
|
function call & number of calls \\
|
||||||
|
\hline
|
||||||
|
$g(n)$ & 1 \\
|
||||||
|
$g(n-1)$ & 2 \\
|
||||||
|
$g(n-2)$ & 4 \\
|
||||||
|
$\cdots$ & $\cdots$ \\
|
||||||
|
$g(1)$ & $2^{n-1}$ \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
Based on this, the time complexity is
|
||||||
|
\[1+2+4+\cdots+2^{n-1} = 2^n-1 = O(2^n).\]
|
||||||
|
|
||||||
|
\section{Complexity classes}
|
||||||
|
|
||||||
|
\index{complexity classes}
|
||||||
|
|
||||||
|
The following list contains common time complexities
|
||||||
|
of algorithms:
|
||||||
|
|
||||||
|
\begin{description}
|
||||||
|
\item[$O(1)$]
|
||||||
|
\index{constant-time algorithm}
|
||||||
|
The running time of a \key{constant-time} algorithm
|
||||||
|
does not depend on the input size.
|
||||||
|
A typical constant-time algorithm is a direct
|
||||||
|
formula that calculates the answer.
|
||||||
|
|
||||||
|
\item[$O(\log n)$]
|
||||||
|
\index{logarithmic algorithm}
|
||||||
|
A \key{logarithmic} algorithm often halves
|
||||||
|
the input size at each step.
|
||||||
|
The running time of such an algorithm
|
||||||
|
is logarithmic, because
|
||||||
|
$\log_2 n$ equals the number of times
|
||||||
|
$n$ must be divided by 2 to get 1.
|
||||||
|
|
||||||
|
\item[$O(\sqrt n)$]
|
||||||
|
A \key{square root algorithm} is slower than
|
||||||
|
$O(\log n)$ but faster than $O(n)$.
|
||||||
|
A special property of square roots is that
|
||||||
|
$\sqrt n = n/\sqrt n$, so the square root $\sqrt n$ lies,
|
||||||
|
in some sense, in the middle of the input.
|
||||||
|
|
||||||
|
\item[$O(n)$]
|
||||||
|
\index{linear algorithm}
|
||||||
|
A \key{linear} algorithm goes through the input
|
||||||
|
a constant number of times.
|
||||||
|
This is often the best possible time complexity,
|
||||||
|
because it is usually necessary to access each
|
||||||
|
input element at least once before
|
||||||
|
reporting the answer.
|
||||||
|
|
||||||
|
\item[$O(n \log n)$]
|
||||||
|
This time complexity often indicates that the
|
||||||
|
algorithm sorts the input,
|
||||||
|
because the time complexity of efficient
|
||||||
|
sorting algorithms is $O(n \log n)$.
|
||||||
|
Another possibility is that the algorithm
|
||||||
|
uses a data structure where each operation
|
||||||
|
takes $O(\log n)$ time.
|
||||||
|
|
||||||
|
\item[$O(n^2)$]
|
||||||
|
\index{quadratic algorithm}
|
||||||
|
A \key{quadratic} algorithm often contains
|
||||||
|
two nested loops.
|
||||||
|
It is possible to go through all pairs of
|
||||||
|
the input elements in $O(n^2)$ time.
|
||||||
|
|
||||||
|
\item[$O(n^3)$]
|
||||||
|
\index{cubic algorithm}
|
||||||
|
A \key{cubic} algorithm often contains
|
||||||
|
three nested loops.
|
||||||
|
It is possible to go through all triplets of
|
||||||
|
the input elements in $O(n^3)$ time.
|
||||||
|
|
||||||
|
\item[$O(2^n)$]
|
||||||
|
This time complexity often indicates that
|
||||||
|
the algorithm iterates through all
|
||||||
|
subsets of the input elements.
|
||||||
|
For example, the subsets of $\{1,2,3\}$ are
|
||||||
|
$\emptyset$, $\{1\}$, $\{2\}$, $\{3\}$, $\{1,2\}$,
|
||||||
|
$\{1,3\}$, $\{2,3\}$ and $\{1,2,3\}$.
|
||||||
|
|
||||||
|
\item[$O(n!)$]
|
||||||
|
This time complexity often indicates that
|
||||||
|
the algorithm iterates through all
|
||||||
|
permutations of the input elements.
|
||||||
|
For example, the permutations of $\{1,2,3\}$ are
|
||||||
|
$(1,2,3)$, $(1,3,2)$, $(2,1,3)$, $(2,3,1)$,
|
||||||
|
$(3,1,2)$ and $(3,2,1)$.
|
||||||
|
|
||||||
|
\end{description}
|
||||||
|
|
||||||
|
\index{polynomial algorithm}
|
||||||
|
An algorithm is \key{polynomial}
|
||||||
|
if its time complexity is at most $O(n^k)$
|
||||||
|
where $k$ is a constant.
|
||||||
|
All the above time complexities except
|
||||||
|
$O(2^n)$ and $O(n!)$ are polynomial.
|
||||||
|
In practice, the constant $k$ is usually small,
|
||||||
|
and therefore a polynomial time complexity
|
||||||
|
roughly means that the algorithm is \emph{efficient}.
|
||||||
|
|
||||||
|
\index{NP-hard problem}
|
||||||
|
|
||||||
|
Most algorithms in this book are polynomial.
|
||||||
|
Still, there are many important problems for which
|
||||||
|
no polynomial algorithm is known, i.e.,
|
||||||
|
nobody knows how to solve them efficiently.
|
||||||
|
\key{NP-hard} problems are an important set
|
||||||
|
of problems, for which no polynomial algorithm
|
||||||
|
is known\footnote{A classic book on the topic is
|
||||||
|
M. R. Garey's and D. S. Johnson's
|
||||||
|
\emph{Computers and Intractability: A Guide to the Theory
|
||||||
|
of NP-Completeness} \cite{gar79}.}.
|
||||||
|
|
||||||
|
\section{Estimating efficiency}
|
||||||
|
|
||||||
|
By calculating the time complexity of an algorithm,
|
||||||
|
it is possible to check, before
|
||||||
|
implementing the algorithm, that it is
|
||||||
|
efficient enough for the problem.
|
||||||
|
The starting point for estimations is the fact that
|
||||||
|
a modern computer can perform some hundreds of
|
||||||
|
millions of operations in a second.
|
||||||
|
|
||||||
|
For example, assume that the time limit for
|
||||||
|
a problem is one second and the input size is $n=10^5$.
|
||||||
|
If the time complexity is $O(n^2)$,
|
||||||
|
the algorithm will perform about $(10^5)^2=10^{10}$ operations.
|
||||||
|
This should take at least some tens of seconds,
|
||||||
|
so the algorithm seems to be too slow for solving the problem.
|
||||||
|
|
||||||
|
On the other hand, given the input size,
|
||||||
|
we can try to \emph{guess}
|
||||||
|
the required time complexity of the algorithm
|
||||||
|
that solves the problem.
|
||||||
|
The following table contains some useful estimates
|
||||||
|
assuming a time limit of one second.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{ll}
|
||||||
|
input size & required time complexity \\
|
||||||
|
\hline
|
||||||
|
$n \le 10$ & $O(n!)$ \\
|
||||||
|
$n \le 20$ & $O(2^n)$ \\
|
||||||
|
$n \le 500$ & $O(n^3)$ \\
|
||||||
|
$n \le 5000$ & $O(n^2)$ \\
|
||||||
|
$n \le 10^6$ & $O(n \log n)$ or $O(n)$ \\
|
||||||
|
$n$ is large & $O(1)$ or $O(\log n)$ \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
For example, if the input size is $n=10^5$,
|
||||||
|
it is probably expected that the time
|
||||||
|
complexity of the algorithm is $O(n)$ or $O(n \log n)$.
|
||||||
|
This information makes it easier to design the algorithm,
|
||||||
|
because it rules out approaches that would yield
|
||||||
|
an algorithm with a worse time complexity.
|
||||||
|
|
||||||
|
\index{constant factor}
|
||||||
|
|
||||||
|
Still, it is important to remember that a
|
||||||
|
time complexity is only an estimate of efficiency,
|
||||||
|
because it hides the \emph{constant factors}.
|
||||||
|
For example, an algorithm that runs in $O(n)$ time
|
||||||
|
may perform $n/2$ or $5n$ operations.
|
||||||
|
This has an important effect on the actual
|
||||||
|
running time of the algorithm.
|
||||||
|
|
||||||
|
\section{Maximum subarray sum}
|
||||||
|
|
||||||
|
\index{maximum subarray sum}
|
||||||
|
|
||||||
|
There are often several possible algorithms
|
||||||
|
for solving a problem such that their
|
||||||
|
time complexities are different.
|
||||||
|
This section discusses a classic problem that
|
||||||
|
has a straightforward $O(n^3)$ solution.
|
||||||
|
However, by designing a better algorithm, it
|
||||||
|
is possible to solve the problem in $O(n^2)$
|
||||||
|
time and even in $O(n)$ time.
|
||||||
|
|
||||||
|
Given an array of $n$ numbers,
|
||||||
|
our task is to calculate the
|
||||||
|
\key{maximum subarray sum}, i.e.,
|
||||||
|
the largest possible sum of
|
||||||
|
a sequence of consecutive values
|
||||||
|
in the array\footnote{J. Bentley's
|
||||||
|
book \emph{Programming Pearls} \cite{ben86} made the problem popular.}.
|
||||||
|
The problem is interesting when there may be
|
||||||
|
negative values in the array.
|
||||||
|
For example, in the array
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$-1$};
|
||||||
|
\node at (1.5,0.5) {$2$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$-3$};
|
||||||
|
\node at (4.5,0.5) {$5$};
|
||||||
|
\node at (5.5,0.5) {$2$};
|
||||||
|
\node at (6.5,0.5) {$-5$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\begin{samepage}
|
||||||
|
the following subarray produces the maximum sum $10$:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (1,0) rectangle (6,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$-1$};
|
||||||
|
\node at (1.5,0.5) {$2$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$-3$};
|
||||||
|
\node at (4.5,0.5) {$5$};
|
||||||
|
\node at (5.5,0.5) {$2$};
|
||||||
|
\node at (6.5,0.5) {$-5$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
We assume that an empty subarray is allowed,
|
||||||
|
so the maximum subarray sum is always at least $0$.
|
||||||
|
|
||||||
|
\subsubsection{Algorithm 1}
|
||||||
|
|
||||||
|
A straightforward way to solve the problem
|
||||||
|
is to go through all possible subarrays,
|
||||||
|
calculate the sum of values in each subarray and maintain
|
||||||
|
the maximum sum.
|
||||||
|
The following code implements this algorithm:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
int best = 0;
|
||||||
|
for (int a = 0; a < n; a++) {
|
||||||
|
for (int b = a; b < n; b++) {
|
||||||
|
int sum = 0;
|
||||||
|
for (int k = a; k <= b; k++) {
|
||||||
|
sum += array[k];
|
||||||
|
}
|
||||||
|
best = max(best,sum);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
cout << best << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The variables \texttt{a} and \texttt{b} fix the first and
|
||||||
|
last index of the subarray,
|
||||||
|
and the sum of values is calculated to the variable \texttt{sum}.
|
||||||
|
The variable \texttt{best} contains the maximum sum found during the search.
|
||||||
|
|
||||||
|
The time complexity of the algorithm is $O(n^3)$,
|
||||||
|
because it consists of three nested loops
|
||||||
|
that go through the input.
|
||||||
|
|
||||||
|
\subsubsection{Algorithm 2}
|
||||||
|
|
||||||
|
It is easy to make Algorithm 1 more efficient
|
||||||
|
by removing one loop from it.
|
||||||
|
This is possible by calculating the sum at the same
|
||||||
|
time when the right end of the subarray moves.
|
||||||
|
The result is the following code:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
int best = 0;
|
||||||
|
for (int a = 0; a < n; a++) {
|
||||||
|
int sum = 0;
|
||||||
|
for (int b = a; b < n; b++) {
|
||||||
|
sum += array[b];
|
||||||
|
best = max(best,sum);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
cout << best << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
After this change, the time complexity is $O(n^2)$.
|
||||||
|
|
||||||
|
\subsubsection{Algorithm 3}
|
||||||
|
|
||||||
|
Surprisingly, it is possible to solve the problem
|
||||||
|
in $O(n)$ time\footnote{In \cite{ben86}, this linear-time algorithm
|
||||||
|
is attributed to J. B. Kadane, and the algorithm is sometimes
|
||||||
|
called \index{Kadane's algorithm} \key{Kadane's algorithm}.}, which means
|
||||||
|
that just one loop is enough.
|
||||||
|
The idea is to calculate, for each array position,
|
||||||
|
the maximum sum of a subarray that ends at that position.
|
||||||
|
After this, the answer for the problem is the
|
||||||
|
maximum of those sums.
|
||||||
|
|
||||||
|
Consider the subproblem of finding the maximum-sum subarray
|
||||||
|
that ends at position $k$.
|
||||||
|
There are two possibilities:
|
||||||
|
\begin{enumerate}
|
||||||
|
\item The subarray only contains the element at position $k$.
|
||||||
|
\item The subarray consists of a subarray that ends
|
||||||
|
at position $k-1$, followed by the element at position $k$.
|
||||||
|
\end{enumerate}
|
||||||
|
|
||||||
|
In the latter case, since we want to
|
||||||
|
find a subarray with maximum sum,
|
||||||
|
the subarray that ends at position $k-1$
|
||||||
|
should also have the maximum sum.
|
||||||
|
Thus, we can solve the problem efficiently
|
||||||
|
by calculating the maximum subarray sum
|
||||||
|
for each ending position from left to right.
|
||||||
|
|
||||||
|
The following code implements the algorithm:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int best = 0, sum = 0;
|
||||||
|
for (int k = 0; k < n; k++) {
|
||||||
|
sum = max(array[k],sum+array[k]);
|
||||||
|
best = max(best,sum);
|
||||||
|
}
|
||||||
|
cout << best << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The algorithm only contains one loop
|
||||||
|
that goes through the input,
|
||||||
|
so the time complexity is $O(n)$.
|
||||||
|
This is also the best possible time complexity,
|
||||||
|
because any algorithm for the problem
|
||||||
|
has to examine all array elements at least once.
|
||||||
|
|
||||||
|
\subsubsection{Efficiency comparison}
|
||||||
|
|
||||||
|
It is interesting to study how efficient
|
||||||
|
algorithms are in practice.
|
||||||
|
The following table shows the running times
|
||||||
|
of the above algorithms for different
|
||||||
|
values of $n$ on a modern computer.
|
||||||
|
|
||||||
|
In each test, the input was generated randomly.
|
||||||
|
The time needed for reading the input was not
|
||||||
|
measured.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rrrr}
|
||||||
|
array size $n$ & Algorithm 1 & Algorithm 2 & Algorithm 3 \\
|
||||||
|
\hline
|
||||||
|
$10^2$ & $0.0$ s & $0.0$ s & $0.0$ s \\
|
||||||
|
$10^3$ & $0.1$ s & $0.0$ s & $0.0$ s \\
|
||||||
|
$10^4$ & > $10.0$ s & $0.1$ s & $0.0$ s \\
|
||||||
|
$10^5$ & > $10.0$ s & $5.3$ s & $0.0$ s \\
|
||||||
|
$10^6$ & > $10.0$ s & > $10.0$ s & $0.0$ s \\
|
||||||
|
$10^7$ & > $10.0$ s & > $10.0$ s & $0.0$ s \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The comparison shows that all algorithms
|
||||||
|
are efficient when the input size is small,
|
||||||
|
but larger inputs bring out remarkable
|
||||||
|
differences in the running times of the algorithms.
|
||||||
|
Algorithm 1 becomes slow
|
||||||
|
when $n=10^4$, and Algorithm 2
|
||||||
|
becomes slow when $n=10^5$.
|
||||||
|
Only Algorithm 3 is able to process
|
||||||
|
even the largest inputs instantly.
|
|
@ -0,0 +1,863 @@
|
||||||
|
\chapter{Sorting}
|
||||||
|
|
||||||
|
\index{sorting}
|
||||||
|
|
||||||
|
\key{Sorting}
|
||||||
|
is a fundamental algorithm design problem.
|
||||||
|
Many efficient algorithms
|
||||||
|
use sorting as a subroutine,
|
||||||
|
because it is often easier to process
|
||||||
|
data if the elements are in a sorted order.
|
||||||
|
|
||||||
|
For example, the problem ''does an array contain
|
||||||
|
two equal elements?'' is easy to solve using sorting.
|
||||||
|
If the array contains two equal elements,
|
||||||
|
they will be next to each other after sorting,
|
||||||
|
so it is easy to find them.
|
||||||
|
Also, the problem ''what is the most frequent element
|
||||||
|
in an array?'' can be solved similarly.
|
||||||
|
|
||||||
|
There are many algorithms for sorting, and they are
|
||||||
|
also good examples of how to apply
|
||||||
|
different algorithm design techniques.
|
||||||
|
The efficient general sorting algorithms
|
||||||
|
work in $O(n \log n)$ time,
|
||||||
|
and many algorithms that use sorting
|
||||||
|
as a subroutine also
|
||||||
|
have this time complexity.
|
||||||
|
|
||||||
|
\section{Sorting theory}
|
||||||
|
|
||||||
|
The basic problem in sorting is as follows:
|
||||||
|
\begin{framed}
|
||||||
|
\noindent
|
||||||
|
Given an array that contains $n$ elements,
|
||||||
|
your task is to sort the elements
|
||||||
|
in increasing order.
|
||||||
|
\end{framed}
|
||||||
|
\noindent
|
||||||
|
For example, the array
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$8$};
|
||||||
|
\node at (3.5,0.5) {$2$};
|
||||||
|
\node at (4.5,0.5) {$9$};
|
||||||
|
\node at (5.5,0.5) {$2$};
|
||||||
|
\node at (6.5,0.5) {$5$};
|
||||||
|
\node at (7.5,0.5) {$6$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
will be as follows after sorting:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$2$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$3$};
|
||||||
|
\node at (4.5,0.5) {$5$};
|
||||||
|
\node at (5.5,0.5) {$6$};
|
||||||
|
\node at (6.5,0.5) {$8$};
|
||||||
|
\node at (7.5,0.5) {$9$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{$O(n^2)$ algorithms}
|
||||||
|
|
||||||
|
\index{bubble sort}
|
||||||
|
|
||||||
|
Simple algorithms for sorting an array
|
||||||
|
work in $O(n^2)$ time.
|
||||||
|
Such algorithms are short and usually
|
||||||
|
consist of two nested loops.
|
||||||
|
A famous $O(n^2)$ time sorting algorithm
|
||||||
|
is \key{bubble sort} where the elements
|
||||||
|
''bubble'' in the array according to their values.
|
||||||
|
|
||||||
|
Bubble sort consists of $n$ rounds.
|
||||||
|
On each round, the algorithm iterates through
|
||||||
|
the elements of the array.
|
||||||
|
Whenever two consecutive elements are found
|
||||||
|
that are not in correct order,
|
||||||
|
the algorithm swaps them.
|
||||||
|
The algorithm can be implemented as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 0; i < n; i++) {
|
||||||
|
for (int j = 0; j < n-1; j++) {
|
||||||
|
if (array[j] > array[j+1]) {
|
||||||
|
swap(array[j],array[j+1]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
After the first round of the algorithm,
|
||||||
|
the largest element will be in the correct position,
|
||||||
|
and in general, after $k$ rounds, the $k$ largest
|
||||||
|
elements will be in the correct positions.
|
||||||
|
Thus, after $n$ rounds, the whole array
|
||||||
|
will be sorted.
|
||||||
|
|
||||||
|
For example, in the array
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$8$};
|
||||||
|
\node at (3.5,0.5) {$2$};
|
||||||
|
\node at (4.5,0.5) {$9$};
|
||||||
|
\node at (5.5,0.5) {$2$};
|
||||||
|
\node at (6.5,0.5) {$5$};
|
||||||
|
\node at (7.5,0.5) {$6$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\noindent
|
||||||
|
the first round of bubble sort swaps elements
|
||||||
|
as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$8$};
|
||||||
|
\node at (4.5,0.5) {$9$};
|
||||||
|
\node at (5.5,0.5) {$2$};
|
||||||
|
\node at (6.5,0.5) {$5$};
|
||||||
|
\node at (7.5,0.5) {$6$};
|
||||||
|
|
||||||
|
\draw[thick,<->] (3.5,-0.25) .. controls (3.25,-1.00) and (2.75,-1.00) .. (2.5,-0.25);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$8$};
|
||||||
|
\node at (4.5,0.5) {$2$};
|
||||||
|
\node at (5.5,0.5) {$9$};
|
||||||
|
\node at (6.5,0.5) {$5$};
|
||||||
|
\node at (7.5,0.5) {$6$};
|
||||||
|
|
||||||
|
\draw[thick,<->] (5.5,-0.25) .. controls (5.25,-1.00) and (4.75,-1.00) .. (4.5,-0.25);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$8$};
|
||||||
|
\node at (4.5,0.5) {$2$};
|
||||||
|
\node at (5.5,0.5) {$5$};
|
||||||
|
\node at (6.5,0.5) {$9$};
|
||||||
|
\node at (7.5,0.5) {$6$};
|
||||||
|
|
||||||
|
\draw[thick,<->] (6.5,-0.25) .. controls (6.25,-1.00) and (5.75,-1.00) .. (5.5,-0.25);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$8$};
|
||||||
|
\node at (4.5,0.5) {$2$};
|
||||||
|
\node at (5.5,0.5) {$5$};
|
||||||
|
\node at (6.5,0.5) {$6$};
|
||||||
|
\node at (7.5,0.5) {$9$};
|
||||||
|
|
||||||
|
\draw[thick,<->] (7.5,-0.25) .. controls (7.25,-1.00) and (6.75,-1.00) .. (6.5,-0.25);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Inversions}
|
||||||
|
|
||||||
|
\index{inversion}
|
||||||
|
|
||||||
|
Bubble sort is an example of a sorting
|
||||||
|
algorithm that always swaps \emph{consecutive}
|
||||||
|
elements in the array.
|
||||||
|
It turns out that the time complexity
|
||||||
|
of such an algorithm is \emph{always}
|
||||||
|
at least $O(n^2)$, because in the worst case,
|
||||||
|
$O(n^2)$ swaps are required for sorting the array.
|
||||||
|
|
||||||
|
A useful concept when analyzing sorting
|
||||||
|
algorithms is an \key{inversion}:
|
||||||
|
a pair of array elements
|
||||||
|
$(\texttt{array}[a],\texttt{array}[b])$ such that
|
||||||
|
$a<b$ and $\texttt{array}[a]>\texttt{array}[b]$,
|
||||||
|
i.e., the elements are in the wrong order.
|
||||||
|
For example, the array
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$2$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$6$};
|
||||||
|
\node at (4.5,0.5) {$3$};
|
||||||
|
\node at (5.5,0.5) {$5$};
|
||||||
|
\node at (6.5,0.5) {$9$};
|
||||||
|
\node at (7.5,0.5) {$8$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
has three inversions: $(6,3)$, $(6,5)$ and $(9,8)$.
|
||||||
|
The number of inversions indicates
|
||||||
|
how much work is needed to sort the array.
|
||||||
|
An array is completely sorted when
|
||||||
|
there are no inversions.
|
||||||
|
On the other hand, if the array elements
|
||||||
|
are in the reverse order,
|
||||||
|
the number of inversions is the largest possible:
|
||||||
|
\[1+2+\cdots+(n-1)=\frac{n(n-1)}{2} = O(n^2)\]
|
||||||
|
|
||||||
|
Swapping a pair of consecutive elements that are
|
||||||
|
in the wrong order removes exactly one inversion
|
||||||
|
from the array.
|
||||||
|
Hence, if a sorting algorithm can only
|
||||||
|
swap consecutive elements, each swap removes
|
||||||
|
at most one inversion, and the time complexity
|
||||||
|
of the algorithm is at least $O(n^2)$.
|
||||||
|
|
||||||
|
\subsubsection{$O(n \log n)$ algorithms}
|
||||||
|
|
||||||
|
\index{merge sort}
|
||||||
|
|
||||||
|
It is possible to sort an array efficiently
|
||||||
|
in $O(n \log n)$ time using algorithms
|
||||||
|
that are not limited to swapping consecutive elements.
|
||||||
|
One such algorithm is \key{merge sort}\footnote{According to \cite{knu983},
|
||||||
|
merge sort was invented by J. von Neumann in 1945.},
|
||||||
|
which is based on recursion.
|
||||||
|
|
||||||
|
Merge sort sorts a subarray \texttt{array}$[a \ldots b]$ as follows:
|
||||||
|
|
||||||
|
\begin{enumerate}
|
||||||
|
\item If $a=b$, do not do anything, because the subarray is already sorted.
|
||||||
|
\item Calculate the position of the middle element: $k=\lfloor (a+b)/2 \rfloor$.
|
||||||
|
\item Recursively sort the subarray \texttt{array}$[a \ldots k]$.
|
||||||
|
\item Recursively sort the subarray \texttt{array}$[k+1 \ldots b]$.
|
||||||
|
\item \emph{Merge} the sorted subarrays \texttt{array}$[a \ldots k]$ and
|
||||||
|
\texttt{array}$[k+1 \ldots b]$
|
||||||
|
into a sorted subarray \texttt{array}$[a \ldots b]$.
|
||||||
|
\end{enumerate}
|
||||||
|
|
||||||
|
Merge sort is an efficient algorithm, because it
|
||||||
|
halves the size of the subarray at each step.
|
||||||
|
The recursion consists of $O(\log n)$ levels,
|
||||||
|
and processing each level takes $O(n)$ time.
|
||||||
|
Merging the subarrays \texttt{array}$[a \ldots k]$ and \texttt{array}$[k+1 \ldots b]$
|
||||||
|
is possible in linear time, because they are already sorted.
|
||||||
|
|
||||||
|
For example, consider sorting the following array:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$6$};
|
||||||
|
\node at (3.5,0.5) {$2$};
|
||||||
|
\node at (4.5,0.5) {$8$};
|
||||||
|
\node at (5.5,0.5) {$2$};
|
||||||
|
\node at (6.5,0.5) {$5$};
|
||||||
|
\node at (7.5,0.5) {$9$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The array will be divided into two subarrays
|
||||||
|
as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (4,1);
|
||||||
|
\draw (5,0) grid (9,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$6$};
|
||||||
|
\node at (3.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\node at (5.5,0.5) {$8$};
|
||||||
|
\node at (6.5,0.5) {$2$};
|
||||||
|
\node at (7.5,0.5) {$5$};
|
||||||
|
\node at (8.5,0.5) {$9$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Then, the subarrays will be sorted recursively
|
||||||
|
as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (4,1);
|
||||||
|
\draw (5,0) grid (9,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$2$};
|
||||||
|
\node at (2.5,0.5) {$3$};
|
||||||
|
\node at (3.5,0.5) {$6$};
|
||||||
|
|
||||||
|
\node at (5.5,0.5) {$2$};
|
||||||
|
\node at (6.5,0.5) {$5$};
|
||||||
|
\node at (7.5,0.5) {$8$};
|
||||||
|
\node at (8.5,0.5) {$9$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Finally, the algorithm merges the sorted
|
||||||
|
subarrays and creates the final sorted array:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$2$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$3$};
|
||||||
|
\node at (4.5,0.5) {$5$};
|
||||||
|
\node at (5.5,0.5) {$6$};
|
||||||
|
\node at (6.5,0.5) {$8$};
|
||||||
|
\node at (7.5,0.5) {$9$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Sorting lower bound}
|
||||||
|
|
||||||
|
Is it possible to sort an array faster
|
||||||
|
than in $O(n \log n)$ time?
|
||||||
|
It turns out that this is \emph{not} possible
|
||||||
|
when we restrict ourselves to sorting algorithms
|
||||||
|
that are based on comparing array elements.
|
||||||
|
|
||||||
|
The lower bound for the time complexity
|
||||||
|
can be proved by considering sorting
|
||||||
|
as a process where each comparison of two elements
|
||||||
|
gives more information about the contents of the array.
|
||||||
|
The process creates the following tree:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) rectangle (3,1);
|
||||||
|
\node at (1.5,0.5) {$x < y?$};
|
||||||
|
|
||||||
|
\draw[thick,->] (1.5,0) -- (-2.5,-1.5);
|
||||||
|
\draw[thick,->] (1.5,0) -- (5.5,-1.5);
|
||||||
|
|
||||||
|
\draw (-4,-2.5) rectangle (-1,-1.5);
|
||||||
|
\draw (4,-2.5) rectangle (7,-1.5);
|
||||||
|
\node at (-2.5,-2) {$x < y?$};
|
||||||
|
\node at (5.5,-2) {$x < y?$};
|
||||||
|
|
||||||
|
\draw[thick,->] (-2.5,-2.5) -- (-4.5,-4);
|
||||||
|
\draw[thick,->] (-2.5,-2.5) -- (-0.5,-4);
|
||||||
|
\draw[thick,->] (5.5,-2.5) -- (3.5,-4);
|
||||||
|
\draw[thick,->] (5.5,-2.5) -- (7.5,-4);
|
||||||
|
|
||||||
|
\draw (-6,-5) rectangle (-3,-4);
|
||||||
|
\draw (-2,-5) rectangle (1,-4);
|
||||||
|
\draw (2,-5) rectangle (5,-4);
|
||||||
|
\draw (6,-5) rectangle (9,-4);
|
||||||
|
\node at (-4.5,-4.5) {$x < y?$};
|
||||||
|
\node at (-0.5,-4.5) {$x < y?$};
|
||||||
|
\node at (3.5,-4.5) {$x < y?$};
|
||||||
|
\node at (7.5,-4.5) {$x < y?$};
|
||||||
|
|
||||||
|
\draw[thick,->] (-4.5,-5) -- (-5.5,-6);
|
||||||
|
\draw[thick,->] (-4.5,-5) -- (-3.5,-6);
|
||||||
|
\draw[thick,->] (-0.5,-5) -- (0.5,-6);
|
||||||
|
\draw[thick,->] (-0.5,-5) -- (-1.5,-6);
|
||||||
|
\draw[thick,->] (3.5,-5) -- (2.5,-6);
|
||||||
|
\draw[thick,->] (3.5,-5) -- (4.5,-6);
|
||||||
|
\draw[thick,->] (7.5,-5) -- (6.5,-6);
|
||||||
|
\draw[thick,->] (7.5,-5) -- (8.5,-6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Here ''$x<y?$'' means that some elements
|
||||||
|
$x$ and $y$ are compared.
|
||||||
|
If $x<y$, the process continues to the left,
|
||||||
|
and otherwise to the right.
|
||||||
|
The results of the process are the possible
|
||||||
|
ways to sort the array, a total of $n!$ ways.
|
||||||
|
For this reason, the height of the tree
|
||||||
|
must be at least
|
||||||
|
\[ \log_2(n!) = \log_2(1)+\log_2(2)+\cdots+\log_2(n).\]
|
||||||
|
We get a lower bound for this sum
|
||||||
|
by choosing the last $n/2$ elements and
|
||||||
|
changing the value of each element to $\log_2(n/2)$.
|
||||||
|
This yields an estimate
|
||||||
|
\[ \log_2(n!) \ge (n/2) \cdot \log_2(n/2),\]
|
||||||
|
so the height of the tree and the minimum
|
||||||
|
possible number of steps in a sorting
|
||||||
|
algorithm in the worst case
|
||||||
|
is at least $n \log n$.
|
||||||
|
|
||||||
|
\subsubsection{Counting sort}
|
||||||
|
|
||||||
|
\index{counting sort}
|
||||||
|
|
||||||
|
The lower bound $n \log n$ does not apply to
|
||||||
|
algorithms that do not compare array elements
|
||||||
|
but use some other information.
|
||||||
|
An example of such an algorithm is
|
||||||
|
\key{counting sort} that sorts an array in
|
||||||
|
$O(n)$ time assuming that every element in the array
|
||||||
|
is an integer between $0 \ldots c$ and $c=O(n)$.
|
||||||
|
|
||||||
|
The algorithm creates a \emph{bookkeeping} array,
|
||||||
|
whose indices are elements of the original array.
|
||||||
|
The algorithm iterates through the original array
|
||||||
|
and calculates how many times each element
|
||||||
|
appears in the array.
|
||||||
|
\newpage
|
||||||
|
|
||||||
|
For example, the array
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$6$};
|
||||||
|
\node at (3.5,0.5) {$9$};
|
||||||
|
\node at (4.5,0.5) {$9$};
|
||||||
|
\node at (5.5,0.5) {$3$};
|
||||||
|
\node at (6.5,0.5) {$5$};
|
||||||
|
\node at (7.5,0.5) {$9$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
corresponds to the following bookkeeping array:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (9,1);
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$0$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$0$};
|
||||||
|
\node at (4.5,0.5) {$1$};
|
||||||
|
\node at (5.5,0.5) {$1$};
|
||||||
|
\node at (6.5,0.5) {$0$};
|
||||||
|
\node at (7.5,0.5) {$0$};
|
||||||
|
\node at (8.5,0.5) {$3$};
|
||||||
|
|
||||||
|
\footnotesize
|
||||||
|
|
||||||
|
\node at (0.5,1.5) {$1$};
|
||||||
|
\node at (1.5,1.5) {$2$};
|
||||||
|
\node at (2.5,1.5) {$3$};
|
||||||
|
\node at (3.5,1.5) {$4$};
|
||||||
|
\node at (4.5,1.5) {$5$};
|
||||||
|
\node at (5.5,1.5) {$6$};
|
||||||
|
\node at (6.5,1.5) {$7$};
|
||||||
|
\node at (7.5,1.5) {$8$};
|
||||||
|
\node at (8.5,1.5) {$9$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
For example, the value at position 3
|
||||||
|
in the bookkeeping array is 2,
|
||||||
|
because the element 3 appears 2 times
|
||||||
|
in the original array.
|
||||||
|
|
||||||
|
Construction of the bookkeeping array
|
||||||
|
takes $O(n)$ time. After this, the sorted array
|
||||||
|
can be created in $O(n)$ time because
|
||||||
|
the number of occurrences of each element can be retrieved
|
||||||
|
from the bookkeeping array.
|
||||||
|
Thus, the total time complexity of counting
|
||||||
|
sort is $O(n)$.
|
||||||
|
|
||||||
|
Counting sort is a very efficient algorithm
|
||||||
|
but it can only be used when the constant $c$
|
||||||
|
is small enough, so that the array elements can
|
||||||
|
be used as indices in the bookkeeping array.
|
||||||
|
|
||||||
|
\section{Sorting in C++}
|
||||||
|
|
||||||
|
\index{sort@\texttt{sort}}
|
||||||
|
|
||||||
|
It is almost never a good idea to use
|
||||||
|
a home-made sorting algorithm
|
||||||
|
in a contest, because there are good
|
||||||
|
implementations available in programming languages.
|
||||||
|
For example, the C++ standard library contains
|
||||||
|
the function \texttt{sort} that can be easily used for
|
||||||
|
sorting arrays and other data structures.
|
||||||
|
|
||||||
|
There are many benefits in using a library function.
|
||||||
|
First, it saves time because there is no need to
|
||||||
|
implement the function.
|
||||||
|
Second, the library implementation is
|
||||||
|
certainly correct and efficient: it is not probable
|
||||||
|
that a home-made sorting function would be better.
|
||||||
|
|
||||||
|
In this section we will see how to use the
|
||||||
|
C++ \texttt{sort} function.
|
||||||
|
The following code sorts
|
||||||
|
a vector in increasing order:
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<int> v = {4,2,5,3,5,8,3};
|
||||||
|
sort(v.begin(),v.end());
|
||||||
|
\end{lstlisting}
|
||||||
|
After the sorting, the contents of the
|
||||||
|
vector will be
|
||||||
|
$[2,3,3,4,5,5,8]$.
|
||||||
|
The default sorting order is increasing,
|
||||||
|
but a reverse order is possible as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
sort(v.rbegin(),v.rend());
|
||||||
|
\end{lstlisting}
|
||||||
|
An ordinary array can be sorted as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int n = 7; // array size
|
||||||
|
int a[] = {4,2,5,3,5,8,3};
|
||||||
|
sort(a,a+n);
|
||||||
|
\end{lstlisting}
|
||||||
|
\newpage
|
||||||
|
The following code sorts the string \texttt{s}:
|
||||||
|
\begin{lstlisting}
|
||||||
|
string s = "monkey";
|
||||||
|
sort(s.begin(), s.end());
|
||||||
|
\end{lstlisting}
|
||||||
|
Sorting a string means that the characters
|
||||||
|
of the string are sorted.
|
||||||
|
For example, the string ''monkey'' becomes ''ekmnoy''.
|
||||||
|
|
||||||
|
\subsubsection{Comparison operators}
|
||||||
|
|
||||||
|
\index{comparison operator}
|
||||||
|
|
||||||
|
The function \texttt{sort} requires that
|
||||||
|
a \key{comparison operator} is defined for the data type
|
||||||
|
of the elements to be sorted.
|
||||||
|
When sorting, this operator will be used
|
||||||
|
whenever it is necessary to find out the order of two elements.
|
||||||
|
|
||||||
|
Most C++ data types have a built-in comparison operator,
|
||||||
|
and elements of those types can be sorted automatically.
|
||||||
|
For example, numbers are sorted according to their values
|
||||||
|
and strings are sorted in alphabetical order.
|
||||||
|
|
||||||
|
\index{pair@\texttt{pair}}
|
||||||
|
|
||||||
|
Pairs (\texttt{pair}) are sorted primarily according to their
|
||||||
|
first elements (\texttt{first}).
|
||||||
|
However, if the first elements of two pairs are equal,
|
||||||
|
they are sorted according to their second elements (\texttt{second}):
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<pair<int,int>> v;
|
||||||
|
v.push_back({1,5});
|
||||||
|
v.push_back({2,3});
|
||||||
|
v.push_back({1,2});
|
||||||
|
sort(v.begin(), v.end());
|
||||||
|
\end{lstlisting}
|
||||||
|
After this, the order of the pairs is
|
||||||
|
$(1,2)$, $(1,5)$ and $(2,3)$.
|
||||||
|
|
||||||
|
\index{tuple@\texttt{tuple}}
|
||||||
|
|
||||||
|
In a similar way, tuples (\texttt{tuple})
|
||||||
|
are sorted primarily by the first element,
|
||||||
|
secondarily by the second element, etc.\footnote{Note that in some older compilers,
|
||||||
|
the function \texttt{make\_tuple} has to be used to create a tuple instead of
|
||||||
|
braces (for example, \texttt{make\_tuple(2,1,4)} instead of \texttt{\{2,1,4\}}).}:
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<tuple<int,int,int>> v;
|
||||||
|
v.push_back({2,1,4});
|
||||||
|
v.push_back({1,5,3});
|
||||||
|
v.push_back({2,1,3});
|
||||||
|
sort(v.begin(), v.end());
|
||||||
|
\end{lstlisting}
|
||||||
|
After this, the order of the tuples is
|
||||||
|
$(1,5,3)$, $(2,1,3)$ and $(2,1,4)$.
|
||||||
|
|
||||||
|
\subsubsection{User-defined structs}
|
||||||
|
|
||||||
|
User-defined structs do not have a comparison
|
||||||
|
operator automatically.
|
||||||
|
The operator should be defined inside
|
||||||
|
the struct as a function
|
||||||
|
\texttt{operator<},
|
||||||
|
whose parameter is another element of the same type.
|
||||||
|
The operator should return \texttt{true}
|
||||||
|
if the element is smaller than the parameter,
|
||||||
|
and \texttt{false} otherwise.
|
||||||
|
|
||||||
|
For example, the following struct \texttt{P}
|
||||||
|
contains the x and y coordinates of a point.
|
||||||
|
The comparison operator is defined so that
|
||||||
|
the points are sorted primarily by the x coordinate
|
||||||
|
and secondarily by the y coordinate.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
struct P {
|
||||||
|
int x, y;
|
||||||
|
bool operator<(const P &p) {
|
||||||
|
if (x != p.x) return x < p.x;
|
||||||
|
else return y < p.y;
|
||||||
|
}
|
||||||
|
};
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Comparison functions}
|
||||||
|
|
||||||
|
\index{comparison function}
|
||||||
|
|
||||||
|
It is also possible to give an external
|
||||||
|
\key{comparison function} to the \texttt{sort} function
|
||||||
|
as a callback function.
|
||||||
|
For example, the following comparison function \texttt{comp}
|
||||||
|
sorts strings primarily by length and secondarily
|
||||||
|
by alphabetical order:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
bool comp(string a, string b) {
|
||||||
|
if (a.size() != b.size()) return a.size() < b.size();
|
||||||
|
return a < b;
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
Now a vector of strings can be sorted as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
sort(v.begin(), v.end(), comp);
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Binary search}
|
||||||
|
|
||||||
|
\index{binary search}
|
||||||
|
|
||||||
|
A general method for searching for an element
|
||||||
|
in an array is to use a \texttt{for} loop
|
||||||
|
that iterates through the elements of the array.
|
||||||
|
For example, the following code searches for
|
||||||
|
an element $x$ in an array:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 0; i < n; i++) {
|
||||||
|
if (array[i] == x) {
|
||||||
|
// x found at index i
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The time complexity of this approach is $O(n)$,
|
||||||
|
because in the worst case, it is necessary to check
|
||||||
|
all elements of the array.
|
||||||
|
If the order of the elements is arbitrary,
|
||||||
|
this is also the best possible approach, because
|
||||||
|
there is no additional information available where
|
||||||
|
in the array we should search for the element $x$.
|
||||||
|
|
||||||
|
However, if the array is \emph{sorted},
|
||||||
|
the situation is different.
|
||||||
|
In this case it is possible to perform the
|
||||||
|
search much faster, because the order of the
|
||||||
|
elements in the array guides the search.
|
||||||
|
The following \key{binary search} algorithm
|
||||||
|
efficiently searches for an element in a sorted array
|
||||||
|
in $O(\log n)$ time.
|
||||||
|
|
||||||
|
\subsubsection{Method 1}
|
||||||
|
|
||||||
|
The usual way to implement binary search
|
||||||
|
resembles looking for a word in a dictionary.
|
||||||
|
The search maintains an active region in the array,
|
||||||
|
which initially contains all array elements.
|
||||||
|
Then, a number of steps is performed,
|
||||||
|
each of which halves the size of the region.
|
||||||
|
|
||||||
|
At each step, the search checks the middle element
|
||||||
|
of the active region.
|
||||||
|
If the middle element is the target element,
|
||||||
|
the search terminates.
|
||||||
|
Otherwise, the search recursively continues
|
||||||
|
to the left or right half of the region,
|
||||||
|
depending on the value of the middle element.
|
||||||
|
|
||||||
|
The above idea can be implemented as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int a = 0, b = n-1;
|
||||||
|
while (a <= b) {
|
||||||
|
int k = (a+b)/2;
|
||||||
|
if (array[k] == x) {
|
||||||
|
// x found at index k
|
||||||
|
}
|
||||||
|
if (array[k] > x) b = k-1;
|
||||||
|
else a = k+1;
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
In this implementation, the active region is $a \ldots b$,
|
||||||
|
and initially the region is $0 \ldots n-1$.
|
||||||
|
The algorithm halves the size of the region at each step,
|
||||||
|
so the time complexity is $O(\log n)$.
|
||||||
|
|
||||||
|
\subsubsection{Method 2}
|
||||||
|
|
||||||
|
An alternative method to implement binary search
|
||||||
|
is based on an efficient way to iterate through
|
||||||
|
the elements of the array.
|
||||||
|
The idea is to make jumps and slow the speed
|
||||||
|
when we get closer to the target element.
|
||||||
|
|
||||||
|
The search goes through the array from left to
|
||||||
|
right, and the initial jump length is $n/2$.
|
||||||
|
At each step, the jump length will be halved:
|
||||||
|
first $n/4$, then $n/8$, $n/16$, etc., until
|
||||||
|
finally the length is 1.
|
||||||
|
After the jumps, either the target element has
|
||||||
|
been found or we know that it does not appear in the array.
|
||||||
|
|
||||||
|
The following code implements the above idea:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int k = 0;
|
||||||
|
for (int b = n/2; b >= 1; b /= 2) {
|
||||||
|
while (k+b < n && array[k+b] <= x) k += b;
|
||||||
|
}
|
||||||
|
if (array[k] == x) {
|
||||||
|
// x found at index k
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
During the search, the variable $b$
|
||||||
|
contains the current jump length.
|
||||||
|
The time complexity of the algorithm is $O(\log n)$,
|
||||||
|
because the code in the \texttt{while} loop
|
||||||
|
is performed at most twice for each jump length.
|
||||||
|
|
||||||
|
\subsubsection{C++ functions}
|
||||||
|
|
||||||
|
The C++ standard library contains the following functions
|
||||||
|
that are based on binary search and work in logarithmic time:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item \texttt{lower\_bound} returns a pointer to the
|
||||||
|
first array element whose value is at least $x$.
|
||||||
|
\item \texttt{upper\_bound} returns a pointer to the
|
||||||
|
first array element whose value is larger than $x$.
|
||||||
|
\item \texttt{equal\_range} returns both above pointers.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
The functions assume that the array is sorted.
|
||||||
|
If there is no such element, the pointer points to
|
||||||
|
the element after the last array element.
|
||||||
|
For example, the following code finds out whether
|
||||||
|
an array contains an element with value $x$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
auto k = lower_bound(array,array+n,x)-array;
|
||||||
|
if (k < n && array[k] == x) {
|
||||||
|
// x found at index k
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Then, the following code counts the number of elements
|
||||||
|
whose value is $x$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
auto a = lower_bound(array, array+n, x);
|
||||||
|
auto b = upper_bound(array, array+n, x);
|
||||||
|
cout << b-a << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Using \texttt{equal\_range}, the code becomes shorter:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
auto r = equal_range(array, array+n, x);
|
||||||
|
cout << r.second-r.first << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Finding the smallest solution}
|
||||||
|
|
||||||
|
An important use for binary search is
|
||||||
|
to find the position where the value of a \emph{function} changes.
|
||||||
|
Suppose that we wish to find the smallest value $k$
|
||||||
|
that is a valid solution for a problem.
|
||||||
|
We are given a function $\texttt{ok}(x)$
|
||||||
|
that returns \texttt{true} if $x$ is a valid solution
|
||||||
|
and \texttt{false} otherwise.
|
||||||
|
In addition, we know that $\texttt{ok}(x)$ is \texttt{false}
|
||||||
|
when $x<k$ and \texttt{true} when $x \ge k$.
|
||||||
|
The situation looks as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|rrrrrrrr}
|
||||||
|
$x$ & 0 & 1 & $\cdots$ & $k-1$ & $k$ & $k+1$ & $\cdots$ \\
|
||||||
|
\hline
|
||||||
|
$\texttt{ok}(x)$ & \texttt{false} & \texttt{false}
|
||||||
|
& $\cdots$ & \texttt{false} & \texttt{true} & \texttt{true} & $\cdots$ \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\noindent
|
||||||
|
Now, the value of $k$ can be found using binary search:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
int x = -1;
|
||||||
|
for (int b = z; b >= 1; b /= 2) {
|
||||||
|
while (!ok(x+b)) x += b;
|
||||||
|
}
|
||||||
|
int k = x+1;
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The search finds the largest value of $x$ for which
|
||||||
|
$\texttt{ok}(x)$ is \texttt{false}.
|
||||||
|
Thus, the next value $k=x+1$
|
||||||
|
is the smallest possible value for which
|
||||||
|
$\texttt{ok}(k)$ is \texttt{true}.
|
||||||
|
The initial jump length $z$ has to be
|
||||||
|
large enough, for example some value
|
||||||
|
for which we know beforehand that $\texttt{ok}(z)$ is \texttt{true}.
|
||||||
|
|
||||||
|
The algorithm calls the function \texttt{ok}
|
||||||
|
$O(\log z)$ times, so the total time complexity
|
||||||
|
depends on the function \texttt{ok}.
|
||||||
|
For example, if the function works in $O(n)$ time,
|
||||||
|
the total time complexity is $O(n \log z)$.
|
||||||
|
|
||||||
|
\subsubsection{Finding the maximum value}
|
||||||
|
|
||||||
|
Binary search can also be used to find
|
||||||
|
the maximum value for a function that is
|
||||||
|
first increasing and then decreasing.
|
||||||
|
Our task is to find a position $k$ such that
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item
|
||||||
|
$f(x)<f(x+1)$ when $x<k$, and
|
||||||
|
\item
|
||||||
|
$f(x)>f(x+1)$ when $x \ge k$.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
The idea is to use binary search
|
||||||
|
for finding the largest value of $x$
|
||||||
|
for which $f(x)<f(x+1)$.
|
||||||
|
This implies that $k=x+1$
|
||||||
|
because $f(x+1)>f(x+2)$.
|
||||||
|
The following code implements the search:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
int x = -1;
|
||||||
|
for (int b = z; b >= 1; b /= 2) {
|
||||||
|
while (f(x+b) < f(x+b+1)) x += b;
|
||||||
|
}
|
||||||
|
int k = x+1;
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Note that unlike in the ordinary binary search,
|
||||||
|
here it is not allowed that consecutive values
|
||||||
|
of the function are equal.
|
||||||
|
In this case it would not be possible to know
|
||||||
|
how to continue the search.
|
|
@ -0,0 +1,794 @@
|
||||||
|
\chapter{Data structures}
|
||||||
|
|
||||||
|
\index{data structure}
|
||||||
|
|
||||||
|
A \key{data structure} is a way to store
|
||||||
|
data in the memory of a computer.
|
||||||
|
It is important to choose an appropriate
|
||||||
|
data structure for a problem,
|
||||||
|
because each data structure has its own
|
||||||
|
advantages and disadvantages.
|
||||||
|
The crucial question is: which operations
|
||||||
|
are efficient in the chosen data structure?
|
||||||
|
|
||||||
|
This chapter introduces the most important
|
||||||
|
data structures in the C++ standard library.
|
||||||
|
It is a good idea to use the standard library
|
||||||
|
whenever possible,
|
||||||
|
because it will save a lot of time.
|
||||||
|
Later in the book we will learn about more sophisticated
|
||||||
|
data structures that are not available
|
||||||
|
in the standard library.
|
||||||
|
|
||||||
|
\section{Dynamic arrays}
|
||||||
|
|
||||||
|
\index{dynamic array}
|
||||||
|
\index{vector}
|
||||||
|
|
||||||
|
A \key{dynamic array} is an array whose
|
||||||
|
size can be changed during the execution
|
||||||
|
of the program.
|
||||||
|
The most popular dynamic array in C++ is
|
||||||
|
the \texttt{vector} structure,
|
||||||
|
which can be used almost like an ordinary array.
|
||||||
|
|
||||||
|
The following code creates an empty vector and
|
||||||
|
adds three elements to it:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<int> v;
|
||||||
|
v.push_back(3); // [3]
|
||||||
|
v.push_back(2); // [3,2]
|
||||||
|
v.push_back(5); // [3,2,5]
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
After this, the elements can be accessed like in an ordinary array:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
cout << v[0] << "\n"; // 3
|
||||||
|
cout << v[1] << "\n"; // 2
|
||||||
|
cout << v[2] << "\n"; // 5
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The function \texttt{size} returns the number of elements in the vector.
|
||||||
|
The following code iterates through
|
||||||
|
the vector and prints all elements in it:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 0; i < v.size(); i++) {
|
||||||
|
cout << v[i] << "\n";
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
A shorter way to iterate through a vector is as follows:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (auto x : v) {
|
||||||
|
cout << x << "\n";
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
The function \texttt{back} returns the last element
|
||||||
|
in the vector, and
|
||||||
|
the function \texttt{pop\_back} removes the last element:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<int> v;
|
||||||
|
v.push_back(5);
|
||||||
|
v.push_back(2);
|
||||||
|
cout << v.back() << "\n"; // 2
|
||||||
|
v.pop_back();
|
||||||
|
cout << v.back() << "\n"; // 5
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The following code creates a vector with five elements:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<int> v = {2,4,2,5,1};
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Another way to create a vector is to give the number
|
||||||
|
of elements and the initial value for each element:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
// size 10, initial value 0
|
||||||
|
vector<int> v(10);
|
||||||
|
\end{lstlisting}
|
||||||
|
\begin{lstlisting}
|
||||||
|
// size 10, initial value 5
|
||||||
|
vector<int> v(10, 5);
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The internal implementation of a vector
|
||||||
|
uses an ordinary array.
|
||||||
|
If the size of the vector increases and
|
||||||
|
the array becomes too small,
|
||||||
|
a new array is allocated and all the
|
||||||
|
elements are moved to the new array.
|
||||||
|
However, this does not happen often and the
|
||||||
|
average time complexity of
|
||||||
|
\texttt{push\_back} is $O(1)$.
|
||||||
|
|
||||||
|
\index{string}
|
||||||
|
|
||||||
|
The \texttt{string} structure
|
||||||
|
is also a dynamic array that can be used almost like a vector.
|
||||||
|
In addition, there is special syntax for strings
|
||||||
|
that is not available in other data structures.
|
||||||
|
Strings can be combined using the \texttt{+} symbol.
|
||||||
|
The function $\texttt{substr}(k,x)$ returns the substring
|
||||||
|
that begins at position $k$ and has length $x$,
|
||||||
|
and the function $\texttt{find}(\texttt{t})$ finds the position
|
||||||
|
of the first occurrence of a substring \texttt{t}.
|
||||||
|
|
||||||
|
The following code presents some string operations:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
string a = "hatti";
|
||||||
|
string b = a+a;
|
||||||
|
cout << b << "\n"; // hattihatti
|
||||||
|
b[5] = 'v';
|
||||||
|
cout << b << "\n"; // hattivatti
|
||||||
|
string c = b.substr(3,4);
|
||||||
|
cout << c << "\n"; // tiva
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Set structures}
|
||||||
|
|
||||||
|
\index{set}
|
||||||
|
|
||||||
|
A \key{set} is a data structure that
|
||||||
|
maintains a collection of elements.
|
||||||
|
The basic operations of sets are element
|
||||||
|
insertion, search and removal.
|
||||||
|
|
||||||
|
The C++ standard library contains two set
|
||||||
|
implementations:
|
||||||
|
The structure \texttt{set} is based on a balanced
|
||||||
|
binary tree and its operations work in $O(\log n)$ time.
|
||||||
|
The structure \texttt{unordered\_set} uses hashing,
|
||||||
|
and its operations work in $O(1)$ time on average.
|
||||||
|
|
||||||
|
The choice of which set implementation to use
|
||||||
|
is often a matter of taste.
|
||||||
|
The benefit of the \texttt{set} structure
|
||||||
|
is that it maintains the order of the elements
|
||||||
|
and provides functions that are not available
|
||||||
|
in \texttt{unordered\_set}.
|
||||||
|
On the other hand, \texttt{unordered\_set}
|
||||||
|
can be more efficient.
|
||||||
|
|
||||||
|
The following code creates a set
|
||||||
|
that contains integers,
|
||||||
|
and shows some of the operations.
|
||||||
|
The function \texttt{insert} adds an element to the set,
|
||||||
|
the function \texttt{count} returns the number of occurrences
|
||||||
|
of an element in the set,
|
||||||
|
and the function \texttt{erase} removes an element from the set.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
set<int> s;
|
||||||
|
s.insert(3);
|
||||||
|
s.insert(2);
|
||||||
|
s.insert(5);
|
||||||
|
cout << s.count(3) << "\n"; // 1
|
||||||
|
cout << s.count(4) << "\n"; // 0
|
||||||
|
s.erase(3);
|
||||||
|
s.insert(4);
|
||||||
|
cout << s.count(3) << "\n"; // 0
|
||||||
|
cout << s.count(4) << "\n"; // 1
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
A set can be used mostly like a vector,
|
||||||
|
but it is not possible to access
|
||||||
|
the elements using the \texttt{[]} notation.
|
||||||
|
The following code creates a set,
|
||||||
|
prints the number of elements in it, and then
|
||||||
|
iterates through all the elements:
|
||||||
|
\begin{lstlisting}
|
||||||
|
set<int> s = {2,5,6,8};
|
||||||
|
cout << s.size() << "\n"; // 4
|
||||||
|
for (auto x : s) {
|
||||||
|
cout << x << "\n";
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
An important property of sets is
|
||||||
|
that all their elements are \emph{distinct}.
|
||||||
|
Thus, the function \texttt{count} always returns
|
||||||
|
either 0 (the element is not in the set)
|
||||||
|
or 1 (the element is in the set),
|
||||||
|
and the function \texttt{insert} never adds
|
||||||
|
an element to the set if it is
|
||||||
|
already there.
|
||||||
|
The following code illustrates this:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
set<int> s;
|
||||||
|
s.insert(5);
|
||||||
|
s.insert(5);
|
||||||
|
s.insert(5);
|
||||||
|
cout << s.count(5) << "\n"; // 1
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
C++ also contains the structures
|
||||||
|
\texttt{multiset} and \texttt{unordered\_multiset}
|
||||||
|
that otherwise work like \texttt{set}
|
||||||
|
and \texttt{unordered\_set}
|
||||||
|
but they can contain multiple instances of an element.
|
||||||
|
For example, in the following code all three instances
|
||||||
|
of the number 5 are added to a multiset:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
multiset<int> s;
|
||||||
|
s.insert(5);
|
||||||
|
s.insert(5);
|
||||||
|
s.insert(5);
|
||||||
|
cout << s.count(5) << "\n"; // 3
|
||||||
|
\end{lstlisting}
|
||||||
|
The function \texttt{erase} removes
|
||||||
|
all instances of an element
|
||||||
|
from a multiset:
|
||||||
|
\begin{lstlisting}
|
||||||
|
s.erase(5);
|
||||||
|
cout << s.count(5) << "\n"; // 0
|
||||||
|
\end{lstlisting}
|
||||||
|
Often, only one instance should be removed,
|
||||||
|
which can be done as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
s.erase(s.find(5));
|
||||||
|
cout << s.count(5) << "\n"; // 2
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Map structures}
|
||||||
|
|
||||||
|
\index{map}
|
||||||
|
|
||||||
|
A \key{map} is a generalized array
|
||||||
|
that consists of key-value-pairs.
|
||||||
|
While the keys in an ordinary array are always
|
||||||
|
the consecutive integers $0,1,\ldots,n-1$,
|
||||||
|
where $n$ is the size of the array,
|
||||||
|
the keys in a map can be of any data type and
|
||||||
|
they do not have to be consecutive values.
|
||||||
|
|
||||||
|
The C++ standard library contains two map
|
||||||
|
implementations that correspond to the set
|
||||||
|
implementations: the structure
|
||||||
|
\texttt{map} is based on a balanced
|
||||||
|
binary tree and accessing elements
|
||||||
|
takes $O(\log n)$ time,
|
||||||
|
while the structure
|
||||||
|
\texttt{unordered\_map} uses hashing
|
||||||
|
and accessing elements takes $O(1)$ time on average.
|
||||||
|
|
||||||
|
The following code creates a map
|
||||||
|
where the keys are strings and the values are integers:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
map<string,int> m;
|
||||||
|
m["monkey"] = 4;
|
||||||
|
m["banana"] = 3;
|
||||||
|
m["harpsichord"] = 9;
|
||||||
|
cout << m["banana"] << "\n"; // 3
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
If the value of a key is requested
|
||||||
|
but the map does not contain it,
|
||||||
|
the key is automatically added to the map with
|
||||||
|
a default value.
|
||||||
|
For example, in the following code,
|
||||||
|
the key ''aybabtu'' with value 0
|
||||||
|
is added to the map.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
map<string,int> m;
|
||||||
|
cout << m["aybabtu"] << "\n"; // 0
|
||||||
|
\end{lstlisting}
|
||||||
|
The function \texttt{count} checks
|
||||||
|
if a key exists in a map:
|
||||||
|
\begin{lstlisting}
|
||||||
|
if (m.count("aybabtu")) {
|
||||||
|
// key exists
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
The following code prints all the keys and values
|
||||||
|
in a map:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (auto x : m) {
|
||||||
|
cout << x.first << " " << x.second << "\n";
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Iterators and ranges}
|
||||||
|
|
||||||
|
\index{iterator}
|
||||||
|
|
||||||
|
Many functions in the C++ standard library
|
||||||
|
operate with iterators.
|
||||||
|
An \key{iterator} is a variable that points
|
||||||
|
to an element in a data structure.
|
||||||
|
|
||||||
|
The often used iterators \texttt{begin}
|
||||||
|
and \texttt{end} define a range that contains
|
||||||
|
all elements in a data structure.
|
||||||
|
The iterator \texttt{begin} points to
|
||||||
|
the first element in the data structure,
|
||||||
|
and the iterator \texttt{end} points to
|
||||||
|
the position \emph{after} the last element.
|
||||||
|
The situation looks as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{llllllllll}
|
||||||
|
\{ & 3, & 4, & 6, & 8, & 12, & 13, & 14, & 17 & \} \\
|
||||||
|
& $\uparrow$ & & & & & & & & $\uparrow$ \\
|
||||||
|
& \multicolumn{3}{l}{\texttt{s.begin()}} & & & & & & \texttt{s.end()} \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Note the asymmetry in the iterators:
|
||||||
|
\texttt{s.begin()} points to an element in the data structure,
|
||||||
|
while \texttt{s.end()} points outside the data structure.
|
||||||
|
Thus, the range defined by the iterators is \emph{half-open}.
|
||||||
|
|
||||||
|
\subsubsection{Working with ranges}
|
||||||
|
|
||||||
|
Iterators are used in C++ standard library functions
|
||||||
|
that are given a range of elements in a data structure.
|
||||||
|
Usually, we want to process all elements in a
|
||||||
|
data structure, so the iterators
|
||||||
|
\texttt{begin} and \texttt{end} are given for the function.
|
||||||
|
|
||||||
|
For example, the following code sorts a vector
|
||||||
|
using the function \texttt{sort},
|
||||||
|
then reverses the order of the elements using the function
|
||||||
|
\texttt{reverse}, and finally shuffles the order of
|
||||||
|
the elements using the function \texttt{random\_shuffle}.
|
||||||
|
|
||||||
|
\index{sort@\texttt{sort}}
|
||||||
|
\index{reverse@\texttt{reverse}}
|
||||||
|
\index{random\_shuffle@\texttt{random\_shuffle}}
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
sort(v.begin(), v.end());
|
||||||
|
reverse(v.begin(), v.end());
|
||||||
|
random_shuffle(v.begin(), v.end());
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
These functions can also be used with an ordinary array.
|
||||||
|
In this case, the functions are given pointers to the array
|
||||||
|
instead of iterators:
|
||||||
|
|
||||||
|
\newpage
|
||||||
|
\begin{lstlisting}
|
||||||
|
sort(a, a+n);
|
||||||
|
reverse(a, a+n);
|
||||||
|
random_shuffle(a, a+n);
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Set iterators}
|
||||||
|
|
||||||
|
Iterators are often used to access
|
||||||
|
elements of a set.
|
||||||
|
The following code creates an iterator
|
||||||
|
\texttt{it} that points to the smallest element in a set:
|
||||||
|
\begin{lstlisting}
|
||||||
|
set<int>::iterator it = s.begin();
|
||||||
|
\end{lstlisting}
|
||||||
|
A shorter way to write the code is as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
auto it = s.begin();
|
||||||
|
\end{lstlisting}
|
||||||
|
The element to which an iterator points
|
||||||
|
can be accessed using the \texttt{*} symbol.
|
||||||
|
For example, the following code prints
|
||||||
|
the first element in the set:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
auto it = s.begin();
|
||||||
|
cout << *it << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Iterators can be moved using the operators
|
||||||
|
\texttt{++} (forward) and \texttt{--} (backward),
|
||||||
|
meaning that the iterator moves to the next
|
||||||
|
or previous element in the set.
|
||||||
|
|
||||||
|
The following code prints all the elements
|
||||||
|
in increasing order:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (auto it = s.begin(); it != s.end(); it++) {
|
||||||
|
cout << *it << "\n";
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
The following code prints the largest element in the set:
|
||||||
|
\begin{lstlisting}
|
||||||
|
auto it = s.end(); it--;
|
||||||
|
cout << *it << "\n";
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The function $\texttt{find}(x)$ returns an iterator
|
||||||
|
that points to an element whose value is $x$.
|
||||||
|
However, if the set does not contain $x$,
|
||||||
|
the iterator will be \texttt{end}.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
auto it = s.find(x);
|
||||||
|
if (it == s.end()) {
|
||||||
|
// x is not found
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The function $\texttt{lower\_bound}(x)$ returns
|
||||||
|
an iterator to the smallest element in the set
|
||||||
|
whose value is \emph{at least} $x$, and
|
||||||
|
the function $\texttt{upper\_bound}(x)$
|
||||||
|
returns an iterator to the smallest element in the set
|
||||||
|
whose value is \emph{larger than} $x$.
|
||||||
|
In both functions, if such an element does not exist,
|
||||||
|
the return value is \texttt{end}.
|
||||||
|
These functions are not supported by the
|
||||||
|
\texttt{unordered\_set} structure which
|
||||||
|
does not maintain the order of the elements.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
For example, the following code finds the element
|
||||||
|
nearest to $x$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
auto it = s.lower_bound(x);
|
||||||
|
if (it == s.begin()) {
|
||||||
|
cout << *it << "\n";
|
||||||
|
} else if (it == s.end()) {
|
||||||
|
it--;
|
||||||
|
cout << *it << "\n";
|
||||||
|
} else {
|
||||||
|
int a = *it; it--;
|
||||||
|
int b = *it;
|
||||||
|
if (x-b < a-x) cout << b << "\n";
|
||||||
|
else cout << a << "\n";
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The code assumes that the set is not empty,
|
||||||
|
and goes through all possible cases
|
||||||
|
using an iterator \texttt{it}.
|
||||||
|
First, the iterator points to the smallest
|
||||||
|
element whose value is at least $x$.
|
||||||
|
If \texttt{it} equals \texttt{begin},
|
||||||
|
the corresponding element is nearest to $x$.
|
||||||
|
If \texttt{it} equals \texttt{end},
|
||||||
|
the largest element in the set is nearest to $x$.
|
||||||
|
If none of the previous cases hold,
|
||||||
|
the element nearest to $x$ is either the
|
||||||
|
element that corresponds to \texttt{it} or the previous element.
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
\section{Other structures}
|
||||||
|
|
||||||
|
\subsubsection{Bitset}
|
||||||
|
|
||||||
|
\index{bitset}
|
||||||
|
|
||||||
|
A \key{bitset} is an array
|
||||||
|
whose each value is either 0 or 1.
|
||||||
|
For example, the following code creates
|
||||||
|
a bitset that contains 10 elements:
|
||||||
|
\begin{lstlisting}
|
||||||
|
bitset<10> s;
|
||||||
|
s[1] = 1;
|
||||||
|
s[3] = 1;
|
||||||
|
s[4] = 1;
|
||||||
|
s[7] = 1;
|
||||||
|
cout << s[4] << "\n"; // 1
|
||||||
|
cout << s[5] << "\n"; // 0
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The benefit of using bitsets is that
|
||||||
|
they require less memory than ordinary arrays,
|
||||||
|
because each element in a bitset only
|
||||||
|
uses one bit of memory.
|
||||||
|
For example,
|
||||||
|
if $n$ bits are stored in an \texttt{int} array,
|
||||||
|
$32n$ bits of memory will be used,
|
||||||
|
but a corresponding bitset only requires $n$ bits of memory.
|
||||||
|
In addition, the values of a bitset
|
||||||
|
can be efficiently manipulated using
|
||||||
|
bit operators, which makes it possible to
|
||||||
|
optimize algorithms using bit sets.
|
||||||
|
|
||||||
|
The following code shows another way to create the above bitset:
|
||||||
|
\begin{lstlisting}
|
||||||
|
bitset<10> s(string("0010011010")); // from right to left
|
||||||
|
cout << s[4] << "\n"; // 1
|
||||||
|
cout << s[5] << "\n"; // 0
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The function \texttt{count} returns the number
|
||||||
|
of ones in the bitset:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
bitset<10> s(string("0010011010"));
|
||||||
|
cout << s.count() << "\n"; // 4
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The following code shows examples of using bit operations:
|
||||||
|
\begin{lstlisting}
|
||||||
|
bitset<10> a(string("0010110110"));
|
||||||
|
bitset<10> b(string("1011011000"));
|
||||||
|
cout << (a&b) << "\n"; // 0010010000
|
||||||
|
cout << (a|b) << "\n"; // 1011111110
|
||||||
|
cout << (a^b) << "\n"; // 1001101110
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Deque}
|
||||||
|
|
||||||
|
\index{deque}
|
||||||
|
|
||||||
|
A \key{deque} is a dynamic array
|
||||||
|
whose size can be efficiently
|
||||||
|
changed at both ends of the array.
|
||||||
|
Like a vector, a deque provides the functions
|
||||||
|
\texttt{push\_back} and \texttt{pop\_back}, but
|
||||||
|
it also includes the functions
|
||||||
|
\texttt{push\_front} and \texttt{pop\_front}
|
||||||
|
which are not available in a vector.
|
||||||
|
|
||||||
|
A deque can be used as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
deque<int> d;
|
||||||
|
d.push_back(5); // [5]
|
||||||
|
d.push_back(2); // [5,2]
|
||||||
|
d.push_front(3); // [3,5,2]
|
||||||
|
d.pop_back(); // [3,5]
|
||||||
|
d.pop_front(); // [5]
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The internal implementation of a deque
|
||||||
|
is more complex than that of a vector,
|
||||||
|
and for this reason, a deque is slower than a vector.
|
||||||
|
Still, both adding and removing
|
||||||
|
elements take $O(1)$ time on average at both ends.
|
||||||
|
|
||||||
|
\subsubsection{Stack}
|
||||||
|
|
||||||
|
\index{stack}
|
||||||
|
|
||||||
|
A \key{stack}
|
||||||
|
is a data structure that provides two
|
||||||
|
$O(1)$ time operations:
|
||||||
|
adding an element to the top,
|
||||||
|
and removing an element from the top.
|
||||||
|
It is only possible to access the top
|
||||||
|
element of a stack.
|
||||||
|
|
||||||
|
The following code shows how a stack can be used:
|
||||||
|
\begin{lstlisting}
|
||||||
|
stack<int> s;
|
||||||
|
s.push(3);
|
||||||
|
s.push(2);
|
||||||
|
s.push(5);
|
||||||
|
cout << s.top(); // 5
|
||||||
|
s.pop();
|
||||||
|
cout << s.top(); // 2
|
||||||
|
\end{lstlisting}
|
||||||
|
\subsubsection{Queue}
|
||||||
|
|
||||||
|
\index{queue}
|
||||||
|
|
||||||
|
A \key{queue} also
|
||||||
|
provides two $O(1)$ time operations:
|
||||||
|
adding an element to the end of the queue,
|
||||||
|
and removing the first element in the queue.
|
||||||
|
It is only possible to access the first
|
||||||
|
and last element of a queue.
|
||||||
|
|
||||||
|
The following code shows how a queue can be used:
|
||||||
|
\begin{lstlisting}
|
||||||
|
queue<int> q;
|
||||||
|
q.push(3);
|
||||||
|
q.push(2);
|
||||||
|
q.push(5);
|
||||||
|
cout << q.front(); // 3
|
||||||
|
q.pop();
|
||||||
|
cout << q.front(); // 2
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Priority queue}
|
||||||
|
|
||||||
|
\index{priority queue}
|
||||||
|
\index{heap}
|
||||||
|
|
||||||
|
A \key{priority queue}
|
||||||
|
maintains a set of elements.
|
||||||
|
The supported operations are insertion and,
|
||||||
|
depending on the type of the queue,
|
||||||
|
retrieval and removal of
|
||||||
|
either the minimum or maximum element.
|
||||||
|
Insertion and removal take $O(\log n)$ time,
|
||||||
|
and retrieval takes $O(1)$ time.
|
||||||
|
|
||||||
|
While an ordered set efficiently supports
|
||||||
|
all the operations of a priority queue,
|
||||||
|
the benefit of using a priority queue is
|
||||||
|
that it has smaller constant factors.
|
||||||
|
A priority queue is usually implemented using
|
||||||
|
a heap structure that is much simpler than a
|
||||||
|
balanced binary tree used in an ordered set.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
By default, the elements in a C++
|
||||||
|
priority queue are sorted in decreasing order,
|
||||||
|
and it is possible to find and remove the
|
||||||
|
largest element in the queue.
|
||||||
|
The following code illustrates this:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
priority_queue<int> q;
|
||||||
|
q.push(3);
|
||||||
|
q.push(5);
|
||||||
|
q.push(7);
|
||||||
|
q.push(2);
|
||||||
|
cout << q.top() << "\n"; // 7
|
||||||
|
q.pop();
|
||||||
|
cout << q.top() << "\n"; // 5
|
||||||
|
q.pop();
|
||||||
|
q.push(6);
|
||||||
|
cout << q.top() << "\n"; // 6
|
||||||
|
q.pop();
|
||||||
|
\end{lstlisting}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
If we want to create a priority queue
|
||||||
|
that supports finding and removing
|
||||||
|
the smallest element,
|
||||||
|
we can do it as follows:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
priority_queue<int,vector<int>,greater<int>> q;
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Policy-based data structures}
|
||||||
|
|
||||||
|
The \texttt{g++} compiler also supports
|
||||||
|
some data structures that are not part
|
||||||
|
of the C++ standard library.
|
||||||
|
Such structures are called \emph{policy-based}
|
||||||
|
data structures.
|
||||||
|
To use these structures, the following lines
|
||||||
|
must be added to the code:
|
||||||
|
\begin{lstlisting}
|
||||||
|
#include <ext/pb_ds/assoc_container.hpp>
|
||||||
|
using namespace __gnu_pbds;
|
||||||
|
\end{lstlisting}
|
||||||
|
After this, we can define a data structure \texttt{indexed\_set} that
|
||||||
|
is like \texttt{set} but can be indexed like an array.
|
||||||
|
The definition for \texttt{int} values is as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
typedef tree<int,null_type,less<int>,rb_tree_tag,
|
||||||
|
tree_order_statistics_node_update> indexed_set;
|
||||||
|
\end{lstlisting}
|
||||||
|
Now we can create a set as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
indexed_set s;
|
||||||
|
s.insert(2);
|
||||||
|
s.insert(3);
|
||||||
|
s.insert(7);
|
||||||
|
s.insert(9);
|
||||||
|
\end{lstlisting}
|
||||||
|
The speciality of this set is that we have access to
|
||||||
|
the indices that the elements would have in a sorted array.
|
||||||
|
The function $\texttt{find\_by\_order}$ returns
|
||||||
|
an iterator to the element at a given position:
|
||||||
|
\begin{lstlisting}
|
||||||
|
auto x = s.find_by_order(2);
|
||||||
|
cout << *x << "\n"; // 7
|
||||||
|
\end{lstlisting}
|
||||||
|
And the function $\texttt{order\_of\_key}$
|
||||||
|
returns the position of a given element:
|
||||||
|
\begin{lstlisting}
|
||||||
|
cout << s.order_of_key(7) << "\n"; // 2
|
||||||
|
\end{lstlisting}
|
||||||
|
If the element does not appear in the set,
|
||||||
|
we get the position that the element would have
|
||||||
|
in the set:
|
||||||
|
\begin{lstlisting}
|
||||||
|
cout << s.order_of_key(6) << "\n"; // 2
|
||||||
|
cout << s.order_of_key(8) << "\n"; // 3
|
||||||
|
\end{lstlisting}
|
||||||
|
Both the functions work in logarithmic time.
|
||||||
|
|
||||||
|
\section{Comparison to sorting}
|
||||||
|
|
||||||
|
It is often possible to solve a problem
|
||||||
|
using either data structures or sorting.
|
||||||
|
Sometimes there are remarkable differences
|
||||||
|
in the actual efficiency of these approaches,
|
||||||
|
which may be hidden in their time complexities.
|
||||||
|
|
||||||
|
Let us consider a problem where
|
||||||
|
we are given two lists $A$ and $B$
|
||||||
|
that both contain $n$ elements.
|
||||||
|
Our task is to calculate the number of elements
|
||||||
|
that belong to both of the lists.
|
||||||
|
For example, for the lists
|
||||||
|
\[A = [5,2,8,9] \hspace{10px} \textrm{and} \hspace{10px} B = [3,2,9,5],\]
|
||||||
|
the answer is 3 because the numbers 2, 5
|
||||||
|
and 9 belong to both of the lists.
|
||||||
|
|
||||||
|
A straightforward solution to the problem is
|
||||||
|
to go through all pairs of elements in $O(n^2)$ time,
|
||||||
|
but next we will focus on
|
||||||
|
more efficient algorithms.
|
||||||
|
|
||||||
|
\subsubsection{Algorithm 1}
|
||||||
|
|
||||||
|
We construct a set of the elements that appear in $A$,
|
||||||
|
and after this, we iterate through the elements
|
||||||
|
of $B$ and check for each elements if it
|
||||||
|
also belongs to $A$.
|
||||||
|
This is efficient because the elements of $A$
|
||||||
|
are in a set.
|
||||||
|
Using the \texttt{set} structure,
|
||||||
|
the time complexity of the algorithm is $O(n \log n)$.
|
||||||
|
|
||||||
|
\subsubsection{Algorithm 2}
|
||||||
|
|
||||||
|
It is not necessary to maintain an ordered set,
|
||||||
|
so instead of the \texttt{set} structure
|
||||||
|
we can also use the \texttt{unordered\_set} structure.
|
||||||
|
This is an easy way to make the algorithm
|
||||||
|
more efficient, because we only have to change
|
||||||
|
the underlying data structure.
|
||||||
|
The time complexity of the new algorithm is $O(n)$.
|
||||||
|
|
||||||
|
\subsubsection{Algorithm 3}
|
||||||
|
|
||||||
|
Instead of data structures, we can use sorting.
|
||||||
|
First, we sort both lists $A$ and $B$.
|
||||||
|
After this, we iterate through both the lists
|
||||||
|
at the same time and find the common elements.
|
||||||
|
The time complexity of sorting is $O(n \log n)$,
|
||||||
|
and the rest of the algorithm works in $O(n)$ time,
|
||||||
|
so the total time complexity is $O(n \log n)$.
|
||||||
|
|
||||||
|
\subsubsection{Efficiency comparison}
|
||||||
|
|
||||||
|
The following table shows how efficient
|
||||||
|
the above algorithms are when $n$ varies and
|
||||||
|
the elements of the lists are random
|
||||||
|
integers between $1 \ldots 10^9$:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rrrr}
|
||||||
|
$n$ & Algorithm 1 & Algorithm 2 & Algorithm 3 \\
|
||||||
|
\hline
|
||||||
|
$10^6$ & $1.5$ s & $0.3$ s & $0.2$ s \\
|
||||||
|
$2 \cdot 10^6$ & $3.7$ s & $0.8$ s & $0.3$ s \\
|
||||||
|
$3 \cdot 10^6$ & $5.7$ s & $1.3$ s & $0.5$ s \\
|
||||||
|
$4 \cdot 10^6$ & $7.7$ s & $1.7$ s & $0.7$ s \\
|
||||||
|
$5 \cdot 10^6$ & $10.0$ s & $2.3$ s & $0.9$ s \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Algorithms 1 and 2 are equal except that
|
||||||
|
they use different set structures.
|
||||||
|
In this problem, this choice has an important effect on
|
||||||
|
the running time, because Algorithm 2
|
||||||
|
is 4–5 times faster than Algorithm 1.
|
||||||
|
|
||||||
|
However, the most efficient algorithm is Algorithm 3
|
||||||
|
which uses sorting.
|
||||||
|
It only uses half the time compared to Algorithm 2.
|
||||||
|
Interestingly, the time complexity of both
|
||||||
|
Algorithm 1 and Algorithm 3 is $O(n \log n)$,
|
||||||
|
but despite this, Algorithm 3 is ten times faster.
|
||||||
|
This can be explained by the fact that
|
||||||
|
sorting is a simple procedure and it is done
|
||||||
|
only once at the beginning of Algorithm 3,
|
||||||
|
and the rest of the algorithm works in linear time.
|
||||||
|
On the other hand,
|
||||||
|
Algorithm 1 maintains a complex balanced binary tree
|
||||||
|
during the whole algorithm.
|
|
@ -0,0 +1,758 @@
|
||||||
|
\chapter{Complete search}
|
||||||
|
|
||||||
|
\key{Complete search}
|
||||||
|
is a general method that can be used
|
||||||
|
to solve almost any algorithm problem.
|
||||||
|
The idea is to generate all possible
|
||||||
|
solutions to the problem using brute force,
|
||||||
|
and then select the best solution or count the
|
||||||
|
number of solutions, depending on the problem.
|
||||||
|
|
||||||
|
Complete search is a good technique
|
||||||
|
if there is enough time to go through all the solutions,
|
||||||
|
because the search is usually easy to implement
|
||||||
|
and it always gives the correct answer.
|
||||||
|
If complete search is too slow,
|
||||||
|
other techniques, such as greedy algorithms or
|
||||||
|
dynamic programming, may be needed.
|
||||||
|
|
||||||
|
\section{Generating subsets}
|
||||||
|
|
||||||
|
\index{subset}
|
||||||
|
|
||||||
|
We first consider the problem of generating
|
||||||
|
all subsets of a set of $n$ elements.
|
||||||
|
For example, the subsets of $\{0,1,2\}$ are
|
||||||
|
$\emptyset$, $\{0\}$, $\{1\}$, $\{2\}$, $\{0,1\}$,
|
||||||
|
$\{0,2\}$, $\{1,2\}$ and $\{0,1,2\}$.
|
||||||
|
There are two common methods to generate subsets:
|
||||||
|
we can either perform a recursive search
|
||||||
|
or exploit the bit representation of integers.
|
||||||
|
|
||||||
|
\subsubsection{Method 1}
|
||||||
|
|
||||||
|
An elegant way to go through all subsets
|
||||||
|
of a set is to use recursion.
|
||||||
|
The following function \texttt{search}
|
||||||
|
generates the subsets of the set
|
||||||
|
$\{0,1,\ldots,n-1\}$.
|
||||||
|
The function maintains a vector \texttt{subset}
|
||||||
|
that will contain the elements of each subset.
|
||||||
|
The search begins when the function is called
|
||||||
|
with parameter 0.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
void search(int k) {
|
||||||
|
if (k == n) {
|
||||||
|
// process subset
|
||||||
|
} else {
|
||||||
|
search(k+1);
|
||||||
|
subset.push_back(k);
|
||||||
|
search(k+1);
|
||||||
|
subset.pop_back();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
When the function \texttt{search}
|
||||||
|
is called with parameter $k$,
|
||||||
|
it decides whether to include the
|
||||||
|
element $k$ in the subset or not,
|
||||||
|
and in both cases,
|
||||||
|
then calls itself with parameter $k+1$
|
||||||
|
However, if $k=n$, the function notices that
|
||||||
|
all elements have been processed
|
||||||
|
and a subset has been generated.
|
||||||
|
|
||||||
|
The following tree illustrates the function calls when $n=3$.
|
||||||
|
We can always choose either the left branch
|
||||||
|
($k$ is not included in the subset) or the right branch
|
||||||
|
($k$ is included in the subset).
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.45]
|
||||||
|
\begin{scope}
|
||||||
|
\small
|
||||||
|
\node at (0,0) {$\texttt{search}(0)$};
|
||||||
|
|
||||||
|
\node at (-8,-4) {$\texttt{search}(1)$};
|
||||||
|
\node at (8,-4) {$\texttt{search}(1)$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (0,0-0.5) -- (-8,-4+0.5);
|
||||||
|
\path[draw,thick,->] (0,0-0.5) -- (8,-4+0.5);
|
||||||
|
|
||||||
|
\node at (-12,-8) {$\texttt{search}(2)$};
|
||||||
|
\node at (-4,-8) {$\texttt{search}(2)$};
|
||||||
|
\node at (4,-8) {$\texttt{search}(2)$};
|
||||||
|
\node at (12,-8) {$\texttt{search}(2)$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (-8,-4-0.5) -- (-12,-8+0.5);
|
||||||
|
\path[draw,thick,->] (-8,-4-0.5) -- (-4,-8+0.5);
|
||||||
|
\path[draw,thick,->] (8,-4-0.5) -- (4,-8+0.5);
|
||||||
|
\path[draw,thick,->] (8,-4-0.5) -- (12,-8+0.5);
|
||||||
|
|
||||||
|
\node at (-14,-12) {$\texttt{search}(3)$};
|
||||||
|
\node at (-10,-12) {$\texttt{search}(3)$};
|
||||||
|
\node at (-6,-12) {$\texttt{search}(3)$};
|
||||||
|
\node at (-2,-12) {$\texttt{search}(3)$};
|
||||||
|
\node at (2,-12) {$\texttt{search}(3)$};
|
||||||
|
\node at (6,-12) {$\texttt{search}(3)$};
|
||||||
|
\node at (10,-12) {$\texttt{search}(3)$};
|
||||||
|
\node at (14,-12) {$\texttt{search}(3)$};
|
||||||
|
|
||||||
|
\node at (-14,-13.5) {$\emptyset$};
|
||||||
|
\node at (-10,-13.5) {$\{2\}$};
|
||||||
|
\node at (-6,-13.5) {$\{1\}$};
|
||||||
|
\node at (-2,-13.5) {$\{1,2\}$};
|
||||||
|
\node at (2,-13.5) {$\{0\}$};
|
||||||
|
\node at (6,-13.5) {$\{0,2\}$};
|
||||||
|
\node at (10,-13.5) {$\{0,1\}$};
|
||||||
|
\node at (14,-13.5) {$\{0,1,2\}$};
|
||||||
|
|
||||||
|
|
||||||
|
\path[draw,thick,->] (-12,-8-0.5) -- (-14,-12+0.5);
|
||||||
|
\path[draw,thick,->] (-12,-8-0.5) -- (-10,-12+0.5);
|
||||||
|
\path[draw,thick,->] (-4,-8-0.5) -- (-6,-12+0.5);
|
||||||
|
\path[draw,thick,->] (-4,-8-0.5) -- (-2,-12+0.5);
|
||||||
|
\path[draw,thick,->] (4,-8-0.5) -- (2,-12+0.5);
|
||||||
|
\path[draw,thick,->] (4,-8-0.5) -- (6,-12+0.5);
|
||||||
|
\path[draw,thick,->] (12,-8-0.5) -- (10,-12+0.5);
|
||||||
|
\path[draw,thick,->] (12,-8-0.5) -- (14,-12+0.5);
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Method 2}
|
||||||
|
|
||||||
|
Another way to generate subsets is based on
|
||||||
|
the bit representation of integers.
|
||||||
|
Each subset of a set of $n$ elements
|
||||||
|
can be represented as a sequence of $n$ bits,
|
||||||
|
which corresponds to an integer between $0 \ldots 2^n-1$.
|
||||||
|
The ones in the bit sequence indicate
|
||||||
|
which elements are included in the subset.
|
||||||
|
|
||||||
|
The usual convention is that
|
||||||
|
the last bit corresponds to element 0,
|
||||||
|
the second last bit corresponds to element 1,
|
||||||
|
and so on.
|
||||||
|
For example, the bit representation of 25
|
||||||
|
is 11001, which corresponds to the subset $\{0,3,4\}$.
|
||||||
|
|
||||||
|
The following code goes through the subsets
|
||||||
|
of a set of $n$ elements
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int b = 0; b < (1<<n); b++) {
|
||||||
|
// process subset
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The following code shows how we can find
|
||||||
|
the elements of a subset that corresponds to a bit sequence.
|
||||||
|
When processing each subset,
|
||||||
|
the code builds a vector that contains the
|
||||||
|
elements in the subset.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int b = 0; b < (1<<n); b++) {
|
||||||
|
vector<int> subset;
|
||||||
|
for (int i = 0; i < n; i++) {
|
||||||
|
if (b&(1<<i)) subset.push_back(i);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Generating permutations}
|
||||||
|
|
||||||
|
\index{permutation}
|
||||||
|
|
||||||
|
Next we consider the problem of generating
|
||||||
|
all permutations of a set of $n$ elements.
|
||||||
|
For example, the permutations of $\{0,1,2\}$ are
|
||||||
|
$(0,1,2)$, $(0,2,1)$, $(1,0,2)$, $(1,2,0)$,
|
||||||
|
$(2,0,1)$ and $(2,1,0)$.
|
||||||
|
Again, there are two approaches:
|
||||||
|
we can either use recursion or go through the
|
||||||
|
permutations iteratively.
|
||||||
|
|
||||||
|
\subsubsection{Method 1}
|
||||||
|
|
||||||
|
Like subsets, permutations can be generated
|
||||||
|
using recursion.
|
||||||
|
The following function \texttt{search} goes
|
||||||
|
through the permutations of the set $\{0,1,\ldots,n-1\}$.
|
||||||
|
The function builds a vector \texttt{permutation}
|
||||||
|
that contains the permutation,
|
||||||
|
and the search begins when the function is
|
||||||
|
called without parameters.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
void search() {
|
||||||
|
if (permutation.size() == n) {
|
||||||
|
// process permutation
|
||||||
|
} else {
|
||||||
|
for (int i = 0; i < n; i++) {
|
||||||
|
if (chosen[i]) continue;
|
||||||
|
chosen[i] = true;
|
||||||
|
permutation.push_back(i);
|
||||||
|
search();
|
||||||
|
chosen[i] = false;
|
||||||
|
permutation.pop_back();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Each function call adds a new element to
|
||||||
|
\texttt{permutation}.
|
||||||
|
The array \texttt{chosen} indicates which
|
||||||
|
elements are already included in the permutation.
|
||||||
|
If the size of \texttt{permutation} equals the size of the set,
|
||||||
|
a permutation has been generated.
|
||||||
|
|
||||||
|
\subsubsection{Method 2}
|
||||||
|
|
||||||
|
\index{next\_permutation@\texttt{next\_permutation}}
|
||||||
|
|
||||||
|
Another method for generating permutations
|
||||||
|
is to begin with the permutation
|
||||||
|
$\{0,1,\ldots,n-1\}$ and repeatedly
|
||||||
|
use a function that constructs the next permutation
|
||||||
|
in increasing order.
|
||||||
|
The C++ standard library contains the function
|
||||||
|
\texttt{next\_permutation} that can be used for this:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<int> permutation;
|
||||||
|
for (int i = 0; i < n; i++) {
|
||||||
|
permutation.push_back(i);
|
||||||
|
}
|
||||||
|
do {
|
||||||
|
// process permutation
|
||||||
|
} while (next_permutation(permutation.begin(),permutation.end()));
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Backtracking}
|
||||||
|
|
||||||
|
\index{backtracking}
|
||||||
|
|
||||||
|
A \key{backtracking} algorithm
|
||||||
|
begins with an empty solution
|
||||||
|
and extends the solution step by step.
|
||||||
|
The search recursively
|
||||||
|
goes through all different ways how
|
||||||
|
a solution can be constructed.
|
||||||
|
|
||||||
|
\index{queen problem}
|
||||||
|
|
||||||
|
As an example, consider the problem of
|
||||||
|
calculating the number
|
||||||
|
of ways $n$ queens can be placed on
|
||||||
|
an $n \times n$ chessboard so that
|
||||||
|
no two queens attack each other.
|
||||||
|
For example, when $n=4$,
|
||||||
|
there are two possible solutions:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.65]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0, 0) grid (4, 4);
|
||||||
|
\node at (1.5,3.5) {\symqueen};
|
||||||
|
\node at (3.5,2.5) {\symqueen};
|
||||||
|
\node at (0.5,1.5) {\symqueen};
|
||||||
|
\node at (2.5,0.5) {\symqueen};
|
||||||
|
|
||||||
|
\draw (6, 0) grid (10, 4);
|
||||||
|
\node at (6+2.5,3.5) {\symqueen};
|
||||||
|
\node at (6+0.5,2.5) {\symqueen};
|
||||||
|
\node at (6+3.5,1.5) {\symqueen};
|
||||||
|
\node at (6+1.5,0.5) {\symqueen};
|
||||||
|
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The problem can be solved using backtracking
|
||||||
|
by placing queens to the board row by row.
|
||||||
|
More precisely, exactly one queen will
|
||||||
|
be placed on each row so that no queen attacks
|
||||||
|
any of the queens placed before.
|
||||||
|
A solution has been found when all
|
||||||
|
$n$ queens have been placed on the board.
|
||||||
|
|
||||||
|
For example, when $n=4$,
|
||||||
|
some partial solutions generated by
|
||||||
|
the backtracking algorithm are as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0, 0) grid (4, 4);
|
||||||
|
|
||||||
|
\draw (-9, -6) grid (-5, -2);
|
||||||
|
\draw (-3, -6) grid (1, -2);
|
||||||
|
\draw (3, -6) grid (7, -2);
|
||||||
|
\draw (9, -6) grid (13, -2);
|
||||||
|
|
||||||
|
\node at (-9+0.5,-3+0.5) {\symqueen};
|
||||||
|
\node at (-3+1+0.5,-3+0.5) {\symqueen};
|
||||||
|
\node at (3+2+0.5,-3+0.5) {\symqueen};
|
||||||
|
\node at (9+3+0.5,-3+0.5) {\symqueen};
|
||||||
|
|
||||||
|
\draw (2,0) -- (-7,-2);
|
||||||
|
\draw (2,0) -- (-1,-2);
|
||||||
|
\draw (2,0) -- (5,-2);
|
||||||
|
\draw (2,0) -- (11,-2);
|
||||||
|
|
||||||
|
\draw (-11, -12) grid (-7, -8);
|
||||||
|
\draw (-6, -12) grid (-2, -8);
|
||||||
|
\draw (-1, -12) grid (3, -8);
|
||||||
|
\draw (4, -12) grid (8, -8);
|
||||||
|
\draw[white] (11, -12) grid (15, -8);
|
||||||
|
\node at (-11+1+0.5,-9+0.5) {\symqueen};
|
||||||
|
\node at (-6+1+0.5,-9+0.5) {\symqueen};
|
||||||
|
\node at (-1+1+0.5,-9+0.5) {\symqueen};
|
||||||
|
\node at (4+1+0.5,-9+0.5) {\symqueen};
|
||||||
|
\node at (-11+0+0.5,-10+0.5) {\symqueen};
|
||||||
|
\node at (-6+1+0.5,-10+0.5) {\symqueen};
|
||||||
|
\node at (-1+2+0.5,-10+0.5) {\symqueen};
|
||||||
|
\node at (4+3+0.5,-10+0.5) {\symqueen};
|
||||||
|
|
||||||
|
\draw (-1,-6) -- (-9,-8);
|
||||||
|
\draw (-1,-6) -- (-4,-8);
|
||||||
|
\draw (-1,-6) -- (1,-8);
|
||||||
|
\draw (-1,-6) -- (6,-8);
|
||||||
|
|
||||||
|
\node at (-9,-13) {illegal};
|
||||||
|
\node at (-4,-13) {illegal};
|
||||||
|
\node at (1,-13) {illegal};
|
||||||
|
\node at (6,-13) {valid};
|
||||||
|
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
At the bottom level, the three first configurations
|
||||||
|
are illegal, because the queens attack each other.
|
||||||
|
However, the fourth configuration is valid
|
||||||
|
and it can be extended to a complete solution by
|
||||||
|
placing two more queens to the board.
|
||||||
|
There is only one way to place the two remaining queens.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
The algorithm can be implemented as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
void search(int y) {
|
||||||
|
if (y == n) {
|
||||||
|
count++;
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
for (int x = 0; x < n; x++) {
|
||||||
|
if (column[x] || diag1[x+y] || diag2[x-y+n-1]) continue;
|
||||||
|
column[x] = diag1[x+y] = diag2[x-y+n-1] = 1;
|
||||||
|
search(y+1);
|
||||||
|
column[x] = diag1[x+y] = diag2[x-y+n-1] = 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
\end{samepage}
|
||||||
|
The search begins by calling \texttt{search(0)}.
|
||||||
|
The size of the board is $n \times n$,
|
||||||
|
and the code calculates the number of solutions
|
||||||
|
to \texttt{count}.
|
||||||
|
|
||||||
|
The code assumes that the rows and columns
|
||||||
|
of the board are numbered from 0 to $n-1$.
|
||||||
|
When the function \texttt{search} is
|
||||||
|
called with parameter $y$,
|
||||||
|
it places a queen on row $y$
|
||||||
|
and then calls itself with parameter $y+1$.
|
||||||
|
Then, if $y=n$, a solution has been found
|
||||||
|
and the variable \texttt{count} is increased by one.
|
||||||
|
|
||||||
|
The array \texttt{column} keeps track of columns
|
||||||
|
that contain a queen,
|
||||||
|
and the arrays \texttt{diag1} and \texttt{diag2}
|
||||||
|
keep track of diagonals.
|
||||||
|
It is not allowed to add another queen to a
|
||||||
|
column or diagonal that already contains a queen.
|
||||||
|
For example, the columns and diagonals of
|
||||||
|
the $4 \times 4$ board are numbered as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.65]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0-6, 0) grid (4-6, 4);
|
||||||
|
\node at (-6+0.5,3.5) {$0$};
|
||||||
|
\node at (-6+1.5,3.5) {$1$};
|
||||||
|
\node at (-6+2.5,3.5) {$2$};
|
||||||
|
\node at (-6+3.5,3.5) {$3$};
|
||||||
|
\node at (-6+0.5,2.5) {$0$};
|
||||||
|
\node at (-6+1.5,2.5) {$1$};
|
||||||
|
\node at (-6+2.5,2.5) {$2$};
|
||||||
|
\node at (-6+3.5,2.5) {$3$};
|
||||||
|
\node at (-6+0.5,1.5) {$0$};
|
||||||
|
\node at (-6+1.5,1.5) {$1$};
|
||||||
|
\node at (-6+2.5,1.5) {$2$};
|
||||||
|
\node at (-6+3.5,1.5) {$3$};
|
||||||
|
\node at (-6+0.5,0.5) {$0$};
|
||||||
|
\node at (-6+1.5,0.5) {$1$};
|
||||||
|
\node at (-6+2.5,0.5) {$2$};
|
||||||
|
\node at (-6+3.5,0.5) {$3$};
|
||||||
|
|
||||||
|
\draw (0, 0) grid (4, 4);
|
||||||
|
\node at (0.5,3.5) {$0$};
|
||||||
|
\node at (1.5,3.5) {$1$};
|
||||||
|
\node at (2.5,3.5) {$2$};
|
||||||
|
\node at (3.5,3.5) {$3$};
|
||||||
|
\node at (0.5,2.5) {$1$};
|
||||||
|
\node at (1.5,2.5) {$2$};
|
||||||
|
\node at (2.5,2.5) {$3$};
|
||||||
|
\node at (3.5,2.5) {$4$};
|
||||||
|
\node at (0.5,1.5) {$2$};
|
||||||
|
\node at (1.5,1.5) {$3$};
|
||||||
|
\node at (2.5,1.5) {$4$};
|
||||||
|
\node at (3.5,1.5) {$5$};
|
||||||
|
\node at (0.5,0.5) {$3$};
|
||||||
|
\node at (1.5,0.5) {$4$};
|
||||||
|
\node at (2.5,0.5) {$5$};
|
||||||
|
\node at (3.5,0.5) {$6$};
|
||||||
|
|
||||||
|
\draw (6, 0) grid (10, 4);
|
||||||
|
\node at (6.5,3.5) {$3$};
|
||||||
|
\node at (7.5,3.5) {$4$};
|
||||||
|
\node at (8.5,3.5) {$5$};
|
||||||
|
\node at (9.5,3.5) {$6$};
|
||||||
|
\node at (6.5,2.5) {$2$};
|
||||||
|
\node at (7.5,2.5) {$3$};
|
||||||
|
\node at (8.5,2.5) {$4$};
|
||||||
|
\node at (9.5,2.5) {$5$};
|
||||||
|
\node at (6.5,1.5) {$1$};
|
||||||
|
\node at (7.5,1.5) {$2$};
|
||||||
|
\node at (8.5,1.5) {$3$};
|
||||||
|
\node at (9.5,1.5) {$4$};
|
||||||
|
\node at (6.5,0.5) {$0$};
|
||||||
|
\node at (7.5,0.5) {$1$};
|
||||||
|
\node at (8.5,0.5) {$2$};
|
||||||
|
\node at (9.5,0.5) {$3$};
|
||||||
|
|
||||||
|
\node at (-4,-1) {\texttt{column}};
|
||||||
|
\node at (2,-1) {\texttt{diag1}};
|
||||||
|
\node at (8,-1) {\texttt{diag2}};
|
||||||
|
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Let $q(n)$ denote the number of ways
|
||||||
|
to place $n$ queens on an $n \times n$ chessboard.
|
||||||
|
The above backtracking
|
||||||
|
algorithm tells us that, for example, $q(8)=92$.
|
||||||
|
When $n$ increases, the search quickly becomes slow,
|
||||||
|
because the number of solutions increases
|
||||||
|
exponentially.
|
||||||
|
For example, calculating $q(16)=14772512$
|
||||||
|
using the above algorithm already takes about a minute
|
||||||
|
on a modern computer\footnote{There is no known way to efficiently
|
||||||
|
calculate larger values of $q(n)$. The current record is
|
||||||
|
$q(27)=234907967154122528$, calculated in 2016 \cite{q27}.}.
|
||||||
|
|
||||||
|
\section{Pruning the search}
|
||||||
|
|
||||||
|
We can often optimize backtracking
|
||||||
|
by pruning the search tree.
|
||||||
|
The idea is to add ''intelligence'' to the algorithm
|
||||||
|
so that it will notice as soon as possible
|
||||||
|
if a partial solution cannot be extended
|
||||||
|
to a complete solution.
|
||||||
|
Such optimizations can have a tremendous
|
||||||
|
effect on the efficiency of the search.
|
||||||
|
|
||||||
|
Let us consider the problem
|
||||||
|
of calculating the number of paths
|
||||||
|
in an $n \times n$ grid from the upper-left corner
|
||||||
|
to the lower-right corner such that the
|
||||||
|
path visits each square exactly once.
|
||||||
|
For example, in a $7 \times 7$ grid,
|
||||||
|
there are 111712 such paths.
|
||||||
|
One of the paths is as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0, 0) grid (7, 7);
|
||||||
|
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||||
|
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||||
|
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||||
|
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||||
|
(5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) --
|
||||||
|
(3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) --
|
||||||
|
(4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) --
|
||||||
|
(5.5,6.5) -- (6.5,6.5) -- (6.5,0.5);
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
We focus on the $7 \times 7$ case,
|
||||||
|
because its level of difficulty is appropriate to our needs.
|
||||||
|
We begin with a straightforward backtracking algorithm,
|
||||||
|
and then optimize it step by step using observations
|
||||||
|
of how the search can be pruned.
|
||||||
|
After each optimization, we measure the running time
|
||||||
|
of the algorithm and the number of recursive calls,
|
||||||
|
so that we clearly see the effect of each
|
||||||
|
optimization on the efficiency of the search.
|
||||||
|
|
||||||
|
\subsubsection{Basic algorithm}
|
||||||
|
|
||||||
|
The first version of the algorithm does not contain
|
||||||
|
any optimizations. We simply use backtracking to generate
|
||||||
|
all possible paths from the upper-left corner to
|
||||||
|
the lower-right corner and count the number of such paths.
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item
|
||||||
|
running time: 483 seconds
|
||||||
|
\item
|
||||||
|
number of recursive calls: 76 billion
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsubsection{Optimization 1}
|
||||||
|
|
||||||
|
In any solution, we first move one step
|
||||||
|
down or right.
|
||||||
|
There are always two paths that
|
||||||
|
are symmetric
|
||||||
|
about the diagonal of the grid
|
||||||
|
after the first step.
|
||||||
|
For example, the following paths are symmetric:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{ccc}
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0, 0) grid (7, 7);
|
||||||
|
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||||
|
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||||
|
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||||
|
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||||
|
(5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) --
|
||||||
|
(3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) --
|
||||||
|
(4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) --
|
||||||
|
(5.5,6.5) -- (6.5,6.5) -- (6.5,0.5);
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{20px}
|
||||||
|
&
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}[yscale=1,xscale=-1,rotate=-90]
|
||||||
|
\draw (0, 0) grid (7, 7);
|
||||||
|
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||||
|
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||||
|
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||||
|
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||||
|
(5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) --
|
||||||
|
(3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) --
|
||||||
|
(4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) --
|
||||||
|
(5.5,6.5) -- (6.5,6.5) -- (6.5,0.5);
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Hence, we can decide that we always first
|
||||||
|
move one step down (or right),
|
||||||
|
and finally multiply the number of solutions by two.
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item
|
||||||
|
running time: 244 seconds
|
||||||
|
\item
|
||||||
|
number of recursive calls: 38 billion
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsubsection{Optimization 2}
|
||||||
|
|
||||||
|
If the path reaches the lower-right square
|
||||||
|
before it has visited all other squares of the grid,
|
||||||
|
it is clear that
|
||||||
|
it will not be possible to complete the solution.
|
||||||
|
An example of this is the following path:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0, 0) grid (7, 7);
|
||||||
|
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||||
|
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||||
|
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||||
|
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||||
|
(6.5,0.5);
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Using this observation, we can terminate the search
|
||||||
|
immediately if we reach the lower-right square too early.
|
||||||
|
\begin{itemize}
|
||||||
|
\item
|
||||||
|
running time: 119 seconds
|
||||||
|
\item
|
||||||
|
number of recursive calls: 20 billion
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsubsection{Optimization 3}
|
||||||
|
|
||||||
|
If the path touches a wall
|
||||||
|
and can turn either left or right,
|
||||||
|
the grid splits into two parts
|
||||||
|
that contain unvisited squares.
|
||||||
|
For example, in the following situation,
|
||||||
|
the path can turn either left or right:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0, 0) grid (7, 7);
|
||||||
|
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||||
|
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||||
|
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||||
|
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||||
|
(5.5,0.5) -- (5.5,6.5);
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
In this case, we cannot visit all squares anymore,
|
||||||
|
so we can terminate the search.
|
||||||
|
This optimization is very useful:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item
|
||||||
|
running time: 1.8 seconds
|
||||||
|
\item
|
||||||
|
number of recursive calls: 221 million
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsubsection{Optimization 4}
|
||||||
|
|
||||||
|
The idea of Optimization 3
|
||||||
|
can be generalized:
|
||||||
|
if the path cannot continue forward
|
||||||
|
but can turn either left or right,
|
||||||
|
the grid splits into two parts
|
||||||
|
that both contain unvisited squares.
|
||||||
|
For example, consider the following path:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0, 0) grid (7, 7);
|
||||||
|
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||||
|
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||||
|
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||||
|
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||||
|
(5.5,0.5) -- (5.5,4.5) -- (3.5,4.5);
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
It is clear that we cannot visit all squares anymore,
|
||||||
|
so we can terminate the search.
|
||||||
|
After this optimization, the search is
|
||||||
|
very efficient:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item
|
||||||
|
running time: 0.6 seconds
|
||||||
|
\item
|
||||||
|
number of recursive calls: 69 million
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
~\\
|
||||||
|
Now is a good moment to stop optimizing
|
||||||
|
the algorithm and see what we have achieved.
|
||||||
|
The running time of the original algorithm
|
||||||
|
was 483 seconds,
|
||||||
|
and now after the optimizations,
|
||||||
|
the running time is only 0.6 seconds.
|
||||||
|
Thus, the algorithm became nearly 1000 times
|
||||||
|
faster after the optimizations.
|
||||||
|
|
||||||
|
This is a usual phenomenon in backtracking,
|
||||||
|
because the search tree is usually large
|
||||||
|
and even simple observations can effectively
|
||||||
|
prune the search.
|
||||||
|
Especially useful are optimizations that
|
||||||
|
occur during the first steps of the algorithm,
|
||||||
|
i.e., at the top of the search tree.
|
||||||
|
|
||||||
|
\section{Meet in the middle}
|
||||||
|
|
||||||
|
\index{meet in the middle}
|
||||||
|
|
||||||
|
\key{Meet in the middle} is a technique
|
||||||
|
where the search space is divided into
|
||||||
|
two parts of about equal size.
|
||||||
|
A separate search is performed
|
||||||
|
for both of the parts,
|
||||||
|
and finally the results of the searches are combined.
|
||||||
|
|
||||||
|
The technique can be used
|
||||||
|
if there is an efficient way to combine the
|
||||||
|
results of the searches.
|
||||||
|
In such a situation, the two searches may require less
|
||||||
|
time than one large search.
|
||||||
|
Typically, we can turn a factor of $2^n$
|
||||||
|
into a factor of $2^{n/2}$ using the meet in the
|
||||||
|
middle technique.
|
||||||
|
|
||||||
|
As an example, consider a problem where
|
||||||
|
we are given a list of $n$ numbers and
|
||||||
|
a number $x$,
|
||||||
|
and we want to find out if it is possible
|
||||||
|
to choose some numbers from the list so that
|
||||||
|
their sum is $x$.
|
||||||
|
For example, given the list $[2,4,5,9]$ and $x=15$,
|
||||||
|
we can choose the numbers $[2,4,9]$ to get $2+4+9=15$.
|
||||||
|
However, if $x=10$ for the same list,
|
||||||
|
it is not possible to form the sum.
|
||||||
|
|
||||||
|
A simple algorithm to the problem is to
|
||||||
|
go through all subsets of the elements and
|
||||||
|
check if the sum of any of the subsets is $x$.
|
||||||
|
The running time of such an algorithm is $O(2^n)$,
|
||||||
|
because there are $2^n$ subsets.
|
||||||
|
However, using the meet in the middle technique,
|
||||||
|
we can achieve a more efficient $O(2^{n/2})$ time algorithm\footnote{This
|
||||||
|
idea was introduced in 1974 by E. Horowitz and S. Sahni \cite{hor74}.}.
|
||||||
|
Note that $O(2^n)$ and $O(2^{n/2})$ are different
|
||||||
|
complexities because $2^{n/2}$ equals $\sqrt{2^n}$.
|
||||||
|
|
||||||
|
The idea is to divide the list into
|
||||||
|
two lists $A$ and $B$ such that both
|
||||||
|
lists contain about half of the numbers.
|
||||||
|
The first search generates all subsets
|
||||||
|
of $A$ and stores their sums to a list $S_A$.
|
||||||
|
Correspondingly, the second search creates
|
||||||
|
a list $S_B$ from $B$.
|
||||||
|
After this, it suffices to check if it is possible
|
||||||
|
to choose one element from $S_A$ and another
|
||||||
|
element from $S_B$ such that their sum is $x$.
|
||||||
|
This is possible exactly when there is a way to
|
||||||
|
form the sum $x$ using the numbers of the original list.
|
||||||
|
|
||||||
|
For example, suppose that the list is $[2,4,5,9]$ and $x=15$.
|
||||||
|
First, we divide the list into $A=[2,4]$ and $B=[5,9]$.
|
||||||
|
After this, we create lists
|
||||||
|
$S_A=[0,2,4,6]$ and $S_B=[0,5,9,14]$.
|
||||||
|
In this case, the sum $x=15$ is possible to form,
|
||||||
|
because $S_A$ contains the sum $6$,
|
||||||
|
$S_B$ contains the sum $9$, and $6+9=15$.
|
||||||
|
This corresponds to the solution $[2,4,9]$.
|
||||||
|
|
||||||
|
We can implement the algorithm so that
|
||||||
|
its time complexity is $O(2^{n/2})$.
|
||||||
|
First, we generate \emph{sorted} lists $S_A$ and $S_B$,
|
||||||
|
which can be done in $O(2^{n/2})$ time using a merge-like technique.
|
||||||
|
After this, since the lists are sorted,
|
||||||
|
we can check in $O(2^{n/2})$ time if
|
||||||
|
the sum $x$ can be created from $S_A$ and $S_B$.
|
|
@ -0,0 +1,680 @@
|
||||||
|
\chapter{Greedy algorithms}
|
||||||
|
|
||||||
|
\index{greedy algorithm}
|
||||||
|
|
||||||
|
A \key{greedy algorithm}
|
||||||
|
constructs a solution to the problem
|
||||||
|
by always making a choice that looks
|
||||||
|
the best at the moment.
|
||||||
|
A greedy algorithm never takes back
|
||||||
|
its choices, but directly constructs
|
||||||
|
the final solution.
|
||||||
|
For this reason, greedy algorithms
|
||||||
|
are usually very efficient.
|
||||||
|
|
||||||
|
The difficulty in designing greedy algorithms
|
||||||
|
is to find a greedy strategy
|
||||||
|
that always produces an optimal solution
|
||||||
|
to the problem.
|
||||||
|
The locally optimal choices in a greedy
|
||||||
|
algorithm should also be globally optimal.
|
||||||
|
It is often difficult to argue that
|
||||||
|
a greedy algorithm works.
|
||||||
|
|
||||||
|
\section{Coin problem}
|
||||||
|
|
||||||
|
As a first example, we consider a problem
|
||||||
|
where we are given a set of coins
|
||||||
|
and our task is to form a sum of money $n$
|
||||||
|
using the coins.
|
||||||
|
The values of the coins are
|
||||||
|
$\texttt{coins}=\{c_1,c_2,\ldots,c_k\}$,
|
||||||
|
and each coin can be used as many times we want.
|
||||||
|
What is the minimum number of coins needed?
|
||||||
|
|
||||||
|
For example, if the coins are the euro coins (in cents)
|
||||||
|
\[\{1,2,5,10,20,50,100,200\}\]
|
||||||
|
and $n=520$,
|
||||||
|
we need at least four coins.
|
||||||
|
The optimal solution is to select coins
|
||||||
|
$200+200+100+20$ whose sum is 520.
|
||||||
|
|
||||||
|
\subsubsection{Greedy algorithm}
|
||||||
|
|
||||||
|
A simple greedy algorithm to the problem
|
||||||
|
always selects the largest possible coin,
|
||||||
|
until the required sum of money has been constructed.
|
||||||
|
This algorithm works in the example case,
|
||||||
|
because we first select two 200 cent coins,
|
||||||
|
then one 100 cent coin and finally one 20 cent coin.
|
||||||
|
But does this algorithm always work?
|
||||||
|
|
||||||
|
It turns out that if the coins are the euro coins,
|
||||||
|
the greedy algorithm \emph{always} works, i.e.,
|
||||||
|
it always produces a solution with the fewest
|
||||||
|
possible number of coins.
|
||||||
|
The correctness of the algorithm can be
|
||||||
|
shown as follows:
|
||||||
|
|
||||||
|
First, each coin 1, 5, 10, 50 and 100 appears
|
||||||
|
at most once in an optimal solution,
|
||||||
|
because if the
|
||||||
|
solution would contain two such coins,
|
||||||
|
we could replace them by one coin and
|
||||||
|
obtain a better solution.
|
||||||
|
For example, if the solution would contain
|
||||||
|
coins $5+5$, we could replace them by coin $10$.
|
||||||
|
|
||||||
|
In the same way, coins 2 and 20 appear
|
||||||
|
at most twice in an optimal solution,
|
||||||
|
because we could replace
|
||||||
|
coins $2+2+2$ by coins $5+1$ and
|
||||||
|
coins $20+20+20$ by coins $50+10$.
|
||||||
|
Moreover, an optimal solution cannot contain
|
||||||
|
coins $2+2+1$ or $20+20+10$,
|
||||||
|
because we could replace them by coins $5$ and $50$.
|
||||||
|
|
||||||
|
Using these observations,
|
||||||
|
we can show for each coin $x$ that
|
||||||
|
it is not possible to optimally construct
|
||||||
|
a sum $x$ or any larger sum by only using coins
|
||||||
|
that are smaller than $x$.
|
||||||
|
For example, if $x=100$, the largest optimal
|
||||||
|
sum using the smaller coins is $50+20+20+5+2+2=99$.
|
||||||
|
Thus, the greedy algorithm that always selects
|
||||||
|
the largest coin produces the optimal solution.
|
||||||
|
|
||||||
|
This example shows that it can be difficult
|
||||||
|
to argue that a greedy algorithm works,
|
||||||
|
even if the algorithm itself is simple.
|
||||||
|
|
||||||
|
\subsubsection{General case}
|
||||||
|
|
||||||
|
In the general case, the coin set can contain any coins
|
||||||
|
and the greedy algorithm \emph{does not} necessarily produce
|
||||||
|
an optimal solution.
|
||||||
|
|
||||||
|
We can prove that a greedy algorithm does not work
|
||||||
|
by showing a counterexample
|
||||||
|
where the algorithm gives a wrong answer.
|
||||||
|
In this problem we can easily find a counterexample:
|
||||||
|
if the coins are $\{1,3,4\}$ and the target sum
|
||||||
|
is 6, the greedy algorithm produces the solution
|
||||||
|
$4+1+1$ while the optimal solution is $3+3$.
|
||||||
|
|
||||||
|
It is not known if the general coin problem
|
||||||
|
can be solved using any greedy algorithm\footnote{However, it is possible
|
||||||
|
to \emph{check} in polynomial time
|
||||||
|
if the greedy algorithm presented in this chapter works for
|
||||||
|
a given set of coins \cite{pea05}.}.
|
||||||
|
However, as we will see in Chapter 7,
|
||||||
|
in some cases,
|
||||||
|
the general problem can be efficiently
|
||||||
|
solved using a dynamic
|
||||||
|
programming algorithm that always gives the
|
||||||
|
correct answer.
|
||||||
|
|
||||||
|
\section{Scheduling}
|
||||||
|
|
||||||
|
Many scheduling problems can be solved
|
||||||
|
using greedy algorithms.
|
||||||
|
A classic problem is as follows:
|
||||||
|
Given $n$ events with their starting and ending
|
||||||
|
times, find a schedule
|
||||||
|
that includes as many events as possible.
|
||||||
|
It is not possible to select an event partially.
|
||||||
|
For example, consider the following events:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{lll}
|
||||||
|
event & starting time & ending time \\
|
||||||
|
\hline
|
||||||
|
$A$ & 1 & 3 \\
|
||||||
|
$B$ & 2 & 5 \\
|
||||||
|
$C$ & 3 & 9 \\
|
||||||
|
$D$ & 6 & 8 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
In this case the maximum number of events is two.
|
||||||
|
For example, we can select events $B$ and $D$
|
||||||
|
as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (2, 0) rectangle (6, -1);
|
||||||
|
\draw[fill=lightgray] (4, -1.5) rectangle (10, -2.5);
|
||||||
|
\draw (6, -3) rectangle (18, -4);
|
||||||
|
\draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
|
||||||
|
\node at (2.5,-0.5) {$A$};
|
||||||
|
\node at (4.5,-2) {$B$};
|
||||||
|
\node at (6.5,-3.5) {$C$};
|
||||||
|
\node at (12.5,-5) {$D$};
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
It is possible to invent several greedy algorithms
|
||||||
|
for the problem, but which of them works in every case?
|
||||||
|
|
||||||
|
\subsubsection*{Algorithm 1}
|
||||||
|
|
||||||
|
The first idea is to select as \emph{short}
|
||||||
|
events as possible.
|
||||||
|
In the example case this algorithm
|
||||||
|
selects the following events:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
\begin{scope}
|
||||||
|
\draw[fill=lightgray] (2, 0) rectangle (6, -1);
|
||||||
|
\draw (4, -1.5) rectangle (10, -2.5);
|
||||||
|
\draw (6, -3) rectangle (18, -4);
|
||||||
|
\draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
|
||||||
|
\node at (2.5,-0.5) {$A$};
|
||||||
|
\node at (4.5,-2) {$B$};
|
||||||
|
\node at (6.5,-3.5) {$C$};
|
||||||
|
\node at (12.5,-5) {$D$};
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
However, selecting short events is not always
|
||||||
|
a correct strategy. For example, the algorithm fails
|
||||||
|
in the following case:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (1, 0) rectangle (7, -1);
|
||||||
|
\draw[fill=lightgray] (6, -1.5) rectangle (9, -2.5);
|
||||||
|
\draw (8, -3) rectangle (14, -4);
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
If we select the short event, we can only select one event.
|
||||||
|
However, it would be possible to select both long events.
|
||||||
|
|
||||||
|
\subsubsection*{Algorithm 2}
|
||||||
|
|
||||||
|
Another idea is to always select the next possible
|
||||||
|
event that \emph{begins} as \emph{early} as possible.
|
||||||
|
This algorithm selects the following events:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
\begin{scope}
|
||||||
|
\draw[fill=lightgray] (2, 0) rectangle (6, -1);
|
||||||
|
\draw (4, -1.5) rectangle (10, -2.5);
|
||||||
|
\draw[fill=lightgray] (6, -3) rectangle (18, -4);
|
||||||
|
\draw (12, -4.5) rectangle (16, -5.5);
|
||||||
|
\node at (2.5,-0.5) {$A$};
|
||||||
|
\node at (4.5,-2) {$B$};
|
||||||
|
\node at (6.5,-3.5) {$C$};
|
||||||
|
\node at (12.5,-5) {$D$};
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
However, we can find a counterexample
|
||||||
|
also for this algorithm.
|
||||||
|
For example, in the following case,
|
||||||
|
the algorithm only selects one event:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
\begin{scope}
|
||||||
|
\draw[fill=lightgray] (1, 0) rectangle (14, -1);
|
||||||
|
\draw (3, -1.5) rectangle (7, -2.5);
|
||||||
|
\draw (8, -3) rectangle (12, -4);
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
If we select the first event, it is not possible
|
||||||
|
to select any other events.
|
||||||
|
However, it would be possible to select the
|
||||||
|
other two events.
|
||||||
|
|
||||||
|
\subsubsection*{Algorithm 3}
|
||||||
|
|
||||||
|
The third idea is to always select the next
|
||||||
|
possible event that \emph{ends} as \emph{early} as possible.
|
||||||
|
This algorithm selects the following events:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
\begin{scope}
|
||||||
|
\draw[fill=lightgray] (2, 0) rectangle (6, -1);
|
||||||
|
\draw (4, -1.5) rectangle (10, -2.5);
|
||||||
|
\draw (6, -3) rectangle (18, -4);
|
||||||
|
\draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
|
||||||
|
\node at (2.5,-0.5) {$A$};
|
||||||
|
\node at (4.5,-2) {$B$};
|
||||||
|
\node at (6.5,-3.5) {$C$};
|
||||||
|
\node at (12.5,-5) {$D$};
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
It turns out that this algorithm
|
||||||
|
\emph{always} produces an optimal solution.
|
||||||
|
The reason for this is that it is always an optimal choice
|
||||||
|
to first select an event that ends
|
||||||
|
as early as possible.
|
||||||
|
After this, it is an optimal choice
|
||||||
|
to select the next event
|
||||||
|
using the same strategy, etc.,
|
||||||
|
until we cannot select any more events.
|
||||||
|
|
||||||
|
One way to argue that the algorithm works
|
||||||
|
is to consider
|
||||||
|
what happens if we first select an event
|
||||||
|
that ends later than the event that ends
|
||||||
|
as early as possible.
|
||||||
|
Now, we will have at most an equal number of
|
||||||
|
choices how we can select the next event.
|
||||||
|
Hence, selecting an event that ends later
|
||||||
|
can never yield a better solution,
|
||||||
|
and the greedy algorithm is correct.
|
||||||
|
|
||||||
|
\section{Tasks and deadlines}
|
||||||
|
|
||||||
|
Let us now consider a problem where
|
||||||
|
we are given $n$ tasks with durations and deadlines
|
||||||
|
and our task is to choose an order to perform the tasks.
|
||||||
|
For each task, we earn $d-x$ points
|
||||||
|
where $d$ is the task's deadline
|
||||||
|
and $x$ is the moment when we finish the task.
|
||||||
|
What is the largest possible total score
|
||||||
|
we can obtain?
|
||||||
|
|
||||||
|
For example, suppose that the tasks are as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{lll}
|
||||||
|
task & duration & deadline \\
|
||||||
|
\hline
|
||||||
|
$A$ & 4 & 2 \\
|
||||||
|
$B$ & 3 & 5 \\
|
||||||
|
$C$ & 2 & 7 \\
|
||||||
|
$D$ & 4 & 5 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
In this case, an optimal schedule for the tasks
|
||||||
|
is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0, 0) rectangle (4, -1);
|
||||||
|
\draw (4, 0) rectangle (10, -1);
|
||||||
|
\draw (10, 0) rectangle (18, -1);
|
||||||
|
\draw (18, 0) rectangle (26, -1);
|
||||||
|
\node at (0.5,-0.5) {$C$};
|
||||||
|
\node at (4.5,-0.5) {$B$};
|
||||||
|
\node at (10.5,-0.5) {$A$};
|
||||||
|
\node at (18.5,-0.5) {$D$};
|
||||||
|
|
||||||
|
\draw (0,1.5) -- (26,1.5);
|
||||||
|
\foreach \i in {0,2,...,26}
|
||||||
|
{
|
||||||
|
\draw (\i,1.25) -- (\i,1.75);
|
||||||
|
}
|
||||||
|
\footnotesize
|
||||||
|
\node at (0,2.5) {0};
|
||||||
|
\node at (10,2.5) {5};
|
||||||
|
\node at (20,2.5) {10};
|
||||||
|
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
In this solution, $C$ yields 5 points,
|
||||||
|
$B$ yields 0 points, $A$ yields $-7$ points
|
||||||
|
and $D$ yields $-8$ points,
|
||||||
|
so the total score is $-10$.
|
||||||
|
|
||||||
|
Surprisingly, the optimal solution to the problem
|
||||||
|
does not depend on the deadlines at all,
|
||||||
|
but a correct greedy strategy is to simply
|
||||||
|
perform the tasks \emph{sorted by their durations}
|
||||||
|
in increasing order.
|
||||||
|
The reason for this is that if we ever perform
|
||||||
|
two tasks one after another such that the first task
|
||||||
|
takes longer than the second task,
|
||||||
|
we can obtain a better solution if we swap the tasks.
|
||||||
|
For example, consider the following schedule:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0, 0) rectangle (8, -1);
|
||||||
|
\draw (8, 0) rectangle (12, -1);
|
||||||
|
\node at (0.5,-0.5) {$X$};
|
||||||
|
\node at (8.5,-0.5) {$Y$};
|
||||||
|
|
||||||
|
\draw [decoration={brace}, decorate, line width=0.3mm] (7.75,-1.5) -- (0.25,-1.5);
|
||||||
|
\draw [decoration={brace}, decorate, line width=0.3mm] (11.75,-1.5) -- (8.25,-1.5);
|
||||||
|
|
||||||
|
\footnotesize
|
||||||
|
\node at (4,-2.5) {$a$};
|
||||||
|
\node at (10,-2.5) {$b$};
|
||||||
|
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Here $a>b$, so we should swap the tasks:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
\begin{scope}
|
||||||
|
\draw (0, 0) rectangle (4, -1);
|
||||||
|
\draw (4, 0) rectangle (12, -1);
|
||||||
|
\node at (0.5,-0.5) {$Y$};
|
||||||
|
\node at (4.5,-0.5) {$X$};
|
||||||
|
|
||||||
|
\draw [decoration={brace}, decorate, line width=0.3mm] (3.75,-1.5) -- (0.25,-1.5);
|
||||||
|
\draw [decoration={brace}, decorate, line width=0.3mm] (11.75,-1.5) -- (4.25,-1.5);
|
||||||
|
|
||||||
|
\footnotesize
|
||||||
|
\node at (2,-2.5) {$b$};
|
||||||
|
\node at (8,-2.5) {$a$};
|
||||||
|
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Now $X$ gives $b$ points less and $Y$ gives $a$ points more,
|
||||||
|
so the total score increases by $a-b > 0$.
|
||||||
|
In an optimal solution,
|
||||||
|
for any two consecutive tasks,
|
||||||
|
it must hold that the shorter task comes
|
||||||
|
before the longer task.
|
||||||
|
Thus, the tasks must be performed
|
||||||
|
sorted by their durations.
|
||||||
|
|
||||||
|
\section{Minimizing sums}
|
||||||
|
|
||||||
|
We next consider a problem where
|
||||||
|
we are given $n$ numbers $a_1,a_2,\ldots,a_n$
|
||||||
|
and our task is to find a value $x$
|
||||||
|
that minimizes the sum
|
||||||
|
\[|a_1-x|^c+|a_2-x|^c+\cdots+|a_n-x|^c.\]
|
||||||
|
We focus on the cases $c=1$ and $c=2$.
|
||||||
|
|
||||||
|
\subsubsection{Case $c=1$}
|
||||||
|
|
||||||
|
In this case, we should minimize the sum
|
||||||
|
\[|a_1-x|+|a_2-x|+\cdots+|a_n-x|.\]
|
||||||
|
For example, if the numbers are $[1,2,9,2,6]$,
|
||||||
|
the best solution is to select $x=2$
|
||||||
|
which produces the sum
|
||||||
|
\[
|
||||||
|
|1-2|+|2-2|+|9-2|+|2-2|+|6-2|=12.
|
||||||
|
\]
|
||||||
|
In the general case, the best choice for $x$
|
||||||
|
is the \textit{median} of the numbers,
|
||||||
|
i.e., the middle number after sorting.
|
||||||
|
For example, the list $[1,2,9,2,6]$
|
||||||
|
becomes $[1,2,2,6,9]$ after sorting,
|
||||||
|
so the median is 2.
|
||||||
|
|
||||||
|
The median is an optimal choice,
|
||||||
|
because if $x$ is smaller than the median,
|
||||||
|
the sum becomes smaller by increasing $x$,
|
||||||
|
and if $x$ is larger then the median,
|
||||||
|
the sum becomes smaller by decreasing $x$.
|
||||||
|
Hence, the optimal solution is that $x$
|
||||||
|
is the median.
|
||||||
|
If $n$ is even and there are two medians,
|
||||||
|
both medians and all values between them
|
||||||
|
are optimal choices.
|
||||||
|
|
||||||
|
\subsubsection{Case $c=2$}
|
||||||
|
|
||||||
|
In this case, we should minimize the sum
|
||||||
|
\[(a_1-x)^2+(a_2-x)^2+\cdots+(a_n-x)^2.\]
|
||||||
|
For example, if the numbers are $[1,2,9,2,6]$,
|
||||||
|
the best solution is to select $x=4$
|
||||||
|
which produces the sum
|
||||||
|
\[
|
||||||
|
(1-4)^2+(2-4)^2+(9-4)^2+(2-4)^2+(6-4)^2=46.
|
||||||
|
\]
|
||||||
|
In the general case, the best choice for $x$
|
||||||
|
is the \emph{average} of the numbers.
|
||||||
|
In the example the average is $(1+2+9+2+6)/5=4$.
|
||||||
|
This result can be derived by presenting
|
||||||
|
the sum as follows:
|
||||||
|
\[
|
||||||
|
nx^2 - 2x(a_1+a_2+\cdots+a_n) + (a_1^2+a_2^2+\cdots+a_n^2)
|
||||||
|
\]
|
||||||
|
The last part does not depend on $x$,
|
||||||
|
so we can ignore it.
|
||||||
|
The remaining parts form a function
|
||||||
|
$nx^2-2xs$ where $s=a_1+a_2+\cdots+a_n$.
|
||||||
|
This is a parabola opening upwards
|
||||||
|
with roots $x=0$ and $x=2s/n$,
|
||||||
|
and the minimum value is the average
|
||||||
|
of the roots $x=s/n$, i.e.,
|
||||||
|
the average of the numbers $a_1,a_2,\ldots,a_n$.
|
||||||
|
|
||||||
|
\section{Data compression}
|
||||||
|
|
||||||
|
\index{data compression}
|
||||||
|
\index{binary code}
|
||||||
|
\index{codeword}
|
||||||
|
|
||||||
|
A \key{binary code} assigns for each character
|
||||||
|
of a string a \key{codeword} that consists of bits.
|
||||||
|
We can \emph{compress} the string using the binary code
|
||||||
|
by replacing each character by the
|
||||||
|
corresponding codeword.
|
||||||
|
For example, the following binary code
|
||||||
|
assigns codewords for characters
|
||||||
|
\texttt{A}–\texttt{D}:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rr}
|
||||||
|
character & codeword \\
|
||||||
|
\hline
|
||||||
|
\texttt{A} & 00 \\
|
||||||
|
\texttt{B} & 01 \\
|
||||||
|
\texttt{C} & 10 \\
|
||||||
|
\texttt{D} & 11 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
This is a \key{constant-length} code
|
||||||
|
which means that the length of each
|
||||||
|
codeword is the same.
|
||||||
|
For example, we can compress the string
|
||||||
|
\texttt{AABACDACA} as follows:
|
||||||
|
\[00\,00\,01\,00\,10\,11\,00\,10\,00\]
|
||||||
|
Using this code, the length of the compressed
|
||||||
|
string is 18 bits.
|
||||||
|
However, we can compress the string better
|
||||||
|
if we use a \key{variable-length} code
|
||||||
|
where codewords may have different lengths.
|
||||||
|
Then we can give short codewords for
|
||||||
|
characters that appear often
|
||||||
|
and long codewords for characters
|
||||||
|
that appear rarely.
|
||||||
|
It turns out that an \key{optimal} code
|
||||||
|
for the above string is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rr}
|
||||||
|
character & codeword \\
|
||||||
|
\hline
|
||||||
|
\texttt{A} & 0 \\
|
||||||
|
\texttt{B} & 110 \\
|
||||||
|
\texttt{C} & 10 \\
|
||||||
|
\texttt{D} & 111 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
An optimal code produces a compressed string
|
||||||
|
that is as short as possible.
|
||||||
|
In this case, the compressed string using
|
||||||
|
the optimal code is
|
||||||
|
\[0\,0\,110\,0\,10\,111\,0\,10\,0,\]
|
||||||
|
so only 15 bits are needed instead of 18 bits.
|
||||||
|
Thus, thanks to a better code it was possible to
|
||||||
|
save 3 bits in the compressed string.
|
||||||
|
|
||||||
|
We require that no codeword
|
||||||
|
is a prefix of another codeword.
|
||||||
|
For example, it is not allowed that a code
|
||||||
|
would contain both codewords 10
|
||||||
|
and 1011.
|
||||||
|
The reason for this is that we want
|
||||||
|
to be able to generate the original string
|
||||||
|
from the compressed string.
|
||||||
|
If a codeword could be a prefix of another codeword,
|
||||||
|
this would not always be possible.
|
||||||
|
For example, the following code is \emph{not} valid:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rr}
|
||||||
|
character & codeword \\
|
||||||
|
\hline
|
||||||
|
\texttt{A} & 10 \\
|
||||||
|
\texttt{B} & 11 \\
|
||||||
|
\texttt{C} & 1011 \\
|
||||||
|
\texttt{D} & 111 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
Using this code, it would not be possible to know
|
||||||
|
if the compressed string 1011 corresponds to
|
||||||
|
the string \texttt{AB} or the string \texttt{C}.
|
||||||
|
|
||||||
|
\index{Huffman coding}
|
||||||
|
|
||||||
|
\subsubsection{Huffman coding}
|
||||||
|
|
||||||
|
\key{Huffman coding}\footnote{D. A. Huffman discovered this method
|
||||||
|
when solving a university course assignment
|
||||||
|
and published the algorithm in 1952 \cite{huf52}.} is a greedy algorithm
|
||||||
|
that constructs an optimal code for
|
||||||
|
compressing a given string.
|
||||||
|
The algorithm builds a binary tree
|
||||||
|
based on the frequencies of the characters
|
||||||
|
in the string,
|
||||||
|
and each character's codeword can be read
|
||||||
|
by following a path from the root to
|
||||||
|
the corresponding node.
|
||||||
|
A move to the left corresponds to bit 0,
|
||||||
|
and a move to the right corresponds to bit 1.
|
||||||
|
|
||||||
|
Initially, each character of the string is
|
||||||
|
represented by a node whose weight is the
|
||||||
|
number of times the character occurs in the string.
|
||||||
|
Then at each step two nodes with minimum weights
|
||||||
|
are combined by creating
|
||||||
|
a new node whose weight is the sum of the weights
|
||||||
|
of the original nodes.
|
||||||
|
The process continues until all nodes have been combined.
|
||||||
|
|
||||||
|
Next we will see how Huffman coding creates
|
||||||
|
the optimal code for the string
|
||||||
|
\texttt{AABACDACA}.
|
||||||
|
Initially, there are four nodes that correspond
|
||||||
|
to the characters of the string:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$5$};
|
||||||
|
\node[draw, circle] (2) at (2,0) {$1$};
|
||||||
|
\node[draw, circle] (3) at (4,0) {$2$};
|
||||||
|
\node[draw, circle] (4) at (6,0) {$1$};
|
||||||
|
|
||||||
|
\node[color=blue] at (0,-0.75) {\texttt{A}};
|
||||||
|
\node[color=blue] at (2,-0.75) {\texttt{B}};
|
||||||
|
\node[color=blue] at (4,-0.75) {\texttt{C}};
|
||||||
|
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
||||||
|
|
||||||
|
%\path[draw,thick,-] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The node that represents character \texttt{A}
|
||||||
|
has weight 5 because character \texttt{A}
|
||||||
|
appears 5 times in the string.
|
||||||
|
The other weights have been calculated
|
||||||
|
in the same way.
|
||||||
|
|
||||||
|
The first step is to combine the nodes that
|
||||||
|
correspond to characters \texttt{B} and \texttt{D},
|
||||||
|
both with weight 1.
|
||||||
|
The result is:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$5$};
|
||||||
|
\node[draw, circle] (3) at (2,0) {$2$};
|
||||||
|
\node[draw, circle] (2) at (4,0) {$1$};
|
||||||
|
\node[draw, circle] (4) at (6,0) {$1$};
|
||||||
|
\node[draw, circle] (5) at (5,1) {$2$};
|
||||||
|
|
||||||
|
\node[color=blue] at (0,-0.75) {\texttt{A}};
|
||||||
|
\node[color=blue] at (2,-0.75) {\texttt{C}};
|
||||||
|
\node[color=blue] at (4,-0.75) {\texttt{B}};
|
||||||
|
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
||||||
|
|
||||||
|
\node at (4.3,0.7) {0};
|
||||||
|
\node at (5.7,0.7) {1};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
After this, the nodes with weight 2 are combined:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,0) {$5$};
|
||||||
|
\node[draw, circle] (3) at (3,1) {$2$};
|
||||||
|
\node[draw, circle] (2) at (4,0) {$1$};
|
||||||
|
\node[draw, circle] (4) at (6,0) {$1$};
|
||||||
|
\node[draw, circle] (5) at (5,1) {$2$};
|
||||||
|
\node[draw, circle] (6) at (4,2) {$4$};
|
||||||
|
|
||||||
|
\node[color=blue] at (1,-0.75) {\texttt{A}};
|
||||||
|
\node[color=blue] at (3,1-0.75) {\texttt{C}};
|
||||||
|
\node[color=blue] at (4,-0.75) {\texttt{B}};
|
||||||
|
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
||||||
|
|
||||||
|
\node at (4.3,0.7) {0};
|
||||||
|
\node at (5.7,0.7) {1};
|
||||||
|
\node at (3.3,1.7) {0};
|
||||||
|
\node at (4.7,1.7) {1};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Finally, the two remaining nodes are combined:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (2,2) {$5$};
|
||||||
|
\node[draw, circle] (3) at (3,1) {$2$};
|
||||||
|
\node[draw, circle] (2) at (4,0) {$1$};
|
||||||
|
\node[draw, circle] (4) at (6,0) {$1$};
|
||||||
|
\node[draw, circle] (5) at (5,1) {$2$};
|
||||||
|
\node[draw, circle] (6) at (4,2) {$4$};
|
||||||
|
\node[draw, circle] (7) at (3,3) {$9$};
|
||||||
|
|
||||||
|
\node[color=blue] at (2,2-0.75) {\texttt{A}};
|
||||||
|
\node[color=blue] at (3,1-0.75) {\texttt{C}};
|
||||||
|
\node[color=blue] at (4,-0.75) {\texttt{B}};
|
||||||
|
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
||||||
|
|
||||||
|
\node at (4.3,0.7) {0};
|
||||||
|
\node at (5.7,0.7) {1};
|
||||||
|
\node at (3.3,1.7) {0};
|
||||||
|
\node at (4.7,1.7) {1};
|
||||||
|
\node at (2.3,2.7) {0};
|
||||||
|
\node at (3.7,2.7) {1};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\path[draw,thick,-] (1) -- (7);
|
||||||
|
\path[draw,thick,-] (6) -- (7);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Now all nodes are in the tree, so the code is ready.
|
||||||
|
The following codewords can be read from the tree:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rr}
|
||||||
|
character & codeword \\
|
||||||
|
\hline
|
||||||
|
\texttt{A} & 0 \\
|
||||||
|
\texttt{B} & 110 \\
|
||||||
|
\texttt{C} & 10 \\
|
||||||
|
\texttt{D} & 111 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,732 @@
|
||||||
|
\chapter{Amortized analysis}
|
||||||
|
|
||||||
|
\index{amortized analysis}
|
||||||
|
|
||||||
|
The time complexity of an algorithm
|
||||||
|
is often easy to analyze
|
||||||
|
just by examining the structure
|
||||||
|
of the algorithm:
|
||||||
|
what loops does the algorithm contain
|
||||||
|
and how many times the loops are performed.
|
||||||
|
However, sometimes a straightforward analysis
|
||||||
|
does not give a true picture of the efficiency of the algorithm.
|
||||||
|
|
||||||
|
\key{Amortized analysis} can be used to analyze
|
||||||
|
algorithms that contain operations whose
|
||||||
|
time complexity varies.
|
||||||
|
The idea is to estimate the total time used to
|
||||||
|
all such operations during the
|
||||||
|
execution of the algorithm, instead of focusing
|
||||||
|
on individual operations.
|
||||||
|
|
||||||
|
\section{Two pointers method}
|
||||||
|
|
||||||
|
\index{two pointers method}
|
||||||
|
|
||||||
|
In the \key{two pointers method},
|
||||||
|
two pointers are used to
|
||||||
|
iterate through the array values.
|
||||||
|
Both pointers can move to one direction only,
|
||||||
|
which ensures that the algorithm works efficiently.
|
||||||
|
Next we discuss two problems that can be solved
|
||||||
|
using the two pointers method.
|
||||||
|
|
||||||
|
\subsubsection{Subarray sum}
|
||||||
|
|
||||||
|
As the first example,
|
||||||
|
consider a problem where we are
|
||||||
|
given an array of $n$ positive integers
|
||||||
|
and a target sum $x$,
|
||||||
|
and we want to find a subarray whose sum is $x$
|
||||||
|
or report that there is no such subarray.
|
||||||
|
|
||||||
|
For example, the array
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$1$};
|
||||||
|
\node at (5.5,0.5) {$1$};
|
||||||
|
\node at (6.5,0.5) {$2$};
|
||||||
|
\node at (7.5,0.5) {$3$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
contains a subarray whose sum is 8:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (2,0) rectangle (5,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$1$};
|
||||||
|
\node at (5.5,0.5) {$1$};
|
||||||
|
\node at (6.5,0.5) {$2$};
|
||||||
|
\node at (7.5,0.5) {$3$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
This problem can be solved in
|
||||||
|
$O(n)$ time by using the two pointers method.
|
||||||
|
The idea is to maintain pointers that point to the
|
||||||
|
first and last value of a subarray.
|
||||||
|
On each turn, the left pointer moves one step
|
||||||
|
to the right, and the right pointer moves to the right
|
||||||
|
as long as the resulting subarray sum is at most $x$.
|
||||||
|
If the sum becomes exactly $x$,
|
||||||
|
a solution has been found.
|
||||||
|
|
||||||
|
As an example, consider the following array
|
||||||
|
and a target sum $x=8$:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$1$};
|
||||||
|
\node at (5.5,0.5) {$1$};
|
||||||
|
\node at (6.5,0.5) {$2$};
|
||||||
|
\node at (7.5,0.5) {$3$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The initial subarray contains the values
|
||||||
|
1, 3 and 2 whose sum is 6:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (0,0) rectangle (3,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$1$};
|
||||||
|
\node at (5.5,0.5) {$1$};
|
||||||
|
\node at (6.5,0.5) {$2$};
|
||||||
|
\node at (7.5,0.5) {$3$};
|
||||||
|
|
||||||
|
\draw[thick,->] (0.5,-0.7) -- (0.5,-0.1);
|
||||||
|
\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Then, the left pointer moves one step to the right.
|
||||||
|
The right pointer does not move, because otherwise
|
||||||
|
the subarray sum would exceed $x$.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (1,0) rectangle (3,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$1$};
|
||||||
|
\node at (5.5,0.5) {$1$};
|
||||||
|
\node at (6.5,0.5) {$2$};
|
||||||
|
\node at (7.5,0.5) {$3$};
|
||||||
|
|
||||||
|
\draw[thick,->] (1.5,-0.7) -- (1.5,-0.1);
|
||||||
|
\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Again, the left pointer moves one step to the right,
|
||||||
|
and this time the right pointer moves three
|
||||||
|
steps to the right.
|
||||||
|
The subarray sum is $2+5+1=8$, so a subarray
|
||||||
|
whose sum is $x$ has been found.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (2,0) rectangle (5,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$1$};
|
||||||
|
\node at (5.5,0.5) {$1$};
|
||||||
|
\node at (6.5,0.5) {$2$};
|
||||||
|
\node at (7.5,0.5) {$3$};
|
||||||
|
|
||||||
|
\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
|
||||||
|
\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The running time of the algorithm depends on
|
||||||
|
the number of steps the right pointer moves.
|
||||||
|
While there is no useful upper bound on how many steps the
|
||||||
|
pointer can move on a \emph{single} turn.
|
||||||
|
we know that the pointer moves \emph{a total of}
|
||||||
|
$O(n)$ steps during the algorithm,
|
||||||
|
because it only moves to the right.
|
||||||
|
|
||||||
|
Since both the left and right pointer
|
||||||
|
move $O(n)$ steps during the algorithm,
|
||||||
|
the algorithm works in $O(n)$ time.
|
||||||
|
|
||||||
|
\subsubsection{2SUM problem}
|
||||||
|
|
||||||
|
\index{2SUM problem}
|
||||||
|
|
||||||
|
Another problem that can be solved using
|
||||||
|
the two pointers method is the following problem,
|
||||||
|
also known as the \key{2SUM problem}:
|
||||||
|
given an array of $n$ numbers and
|
||||||
|
a target sum $x$, find
|
||||||
|
two array values such that their sum is $x$,
|
||||||
|
or report that no such values exist.
|
||||||
|
|
||||||
|
To solve the problem, we first
|
||||||
|
sort the array values in increasing order.
|
||||||
|
After that, we iterate through the array using
|
||||||
|
two pointers.
|
||||||
|
The left pointer starts at the first value
|
||||||
|
and moves one step to the right on each turn.
|
||||||
|
The right pointer begins at the last value
|
||||||
|
and always moves to the left until the sum of the
|
||||||
|
left and right value is at most $x$.
|
||||||
|
If the sum is exactly $x$,
|
||||||
|
a solution has been found.
|
||||||
|
|
||||||
|
For example, consider the following array
|
||||||
|
and a target sum $x=12$:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$4$};
|
||||||
|
\node at (2.5,0.5) {$5$};
|
||||||
|
\node at (3.5,0.5) {$6$};
|
||||||
|
\node at (4.5,0.5) {$7$};
|
||||||
|
\node at (5.5,0.5) {$9$};
|
||||||
|
\node at (6.5,0.5) {$9$};
|
||||||
|
\node at (7.5,0.5) {$10$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The initial positions of the pointers
|
||||||
|
are as follows.
|
||||||
|
The sum of the values is $1+10=11$
|
||||||
|
that is smaller than $x$.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (0,0) rectangle (1,1);
|
||||||
|
\fill[color=lightgray] (7,0) rectangle (8,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$4$};
|
||||||
|
\node at (2.5,0.5) {$5$};
|
||||||
|
\node at (3.5,0.5) {$6$};
|
||||||
|
\node at (4.5,0.5) {$7$};
|
||||||
|
\node at (5.5,0.5) {$9$};
|
||||||
|
\node at (6.5,0.5) {$9$};
|
||||||
|
\node at (7.5,0.5) {$10$};
|
||||||
|
|
||||||
|
\draw[thick,->] (0.5,-0.7) -- (0.5,-0.1);
|
||||||
|
\draw[thick,->] (7.5,-0.7) -- (7.5,-0.1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Then the left pointer moves one step to the right.
|
||||||
|
The right pointer moves three steps to the left,
|
||||||
|
and the sum becomes $4+7=11$.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (1,0) rectangle (2,1);
|
||||||
|
\fill[color=lightgray] (4,0) rectangle (5,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$4$};
|
||||||
|
\node at (2.5,0.5) {$5$};
|
||||||
|
\node at (3.5,0.5) {$6$};
|
||||||
|
\node at (4.5,0.5) {$7$};
|
||||||
|
\node at (5.5,0.5) {$9$};
|
||||||
|
\node at (6.5,0.5) {$9$};
|
||||||
|
\node at (7.5,0.5) {$10$};
|
||||||
|
|
||||||
|
\draw[thick,->] (1.5,-0.7) -- (1.5,-0.1);
|
||||||
|
\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
After this, the left pointer moves one step to the right again.
|
||||||
|
The right pointer does not move, and a solution
|
||||||
|
$5+7=12$ has been found.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (2,0) rectangle (3,1);
|
||||||
|
\fill[color=lightgray] (4,0) rectangle (5,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$4$};
|
||||||
|
\node at (2.5,0.5) {$5$};
|
||||||
|
\node at (3.5,0.5) {$6$};
|
||||||
|
\node at (4.5,0.5) {$7$};
|
||||||
|
\node at (5.5,0.5) {$9$};
|
||||||
|
\node at (6.5,0.5) {$9$};
|
||||||
|
\node at (7.5,0.5) {$10$};
|
||||||
|
|
||||||
|
\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
|
||||||
|
\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The running time of the algorithm is
|
||||||
|
$O(n \log n)$, because it first sorts
|
||||||
|
the array in $O(n \log n)$ time,
|
||||||
|
and then both pointers move $O(n)$ steps.
|
||||||
|
|
||||||
|
Note that it is possible to solve the problem
|
||||||
|
in another way in $O(n \log n)$ time using binary search.
|
||||||
|
In such a solution, we iterate through the array
|
||||||
|
and for each array value, we try to find another
|
||||||
|
value that yields the sum $x$.
|
||||||
|
This can be done by performing $n$ binary searches,
|
||||||
|
each of which takes $O(\log n)$ time.
|
||||||
|
|
||||||
|
\index{3SUM problem}
|
||||||
|
A more difficult problem is
|
||||||
|
the \key{3SUM problem} that asks to
|
||||||
|
find \emph{three} array values
|
||||||
|
whose sum is $x$.
|
||||||
|
Using the idea of the above algorithm,
|
||||||
|
this problem can be solved in $O(n^2)$ time\footnote{For a long time,
|
||||||
|
it was thought that solving
|
||||||
|
the 3SUM problem more efficiently than in $O(n^2)$ time
|
||||||
|
would not be possible.
|
||||||
|
However, in 2014, it turned out \cite{gro14}
|
||||||
|
that this is not the case.}.
|
||||||
|
Can you see how?
|
||||||
|
|
||||||
|
\section{Nearest smaller elements}
|
||||||
|
|
||||||
|
\index{nearest smaller elements}
|
||||||
|
|
||||||
|
Amortized analysis is often used to
|
||||||
|
estimate the number of operations
|
||||||
|
performed on a data structure.
|
||||||
|
The operations may be distributed unevenly so
|
||||||
|
that most operations occur during a
|
||||||
|
certain phase of the algorithm, but the total
|
||||||
|
number of the operations is limited.
|
||||||
|
|
||||||
|
As an example, consider the problem
|
||||||
|
of finding for each array element
|
||||||
|
the \key{nearest smaller element}, i.e.,
|
||||||
|
the first smaller element that precedes the element
|
||||||
|
in the array.
|
||||||
|
It is possible that no such element exists,
|
||||||
|
in which case the algorithm should report this.
|
||||||
|
Next we will see how the problem can be
|
||||||
|
efficiently solved using a stack structure.
|
||||||
|
|
||||||
|
We go through the array from left to right
|
||||||
|
and maintain a stack of array elements.
|
||||||
|
At each array position, we remove elements from the stack
|
||||||
|
until the top element is smaller than the
|
||||||
|
current element, or the stack is empty.
|
||||||
|
Then, we report that the top element is
|
||||||
|
the nearest smaller element of the current element,
|
||||||
|
or if the stack is empty, there is no such element.
|
||||||
|
Finally, we add the current element to the stack.
|
||||||
|
|
||||||
|
As an example, consider the following array:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$2$};
|
||||||
|
\node at (4.5,0.5) {$5$};
|
||||||
|
\node at (5.5,0.5) {$3$};
|
||||||
|
\node at (6.5,0.5) {$4$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
First, the elements 1, 3 and 4 are added to the stack,
|
||||||
|
because each element is larger than the previous element.
|
||||||
|
Thus, the nearest smaller element of 4 is 3,
|
||||||
|
and the nearest smaller element of 3 is 1.
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (2,0) rectangle (3,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$2$};
|
||||||
|
\node at (4.5,0.5) {$5$};
|
||||||
|
\node at (5.5,0.5) {$3$};
|
||||||
|
\node at (6.5,0.5) {$4$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
|
||||||
|
\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2);
|
||||||
|
\draw (2.2,0.2-1.2) rectangle (2.8,0.8-1.2);
|
||||||
|
|
||||||
|
\node at (0.5,0.5-1.2) {$1$};
|
||||||
|
\node at (1.5,0.5-1.2) {$3$};
|
||||||
|
\node at (2.5,0.5-1.2) {$4$};
|
||||||
|
|
||||||
|
\draw[->,thick] (0.8,0.5-1.2) -- (1.2,0.5-1.2);
|
||||||
|
\draw[->,thick] (1.8,0.5-1.2) -- (2.2,0.5-1.2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The next element 2 is smaller than the two top
|
||||||
|
elements in the stack.
|
||||||
|
Thus, the elements 3 and 4 are removed from the stack,
|
||||||
|
and then the element 2 is added to the stack.
|
||||||
|
Its nearest smaller element is 1:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (3,0) rectangle (4,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$2$};
|
||||||
|
\node at (4.5,0.5) {$5$};
|
||||||
|
\node at (5.5,0.5) {$3$};
|
||||||
|
\node at (6.5,0.5) {$4$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
|
||||||
|
\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
|
||||||
|
|
||||||
|
\node at (0.5,0.5-1.2) {$1$};
|
||||||
|
\node at (3.5,0.5-1.2) {$2$};
|
||||||
|
|
||||||
|
\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Then, the element 5 is larger than the element 2,
|
||||||
|
so it will be added to the stack, and
|
||||||
|
its nearest smaller element is 2:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (4,0) rectangle (5,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$2$};
|
||||||
|
\node at (4.5,0.5) {$5$};
|
||||||
|
\node at (5.5,0.5) {$3$};
|
||||||
|
\node at (6.5,0.5) {$4$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
|
||||||
|
\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
|
||||||
|
\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2);
|
||||||
|
|
||||||
|
\node at (0.5,0.5-1.2) {$1$};
|
||||||
|
\node at (3.5,0.5-1.2) {$2$};
|
||||||
|
\node at (4.5,0.5-1.2) {$5$};
|
||||||
|
|
||||||
|
\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2);
|
||||||
|
\draw[->,thick] (3.8,0.5-1.2) -- (4.2,0.5-1.2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
After this, the element 5 is removed from the stack
|
||||||
|
and the elements 3 and 4 are added to the stack:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (6,0) rectangle (7,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$2$};
|
||||||
|
\node at (4.5,0.5) {$5$};
|
||||||
|
\node at (5.5,0.5) {$3$};
|
||||||
|
\node at (6.5,0.5) {$4$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
|
||||||
|
\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
|
||||||
|
\draw (5.2,0.2-1.2) rectangle (5.8,0.8-1.2);
|
||||||
|
\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2);
|
||||||
|
|
||||||
|
\node at (0.5,0.5-1.2) {$1$};
|
||||||
|
\node at (3.5,0.5-1.2) {$2$};
|
||||||
|
\node at (5.5,0.5-1.2) {$3$};
|
||||||
|
\node at (6.5,0.5-1.2) {$4$};
|
||||||
|
|
||||||
|
\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2);
|
||||||
|
\draw[->,thick] (3.8,0.5-1.2) -- (5.2,0.5-1.2);
|
||||||
|
\draw[->,thick] (5.8,0.5-1.2) -- (6.2,0.5-1.2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Finally, all elements except 1 are removed
|
||||||
|
from the stack and the last element 2
|
||||||
|
is added to the stack:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (7,0) rectangle (8,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$1$};
|
||||||
|
\node at (1.5,0.5) {$3$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$2$};
|
||||||
|
\node at (4.5,0.5) {$5$};
|
||||||
|
\node at (5.5,0.5) {$3$};
|
||||||
|
\node at (6.5,0.5) {$4$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
|
||||||
|
\draw (7.2,0.2-1.2) rectangle (7.8,0.8-1.2);
|
||||||
|
|
||||||
|
\node at (0.5,0.5-1.2) {$1$};
|
||||||
|
\node at (7.5,0.5-1.2) {$2$};
|
||||||
|
|
||||||
|
\draw[->,thick] (0.8,0.5-1.2) -- (7.2,0.5-1.2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The efficiency of the algorithm depends on
|
||||||
|
the total number of stack operations.
|
||||||
|
If the current element is larger than
|
||||||
|
the top element in the stack, it is directly
|
||||||
|
added to the stack, which is efficient.
|
||||||
|
However, sometimes the stack can contain several
|
||||||
|
larger elements and it takes time to remove them.
|
||||||
|
Still, each element is added \emph{exactly once} to the stack
|
||||||
|
and removed \emph{at most once} from the stack.
|
||||||
|
Thus, each element causes $O(1)$ stack operations,
|
||||||
|
and the algorithm works in $O(n)$ time.
|
||||||
|
|
||||||
|
\section{Sliding window minimum}
|
||||||
|
|
||||||
|
\index{sliding window}
|
||||||
|
\index{sliding window minimum}
|
||||||
|
|
||||||
|
A \key{sliding window} is a constant-size subarray
|
||||||
|
that moves from left to right through the array.
|
||||||
|
At each window position,
|
||||||
|
we want to calculate some information
|
||||||
|
about the elements inside the window.
|
||||||
|
In this section, we focus on the problem
|
||||||
|
of maintaining the \key{sliding window minimum},
|
||||||
|
which means that
|
||||||
|
we should report the smallest value inside each window.
|
||||||
|
|
||||||
|
The sliding window minimum can be calculated
|
||||||
|
using a similar idea that we used to calculate
|
||||||
|
the nearest smaller elements.
|
||||||
|
We maintain a queue
|
||||||
|
where each element is larger than
|
||||||
|
the previous element,
|
||||||
|
and the first element
|
||||||
|
always corresponds to the minimum element inside the window.
|
||||||
|
After each window move,
|
||||||
|
we remove elements from the end of the queue
|
||||||
|
until the last queue element
|
||||||
|
is smaller than the new window element,
|
||||||
|
or the queue becomes empty.
|
||||||
|
We also remove the first queue element
|
||||||
|
if it is not inside the window anymore.
|
||||||
|
Finally, we add the new window element
|
||||||
|
to the end of the queue.
|
||||||
|
|
||||||
|
As an example, consider the following array:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$2$};
|
||||||
|
\node at (1.5,0.5) {$1$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$3$};
|
||||||
|
\node at (5.5,0.5) {$4$};
|
||||||
|
\node at (6.5,0.5) {$1$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Suppose that the size of the sliding window is 4.
|
||||||
|
At the first window position, the smallest value is 1:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (0,0) rectangle (4,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$2$};
|
||||||
|
\node at (1.5,0.5) {$1$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$3$};
|
||||||
|
\node at (5.5,0.5) {$4$};
|
||||||
|
\node at (6.5,0.5) {$1$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2);
|
||||||
|
\draw (2.2,0.2-1.2) rectangle (2.8,0.8-1.2);
|
||||||
|
\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
|
||||||
|
|
||||||
|
\node at (1.5,0.5-1.2) {$1$};
|
||||||
|
\node at (2.5,0.5-1.2) {$4$};
|
||||||
|
\node at (3.5,0.5-1.2) {$5$};
|
||||||
|
|
||||||
|
\draw[->,thick] (1.8,0.5-1.2) -- (2.2,0.5-1.2);
|
||||||
|
\draw[->,thick] (2.8,0.5-1.2) -- (3.2,0.5-1.2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Then the window moves one step right.
|
||||||
|
The new element 3 is smaller than the elements
|
||||||
|
4 and 5 in the queue, so the elements 4 and 5
|
||||||
|
are removed from the queue
|
||||||
|
and the element 3 is added to the queue.
|
||||||
|
The smallest value is still 1.
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (1,0) rectangle (5,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$2$};
|
||||||
|
\node at (1.5,0.5) {$1$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$3$};
|
||||||
|
\node at (5.5,0.5) {$4$};
|
||||||
|
\node at (6.5,0.5) {$1$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2);
|
||||||
|
\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2);
|
||||||
|
|
||||||
|
\node at (1.5,0.5-1.2) {$1$};
|
||||||
|
\node at (4.5,0.5-1.2) {$3$};
|
||||||
|
|
||||||
|
\draw[->,thick] (1.8,0.5-1.2) -- (4.2,0.5-1.2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
After this, the window moves again,
|
||||||
|
and the smallest element 1
|
||||||
|
does not belong to the window anymore.
|
||||||
|
Thus, it is removed from the queue and the smallest
|
||||||
|
value is now 3. Also the new element 4
|
||||||
|
is added to the queue.
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (2,0) rectangle (6,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$2$};
|
||||||
|
\node at (1.5,0.5) {$1$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$3$};
|
||||||
|
\node at (5.5,0.5) {$4$};
|
||||||
|
\node at (6.5,0.5) {$1$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2);
|
||||||
|
\draw (5.2,0.2-1.2) rectangle (5.8,0.8-1.2);
|
||||||
|
|
||||||
|
\node at (4.5,0.5-1.2) {$3$};
|
||||||
|
\node at (5.5,0.5-1.2) {$4$};
|
||||||
|
|
||||||
|
\draw[->,thick] (4.8,0.5-1.2) -- (5.2,0.5-1.2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The next new element 1 is smaller than all elements
|
||||||
|
in the queue.
|
||||||
|
Thus, all elements are removed from the queue
|
||||||
|
and it will only contain the element 1:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (3,0) rectangle (7,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$2$};
|
||||||
|
\node at (1.5,0.5) {$1$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$3$};
|
||||||
|
\node at (5.5,0.5) {$4$};
|
||||||
|
\node at (6.5,0.5) {$1$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2);
|
||||||
|
|
||||||
|
\node at (6.5,0.5-1.2) {$1$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Finally the window reaches its last position.
|
||||||
|
The element 2 is added to the queue,
|
||||||
|
but the smallest value inside the window
|
||||||
|
is still 1.
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (4,0) rectangle (8,1);
|
||||||
|
\draw (0,0) grid (8,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$2$};
|
||||||
|
\node at (1.5,0.5) {$1$};
|
||||||
|
\node at (2.5,0.5) {$4$};
|
||||||
|
\node at (3.5,0.5) {$5$};
|
||||||
|
\node at (4.5,0.5) {$3$};
|
||||||
|
\node at (5.5,0.5) {$4$};
|
||||||
|
\node at (6.5,0.5) {$1$};
|
||||||
|
\node at (7.5,0.5) {$2$};
|
||||||
|
|
||||||
|
\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2);
|
||||||
|
\draw (7.2,0.2-1.2) rectangle (7.8,0.8-1.2);
|
||||||
|
|
||||||
|
\node at (6.5,0.5-1.2) {$1$};
|
||||||
|
\node at (7.5,0.5-1.2) {$2$};
|
||||||
|
|
||||||
|
\draw[->,thick] (6.8,0.5-1.2) -- (7.2,0.5-1.2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Since each array element
|
||||||
|
is added to the queue exactly once and
|
||||||
|
removed from the queue at most once,
|
||||||
|
the algorithm works in $O(n)$ time.
|
||||||
|
|
||||||
|
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,849 @@
|
||||||
|
\chapter{Bit manipulation}
|
||||||
|
|
||||||
|
All data in computer programs is internally stored as bits,
|
||||||
|
i.e., as numbers 0 and 1.
|
||||||
|
This chapter discusses the bit representation
|
||||||
|
of integers, and shows examples
|
||||||
|
of how to use bit operations.
|
||||||
|
It turns out that there are many uses for
|
||||||
|
bit manipulation in algorithm programming.
|
||||||
|
|
||||||
|
\section{Bit representation}
|
||||||
|
|
||||||
|
\index{bit representation}
|
||||||
|
|
||||||
|
In programming, an $n$ bit integer is internally
|
||||||
|
stored as a binary number that consists of $n$ bits.
|
||||||
|
For example, the C++ type \texttt{int} is
|
||||||
|
a 32-bit type, which means that every \texttt{int}
|
||||||
|
number consists of 32 bits.
|
||||||
|
|
||||||
|
Here is the bit representation of
|
||||||
|
the \texttt{int} number 43:
|
||||||
|
\[00000000000000000000000000101011\]
|
||||||
|
The bits in the representation are indexed from right to left.
|
||||||
|
To convert a bit representation $b_k \cdots b_2 b_1 b_0$ into a number,
|
||||||
|
we can use the formula
|
||||||
|
\[b_k 2^k + \ldots + b_2 2^2 + b_1 2^1 + b_0 2^0.\]
|
||||||
|
For example,
|
||||||
|
\[1 \cdot 2^5 + 1 \cdot 2^3 + 1 \cdot 2^1 + 1 \cdot 2^0 = 43.\]
|
||||||
|
|
||||||
|
The bit representation of a number is either
|
||||||
|
\key{signed} or \key{unsigned}.
|
||||||
|
Usually a signed representation is used,
|
||||||
|
which means that both negative and positive
|
||||||
|
numbers can be represented.
|
||||||
|
A signed variable of $n$ bits can contain any
|
||||||
|
integer between $-2^{n-1}$ and $2^{n-1}-1$.
|
||||||
|
For example, the \texttt{int} type in C++ is
|
||||||
|
a signed type, so an \texttt{int} variable can contain any
|
||||||
|
integer between $-2^{31}$ and $2^{31}-1$.
|
||||||
|
|
||||||
|
The first bit in a signed representation
|
||||||
|
is the sign of the number (0 for nonnegative numbers
|
||||||
|
and 1 for negative numbers), and
|
||||||
|
the remaining $n-1$ bits contain the magnitude of the number.
|
||||||
|
\key{Two's complement} is used, which means that the
|
||||||
|
opposite number of a number is calculated by first
|
||||||
|
inverting all the bits in the number,
|
||||||
|
and then increasing the number by one.
|
||||||
|
|
||||||
|
For example, the bit representation of
|
||||||
|
the \texttt{int} number $-43$ is
|
||||||
|
\[11111111111111111111111111010101.\]
|
||||||
|
|
||||||
|
In an unsigned representation, only nonnegative
|
||||||
|
numbers can be used, but the upper bound for the values is larger.
|
||||||
|
An unsigned variable of $n$ bits can contain any
|
||||||
|
integer between $0$ and $2^n-1$.
|
||||||
|
For example, in C++, an \texttt{unsigned int} variable
|
||||||
|
can contain any integer between $0$ and $2^{32}-1$.
|
||||||
|
|
||||||
|
There is a connection between the
|
||||||
|
representations:
|
||||||
|
a signed number $-x$ equals an unsigned number $2^n-x$.
|
||||||
|
For example, the following code shows that
|
||||||
|
the signed number $x=-43$ equals the unsigned
|
||||||
|
number $y=2^{32}-43$:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int x = -43;
|
||||||
|
unsigned int y = x;
|
||||||
|
cout << x << "\n"; // -43
|
||||||
|
cout << y << "\n"; // 4294967253
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
If a number is larger than the upper bound
|
||||||
|
of the bit representation, the number will overflow.
|
||||||
|
In a signed representation,
|
||||||
|
the next number after $2^{n-1}-1$ is $-2^{n-1}$,
|
||||||
|
and in an unsigned representation,
|
||||||
|
the next number after $2^n-1$ is $0$.
|
||||||
|
For example, consider the following code:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int x = 2147483647
|
||||||
|
cout << x << "\n"; // 2147483647
|
||||||
|
x++;
|
||||||
|
cout << x << "\n"; // -2147483648
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Initially, the value of $x$ is $2^{31}-1$.
|
||||||
|
This is the largest value that can be stored
|
||||||
|
in an \texttt{int} variable,
|
||||||
|
so the next number after $2^{31}-1$ is $-2^{31}$.
|
||||||
|
|
||||||
|
|
||||||
|
\section{Bit operations}
|
||||||
|
|
||||||
|
\newcommand\XOR{\mathbin{\char`\^}}
|
||||||
|
|
||||||
|
\subsubsection{And operation}
|
||||||
|
|
||||||
|
\index{and operation}
|
||||||
|
|
||||||
|
The \key{and} operation $x$ \& $y$ produces a number
|
||||||
|
that has one bits in positions where both
|
||||||
|
$x$ and $y$ have one bits.
|
||||||
|
For example, $22$ \& $26$ = 18, because
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rrr}
|
||||||
|
& 10110 & (22)\\
|
||||||
|
\& & 11010 & (26) \\
|
||||||
|
\hline
|
||||||
|
= & 10010 & (18) \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Using the and operation, we can check if a number
|
||||||
|
$x$ is even because
|
||||||
|
$x$ \& $1$ = 0 if $x$ is even, and
|
||||||
|
$x$ \& $1$ = 1 if $x$ is odd.
|
||||||
|
More generally, $x$ is divisible by $2^k$
|
||||||
|
exactly when $x$ \& $(2^k-1)$ = 0.
|
||||||
|
|
||||||
|
\subsubsection{Or operation}
|
||||||
|
|
||||||
|
\index{or operation}
|
||||||
|
|
||||||
|
The \key{or} operation $x$ | $y$ produces a number
|
||||||
|
that has one bits in positions where at least one
|
||||||
|
of $x$ and $y$ have one bits.
|
||||||
|
For example, $22$ | $26$ = 30, because
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rrr}
|
||||||
|
& 10110 & (22)\\
|
||||||
|
| & 11010 & (26) \\
|
||||||
|
\hline
|
||||||
|
= & 11110 & (30) \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Xor operation}
|
||||||
|
|
||||||
|
\index{xor operation}
|
||||||
|
|
||||||
|
The \key{xor} operation $x$ $\XOR$ $y$ produces a number
|
||||||
|
that has one bits in positions where exactly one
|
||||||
|
of $x$ and $y$ have one bits.
|
||||||
|
For example, $22$ $\XOR$ $26$ = 12, because
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rrr}
|
||||||
|
& 10110 & (22)\\
|
||||||
|
$\XOR$ & 11010 & (26) \\
|
||||||
|
\hline
|
||||||
|
= & 01100 & (12) \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Not operation}
|
||||||
|
|
||||||
|
\index{not operation}
|
||||||
|
|
||||||
|
The \key{not} operation \textasciitilde$x$
|
||||||
|
produces a number where all the bits of $x$
|
||||||
|
have been inverted.
|
||||||
|
The formula \textasciitilde$x = -x-1$ holds,
|
||||||
|
for example, \textasciitilde$29 = -30$.
|
||||||
|
|
||||||
|
The result of the not operation at the bit level
|
||||||
|
depends on the length of the bit representation,
|
||||||
|
because the operation inverts all bits.
|
||||||
|
For example, if the numbers are 32-bit
|
||||||
|
\texttt{int} numbers, the result is as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{rrrr}
|
||||||
|
$x$ & = & 29 & 00000000000000000000000000011101 \\
|
||||||
|
\textasciitilde$x$ & = & $-30$ & 11111111111111111111111111100010 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Bit shifts}
|
||||||
|
|
||||||
|
\index{bit shift}
|
||||||
|
|
||||||
|
The left bit shift $x < < k$ appends $k$
|
||||||
|
zero bits to the number,
|
||||||
|
and the right bit shift $x > > k$
|
||||||
|
removes the $k$ last bits from the number.
|
||||||
|
For example, $14 < < 2 = 56$,
|
||||||
|
because $14$ and $56$ correspond to 1110 and 111000.
|
||||||
|
Similarly, $49 > > 3 = 6$,
|
||||||
|
because $49$ and $6$ correspond to 110001 and 110.
|
||||||
|
|
||||||
|
Note that $x < < k$
|
||||||
|
corresponds to multiplying $x$ by $2^k$,
|
||||||
|
and $x > > k$
|
||||||
|
corresponds to dividing $x$ by $2^k$
|
||||||
|
rounded down to an integer.
|
||||||
|
|
||||||
|
\subsubsection{Applications}
|
||||||
|
|
||||||
|
A number of the form $1 < < k$ has a one bit
|
||||||
|
in position $k$ and all other bits are zero,
|
||||||
|
so we can use such numbers to access single bits of numbers.
|
||||||
|
In particular, the $k$th bit of a number is one
|
||||||
|
exactly when $x$ \& $(1 < < k)$ is not zero.
|
||||||
|
The following code prints the bit representation
|
||||||
|
of an \texttt{int} number $x$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 31; i >= 0; i--) {
|
||||||
|
if (x&(1<<i)) cout << "1";
|
||||||
|
else cout << "0";
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
It is also possible to modify single bits
|
||||||
|
of numbers using similar ideas.
|
||||||
|
For example, the formula $x$ | $(1 < < k)$
|
||||||
|
sets the $k$th bit of $x$ to one,
|
||||||
|
the formula
|
||||||
|
$x$ \& \textasciitilde $(1 < < k)$
|
||||||
|
sets the $k$th bit of $x$ to zero,
|
||||||
|
and the formula
|
||||||
|
$x$ $\XOR$ $(1 < < k)$
|
||||||
|
inverts the $k$th bit of $x$.
|
||||||
|
|
||||||
|
The formula $x$ \& $(x-1)$ sets the last
|
||||||
|
one bit of $x$ to zero,
|
||||||
|
and the formula $x$ \& $-x$ sets all the
|
||||||
|
one bits to zero, except for the last one bit.
|
||||||
|
The formula $x$ | $(x-1)$
|
||||||
|
inverts all the bits after the last one bit.
|
||||||
|
Also note that a positive number $x$ is
|
||||||
|
a power of two exactly when $x$ \& $(x-1) = 0$.
|
||||||
|
|
||||||
|
\subsubsection*{Additional functions}
|
||||||
|
|
||||||
|
The g++ compiler provides the following
|
||||||
|
functions for counting bits:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item
|
||||||
|
$\texttt{\_\_builtin\_clz}(x)$:
|
||||||
|
the number of zeros at the beginning of the number
|
||||||
|
\item
|
||||||
|
$\texttt{\_\_builtin\_ctz}(x)$:
|
||||||
|
the number of zeros at the end of the number
|
||||||
|
\item
|
||||||
|
$\texttt{\_\_builtin\_popcount}(x)$:
|
||||||
|
the number of ones in the number
|
||||||
|
\item
|
||||||
|
$\texttt{\_\_builtin\_parity}(x)$:
|
||||||
|
the parity (even or odd) of the number of ones
|
||||||
|
\end{itemize}
|
||||||
|
\begin{samepage}
|
||||||
|
|
||||||
|
The functions can be used as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int x = 5328; // 00000000000000000001010011010000
|
||||||
|
cout << __builtin_clz(x) << "\n"; // 19
|
||||||
|
cout << __builtin_ctz(x) << "\n"; // 4
|
||||||
|
cout << __builtin_popcount(x) << "\n"; // 5
|
||||||
|
cout << __builtin_parity(x) << "\n"; // 1
|
||||||
|
\end{lstlisting}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
While the above functions only support \texttt{int} numbers,
|
||||||
|
there are also \texttt{long long} versions of
|
||||||
|
the functions available with the suffix \texttt{ll}.
|
||||||
|
|
||||||
|
\section{Representing sets}
|
||||||
|
|
||||||
|
Every subset of a set
|
||||||
|
$\{0,1,2,\ldots,n-1\}$
|
||||||
|
can be represented as an $n$ bit integer
|
||||||
|
whose one bits indicate which
|
||||||
|
elements belong to the subset.
|
||||||
|
This is an efficient way to represent sets,
|
||||||
|
because every element requires only one bit of memory,
|
||||||
|
and set operations can be implemented as bit operations.
|
||||||
|
|
||||||
|
For example, since \texttt{int} is a 32-bit type,
|
||||||
|
an \texttt{int} number can represent any subset
|
||||||
|
of the set $\{0,1,2,\ldots,31\}$.
|
||||||
|
The bit representation of the set $\{1,3,4,8\}$ is
|
||||||
|
\[00000000000000000000000100011010,\]
|
||||||
|
which corresponds to the number $2^8+2^4+2^3+2^1=282$.
|
||||||
|
|
||||||
|
\subsubsection{Set implementation}
|
||||||
|
|
||||||
|
The following code declares an \texttt{int}
|
||||||
|
variable $x$ that can contain
|
||||||
|
a subset of $\{0,1,2,\ldots,31\}$.
|
||||||
|
After this, the code adds the elements 1, 3, 4 and 8
|
||||||
|
to the set and prints the size of the set.
|
||||||
|
\begin{lstlisting}
|
||||||
|
int x = 0;
|
||||||
|
x |= (1<<1);
|
||||||
|
x |= (1<<3);
|
||||||
|
x |= (1<<4);
|
||||||
|
x |= (1<<8);
|
||||||
|
cout << __builtin_popcount(x) << "\n"; // 4
|
||||||
|
\end{lstlisting}
|
||||||
|
Then, the following code prints all
|
||||||
|
elements that belong to the set:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 0; i < 32; i++) {
|
||||||
|
if (x&(1<<i)) cout << i << " ";
|
||||||
|
}
|
||||||
|
// output: 1 3 4 8
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Set operations}
|
||||||
|
|
||||||
|
Set operations can be implemented as follows as bit operations:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{lll}
|
||||||
|
& set syntax & bit syntax \\
|
||||||
|
\hline
|
||||||
|
intersection & $a \cap b$ & $a$ \& $b$ \\
|
||||||
|
union & $a \cup b$ & $a$ | $b$ \\
|
||||||
|
complement & $\bar a$ & \textasciitilde$a$ \\
|
||||||
|
difference & $a \setminus b$ & $a$ \& (\textasciitilde$b$) \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
For example, the following code first constructs
|
||||||
|
the sets $x=\{1,3,4,8\}$ and $y=\{3,6,8,9\}$,
|
||||||
|
and then constructs the set $z = x \cup y = \{1,3,4,6,8,9\}$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
int x = (1<<1)|(1<<3)|(1<<4)|(1<<8);
|
||||||
|
int y = (1<<3)|(1<<6)|(1<<8)|(1<<9);
|
||||||
|
int z = x|y;
|
||||||
|
cout << __builtin_popcount(z) << "\n"; // 6
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Iterating through subsets}
|
||||||
|
|
||||||
|
The following code goes through
|
||||||
|
the subsets of $\{0,1,\ldots,n-1\}$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int b = 0; b < (1<<n); b++) {
|
||||||
|
// process subset b
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
The following code goes through
|
||||||
|
the subsets with exactly $k$ elements:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int b = 0; b < (1<<n); b++) {
|
||||||
|
if (__builtin_popcount(b) == k) {
|
||||||
|
// process subset b
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
The following code goes through the subsets
|
||||||
|
of a set $x$:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int b = 0;
|
||||||
|
do {
|
||||||
|
// process subset b
|
||||||
|
} while (b=(b-x)&x);
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Bit optimizations}
|
||||||
|
|
||||||
|
Many algorithms can be optimized using
|
||||||
|
bit operations.
|
||||||
|
Such optimizations do not change the
|
||||||
|
time complexity of the algorithm,
|
||||||
|
but they may have a large impact
|
||||||
|
on the actual running time of the code.
|
||||||
|
In this section we discuss examples
|
||||||
|
of such situations.
|
||||||
|
|
||||||
|
\subsubsection{Hamming distances}
|
||||||
|
|
||||||
|
\index{Hamming distance}
|
||||||
|
The \key{Hamming distance}
|
||||||
|
$\texttt{hamming}(a,b)$ between two
|
||||||
|
strings $a$ and $b$ of equal length is
|
||||||
|
the number of positions where the strings differ.
|
||||||
|
For example,
|
||||||
|
\[\texttt{hamming}(01101,11001)=2.\]
|
||||||
|
|
||||||
|
Consider the following problem: Given
|
||||||
|
a list of $n$ bit strings, each of length $k$,
|
||||||
|
calculate the minimum Hamming distance
|
||||||
|
between two strings in the list.
|
||||||
|
For example, the answer for $[00111,01101,11110]$
|
||||||
|
is 2, because
|
||||||
|
\begin{itemize}[noitemsep]
|
||||||
|
\item $\texttt{hamming}(00111,01101)=2$,
|
||||||
|
\item $\texttt{hamming}(00111,11110)=3$, and
|
||||||
|
\item $\texttt{hamming}(01101,11110)=3$.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
A straightforward way to solve the problem is
|
||||||
|
to go through all pairs of strings and calculate
|
||||||
|
their Hamming distances,
|
||||||
|
which yields an $O(n^2 k)$ time algorithm.
|
||||||
|
The following function can be used to
|
||||||
|
calculate distances:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int hamming(string a, string b) {
|
||||||
|
int d = 0;
|
||||||
|
for (int i = 0; i < k; i++) {
|
||||||
|
if (a[i] != b[i]) d++;
|
||||||
|
}
|
||||||
|
return d;
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
However, if $k$ is small, we can optimize the code
|
||||||
|
by storing the bit strings as integers and
|
||||||
|
calculating the Hamming distances using bit operations.
|
||||||
|
In particular, if $k \le 32$, we can just store
|
||||||
|
the strings as \texttt{int} values and use the
|
||||||
|
following function to calculate distances:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int hamming(int a, int b) {
|
||||||
|
return __builtin_popcount(a^b);
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
In the above function, the xor operation constructs
|
||||||
|
a bit string that has one bits in positions
|
||||||
|
where $a$ and $b$ differ.
|
||||||
|
Then, the number of bits is calculated using
|
||||||
|
the \texttt{\_\_builtin\_popcount} function.
|
||||||
|
|
||||||
|
To compare the implementations, we generated
|
||||||
|
a list of 10000 random bit strings of length 30.
|
||||||
|
Using the first approach, the search took
|
||||||
|
13.5 seconds, and after the bit optimization,
|
||||||
|
it only took 0.5 seconds.
|
||||||
|
Thus, the bit optimized code was almost
|
||||||
|
30 times faster than the original code.
|
||||||
|
|
||||||
|
\subsubsection{Counting subgrids}
|
||||||
|
|
||||||
|
As another example, consider the
|
||||||
|
following problem:
|
||||||
|
Given an $n \times n$ grid whose
|
||||||
|
each square is either black (1) or white (0),
|
||||||
|
calculate the number of subgrids
|
||||||
|
whose all corners are black.
|
||||||
|
For example, the grid
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.5]
|
||||||
|
\fill[black] (1,1) rectangle (2,2);
|
||||||
|
\fill[black] (1,4) rectangle (2,5);
|
||||||
|
\fill[black] (4,1) rectangle (5,2);
|
||||||
|
\fill[black] (4,4) rectangle (5,5);
|
||||||
|
\fill[black] (1,3) rectangle (2,4);
|
||||||
|
\fill[black] (2,3) rectangle (3,4);
|
||||||
|
\fill[black] (2,1) rectangle (3,2);
|
||||||
|
\fill[black] (0,2) rectangle (1,3);
|
||||||
|
\draw (0,0) grid (5,5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
contains two such subgrids:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.5]
|
||||||
|
\fill[black] (1,1) rectangle (2,2);
|
||||||
|
\fill[black] (1,4) rectangle (2,5);
|
||||||
|
\fill[black] (4,1) rectangle (5,2);
|
||||||
|
\fill[black] (4,4) rectangle (5,5);
|
||||||
|
\fill[black] (1,3) rectangle (2,4);
|
||||||
|
\fill[black] (2,3) rectangle (3,4);
|
||||||
|
\fill[black] (2,1) rectangle (3,2);
|
||||||
|
\fill[black] (0,2) rectangle (1,3);
|
||||||
|
\draw (0,0) grid (5,5);
|
||||||
|
|
||||||
|
\fill[black] (7+1,1) rectangle (7+2,2);
|
||||||
|
\fill[black] (7+1,4) rectangle (7+2,5);
|
||||||
|
\fill[black] (7+4,1) rectangle (7+5,2);
|
||||||
|
\fill[black] (7+4,4) rectangle (7+5,5);
|
||||||
|
\fill[black] (7+1,3) rectangle (7+2,4);
|
||||||
|
\fill[black] (7+2,3) rectangle (7+3,4);
|
||||||
|
\fill[black] (7+2,1) rectangle (7+3,2);
|
||||||
|
\fill[black] (7+0,2) rectangle (7+1,3);
|
||||||
|
\draw (7+0,0) grid (7+5,5);
|
||||||
|
|
||||||
|
\draw[color=red,line width=1mm] (1,1) rectangle (3,4);
|
||||||
|
\draw[color=red,line width=1mm] (7+1,1) rectangle (7+5,5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
There is an $O(n^3)$ time algorithm for solving the problem:
|
||||||
|
go through all $O(n^2)$ pairs of rows and for each pair
|
||||||
|
$(a,b)$ calculate the number of columns that contain a black
|
||||||
|
square in both rows in $O(n)$ time.
|
||||||
|
The following code assumes that $\texttt{color}[y][x]$
|
||||||
|
denotes the color in row $y$ and column $x$:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int count = 0;
|
||||||
|
for (int i = 0; i < n; i++) {
|
||||||
|
if (color[a][i] == 1 && color[b][i] == 1) count++;
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
Then, those columns
|
||||||
|
account for $\texttt{count}(\texttt{count}-1)/2$ subgrids with black corners,
|
||||||
|
because we can choose any two of them to form a subgrid.
|
||||||
|
|
||||||
|
To optimize this algorithm, we divide the grid into blocks
|
||||||
|
of columns such that each block consists of $N$
|
||||||
|
consecutive columns. Then, each row is stored as
|
||||||
|
a list of $N$-bit numbers that describe the colors
|
||||||
|
of the squares. Now we can process $N$ columns at the same time
|
||||||
|
using bit operations. In the following code,
|
||||||
|
$\texttt{color}[y][k]$ represents
|
||||||
|
a block of $N$ colors as bits.
|
||||||
|
\begin{lstlisting}
|
||||||
|
int count = 0;
|
||||||
|
for (int i = 0; i <= n/N; i++) {
|
||||||
|
count += __builtin_popcount(color[a][i]&color[b][i]);
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
The resulting algorithm works in $O(n^3/N)$ time.
|
||||||
|
|
||||||
|
We generated a random grid of size $2500 \times 2500$
|
||||||
|
and compared the original and bit optimized implementation.
|
||||||
|
While the original code took $29.6$ seconds,
|
||||||
|
the bit optimized version only took $3.1$ seconds
|
||||||
|
with $N=32$ (\texttt{int} numbers) and $1.7$ seconds
|
||||||
|
with $N=64$ (\texttt{long long} numbers).
|
||||||
|
|
||||||
|
\section{Dynamic programming}
|
||||||
|
|
||||||
|
Bit operations provide an efficient and convenient
|
||||||
|
way to implement dynamic programming algorithms
|
||||||
|
whose states contain subsets of elements,
|
||||||
|
because such states can be stored as integers.
|
||||||
|
Next we discuss examples of combining
|
||||||
|
bit operations and dynamic programming.
|
||||||
|
|
||||||
|
\subsubsection{Optimal selection}
|
||||||
|
|
||||||
|
As a first example, consider the following problem:
|
||||||
|
We are given the prices of $k$ products
|
||||||
|
over $n$ days, and we want to buy each product
|
||||||
|
exactly once.
|
||||||
|
However, we are allowed to buy at most one product
|
||||||
|
in a day.
|
||||||
|
What is the minimum total price?
|
||||||
|
For example, consider the following scenario ($k=3$ and $n=8$):
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.65]
|
||||||
|
\draw (0, 0) grid (8,3);
|
||||||
|
\node at (-2.5,2.5) {product 0};
|
||||||
|
\node at (-2.5,1.5) {product 1};
|
||||||
|
\node at (-2.5,0.5) {product 2};
|
||||||
|
|
||||||
|
\foreach \x in {0,...,7}
|
||||||
|
{\node at (\x+0.5,3.5) {\x};}
|
||||||
|
\foreach \x/\v in {0/6,1/9,2/5,3/2,4/8,5/9,6/1,7/6}
|
||||||
|
{\node at (\x+0.5,2.5) {\v};}
|
||||||
|
\foreach \x/\v in {0/8,1/2,2/6,3/2,4/7,5/5,6/7,7/2}
|
||||||
|
{\node at (\x+0.5,1.5) {\v};}
|
||||||
|
\foreach \x/\v in {0/5,1/3,2/9,3/7,4/3,5/5,6/1,7/4}
|
||||||
|
{\node at (\x+0.5,0.5) {\v};}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
In this scenario, the minimum total price is $5$:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.65]
|
||||||
|
\fill [color=lightgray] (1, 1) rectangle (2, 2);
|
||||||
|
\fill [color=lightgray] (3, 2) rectangle (4, 3);
|
||||||
|
\fill [color=lightgray] (6, 0) rectangle (7, 1);
|
||||||
|
\draw (0, 0) grid (8,3);
|
||||||
|
\node at (-2.5,2.5) {product 0};
|
||||||
|
\node at (-2.5,1.5) {product 1};
|
||||||
|
\node at (-2.5,0.5) {product 2};
|
||||||
|
|
||||||
|
\foreach \x in {0,...,7}
|
||||||
|
{\node at (\x+0.5,3.5) {\x};}
|
||||||
|
\foreach \x/\v in {0/6,1/9,2/5,3/2,4/8,5/9,6/1,7/6}
|
||||||
|
{\node at (\x+0.5,2.5) {\v};}
|
||||||
|
\foreach \x/\v in {0/8,1/2,2/6,3/2,4/7,5/5,6/7,7/2}
|
||||||
|
{\node at (\x+0.5,1.5) {\v};}
|
||||||
|
\foreach \x/\v in {0/5,1/3,2/9,3/7,4/3,5/5,6/1,7/4}
|
||||||
|
{\node at (\x+0.5,0.5) {\v};}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Let $\texttt{price}[x][d]$ denote the price of product $x$
|
||||||
|
on day $d$.
|
||||||
|
For example, in the above scenario $\texttt{price}[2][3] = 7$.
|
||||||
|
Then, let $\texttt{total}(S,d)$ denote the minimum total
|
||||||
|
price for buying a subset $S$ of products by day $d$.
|
||||||
|
Using this function, the solution to the problem is
|
||||||
|
$\texttt{total}(\{0 \ldots k-1\},n-1)$.
|
||||||
|
|
||||||
|
First, $\texttt{total}(\emptyset,d) = 0$,
|
||||||
|
because it does not cost anything to buy an empty set,
|
||||||
|
and $\texttt{total}(\{x\},0) = \texttt{price}[x][0]$,
|
||||||
|
because there is one way to buy one product on the first day.
|
||||||
|
Then, the following recurrence can be used:
|
||||||
|
\begin{equation*}
|
||||||
|
\begin{split}
|
||||||
|
\texttt{total}(S,d) = \min( & \texttt{total}(S,d-1), \\
|
||||||
|
& \min_{x \in S} (\texttt{total}(S \setminus x,d-1)+\texttt{price}[x][d]))
|
||||||
|
\end{split}
|
||||||
|
\end{equation*}
|
||||||
|
This means that we either do not buy any product on day $d$
|
||||||
|
or buy a product $x$ that belongs to $S$.
|
||||||
|
In the latter case, we remove $x$ from $S$ and add the
|
||||||
|
price of $x$ to the total price.
|
||||||
|
|
||||||
|
The next step is to calculate the values of the function
|
||||||
|
using dynamic programming.
|
||||||
|
To store the function values, we declare an array
|
||||||
|
\begin{lstlisting}
|
||||||
|
int total[1<<K][N];
|
||||||
|
\end{lstlisting}
|
||||||
|
where $K$ and $N$ are suitably large constants.
|
||||||
|
The first dimension of the array corresponds to a bit
|
||||||
|
representation of a subset.
|
||||||
|
|
||||||
|
First, the cases where $d=0$ can be processed as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int x = 0; x < k; x++) {
|
||||||
|
total[1<<x][0] = price[x][0];
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
Then, the recurrence translates into the following code:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int d = 1; d < n; d++) {
|
||||||
|
for (int s = 0; s < (1<<k); s++) {
|
||||||
|
total[s][d] = total[s][d-1];
|
||||||
|
for (int x = 0; x < k; x++) {
|
||||||
|
if (s&(1<<x)) {
|
||||||
|
total[s][d] = min(total[s][d],
|
||||||
|
total[s^(1<<x)][d-1]+price[x][d]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
The time complexity of the algorithm is $O(n 2^k k)$.
|
||||||
|
|
||||||
|
\subsubsection{From permutations to subsets}
|
||||||
|
|
||||||
|
Using dynamic programming, it is often possible
|
||||||
|
to change an iteration over permutations into
|
||||||
|
an iteration over subsets\footnote{This technique was introduced in 1962
|
||||||
|
by M. Held and R. M. Karp \cite{hel62}.}.
|
||||||
|
The benefit of this is that
|
||||||
|
$n!$, the number of permutations,
|
||||||
|
is much larger than $2^n$, the number of subsets.
|
||||||
|
For example, if $n=20$, then
|
||||||
|
$n! \approx 2.4 \cdot 10^{18}$ and $2^n \approx 10^6$.
|
||||||
|
Thus, for certain values of $n$,
|
||||||
|
we can efficiently go through the subsets but not through the permutations.
|
||||||
|
|
||||||
|
As an example, consider the following problem:
|
||||||
|
There is an elevator with maximum weight $x$,
|
||||||
|
and $n$ people with known weights
|
||||||
|
who want to get from the ground floor
|
||||||
|
to the top floor.
|
||||||
|
What is the minimum number of rides needed
|
||||||
|
if the people enter the elevator in an optimal order?
|
||||||
|
|
||||||
|
For example, suppose that $x=10$, $n=5$
|
||||||
|
and the weights are as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{ll}
|
||||||
|
person & weight \\
|
||||||
|
\hline
|
||||||
|
0 & 2 \\
|
||||||
|
1 & 3 \\
|
||||||
|
2 & 3 \\
|
||||||
|
3 & 5 \\
|
||||||
|
4 & 6 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
In this case, the minimum number of rides is 2.
|
||||||
|
One optimal order is $\{0,2,3,1,4\}$,
|
||||||
|
which partitions the people into two rides:
|
||||||
|
first $\{0,2,3\}$ (total weight 10),
|
||||||
|
and then $\{1,4\}$ (total weight 9).
|
||||||
|
|
||||||
|
The problem can be easily solved in $O(n! n)$ time
|
||||||
|
by testing all possible permutations of $n$ people.
|
||||||
|
However, we can use dynamic programming to get
|
||||||
|
a more efficient $O(2^n n)$ time algorithm.
|
||||||
|
The idea is to calculate for each subset of people
|
||||||
|
two values: the minimum number of rides needed and
|
||||||
|
the minimum weight of people who ride in the last group.
|
||||||
|
|
||||||
|
Let $\texttt{weight}[p]$ denote the weight of
|
||||||
|
person $p$.
|
||||||
|
We define two functions:
|
||||||
|
$\texttt{rides}(S)$ is the minimum number of
|
||||||
|
rides for a subset $S$,
|
||||||
|
and $\texttt{last}(S)$ is the minimum weight
|
||||||
|
of the last ride.
|
||||||
|
For example, in the above scenario
|
||||||
|
\[ \texttt{rides}(\{1,3,4\})=2 \hspace{10px} \textrm{and}
|
||||||
|
\hspace{10px} \texttt{last}(\{1,3,4\})=5,\]
|
||||||
|
because the optimal rides are $\{1,4\}$ and $\{3\}$,
|
||||||
|
and the second ride has weight 5.
|
||||||
|
Of course, our final goal is to calculate the value
|
||||||
|
of $\texttt{rides}(\{0 \ldots n-1\})$.
|
||||||
|
|
||||||
|
We can calculate the values
|
||||||
|
of the functions recursively and then apply
|
||||||
|
dynamic programming.
|
||||||
|
The idea is to go through all people
|
||||||
|
who belong to $S$ and optimally
|
||||||
|
choose the last person $p$ who enters the elevator.
|
||||||
|
Each such choice yields a subproblem
|
||||||
|
for a smaller subset of people.
|
||||||
|
If $\texttt{last}(S \setminus p)+\texttt{weight}[p] \le x$,
|
||||||
|
we can add $p$ to the last ride.
|
||||||
|
Otherwise, we have to reserve a new ride
|
||||||
|
that initially only contains $p$.
|
||||||
|
|
||||||
|
To implement dynamic programming,
|
||||||
|
we declare an array
|
||||||
|
\begin{lstlisting}
|
||||||
|
pair<int,int> best[1<<N];
|
||||||
|
\end{lstlisting}
|
||||||
|
that contains for each subset $S$
|
||||||
|
a pair $(\texttt{rides}(S),\texttt{last}(S))$.
|
||||||
|
We set the value for an empty group as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
best[0] = {1,0};
|
||||||
|
\end{lstlisting}
|
||||||
|
Then, we can fill the array as follows:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int s = 1; s < (1<<n); s++) {
|
||||||
|
// initial value: n+1 rides are needed
|
||||||
|
best[s] = {n+1,0};
|
||||||
|
for (int p = 0; p < n; p++) {
|
||||||
|
if (s&(1<<p)) {
|
||||||
|
auto option = best[s^(1<<p)];
|
||||||
|
if (option.second+weight[p] <= x) {
|
||||||
|
// add p to an existing ride
|
||||||
|
option.second += weight[p];
|
||||||
|
} else {
|
||||||
|
// reserve a new ride for p
|
||||||
|
option.first++;
|
||||||
|
option.second = weight[p];
|
||||||
|
}
|
||||||
|
best[s] = min(best[s], option);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
Note that the above loop guarantees that
|
||||||
|
for any two subsets $S_1$ and $S_2$
|
||||||
|
such that $S_1 \subset S_2$, we process $S_1$ before $S_2$.
|
||||||
|
Thus, the dynamic programming values are calculated in the
|
||||||
|
correct order.
|
||||||
|
|
||||||
|
\subsubsection{Counting subsets}
|
||||||
|
|
||||||
|
Our last problem in this chapter is as follows:
|
||||||
|
Let $X=\{0 \ldots n-1\}$, and each subset $S \subset X$
|
||||||
|
is assigned an integer $\texttt{value}[S]$.
|
||||||
|
Our task is to calculate for each $S$
|
||||||
|
\[\texttt{sum}(S) = \sum_{A \subset S} \texttt{value}[A],\]
|
||||||
|
i.e., the sum of values of subsets of $S$.
|
||||||
|
|
||||||
|
For example, suppose that $n=3$ and the values are as follows:
|
||||||
|
\begin{multicols}{2}
|
||||||
|
\begin{itemize}
|
||||||
|
\item $\texttt{value}[\emptyset] = 3$
|
||||||
|
\item $\texttt{value}[\{0\}] = 1$
|
||||||
|
\item $\texttt{value}[\{1\}] = 4$
|
||||||
|
\item $\texttt{value}[\{0,1\}] = 5$
|
||||||
|
\item $\texttt{value}[\{2\}] = 5$
|
||||||
|
\item $\texttt{value}[\{0,2\}] = 1$
|
||||||
|
\item $\texttt{value}[\{1,2\}] = 3$
|
||||||
|
\item $\texttt{value}[\{0,1,2\}] = 3$
|
||||||
|
\end{itemize}
|
||||||
|
\end{multicols}
|
||||||
|
In this case, for example,
|
||||||
|
\begin{equation*}
|
||||||
|
\begin{split}
|
||||||
|
\texttt{sum}(\{0,2\}) &= \texttt{value}[\emptyset]+\texttt{value}[\{0\}]+\texttt{value}[\{2\}]+\texttt{value}[\{0,2\}] \\
|
||||||
|
&= 3 + 1 + 5 + 1 = 10.
|
||||||
|
\end{split}
|
||||||
|
\end{equation*}
|
||||||
|
|
||||||
|
Because there are a total of $2^n$ subsets,
|
||||||
|
one possible solution is to go through all
|
||||||
|
pairs of subsets in $O(2^{2n})$ time.
|
||||||
|
However, using dynamic programming, we
|
||||||
|
can solve the problem in $O(2^n n)$ time.
|
||||||
|
The idea is to focus on sums where the
|
||||||
|
elements that may be removed from $S$ are restricted.
|
||||||
|
|
||||||
|
Let $\texttt{partial}(S,k)$ denote the sum of
|
||||||
|
values of subsets of $S$ with the restriction
|
||||||
|
that only elements $0 \ldots k$
|
||||||
|
may be removed from $S$.
|
||||||
|
For example,
|
||||||
|
\[\texttt{partial}(\{0,2\},1)=\texttt{value}[\{2\}]+\texttt{value}[\{0,2\}],\]
|
||||||
|
because we may only remove elements $0 \ldots 1$.
|
||||||
|
We can calculate values of \texttt{sum} using
|
||||||
|
values of \texttt{partial}, because
|
||||||
|
\[\texttt{sum}(S) = \texttt{partial}(S,n-1).\]
|
||||||
|
The base cases for the function are
|
||||||
|
\[\texttt{partial}(S,-1)=\texttt{value}[S],\]
|
||||||
|
because in this case no elements can be removed from $S$.
|
||||||
|
Then, in the general case we can use the following recurrence:
|
||||||
|
\begin{equation*}
|
||||||
|
\texttt{partial}(S,k) = \begin{cases}
|
||||||
|
\texttt{partial}(S,k-1) & k \notin S \\
|
||||||
|
\texttt{partial}(S,k-1) + \texttt{partial}(S \setminus \{k\},k-1) & k \in S
|
||||||
|
\end{cases}
|
||||||
|
\end{equation*}
|
||||||
|
Here we focus on the element $k$.
|
||||||
|
If $k \in S$, we have two options: we may either keep $k$ in $S$
|
||||||
|
or remove it from $S$.
|
||||||
|
|
||||||
|
There is a particularly clever way to implement the
|
||||||
|
calculation of sums. We can declare an array
|
||||||
|
\begin{lstlisting}
|
||||||
|
int sum[1<<N];
|
||||||
|
\end{lstlisting}
|
||||||
|
that will contain the sum of each subset.
|
||||||
|
The array is initialized as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int s = 0; s < (1<<n); s++) {
|
||||||
|
sum[s] = value[s];
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
Then, we can fill the array as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int k = 0; k < n; k++) {
|
||||||
|
for (int s = 0; s < (1<<n); s++) {
|
||||||
|
if (s&(1<<k)) sum[s] += sum[s^(1<<k)];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
This code calculates the values of $\texttt{partial}(S,k)$
|
||||||
|
for $k=0 \ldots n-1$ to the array \texttt{sum}.
|
||||||
|
Since $\texttt{partial}(S,k)$ is always based on
|
||||||
|
$\texttt{partial}(S,k-1)$, we can reuse the array
|
||||||
|
\texttt{sum}, which yields a very efficient implementation.
|
|
@ -0,0 +1,764 @@
|
||||||
|
\chapter{Basics of graphs}
|
||||||
|
|
||||||
|
Many programming problems can be solved by
|
||||||
|
modeling the problem as a graph problem
|
||||||
|
and using an appropriate graph algorithm.
|
||||||
|
A typical example of a graph is a network
|
||||||
|
of roads and cities in a country.
|
||||||
|
Sometimes, though, the graph is hidden
|
||||||
|
in the problem and it may be difficult to detect it.
|
||||||
|
|
||||||
|
This part of the book discusses graph algorithms,
|
||||||
|
especially focusing on topics that
|
||||||
|
are important in competitive programming.
|
||||||
|
In this chapter, we go through concepts
|
||||||
|
related to graphs,
|
||||||
|
and study different ways to represent graphs in algorithms.
|
||||||
|
|
||||||
|
\section{Graph terminology}
|
||||||
|
|
||||||
|
\index{graph}
|
||||||
|
\index{node}
|
||||||
|
\index{edge}
|
||||||
|
|
||||||
|
A \key{graph} consists of \key{nodes}
|
||||||
|
and \key{edges}. In this book,
|
||||||
|
the variable $n$ denotes the number of nodes
|
||||||
|
in a graph, and the variable $m$ denotes
|
||||||
|
the number of edges.
|
||||||
|
The nodes are numbered
|
||||||
|
using integers $1,2,\ldots,n$.
|
||||||
|
|
||||||
|
For example, the following graph consists of 5 nodes and 7 edges:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (6,2) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\index{path}
|
||||||
|
|
||||||
|
A \key{path} leads from node $a$ to node $b$
|
||||||
|
through edges of the graph.
|
||||||
|
The \key{length} of a path is the number of
|
||||||
|
edges in it.
|
||||||
|
For example, the above graph contains
|
||||||
|
a path $1 \rightarrow 3 \rightarrow 4 \rightarrow 5$
|
||||||
|
of length 3
|
||||||
|
from node 1 to node 5:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (6,2) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\index{cycle}
|
||||||
|
|
||||||
|
A path is a \key{cycle} if the first and last
|
||||||
|
node is the same.
|
||||||
|
For example, the above graph contains
|
||||||
|
a cycle $1 \rightarrow 3 \rightarrow 4 \rightarrow 1$.
|
||||||
|
A path is \key{simple} if each node appears
|
||||||
|
at most once in the path.
|
||||||
|
|
||||||
|
|
||||||
|
%
|
||||||
|
% \begin{itemize}
|
||||||
|
% \item $1 \rightarrow 2 \rightarrow 5$ (length 2)
|
||||||
|
% \item $1 \rightarrow 4 \rightarrow 5$ (length 2)
|
||||||
|
% \item $1 \rightarrow 2 \rightarrow 4 \rightarrow 5$ (length 3)
|
||||||
|
% \item $1 \rightarrow 3 \rightarrow 4 \rightarrow 5$ (length 3)
|
||||||
|
% \item $1 \rightarrow 4 \rightarrow 2 \rightarrow 5$ (length 3)
|
||||||
|
% \item $1 \rightarrow 3 \rightarrow 4 \rightarrow 2 \rightarrow 5$ (length 4)
|
||||||
|
% \end{itemize}
|
||||||
|
|
||||||
|
\subsubsection{Connectivity}
|
||||||
|
|
||||||
|
\index{connected graph}
|
||||||
|
|
||||||
|
A graph is \key{connected} if there is a path
|
||||||
|
between any two nodes.
|
||||||
|
For example, the following graph is connected:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The following graph is not connected,
|
||||||
|
because it is not possible to get
|
||||||
|
from node 4 to any other node:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\index{component}
|
||||||
|
|
||||||
|
The connected parts of a graph are
|
||||||
|
called its \key{components}.
|
||||||
|
For example, the following graph
|
||||||
|
contains three components:
|
||||||
|
$\{1,\,2,\,3\}$,
|
||||||
|
$\{4,\,5,\,6,\,7\}$ and
|
||||||
|
$\{8\}$.
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.8]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
|
||||||
|
\node[draw, circle] (6) at (6,1) {$6$};
|
||||||
|
\node[draw, circle] (7) at (9,1) {$7$};
|
||||||
|
\node[draw, circle] (4) at (6,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (9,3) {$5$};
|
||||||
|
|
||||||
|
\node[draw, circle] (8) at (11,2) {$8$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
\path[draw,thick,-] (5) -- (7);
|
||||||
|
\path[draw,thick,-] (6) -- (7);
|
||||||
|
\path[draw,thick,-] (6) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\index{tree}
|
||||||
|
|
||||||
|
A \key{tree} is a connected graph
|
||||||
|
that consists of $n$ nodes and $n-1$ edges.
|
||||||
|
There is a unique path
|
||||||
|
between any two nodes of a tree.
|
||||||
|
For example, the following graph is a tree:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (6,2) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
%\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
%\path[draw,thick,-] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Edge directions}
|
||||||
|
|
||||||
|
\index{directed graph}
|
||||||
|
|
||||||
|
A graph is \key{directed}
|
||||||
|
if the edges can be traversed
|
||||||
|
in one direction only.
|
||||||
|
For example, the following graph is directed:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (6,2) {$5$};
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The above graph contains
|
||||||
|
a path $3 \rightarrow 1 \rightarrow 2 \rightarrow 5$
|
||||||
|
from node $3$ to node $5$,
|
||||||
|
but there is no path from node $5$ to node $3$.
|
||||||
|
|
||||||
|
\subsubsection{Edge weights}
|
||||||
|
|
||||||
|
\index{weighted graph}
|
||||||
|
|
||||||
|
In a \key{weighted} graph, each edge is assigned
|
||||||
|
a \key{weight}.
|
||||||
|
The weights are often interpreted as edge lengths.
|
||||||
|
For example, the following graph is weighted:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (6,2) {$5$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:1] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:7] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:3] {} (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The length of a path in a weighted graph
|
||||||
|
is the sum of the edge weights on the path.
|
||||||
|
For example, in the above graph,
|
||||||
|
the length of the path
|
||||||
|
$1 \rightarrow 2 \rightarrow 5$ is $12$,
|
||||||
|
and the length of the path
|
||||||
|
$1 \rightarrow 3 \rightarrow 4 \rightarrow 5$ is $11$.
|
||||||
|
The latter path is the \key{shortest} path from node $1$ to node $5$.
|
||||||
|
|
||||||
|
\subsubsection{Neighbors and degrees}
|
||||||
|
|
||||||
|
\index{neighbor}
|
||||||
|
\index{degree}
|
||||||
|
|
||||||
|
Two nodes are \key{neighbors} or \key{adjacent}
|
||||||
|
if there is an edge between them.
|
||||||
|
The \key{degree} of a node
|
||||||
|
is the number of its neighbors.
|
||||||
|
For example, in the following graph,
|
||||||
|
the neighbors of node 2 are 1, 4 and 5,
|
||||||
|
so its degree is 3.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (6,2) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
%\path[draw,thick,-] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The sum of degrees in a graph is always $2m$,
|
||||||
|
where $m$ is the number of edges,
|
||||||
|
because each edge
|
||||||
|
increases the degree of exactly two nodes by one.
|
||||||
|
For this reason, the sum of degrees is always even.
|
||||||
|
|
||||||
|
\index{regular graph}
|
||||||
|
\index{complete graph}
|
||||||
|
|
||||||
|
A graph is \key{regular} if the
|
||||||
|
degree of every node is a constant $d$.
|
||||||
|
A graph is \key{complete} if the
|
||||||
|
degree of every node is $n-1$, i.e.,
|
||||||
|
the graph contains all possible edges
|
||||||
|
between the nodes.
|
||||||
|
|
||||||
|
\index{indegree}
|
||||||
|
\index{outdegree}
|
||||||
|
|
||||||
|
In a directed graph, the \key{indegree}
|
||||||
|
of a node is the number of edges
|
||||||
|
that end at the node,
|
||||||
|
and the \key{outdegree} of a node
|
||||||
|
is the number of edges that start at the node.
|
||||||
|
For example, in the following graph,
|
||||||
|
the indegree of node 2 is 2,
|
||||||
|
and the outdegree of node 2 is 1.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (6,2) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (4);
|
||||||
|
\path[draw,thick,<-,>=latex] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Colorings}
|
||||||
|
|
||||||
|
\index{coloring}
|
||||||
|
\index{bipartite graph}
|
||||||
|
|
||||||
|
In a \key{coloring} of a graph,
|
||||||
|
each node is assigned a color so that
|
||||||
|
no adjacent nodes have the same color.
|
||||||
|
|
||||||
|
A graph is \key{bipartite} if
|
||||||
|
it is possible to color it using two colors.
|
||||||
|
It turns out that a graph is bipartite
|
||||||
|
exactly when it does not contain a cycle
|
||||||
|
with an odd number of edges.
|
||||||
|
For example, the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$2$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$3$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$5$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-2,1) {$4$};
|
||||||
|
\node[draw, circle] (6) at (-2,3) {$1$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
is bipartite, because it can be colored as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle, fill=blue!40] (1) at (1,3) {$2$};
|
||||||
|
\node[draw, circle, fill=red!40] (2) at (4,3) {$3$};
|
||||||
|
\node[draw, circle, fill=red!40] (3) at (1,1) {$5$};
|
||||||
|
\node[draw, circle, fill=blue!40] (4) at (4,1) {$6$};
|
||||||
|
\node[draw, circle, fill=red!40] (5) at (-2,1) {$4$};
|
||||||
|
\node[draw, circle, fill=blue!40] (6) at (-2,3) {$1$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
However, the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$2$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$3$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$5$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-2,1) {$4$};
|
||||||
|
\node[draw, circle] (6) at (-2,3) {$1$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\path[draw,thick,-] (1) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
is not bipartite, because it is not possible to color
|
||||||
|
the following cycle of three nodes using two colors:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$2$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$3$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$5$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-2,1) {$4$};
|
||||||
|
\node[draw, circle] (6) at (-2,3) {$1$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\path[draw,thick,-] (1) -- (6);
|
||||||
|
|
||||||
|
\path[draw=red,thick,-,line width=2pt] (1) -- (3);
|
||||||
|
\path[draw=red,thick,-,line width=2pt] (3) -- (6);
|
||||||
|
\path[draw=red,thick,-,line width=2pt] (6) -- (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Simplicity}
|
||||||
|
|
||||||
|
\index{simple graph}
|
||||||
|
|
||||||
|
A graph is \key{simple}
|
||||||
|
if no edge starts and ends at the same node,
|
||||||
|
and there are no multiple
|
||||||
|
edges between two nodes.
|
||||||
|
Often we assume that graphs are simple.
|
||||||
|
For example, the following graph is \emph{not} simple:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$2$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$3$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$5$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-2,1) {$4$};
|
||||||
|
\node[draw, circle] (6) at (-2,3) {$1$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) edge [bend right=20] (2);
|
||||||
|
\path[draw,thick,-] (2) edge [bend right=20] (1);
|
||||||
|
%\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
|
||||||
|
\tikzset{every loop/.style={in=135,out=190}}
|
||||||
|
\path[draw,thick,-] (5) edge [loop left] (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\section{Graph representation}
|
||||||
|
|
||||||
|
There are several ways to represent graphs
|
||||||
|
in algorithms.
|
||||||
|
The choice of a data structure
|
||||||
|
depends on the size of the graph and
|
||||||
|
the way the algorithm processes it.
|
||||||
|
Next we will go through three common representations.
|
||||||
|
|
||||||
|
\subsubsection{Adjacency list representation}
|
||||||
|
|
||||||
|
\index{adjacency list}
|
||||||
|
|
||||||
|
In the adjacency list representation,
|
||||||
|
each node $x$ in the graph is assigned an \key{adjacency list}
|
||||||
|
that consists of nodes
|
||||||
|
to which there is an edge from $x$.
|
||||||
|
Adjacency lists are the most popular
|
||||||
|
way to represent graphs, and most algorithms can be
|
||||||
|
efficiently implemented using them.
|
||||||
|
|
||||||
|
A convenient way to store the adjacency lists is to declare
|
||||||
|
an array of vectors as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<int> adj[N];
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The constant $N$ is chosen so that all
|
||||||
|
adjacency lists can be stored.
|
||||||
|
For example, the graph
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (3,1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
can be stored as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
adj[1].push_back(2);
|
||||||
|
adj[2].push_back(3);
|
||||||
|
adj[2].push_back(4);
|
||||||
|
adj[3].push_back(4);
|
||||||
|
adj[4].push_back(1);
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
If the graph is undirected, it can be stored in a similar way,
|
||||||
|
but each edge is added in both directions.
|
||||||
|
|
||||||
|
For a weighted graph, the structure can be extended
|
||||||
|
as follows:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<pair<int,int>> adj[N];
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
In this case, the adjacency list of node $a$
|
||||||
|
contains the pair $(b,w)$
|
||||||
|
always when there is an edge from node $a$ to node $b$
|
||||||
|
with weight $w$. For example, the graph
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (3,1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
can be stored as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
adj[1].push_back({2,5});
|
||||||
|
adj[2].push_back({3,7});
|
||||||
|
adj[2].push_back({4,6});
|
||||||
|
adj[3].push_back({4,5});
|
||||||
|
adj[4].push_back({1,2});
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The benefit of using adjacency lists is that
|
||||||
|
we can efficiently find the nodes to which
|
||||||
|
we can move from a given node through an edge.
|
||||||
|
For example, the following loop goes through all nodes
|
||||||
|
to which we can move from node $s$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (auto u : adj[s]) {
|
||||||
|
// process node u
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Adjacency matrix representation}
|
||||||
|
|
||||||
|
\index{adjacency matrix}
|
||||||
|
|
||||||
|
An \key{adjacency matrix} is a two-dimensional array
|
||||||
|
that indicates which edges the graph contains.
|
||||||
|
We can efficiently check from an adjacency matrix
|
||||||
|
if there is an edge between two nodes.
|
||||||
|
The matrix can be stored as an array
|
||||||
|
\begin{lstlisting}
|
||||||
|
int adj[N][N];
|
||||||
|
\end{lstlisting}
|
||||||
|
where each value $\texttt{adj}[a][b]$ indicates
|
||||||
|
whether the graph contains an edge from
|
||||||
|
node $a$ to node $b$.
|
||||||
|
If the edge is included in the graph,
|
||||||
|
then $\texttt{adj}[a][b]=1$,
|
||||||
|
and otherwise $\texttt{adj}[a][b]=0$.
|
||||||
|
For example, the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (3,1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
can be represented as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (4,4);
|
||||||
|
\node at (0.5,0.5) {1};
|
||||||
|
\node at (1.5,0.5) {0};
|
||||||
|
\node at (2.5,0.5) {0};
|
||||||
|
\node at (3.5,0.5) {0};
|
||||||
|
\node at (0.5,1.5) {0};
|
||||||
|
\node at (1.5,1.5) {0};
|
||||||
|
\node at (2.5,1.5) {0};
|
||||||
|
\node at (3.5,1.5) {1};
|
||||||
|
\node at (0.5,2.5) {0};
|
||||||
|
\node at (1.5,2.5) {0};
|
||||||
|
\node at (2.5,2.5) {1};
|
||||||
|
\node at (3.5,2.5) {1};
|
||||||
|
\node at (0.5,3.5) {0};
|
||||||
|
\node at (1.5,3.5) {1};
|
||||||
|
\node at (2.5,3.5) {0};
|
||||||
|
\node at (3.5,3.5) {0};
|
||||||
|
\node at (-0.5,0.5) {4};
|
||||||
|
\node at (-0.5,1.5) {3};
|
||||||
|
\node at (-0.5,2.5) {2};
|
||||||
|
\node at (-0.5,3.5) {1};
|
||||||
|
\node at (0.5,4.5) {1};
|
||||||
|
\node at (1.5,4.5) {2};
|
||||||
|
\node at (2.5,4.5) {3};
|
||||||
|
\node at (3.5,4.5) {4};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
If the graph is weighted, the adjacency matrix
|
||||||
|
representation can be extended so that
|
||||||
|
the matrix contains the weight of the edge
|
||||||
|
if the edge exists.
|
||||||
|
Using this representation, the graph
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (3,1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\begin{samepage}
|
||||||
|
corresponds to the following matrix:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (4,4);
|
||||||
|
\node at (0.5,0.5) {2};
|
||||||
|
\node at (1.5,0.5) {0};
|
||||||
|
\node at (2.5,0.5) {0};
|
||||||
|
\node at (3.5,0.5) {0};
|
||||||
|
\node at (0.5,1.5) {0};
|
||||||
|
\node at (1.5,1.5) {0};
|
||||||
|
\node at (2.5,1.5) {0};
|
||||||
|
\node at (3.5,1.5) {5};
|
||||||
|
\node at (0.5,2.5) {0};
|
||||||
|
\node at (1.5,2.5) {0};
|
||||||
|
\node at (2.5,2.5) {7};
|
||||||
|
\node at (3.5,2.5) {6};
|
||||||
|
\node at (0.5,3.5) {0};
|
||||||
|
\node at (1.5,3.5) {5};
|
||||||
|
\node at (2.5,3.5) {0};
|
||||||
|
\node at (3.5,3.5) {0};
|
||||||
|
\node at (-0.5,0.5) {4};
|
||||||
|
\node at (-0.5,1.5) {3};
|
||||||
|
\node at (-0.5,2.5) {2};
|
||||||
|
\node at (-0.5,3.5) {1};
|
||||||
|
\node at (0.5,4.5) {1};
|
||||||
|
\node at (1.5,4.5) {2};
|
||||||
|
\node at (2.5,4.5) {3};
|
||||||
|
\node at (3.5,4.5) {4};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
The drawback of the adjacency matrix representation
|
||||||
|
is that the matrix contains $n^2$ elements,
|
||||||
|
and usually most of them are zero.
|
||||||
|
For this reason, the representation cannot be used
|
||||||
|
if the graph is large.
|
||||||
|
|
||||||
|
\subsubsection{Edge list representation}
|
||||||
|
|
||||||
|
\index{edge list}
|
||||||
|
|
||||||
|
An \key{edge list} contains all edges of a graph
|
||||||
|
in some order.
|
||||||
|
This is a convenient way to represent a graph
|
||||||
|
if the algorithm processes all edges of the graph
|
||||||
|
and it is not needed to find edges that start
|
||||||
|
at a given node.
|
||||||
|
|
||||||
|
The edge list can be stored in a vector
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<pair<int,int>> edges;
|
||||||
|
\end{lstlisting}
|
||||||
|
where each pair $(a,b)$ denotes that
|
||||||
|
there is an edge from node $a$ to node $b$.
|
||||||
|
Thus, the graph
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (3,1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
can be represented as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
edges.push_back({1,2});
|
||||||
|
edges.push_back({2,3});
|
||||||
|
edges.push_back({2,4});
|
||||||
|
edges.push_back({3,4});
|
||||||
|
edges.push_back({4,1});
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\noindent
|
||||||
|
If the graph is weighted, the structure can
|
||||||
|
be extended as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<tuple<int,int,int>> edges;
|
||||||
|
\end{lstlisting}
|
||||||
|
Each element in this list is of the
|
||||||
|
form $(a,b,w)$, which means that there
|
||||||
|
is an edge from node $a$ to node $b$ with weight $w$.
|
||||||
|
For example, the graph
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (3,1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\begin{samepage}
|
||||||
|
can be represented as follows\footnote{In some older compilers, the function
|
||||||
|
\texttt{make\_tuple} must be used instead of the braces (for example,
|
||||||
|
\texttt{make\_tuple(1,2,5)} instead of \texttt{\{1,2,5\}}).}:
|
||||||
|
\begin{lstlisting}
|
||||||
|
edges.push_back({1,2,5});
|
||||||
|
edges.push_back({2,3,7});
|
||||||
|
edges.push_back({2,4,6});
|
||||||
|
edges.push_back({3,4,5});
|
||||||
|
edges.push_back({4,1,2});
|
||||||
|
\end{lstlisting}
|
||||||
|
\end{samepage}
|
|
@ -0,0 +1,549 @@
|
||||||
|
\chapter{Graph traversal}
|
||||||
|
|
||||||
|
This chapter discusses two fundamental
|
||||||
|
graph algorithms:
|
||||||
|
depth-first search and breadth-first search.
|
||||||
|
Both algorithms are given a starting
|
||||||
|
node in the graph,
|
||||||
|
and they visit all nodes that can be reached
|
||||||
|
from the starting node.
|
||||||
|
The difference in the algorithms is the order
|
||||||
|
in which they visit the nodes.
|
||||||
|
|
||||||
|
\section{Depth-first search}
|
||||||
|
|
||||||
|
\index{depth-first search}
|
||||||
|
|
||||||
|
\key{Depth-first search} (DFS)
|
||||||
|
is a straightforward graph traversal technique.
|
||||||
|
The algorithm begins at a starting node,
|
||||||
|
and proceeds to all other nodes that are
|
||||||
|
reachable from the starting node using
|
||||||
|
the edges of the graph.
|
||||||
|
|
||||||
|
Depth-first search always follows a single
|
||||||
|
path in the graph as long as it finds
|
||||||
|
new nodes.
|
||||||
|
After this, it returns to previous
|
||||||
|
nodes and begins to explore other parts of the graph.
|
||||||
|
The algorithm keeps track of visited nodes,
|
||||||
|
so that it processes each node only once.
|
||||||
|
|
||||||
|
\subsubsection*{Example}
|
||||||
|
|
||||||
|
Let us consider how depth-first search processes
|
||||||
|
the following graph:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
We may begin the search at any node of the graph;
|
||||||
|
now we will begin the search at node 1.
|
||||||
|
|
||||||
|
The search first proceeds to node 2:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
After this, nodes 3 and 5 will be visited:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The neighbors of node 5 are 2 and 3,
|
||||||
|
but the search has already visited both of them,
|
||||||
|
so it is time to return to the previous nodes.
|
||||||
|
Also the neighbors of nodes 3 and 2
|
||||||
|
have been visited, so we next move
|
||||||
|
from node 1 to node 4:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
After this, the search terminates because it has visited
|
||||||
|
all nodes.
|
||||||
|
|
||||||
|
The time complexity of depth-first search is $O(n+m)$
|
||||||
|
where $n$ is the number of nodes and $m$ is the
|
||||||
|
number of edges,
|
||||||
|
because the algorithm processes each node and edge once.
|
||||||
|
|
||||||
|
\subsubsection*{Implementation}
|
||||||
|
|
||||||
|
Depth-first search can be conveniently
|
||||||
|
implemented using recursion.
|
||||||
|
The following function \texttt{dfs} begins
|
||||||
|
a depth-first search at a given node.
|
||||||
|
The function assumes that the graph is
|
||||||
|
stored as adjacency lists in an array
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<int> adj[N];
|
||||||
|
\end{lstlisting}
|
||||||
|
and also maintains an array
|
||||||
|
\begin{lstlisting}
|
||||||
|
bool visited[N];
|
||||||
|
\end{lstlisting}
|
||||||
|
that keeps track of the visited nodes.
|
||||||
|
Initially, each array value is \texttt{false},
|
||||||
|
and when the search arrives at node $s$,
|
||||||
|
the value of \texttt{visited}[$s$] becomes \texttt{true}.
|
||||||
|
The function can be implemented as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
void dfs(int s) {
|
||||||
|
if (visited[s]) return;
|
||||||
|
visited[s] = true;
|
||||||
|
// process node s
|
||||||
|
for (auto u: adj[s]) {
|
||||||
|
dfs(u);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Breadth-first search}
|
||||||
|
|
||||||
|
\index{breadth-first search}
|
||||||
|
|
||||||
|
\key{Breadth-first search} (BFS) visits the nodes
|
||||||
|
in increasing order of their distance
|
||||||
|
from the starting node.
|
||||||
|
Thus, we can calculate the distance
|
||||||
|
from the starting node to all other
|
||||||
|
nodes using breadth-first search.
|
||||||
|
However, breadth-first search is more difficult
|
||||||
|
to implement than depth-first search.
|
||||||
|
|
||||||
|
Breadth-first search goes through the nodes
|
||||||
|
one level after another.
|
||||||
|
First the search explores the nodes whose
|
||||||
|
distance from the starting node is 1,
|
||||||
|
then the nodes whose distance is 2, and so on.
|
||||||
|
This process continues until all nodes
|
||||||
|
have been visited.
|
||||||
|
|
||||||
|
\subsubsection*{Example}
|
||||||
|
|
||||||
|
Let us consider how breadth-first search processes
|
||||||
|
the following graph:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Suppose that the search begins at node 1.
|
||||||
|
First, we process all nodes that can be reached
|
||||||
|
from node 1 using a single edge:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
After this, we proceed to nodes 3 and 5:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle,fill=lightgray] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Finally, we visit node 6:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle,fill=lightgray] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle,fill=lightgray] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (6);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Now we have calculated the distances
|
||||||
|
from the starting node to all nodes of the graph.
|
||||||
|
The distances are as follows:
|
||||||
|
|
||||||
|
\begin{tabular}{ll}
|
||||||
|
\\
|
||||||
|
node & distance \\
|
||||||
|
\hline
|
||||||
|
1 & 0 \\
|
||||||
|
2 & 1 \\
|
||||||
|
3 & 2 \\
|
||||||
|
4 & 1 \\
|
||||||
|
5 & 2 \\
|
||||||
|
6 & 3 \\
|
||||||
|
\\
|
||||||
|
\end{tabular}
|
||||||
|
|
||||||
|
Like in depth-first search,
|
||||||
|
the time complexity of breadth-first search
|
||||||
|
is $O(n+m)$, where $n$ is the number of nodes
|
||||||
|
and $m$ is the number of edges.
|
||||||
|
|
||||||
|
\subsubsection*{Implementation}
|
||||||
|
|
||||||
|
Breadth-first search is more difficult
|
||||||
|
to implement than depth-first search,
|
||||||
|
because the algorithm visits nodes
|
||||||
|
in different parts of the graph.
|
||||||
|
A typical implementation is based on
|
||||||
|
a queue that contains nodes.
|
||||||
|
At each step, the next node in the queue
|
||||||
|
will be processed.
|
||||||
|
|
||||||
|
The following code assumes that the graph is stored
|
||||||
|
as adjacency lists and maintains the following
|
||||||
|
data structures:
|
||||||
|
\begin{lstlisting}
|
||||||
|
queue<int> q;
|
||||||
|
bool visited[N];
|
||||||
|
int distance[N];
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The queue \texttt{q}
|
||||||
|
contains nodes to be processed
|
||||||
|
in increasing order of their distance.
|
||||||
|
New nodes are always added to the end
|
||||||
|
of the queue, and the node at the beginning
|
||||||
|
of the queue is the next node to be processed.
|
||||||
|
The array \texttt{visited} indicates
|
||||||
|
which nodes the search has already visited,
|
||||||
|
and the array \texttt{distance} will contain the
|
||||||
|
distances from the starting node to all nodes of the graph.
|
||||||
|
|
||||||
|
The search can be implemented as follows,
|
||||||
|
starting at node $x$:
|
||||||
|
\begin{lstlisting}
|
||||||
|
visited[x] = true;
|
||||||
|
distance[x] = 0;
|
||||||
|
q.push(x);
|
||||||
|
while (!q.empty()) {
|
||||||
|
int s = q.front(); q.pop();
|
||||||
|
// process node s
|
||||||
|
for (auto u : adj[s]) {
|
||||||
|
if (visited[u]) continue;
|
||||||
|
visited[u] = true;
|
||||||
|
distance[u] = distance[s]+1;
|
||||||
|
q.push(u);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Applications}
|
||||||
|
|
||||||
|
Using the graph traversal algorithms,
|
||||||
|
we can check many properties of graphs.
|
||||||
|
Usually, both depth-first search and
|
||||||
|
breadth-first search may be used,
|
||||||
|
but in practice, depth-first search
|
||||||
|
is a better choice, because it is
|
||||||
|
easier to implement.
|
||||||
|
In the following applications we will
|
||||||
|
assume that the graph is undirected.
|
||||||
|
|
||||||
|
\subsubsection{Connectivity check}
|
||||||
|
|
||||||
|
\index{connected graph}
|
||||||
|
|
||||||
|
A graph is connected if there is a path
|
||||||
|
between any two nodes of the graph.
|
||||||
|
Thus, we can check if a graph is connected
|
||||||
|
by starting at an arbitrary node and
|
||||||
|
finding out if we can reach all other nodes.
|
||||||
|
|
||||||
|
For example, in the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (2) at (7,5) {$2$};
|
||||||
|
\node[draw, circle] (1) at (3,5) {$1$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (5) at (7,3) {$5$};
|
||||||
|
\node[draw, circle] (4) at (3,3) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
a depth-first search from node $1$ visits
|
||||||
|
the following nodes:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (2) at (7,5) {$2$};
|
||||||
|
\node[draw, circle,fill=lightgray] (1) at (3,5) {$1$};
|
||||||
|
\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (5) at (7,3) {$5$};
|
||||||
|
\node[draw, circle,fill=lightgray] (4) at (3,3) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Since the search did not visit all the nodes,
|
||||||
|
we can conclude that the graph is not connected.
|
||||||
|
In a similar way, we can also find all connected components
|
||||||
|
of a graph by iterating through the nodes and always
|
||||||
|
starting a new depth-first search if the current node
|
||||||
|
does not belong to any component yet.
|
||||||
|
|
||||||
|
\subsubsection{Finding cycles}
|
||||||
|
|
||||||
|
\index{cycle}
|
||||||
|
|
||||||
|
A graph contains a cycle if during a graph traversal,
|
||||||
|
we find a node whose neighbor (other than the
|
||||||
|
previous node in the current path) has already been
|
||||||
|
visited.
|
||||||
|
For example, the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (2) at (7,5) {$2$};
|
||||||
|
\node[draw, circle] (1) at (3,5) {$1$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (5) at (7,3) {$5$};
|
||||||
|
\node[draw, circle] (4) at (3,3) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
contains two cycles and we can find one
|
||||||
|
of them as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle,fill=lightgray] (2) at (7,5) {$2$};
|
||||||
|
\node[draw, circle,fill=lightgray] (1) at (3,5) {$1$};
|
||||||
|
\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle,fill=lightgray] (5) at (7,3) {$5$};
|
||||||
|
\node[draw, circle] (4) at (3,3) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- (5);
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
After moving from node 2 to node 5 we notice that
|
||||||
|
the neighbor 3 of node 5 has already been visited.
|
||||||
|
Thus, the graph contains a cycle that goes through node 3,
|
||||||
|
for example, $3 \rightarrow 2 \rightarrow 5 \rightarrow 3$.
|
||||||
|
|
||||||
|
Another way to find out whether a graph contains a cycle
|
||||||
|
is to simply calculate the number of nodes and edges
|
||||||
|
in every component.
|
||||||
|
If a component contains $c$ nodes and no cycle,
|
||||||
|
it must contain exactly $c-1$ edges
|
||||||
|
(so it has to be a tree).
|
||||||
|
If there are $c$ or more edges, the component
|
||||||
|
surely contains a cycle.
|
||||||
|
|
||||||
|
\subsubsection{Bipartiteness check}
|
||||||
|
|
||||||
|
\index{bipartite graph}
|
||||||
|
|
||||||
|
A graph is bipartite if its nodes can be colored
|
||||||
|
using two colors so that there are no adjacent
|
||||||
|
nodes with the same color.
|
||||||
|
It is surprisingly easy to check if a graph
|
||||||
|
is bipartite using graph traversal algorithms.
|
||||||
|
|
||||||
|
The idea is to color the starting node blue,
|
||||||
|
all its neighbors red, all their neighbors blue, and so on.
|
||||||
|
If at some point of the search we notice that
|
||||||
|
two adjacent nodes have the same color,
|
||||||
|
this means that the graph is not bipartite.
|
||||||
|
Otherwise the graph is bipartite and one coloring
|
||||||
|
has been found.
|
||||||
|
|
||||||
|
For example, the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (2) at (5,5) {$2$};
|
||||||
|
\node[draw, circle] (1) at (3,5) {$1$};
|
||||||
|
\node[draw, circle] (3) at (7,4) {$3$};
|
||||||
|
\node[draw, circle] (5) at (5,3) {$5$};
|
||||||
|
\node[draw, circle] (4) at (3,3) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (5) -- (4);
|
||||||
|
\path[draw,thick,-] (4) -- (1);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (5) -- (3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
is not bipartite, because a search from node 1
|
||||||
|
proceeds as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle,fill=red!40] (2) at (5,5) {$2$};
|
||||||
|
\node[draw, circle,fill=blue!40] (1) at (3,5) {$1$};
|
||||||
|
\node[draw, circle,fill=blue!40] (3) at (7,4) {$3$};
|
||||||
|
\node[draw, circle,fill=red!40] (5) at (5,3) {$5$};
|
||||||
|
\node[draw, circle] (4) at (3,3) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (5) -- (4);
|
||||||
|
\path[draw,thick,-] (4) -- (1);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (5) -- (3);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- (2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
We notice that the color of both nodes 2 and 5
|
||||||
|
is red, while they are adjacent nodes in the graph.
|
||||||
|
Thus, the graph is not bipartite.
|
||||||
|
|
||||||
|
This algorithm always works, because when there
|
||||||
|
are only two colors available,
|
||||||
|
the color of the starting node in a component
|
||||||
|
determines the colors of all other nodes in the component.
|
||||||
|
It does not make any difference whether the
|
||||||
|
starting node is red or blue.
|
||||||
|
|
||||||
|
Note that in the general case,
|
||||||
|
it is difficult to find out if the nodes
|
||||||
|
in a graph can be colored using $k$ colors
|
||||||
|
so that no adjacent nodes have the same color.
|
||||||
|
Even when $k=3$, no efficient algorithm is known
|
||||||
|
but the problem is NP-hard.
|
|
@ -0,0 +1,802 @@
|
||||||
|
\chapter{Shortest paths}
|
||||||
|
|
||||||
|
\index{shortest path}
|
||||||
|
|
||||||
|
Finding a shortest path between two nodes
|
||||||
|
of a graph
|
||||||
|
is an important problem that has many
|
||||||
|
practical applications.
|
||||||
|
For example, a natural problem related to a road network
|
||||||
|
is to calculate the shortest possible length of a route
|
||||||
|
between two cities, given the lengths of the roads.
|
||||||
|
|
||||||
|
In an unweighted graph, the length of a path equals
|
||||||
|
the number of its edges, and we can
|
||||||
|
simply use breadth-first search to find
|
||||||
|
a shortest path.
|
||||||
|
However, in this chapter we focus on
|
||||||
|
weighted graphs
|
||||||
|
where more sophisticated algorithms
|
||||||
|
are needed
|
||||||
|
for finding shortest paths.
|
||||||
|
|
||||||
|
\section{Bellman–Ford algorithm}
|
||||||
|
|
||||||
|
\index{Bellman–Ford algorithm}
|
||||||
|
|
||||||
|
The \key{Bellman–Ford algorithm}\footnote{The algorithm is named after
|
||||||
|
R. E. Bellman and L. R. Ford who published it independently
|
||||||
|
in 1958 and 1956, respectively \cite{bel58,for56a}.} finds
|
||||||
|
shortest paths from a starting node to all
|
||||||
|
nodes of the graph.
|
||||||
|
The algorithm can process all kinds of graphs,
|
||||||
|
provided that the graph does not contain a
|
||||||
|
cycle with negative length.
|
||||||
|
If the graph contains a negative cycle,
|
||||||
|
the algorithm can detect this.
|
||||||
|
|
||||||
|
The algorithm keeps track of distances
|
||||||
|
from the starting node to all nodes of the graph.
|
||||||
|
Initially, the distance to the starting node is 0
|
||||||
|
and the distance to all other nodes in infinite.
|
||||||
|
The algorithm reduces the distances by finding
|
||||||
|
edges that shorten the paths until it is not
|
||||||
|
possible to reduce any distance.
|
||||||
|
|
||||||
|
\subsubsection{Example}
|
||||||
|
|
||||||
|
Let us consider how the Bellman–Ford algorithm
|
||||||
|
works in the following graph:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (1,3) {1};
|
||||||
|
\node[draw, circle] (2) at (4,3) {2};
|
||||||
|
\node[draw, circle] (3) at (1,1) {3};
|
||||||
|
\node[draw, circle] (4) at (4,1) {4};
|
||||||
|
\node[draw, circle] (5) at (6,2) {6};
|
||||||
|
\node[color=red] at (1,3+0.55) {$0$};
|
||||||
|
\node[color=red] at (4,3+0.55) {$\infty$};
|
||||||
|
\node[color=red] at (1,1-0.55) {$\infty$};
|
||||||
|
\node[color=red] at (4,1-0.55) {$\infty$};
|
||||||
|
\node[color=red] at (6,2-0.55) {$\infty$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Each node of the graph is assigned a distance.
|
||||||
|
Initially, the distance to the starting node is 0,
|
||||||
|
and the distance to all other nodes is infinite.
|
||||||
|
|
||||||
|
The algorithm searches for edges that reduce distances.
|
||||||
|
First, all edges from node 1 reduce distances:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (1,3) {1};
|
||||||
|
\node[draw, circle] (2) at (4,3) {2};
|
||||||
|
\node[draw, circle] (3) at (1,1) {3};
|
||||||
|
\node[draw, circle] (4) at (4,1) {4};
|
||||||
|
\node[draw, circle] (5) at (6,2) {5};
|
||||||
|
\node[color=red] at (1,3+0.55) {$0$};
|
||||||
|
\node[color=red] at (4,3+0.55) {$5$};
|
||||||
|
\node[color=red] at (1,1-0.55) {$3$};
|
||||||
|
\node[color=red] at (4,1-0.55) {$7$};
|
||||||
|
\node[color=red] at (6,2-0.55) {$\infty$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
After this, edges
|
||||||
|
$2 \rightarrow 5$ and $3 \rightarrow 4$
|
||||||
|
reduce distances:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (1,3) {1};
|
||||||
|
\node[draw, circle] (2) at (4,3) {2};
|
||||||
|
\node[draw, circle] (3) at (1,1) {3};
|
||||||
|
\node[draw, circle] (4) at (4,1) {4};
|
||||||
|
\node[draw, circle] (5) at (6,2) {5};
|
||||||
|
\node[color=red] at (1,3+0.55) {$0$};
|
||||||
|
\node[color=red] at (4,3+0.55) {$5$};
|
||||||
|
\node[color=red] at (1,1-0.55) {$3$};
|
||||||
|
\node[color=red] at (4,1-0.55) {$4$};
|
||||||
|
\node[color=red] at (6,2-0.55) {$7$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Finally, there is one more change:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (1,3) {1};
|
||||||
|
\node[draw, circle] (2) at (4,3) {2};
|
||||||
|
\node[draw, circle] (3) at (1,1) {3};
|
||||||
|
\node[draw, circle] (4) at (4,1) {4};
|
||||||
|
\node[draw, circle] (5) at (6,2) {5};
|
||||||
|
\node[color=red] at (1,3+0.55) {$0$};
|
||||||
|
\node[color=red] at (4,3+0.55) {$5$};
|
||||||
|
\node[color=red] at (1,1-0.55) {$3$};
|
||||||
|
\node[color=red] at (4,1-0.55) {$4$};
|
||||||
|
\node[color=red] at (6,2-0.55) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
After this, no edge can reduce any distance.
|
||||||
|
This means that the distances are final,
|
||||||
|
and we have successfully
|
||||||
|
calculated the shortest distances
|
||||||
|
from the starting node to all nodes of the graph.
|
||||||
|
|
||||||
|
For example, the shortest distance 3
|
||||||
|
from node 1 to node 5 corresponds to
|
||||||
|
the following path:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (1,3) {1};
|
||||||
|
\node[draw, circle] (2) at (4,3) {2};
|
||||||
|
\node[draw, circle] (3) at (1,1) {3};
|
||||||
|
\node[draw, circle] (4) at (4,1) {4};
|
||||||
|
\node[draw, circle] (5) at (6,2) {5};
|
||||||
|
\node[color=red] at (1,3+0.55) {$0$};
|
||||||
|
\node[color=red] at (4,3+0.55) {$5$};
|
||||||
|
\node[color=red] at (1,1-0.55) {$3$};
|
||||||
|
\node[color=red] at (4,1-0.55) {$4$};
|
||||||
|
\node[color=red] at (6,2-0.55) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Implementation}
|
||||||
|
|
||||||
|
The following implementation of the
|
||||||
|
Bellman–Ford algorithm determines the shortest distances
|
||||||
|
from a node $x$ to all nodes of the graph.
|
||||||
|
The code assumes that the graph is stored
|
||||||
|
as an edge list \texttt{edges}
|
||||||
|
that consists of tuples of the form $(a,b,w)$,
|
||||||
|
meaning that there is an edge from node $a$ to node $b$
|
||||||
|
with weight $w$.
|
||||||
|
|
||||||
|
The algorithm consists of $n-1$ rounds,
|
||||||
|
and on each round the algorithm goes through
|
||||||
|
all edges of the graph and tries to
|
||||||
|
reduce the distances.
|
||||||
|
The algorithm constructs an array \texttt{distance}
|
||||||
|
that will contain the distances from $x$
|
||||||
|
to all nodes of the graph.
|
||||||
|
The constant \texttt{INF} denotes an infinite distance.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i++) distance[i] = INF;
|
||||||
|
distance[x] = 0;
|
||||||
|
for (int i = 1; i <= n-1; i++) {
|
||||||
|
for (auto e : edges) {
|
||||||
|
int a, b, w;
|
||||||
|
tie(a, b, w) = e;
|
||||||
|
distance[b] = min(distance[b], distance[a]+w);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The time complexity of the algorithm is $O(nm)$,
|
||||||
|
because the algorithm consists of $n-1$ rounds and
|
||||||
|
iterates through all $m$ edges during a round.
|
||||||
|
If there are no negative cycles in the graph,
|
||||||
|
all distances are final after $n-1$ rounds,
|
||||||
|
because each shortest path can contain at most $n-1$ edges.
|
||||||
|
|
||||||
|
In practice, the final distances can usually
|
||||||
|
be found faster than in $n-1$ rounds.
|
||||||
|
Thus, a possible way to make the algorithm more efficient
|
||||||
|
is to stop the algorithm if no distance
|
||||||
|
can be reduced during a round.
|
||||||
|
|
||||||
|
\subsubsection{Negative cycles}
|
||||||
|
|
||||||
|
\index{negative cycle}
|
||||||
|
|
||||||
|
The Bellman–Ford algorithm can also be used to
|
||||||
|
check if the graph contains a cycle with negative length.
|
||||||
|
For example, the graph
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,1) {$2$};
|
||||||
|
\node[draw, circle] (3) at (2,-1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,0) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:$3$] {} (2);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:$1$] {} (4);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=below:$5$] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:$-7$] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=right:$2$] {} (3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\noindent
|
||||||
|
contains a negative cycle
|
||||||
|
$2 \rightarrow 3 \rightarrow 4 \rightarrow 2$
|
||||||
|
with length $-4$.
|
||||||
|
|
||||||
|
If the graph contains a negative cycle,
|
||||||
|
we can shorten infinitely many times
|
||||||
|
any path that contains the cycle by repeating the cycle
|
||||||
|
again and again.
|
||||||
|
Thus, the concept of a shortest path
|
||||||
|
is not meaningful in this situation.
|
||||||
|
|
||||||
|
A negative cycle can be detected
|
||||||
|
using the Bellman–Ford algorithm by
|
||||||
|
running the algorithm for $n$ rounds.
|
||||||
|
If the last round reduces any distance,
|
||||||
|
the graph contains a negative cycle.
|
||||||
|
Note that this algorithm can be used to
|
||||||
|
search for
|
||||||
|
a negative cycle in the whole graph
|
||||||
|
regardless of the starting node.
|
||||||
|
|
||||||
|
\subsubsection{SPFA algorithm}
|
||||||
|
|
||||||
|
\index{SPFA algorithm}
|
||||||
|
|
||||||
|
The \key{SPFA algorithm} (''Shortest Path Faster Algorithm'') \cite{fan94}
|
||||||
|
is a variant of the Bellman–Ford algorithm,
|
||||||
|
that is often more efficient than the original algorithm.
|
||||||
|
The SPFA algorithm does not go through all the edges on each round,
|
||||||
|
but instead, it chooses the edges to be examined
|
||||||
|
in a more intelligent way.
|
||||||
|
|
||||||
|
The algorithm maintains a queue of nodes that might
|
||||||
|
be used for reducing the distances.
|
||||||
|
First, the algorithm adds the starting node $x$
|
||||||
|
to the queue.
|
||||||
|
Then, the algorithm always processes the
|
||||||
|
first node in the queue, and when an edge
|
||||||
|
$a \rightarrow b$ reduces a distance,
|
||||||
|
node $b$ is added to the queue.
|
||||||
|
%
|
||||||
|
% The following implementation uses a
|
||||||
|
% \texttt{queue} \texttt{q}.
|
||||||
|
% In addition, an array \texttt{inqueue} indicates
|
||||||
|
% if a node is already in the queue,
|
||||||
|
% in which case the algorithm does not add
|
||||||
|
% the node to the queue again.
|
||||||
|
%
|
||||||
|
% \begin{lstlisting}
|
||||||
|
% for (int i = 1; i <= n; i++) distance[i] = INF;
|
||||||
|
% distance[x] = 0;
|
||||||
|
% q.push(x);
|
||||||
|
% while (!q.empty()) {
|
||||||
|
% int a = q.front(); q.pop();
|
||||||
|
% inqueue[a] = false;
|
||||||
|
% for (auto b : v[a]) {
|
||||||
|
% if (distance[a]+b.second < distance[b.first]) {
|
||||||
|
% distance[b.first] = distance[a]+b.second;
|
||||||
|
% if (!inqueue[b]) {q.push(b); inqueue[b] = true;}
|
||||||
|
% }
|
||||||
|
% }
|
||||||
|
% }
|
||||||
|
% \end{lstlisting}
|
||||||
|
|
||||||
|
The efficiency of the SPFA algorithm depends
|
||||||
|
on the structure of the graph:
|
||||||
|
the algorithm is often efficient,
|
||||||
|
but its worst case time complexity is still
|
||||||
|
$O(nm)$ and it is possible to create inputs
|
||||||
|
that make the algorithm as slow as the
|
||||||
|
original Bellman–Ford algorithm.
|
||||||
|
|
||||||
|
\section{Dijkstra's algorithm}
|
||||||
|
|
||||||
|
\index{Dijkstra's algorithm}
|
||||||
|
|
||||||
|
\key{Dijkstra's algorithm}\footnote{E. W. Dijkstra published the algorithm in 1959 \cite{dij59};
|
||||||
|
however, his original paper does not mention how to implement the algorithm efficiently.}
|
||||||
|
finds shortest
|
||||||
|
paths from the starting node to all nodes of the graph,
|
||||||
|
like the Bellman–Ford algorithm.
|
||||||
|
The benefit of Dijsktra's algorithm is that
|
||||||
|
it is more efficient and can be used for
|
||||||
|
processing large graphs.
|
||||||
|
However, the algorithm requires that there
|
||||||
|
are no negative weight edges in the graph.
|
||||||
|
|
||||||
|
Like the Bellman–Ford algorithm,
|
||||||
|
Dijkstra's algorithm maintains distances
|
||||||
|
to the nodes and reduces them during the search.
|
||||||
|
Dijkstra's algorithm is efficient, because
|
||||||
|
it only processes
|
||||||
|
each edge in the graph once, using the fact
|
||||||
|
that there are no negative edges.
|
||||||
|
|
||||||
|
\subsubsection{Example}
|
||||||
|
|
||||||
|
Let us consider how Dijkstra's algorithm
|
||||||
|
works in the following graph when the
|
||||||
|
starting node is node 1:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {3};
|
||||||
|
\node[draw, circle] (2) at (4,3) {4};
|
||||||
|
\node[draw, circle] (3) at (1,1) {2};
|
||||||
|
\node[draw, circle] (4) at (4,1) {1};
|
||||||
|
\node[draw, circle] (5) at (6,2) {5};
|
||||||
|
|
||||||
|
\node[color=red] at (1,3+0.6) {$\infty$};
|
||||||
|
\node[color=red] at (4,3+0.6) {$\infty$};
|
||||||
|
\node[color=red] at (1,1-0.6) {$\infty$};
|
||||||
|
\node[color=red] at (4,1-0.6) {$0$};
|
||||||
|
\node[color=red] at (6,2-0.6) {$\infty$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Like in the Bellman–Ford algorithm,
|
||||||
|
initially the distance to the starting node is 0
|
||||||
|
and the distance to all other nodes is infinite.
|
||||||
|
|
||||||
|
At each step, Dijkstra's algorithm selects a node
|
||||||
|
that has not been processed yet and whose distance
|
||||||
|
is as small as possible.
|
||||||
|
The first such node is node 1 with distance 0.
|
||||||
|
|
||||||
|
When a node is selected, the algorithm
|
||||||
|
goes through all edges that start at the node
|
||||||
|
and reduces the distances using them:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {3};
|
||||||
|
\node[draw, circle] (2) at (4,3) {4};
|
||||||
|
\node[draw, circle] (3) at (1,1) {2};
|
||||||
|
\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
|
||||||
|
\node[draw, circle] (5) at (6,2) {5};
|
||||||
|
|
||||||
|
\node[color=red] at (1,3+0.6) {$\infty$};
|
||||||
|
\node[color=red] at (4,3+0.6) {$9$};
|
||||||
|
\node[color=red] at (1,1-0.6) {$5$};
|
||||||
|
\node[color=red] at (4,1-0.6) {$0$};
|
||||||
|
\node[color=red] at (6,2-0.6) {$1$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
In this case,
|
||||||
|
the edges from node 1 reduced the distances of
|
||||||
|
nodes 2, 4 and 5, whose distances are now 5, 9 and 1.
|
||||||
|
|
||||||
|
The next node to be processed is node 5 with distance 1.
|
||||||
|
This reduces the distance to node 4 from 9 to 3:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (1,3) {3};
|
||||||
|
\node[draw, circle] (2) at (4,3) {4};
|
||||||
|
\node[draw, circle] (3) at (1,1) {2};
|
||||||
|
\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
|
||||||
|
\node[draw, circle, fill=lightgray] (5) at (6,2) {5};
|
||||||
|
|
||||||
|
\node[color=red] at (1,3+0.6) {$\infty$};
|
||||||
|
\node[color=red] at (4,3+0.6) {$3$};
|
||||||
|
\node[color=red] at (1,1-0.6) {$5$};
|
||||||
|
\node[color=red] at (4,1-0.6) {$0$};
|
||||||
|
\node[color=red] at (6,2-0.6) {$1$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- (2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
After this, the next node is node 4, which reduces
|
||||||
|
the distance to node 3 to 9:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {3};
|
||||||
|
\node[draw, circle, fill=lightgray] (2) at (4,3) {4};
|
||||||
|
\node[draw, circle] (3) at (1,1) {2};
|
||||||
|
\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
|
||||||
|
\node[draw, circle, fill=lightgray] (5) at (6,2) {5};
|
||||||
|
|
||||||
|
\node[color=red] at (1,3+0.6) {$9$};
|
||||||
|
\node[color=red] at (4,3+0.6) {$3$};
|
||||||
|
\node[color=red] at (1,1-0.6) {$5$};
|
||||||
|
\node[color=red] at (4,1-0.6) {$0$};
|
||||||
|
\node[color=red] at (6,2-0.6) {$1$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
A remarkable property in Dijkstra's algorithm is that
|
||||||
|
whenever a node is selected, its distance is final.
|
||||||
|
For example, at this point of the algorithm,
|
||||||
|
the distances 0, 1 and 3 are the final distances
|
||||||
|
to nodes 1, 5 and 4.
|
||||||
|
|
||||||
|
After this, the algorithm processes the two
|
||||||
|
remaining nodes, and the final distances are as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle, fill=lightgray] (1) at (1,3) {3};
|
||||||
|
\node[draw, circle, fill=lightgray] (2) at (4,3) {4};
|
||||||
|
\node[draw, circle, fill=lightgray] (3) at (1,1) {2};
|
||||||
|
\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
|
||||||
|
\node[draw, circle, fill=lightgray] (5) at (6,2) {5};
|
||||||
|
|
||||||
|
\node[color=red] at (1,3+0.6) {$7$};
|
||||||
|
\node[color=red] at (4,3+0.6) {$3$};
|
||||||
|
\node[color=red] at (1,1-0.6) {$5$};
|
||||||
|
\node[color=red] at (4,1-0.6) {$0$};
|
||||||
|
\node[color=red] at (6,2-0.6) {$1$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Negative edges}
|
||||||
|
|
||||||
|
The efficiency of Dijkstra's algorithm is
|
||||||
|
based on the fact that the graph does not
|
||||||
|
contain negative edges.
|
||||||
|
If there is a negative edge,
|
||||||
|
the algorithm may give incorrect results.
|
||||||
|
As an example, consider the following graph:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,1) {$2$};
|
||||||
|
\node[draw, circle] (3) at (2,-1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,0) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:2] {} (2);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:3] {} (4);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=below:6] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:$-5$] {} (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\noindent
|
||||||
|
The shortest path from node 1 to node 4 is
|
||||||
|
$1 \rightarrow 3 \rightarrow 4$
|
||||||
|
and its length is 1.
|
||||||
|
However, Dijkstra's algorithm
|
||||||
|
finds the path $1 \rightarrow 2 \rightarrow 4$
|
||||||
|
by following the minimum weight edges.
|
||||||
|
The algorithm does not take into account that
|
||||||
|
on the other path, the weight $-5$
|
||||||
|
compensates the previous large weight $6$.
|
||||||
|
|
||||||
|
\subsubsection{Implementation}
|
||||||
|
|
||||||
|
The following implementation of Dijkstra's algorithm
|
||||||
|
calculates the minimum distances from a node $x$
|
||||||
|
to other nodes of the graph.
|
||||||
|
The graph is stored as adjacency lists
|
||||||
|
so that \texttt{adj[$a$]} contains a pair $(b,w)$
|
||||||
|
always when there is an edge from node $a$ to node $b$
|
||||||
|
with weight $w$.
|
||||||
|
|
||||||
|
An efficient implementation of Dijkstra's algorithm
|
||||||
|
requires that it is possible to efficiently find the
|
||||||
|
minimum distance node that has not been processed.
|
||||||
|
An appropriate data structure for this is a priority queue
|
||||||
|
that contains the nodes ordered by their distances.
|
||||||
|
Using a priority queue, the next node to be processed
|
||||||
|
can be retrieved in logarithmic time.
|
||||||
|
|
||||||
|
In the following code, the priority queue
|
||||||
|
\texttt{q} contains pairs of the form $(-d,x)$,
|
||||||
|
meaning that the current distance to node $x$ is $d$.
|
||||||
|
The array $\texttt{distance}$ contains the distance to
|
||||||
|
each node, and the array $\texttt{processed}$ indicates
|
||||||
|
whether a node has been processed.
|
||||||
|
Initially the distance is $0$ to $x$ and $\infty$ to all other nodes.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i++) distance[i] = INF;
|
||||||
|
distance[x] = 0;
|
||||||
|
q.push({0,x});
|
||||||
|
while (!q.empty()) {
|
||||||
|
int a = q.top().second; q.pop();
|
||||||
|
if (processed[a]) continue;
|
||||||
|
processed[a] = true;
|
||||||
|
for (auto u : adj[a]) {
|
||||||
|
int b = u.first, w = u.second;
|
||||||
|
if (distance[a]+w < distance[b]) {
|
||||||
|
distance[b] = distance[a]+w;
|
||||||
|
q.push({-distance[b],b});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Note that the priority queue contains \emph{negative}
|
||||||
|
distances to nodes.
|
||||||
|
The reason for this is that the
|
||||||
|
default version of the C++ priority queue finds maximum
|
||||||
|
elements, while we want to find minimum elements.
|
||||||
|
By using negative distances,
|
||||||
|
we can directly use the default priority queue\footnote{Of
|
||||||
|
course, we could also declare the priority queue as in Chapter 4.5
|
||||||
|
and use positive distances, but the implementation would be a bit longer.}.
|
||||||
|
Also note that there may be several instances of the same
|
||||||
|
node in the priority queue; however, only the instance with the
|
||||||
|
minimum distance will be processed.
|
||||||
|
|
||||||
|
The time complexity of the above implementation is
|
||||||
|
$O(n+m \log m)$, because the algorithm goes through
|
||||||
|
all nodes of the graph and adds for each edge
|
||||||
|
at most one distance to the priority queue.
|
||||||
|
|
||||||
|
\section{Floyd–Warshall algorithm}
|
||||||
|
|
||||||
|
\index{Floyd–Warshall algorithm}
|
||||||
|
|
||||||
|
The \key{Floyd–Warshall algorithm}\footnote{The algorithm
|
||||||
|
is named after R. W. Floyd and S. Warshall
|
||||||
|
who published it independently in 1962 \cite{flo62,war62}.}
|
||||||
|
provides an alternative way to approach the problem
|
||||||
|
of finding shortest paths.
|
||||||
|
Unlike the other algorithms of this chapter,
|
||||||
|
it finds all shortest paths between the nodes
|
||||||
|
in a single run.
|
||||||
|
|
||||||
|
The algorithm maintains a two-dimensional array
|
||||||
|
that contains distances between the nodes.
|
||||||
|
First, distances are calculated only using
|
||||||
|
direct edges between the nodes,
|
||||||
|
and after this, the algorithm reduces distances
|
||||||
|
by using intermediate nodes in paths.
|
||||||
|
|
||||||
|
\subsubsection{Example}
|
||||||
|
|
||||||
|
Let us consider how the Floyd–Warshall algorithm
|
||||||
|
works in the following graph:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$3$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$4$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$1$};
|
||||||
|
\node[draw, circle] (5) at (6,2) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Initially, the distance from each node to itself is $0$,
|
||||||
|
and the distance between nodes $a$ and $b$ is $x$
|
||||||
|
if there is an edge between nodes $a$ and $b$ with weight $x$.
|
||||||
|
All other distances are infinite.
|
||||||
|
|
||||||
|
In this graph, the initial array is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|rrrrr}
|
||||||
|
& 1 & 2 & 3 & 4 & 5 \\
|
||||||
|
\hline
|
||||||
|
1 & 0 & 5 & $\infty$ & 9 & 1 \\
|
||||||
|
2 & 5 & 0 & 2 & $\infty$ & $\infty$ \\
|
||||||
|
3 & $\infty$ & 2 & 0 & 7 & $\infty$ \\
|
||||||
|
4 & 9 & $\infty$ & 7 & 0 & 2 \\
|
||||||
|
5 & 1 & $\infty$ & $\infty$ & 2 & 0 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
\vspace{10pt}
|
||||||
|
The algorithm consists of consecutive rounds.
|
||||||
|
On each round, the algorithm selects a new node
|
||||||
|
that can act as an intermediate node in paths from now on,
|
||||||
|
and distances are reduced using this node.
|
||||||
|
|
||||||
|
On the first round, node 1 is the new intermediate node.
|
||||||
|
There is a new path between nodes 2 and 4
|
||||||
|
with length 14, because node 1 connects them.
|
||||||
|
There is also a new path
|
||||||
|
between nodes 2 and 5 with length 6.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|rrrrr}
|
||||||
|
& 1 & 2 & 3 & 4 & 5 \\
|
||||||
|
\hline
|
||||||
|
1 & 0 & 5 & $\infty$ & 9 & 1 \\
|
||||||
|
2 & 5 & 0 & 2 & \textbf{14} & \textbf{6} \\
|
||||||
|
3 & $\infty$ & 2 & 0 & 7 & $\infty$ \\
|
||||||
|
4 & 9 & \textbf{14} & 7 & 0 & 2 \\
|
||||||
|
5 & 1 & \textbf{6} & $\infty$ & 2 & 0 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
\vspace{10pt}
|
||||||
|
|
||||||
|
On the second round, node 2 is the new intermediate node.
|
||||||
|
This creates new paths between nodes 1 and 3
|
||||||
|
and between nodes 3 and 5:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|rrrrr}
|
||||||
|
& 1 & 2 & 3 & 4 & 5 \\
|
||||||
|
\hline
|
||||||
|
1 & 0 & 5 & \textbf{7} & 9 & 1 \\
|
||||||
|
2 & 5 & 0 & 2 & 14 & 6 \\
|
||||||
|
3 & \textbf{7} & 2 & 0 & 7 & \textbf{8} \\
|
||||||
|
4 & 9 & 14 & 7 & 0 & 2 \\
|
||||||
|
5 & 1 & 6 & \textbf{8} & 2 & 0 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
\vspace{10pt}
|
||||||
|
|
||||||
|
On the third round, node 3 is the new intermediate round.
|
||||||
|
There is a new path between nodes 2 and 4:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|rrrrr}
|
||||||
|
& 1 & 2 & 3 & 4 & 5 \\
|
||||||
|
\hline
|
||||||
|
1 & 0 & 5 & 7 & 9 & 1 \\
|
||||||
|
2 & 5 & 0 & 2 & \textbf{9} & 6 \\
|
||||||
|
3 & 7 & 2 & 0 & 7 & 8 \\
|
||||||
|
4 & 9 & \textbf{9} & 7 & 0 & 2 \\
|
||||||
|
5 & 1 & 6 & 8 & 2 & 0 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
\vspace{10pt}
|
||||||
|
|
||||||
|
The algorithm continues like this,
|
||||||
|
until all nodes have been appointed intermediate nodes.
|
||||||
|
After the algorithm has finished, the array contains
|
||||||
|
the minimum distances between any two nodes:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|rrrrr}
|
||||||
|
& 1 & 2 & 3 & 4 & 5 \\
|
||||||
|
\hline
|
||||||
|
1 & 0 & 5 & 7 & 3 & 1 \\
|
||||||
|
2 & 5 & 0 & 2 & 8 & 6 \\
|
||||||
|
3 & 7 & 2 & 0 & 7 & 8 \\
|
||||||
|
4 & 3 & 8 & 7 & 0 & 2 \\
|
||||||
|
5 & 1 & 6 & 8 & 2 & 0 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
For example, the array tells us that the
|
||||||
|
shortest distance between nodes 2 and 4 is 8.
|
||||||
|
This corresponds to the following path:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$3$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$4$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$1$};
|
||||||
|
\node[draw, circle] (5) at (6,2) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- (2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Implementation}
|
||||||
|
|
||||||
|
The advantage of the
|
||||||
|
Floyd–Warshall algorithm that it is
|
||||||
|
easy to implement.
|
||||||
|
The following code constructs a
|
||||||
|
distance matrix where $\texttt{distance}[a][b]$
|
||||||
|
is the shortest distance between nodes $a$ and $b$.
|
||||||
|
First, the algorithm initializes \texttt{distance}
|
||||||
|
using the adjacency matrix \texttt{adj} of the graph:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i++) {
|
||||||
|
for (int j = 1; j <= n; j++) {
|
||||||
|
if (i == j) distance[i][j] = 0;
|
||||||
|
else if (adj[i][j]) distance[i][j] = adj[i][j];
|
||||||
|
else distance[i][j] = INF;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
After this, the shortest distances can be found as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int k = 1; k <= n; k++) {
|
||||||
|
for (int i = 1; i <= n; i++) {
|
||||||
|
for (int j = 1; j <= n; j++) {
|
||||||
|
distance[i][j] = min(distance[i][j],
|
||||||
|
distance[i][k]+distance[k][j]);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The time complexity of the algorithm is $O(n^3)$,
|
||||||
|
because it contains three nested loops
|
||||||
|
that go through the nodes of the graph.
|
||||||
|
|
||||||
|
Since the implementation of the Floyd–Warshall
|
||||||
|
algorithm is simple, the algorithm can be
|
||||||
|
a good choice even if it is only needed to find a
|
||||||
|
single shortest path in the graph.
|
||||||
|
However, the algorithm can only be used when the graph
|
||||||
|
is so small that a cubic time complexity is fast enough.
|
|
@ -0,0 +1,609 @@
|
||||||
|
\chapter{Tree algorithms}
|
||||||
|
|
||||||
|
\index{tree}
|
||||||
|
|
||||||
|
A \key{tree} is a connected, acyclic graph
|
||||||
|
that consists of $n$ nodes and $n-1$ edges.
|
||||||
|
Removing any edge from a tree divides it
|
||||||
|
into two components,
|
||||||
|
and adding any edge to a tree creates a cycle.
|
||||||
|
Moreover, there is always a unique path between any
|
||||||
|
two nodes of a tree.
|
||||||
|
|
||||||
|
For example, the following tree consists of 8 nodes and 7 edges:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,3) {$4$};
|
||||||
|
\node[draw, circle] (3) at (0,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (2,1) {$3$};
|
||||||
|
\node[draw, circle] (5) at (4,1) {$7$};
|
||||||
|
\node[draw, circle] (6) at (-2,3) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-2,1) {$6$};
|
||||||
|
\node[draw, circle] (8) at (-4,1) {$8$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
\path[draw,thick,-] (7) -- (8);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\index{leaf}
|
||||||
|
|
||||||
|
The \key{leaves} of a tree are the nodes
|
||||||
|
with degree 1, i.e., with only one neighbor.
|
||||||
|
For example, the leaves of the above tree
|
||||||
|
are nodes 3, 5, 7 and 8.
|
||||||
|
|
||||||
|
\index{root}
|
||||||
|
\index{rooted tree}
|
||||||
|
|
||||||
|
In a \key{rooted} tree, one of the nodes
|
||||||
|
is appointed the \key{root} of the tree,
|
||||||
|
and all other nodes are
|
||||||
|
placed underneath the root.
|
||||||
|
For example, in the following tree,
|
||||||
|
node 1 is the root node.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,3) {$1$};
|
||||||
|
\node[draw, circle] (4) at (2,1) {$4$};
|
||||||
|
\node[draw, circle] (2) at (-2,1) {$2$};
|
||||||
|
\node[draw, circle] (3) at (0,1) {$3$};
|
||||||
|
\node[draw, circle] (7) at (2,-1) {$7$};
|
||||||
|
\node[draw, circle] (5) at (-3,-1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (-1,-1) {$6$};
|
||||||
|
\node[draw, circle] (8) at (-1,-3) {$8$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (6);
|
||||||
|
\path[draw,thick,-] (4) -- (7);
|
||||||
|
\path[draw,thick,-] (6) -- (8);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\index{child}
|
||||||
|
\index{parent}
|
||||||
|
|
||||||
|
In a rooted tree, the \key{children} of a node
|
||||||
|
are its lower neighbors, and the \key{parent} of a node
|
||||||
|
is its upper neighbor.
|
||||||
|
Each node has exactly one parent,
|
||||||
|
except for the root that does not have a parent.
|
||||||
|
For example, in the above tree,
|
||||||
|
the children of node 2 are nodes 5 and 6,
|
||||||
|
and its parent is node 1.
|
||||||
|
|
||||||
|
\index{subtree}
|
||||||
|
|
||||||
|
The structure of a rooted tree is \emph{recursive}:
|
||||||
|
each node of the tree acts as the root of a \key{subtree}
|
||||||
|
that contains the node itself and all nodes
|
||||||
|
that are in the subtrees of its children.
|
||||||
|
For example, in the above tree, the subtree of node 2
|
||||||
|
consists of nodes 2, 5, 6 and 8:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (2) at (-2,1) {$2$};
|
||||||
|
\node[draw, circle] (5) at (-3,-1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (-1,-1) {$6$};
|
||||||
|
\node[draw, circle] (8) at (-1,-3) {$8$};
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (6);
|
||||||
|
\path[draw,thick,-] (6) -- (8);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\section{Tree traversal}
|
||||||
|
|
||||||
|
General graph traversal algorithms
|
||||||
|
can be used to traverse the nodes of a tree.
|
||||||
|
However, the traversal of a tree is easier to implement than
|
||||||
|
that of a general graph, because
|
||||||
|
there are no cycles in the tree and it is not
|
||||||
|
possible to reach a node from multiple directions.
|
||||||
|
|
||||||
|
The typical way to traverse a tree is to start
|
||||||
|
a depth-first search at an arbitrary node.
|
||||||
|
The following recursive function can be used:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
void dfs(int s, int e) {
|
||||||
|
// process node s
|
||||||
|
for (auto u : adj[s]) {
|
||||||
|
if (u != e) dfs(u, s);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The function is given two parameters: the current node $s$
|
||||||
|
and the previous node $e$.
|
||||||
|
The purpose of the parameter $e$ is to make sure
|
||||||
|
that the search only moves to nodes
|
||||||
|
that have not been visited yet.
|
||||||
|
|
||||||
|
The following function call starts the search
|
||||||
|
at node $x$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
dfs(x, 0);
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
In the first call $e=0$, because there is no
|
||||||
|
previous node, and it is allowed
|
||||||
|
to proceed to any direction in the tree.
|
||||||
|
|
||||||
|
\subsubsection{Dynamic programming}
|
||||||
|
|
||||||
|
Dynamic programming can be used to calculate
|
||||||
|
some information during a tree traversal.
|
||||||
|
Using dynamic programming, we can, for example,
|
||||||
|
calculate in $O(n)$ time for each node of a rooted tree the
|
||||||
|
number of nodes in its subtree
|
||||||
|
or the length of the longest path from the node
|
||||||
|
to a leaf.
|
||||||
|
|
||||||
|
As an example, let us calculate for each node $s$
|
||||||
|
a value $\texttt{count}[s]$: the number of nodes in its subtree.
|
||||||
|
The subtree contains the node itself and
|
||||||
|
all nodes in the subtrees of its children,
|
||||||
|
so we can calculate the number of nodes
|
||||||
|
recursively using the following code:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
void dfs(int s, int e) {
|
||||||
|
count[s] = 1;
|
||||||
|
for (auto u : adj[s]) {
|
||||||
|
if (u == e) continue;
|
||||||
|
dfs(u, s);
|
||||||
|
count[s] += count[u];
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Diameter}
|
||||||
|
|
||||||
|
\index{diameter}
|
||||||
|
|
||||||
|
The \key{diameter} of a tree
|
||||||
|
is the maximum length of a path between two nodes.
|
||||||
|
For example, consider the following tree:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,3) {$4$};
|
||||||
|
\node[draw, circle] (3) at (0,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (2,1) {$3$};
|
||||||
|
\node[draw, circle] (5) at (4,1) {$7$};
|
||||||
|
\node[draw, circle] (6) at (-2,3) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-2,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The diameter of this tree is 4,
|
||||||
|
which corresponds to the following path:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,3) {$4$};
|
||||||
|
\node[draw, circle] (3) at (0,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (2,1) {$3$};
|
||||||
|
\node[draw, circle] (5) at (4,1) {$7$};
|
||||||
|
\node[draw, circle] (6) at (-2,3) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-2,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Note that there may be several maximum-length paths.
|
||||||
|
In the above path, we could replace node 6 with node 5
|
||||||
|
to obtain another path with length 4.
|
||||||
|
|
||||||
|
Next we will discuss two $O(n)$ time algorithms
|
||||||
|
for calculating the diameter of a tree.
|
||||||
|
The first algorithm is based on dynamic programming,
|
||||||
|
and the second algorithm uses two depth-first searches.
|
||||||
|
|
||||||
|
\subsubsection{Algorithm 1}
|
||||||
|
|
||||||
|
A general way to approach many tree problems
|
||||||
|
is to first root the tree arbitrarily.
|
||||||
|
After this, we can try to solve the problem
|
||||||
|
separately for each subtree.
|
||||||
|
Our first algorithm for calculating the diameter
|
||||||
|
is based on this idea.
|
||||||
|
|
||||||
|
An important observation is that every path
|
||||||
|
in a rooted tree has a \emph{highest point}:
|
||||||
|
the highest node that belongs to the path.
|
||||||
|
Thus, we can calculate for each node the length
|
||||||
|
of the longest path whose highest point is the node.
|
||||||
|
One of those paths corresponds to the diameter of the tree.
|
||||||
|
|
||||||
|
For example, in the following tree,
|
||||||
|
node 1 is the highest point on the path
|
||||||
|
that corresponds to the diameter:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,1) {$4$};
|
||||||
|
\node[draw, circle] (3) at (-2,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (0,1) {$3$};
|
||||||
|
\node[draw, circle] (5) at (2,-1) {$7$};
|
||||||
|
\node[draw, circle] (6) at (-3,-1) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-1,-1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
We calculate for each node $x$ two values:
|
||||||
|
\begin{itemize}
|
||||||
|
\item $\texttt{toLeaf}(x)$: the maximum length of a path from $x$ to any leaf
|
||||||
|
\item $\texttt{maxLength}(x)$: the maximum length of a path
|
||||||
|
whose highest point is $x$
|
||||||
|
\end{itemize}
|
||||||
|
For example, in the above tree,
|
||||||
|
$\texttt{toLeaf}(1)=2$, because there is a path
|
||||||
|
$1 \rightarrow 2 \rightarrow 6$,
|
||||||
|
and $\texttt{maxLength}(1)=4$,
|
||||||
|
because there is a path
|
||||||
|
$6 \rightarrow 2 \rightarrow 1 \rightarrow 4 \rightarrow 7$.
|
||||||
|
In this case, $\texttt{maxLength}(1)$ equals the diameter.
|
||||||
|
|
||||||
|
Dynamic programming can be used to calculate the above
|
||||||
|
values for all nodes in $O(n)$ time.
|
||||||
|
First, to calculate $\texttt{toLeaf}(x)$,
|
||||||
|
we go through the children of $x$,
|
||||||
|
choose a child $c$ with maximum $\texttt{toLeaf}(c)$
|
||||||
|
and add one to this value.
|
||||||
|
Then, to calculate $\texttt{maxLength}(x)$,
|
||||||
|
we choose two distinct children $a$ and $b$
|
||||||
|
such that the sum $\texttt{toLeaf}(a)+\texttt{toLeaf}(b)$
|
||||||
|
is maximum and add two to this sum.
|
||||||
|
|
||||||
|
\subsubsection{Algorithm 2}
|
||||||
|
|
||||||
|
Another efficient way to calculate the diameter
|
||||||
|
of a tree is based on two depth-first searches.
|
||||||
|
First, we choose an arbitrary node $a$ in the tree
|
||||||
|
and find the farthest node $b$ from $a$.
|
||||||
|
Then, we find the farthest node $c$ from $b$.
|
||||||
|
The diameter of the tree is the distance between $b$ and $c$.
|
||||||
|
|
||||||
|
In the following graph, $a$, $b$ and $c$ could be:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,3) {$4$};
|
||||||
|
\node[draw, circle] (3) at (0,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (2,1) {$3$};
|
||||||
|
\node[draw, circle] (5) at (4,1) {$7$};
|
||||||
|
\node[draw, circle] (6) at (-2,3) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-2,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
\node[color=red] at (2,1.6) {$a$};
|
||||||
|
\node[color=red] at (-2,1.6) {$b$};
|
||||||
|
\node[color=red] at (4,1.6) {$c$};
|
||||||
|
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
This is an elegant method, but why does it work?
|
||||||
|
|
||||||
|
It helps to draw the tree differently so that
|
||||||
|
the path that corresponds to the diameter
|
||||||
|
is horizontal, and all other
|
||||||
|
nodes hang from it:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (2,1) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (3) at (0,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (2,-1) {$3$};
|
||||||
|
\node[draw, circle] (5) at (6,1) {$7$};
|
||||||
|
\node[draw, circle] (6) at (0,-1) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-2,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
\node[color=red] at (2,-1.6) {$a$};
|
||||||
|
\node[color=red] at (-2,1.6) {$b$};
|
||||||
|
\node[color=red] at (6,1.6) {$c$};
|
||||||
|
\node[color=red] at (2,1.6) {$x$};
|
||||||
|
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
|
||||||
|
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Node $x$ indicates the place where the path
|
||||||
|
from node $a$ joins the path that corresponds
|
||||||
|
to the diameter.
|
||||||
|
The farthest node from $a$
|
||||||
|
is node $b$, node $c$ or some other node
|
||||||
|
that is at least as far from node $x$.
|
||||||
|
Thus, this node is always a valid choice for
|
||||||
|
an endpoint of a path that corresponds to the diameter.
|
||||||
|
|
||||||
|
\section{All longest paths}
|
||||||
|
|
||||||
|
Our next problem is to calculate for every node
|
||||||
|
in the tree the maximum length of a path
|
||||||
|
that begins at the node.
|
||||||
|
This can be seen as a generalization of the
|
||||||
|
tree diameter problem, because the largest of those
|
||||||
|
lengths equals the diameter of the tree.
|
||||||
|
Also this problem can be solved in $O(n)$ time.
|
||||||
|
|
||||||
|
As an example, consider the following tree:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (-1.5,-1) {$4$};
|
||||||
|
\node[draw, circle] (3) at (2,0) {$2$};
|
||||||
|
\node[draw, circle] (4) at (-1.5,1) {$3$};
|
||||||
|
\node[draw, circle] (6) at (3.5,-1) {$6$};
|
||||||
|
\node[draw, circle] (7) at (3.5,1) {$5$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Let $\texttt{maxLength}(x)$ denote the maximum length
|
||||||
|
of a path that begins at node $x$.
|
||||||
|
For example, in the above tree,
|
||||||
|
$\texttt{maxLength}(4)=3$, because there
|
||||||
|
is a path $4 \rightarrow 1 \rightarrow 2 \rightarrow 6$.
|
||||||
|
Here is a complete table of the values:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{l|lllllll}
|
||||||
|
node $x$ & 1 & 2 & 3 & 4 & 5 & 6 \\
|
||||||
|
$\texttt{maxLength}(x)$ & 2 & 2 & 3 & 3 & 3 & 3 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Also in this problem, a good starting point
|
||||||
|
for solving the problem is to root the tree arbitrarily:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,1) {$4$};
|
||||||
|
\node[draw, circle] (3) at (-2,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (0,1) {$3$};
|
||||||
|
\node[draw, circle] (6) at (-3,-1) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-1,-1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The first part of the problem is to calculate for every node $x$
|
||||||
|
the maximum length of a path that goes through a child of $x$.
|
||||||
|
For example, the longest path from node 1
|
||||||
|
goes through its child 2:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,1) {$4$};
|
||||||
|
\node[draw, circle] (3) at (-2,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (0,1) {$3$};
|
||||||
|
\node[draw, circle] (6) at (-3,-1) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-1,-1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
|
||||||
|
\path[draw,thick,->,color=red,line width=2pt] (1) -- (3);
|
||||||
|
\path[draw,thick,->,color=red,line width=2pt] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
This part is easy to solve in $O(n)$ time, because we can use
|
||||||
|
dynamic programming as we have done previously.
|
||||||
|
|
||||||
|
Then, the second part of the problem is to calculate
|
||||||
|
for every node $x$ the maximum length of a path
|
||||||
|
through its parent $p$.
|
||||||
|
For example, the longest path
|
||||||
|
from node 3 goes through its parent 1:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,1) {$4$};
|
||||||
|
\node[draw, circle] (3) at (-2,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (0,1) {$3$};
|
||||||
|
\node[draw, circle] (6) at (-3,-1) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-1,-1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
|
||||||
|
\path[draw,thick,->,color=red,line width=2pt] (4) -- (1);
|
||||||
|
\path[draw,thick,->,color=red,line width=2pt] (1) -- (3);
|
||||||
|
\path[draw,thick,->,color=red,line width=2pt] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
At first glance, it seems that we should choose
|
||||||
|
the longest path from $p$.
|
||||||
|
However, this \emph{does not} always work,
|
||||||
|
because the longest path from $p$
|
||||||
|
may go through $x$.
|
||||||
|
Here is an example of this situation:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,1) {$4$};
|
||||||
|
\node[draw, circle] (3) at (-2,1) {$2$};
|
||||||
|
\node[draw, circle] (4) at (0,1) {$3$};
|
||||||
|
\node[draw, circle] (6) at (-3,-1) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-1,-1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
|
||||||
|
\path[draw,thick,->,color=red,line width=2pt] (3) -- (1);
|
||||||
|
\path[draw,thick,->,color=red,line width=2pt] (1) -- (2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Still, we can solve the second part in
|
||||||
|
$O(n)$ time by storing \emph{two} maximum lengths
|
||||||
|
for each node $x$:
|
||||||
|
\begin{itemize}
|
||||||
|
\item $\texttt{maxLength}_1(x)$:
|
||||||
|
the maximum length of a path from $x$
|
||||||
|
\item $\texttt{maxLength}_2(x)$
|
||||||
|
the maximum length of a path from $x$
|
||||||
|
in another direction than the first path
|
||||||
|
\end{itemize}
|
||||||
|
For example, in the above graph,
|
||||||
|
$\texttt{maxLength}_1(1)=2$
|
||||||
|
using the path $1 \rightarrow 2 \rightarrow 5$,
|
||||||
|
and $\texttt{maxLength}_2(1)=1$
|
||||||
|
using the path $1 \rightarrow 3$.
|
||||||
|
|
||||||
|
Finally, if the path that corresponds to
|
||||||
|
$\texttt{maxLength}_1(p)$ goes through $x$,
|
||||||
|
we conclude that the maximum length is
|
||||||
|
$\texttt{maxLength}_2(p)+1$,
|
||||||
|
and otherwise the maximum length is
|
||||||
|
$\texttt{maxLength}_1(p)+1$.
|
||||||
|
|
||||||
|
|
||||||
|
\section{Binary trees}
|
||||||
|
|
||||||
|
\index{binary tree}
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
A \key{binary tree} is a rooted tree
|
||||||
|
where each node has a left and right subtree.
|
||||||
|
It is possible that a subtree of a node is empty.
|
||||||
|
Thus, every node in a binary tree has
|
||||||
|
zero, one or two children.
|
||||||
|
|
||||||
|
For example, the following tree is a binary tree:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (-1.5,-1.5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1.5,-1.5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (-3,-3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (0,-3) {$5$};
|
||||||
|
\node[draw, circle] (6) at (-1.5,-4.5) {$6$};
|
||||||
|
\node[draw, circle] (7) at (3,-3) {$7$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (7);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
\index{pre-order}
|
||||||
|
\index{in-order}
|
||||||
|
\index{post-order}
|
||||||
|
|
||||||
|
The nodes of a binary tree have three natural
|
||||||
|
orderings that correspond to different ways to
|
||||||
|
recursively traverse the tree:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item \key{pre-order}: first process the root,
|
||||||
|
then traverse the left subtree, then traverse the right subtree
|
||||||
|
\item \key{in-order}: first traverse the left subtree,
|
||||||
|
then process the root, then traverse the right subtree
|
||||||
|
\item \key{post-order}: first traverse the left subtree,
|
||||||
|
then traverse the right subtree, then process the root
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
For the above tree, the nodes in
|
||||||
|
pre-order are
|
||||||
|
$[1,2,4,5,6,3,7]$,
|
||||||
|
in in-order $[4,2,6,5,1,3,7]$
|
||||||
|
and in post-order $[4,6,5,2,7,3,1]$.
|
||||||
|
|
||||||
|
If we know the pre-order and in-order
|
||||||
|
of a tree, we can reconstruct the exact structure of the tree.
|
||||||
|
For example, the above tree is the only possible tree
|
||||||
|
with pre-order $[1,2,4,5,6,3,7]$ and
|
||||||
|
in-order $[4,2,6,5,1,3,7]$.
|
||||||
|
In a similar way, the post-order and in-order
|
||||||
|
also determine the structure of a tree.
|
||||||
|
|
||||||
|
However, the situation is different if we only know
|
||||||
|
the pre-order and post-order of a tree.
|
||||||
|
In this case, there may be more than one tree
|
||||||
|
that match the orderings.
|
||||||
|
For example, in both of the trees
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (-1.5,-1.5) {$2$};
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
|
||||||
|
\node[draw, circle] (1b) at (0+4,0) {$1$};
|
||||||
|
\node[draw, circle] (2b) at (1.5+4,-1.5) {$2$};
|
||||||
|
\path[draw,thick,-] (1b) -- (2b);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
the pre-order is $[1,2]$ and the post-order is $[2,1]$,
|
||||||
|
but the structures of the trees are different.
|
||||||
|
|
|
@ -0,0 +1,712 @@
|
||||||
|
\chapter{Spanning trees}
|
||||||
|
|
||||||
|
\index{spanning tree}
|
||||||
|
|
||||||
|
A \key{spanning tree} of a graph consists of
|
||||||
|
all nodes of the graph and some of the
|
||||||
|
edges of the graph so that there is a path
|
||||||
|
between any two nodes.
|
||||||
|
Like trees in general, spanning trees are
|
||||||
|
connected and acyclic.
|
||||||
|
Usually there are several ways to construct a spanning tree.
|
||||||
|
|
||||||
|
For example, consider the following graph:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
One spanning tree for the graph is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The weight of a spanning tree is the sum of its edge weights.
|
||||||
|
For example, the weight of the above spanning tree is
|
||||||
|
$3+5+9+3+2=22$.
|
||||||
|
|
||||||
|
\index{minimum spanning tree}
|
||||||
|
|
||||||
|
A \key{minimum spanning tree}
|
||||||
|
is a spanning tree whose weight is as small as possible.
|
||||||
|
The weight of a minimum spanning tree for the example graph
|
||||||
|
is 20, and such a tree can be constructed as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\index{maximum spanning tree}
|
||||||
|
|
||||||
|
In a similar way, a \key{maximum spanning tree}
|
||||||
|
is a spanning tree whose weight is as large as possible.
|
||||||
|
The weight of a maximum spanning tree for the
|
||||||
|
example graph is 32:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Note that a graph may have several
|
||||||
|
minimum and maximum spanning trees,
|
||||||
|
so the trees are not unique.
|
||||||
|
|
||||||
|
It turns out that several greedy methods
|
||||||
|
can be used to construct minimum and maximum
|
||||||
|
spanning trees.
|
||||||
|
In this chapter, we discuss two algorithms
|
||||||
|
that process
|
||||||
|
the edges of the graph ordered by their weights.
|
||||||
|
We focus on finding minimum spanning trees,
|
||||||
|
but the same algorithms can find
|
||||||
|
maximum spanning trees by processing the edges in reverse order.
|
||||||
|
|
||||||
|
\section{Kruskal's algorithm}
|
||||||
|
|
||||||
|
\index{Kruskal's algorithm}
|
||||||
|
|
||||||
|
In \key{Kruskal's algorithm}\footnote{The algorithm was published in 1956
|
||||||
|
by J. B. Kruskal \cite{kru56}.}, the initial spanning tree
|
||||||
|
only contains the nodes of the graph
|
||||||
|
and does not contain any edges.
|
||||||
|
Then the algorithm goes through the edges
|
||||||
|
ordered by their weights, and always adds an edge
|
||||||
|
to the tree if it does not create a cycle.
|
||||||
|
|
||||||
|
The algorithm maintains the components
|
||||||
|
of the tree.
|
||||||
|
Initially, each node of the graph
|
||||||
|
belongs to a separate component.
|
||||||
|
Always when an edge is added to the tree,
|
||||||
|
two components are joined.
|
||||||
|
Finally, all nodes belong to the same component,
|
||||||
|
and a minimum spanning tree has been found.
|
||||||
|
|
||||||
|
\subsubsection{Example}
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
Let us consider how Kruskal's algorithm processes the
|
||||||
|
following graph:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
The first step of the algorithm is to sort the
|
||||||
|
edges in increasing order of their weights.
|
||||||
|
The result is the following list:
|
||||||
|
|
||||||
|
\begin{tabular}{ll}
|
||||||
|
\\
|
||||||
|
edge & weight \\
|
||||||
|
\hline
|
||||||
|
5--6 & 2 \\
|
||||||
|
1--2 & 3 \\
|
||||||
|
3--6 & 3 \\
|
||||||
|
1--5 & 5 \\
|
||||||
|
2--3 & 5 \\
|
||||||
|
2--5 & 6 \\
|
||||||
|
4--6 & 7 \\
|
||||||
|
3--4 & 9 \\
|
||||||
|
\\
|
||||||
|
\end{tabular}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
After this, the algorithm goes through the list
|
||||||
|
and adds each edge to the tree if it joins
|
||||||
|
two separate components.
|
||||||
|
|
||||||
|
Initially, each node is in its own component:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
%\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The first edge to be added to the tree is
|
||||||
|
the edge 5--6 that creates a component $\{5,6\}$
|
||||||
|
by joining the components $\{5\}$ and $\{6\}$:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
|
||||||
|
%\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
After this, the edges 1--2, 3--6 and 1--5 are added in a similar way:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
After those steps, most components have been joined
|
||||||
|
and there are two components in the tree:
|
||||||
|
$\{1,2,3,5,6\}$ and $\{4\}$.
|
||||||
|
|
||||||
|
The next edge in the list is the edge 2--3,
|
||||||
|
but it will not be included in the tree, because
|
||||||
|
nodes 2 and 3 are already in the same component.
|
||||||
|
For the same reason, the edge 2--5 will not be included in the tree.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
Finally, the edge 4--6 will be included in the tree:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
After this, the algorithm will not add any
|
||||||
|
new edges, because the graph is connected
|
||||||
|
and there is a path between any two nodes.
|
||||||
|
The resulting graph is a minimum spanning tree
|
||||||
|
with weight $2+3+3+5+7=20$.
|
||||||
|
|
||||||
|
\subsubsection{Why does this work?}
|
||||||
|
|
||||||
|
It is a good question why Kruskal's algorithm works.
|
||||||
|
Why does the greedy strategy guarantee that we
|
||||||
|
will find a minimum spanning tree?
|
||||||
|
|
||||||
|
Let us see what happens if the minimum weight edge of
|
||||||
|
the graph is \emph{not} included in the spanning tree.
|
||||||
|
For example, suppose that a spanning tree
|
||||||
|
for the previous graph would not contain the
|
||||||
|
minimum weight edge 5--6.
|
||||||
|
We do not know the exact structure of such a spanning tree,
|
||||||
|
but in any case it has to contain some edges.
|
||||||
|
Assume that the tree would be as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,-,dashed] (1) -- (2);
|
||||||
|
\path[draw,thick,-,dashed] (2) -- (5);
|
||||||
|
\path[draw,thick,-,dashed] (2) -- (3);
|
||||||
|
\path[draw,thick,-,dashed] (3) -- (4);
|
||||||
|
\path[draw,thick,-,dashed] (4) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
However, it is not possible that the above tree
|
||||||
|
would be a minimum spanning tree for the graph.
|
||||||
|
The reason for this is that we can remove an edge
|
||||||
|
from the tree and replace it with the minimum weight edge 5--6.
|
||||||
|
This produces a spanning tree whose weight is
|
||||||
|
\emph{smaller}:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,-,dashed] (1) -- (2);
|
||||||
|
\path[draw,thick,-,dashed] (2) -- (5);
|
||||||
|
\path[draw,thick,-,dashed] (3) -- (4);
|
||||||
|
\path[draw,thick,-,dashed] (4) -- (6);
|
||||||
|
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
For this reason, it is always optimal
|
||||||
|
to include the minimum weight edge
|
||||||
|
in the tree to produce a minimum spanning tree.
|
||||||
|
Using a similar argument, we can show that it
|
||||||
|
is also optimal to add the next edge in weight order
|
||||||
|
to the tree, and so on.
|
||||||
|
Hence, Kruskal's algorithm works correctly and
|
||||||
|
always produces a minimum spanning tree.
|
||||||
|
|
||||||
|
\subsubsection{Implementation}
|
||||||
|
|
||||||
|
When implementing Kruskal's algorithm,
|
||||||
|
it is convenient to use
|
||||||
|
the edge list representation of the graph.
|
||||||
|
The first phase of the algorithm sorts the
|
||||||
|
edges in the list in $O(m \log m)$ time.
|
||||||
|
After this, the second phase of the algorithm
|
||||||
|
builds the minimum spanning tree as follows:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (...) {
|
||||||
|
if (!same(a,b)) unite(a,b);
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The loop goes through the edges in the list
|
||||||
|
and always processes an edge $a$--$b$
|
||||||
|
where $a$ and $b$ are two nodes.
|
||||||
|
Two functions are needed:
|
||||||
|
the function \texttt{same} determines
|
||||||
|
if $a$ and $b$ are in the same component,
|
||||||
|
and the function \texttt{unite}
|
||||||
|
joins the components that contain $a$ and $b$.
|
||||||
|
|
||||||
|
The problem is how to efficiently implement
|
||||||
|
the functions \texttt{same} and \texttt{unite}.
|
||||||
|
One possibility is to implement the function
|
||||||
|
\texttt{same} as a graph traversal and check if
|
||||||
|
we can get from node $a$ to node $b$.
|
||||||
|
However, the time complexity of such a function
|
||||||
|
would be $O(n+m)$
|
||||||
|
and the resulting algorithm would be slow,
|
||||||
|
because the function \texttt{same} will be called for each edge in the graph.
|
||||||
|
|
||||||
|
We will solve the problem using a union-find structure
|
||||||
|
that implements both functions in $O(\log n)$ time.
|
||||||
|
Thus, the time complexity of Kruskal's algorithm
|
||||||
|
will be $O(m \log n)$ after sorting the edge list.
|
||||||
|
|
||||||
|
\section{Union-find structure}
|
||||||
|
|
||||||
|
\index{union-find structure}
|
||||||
|
|
||||||
|
A \key{union-find structure} maintains
|
||||||
|
a collection of sets.
|
||||||
|
The sets are disjoint, so no element
|
||||||
|
belongs to more than one set.
|
||||||
|
Two $O(\log n)$ time operations are supported:
|
||||||
|
the \texttt{unite} operation joins two sets,
|
||||||
|
and the \texttt{find} operation finds the representative
|
||||||
|
of the set that contains a given element\footnote{The structure presented here
|
||||||
|
was introduced in 1971 by J. D. Hopcroft and J. D. Ullman \cite{hop71}.
|
||||||
|
Later, in 1975, R. E. Tarjan studied a more sophisticated variant
|
||||||
|
of the structure \cite{tar75} that is discussed in many algorithm
|
||||||
|
textbooks nowadays.}.
|
||||||
|
|
||||||
|
\subsubsection{Structure}
|
||||||
|
|
||||||
|
In a union-find structure, one element in each set
|
||||||
|
is the representative of the set,
|
||||||
|
and there is a chain from any other element of the
|
||||||
|
set to the representative.
|
||||||
|
For example, assume that the sets are
|
||||||
|
$\{1,4,7\}$, $\{5\}$ and $\{2,3,6,8\}$:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (0,-1) {$1$};
|
||||||
|
\node[draw, circle] (2) at (7,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (7,-1.5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,0) {$4$};
|
||||||
|
\node[draw, circle] (5) at (4,0) {$5$};
|
||||||
|
\node[draw, circle] (6) at (6,-2.5) {$6$};
|
||||||
|
\node[draw, circle] (7) at (2,-1) {$7$};
|
||||||
|
\node[draw, circle] (8) at (8,-2.5) {$8$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- (4);
|
||||||
|
\path[draw,thick,->] (7) -- (4);
|
||||||
|
|
||||||
|
\path[draw,thick,->] (3) -- (2);
|
||||||
|
\path[draw,thick,->] (6) -- (3);
|
||||||
|
\path[draw,thick,->] (8) -- (3);
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
In this case the representatives
|
||||||
|
of the sets are 4, 5 and 2.
|
||||||
|
We can find the representative of any element
|
||||||
|
by following the chain that begins at the element.
|
||||||
|
For example, the element 2 is the representative
|
||||||
|
for the element 6, because
|
||||||
|
we follow the chain $6 \rightarrow 3 \rightarrow 2$.
|
||||||
|
Two elements belong to the same set exactly when
|
||||||
|
their representatives are the same.
|
||||||
|
|
||||||
|
Two sets can be joined by connecting the
|
||||||
|
representative of one set to the
|
||||||
|
representative of the other set.
|
||||||
|
For example, the sets
|
||||||
|
$\{1,4,7\}$ and $\{2,3,6,8\}$
|
||||||
|
can be joined as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (2,-1) {$1$};
|
||||||
|
\node[draw, circle] (2) at (7,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (7,-1.5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (3,0) {$4$};
|
||||||
|
\node[draw, circle] (6) at (6,-2.5) {$6$};
|
||||||
|
\node[draw, circle] (7) at (4,-1) {$7$};
|
||||||
|
\node[draw, circle] (8) at (8,-2.5) {$8$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- (4);
|
||||||
|
\path[draw,thick,->] (7) -- (4);
|
||||||
|
|
||||||
|
\path[draw,thick,->] (3) -- (2);
|
||||||
|
\path[draw,thick,->] (6) -- (3);
|
||||||
|
\path[draw,thick,->] (8) -- (3);
|
||||||
|
|
||||||
|
\path[draw,thick,->] (4) -- (2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The resulting set contains the elements
|
||||||
|
$\{1,2,3,4,6,7,8\}$.
|
||||||
|
From this on, the element 2 is the representative
|
||||||
|
for the entire set and the old representative 4
|
||||||
|
points to the element 2.
|
||||||
|
|
||||||
|
The efficiency of the union-find structure depends on
|
||||||
|
how the sets are joined.
|
||||||
|
It turns out that we can follow a simple strategy:
|
||||||
|
always connect the representative of the
|
||||||
|
\emph{smaller} set to the representative of the \emph{larger} set
|
||||||
|
(or if the sets are of equal size,
|
||||||
|
we can make an arbitrary choice).
|
||||||
|
Using this strategy, the length of any chain
|
||||||
|
will be $O(\log n)$, so we can
|
||||||
|
find the representative of any element
|
||||||
|
efficiently by following the corresponding chain.
|
||||||
|
|
||||||
|
\subsubsection{Implementation}
|
||||||
|
|
||||||
|
The union-find structure can be implemented
|
||||||
|
using arrays.
|
||||||
|
In the following implementation,
|
||||||
|
the array \texttt{link} contains for each element
|
||||||
|
the next element
|
||||||
|
in the chain or the element itself if it is
|
||||||
|
a representative,
|
||||||
|
and the array \texttt{size} indicates for each representative
|
||||||
|
the size of the corresponding set.
|
||||||
|
|
||||||
|
Initially, each element belongs to a separate set:
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int i = 1; i <= n; i++) link[i] = i;
|
||||||
|
for (int i = 1; i <= n; i++) size[i] = 1;
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The function \texttt{find} returns
|
||||||
|
the representative for an element $x$.
|
||||||
|
The representative can be found by following
|
||||||
|
the chain that begins at $x$.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
int find(int x) {
|
||||||
|
while (x != link[x]) x = link[x];
|
||||||
|
return x;
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The function \texttt{same} checks
|
||||||
|
whether elements $a$ and $b$ belong to the same set.
|
||||||
|
This can easily be done by using the
|
||||||
|
function \texttt{find}:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
bool same(int a, int b) {
|
||||||
|
return find(a) == find(b);
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
The function \texttt{unite} joins the sets
|
||||||
|
that contain elements $a$ and $b$
|
||||||
|
(the elements have to be in different sets).
|
||||||
|
The function first finds the representatives
|
||||||
|
of the sets and then connects the smaller
|
||||||
|
set to the larger set.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
void unite(int a, int b) {
|
||||||
|
a = find(a);
|
||||||
|
b = find(b);
|
||||||
|
if (size[a] < size[b]) swap(a,b);
|
||||||
|
size[a] += size[b];
|
||||||
|
link[b] = a;
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
The time complexity of the function \texttt{find}
|
||||||
|
is $O(\log n)$ assuming that the length of each
|
||||||
|
chain is $O(\log n)$.
|
||||||
|
In this case, the functions \texttt{same} and \texttt{unite}
|
||||||
|
also work in $O(\log n)$ time.
|
||||||
|
The function \texttt{unite} makes sure that the
|
||||||
|
length of each chain is $O(\log n)$ by connecting
|
||||||
|
the smaller set to the larger set.
|
||||||
|
|
||||||
|
\section{Prim's algorithm}
|
||||||
|
|
||||||
|
\index{Prim's algorithm}
|
||||||
|
|
||||||
|
\key{Prim's algorithm}\footnote{The algorithm is
|
||||||
|
named after R. C. Prim who published it in 1957 \cite{pri57}.
|
||||||
|
However, the same algorithm was discovered already in 1930
|
||||||
|
by V. Jarník.} is an alternative method
|
||||||
|
for finding a minimum spanning tree.
|
||||||
|
The algorithm first adds an arbitrary node
|
||||||
|
to the tree.
|
||||||
|
After this, the algorithm always chooses
|
||||||
|
a minimum-weight edge that
|
||||||
|
adds a new node to the tree.
|
||||||
|
Finally, all nodes have been added to the tree
|
||||||
|
and a minimum spanning tree has been found.
|
||||||
|
|
||||||
|
Prim's algorithm resembles Dijkstra's algorithm.
|
||||||
|
The difference is that Dijkstra's algorithm always
|
||||||
|
selects an edge whose distance from the starting
|
||||||
|
node is minimum, but Prim's algorithm simply selects
|
||||||
|
the minimum weight edge that adds a new node to the tree.
|
||||||
|
|
||||||
|
\subsubsection{Example}
|
||||||
|
|
||||||
|
Let us consider how Prim's algorithm works
|
||||||
|
in the following graph:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
|
||||||
|
%\path[draw=red,thick,-,line width=2pt] (5) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Initially, there are no edges between the nodes:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
%\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
An arbitrary node can be the starting node,
|
||||||
|
so let us choose node 1.
|
||||||
|
First, we add node 2 that is connected by
|
||||||
|
an edge of weight 3:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
After this, there are two edges with weight 5,
|
||||||
|
so we can add either node 3 or node 5 to the tree.
|
||||||
|
Let us add node 3 first:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
The process continues until all nodes have been included in the tree:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||||
|
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||||
|
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||||
|
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||||
|
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||||
|
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
\subsubsection{Implementation}
|
||||||
|
|
||||||
|
Like Dijkstra's algorithm, Prim's algorithm can be
|
||||||
|
efficiently implemented using a priority queue.
|
||||||
|
The priority queue should contain all nodes
|
||||||
|
that can be connected to the current component using
|
||||||
|
a single edge, in increasing order of the weights
|
||||||
|
of the corresponding edges.
|
||||||
|
|
||||||
|
The time complexity of Prim's algorithm is
|
||||||
|
$O(n + m \log m)$ that equals the time complexity
|
||||||
|
of Dijkstra's algorithm.
|
||||||
|
In practice, Prim's and Kruskal's algorithms
|
||||||
|
are both efficient, and the choice of the algorithm
|
||||||
|
is a matter of taste.
|
||||||
|
Still, most competitive programmers use Kruskal's algorithm.
|
|
@ -0,0 +1,708 @@
|
||||||
|
\chapter{Directed graphs}
|
||||||
|
|
||||||
|
In this chapter, we focus on two classes of directed graphs:
|
||||||
|
\begin{itemize}
|
||||||
|
\item \key{Acyclic graphs}:
|
||||||
|
There are no cycles in the graph,
|
||||||
|
so there is no path from any node to itself\footnote{Directed acyclic
|
||||||
|
graphs are sometimes called DAGs.}.
|
||||||
|
\item \key{Successor graphs}:
|
||||||
|
The outdegree of each node is 1,
|
||||||
|
so each node has a unique successor.
|
||||||
|
\end{itemize}
|
||||||
|
It turns out that in both cases,
|
||||||
|
we can design efficient algorithms that are based
|
||||||
|
on the special properties of the graphs.
|
||||||
|
|
||||||
|
\section{Topological sorting}
|
||||||
|
|
||||||
|
\index{topological sorting}
|
||||||
|
\index{cycle}
|
||||||
|
|
||||||
|
A \key{topological sort} is an ordering
|
||||||
|
of the nodes of a directed graph
|
||||||
|
such that if there is a path from node $a$ to node $b$,
|
||||||
|
then node $a$ appears before node $b$ in the ordering.
|
||||||
|
For example, for the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
one topological sort is
|
||||||
|
$[4,1,5,2,3,6]$:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (-6,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (-3,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (-1.5,0) {$3$};
|
||||||
|
\node[draw, circle] (4) at (-7.5,0) {$4$};
|
||||||
|
\node[draw, circle] (5) at (-4.5,0) {$5$};
|
||||||
|
\node[draw, circle] (6) at (-0,0) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) edge [bend right=30] (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) edge [bend left=30] (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) edge [bend left=30] (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
An acyclic graph always has a topological sort.
|
||||||
|
However, if the graph contains a cycle,
|
||||||
|
it is not possible to form a topological sort,
|
||||||
|
because no node of the cycle can appear
|
||||||
|
before the other nodes of the cycle in the ordering.
|
||||||
|
It turns out that depth-first search can be used
|
||||||
|
to both check if a directed graph contains a cycle
|
||||||
|
and, if it does not contain a cycle, to construct a topological sort.
|
||||||
|
|
||||||
|
\subsubsection{Algorithm}
|
||||||
|
|
||||||
|
The idea is to go through the nodes of the graph
|
||||||
|
and always begin a depth-first search at the current node
|
||||||
|
if it has not been processed yet.
|
||||||
|
During the searches, the nodes have three possible states:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item state 0: the node has not been processed (white)
|
||||||
|
\item state 1: the node is under processing (light gray)
|
||||||
|
\item state 2: the node has been processed (dark gray)
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
Initially, the state of each node is 0.
|
||||||
|
When a search reaches a node for the first time,
|
||||||
|
its state becomes 1.
|
||||||
|
Finally, after all successors of the node have
|
||||||
|
been processed, its state becomes 2.
|
||||||
|
|
||||||
|
If the graph contains a cycle, we will find this out
|
||||||
|
during the search, because sooner or later
|
||||||
|
we will arrive at a node whose state is 1.
|
||||||
|
In this case, it is not possible to construct a topological sort.
|
||||||
|
|
||||||
|
If the graph does not contain a cycle, we can construct
|
||||||
|
a topological sort by
|
||||||
|
adding each node to a list when the state of the node becomes 2.
|
||||||
|
This list in reverse order is a topological sort.
|
||||||
|
|
||||||
|
\subsubsection{Example 1}
|
||||||
|
|
||||||
|
In the example graph, the search first proceeds
|
||||||
|
from node 1 to node 6:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle,fill=gray!20] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle,fill=gray!20] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle,fill=gray!20] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle,fill=gray!80] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||||
|
%\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Now node 6 has been processed, so it is added to the list.
|
||||||
|
After this, also nodes 3, 2 and 1 are added to the list:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle,fill=gray!80] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle,fill=gray!80] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle,fill=gray!80] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle,fill=gray!80] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
At this point, the list is $[6,3,2,1]$.
|
||||||
|
The next search begins at node 4:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle,fill=gray!80] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle,fill=gray!80] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle,fill=gray!80] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle,fill=gray!20] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle,fill=gray!80] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle,fill=gray!80] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
%\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Thus, the final list is $[6,3,2,1,5,4]$.
|
||||||
|
We have processed all nodes, so a topological sort has
|
||||||
|
been found.
|
||||||
|
The topological sort is the reverse list
|
||||||
|
$[4,5,1,2,3,6]$:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (3,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4.5,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (6,0) {$3$};
|
||||||
|
\node[draw, circle] (4) at (0,0) {$4$};
|
||||||
|
\node[draw, circle] (5) at (1.5,0) {$5$};
|
||||||
|
\node[draw, circle] (6) at (7.5,0) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (4) edge [bend left=30] (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) edge [bend right=30] (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) edge [bend right=40] (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Note that a topological sort is not unique,
|
||||||
|
and there can be several topological sorts for a graph.
|
||||||
|
|
||||||
|
\subsubsection{Example 2}
|
||||||
|
|
||||||
|
Let us now consider a graph for which we
|
||||||
|
cannot construct a topological sort,
|
||||||
|
because the graph contains a cycle:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The search proceeds as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle,fill=gray!20] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle,fill=gray!20] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle,fill=gray!20] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle,fill=gray!20] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- (2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The search reaches node 2 whose state is 1,
|
||||||
|
which means that the graph contains a cycle.
|
||||||
|
In this example, there is a cycle
|
||||||
|
$2 \rightarrow 3 \rightarrow 5 \rightarrow 2$.
|
||||||
|
|
||||||
|
\section{Dynamic programming}
|
||||||
|
|
||||||
|
If a directed graph is acyclic,
|
||||||
|
dynamic programming can be applied to it.
|
||||||
|
For example, we can efficiently solve the following
|
||||||
|
problems concerning paths from a starting node
|
||||||
|
to an ending node:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item how many different paths are there?
|
||||||
|
\item what is the shortest/longest path?
|
||||||
|
\item what is the minimum/maximum number of edges in a path?
|
||||||
|
\item which nodes certainly appear in any path?
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsubsection{Counting the number of paths}
|
||||||
|
|
||||||
|
As an example, let us calculate the number of paths
|
||||||
|
from node 1 to node 6 in the following graph:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
There are a total of three such paths:
|
||||||
|
\begin{itemize}
|
||||||
|
\item $1 \rightarrow 2 \rightarrow 3 \rightarrow 6$
|
||||||
|
\item $1 \rightarrow 4 \rightarrow 5 \rightarrow 2 \rightarrow 3 \rightarrow 6$
|
||||||
|
\item $1 \rightarrow 4 \rightarrow 5 \rightarrow 3 \rightarrow 6$
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
Let $\texttt{paths}(x)$ denote the number of paths from
|
||||||
|
node 1 to node $x$.
|
||||||
|
As a base case, $\texttt{paths}(1)=1$.
|
||||||
|
Then, to calculate other values of $\texttt{paths}(x)$,
|
||||||
|
we may use the recursion
|
||||||
|
\[\texttt{paths}(x) = \texttt{paths}(a_1)+\texttt{paths}(a_2)+\cdots+\texttt{paths}(a_k)\]
|
||||||
|
where $a_1,a_2,\ldots,a_k$ are the nodes from which there
|
||||||
|
is an edge to $x$.
|
||||||
|
Since the graph is acyclic, the values of $\texttt{paths}(x)$
|
||||||
|
can be calculated in the order of a topological sort.
|
||||||
|
A topological sort for the above graph is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4.5,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (6,0) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1.5,0) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,0) {$5$};
|
||||||
|
\node[draw, circle] (6) at (7.5,0) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) edge [bend left=30] (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) edge [bend right=30] (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Hence, the numbers of paths are as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,5) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,3) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
|
||||||
|
\node[color=red] at (1,2.3) {$1$};
|
||||||
|
\node[color=red] at (3,2.3) {$1$};
|
||||||
|
\node[color=red] at (5,2.3) {$3$};
|
||||||
|
\node[color=red] at (1,5.7) {$1$};
|
||||||
|
\node[color=red] at (3,5.7) {$2$};
|
||||||
|
\node[color=red] at (5,5.7) {$3$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
For example, to calculate the value of $\texttt{paths}(3)$,
|
||||||
|
we can use the formula $\texttt{paths}(2)+\texttt{paths}(5)$,
|
||||||
|
because there are edges from nodes 2 and 5
|
||||||
|
to node 3.
|
||||||
|
Since $\texttt{paths}(2)=2$ and $\texttt{paths}(5)=1$, we conclude that $\texttt{paths}(3)=3$.
|
||||||
|
|
||||||
|
\subsubsection{Extending Dijkstra's algorithm}
|
||||||
|
|
||||||
|
\index{Dijkstra's algorithm}
|
||||||
|
|
||||||
|
A by-product of Dijkstra's algorithm is a directed, acyclic
|
||||||
|
graph that indicates for each node of the original graph
|
||||||
|
the possible ways to reach the node using a shortest path
|
||||||
|
from the starting node.
|
||||||
|
Dynamic programming can be applied to that graph.
|
||||||
|
For example, in the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (0,-2) {$3$};
|
||||||
|
\node[draw, circle] (4) at (2,-2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (4,-1) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
\path[draw,thick,-] (1) -- node[font=\small,label=left:5] {} (3);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=right:4] {} (4);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:8] {} (5);
|
||||||
|
\path[draw,thick,-] (3) -- node[font=\small,label=below:2] {} (4);
|
||||||
|
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||||
|
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
the shortest paths from node 1 may use the following edges:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (0,-2) {$3$};
|
||||||
|
\node[draw, circle] (4) at (2,-2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (4,-1) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
\path[draw,thick,->] (1) -- node[font=\small,label=left:5] {} (3);
|
||||||
|
\path[draw,thick,->] (2) -- node[font=\small,label=right:4] {} (4);
|
||||||
|
\path[draw,thick,->] (3) -- node[font=\small,label=below:2] {} (4);
|
||||||
|
\path[draw,thick,->] (4) -- node[font=\small,label=below:1] {} (5);
|
||||||
|
\path[draw,thick,->] (2) -- node[font=\small,label=above:2] {} (3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Now we can, for example, calculate the number of
|
||||||
|
shortest paths from node 1 to node 5
|
||||||
|
using dynamic programming:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (0,-2) {$3$};
|
||||||
|
\node[draw, circle] (4) at (2,-2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (4,-1) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- node[font=\small,label=above:3] {} (2);
|
||||||
|
\path[draw,thick,->] (1) -- node[font=\small,label=left:5] {} (3);
|
||||||
|
\path[draw,thick,->] (2) -- node[font=\small,label=right:4] {} (4);
|
||||||
|
\path[draw,thick,->] (3) -- node[font=\small,label=below:2] {} (4);
|
||||||
|
\path[draw,thick,->] (4) -- node[font=\small,label=below:1] {} (5);
|
||||||
|
\path[draw,thick,->] (2) -- node[font=\small,label=above:2] {} (3);
|
||||||
|
|
||||||
|
\node[color=red] at (0,0.7) {$1$};
|
||||||
|
\node[color=red] at (2,0.7) {$1$};
|
||||||
|
\node[color=red] at (0,-2.7) {$2$};
|
||||||
|
\node[color=red] at (2,-2.7) {$3$};
|
||||||
|
\node[color=red] at (4,-1.7) {$3$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Representing problems as graphs}
|
||||||
|
|
||||||
|
Actually, any dynamic programming problem
|
||||||
|
can be represented as a directed, acyclic graph.
|
||||||
|
In such a graph, each node corresponds to a dynamic programming state
|
||||||
|
and the edges indicate how the states depend on each other.
|
||||||
|
|
||||||
|
As an example, consider the problem
|
||||||
|
of forming a sum of money $n$
|
||||||
|
using coins
|
||||||
|
$\{c_1,c_2,\ldots,c_k\}$.
|
||||||
|
In this problem, we can construct a graph where
|
||||||
|
each node corresponds to a sum of money,
|
||||||
|
and the edges show how the coins can be chosen.
|
||||||
|
For example, for coins $\{1,3,4\}$ and $n=6$,
|
||||||
|
the graph is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (0) at (0,0) {$0$};
|
||||||
|
\node[draw, circle] (1) at (2,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (6,0) {$3$};
|
||||||
|
\node[draw, circle] (4) at (8,0) {$4$};
|
||||||
|
\node[draw, circle] (5) at (10,0) {$5$};
|
||||||
|
\node[draw, circle] (6) at (12,0) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (0) -- (1);
|
||||||
|
\path[draw,thick,->] (1) -- (2);
|
||||||
|
\path[draw,thick,->] (2) -- (3);
|
||||||
|
\path[draw,thick,->] (3) -- (4);
|
||||||
|
\path[draw,thick,->] (4) -- (5);
|
||||||
|
\path[draw,thick,->] (5) -- (6);
|
||||||
|
|
||||||
|
\path[draw,thick,->] (0) edge [bend right=30] (3);
|
||||||
|
\path[draw,thick,->] (1) edge [bend right=30] (4);
|
||||||
|
\path[draw,thick,->] (2) edge [bend right=30] (5);
|
||||||
|
\path[draw,thick,->] (3) edge [bend right=30] (6);
|
||||||
|
|
||||||
|
\path[draw,thick,->] (0) edge [bend left=30] (4);
|
||||||
|
\path[draw,thick,->] (1) edge [bend left=30] (5);
|
||||||
|
\path[draw,thick,->] (2) edge [bend left=30] (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Using this representation,
|
||||||
|
the shortest path from node 0 to node $n$
|
||||||
|
corresponds to a solution with the minimum number of coins,
|
||||||
|
and the total number of paths from node 0 to node $n$
|
||||||
|
equals the total number of solutions.
|
||||||
|
|
||||||
|
\section{Successor paths}
|
||||||
|
|
||||||
|
\index{successor graph}
|
||||||
|
\index{functional graph}
|
||||||
|
|
||||||
|
For the rest of the chapter,
|
||||||
|
we will focus on \key{successor graphs}.
|
||||||
|
In those graphs,
|
||||||
|
the outdegree of each node is 1, i.e.,
|
||||||
|
exactly one edge starts at each node.
|
||||||
|
A successor graph consists of one or more
|
||||||
|
components, each of which contains
|
||||||
|
one cycle and some paths that lead to it.
|
||||||
|
|
||||||
|
Successor graphs are sometimes called
|
||||||
|
\key{functional graphs}.
|
||||||
|
The reason for this is that any successor graph
|
||||||
|
corresponds to a function that defines
|
||||||
|
the edges of the graph.
|
||||||
|
The parameter for the function is a node of the graph,
|
||||||
|
and the function gives the successor of that node.
|
||||||
|
|
||||||
|
For example, the function
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|rrrrrrrrr}
|
||||||
|
$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
|
||||||
|
\hline
|
||||||
|
$\texttt{succ}(x)$ & 3 & 5 & 7 & 6 & 2 & 2 & 1 & 6 & 3 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
defines the following graph:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (-2,0) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,-3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (4,0) {$5$};
|
||||||
|
\node[draw, circle] (6) at (2,-1.5) {$6$};
|
||||||
|
\node[draw, circle] (7) at (-2,-1.5) {$7$};
|
||||||
|
\node[draw, circle] (8) at (3,-3) {$8$};
|
||||||
|
\node[draw, circle] (9) at (-4,0) {$9$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- (3);
|
||||||
|
\path[draw,thick,->] (2) edge [bend left=40] (5);
|
||||||
|
\path[draw,thick,->] (3) -- (7);
|
||||||
|
\path[draw,thick,->] (4) -- (6);
|
||||||
|
\path[draw,thick,->] (5) edge [bend left=40] (2);
|
||||||
|
\path[draw,thick,->] (6) -- (2);
|
||||||
|
\path[draw,thick,->] (7) -- (1);
|
||||||
|
\path[draw,thick,->] (8) -- (6);
|
||||||
|
\path[draw,thick,->] (9) -- (3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Since each node of a successor graph has a
|
||||||
|
unique successor, we can also define a function $\texttt{succ}(x,k)$
|
||||||
|
that gives the node that we will reach if
|
||||||
|
we begin at node $x$ and walk $k$ steps forward.
|
||||||
|
For example, in the above graph $\texttt{succ}(4,6)=2$,
|
||||||
|
because we will reach node 2 by walking 6 steps from node 4:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$4$};
|
||||||
|
\node[draw, circle] (2) at (1.5,0) {$6$};
|
||||||
|
\node[draw, circle] (3) at (3,0) {$2$};
|
||||||
|
\node[draw, circle] (4) at (4.5,0) {$5$};
|
||||||
|
\node[draw, circle] (5) at (6,0) {$2$};
|
||||||
|
\node[draw, circle] (6) at (7.5,0) {$5$};
|
||||||
|
\node[draw, circle] (7) at (9,0) {$2$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- (2);
|
||||||
|
\path[draw,thick,->] (2) -- (3);
|
||||||
|
\path[draw,thick,->] (3) -- (4);
|
||||||
|
\path[draw,thick,->] (4) -- (5);
|
||||||
|
\path[draw,thick,->] (5) -- (6);
|
||||||
|
\path[draw,thick,->] (6) -- (7);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
A straightforward way to calculate a value of $\texttt{succ}(x,k)$
|
||||||
|
is to start at node $x$ and walk $k$ steps forward, which takes $O(k)$ time.
|
||||||
|
However, using preprocessing, any value of $\texttt{succ}(x,k)$
|
||||||
|
can be calculated in only $O(\log k)$ time.
|
||||||
|
|
||||||
|
The idea is to precalculate all values of $\texttt{succ}(x,k)$ where
|
||||||
|
$k$ is a power of two and at most $u$, where $u$ is
|
||||||
|
the maximum number of steps we will ever walk.
|
||||||
|
This can be efficiently done, because
|
||||||
|
we can use the following recursion:
|
||||||
|
|
||||||
|
\begin{equation*}
|
||||||
|
\texttt{succ}(x,k) = \begin{cases}
|
||||||
|
\texttt{succ}(x) & k = 1\\
|
||||||
|
\texttt{succ}(\texttt{succ}(x,k/2),k/2) & k > 1\\
|
||||||
|
\end{cases}
|
||||||
|
\end{equation*}
|
||||||
|
|
||||||
|
Precalculating the values takes $O(n \log u)$ time,
|
||||||
|
because $O(\log u)$ values are calculated for each node.
|
||||||
|
In the above graph, the first values are as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|rrrrrrrrr}
|
||||||
|
$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
|
||||||
|
\hline
|
||||||
|
$\texttt{succ}(x,1)$ & 3 & 5 & 7 & 6 & 2 & 2 & 1 & 6 & 3 \\
|
||||||
|
$\texttt{succ}(x,2)$ & 7 & 2 & 1 & 2 & 5 & 5 & 3 & 2 & 7 \\
|
||||||
|
$\texttt{succ}(x,4)$ & 3 & 2 & 7 & 2 & 5 & 5 & 1 & 2 & 3 \\
|
||||||
|
$\texttt{succ}(x,8)$ & 7 & 2 & 1 & 2 & 5 & 5 & 3 & 2 & 7 \\
|
||||||
|
$\cdots$ \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
After this, any value of $\texttt{succ}(x,k)$ can be calculated
|
||||||
|
by presenting the number of steps $k$ as a sum of powers of two.
|
||||||
|
For example, if we want to calculate the value of $\texttt{succ}(x,11)$,
|
||||||
|
we first form the representation $11=8+2+1$.
|
||||||
|
Using that,
|
||||||
|
\[\texttt{succ}(x,11)=\texttt{succ}(\texttt{succ}(\texttt{succ}(x,8),2),1).\]
|
||||||
|
For example, in the previous graph
|
||||||
|
\[\texttt{succ}(4,11)=\texttt{succ}(\texttt{succ}(\texttt{succ}(4,8),2),1)=5.\]
|
||||||
|
|
||||||
|
Such a representation always consists of
|
||||||
|
$O(\log k)$ parts, so calculating a value of $\texttt{succ}(x,k)$
|
||||||
|
takes $O(\log k)$ time.
|
||||||
|
|
||||||
|
\section{Cycle detection}
|
||||||
|
|
||||||
|
\index{cycle}
|
||||||
|
\index{cycle detection}
|
||||||
|
|
||||||
|
Consider a successor graph that only contains
|
||||||
|
a path that ends in a cycle.
|
||||||
|
We may ask the following questions:
|
||||||
|
if we begin our walk at the starting node,
|
||||||
|
what is the first node in the cycle
|
||||||
|
and how many nodes does the cycle contain?
|
||||||
|
|
||||||
|
For example, in the graph
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (5) at (0,0) {$5$};
|
||||||
|
\node[draw, circle] (4) at (-2,0) {$4$};
|
||||||
|
\node[draw, circle] (6) at (-1,1.5) {$6$};
|
||||||
|
\node[draw, circle] (3) at (-4,0) {$3$};
|
||||||
|
\node[draw, circle] (2) at (-6,0) {$2$};
|
||||||
|
\node[draw, circle] (1) at (-8,0) {$1$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- (2);
|
||||||
|
\path[draw,thick,->] (2) -- (3);
|
||||||
|
\path[draw,thick,->] (3) -- (4);
|
||||||
|
\path[draw,thick,->] (4) -- (5);
|
||||||
|
\path[draw,thick,->] (5) -- (6);
|
||||||
|
\path[draw,thick,->] (6) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
we begin our walk at node 1,
|
||||||
|
the first node that belongs to the cycle is node 4, and the cycle consists
|
||||||
|
of three nodes (4, 5 and 6).
|
||||||
|
|
||||||
|
A simple way to detect the cycle is to walk in the
|
||||||
|
graph and keep track of
|
||||||
|
all nodes that have been visited. Once a node is visited
|
||||||
|
for the second time, we can conclude
|
||||||
|
that the node is the first node in the cycle.
|
||||||
|
This method works in $O(n)$ time and also uses
|
||||||
|
$O(n)$ memory.
|
||||||
|
|
||||||
|
However, there are better algorithms for cycle detection.
|
||||||
|
The time complexity of such algorithms is still $O(n)$,
|
||||||
|
but they only use $O(1)$ memory.
|
||||||
|
This is an important improvement if $n$ is large.
|
||||||
|
Next we will discuss Floyd's algorithm that
|
||||||
|
achieves these properties.
|
||||||
|
|
||||||
|
\subsubsection{Floyd's algorithm}
|
||||||
|
|
||||||
|
\index{Floyd's algorithm}
|
||||||
|
|
||||||
|
\key{Floyd's algorithm}\footnote{The idea of the algorithm is mentioned in \cite{knu982}
|
||||||
|
and attributed to R. W. Floyd; however, it is not known if Floyd actually
|
||||||
|
discovered the algorithm.} walks forward
|
||||||
|
in the graph using two pointers $a$ and $b$.
|
||||||
|
Both pointers begin at a node $x$ that
|
||||||
|
is the starting node of the graph.
|
||||||
|
Then, on each turn, the pointer $a$ walks
|
||||||
|
one step forward and the pointer $b$
|
||||||
|
walks two steps forward.
|
||||||
|
The process continues until
|
||||||
|
the pointers meet each other:
|
||||||
|
\begin{lstlisting}
|
||||||
|
a = succ(x);
|
||||||
|
b = succ(succ(x));
|
||||||
|
while (a != b) {
|
||||||
|
a = succ(a);
|
||||||
|
b = succ(succ(b));
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
At this point, the pointer $a$ has walked $k$ steps
|
||||||
|
and the pointer $b$ has walked $2k$ steps,
|
||||||
|
so the length of the cycle divides $k$.
|
||||||
|
Thus, the first node that belongs to the cycle
|
||||||
|
can be found by moving the pointer $a$ to node $x$
|
||||||
|
and advancing the pointers
|
||||||
|
step by step until they meet again.
|
||||||
|
\begin{lstlisting}
|
||||||
|
a = x;
|
||||||
|
while (a != b) {
|
||||||
|
a = succ(a);
|
||||||
|
b = succ(b);
|
||||||
|
}
|
||||||
|
first = a;
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
After this, the length of the cycle
|
||||||
|
can be calculated as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
b = succ(a);
|
||||||
|
length = 1;
|
||||||
|
while (a != b) {
|
||||||
|
b = succ(b);
|
||||||
|
length++;
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
|
@ -0,0 +1,563 @@
|
||||||
|
\chapter{Strong connectivity}
|
||||||
|
|
||||||
|
\index{strongly connected graph}
|
||||||
|
|
||||||
|
In a directed graph,
|
||||||
|
the edges can be traversed in one direction only,
|
||||||
|
so even if the graph is connected,
|
||||||
|
this does not guarantee that there would be
|
||||||
|
a path from a node to another node.
|
||||||
|
For this reason, it is meaningful to define a new concept
|
||||||
|
that requires more than connectivity.
|
||||||
|
|
||||||
|
A graph is \key{strongly connected}
|
||||||
|
if there is a path from any node to all
|
||||||
|
other nodes in the graph.
|
||||||
|
For example, in the following picture,
|
||||||
|
the left graph is strongly connected
|
||||||
|
while the right graph is not.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,1) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,1) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,-1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (3,-1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- (2);
|
||||||
|
\path[draw,thick,->] (2) -- (4);
|
||||||
|
\path[draw,thick,->] (4) -- (3);
|
||||||
|
\path[draw,thick,->] (3) -- (1);
|
||||||
|
|
||||||
|
\node[draw, circle] (1b) at (6,1) {$1$};
|
||||||
|
\node[draw, circle] (2b) at (8,1) {$2$};
|
||||||
|
\node[draw, circle] (3b) at (6,-1) {$3$};
|
||||||
|
\node[draw, circle] (4b) at (8,-1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1b) -- (2b);
|
||||||
|
\path[draw,thick,->] (2b) -- (4b);
|
||||||
|
\path[draw,thick,->] (4b) -- (3b);
|
||||||
|
\path[draw,thick,->] (1b) -- (3b);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The right graph is not strongly connected
|
||||||
|
because, for example, there is no path
|
||||||
|
from node 2 to node 1.
|
||||||
|
|
||||||
|
\index{strongly connected component}
|
||||||
|
\index{component graph}
|
||||||
|
|
||||||
|
The \key{strongly connected components}
|
||||||
|
of a graph divide the graph into strongly connected
|
||||||
|
parts that are as large as possible.
|
||||||
|
The strongly connected components form an
|
||||||
|
acyclic \key{component graph} that represents
|
||||||
|
the deep structure of the original graph.
|
||||||
|
|
||||||
|
For example, for the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||||
|
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||||
|
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||||
|
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||||
|
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||||
|
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (2) -- (1);
|
||||||
|
\path[draw,thick,->] (1) -- (3);
|
||||||
|
\path[draw,thick,->] (3) -- (2);
|
||||||
|
\path[draw,thick,->] (2) -- (4);
|
||||||
|
\path[draw,thick,->] (3) -- (5);
|
||||||
|
\path[draw,thick,->] (4) edge [bend left] (6);
|
||||||
|
\path[draw,thick,->] (6) edge [bend left] (4);
|
||||||
|
\path[draw,thick,->] (4) -- (5);
|
||||||
|
\path[draw,thick,->] (5) -- (7);
|
||||||
|
\path[draw,thick,->] (6) -- (7);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
the strongly connected components are as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||||
|
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||||
|
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||||
|
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||||
|
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (2) -- (1);
|
||||||
|
\path[draw,thick,->] (1) -- (3);
|
||||||
|
\path[draw,thick,->] (3) -- (2);
|
||||||
|
\path[draw,thick,->] (2) -- (4);
|
||||||
|
\path[draw,thick,->] (3) -- (5);
|
||||||
|
\path[draw,thick,->] (4) edge [bend left] (6);
|
||||||
|
\path[draw,thick,->] (6) edge [bend left] (4);
|
||||||
|
\path[draw,thick,->] (4) -- (5);
|
||||||
|
\path[draw,thick,->] (5) -- (7);
|
||||||
|
\path[draw,thick,->] (6) -- (7);
|
||||||
|
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The corresponding component graph is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (-3,1) {$B$};
|
||||||
|
\node[draw, circle] (2) at (-6,2) {$A$};
|
||||||
|
\node[draw, circle] (3) at (-5,0) {$D$};
|
||||||
|
\node[draw, circle] (4) at (-7,0) {$C$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- (2);
|
||||||
|
\path[draw,thick,->] (1) -- (3);
|
||||||
|
\path[draw,thick,->] (2) -- (3);
|
||||||
|
\path[draw,thick,->] (2) -- (4);
|
||||||
|
\path[draw,thick,->] (3) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The components are $A=\{1,2\}$,
|
||||||
|
$B=\{3,6,7\}$, $C=\{4\}$ and $D=\{5\}$.
|
||||||
|
|
||||||
|
A component graph is an acyclic, directed graph,
|
||||||
|
so it is easier to process than the original graph.
|
||||||
|
Since the graph does not contain cycles,
|
||||||
|
we can always construct a topological sort and
|
||||||
|
use dynamic programming techniques like those
|
||||||
|
presented in Chapter 16.
|
||||||
|
|
||||||
|
\section{Kosaraju's algorithm}
|
||||||
|
|
||||||
|
\index{Kosaraju's algorithm}
|
||||||
|
|
||||||
|
\key{Kosaraju's algorithm}\footnote{According to \cite{aho83},
|
||||||
|
S. R. Kosaraju invented this algorithm in 1978
|
||||||
|
but did not publish it. In 1981, the same algorithm was rediscovered
|
||||||
|
and published by M. Sharir \cite{sha81}.} is an efficient
|
||||||
|
method for finding the strongly connected components
|
||||||
|
of a directed graph.
|
||||||
|
The algorithm performs two depth-first searches:
|
||||||
|
the first search constructs a list of nodes
|
||||||
|
according to the structure of the graph,
|
||||||
|
and the second search forms the strongly connected components.
|
||||||
|
|
||||||
|
\subsubsection{Search 1}
|
||||||
|
|
||||||
|
The first phase of Kosaraju's algorithm constructs
|
||||||
|
a list of nodes in the order in which a
|
||||||
|
depth-first search processes them.
|
||||||
|
The algorithm goes through the nodes,
|
||||||
|
and begins a depth-first search at each
|
||||||
|
unprocessed node.
|
||||||
|
Each node will be added to the list
|
||||||
|
after it has been processed.
|
||||||
|
|
||||||
|
In the example graph, the nodes are processed
|
||||||
|
in the following order:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||||
|
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||||
|
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||||
|
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||||
|
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||||
|
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||||
|
|
||||||
|
\node at (-7,2.75) {$1/8$};
|
||||||
|
\node at (-5,2.75) {$2/7$};
|
||||||
|
\node at (-3,2.75) {$9/14$};
|
||||||
|
\node at (-7,-0.75) {$4/5$};
|
||||||
|
\node at (-5,-0.75) {$3/6$};
|
||||||
|
\node at (-3,-0.75) {$11/12$};
|
||||||
|
\node at (-1,1.75) {$10/13$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (2) -- (1);
|
||||||
|
\path[draw,thick,->] (1) -- (3);
|
||||||
|
\path[draw,thick,->] (3) -- (2);
|
||||||
|
\path[draw,thick,->] (2) -- (4);
|
||||||
|
\path[draw,thick,->] (3) -- (5);
|
||||||
|
\path[draw,thick,->] (4) edge [bend left] (6);
|
||||||
|
\path[draw,thick,->] (6) edge [bend left] (4);
|
||||||
|
\path[draw,thick,->] (4) -- (5);
|
||||||
|
\path[draw,thick,->] (5) -- (7);
|
||||||
|
\path[draw,thick,->] (6) -- (7);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The notation $x/y$ means that
|
||||||
|
processing the node started
|
||||||
|
at time $x$ and finished at time $y$.
|
||||||
|
Thus, the corresponding list is as follows:
|
||||||
|
|
||||||
|
\begin{tabular}{ll}
|
||||||
|
\\
|
||||||
|
node & processing time \\
|
||||||
|
\hline
|
||||||
|
4 & 5 \\
|
||||||
|
5 & 6 \\
|
||||||
|
2 & 7 \\
|
||||||
|
1 & 8 \\
|
||||||
|
6 & 12 \\
|
||||||
|
7 & 13 \\
|
||||||
|
3 & 14 \\
|
||||||
|
\\
|
||||||
|
\end{tabular}
|
||||||
|
%
|
||||||
|
% In the second phase of the algorithm,
|
||||||
|
% the nodes will be processed
|
||||||
|
% in reverse order: $[3,7,6,1,2,5,4]$.
|
||||||
|
|
||||||
|
\subsubsection{Search 2}
|
||||||
|
|
||||||
|
The second phase of the algorithm
|
||||||
|
forms the strongly connected components
|
||||||
|
of the graph.
|
||||||
|
First, the algorithm reverses every
|
||||||
|
edge in the graph.
|
||||||
|
This guarantees that during the second search,
|
||||||
|
we will always find strongly connected
|
||||||
|
components that do not have extra nodes.
|
||||||
|
|
||||||
|
After reversing the edges,
|
||||||
|
the example graph is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||||
|
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||||
|
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||||
|
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||||
|
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||||
|
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,<-] (2) -- (1);
|
||||||
|
\path[draw,thick,<-] (1) -- (3);
|
||||||
|
\path[draw,thick,<-] (3) -- (2);
|
||||||
|
\path[draw,thick,<-] (2) -- (4);
|
||||||
|
\path[draw,thick,<-] (3) -- (5);
|
||||||
|
\path[draw,thick,<-] (4) edge [bend left] (6);
|
||||||
|
\path[draw,thick,<-] (6) edge [bend left] (4);
|
||||||
|
\path[draw,thick,<-] (4) -- (5);
|
||||||
|
\path[draw,thick,<-] (5) -- (7);
|
||||||
|
\path[draw,thick,<-] (6) -- (7);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
After this, the algorithm goes through
|
||||||
|
the list of nodes created by the first search,
|
||||||
|
in \emph{reverse} order.
|
||||||
|
If a node does not belong to a component,
|
||||||
|
the algorithm creates a new component
|
||||||
|
and starts a depth-first search
|
||||||
|
that adds all new nodes found during the search
|
||||||
|
to the new component.
|
||||||
|
|
||||||
|
In the example graph, the first component
|
||||||
|
begins at node 3:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||||
|
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||||
|
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||||
|
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||||
|
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||||
|
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,<-] (2) -- (1);
|
||||||
|
\path[draw,thick,<-] (1) -- (3);
|
||||||
|
\path[draw,thick,<-] (3) -- (2);
|
||||||
|
\path[draw,thick,<-] (2) -- (4);
|
||||||
|
\path[draw,thick,<-] (3) -- (5);
|
||||||
|
\path[draw,thick,<-] (4) edge [bend left] (6);
|
||||||
|
\path[draw,thick,<-] (6) edge [bend left] (4);
|
||||||
|
\path[draw,thick,<-] (4) -- (5);
|
||||||
|
\path[draw,thick,<-] (5) -- (7);
|
||||||
|
\path[draw,thick,<-] (6) -- (7);
|
||||||
|
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Note that since all edges are reversed,
|
||||||
|
the component does not ''leak'' to other parts in the graph.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
The next nodes in the list are nodes 7 and 6,
|
||||||
|
but they already belong to a component,
|
||||||
|
so the next new component begins at node 1:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||||
|
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||||
|
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||||
|
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||||
|
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||||
|
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,<-] (2) -- (1);
|
||||||
|
\path[draw,thick,<-] (1) -- (3);
|
||||||
|
\path[draw,thick,<-] (3) -- (2);
|
||||||
|
\path[draw,thick,<-] (2) -- (4);
|
||||||
|
\path[draw,thick,<-] (3) -- (5);
|
||||||
|
\path[draw,thick,<-] (4) edge [bend left] (6);
|
||||||
|
\path[draw,thick,<-] (6) edge [bend left] (4);
|
||||||
|
\path[draw,thick,<-] (4) -- (5);
|
||||||
|
\path[draw,thick,<-] (5) -- (7);
|
||||||
|
\path[draw,thick,<-] (6) -- (7);
|
||||||
|
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
|
||||||
|
%\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
|
||||||
|
%\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
Finally, the algorithm processes nodes 5 and 4
|
||||||
|
that create the remaining strongly connected components:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||||
|
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||||
|
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||||
|
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||||
|
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||||
|
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||||
|
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||||
|
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,<-] (2) -- (1);
|
||||||
|
\path[draw,thick,<-] (1) -- (3);
|
||||||
|
\path[draw,thick,<-] (3) -- (2);
|
||||||
|
\path[draw,thick,<-] (2) -- (4);
|
||||||
|
\path[draw,thick,<-] (3) -- (5);
|
||||||
|
\path[draw,thick,<-] (4) edge [bend left] (6);
|
||||||
|
\path[draw,thick,<-] (6) edge [bend left] (4);
|
||||||
|
\path[draw,thick,<-] (4) -- (5);
|
||||||
|
\path[draw,thick,<-] (5) -- (7);
|
||||||
|
\path[draw,thick,<-] (6) -- (7);
|
||||||
|
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
|
||||||
|
\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
The time complexity of the algorithm is $O(n+m)$,
|
||||||
|
because the algorithm
|
||||||
|
performs two depth-first searches.
|
||||||
|
|
||||||
|
\section{2SAT problem}
|
||||||
|
|
||||||
|
\index{2SAT problem}
|
||||||
|
|
||||||
|
Strong connectivity is also linked with the
|
||||||
|
\key{2SAT problem}\footnote{The algorithm presented here was
|
||||||
|
introduced in \cite{asp79}.
|
||||||
|
There is also another well-known linear-time algorithm \cite{eve75}
|
||||||
|
that is based on backtracking.}.
|
||||||
|
In this problem, we are given a logical formula
|
||||||
|
\[
|
||||||
|
(a_1 \lor b_1) \land (a_2 \lor b_2) \land \cdots \land (a_m \lor b_m),
|
||||||
|
\]
|
||||||
|
where each $a_i$ and $b_i$ is either a logical variable
|
||||||
|
($x_1,x_2,\ldots,x_n$)
|
||||||
|
or a negation of a logical variable
|
||||||
|
($\lnot x_1, \lnot x_2, \ldots, \lnot x_n$).
|
||||||
|
The symbols ''$\land$'' and ''$\lor$'' denote
|
||||||
|
logical operators ''and'' and ''or''.
|
||||||
|
Our task is to assign each variable a value
|
||||||
|
so that the formula is true, or state
|
||||||
|
that this is not possible.
|
||||||
|
|
||||||
|
For example, the formula
|
||||||
|
\[
|
||||||
|
L_1 = (x_2 \lor \lnot x_1) \land
|
||||||
|
(\lnot x_1 \lor \lnot x_2) \land
|
||||||
|
(x_1 \lor x_3) \land
|
||||||
|
(\lnot x_2 \lor \lnot x_3) \land
|
||||||
|
(x_1 \lor x_4)
|
||||||
|
\]
|
||||||
|
is true when the variables are assigned as follows:
|
||||||
|
|
||||||
|
\[
|
||||||
|
\begin{cases}
|
||||||
|
x_1 = \textrm{false} \\
|
||||||
|
x_2 = \textrm{false} \\
|
||||||
|
x_3 = \textrm{true} \\
|
||||||
|
x_4 = \textrm{true} \\
|
||||||
|
\end{cases}
|
||||||
|
\]
|
||||||
|
|
||||||
|
However, the formula
|
||||||
|
\[
|
||||||
|
L_2 = (x_1 \lor x_2) \land
|
||||||
|
(x_1 \lor \lnot x_2) \land
|
||||||
|
(\lnot x_1 \lor x_3) \land
|
||||||
|
(\lnot x_1 \lor \lnot x_3)
|
||||||
|
\]
|
||||||
|
is always false, regardless of how we
|
||||||
|
assign the values.
|
||||||
|
The reason for this is that we cannot
|
||||||
|
choose a value for $x_1$
|
||||||
|
without creating a contradiction.
|
||||||
|
If $x_1$ is false, both $x_2$ and $\lnot x_2$
|
||||||
|
should be true which is impossible,
|
||||||
|
and if $x_1$ is true, both $x_3$ and $\lnot x_3$
|
||||||
|
should be true which is also impossible.
|
||||||
|
|
||||||
|
The 2SAT problem can be represented as a graph
|
||||||
|
whose nodes correspond to
|
||||||
|
variables $x_i$ and negations $\lnot x_i$,
|
||||||
|
and edges determine the connections
|
||||||
|
between the variables.
|
||||||
|
Each pair $(a_i \lor b_i)$ generates two edges:
|
||||||
|
$\lnot a_i \to b_i$ and $\lnot b_i \to a_i$.
|
||||||
|
This means that if $a_i$ does not hold,
|
||||||
|
$b_i$ must hold, and vice versa.
|
||||||
|
|
||||||
|
The graph for the formula $L_1$ is:
|
||||||
|
\\
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=1.0,minimum size=2pt]
|
||||||
|
\node[draw, circle, inner sep=1.3pt] (1) at (1,2) {$\lnot x_3$};
|
||||||
|
\node[draw, circle] (2) at (3,2) {$x_2$};
|
||||||
|
\node[draw, circle, inner sep=1.3pt] (3) at (1,0) {$\lnot x_4$};
|
||||||
|
\node[draw, circle] (4) at (3,0) {$x_1$};
|
||||||
|
\node[draw, circle, inner sep=1.3pt] (5) at (5,2) {$\lnot x_1$};
|
||||||
|
\node[draw, circle] (6) at (7,2) {$x_4$};
|
||||||
|
\node[draw, circle, inner sep=1.3pt] (7) at (5,0) {$\lnot x_2$};
|
||||||
|
\node[draw, circle] (8) at (7,0) {$x_3$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- (4);
|
||||||
|
\path[draw,thick,->] (4) -- (2);
|
||||||
|
\path[draw,thick,->] (2) -- (1);
|
||||||
|
\path[draw,thick,->] (3) -- (4);
|
||||||
|
\path[draw,thick,->] (2) -- (5);
|
||||||
|
\path[draw,thick,->] (4) -- (7);
|
||||||
|
\path[draw,thick,->] (5) -- (6);
|
||||||
|
\path[draw,thick,->] (5) -- (8);
|
||||||
|
\path[draw,thick,->] (8) -- (7);
|
||||||
|
\path[draw,thick,->] (7) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
And the graph for the formula $L_2$ is:
|
||||||
|
\\
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=1.0,minimum size=2pt]
|
||||||
|
\node[draw, circle] (1) at (1,2) {$x_3$};
|
||||||
|
\node[draw, circle] (2) at (3,2) {$x_2$};
|
||||||
|
\node[draw, circle, inner sep=1.3pt] (3) at (5,2) {$\lnot x_2$};
|
||||||
|
\node[draw, circle, inner sep=1.3pt] (4) at (7,2) {$\lnot x_3$};
|
||||||
|
\node[draw, circle, inner sep=1.3pt] (5) at (4,3.5) {$\lnot x_1$};
|
||||||
|
\node[draw, circle] (6) at (4,0.5) {$x_1$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- (5);
|
||||||
|
\path[draw,thick,->] (4) -- (5);
|
||||||
|
\path[draw,thick,->] (6) -- (1);
|
||||||
|
\path[draw,thick,->] (6) -- (4);
|
||||||
|
\path[draw,thick,->] (5) -- (2);
|
||||||
|
\path[draw,thick,->] (5) -- (3);
|
||||||
|
\path[draw,thick,->] (2) -- (6);
|
||||||
|
\path[draw,thick,->] (3) -- (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The structure of the graph tells us whether
|
||||||
|
it is possible to assign the values
|
||||||
|
of the variables so
|
||||||
|
that the formula is true.
|
||||||
|
It turns out that this can be done
|
||||||
|
exactly when there are no nodes
|
||||||
|
$x_i$ and $\lnot x_i$ such that
|
||||||
|
both nodes belong to the
|
||||||
|
same strongly connected component.
|
||||||
|
If there are such nodes,
|
||||||
|
the graph contains
|
||||||
|
a path from $x_i$ to $\lnot x_i$
|
||||||
|
and also a path from $\lnot x_i$ to $x_i$,
|
||||||
|
so both $x_i$ and $\lnot x_i$ should be true
|
||||||
|
which is not possible.
|
||||||
|
|
||||||
|
In the graph of the formula $L_1$
|
||||||
|
there are no nodes $x_i$ and $\lnot x_i$
|
||||||
|
such that both nodes
|
||||||
|
belong to the same strongly connected component,
|
||||||
|
so a solution exists.
|
||||||
|
In the graph of the formula $L_2$
|
||||||
|
all nodes belong to the same strongly connected component,
|
||||||
|
so a solution does not exist.
|
||||||
|
|
||||||
|
If a solution exists, the values for the variables
|
||||||
|
can be found by going through the nodes of the
|
||||||
|
component graph in a reverse topological sort order.
|
||||||
|
At each step, we process a component
|
||||||
|
that does not contain edges that lead to an
|
||||||
|
unprocessed component.
|
||||||
|
If the variables in the component
|
||||||
|
have not been assigned values,
|
||||||
|
their values will be determined
|
||||||
|
according to the values in the component,
|
||||||
|
and if they already have values,
|
||||||
|
they remain unchanged.
|
||||||
|
The process continues until each variable
|
||||||
|
has been assigned a value.
|
||||||
|
|
||||||
|
The component graph for the formula $L_1$ is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=1.0]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$A$};
|
||||||
|
\node[draw, circle] (2) at (2,0) {$B$};
|
||||||
|
\node[draw, circle] (3) at (4,0) {$C$};
|
||||||
|
\node[draw, circle] (4) at (6,0) {$D$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) -- (2);
|
||||||
|
\path[draw,thick,->] (2) -- (3);
|
||||||
|
\path[draw,thick,->] (3) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The components are
|
||||||
|
$A = \{\lnot x_4\}$,
|
||||||
|
$B = \{x_1, x_2, \lnot x_3\}$,
|
||||||
|
$C = \{\lnot x_1, \lnot x_2, x_3\}$ and
|
||||||
|
$D = \{x_4\}$.
|
||||||
|
When constructing the solution,
|
||||||
|
we first process the component $D$
|
||||||
|
where $x_4$ becomes true.
|
||||||
|
After this, we process the component $C$
|
||||||
|
where $x_1$ and $x_2$ become false
|
||||||
|
and $x_3$ becomes true.
|
||||||
|
All variables have been assigned values,
|
||||||
|
so the remaining components $A$ and $B$
|
||||||
|
do not change the variables.
|
||||||
|
|
||||||
|
Note that this method works, because the
|
||||||
|
graph has a special structure:
|
||||||
|
if there are paths from node $x_i$ to node $x_j$
|
||||||
|
and from node $x_j$ to node $\lnot x_j$,
|
||||||
|
then node $x_i$ never becomes true.
|
||||||
|
The reason for this is that there is also
|
||||||
|
a path from node $\lnot x_j$ to node $\lnot x_i$,
|
||||||
|
and both $x_i$ and $x_j$ become false.
|
||||||
|
|
||||||
|
\index{3SAT problem}
|
||||||
|
|
||||||
|
A more difficult problem is the \key{3SAT problem},
|
||||||
|
where each part of the formula is of the form
|
||||||
|
$(a_i \lor b_i \lor c_i)$.
|
||||||
|
This problem is NP-hard, so no efficient algorithm
|
||||||
|
for solving the problem is known.
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,674 @@
|
||||||
|
\chapter{Paths and circuits}
|
||||||
|
|
||||||
|
This chapter focuses on two types of paths in graphs:
|
||||||
|
\begin{itemize}
|
||||||
|
\item An \key{Eulerian path} is a path that
|
||||||
|
goes through each edge exactly once.
|
||||||
|
\item A \key{Hamiltonian path} is a path
|
||||||
|
that visits each node exactly once.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
While Eulerian and Hamiltonian paths look like
|
||||||
|
similar concepts at first glance,
|
||||||
|
the computational problems related to them
|
||||||
|
are very different.
|
||||||
|
It turns out that there is a simple rule that
|
||||||
|
determines whether a graph contains an Eulerian path,
|
||||||
|
and there is also an efficient algorithm to
|
||||||
|
find such a path if it exists.
|
||||||
|
On the contrary, checking the existence of a Hamiltonian path is a NP-hard
|
||||||
|
problem, and no efficient algorithm is known for solving the problem.
|
||||||
|
|
||||||
|
\section{Eulerian paths}
|
||||||
|
|
||||||
|
\index{Eulerian path}
|
||||||
|
|
||||||
|
An \key{Eulerian path}\footnote{L. Euler studied such paths in 1736
|
||||||
|
when he solved the famous Königsberg bridge problem.
|
||||||
|
This was the birth of graph theory.} is a path
|
||||||
|
that goes exactly once through each edge of the graph.
|
||||||
|
For example, the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
has an Eulerian path from node 2 to node 5:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:1.}] {} (1);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]left:2.}] {} (4);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]south:3.}] {} (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]left:4.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:5.}] {} (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]south:6.}] {} (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\index{Eulerian circuit}
|
||||||
|
An \key{Eulerian circuit}
|
||||||
|
is an Eulerian path that starts and ends
|
||||||
|
at the same node.
|
||||||
|
For example, the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
has an Eulerian circuit that starts and ends at node 1:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]left:1.}] {} (4);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]south:2.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]right:3.}] {} (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:4.}] {} (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]north:5.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:6.}] {} (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Existence}
|
||||||
|
|
||||||
|
The existence of Eulerian paths and circuits
|
||||||
|
depends on the degrees of the nodes.
|
||||||
|
First, an undirected graph has an Eulerian path
|
||||||
|
exactly when all the edges
|
||||||
|
belong to the same connected component and
|
||||||
|
\begin{itemize}
|
||||||
|
\item the degree of each node is even \emph{or}
|
||||||
|
\item the degree of exactly two nodes is odd,
|
||||||
|
and the degree of all other nodes is even.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
In the first case, each Eulerian path is also an Eulerian circuit.
|
||||||
|
In the second case, the odd-degree nodes are the starting
|
||||||
|
and ending nodes of an Eulerian path which is not an Eulerian circuit.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
For example, in the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
nodes 1, 3 and 4 have a degree of 2,
|
||||||
|
and nodes 2 and 5 have a degree of 3.
|
||||||
|
Exactly two nodes have an odd degree,
|
||||||
|
so there is an Eulerian path between nodes 2 and 5,
|
||||||
|
but the graph does not contain an Eulerian circuit.
|
||||||
|
|
||||||
|
In a directed graph,
|
||||||
|
we focus on indegrees and outdegrees
|
||||||
|
of the nodes.
|
||||||
|
A directed graph contains an Eulerian path
|
||||||
|
exactly when all the edges belong to the same
|
||||||
|
connected component and
|
||||||
|
\begin{itemize}
|
||||||
|
\item in each node, the indegree equals the outdegree, \emph{or}
|
||||||
|
\item in one node, the indegree is one larger than the outdegree,
|
||||||
|
in another node, the outdegree is one larger than the indegree,
|
||||||
|
and in all other nodes, the indegree equals the outdegree.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
In the first case, each Eulerian path
|
||||||
|
is also an Eulerian circuit,
|
||||||
|
and in the second case, the graph contains an Eulerian path
|
||||||
|
that begins at the node whose outdegree is larger
|
||||||
|
and ends at the node whose indegree is larger.
|
||||||
|
|
||||||
|
For example, in the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
nodes 1, 3 and 4 have both indegree 1 and outdegree 1,
|
||||||
|
node 2 has indegree 1 and outdegree 2,
|
||||||
|
and node 5 has indegree 2 and outdegree 1.
|
||||||
|
Hence, the graph contains an Eulerian path
|
||||||
|
from node 2 to node 5:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:1.}] {} (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]south:2.}] {} (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:3.}] {} (4);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]left:4.}] {} (1);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:5.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]left:6.}] {} (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Hierholzer's algorithm}
|
||||||
|
|
||||||
|
\index{Hierholzer's algorithm}
|
||||||
|
|
||||||
|
\key{Hierholzer's algorithm}\footnote{The algorithm was published
|
||||||
|
in 1873 after Hierholzer's death \cite{hie73}.} is an efficient
|
||||||
|
method for constructing
|
||||||
|
an Eulerian circuit.
|
||||||
|
The algorithm consists of several rounds,
|
||||||
|
each of which adds new edges to the circuit.
|
||||||
|
Of course, we assume that the graph contains
|
||||||
|
an Eulerian circuit; otherwise Hierholzer's
|
||||||
|
algorithm cannot find it.
|
||||||
|
|
||||||
|
First, the algorithm constructs a circuit that contains
|
||||||
|
some (not necessarily all) of the edges of the graph.
|
||||||
|
After this, the algorithm extends the circuit
|
||||||
|
step by step by adding subcircuits to it.
|
||||||
|
The process continues until all edges have been added
|
||||||
|
to the circuit.
|
||||||
|
|
||||||
|
The algorithm extends the circuit by always finding
|
||||||
|
a node $x$ that belongs to the circuit but has
|
||||||
|
an outgoing edge that is not included in the circuit.
|
||||||
|
The algorithm constructs a new path from node $x$
|
||||||
|
that only contains edges that are not yet in the circuit.
|
||||||
|
Sooner or later,
|
||||||
|
the path will return to node $x$,
|
||||||
|
which creates a subcircuit.
|
||||||
|
|
||||||
|
If the graph only contains an Eulerian path,
|
||||||
|
we can still use Hierholzer's algorithm
|
||||||
|
to find it by adding an extra edge to the graph
|
||||||
|
and removing the edge after the circuit
|
||||||
|
has been constructed.
|
||||||
|
For example, in an undirected graph,
|
||||||
|
we add the extra edge between the two
|
||||||
|
odd-degree nodes.
|
||||||
|
|
||||||
|
Next we will see how Hierholzer's algorithm
|
||||||
|
constructs an Eulerian circuit for an undirected graph.
|
||||||
|
|
||||||
|
\subsubsection{Example}
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
Let us consider the following graph:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (3,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (1,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (3,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (5,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (1,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (3,1) {$6$};
|
||||||
|
\node[draw, circle] (7) at (5,1) {$7$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (4) -- (7);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\path[draw,thick,-] (6) -- (7);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
Suppose that the algorithm first creates a circuit
|
||||||
|
that begins at node 1.
|
||||||
|
A possible circuit is
|
||||||
|
$1 \rightarrow 2 \rightarrow 3 \rightarrow 1$:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (3,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (1,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (3,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (5,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (1,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (3,1) {$6$};
|
||||||
|
\node[draw, circle] (7) at (5,1) {$7$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (4) -- (7);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\path[draw,thick,-] (6) -- (7);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:2.}] {} (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]east:3.}] {} (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
After this, the algorithm adds
|
||||||
|
the subcircuit
|
||||||
|
$2 \rightarrow 5 \rightarrow 6 \rightarrow 2$
|
||||||
|
to the circuit:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (3,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (1,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (3,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (5,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (1,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (3,1) {$6$};
|
||||||
|
\node[draw, circle] (7) at (5,1) {$7$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (4) -- (7);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\path[draw,thick,-] (6) -- (7);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]west:2.}] {} (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:3.}] {} (6);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (6) -- node[font=\small,label={[red]north:4.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:5.}] {} (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]east:6.}] {} (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Finally, the algorithm adds the subcircuit
|
||||||
|
$6 \rightarrow 3 \rightarrow 4 \rightarrow 7 \rightarrow 6$
|
||||||
|
to the circuit:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (3,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (1,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (3,3) {$3$};
|
||||||
|
\node[draw, circle] (4) at (5,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (1,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (3,1) {$6$};
|
||||||
|
\node[draw, circle] (7) at (5,1) {$7$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (6);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (6);
|
||||||
|
\path[draw,thick,-] (4) -- (7);
|
||||||
|
\path[draw,thick,-] (5) -- (6);
|
||||||
|
\path[draw,thick,-] (6) -- (7);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]west:2.}] {} (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:3.}] {} (6);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (6) -- node[font=\small,label={[red]east:4.}] {} (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]north:5.}] {} (4);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]east:6.}] {} (7);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (7) -- node[font=\small,label={[red]south:7.}] {} (6);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (6) -- node[font=\small,label={[red]right:8.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:9.}] {} (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]east:10.}] {} (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
Now all edges are included in the circuit,
|
||||||
|
so we have successfully constructed an Eulerian circuit.
|
||||||
|
|
||||||
|
\section{Hamiltonian paths}
|
||||||
|
|
||||||
|
\index{Hamiltonian path}
|
||||||
|
|
||||||
|
A \key{Hamiltonian path}
|
||||||
|
%\footnote{W. R. Hamilton (1805--1865) was an Irish mathematician.}
|
||||||
|
is a path
|
||||||
|
that visits each node of the graph exactly once.
|
||||||
|
For example, the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
contains a Hamiltonian path from node 1 to node 3:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]left:1.}] {} (4);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]south:2.}] {} (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]left:3.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:4.}] {} (3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\index{Hamiltonian circuit}
|
||||||
|
|
||||||
|
If a Hamiltonian path begins and ends at the same node,
|
||||||
|
it is called a \key{Hamiltonian circuit}.
|
||||||
|
The graph above also has an Hamiltonian circuit
|
||||||
|
that begins and ends at node 1:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,5) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,5) {$2$};
|
||||||
|
\node[draw, circle] (3) at (5,4) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1,3) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,3) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (2) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (5);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:2.}] {} (3);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]south:3.}] {} (5);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:4.}] {} (4);
|
||||||
|
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]left:5.}] {} (1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Existence}
|
||||||
|
|
||||||
|
No efficient method is known for testing if a graph
|
||||||
|
contains a Hamiltonian path, and the problem is NP-hard.
|
||||||
|
Still, in some special cases, we can be certain
|
||||||
|
that a graph contains a Hamiltonian path.
|
||||||
|
|
||||||
|
A simple observation is that if the graph is complete,
|
||||||
|
i.e., there is an edge between all pairs of nodes,
|
||||||
|
it also contains a Hamiltonian path.
|
||||||
|
Also stronger results have been achieved:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item
|
||||||
|
\index{Dirac's theorem}
|
||||||
|
\key{Dirac's theorem}: %\cite{dir52}
|
||||||
|
If the degree of each node is at least $n/2$,
|
||||||
|
the graph contains a Hamiltonian path.
|
||||||
|
\item
|
||||||
|
\index{Ore's theorem}
|
||||||
|
\key{Ore's theorem}: %\cite{ore60}
|
||||||
|
If the sum of degrees of each non-adjacent pair of nodes
|
||||||
|
is at least $n$,
|
||||||
|
the graph contains a Hamiltonian path.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
A common property in these theorems and other results is
|
||||||
|
that they guarantee the existence of a Hamiltonian path
|
||||||
|
if the graph has \emph{a large number} of edges.
|
||||||
|
This makes sense, because the more edges the graph contains,
|
||||||
|
the more possibilities there is to construct a Hamiltonian path.
|
||||||
|
|
||||||
|
\subsubsection{Construction}
|
||||||
|
|
||||||
|
Since there is no efficient way to check if a Hamiltonian
|
||||||
|
path exists, it is clear that there is also no method
|
||||||
|
to efficiently construct the path, because otherwise
|
||||||
|
we could just try to construct the path and see
|
||||||
|
whether it exists.
|
||||||
|
|
||||||
|
A simple way to search for a Hamiltonian path is
|
||||||
|
to use a backtracking algorithm that goes through all
|
||||||
|
possible ways to construct the path.
|
||||||
|
The time complexity of such an algorithm is at least $O(n!)$,
|
||||||
|
because there are $n!$ different ways to choose the order of $n$ nodes.
|
||||||
|
|
||||||
|
A more efficient solution is based on dynamic programming
|
||||||
|
(see Chapter 10.5).
|
||||||
|
The idea is to calculate values
|
||||||
|
of a function $\texttt{possible}(S,x)$,
|
||||||
|
where $S$ is a subset of nodes and $x$
|
||||||
|
is one of the nodes.
|
||||||
|
The function indicates whether there is a Hamiltonian path
|
||||||
|
that visits the nodes of $S$ and ends at node $x$.
|
||||||
|
It is possible to implement this solution in $O(2^n n^2)$ time.
|
||||||
|
|
||||||
|
\section{De Bruijn sequences}
|
||||||
|
|
||||||
|
\index{De Bruijn sequence}
|
||||||
|
|
||||||
|
A \key{De Bruijn sequence}
|
||||||
|
is a string that contains
|
||||||
|
every string of length $n$
|
||||||
|
exactly once as a substring, for a fixed
|
||||||
|
alphabet of $k$ characters.
|
||||||
|
The length of such a string is
|
||||||
|
$k^n+n-1$ characters.
|
||||||
|
For example, when $n=3$ and $k=2$,
|
||||||
|
an example of a De Bruijn sequence is
|
||||||
|
\[0001011100.\]
|
||||||
|
The substrings of this string are all
|
||||||
|
combinations of three bits:
|
||||||
|
000, 001, 010, 011, 100, 101, 110 and 111.
|
||||||
|
|
||||||
|
It turns out that each De Bruijn sequence
|
||||||
|
corresponds to an Eulerian path in a graph.
|
||||||
|
The idea is to construct a graph where
|
||||||
|
each node contains a string of $n-1$ characters
|
||||||
|
and each edge adds one character to the string.
|
||||||
|
The following graph corresponds to the above scenario:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.8]
|
||||||
|
\node[draw, circle] (00) at (-3,0) {00};
|
||||||
|
\node[draw, circle] (11) at (3,0) {11};
|
||||||
|
\node[draw, circle] (01) at (0,2) {01};
|
||||||
|
\node[draw, circle] (10) at (0,-2) {10};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (00) edge [bend left=20] node[font=\small,label=1] {} (01);
|
||||||
|
\path[draw,thick,->] (01) edge [bend left=20] node[font=\small,label=1] {} (11);
|
||||||
|
\path[draw,thick,->] (11) edge [bend left=20] node[font=\small,label=below:0] {} (10);
|
||||||
|
\path[draw,thick,->] (10) edge [bend left=20] node[font=\small,label=below:0] {} (00);
|
||||||
|
|
||||||
|
\path[draw,thick,->] (01) edge [bend left=30] node[font=\small,label=right:0] {} (10);
|
||||||
|
\path[draw,thick,->] (10) edge [bend left=30] node[font=\small,label=left:1] {} (01);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (00) edge [loop left] node[font=\small,label=below:0] {} (00);
|
||||||
|
\path[draw,thick,-] (11) edge [loop right] node[font=\small,label=below:1] {} (11);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
An Eulerian path in this graph corresponds to a string
|
||||||
|
that contains all strings of length $n$.
|
||||||
|
The string contains the characters of the starting node
|
||||||
|
and all characters of the edges.
|
||||||
|
The starting node has $n-1$ characters
|
||||||
|
and there are $k^n$ characters in the edges,
|
||||||
|
so the length of the string is $k^n+n-1$.
|
||||||
|
|
||||||
|
\section{Knight's tours}
|
||||||
|
|
||||||
|
\index{knight's tour}
|
||||||
|
|
||||||
|
A \key{knight's tour} is a sequence of moves
|
||||||
|
of a knight on an $n \times n$ chessboard
|
||||||
|
following the rules of chess such that the knight
|
||||||
|
visits each square exactly once.
|
||||||
|
A knight's tour is called a \emph{closed} tour
|
||||||
|
if the knight finally returns to the starting square and
|
||||||
|
otherwise it is called an \emph{open} tour.
|
||||||
|
|
||||||
|
For example, here is an open knight's tour on a $5 \times 5$ board:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (5,5);
|
||||||
|
\node at (0.5,4.5) {$1$};
|
||||||
|
\node at (1.5,4.5) {$4$};
|
||||||
|
\node at (2.5,4.5) {$11$};
|
||||||
|
\node at (3.5,4.5) {$16$};
|
||||||
|
\node at (4.5,4.5) {$25$};
|
||||||
|
\node at (0.5,3.5) {$12$};
|
||||||
|
\node at (1.5,3.5) {$17$};
|
||||||
|
\node at (2.5,3.5) {$2$};
|
||||||
|
\node at (3.5,3.5) {$5$};
|
||||||
|
\node at (4.5,3.5) {$10$};
|
||||||
|
\node at (0.5,2.5) {$3$};
|
||||||
|
\node at (1.5,2.5) {$20$};
|
||||||
|
\node at (2.5,2.5) {$7$};
|
||||||
|
\node at (3.5,2.5) {$24$};
|
||||||
|
\node at (4.5,2.5) {$15$};
|
||||||
|
\node at (0.5,1.5) {$18$};
|
||||||
|
\node at (1.5,1.5) {$13$};
|
||||||
|
\node at (2.5,1.5) {$22$};
|
||||||
|
\node at (3.5,1.5) {$9$};
|
||||||
|
\node at (4.5,1.5) {$6$};
|
||||||
|
\node at (0.5,0.5) {$21$};
|
||||||
|
\node at (1.5,0.5) {$8$};
|
||||||
|
\node at (2.5,0.5) {$19$};
|
||||||
|
\node at (3.5,0.5) {$14$};
|
||||||
|
\node at (4.5,0.5) {$23$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
A knight's tour corresponds to a Hamiltonian path in a graph
|
||||||
|
whose nodes represent the squares of the board,
|
||||||
|
and two nodes are connected with an edge if a knight
|
||||||
|
can move between the squares according to the rules of chess.
|
||||||
|
|
||||||
|
A natural way to construct a knight's tour is to use backtracking.
|
||||||
|
The search can be made more efficient by using
|
||||||
|
\emph{heuristics} that attempt to guide the knight so that
|
||||||
|
a complete tour will be found quickly.
|
||||||
|
|
||||||
|
\subsubsection{Warnsdorf's rule}
|
||||||
|
|
||||||
|
\index{heuristic}
|
||||||
|
\index{Warnsdorf's rule}
|
||||||
|
|
||||||
|
\key{Warnsdorf's rule} is a simple and effective heuristic
|
||||||
|
for finding a knight's tour\footnote{This heuristic was proposed
|
||||||
|
in Warnsdorf's book \cite{war23} in 1823. There are
|
||||||
|
also polynomial algorithms for finding knight's tours
|
||||||
|
\cite{par97}, but they are more complicated.}.
|
||||||
|
Using the rule, it is possible to efficiently construct a tour
|
||||||
|
even on a large board.
|
||||||
|
The idea is to always move the knight so that it ends up
|
||||||
|
in a square where the number of possible moves is as
|
||||||
|
\emph{small} as possible.
|
||||||
|
|
||||||
|
For example, in the following situation, there are five
|
||||||
|
possible squares to which the knight can move (squares $a \ldots e$):
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (5,5);
|
||||||
|
\node at (0.5,4.5) {$1$};
|
||||||
|
\node at (2.5,3.5) {$2$};
|
||||||
|
\node at (4.5,4.5) {$a$};
|
||||||
|
\node at (0.5,2.5) {$b$};
|
||||||
|
\node at (4.5,2.5) {$e$};
|
||||||
|
\node at (1.5,1.5) {$c$};
|
||||||
|
\node at (3.5,1.5) {$d$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
In this situation, Warnsdorf's rule moves the knight to square $a$,
|
||||||
|
because after this choice, there is only a single possible move.
|
||||||
|
The other choices would move the knight to squares where
|
||||||
|
there would be three moves available.
|
||||||
|
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,726 @@
|
||||||
|
\chapter{Number theory}
|
||||||
|
|
||||||
|
\index{number theory}
|
||||||
|
|
||||||
|
\key{Number theory} is a branch of mathematics
|
||||||
|
that studies integers.
|
||||||
|
Number theory is a fascinating field,
|
||||||
|
because many questions involving integers
|
||||||
|
are very difficult to solve even if they
|
||||||
|
seem simple at first glance.
|
||||||
|
|
||||||
|
As an example, consider the following equation:
|
||||||
|
\[x^3 + y^3 + z^3 = 33\]
|
||||||
|
It is easy to find three real numbers $x$, $y$ and $z$
|
||||||
|
that satisfy the equation.
|
||||||
|
For example, we can choose
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
x = 3, \\
|
||||||
|
y = \sqrt[3]{3}, \\
|
||||||
|
z = \sqrt[3]{3}.\\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
However, it is an open problem in number theory
|
||||||
|
if there are any three
|
||||||
|
\emph{integers} $x$, $y$ and $z$
|
||||||
|
that would satisfy the equation \cite{bec07}.
|
||||||
|
|
||||||
|
In this chapter, we will focus on basic concepts
|
||||||
|
and algorithms in number theory.
|
||||||
|
Throughout the chapter, we will assume that all numbers
|
||||||
|
are integers, if not otherwise stated.
|
||||||
|
|
||||||
|
\section{Primes and factors}
|
||||||
|
|
||||||
|
\index{divisibility}
|
||||||
|
\index{factor}
|
||||||
|
\index{divisor}
|
||||||
|
|
||||||
|
A number $a$ is called a \key{factor} or a \key{divisor} of a number $b$
|
||||||
|
if $a$ divides $b$.
|
||||||
|
If $a$ is a factor of $b$,
|
||||||
|
we write $a \mid b$, and otherwise we write $a \nmid b$.
|
||||||
|
For example, the factors of 24 are
|
||||||
|
1, 2, 3, 4, 6, 8, 12 and 24.
|
||||||
|
|
||||||
|
\index{prime}
|
||||||
|
\index{prime decomposition}
|
||||||
|
|
||||||
|
A number $n>1$ is a \key{prime}
|
||||||
|
if its only positive factors are 1 and $n$.
|
||||||
|
For example, 7, 19 and 41 are primes,
|
||||||
|
but 35 is not a prime, because $5 \cdot 7 = 35$.
|
||||||
|
For every number $n>1$, there is a unique
|
||||||
|
\key{prime factorization}
|
||||||
|
\[ n = p_1^{\alpha_1} p_2^{\alpha_2} \cdots p_k^{\alpha_k},\]
|
||||||
|
where $p_1,p_2,\ldots,p_k$ are distinct primes and
|
||||||
|
$\alpha_1,\alpha_2,\ldots,\alpha_k$ are positive numbers.
|
||||||
|
For example, the prime factorization for 84 is
|
||||||
|
\[84 = 2^2 \cdot 3^1 \cdot 7^1.\]
|
||||||
|
|
||||||
|
The \key{number of factors} of a number $n$ is
|
||||||
|
\[\tau(n)=\prod_{i=1}^k (\alpha_i+1),\]
|
||||||
|
because for each prime $p_i$, there are
|
||||||
|
$\alpha_i+1$ ways to choose how many times
|
||||||
|
it appears in the factor.
|
||||||
|
For example, the number of factors
|
||||||
|
of 84 is
|
||||||
|
$\tau(84)=3 \cdot 2 \cdot 2 = 12$.
|
||||||
|
The factors are
|
||||||
|
1, 2, 3, 4, 6, 7, 12, 14, 21, 28, 42 and 84.
|
||||||
|
|
||||||
|
The \key{sum of factors} of $n$ is
|
||||||
|
\[\sigma(n)=\prod_{i=1}^k (1+p_i+\ldots+p_i^{\alpha_i}) = \prod_{i=1}^k \frac{p_i^{a_i+1}-1}{p_i-1},\]
|
||||||
|
where the latter formula is based on the geometric progression formula.
|
||||||
|
For example, the sum of factors of 84 is
|
||||||
|
\[\sigma(84)=\frac{2^3-1}{2-1} \cdot \frac{3^2-1}{3-1} \cdot \frac{7^2-1}{7-1} = 7 \cdot 4 \cdot 8 = 224.\]
|
||||||
|
|
||||||
|
The \key{product of factors} of $n$ is
|
||||||
|
\[\mu(n)=n^{\tau(n)/2},\]
|
||||||
|
because we can form $\tau(n)/2$ pairs from the factors,
|
||||||
|
each with product $n$.
|
||||||
|
For example, the factors of 84
|
||||||
|
produce the pairs
|
||||||
|
$1 \cdot 84$, $2 \cdot 42$, $3 \cdot 28$, etc.,
|
||||||
|
and the product of the factors is $\mu(84)=84^6=351298031616$.
|
||||||
|
|
||||||
|
\index{perfect number}
|
||||||
|
|
||||||
|
A number $n$ is called a \key{perfect number} if $n=\sigma(n)-n$,
|
||||||
|
i.e., $n$ equals the sum of its factors
|
||||||
|
between $1$ and $n-1$.
|
||||||
|
For example, 28 is a perfect number,
|
||||||
|
because $28=1+2+4+7+14$.
|
||||||
|
|
||||||
|
\subsubsection{Number of primes}
|
||||||
|
|
||||||
|
It is easy to show that there is an infinite number
|
||||||
|
of primes.
|
||||||
|
If the number of primes would be finite,
|
||||||
|
we could construct a set $P=\{p_1,p_2,\ldots,p_n\}$
|
||||||
|
that would contain all the primes.
|
||||||
|
For example, $p_1=2$, $p_2=3$, $p_3=5$, and so on.
|
||||||
|
However, using $P$, we could form a new prime
|
||||||
|
\[p_1 p_2 \cdots p_n+1\]
|
||||||
|
that is larger than all elements in $P$.
|
||||||
|
This is a contradiction, and the number of primes
|
||||||
|
has to be infinite.
|
||||||
|
|
||||||
|
\subsubsection{Density of primes}
|
||||||
|
|
||||||
|
The density of primes means how often there are primes
|
||||||
|
among the numbers.
|
||||||
|
Let $\pi(n)$ denote the number of primes between
|
||||||
|
$1$ and $n$. For example, $\pi(10)=4$, because
|
||||||
|
there are 4 primes between $1$ and $10$: 2, 3, 5 and 7.
|
||||||
|
|
||||||
|
It is possible to show that
|
||||||
|
\[\pi(n) \approx \frac{n}{\ln n},\]
|
||||||
|
which means that primes are quite frequent.
|
||||||
|
For example, the number of primes between
|
||||||
|
$1$ and $10^6$ is $\pi(10^6)=78498$,
|
||||||
|
and $10^6 / \ln 10^6 \approx 72382$.
|
||||||
|
|
||||||
|
\subsubsection{Conjectures}
|
||||||
|
|
||||||
|
There are many \emph{conjectures} involving primes.
|
||||||
|
Most people think that the conjectures are true,
|
||||||
|
but nobody has been able to prove them.
|
||||||
|
For example, the following conjectures are famous:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\index{Goldbach's conjecture}
|
||||||
|
\item \key{Goldbach's conjecture}:
|
||||||
|
Each even integer $n>2$ can be represented as a
|
||||||
|
sum $n=a+b$ so that both $a$ and $b$ are primes.
|
||||||
|
\index{twin prime}
|
||||||
|
\item \key{Twin prime conjecture}:
|
||||||
|
There is an infinite number of pairs
|
||||||
|
of the form $\{p,p+2\}$,
|
||||||
|
where both $p$ and $p+2$ are primes.
|
||||||
|
\index{Legendre's conjecture}
|
||||||
|
\item \key{Legendre's conjecture}:
|
||||||
|
There is always a prime between numbers
|
||||||
|
$n^2$ and $(n+1)^2$, where $n$ is any positive integer.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsubsection{Basic algorithms}
|
||||||
|
|
||||||
|
If a number $n$ is not prime,
|
||||||
|
it can be represented as a product $a \cdot b$,
|
||||||
|
where $a \le \sqrt n$ or $b \le \sqrt n$,
|
||||||
|
so it certainly has a factor between $2$ and $\lfloor \sqrt n \rfloor$.
|
||||||
|
Using this observation, we can both test
|
||||||
|
if a number is prime and find the prime factorization
|
||||||
|
of a number in $O(\sqrt n)$ time.
|
||||||
|
|
||||||
|
The following function \texttt{prime} checks
|
||||||
|
if the given number $n$ is prime.
|
||||||
|
The function attempts to divide $n$ by
|
||||||
|
all numbers between $2$ and $\lfloor \sqrt n \rfloor$,
|
||||||
|
and if none of them divides $n$, then $n$ is prime.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
bool prime(int n) {
|
||||||
|
if (n < 2) return false;
|
||||||
|
for (int x = 2; x*x <= n; x++) {
|
||||||
|
if (n%x == 0) return false;
|
||||||
|
}
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\noindent
|
||||||
|
The following function \texttt{factors}
|
||||||
|
constructs a vector that contains the prime
|
||||||
|
factorization of $n$.
|
||||||
|
The function divides $n$ by its prime factors,
|
||||||
|
and adds them to the vector.
|
||||||
|
The process ends when the remaining number $n$
|
||||||
|
has no factors between $2$ and $\lfloor \sqrt n \rfloor$.
|
||||||
|
If $n>1$, it is prime and the last factor.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
vector<int> factors(int n) {
|
||||||
|
vector<int> f;
|
||||||
|
for (int x = 2; x*x <= n; x++) {
|
||||||
|
while (n%x == 0) {
|
||||||
|
f.push_back(x);
|
||||||
|
n /= x;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
if (n > 1) f.push_back(n);
|
||||||
|
return f;
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
Note that each prime factor appears in the vector
|
||||||
|
as many times as it divides the number.
|
||||||
|
For example, $24=2^3 \cdot 3$,
|
||||||
|
so the result of the function is $[2,2,2,3]$.
|
||||||
|
|
||||||
|
\subsubsection{Sieve of Eratosthenes}
|
||||||
|
|
||||||
|
\index{sieve of Eratosthenes}
|
||||||
|
|
||||||
|
The \key{sieve of Eratosthenes}
|
||||||
|
%\footnote{Eratosthenes (c. 276 BC -- c. 194 BC) was a Greek mathematician.}
|
||||||
|
is a preprocessing
|
||||||
|
algorithm that builds an array using which we
|
||||||
|
can efficiently check if a given number between $2 \ldots n$
|
||||||
|
is prime and, if it is not, find one prime factor of the number.
|
||||||
|
|
||||||
|
The algorithm builds an array $\texttt{sieve}$
|
||||||
|
whose positions $2,3,\ldots,n$ are used.
|
||||||
|
The value $\texttt{sieve}[k]=0$ means
|
||||||
|
that $k$ is prime,
|
||||||
|
and the value $\texttt{sieve}[k] \neq 0$
|
||||||
|
means that $k$ is not a prime and one
|
||||||
|
of its prime factors is $\texttt{sieve}[k]$.
|
||||||
|
|
||||||
|
The algorithm iterates through the numbers
|
||||||
|
$2 \ldots n$ one by one.
|
||||||
|
Always when a new prime $x$ is found,
|
||||||
|
the algorithm records that the multiples
|
||||||
|
of $x$ ($2x,3x,4x,\ldots$) are not primes,
|
||||||
|
because the number $x$ divides them.
|
||||||
|
|
||||||
|
For example, if $n=20$, the array is as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (19,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$0$};
|
||||||
|
\node at (1.5,0.5) {$0$};
|
||||||
|
\node at (2.5,0.5) {$2$};
|
||||||
|
\node at (3.5,0.5) {$0$};
|
||||||
|
\node at (4.5,0.5) {$3$};
|
||||||
|
\node at (5.5,0.5) {$0$};
|
||||||
|
\node at (6.5,0.5) {$2$};
|
||||||
|
\node at (7.5,0.5) {$3$};
|
||||||
|
\node at (8.5,0.5) {$5$};
|
||||||
|
\node at (9.5,0.5) {$0$};
|
||||||
|
\node at (10.5,0.5) {$3$};
|
||||||
|
\node at (11.5,0.5) {$0$};
|
||||||
|
\node at (12.5,0.5) {$7$};
|
||||||
|
\node at (13.5,0.5) {$5$};
|
||||||
|
\node at (14.5,0.5) {$2$};
|
||||||
|
\node at (15.5,0.5) {$0$};
|
||||||
|
\node at (16.5,0.5) {$3$};
|
||||||
|
\node at (17.5,0.5) {$0$};
|
||||||
|
\node at (18.5,0.5) {$5$};
|
||||||
|
|
||||||
|
\footnotesize
|
||||||
|
|
||||||
|
\node at (0.5,1.5) {$2$};
|
||||||
|
\node at (1.5,1.5) {$3$};
|
||||||
|
\node at (2.5,1.5) {$4$};
|
||||||
|
\node at (3.5,1.5) {$5$};
|
||||||
|
\node at (4.5,1.5) {$6$};
|
||||||
|
\node at (5.5,1.5) {$7$};
|
||||||
|
\node at (6.5,1.5) {$8$};
|
||||||
|
\node at (7.5,1.5) {$9$};
|
||||||
|
\node at (8.5,1.5) {$10$};
|
||||||
|
\node at (9.5,1.5) {$11$};
|
||||||
|
\node at (10.5,1.5) {$12$};
|
||||||
|
\node at (11.5,1.5) {$13$};
|
||||||
|
\node at (12.5,1.5) {$14$};
|
||||||
|
\node at (13.5,1.5) {$15$};
|
||||||
|
\node at (14.5,1.5) {$16$};
|
||||||
|
\node at (15.5,1.5) {$17$};
|
||||||
|
\node at (16.5,1.5) {$18$};
|
||||||
|
\node at (17.5,1.5) {$19$};
|
||||||
|
\node at (18.5,1.5) {$20$};
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The following code implements the sieve of
|
||||||
|
Eratosthenes.
|
||||||
|
The code assumes that each element of
|
||||||
|
\texttt{sieve} is initially zero.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
for (int x = 2; x <= n; x++) {
|
||||||
|
if (sieve[x]) continue;
|
||||||
|
for (int u = 2*x; u <= n; u += x) {
|
||||||
|
sieve[u] = x;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The inner loop of the algorithm is executed
|
||||||
|
$n/x$ times for each value of $x$.
|
||||||
|
Thus, an upper bound for the running time
|
||||||
|
of the algorithm is the harmonic sum
|
||||||
|
\[\sum_{x=2}^n n/x = n/2 + n/3 + n/4 + \cdots + n/n = O(n \log n).\]
|
||||||
|
|
||||||
|
\index{harmonic sum}
|
||||||
|
|
||||||
|
In fact, the algorithm is more efficient,
|
||||||
|
because the inner loop will be executed only if
|
||||||
|
the number $x$ is prime.
|
||||||
|
It can be shown that the running time of the
|
||||||
|
algorithm is only $O(n \log \log n)$,
|
||||||
|
a complexity very near to $O(n)$.
|
||||||
|
|
||||||
|
\subsubsection{Euclid's algorithm}
|
||||||
|
|
||||||
|
\index{greatest common divisor}
|
||||||
|
\index{least common multiple}
|
||||||
|
\index{Euclid's algorithm}
|
||||||
|
|
||||||
|
The \key{greatest common divisor} of
|
||||||
|
numbers $a$ and $b$, $\gcd(a,b)$,
|
||||||
|
is the greatest number that divides both $a$ and $b$,
|
||||||
|
and the \key{least common multiple} of
|
||||||
|
$a$ and $b$, $\textrm{lcm}(a,b)$,
|
||||||
|
is the smallest number that is divisible by
|
||||||
|
both $a$ and $b$.
|
||||||
|
For example,
|
||||||
|
$\gcd(24,36)=12$ and
|
||||||
|
$\textrm{lcm}(24,36)=72$.
|
||||||
|
|
||||||
|
The greatest common divisor and the least common multiple
|
||||||
|
are connected as follows:
|
||||||
|
\[\textrm{lcm}(a,b)=\frac{ab}{\textrm{gcd}(a,b)}\]
|
||||||
|
|
||||||
|
\key{Euclid's algorithm}\footnote{Euclid was a Greek mathematician who
|
||||||
|
lived in about 300 BC. This is perhaps the first known algorithm in history.} provides an efficient way
|
||||||
|
to find the greatest common divisor of two numbers.
|
||||||
|
The algorithm is based on the following formula:
|
||||||
|
\begin{equation*}
|
||||||
|
\textrm{gcd}(a,b) = \begin{cases}
|
||||||
|
a & b = 0\\
|
||||||
|
\textrm{gcd}(b,a \bmod b) & b \neq 0\\
|
||||||
|
\end{cases}
|
||||||
|
\end{equation*}
|
||||||
|
|
||||||
|
For example,
|
||||||
|
\[\textrm{gcd}(24,36) = \textrm{gcd}(36,24)
|
||||||
|
= \textrm{gcd}(24,12) = \textrm{gcd}(12,0)=12.\]
|
||||||
|
|
||||||
|
The algorithm can be implemented as follows:
|
||||||
|
\begin{lstlisting}
|
||||||
|
int gcd(int a, int b) {
|
||||||
|
if (b == 0) return a;
|
||||||
|
return gcd(b, a%b);
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
It can be shown that Euclid's algorithm works
|
||||||
|
in $O(\log n)$ time, where $n=\min(a,b)$.
|
||||||
|
The worst case for the algorithm is
|
||||||
|
the case when $a$ and $b$ are consecutive Fibonacci numbers.
|
||||||
|
For example,
|
||||||
|
\[\textrm{gcd}(13,8)=\textrm{gcd}(8,5)
|
||||||
|
=\textrm{gcd}(5,3)=\textrm{gcd}(3,2)=\textrm{gcd}(2,1)=\textrm{gcd}(1,0)=1.\]
|
||||||
|
|
||||||
|
\subsubsection{Euler's totient function}
|
||||||
|
|
||||||
|
\index{coprime}
|
||||||
|
\index{Euler's totient function}
|
||||||
|
|
||||||
|
Numbers $a$ and $b$ are \key{coprime}
|
||||||
|
if $\textrm{gcd}(a,b)=1$.
|
||||||
|
\key{Euler's totient function} $\varphi(n)$
|
||||||
|
%\footnote{Euler presented this function in 1763.}
|
||||||
|
gives the number of coprime numbers to $n$
|
||||||
|
between $1$ and $n$.
|
||||||
|
For example, $\varphi(12)=4$,
|
||||||
|
because 1, 5, 7 and 11
|
||||||
|
are coprime to 12.
|
||||||
|
|
||||||
|
The value of $\varphi(n)$ can be calculated
|
||||||
|
from the prime factorization of $n$
|
||||||
|
using the formula
|
||||||
|
\[ \varphi(n) = \prod_{i=1}^k p_i^{\alpha_i-1}(p_i-1). \]
|
||||||
|
For example, $\varphi(12)=2^1 \cdot (2-1) \cdot 3^0 \cdot (3-1)=4$.
|
||||||
|
Note that $\varphi(n)=n-1$ if $n$ is prime.
|
||||||
|
|
||||||
|
\section{Modular arithmetic}
|
||||||
|
|
||||||
|
\index{modular arithmetic}
|
||||||
|
|
||||||
|
In \key{modular arithmetic},
|
||||||
|
the set of numbers is limited so
|
||||||
|
that only numbers $0,1,2,\ldots,m-1$ are used,
|
||||||
|
where $m$ is a constant.
|
||||||
|
Each number $x$ is
|
||||||
|
represented by the number $x \bmod m$:
|
||||||
|
the remainder after dividing $x$ by $m$.
|
||||||
|
For example, if $m=17$, then $75$
|
||||||
|
is represented by $75 \bmod 17 = 7$.
|
||||||
|
|
||||||
|
Often we can take remainders before doing
|
||||||
|
calculations.
|
||||||
|
In particular, the following formulas hold:
|
||||||
|
\[
|
||||||
|
\begin{array}{rcl}
|
||||||
|
(x+y) \bmod m & = & (x \bmod m + y \bmod m) \bmod m \\
|
||||||
|
(x-y) \bmod m & = & (x \bmod m - y \bmod m) \bmod m \\
|
||||||
|
(x \cdot y) \bmod m & = & (x \bmod m \cdot y \bmod m) \bmod m \\
|
||||||
|
x^n \bmod m & = & (x \bmod m)^n \bmod m \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
|
||||||
|
\subsubsection{Modular exponentiation}
|
||||||
|
|
||||||
|
There is often need to efficiently calculate
|
||||||
|
the value of $x^n \bmod m$.
|
||||||
|
This can be done in $O(\log n)$ time
|
||||||
|
using the following recursion:
|
||||||
|
\begin{equation*}
|
||||||
|
x^n = \begin{cases}
|
||||||
|
1 & n = 0\\
|
||||||
|
x^{n/2} \cdot x^{n/2} & \text{$n$ is even}\\
|
||||||
|
x^{n-1} \cdot x & \text{$n$ is odd}
|
||||||
|
\end{cases}
|
||||||
|
\end{equation*}
|
||||||
|
|
||||||
|
It is important that in the case of an even $n$,
|
||||||
|
the value of $x^{n/2}$ is calculated only once.
|
||||||
|
This guarantees that the time complexity of the
|
||||||
|
algorithm is $O(\log n)$, because $n$ is always halved
|
||||||
|
when it is even.
|
||||||
|
|
||||||
|
The following function calculates the value of
|
||||||
|
$x^n \bmod m$:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
int modpow(int x, int n, int m) {
|
||||||
|
if (n == 0) return 1%m;
|
||||||
|
long long u = modpow(x,n/2,m);
|
||||||
|
u = (u*u)%m;
|
||||||
|
if (n%2 == 1) u = (u*x)%m;
|
||||||
|
return u;
|
||||||
|
}
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\subsubsection{Fermat's theorem and Euler's theorem}
|
||||||
|
|
||||||
|
\index{Fermat's theorem}
|
||||||
|
\index{Euler's theorem}
|
||||||
|
|
||||||
|
\key{Fermat's theorem}
|
||||||
|
%\footnote{Fermat discovered this theorem in 1640.}
|
||||||
|
states that
|
||||||
|
\[x^{m-1} \bmod m = 1\]
|
||||||
|
when $m$ is prime and $x$ and $m$ are coprime.
|
||||||
|
This also yields
|
||||||
|
\[x^k \bmod m = x^{k \bmod (m-1)} \bmod m.\]
|
||||||
|
More generally, \key{Euler's theorem}
|
||||||
|
%\footnote{Euler published this theorem in 1763.}
|
||||||
|
states that
|
||||||
|
\[x^{\varphi(m)} \bmod m = 1\]
|
||||||
|
when $x$ and $m$ are coprime.
|
||||||
|
Fermat's theorem follows from Euler's theorem,
|
||||||
|
because if $m$ is a prime, then $\varphi(m)=m-1$.
|
||||||
|
|
||||||
|
\subsubsection{Modular inverse}
|
||||||
|
|
||||||
|
\index{modular inverse}
|
||||||
|
|
||||||
|
The inverse of $x$ modulo $m$
|
||||||
|
is a number $x^{-1}$ such that
|
||||||
|
\[ x x^{-1} \bmod m = 1. \]
|
||||||
|
For example, if $x=6$ and $m=17$,
|
||||||
|
then $x^{-1}=3$, because $6\cdot3 \bmod 17=1$.
|
||||||
|
|
||||||
|
Using modular inverses, we can divide numbers
|
||||||
|
modulo $m$, because division by $x$
|
||||||
|
corresponds to multiplication by $x^{-1}$.
|
||||||
|
For example, to evaluate the value of $36/6 \bmod 17$,
|
||||||
|
we can use the formula $2 \cdot 3 \bmod 17$,
|
||||||
|
because $36 \bmod 17 = 2$ and $6^{-1} \bmod 17 = 3$.
|
||||||
|
|
||||||
|
However, a modular inverse does not always exist.
|
||||||
|
For example, if $x=2$ and $m=4$, the equation
|
||||||
|
\[ x x^{-1} \bmod m = 1 \]
|
||||||
|
cannot be solved, because all multiples of 2
|
||||||
|
are even and the remainder can never be 1 when $m=4$.
|
||||||
|
It turns out that the value of $x^{-1} \bmod m$
|
||||||
|
can be calculated exactly when $x$ and $m$ are coprime.
|
||||||
|
|
||||||
|
If a modular inverse exists, it can be
|
||||||
|
calculated using the formula
|
||||||
|
\[
|
||||||
|
x^{-1} = x^{\varphi(m)-1}.
|
||||||
|
\]
|
||||||
|
If $m$ is prime, the formula becomes
|
||||||
|
\[
|
||||||
|
x^{-1} = x^{m-2}.
|
||||||
|
\]
|
||||||
|
For example,
|
||||||
|
\[6^{-1} \bmod 17 =6^{17-2} \bmod 17 = 3.\]
|
||||||
|
|
||||||
|
This formula allows us to efficiently calculate
|
||||||
|
modular inverses using the modular exponentation algorithm.
|
||||||
|
The formula can be derived using Euler's theorem.
|
||||||
|
First, the modular inverse should satisfy the following equation:
|
||||||
|
\[
|
||||||
|
x x^{-1} \bmod m = 1.
|
||||||
|
\]
|
||||||
|
On the other hand, according to Euler's theorem,
|
||||||
|
\[
|
||||||
|
x^{\varphi(m)} \bmod m = xx^{\varphi(m)-1} \bmod m = 1,
|
||||||
|
\]
|
||||||
|
so the numbers $x^{-1}$ and $x^{\varphi(m)-1}$ are equal.
|
||||||
|
|
||||||
|
\subsubsection{Computer arithmetic}
|
||||||
|
|
||||||
|
In programming, unsigned integers are represented modulo $2^k$,
|
||||||
|
where $k$ is the number of bits of the data type.
|
||||||
|
A usual consequence of this is that a number wraps around
|
||||||
|
if it becomes too large.
|
||||||
|
|
||||||
|
For example, in C++, numbers of type \texttt{unsigned int}
|
||||||
|
are represented modulo $2^{32}$.
|
||||||
|
The following code declares an \texttt{unsigned int}
|
||||||
|
variable whose value is $123456789$.
|
||||||
|
After this, the value will be multiplied by itself,
|
||||||
|
and the result is
|
||||||
|
$123456789^2 \bmod 2^{32} = 2537071545$.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
unsigned int x = 123456789;
|
||||||
|
cout << x*x << "\n"; // 2537071545
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Solving equations}
|
||||||
|
|
||||||
|
\subsubsection*{Diophantine equations}
|
||||||
|
|
||||||
|
\index{Diophantine equation}
|
||||||
|
|
||||||
|
A \key{Diophantine equation}
|
||||||
|
%\footnote{Diophantus of Alexandria was a Greek mathematician who lived in the 3th century.}
|
||||||
|
is an equation of the form
|
||||||
|
\[ ax + by = c, \]
|
||||||
|
where $a$, $b$ and $c$ are constants
|
||||||
|
and the values of $x$ and $y$ should be found.
|
||||||
|
Each number in the equation has to be an integer.
|
||||||
|
For example, one solution for the equation
|
||||||
|
$5x+2y=11$ is $x=3$ and $y=-2$.
|
||||||
|
|
||||||
|
\index{extended Euclid's algorithm}
|
||||||
|
|
||||||
|
We can efficiently solve a Diophantine equation
|
||||||
|
by using Euclid's algorithm.
|
||||||
|
It turns out that we can extend Euclid's algorithm
|
||||||
|
so that it will find numbers $x$ and $y$
|
||||||
|
that satisfy the following equation:
|
||||||
|
\[
|
||||||
|
ax + by = \textrm{gcd}(a,b)
|
||||||
|
\]
|
||||||
|
|
||||||
|
A Diophantine equation can be solved if
|
||||||
|
$c$ is divisible by
|
||||||
|
$\textrm{gcd}(a,b)$,
|
||||||
|
and otherwise it cannot be solved.
|
||||||
|
|
||||||
|
As an example, let us find numbers $x$ and $y$
|
||||||
|
that satisfy the following equation:
|
||||||
|
\[
|
||||||
|
39x + 15y = 12
|
||||||
|
\]
|
||||||
|
The equation can be solved, because
|
||||||
|
$\textrm{gcd}(39,15)=3$ and $3 \mid 12$.
|
||||||
|
When Euclid's algorithm calculates the
|
||||||
|
greatest common divisor of 39 and 15,
|
||||||
|
it produces the following sequence of function calls:
|
||||||
|
\[
|
||||||
|
\textrm{gcd}(39,15) = \textrm{gcd}(15,9)
|
||||||
|
= \textrm{gcd}(9,6) = \textrm{gcd}(6,3)
|
||||||
|
= \textrm{gcd}(3,0) = 3 \]
|
||||||
|
This corresponds to the following equations:
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
39 - 2 \cdot 15 & = & 9 \\
|
||||||
|
15 - 1 \cdot 9 & = & 6 \\
|
||||||
|
9 - 1 \cdot 6 & = & 3 \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
Using these equations, we can derive
|
||||||
|
\[
|
||||||
|
39 \cdot 2 + 15 \cdot (-5) = 3
|
||||||
|
\]
|
||||||
|
and by multiplying this by 4, the result is
|
||||||
|
\[
|
||||||
|
39 \cdot 8 + 15 \cdot (-20) = 12,
|
||||||
|
\]
|
||||||
|
so a solution to the equation is
|
||||||
|
$x=8$ and $y=-20$.
|
||||||
|
|
||||||
|
A solution to a Diophantine equation is not unique,
|
||||||
|
because we can form an infinite number of solutions
|
||||||
|
if we know one solution.
|
||||||
|
If a pair $(x,y)$ is a solution, then also all pairs
|
||||||
|
\[(x+\frac{kb}{\textrm{gcd}(a,b)},y-\frac{ka}{\textrm{gcd}(a,b)})\]
|
||||||
|
are solutions, where $k$ is any integer.
|
||||||
|
|
||||||
|
\subsubsection{Chinese remainder theorem}
|
||||||
|
|
||||||
|
\index{Chinese remainder theorem}
|
||||||
|
|
||||||
|
The \key{Chinese remainder theorem} solves
|
||||||
|
a group of equations of the form
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
x & = & a_1 \bmod m_1 \\
|
||||||
|
x & = & a_2 \bmod m_2 \\
|
||||||
|
\cdots \\
|
||||||
|
x & = & a_n \bmod m_n \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
where all pairs of $m_1,m_2,\ldots,m_n$ are coprime.
|
||||||
|
|
||||||
|
Let $x^{-1}_m$ be the inverse of $x$ modulo $m$, and
|
||||||
|
\[ X_k = \frac{m_1 m_2 \cdots m_n}{m_k}.\]
|
||||||
|
Using this notation, a solution to the equations is
|
||||||
|
\[x = a_1 X_1 {X_1}^{-1}_{m_1} + a_2 X_2 {X_2}^{-1}_{m_2} + \cdots + a_n X_n {X_n}^{-1}_{m_n}.\]
|
||||||
|
In this solution, for each $k=1,2,\ldots,n$,
|
||||||
|
\[a_k X_k {X_k}^{-1}_{m_k} \bmod m_k = a_k,\]
|
||||||
|
because
|
||||||
|
\[X_k {X_k}^{-1}_{m_k} \bmod m_k = 1.\]
|
||||||
|
Since all other terms in the sum are divisible by $m_k$,
|
||||||
|
they have no effect on the remainder,
|
||||||
|
and $x \bmod m_k = a_k$.
|
||||||
|
|
||||||
|
For example, a solution for
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
x & = & 3 \bmod 5 \\
|
||||||
|
x & = & 4 \bmod 7 \\
|
||||||
|
x & = & 2 \bmod 3 \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
is
|
||||||
|
\[ 3 \cdot 21 \cdot 1 + 4 \cdot 15 \cdot 1 + 2 \cdot 35 \cdot 2 = 263.\]
|
||||||
|
|
||||||
|
Once we have found a solution $x$,
|
||||||
|
we can create an infinite number of other solutions,
|
||||||
|
because all numbers of the form
|
||||||
|
\[x+m_1 m_2 \cdots m_n\]
|
||||||
|
are solutions.
|
||||||
|
|
||||||
|
\section{Other results}
|
||||||
|
|
||||||
|
\subsubsection{Lagrange's theorem}
|
||||||
|
|
||||||
|
\index{Lagrange's theorem}
|
||||||
|
|
||||||
|
\key{Lagrange's theorem}
|
||||||
|
%\footnote{J.-L. Lagrange (1736--1813) was an Italian mathematician.}
|
||||||
|
states that every positive integer
|
||||||
|
can be represented as a sum of four squares, i.e.,
|
||||||
|
$a^2+b^2+c^2+d^2$.
|
||||||
|
For example, the number 123 can be represented
|
||||||
|
as the sum $8^2+5^2+5^2+3^2$.
|
||||||
|
|
||||||
|
\subsubsection{Zeckendorf's theorem}
|
||||||
|
|
||||||
|
\index{Zeckendorf's theorem}
|
||||||
|
\index{Fibonacci number}
|
||||||
|
|
||||||
|
\key{Zeckendorf's theorem}
|
||||||
|
%\footnote{E. Zeckendorf published the theorem in 1972 \cite{zec72}; however, this was not a new result.}
|
||||||
|
states that every
|
||||||
|
positive integer has a unique representation
|
||||||
|
as a sum of Fibonacci numbers such that
|
||||||
|
no two numbers are equal or consecutive
|
||||||
|
Fibonacci numbers.
|
||||||
|
For example, the number 74 can be represented
|
||||||
|
as the sum $55+13+5+1$.
|
||||||
|
|
||||||
|
\subsubsection{Pythagorean triples}
|
||||||
|
|
||||||
|
\index{Pythagorean triple}
|
||||||
|
\index{Euclid's formula}
|
||||||
|
|
||||||
|
A \key{Pythagorean triple} is a triple $(a,b,c)$
|
||||||
|
that satisfies the Pythagorean theorem
|
||||||
|
$a^2+b^2=c^2$, which means that there is a right triangle
|
||||||
|
with side lengths $a$, $b$ and $c$.
|
||||||
|
For example, $(3,4,5)$ is a Pythagorean triple.
|
||||||
|
|
||||||
|
If $(a,b,c)$ is a Pythagorean triple,
|
||||||
|
all triples of the form $(ka,kb,kc)$
|
||||||
|
are also Pythagorean triples where $k>1$.
|
||||||
|
A Pythagorean triple is \emph{primitive} if
|
||||||
|
$a$, $b$ and $c$ are coprime,
|
||||||
|
and all Pythagorean triples can be constructed
|
||||||
|
from primitive triples using a multiplier $k$.
|
||||||
|
|
||||||
|
\key{Euclid's formula} can be used to produce
|
||||||
|
all primitive Pythagorean triples.
|
||||||
|
Each such triple is of the form
|
||||||
|
\[(n^2-m^2,2nm,n^2+m^2),\]
|
||||||
|
where $0<m<n$, $n$ and $m$ are coprime
|
||||||
|
and at least one of $n$ and $m$ is even.
|
||||||
|
For example, when $m=1$ and $n=2$, the formula
|
||||||
|
produces the smallest Pythagorean triple
|
||||||
|
\[(2^2-1^2,2\cdot2\cdot1,2^2+1^2)=(3,4,5).\]
|
||||||
|
|
||||||
|
\subsubsection{Wilson's theorem}
|
||||||
|
|
||||||
|
\index{Wilson's theorem}
|
||||||
|
|
||||||
|
\key{Wilson's theorem}
|
||||||
|
%\footnote{J. Wilson (1741--1793) was an English mathematician.}
|
||||||
|
states that a number $n$
|
||||||
|
is prime exactly when
|
||||||
|
\[(n-1)! \bmod n = n-1.\]
|
||||||
|
For example, the number 11 is prime, because
|
||||||
|
\[10! \bmod 11 = 10,\]
|
||||||
|
and the number 12 is not prime, because
|
||||||
|
\[11! \bmod 12 = 0 \neq 11.\]
|
||||||
|
|
||||||
|
Hence, Wilson's theorem can be used to find out
|
||||||
|
whether a number is prime. However, in practice, the theorem cannot be
|
||||||
|
applied to large values of $n$, because it is difficult
|
||||||
|
to calculate values of $(n-1)!$ when $n$ is large.
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,925 @@
|
||||||
|
\chapter{Combinatorics}
|
||||||
|
|
||||||
|
\index{combinatorics}
|
||||||
|
|
||||||
|
\key{Combinatorics} studies methods for counting
|
||||||
|
combinations of objects.
|
||||||
|
Usually, the goal is to find a way to
|
||||||
|
count the combinations efficiently
|
||||||
|
without generating each combination separately.
|
||||||
|
|
||||||
|
As an example, consider the problem
|
||||||
|
of counting the number of ways to
|
||||||
|
represent an integer $n$ as a sum of positive integers.
|
||||||
|
For example, there are 8 representations
|
||||||
|
for $4$:
|
||||||
|
\begin{multicols}{2}
|
||||||
|
\begin{itemize}
|
||||||
|
\item $1+1+1+1$
|
||||||
|
\item $1+1+2$
|
||||||
|
\item $1+2+1$
|
||||||
|
\item $2+1+1$
|
||||||
|
\item $2+2$
|
||||||
|
\item $3+1$
|
||||||
|
\item $1+3$
|
||||||
|
\item $4$
|
||||||
|
\end{itemize}
|
||||||
|
\end{multicols}
|
||||||
|
|
||||||
|
A combinatorial problem can often be solved
|
||||||
|
using a recursive function.
|
||||||
|
In this problem, we can define a function $f(n)$
|
||||||
|
that gives the number of representations for $n$.
|
||||||
|
For example, $f(4)=8$ according to the above example.
|
||||||
|
The values of the function
|
||||||
|
can be recursively calculated as follows:
|
||||||
|
\begin{equation*}
|
||||||
|
f(n) = \begin{cases}
|
||||||
|
1 & n = 0\\
|
||||||
|
f(0)+f(1)+\cdots+f(n-1) & n > 0\\
|
||||||
|
\end{cases}
|
||||||
|
\end{equation*}
|
||||||
|
The base case is $f(0)=1$,
|
||||||
|
because the empty sum represents the number 0.
|
||||||
|
Then, if $n>0$, we consider all ways to
|
||||||
|
choose the first number of the sum.
|
||||||
|
If the first number is $k$,
|
||||||
|
there are $f(n-k)$ representations
|
||||||
|
for the remaining part of the sum.
|
||||||
|
Thus, we calculate the sum of all values
|
||||||
|
of the form $f(n-k)$ where $k<n$.
|
||||||
|
|
||||||
|
The first values for the function are:
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
f(0) & = & 1 \\
|
||||||
|
f(1) & = & 1 \\
|
||||||
|
f(2) & = & 2 \\
|
||||||
|
f(3) & = & 4 \\
|
||||||
|
f(4) & = & 8 \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
|
||||||
|
Sometimes, a recursive formula can be replaced
|
||||||
|
with a closed-form formula.
|
||||||
|
In this problem,
|
||||||
|
\[
|
||||||
|
f(n)=2^{n-1},
|
||||||
|
\]
|
||||||
|
which is based on the fact that there are $n-1$
|
||||||
|
possible positions for +-signs in the sum
|
||||||
|
and we can choose any subset of them.
|
||||||
|
|
||||||
|
\section{Binomial coefficients}
|
||||||
|
|
||||||
|
\index{binomial coefficient}
|
||||||
|
|
||||||
|
The \key{binomial coefficient} ${n \choose k}$
|
||||||
|
equals the number of ways we can choose a subset
|
||||||
|
of $k$ elements from a set of $n$ elements.
|
||||||
|
For example, ${5 \choose 3}=10$,
|
||||||
|
because the set $\{1,2,3,4,5\}$
|
||||||
|
has 10 subsets of 3 elements:
|
||||||
|
\[ \{1,2,3\}, \{1,2,4\}, \{1,2,5\}, \{1,3,4\}, \{1,3,5\},
|
||||||
|
\{1,4,5\}, \{2,3,4\}, \{2,3,5\}, \{2,4,5\}, \{3,4,5\} \]
|
||||||
|
|
||||||
|
\subsubsection{Formula 1}
|
||||||
|
|
||||||
|
Binomial coefficients can be
|
||||||
|
recursively calculated as follows:
|
||||||
|
|
||||||
|
\[
|
||||||
|
{n \choose k} = {n-1 \choose k-1} + {n-1 \choose k}
|
||||||
|
\]
|
||||||
|
|
||||||
|
The idea is to fix an element $x$ in the set.
|
||||||
|
If $x$ is included in the subset,
|
||||||
|
we have to choose $k-1$
|
||||||
|
elements from $n-1$ elements,
|
||||||
|
and if $x$ is not included in the subset,
|
||||||
|
we have to choose $k$ elements from $n-1$ elements.
|
||||||
|
|
||||||
|
The base cases for the recursion are
|
||||||
|
\[
|
||||||
|
{n \choose 0} = {n \choose n} = 1,
|
||||||
|
\]
|
||||||
|
because there is always exactly
|
||||||
|
one way to construct an empty subset
|
||||||
|
and a subset that contains all the elements.
|
||||||
|
|
||||||
|
\subsubsection{Formula 2}
|
||||||
|
|
||||||
|
Another way to calculate binomial coefficients is as follows:
|
||||||
|
\[
|
||||||
|
{n \choose k} = \frac{n!}{k!(n-k)!}.
|
||||||
|
\]
|
||||||
|
|
||||||
|
There are $n!$ permutations of $n$ elements.
|
||||||
|
We go through all permutations and always
|
||||||
|
include the first $k$ elements of the permutation
|
||||||
|
in the subset.
|
||||||
|
Since the order of the elements in the subset
|
||||||
|
and outside the subset does not matter,
|
||||||
|
the result is divided by $k!$ and $(n-k)!$
|
||||||
|
|
||||||
|
\subsubsection{Properties}
|
||||||
|
|
||||||
|
For binomial coefficients,
|
||||||
|
\[
|
||||||
|
{n \choose k} = {n \choose n-k},
|
||||||
|
\]
|
||||||
|
because we actually divide a set of $n$ elements into
|
||||||
|
two subsets: the first contains $k$ elements
|
||||||
|
and the second contains $n-k$ elements.
|
||||||
|
|
||||||
|
The sum of binomial coefficients is
|
||||||
|
\[
|
||||||
|
{n \choose 0}+{n \choose 1}+{n \choose 2}+\ldots+{n \choose n}=2^n.
|
||||||
|
\]
|
||||||
|
|
||||||
|
The reason for the name ''binomial coefficient''
|
||||||
|
can be seen when the binomial $(a+b)$ is raised to
|
||||||
|
the $n$th power:
|
||||||
|
|
||||||
|
\[ (a+b)^n =
|
||||||
|
{n \choose 0} a^n b^0 +
|
||||||
|
{n \choose 1} a^{n-1} b^1 +
|
||||||
|
\ldots +
|
||||||
|
{n \choose n-1} a^1 b^{n-1} +
|
||||||
|
{n \choose n} a^0 b^n. \]
|
||||||
|
|
||||||
|
\index{Pascal's triangle}
|
||||||
|
|
||||||
|
Binomial coefficients also appear in
|
||||||
|
\key{Pascal's triangle}
|
||||||
|
where each value equals the sum of two
|
||||||
|
above values:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}{0.9}
|
||||||
|
\node at (0,0) {1};
|
||||||
|
\node at (-0.5,-0.5) {1};
|
||||||
|
\node at (0.5,-0.5) {1};
|
||||||
|
\node at (-1,-1) {1};
|
||||||
|
\node at (0,-1) {2};
|
||||||
|
\node at (1,-1) {1};
|
||||||
|
\node at (-1.5,-1.5) {1};
|
||||||
|
\node at (-0.5,-1.5) {3};
|
||||||
|
\node at (0.5,-1.5) {3};
|
||||||
|
\node at (1.5,-1.5) {1};
|
||||||
|
\node at (-2,-2) {1};
|
||||||
|
\node at (-1,-2) {4};
|
||||||
|
\node at (0,-2) {6};
|
||||||
|
\node at (1,-2) {4};
|
||||||
|
\node at (2,-2) {1};
|
||||||
|
\node at (-2,-2.5) {$\ldots$};
|
||||||
|
\node at (-1,-2.5) {$\ldots$};
|
||||||
|
\node at (0,-2.5) {$\ldots$};
|
||||||
|
\node at (1,-2.5) {$\ldots$};
|
||||||
|
\node at (2,-2.5) {$\ldots$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Boxes and balls}
|
||||||
|
|
||||||
|
''Boxes and balls'' is a useful model,
|
||||||
|
where we count the ways to
|
||||||
|
place $k$ balls in $n$ boxes.
|
||||||
|
Let us consider three scenarios:
|
||||||
|
|
||||||
|
\textit{Scenario 1}: Each box can contain
|
||||||
|
at most one ball.
|
||||||
|
For example, when $n=5$ and $k=2$,
|
||||||
|
there are 10 solutions:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.5]
|
||||||
|
\newcommand\lax[3]{
|
||||||
|
\path[draw,thick,-] (#1-0.5,#2+0.5) -- (#1-0.5,#2-0.5) --
|
||||||
|
(#1+0.5,#2-0.5) -- (#1+0.5,#2+0.5);
|
||||||
|
\ifthenelse{\equal{#3}{1}}{\draw[fill=black] (#1,#2-0.3) circle (0.15);}{}
|
||||||
|
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1-0.2,#2-0.3) circle (0.15);}{}
|
||||||
|
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1+0.2,#2-0.3) circle (0.15);}{}
|
||||||
|
}
|
||||||
|
\newcommand\laa[7]{
|
||||||
|
\lax{#1}{#2}{#3}
|
||||||
|
\lax{#1+1.2}{#2}{#4}
|
||||||
|
\lax{#1+2.4}{#2}{#5}
|
||||||
|
\lax{#1+3.6}{#2}{#6}
|
||||||
|
\lax{#1+4.8}{#2}{#7}
|
||||||
|
}
|
||||||
|
|
||||||
|
\laa{0}{0}{1}{1}{0}{0}{0}
|
||||||
|
\laa{0}{-2}{1}{0}{1}{0}{0}
|
||||||
|
\laa{0}{-4}{1}{0}{0}{1}{0}
|
||||||
|
\laa{0}{-6}{1}{0}{0}{0}{1}
|
||||||
|
\laa{8}{0}{0}{1}{1}{0}{0}
|
||||||
|
\laa{8}{-2}{0}{1}{0}{1}{0}
|
||||||
|
\laa{8}{-4}{0}{1}{0}{0}{1}
|
||||||
|
\laa{16}{0}{0}{0}{1}{1}{0}
|
||||||
|
\laa{16}{-2}{0}{0}{1}{0}{1}
|
||||||
|
\laa{16}{-4}{0}{0}{0}{1}{1}
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
In this scenario, the answer is directly the
|
||||||
|
binomial coefficient ${n \choose k}$.
|
||||||
|
|
||||||
|
\textit{Scenario 2}: A box can contain multiple balls.
|
||||||
|
For example, when $n=5$ and $k=2$,
|
||||||
|
there are 15 solutions:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.5]
|
||||||
|
\newcommand\lax[3]{
|
||||||
|
\path[draw,thick,-] (#1-0.5,#2+0.5) -- (#1-0.5,#2-0.5) --
|
||||||
|
(#1+0.5,#2-0.5) -- (#1+0.5,#2+0.5);
|
||||||
|
\ifthenelse{\equal{#3}{1}}{\draw[fill=black] (#1,#2-0.3) circle (0.15);}{}
|
||||||
|
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1-0.2,#2-0.3) circle (0.15);}{}
|
||||||
|
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1+0.2,#2-0.3) circle (0.15);}{}
|
||||||
|
}
|
||||||
|
\newcommand\laa[7]{
|
||||||
|
\lax{#1}{#2}{#3}
|
||||||
|
\lax{#1+1.2}{#2}{#4}
|
||||||
|
\lax{#1+2.4}{#2}{#5}
|
||||||
|
\lax{#1+3.6}{#2}{#6}
|
||||||
|
\lax{#1+4.8}{#2}{#7}
|
||||||
|
}
|
||||||
|
|
||||||
|
\laa{0}{0}{2}{0}{0}{0}{0}
|
||||||
|
\laa{0}{-2}{1}{1}{0}{0}{0}
|
||||||
|
\laa{0}{-4}{1}{0}{1}{0}{0}
|
||||||
|
\laa{0}{-6}{1}{0}{0}{1}{0}
|
||||||
|
\laa{0}{-8}{1}{0}{0}{0}{1}
|
||||||
|
\laa{8}{0}{0}{2}{0}{0}{0}
|
||||||
|
\laa{8}{-2}{0}{1}{1}{0}{0}
|
||||||
|
\laa{8}{-4}{0}{1}{0}{1}{0}
|
||||||
|
\laa{8}{-6}{0}{1}{0}{0}{1}
|
||||||
|
\laa{8}{-8}{0}{0}{2}{0}{0}
|
||||||
|
\laa{16}{0}{0}{0}{1}{1}{0}
|
||||||
|
\laa{16}{-2}{0}{0}{1}{0}{1}
|
||||||
|
\laa{16}{-4}{0}{0}{0}{2}{0}
|
||||||
|
\laa{16}{-6}{0}{0}{0}{1}{1}
|
||||||
|
\laa{16}{-8}{0}{0}{0}{0}{2}
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The process of placing the balls in the boxes
|
||||||
|
can be represented as a string
|
||||||
|
that consists of symbols
|
||||||
|
''o'' and ''$\rightarrow$''.
|
||||||
|
Initially, assume that we are standing at the leftmost box.
|
||||||
|
The symbol ''o'' means that we place a ball
|
||||||
|
in the current box, and the symbol
|
||||||
|
''$\rightarrow$'' means that we move to
|
||||||
|
the next box to the right.
|
||||||
|
|
||||||
|
Using this notation, each solution is a string
|
||||||
|
that contains $k$ times the symbol ''o'' and
|
||||||
|
$n-1$ times the symbol ''$\rightarrow$''.
|
||||||
|
For example, the upper-right solution
|
||||||
|
in the above picture corresponds to the string
|
||||||
|
''$\rightarrow$ $\rightarrow$ o $\rightarrow$ o $\rightarrow$''.
|
||||||
|
Thus, the number of solutions is
|
||||||
|
${k+n-1 \choose k}$.
|
||||||
|
|
||||||
|
\textit{Scenario 3}: Each box may contain at most one ball,
|
||||||
|
and in addition, no two adjacent boxes may both contain a ball.
|
||||||
|
For example, when $n=5$ and $k=2$,
|
||||||
|
there are 6 solutions:
|
||||||
|
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.5]
|
||||||
|
\newcommand\lax[3]{
|
||||||
|
\path[draw,thick,-] (#1-0.5,#2+0.5) -- (#1-0.5,#2-0.5) --
|
||||||
|
(#1+0.5,#2-0.5) -- (#1+0.5,#2+0.5);
|
||||||
|
\ifthenelse{\equal{#3}{1}}{\draw[fill=black] (#1,#2-0.3) circle (0.15);}{}
|
||||||
|
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1-0.2,#2-0.3) circle (0.15);}{}
|
||||||
|
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1+0.2,#2-0.3) circle (0.15);}{}
|
||||||
|
}
|
||||||
|
\newcommand\laa[7]{
|
||||||
|
\lax{#1}{#2}{#3}
|
||||||
|
\lax{#1+1.2}{#2}{#4}
|
||||||
|
\lax{#1+2.4}{#2}{#5}
|
||||||
|
\lax{#1+3.6}{#2}{#6}
|
||||||
|
\lax{#1+4.8}{#2}{#7}
|
||||||
|
}
|
||||||
|
|
||||||
|
\laa{0}{0}{1}{0}{1}{0}{0}
|
||||||
|
\laa{0}{-2}{1}{0}{0}{1}{0}
|
||||||
|
\laa{8}{0}{1}{0}{0}{0}{1}
|
||||||
|
\laa{8}{-2}{0}{1}{0}{1}{0}
|
||||||
|
\laa{16}{0}{0}{1}{0}{0}{1}
|
||||||
|
\laa{16}{-2}{0}{0}{1}{0}{1}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
In this scenario, we can assume that
|
||||||
|
$k$ balls are initially placed in boxes
|
||||||
|
and there is an empty box between each
|
||||||
|
two adjacent boxes.
|
||||||
|
The remaining task is to choose the
|
||||||
|
positions for the remaining empty boxes.
|
||||||
|
There are $n-2k+1$ such boxes and
|
||||||
|
$k+1$ positions for them.
|
||||||
|
Thus, using the formula of scenario 2,
|
||||||
|
the number of solutions is
|
||||||
|
${n-k+1 \choose n-2k+1}$.
|
||||||
|
|
||||||
|
\subsubsection{Multinomial coefficients}
|
||||||
|
|
||||||
|
\index{multinomial coefficient}
|
||||||
|
|
||||||
|
The \key{multinomial coefficient}
|
||||||
|
\[ {n \choose k_1,k_2,\ldots,k_m} = \frac{n!}{k_1! k_2! \cdots k_m!}, \]
|
||||||
|
equals the number of ways
|
||||||
|
we can divide $n$ elements into subsets
|
||||||
|
of sizes $k_1,k_2,\ldots,k_m$,
|
||||||
|
where $k_1+k_2+\cdots+k_m=n$.
|
||||||
|
Multinomial coefficients can be seen as a
|
||||||
|
generalization of binomial cofficients;
|
||||||
|
if $m=2$, the above formula
|
||||||
|
corresponds to the binomial coefficient formula.
|
||||||
|
|
||||||
|
\section{Catalan numbers}
|
||||||
|
|
||||||
|
\index{Catalan number}
|
||||||
|
|
||||||
|
The \key{Catalan number}
|
||||||
|
%\footnote{E. C. Catalan (1814--1894) was a Belgian mathematician.}
|
||||||
|
$C_n$ equals the
|
||||||
|
number of valid
|
||||||
|
parenthesis expressions that consist of
|
||||||
|
$n$ left parentheses and $n$ right parentheses.
|
||||||
|
|
||||||
|
For example, $C_3=5$, because
|
||||||
|
we can construct the following parenthesis
|
||||||
|
expressions using three
|
||||||
|
left and right parentheses:
|
||||||
|
|
||||||
|
\begin{itemize}[noitemsep]
|
||||||
|
\item \texttt{()()()}
|
||||||
|
\item \texttt{(())()}
|
||||||
|
\item \texttt{()(())}
|
||||||
|
\item \texttt{((()))}
|
||||||
|
\item \texttt{(()())}
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsubsection{Parenthesis expressions}
|
||||||
|
|
||||||
|
\index{parenthesis expression}
|
||||||
|
|
||||||
|
What is exactly a \emph{valid parenthesis expression}?
|
||||||
|
The following rules precisely define all
|
||||||
|
valid parenthesis expressions:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item An empty parenthesis expression is valid.
|
||||||
|
\item If an expression $A$ is valid,
|
||||||
|
then also the expression
|
||||||
|
\texttt{(}$A$\texttt{)} is valid.
|
||||||
|
\item If expressions $A$ and $B$ are valid,
|
||||||
|
then also the expression $AB$ is valid.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
Another way to characterize valid
|
||||||
|
parenthesis expressions is that if
|
||||||
|
we choose any prefix of such an expression,
|
||||||
|
it has to contain at least as many left
|
||||||
|
parentheses as right parentheses.
|
||||||
|
In addition, the complete expression has to
|
||||||
|
contain an equal number of left and right
|
||||||
|
parentheses.
|
||||||
|
|
||||||
|
\subsubsection{Formula 1}
|
||||||
|
|
||||||
|
Catalan numbers can be calculated using the formula
|
||||||
|
\[ C_n = \sum_{i=0}^{n-1} C_{i} C_{n-i-1}.\]
|
||||||
|
|
||||||
|
The sum goes through the ways to divide the
|
||||||
|
expression into two parts
|
||||||
|
such that both parts are valid
|
||||||
|
expressions and the first part is as short as possible
|
||||||
|
but not empty.
|
||||||
|
For any $i$, the first part contains $i+1$ pairs
|
||||||
|
of parentheses and the number of expressions
|
||||||
|
is the product of the following values:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item $C_{i}$: the number of ways to construct an expression
|
||||||
|
using the parentheses of the first part,
|
||||||
|
not counting the outermost parentheses
|
||||||
|
\item $C_{n-i-1}$: the number of ways to construct an
|
||||||
|
expression using the parentheses of the second part
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
The base case is $C_0=1$,
|
||||||
|
because we can construct an empty parenthesis
|
||||||
|
expression using zero pairs of parentheses.
|
||||||
|
|
||||||
|
\subsubsection{Formula 2}
|
||||||
|
|
||||||
|
Catalan numbers can also be calculated
|
||||||
|
using binomial coefficients:
|
||||||
|
\[ C_n = \frac{1}{n+1} {2n \choose n}\]
|
||||||
|
The formula can be explained as follows:
|
||||||
|
|
||||||
|
There are a total of ${2n \choose n}$ ways
|
||||||
|
to construct a (not necessarily valid)
|
||||||
|
parenthesis expression that contains $n$ left
|
||||||
|
parentheses and $n$ right parentheses.
|
||||||
|
Let us calculate the number of such
|
||||||
|
expressions that are \emph{not} valid.
|
||||||
|
|
||||||
|
If a parenthesis expression is not valid,
|
||||||
|
it has to contain a prefix where the
|
||||||
|
number of right parentheses exceeds the
|
||||||
|
number of left parentheses.
|
||||||
|
The idea is to reverse each parenthesis
|
||||||
|
that belongs to such a prefix.
|
||||||
|
For example, the expression
|
||||||
|
\texttt{())()(} contains a prefix \texttt{())},
|
||||||
|
and after reversing the prefix,
|
||||||
|
the expression becomes \texttt{)((()(}.
|
||||||
|
|
||||||
|
The resulting expression consists of $n+1$
|
||||||
|
left parentheses and $n-1$ right parentheses.
|
||||||
|
The number of such expressions is ${2n \choose n+1}$,
|
||||||
|
which equals the number of non-valid
|
||||||
|
parenthesis expressions.
|
||||||
|
Thus, the number of valid parenthesis
|
||||||
|
expressions can be calculated using the formula
|
||||||
|
\[{2n \choose n}-{2n \choose n+1} = {2n \choose n} - \frac{n}{n+1} {2n \choose n} = \frac{1}{n+1} {2n \choose n}.\]
|
||||||
|
|
||||||
|
\subsubsection{Counting trees}
|
||||||
|
|
||||||
|
Catalan numbers are also related to trees:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item there are $C_n$ binary trees of $n$ nodes
|
||||||
|
\item there are $C_{n-1}$ rooted trees of $n$ nodes
|
||||||
|
\end{itemize}
|
||||||
|
\noindent
|
||||||
|
For example, for $C_3=5$, the binary trees are
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\path[draw,thick,-] (0,0) -- (-1,-1);
|
||||||
|
\path[draw,thick,-] (0,0) -- (1,-1);
|
||||||
|
\draw[fill=white] (0,0) circle (0.3);
|
||||||
|
\draw[fill=white] (-1,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (1,-1) circle (0.3);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (4,0) -- (4-0.75,-1) -- (4-1.5,-2);
|
||||||
|
\draw[fill=white] (4,0) circle (0.3);
|
||||||
|
\draw[fill=white] (4-0.75,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (4-1.5,-2) circle (0.3);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (6.5,0) -- (6.5-0.75,-1) -- (6.5-0,-2);
|
||||||
|
\draw[fill=white] (6.5,0) circle (0.3);
|
||||||
|
\draw[fill=white] (6.5-0.75,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (6.5-0,-2) circle (0.3);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (9,0) -- (9+0.75,-1) -- (9-0,-2);
|
||||||
|
\draw[fill=white] (9,0) circle (0.3);
|
||||||
|
\draw[fill=white] (9+0.75,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (9-0,-2) circle (0.3);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (11.5,0) -- (11.5+0.75,-1) -- (11.5+1.5,-2);
|
||||||
|
\draw[fill=white] (11.5,0) circle (0.3);
|
||||||
|
\draw[fill=white] (11.5+0.75,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (11.5+1.5,-2) circle (0.3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
and the rooted trees are
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\path[draw,thick,-] (0,0) -- (-1,-1);
|
||||||
|
\path[draw,thick,-] (0,0) -- (0,-1);
|
||||||
|
\path[draw,thick,-] (0,0) -- (1,-1);
|
||||||
|
\draw[fill=white] (0,0) circle (0.3);
|
||||||
|
\draw[fill=white] (-1,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (0,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (1,-1) circle (0.3);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (3,0) -- (3,-1) -- (3,-2) -- (3,-3);
|
||||||
|
\draw[fill=white] (3,0) circle (0.3);
|
||||||
|
\draw[fill=white] (3,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (3,-2) circle (0.3);
|
||||||
|
\draw[fill=white] (3,-3) circle (0.3);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (6+0,0) -- (6-1,-1);
|
||||||
|
\path[draw,thick,-] (6+0,0) -- (6+1,-1) -- (6+1,-2);
|
||||||
|
\draw[fill=white] (6+0,0) circle (0.3);
|
||||||
|
\draw[fill=white] (6-1,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (6+1,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (6+1,-2) circle (0.3);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (9+0,0) -- (9+1,-1);
|
||||||
|
\path[draw,thick,-] (9+0,0) -- (9-1,-1) -- (9-1,-2);
|
||||||
|
\draw[fill=white] (9+0,0) circle (0.3);
|
||||||
|
\draw[fill=white] (9+1,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (9-1,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (9-1,-2) circle (0.3);
|
||||||
|
|
||||||
|
\path[draw,thick,-] (12+0,0) -- (12+0,-1) -- (12-1,-2);
|
||||||
|
\path[draw,thick,-] (12+0,0) -- (12+0,-1) -- (12+1,-2);
|
||||||
|
\draw[fill=white] (12+0,0) circle (0.3);
|
||||||
|
\draw[fill=white] (12+0,-1) circle (0.3);
|
||||||
|
\draw[fill=white] (12-1,-2) circle (0.3);
|
||||||
|
\draw[fill=white] (12+1,-2) circle (0.3);
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\section{Inclusion-exclusion}
|
||||||
|
|
||||||
|
\index{inclusion-exclusion}
|
||||||
|
|
||||||
|
\key{Inclusion-exclusion} is a technique
|
||||||
|
that can be used for counting the size
|
||||||
|
of a union of sets when the sizes of
|
||||||
|
the intersections are known, and vice versa.
|
||||||
|
A simple example of the technique is the formula
|
||||||
|
\[ |A \cup B| = |A| + |B| - |A \cap B|,\]
|
||||||
|
where $A$ and $B$ are sets and $|X|$
|
||||||
|
denotes the size of $X$.
|
||||||
|
The formula can be illustrated as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.8]
|
||||||
|
|
||||||
|
\draw (0,0) circle (1.5);
|
||||||
|
\draw (1.5,0) circle (1.5);
|
||||||
|
|
||||||
|
\node at (-0.75,0) {\small $A$};
|
||||||
|
\node at (2.25,0) {\small $B$};
|
||||||
|
\node at (0.75,0) {\small $A \cap B$};
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Our goal is to calculate
|
||||||
|
the size of the union $A \cup B$
|
||||||
|
that corresponds to the area of the region
|
||||||
|
that belongs to at least one circle.
|
||||||
|
The picture shows that we can calculate
|
||||||
|
the area of $A \cup B$ by first summing the
|
||||||
|
areas of $A$ and $B$ and then subtracting
|
||||||
|
the area of $A \cap B$.
|
||||||
|
|
||||||
|
The same idea can be applied when the number
|
||||||
|
of sets is larger.
|
||||||
|
When there are three sets, the inclusion-exclusion formula is
|
||||||
|
\[ |A \cup B \cup C| = |A| + |B| + |C| - |A \cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C| \]
|
||||||
|
and the corresponding picture is
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.8]
|
||||||
|
|
||||||
|
\draw (0,0) circle (1.75);
|
||||||
|
\draw (2,0) circle (1.75);
|
||||||
|
\draw (1,1.5) circle (1.75);
|
||||||
|
|
||||||
|
\node at (-0.75,-0.25) {\small $A$};
|
||||||
|
\node at (2.75,-0.25) {\small $B$};
|
||||||
|
\node at (1,2.5) {\small $C$};
|
||||||
|
\node at (1,-0.5) {\small $A \cap B$};
|
||||||
|
\node at (0,1.25) {\small $A \cap C$};
|
||||||
|
\node at (2,1.25) {\small $B \cap C$};
|
||||||
|
\node at (1,0.5) {\scriptsize $A \cap B \cap C$};
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
In the general case, the size of the
|
||||||
|
union $X_1 \cup X_2 \cup \cdots \cup X_n$
|
||||||
|
can be calculated by going through all possible
|
||||||
|
intersections that contain some of the sets $X_1,X_2,\ldots,X_n$.
|
||||||
|
If the intersection contains an odd number of sets,
|
||||||
|
its size is added to the answer,
|
||||||
|
and otherwise its size is subtracted from the answer.
|
||||||
|
|
||||||
|
Note that there are similar formulas
|
||||||
|
for calculating
|
||||||
|
the size of an intersection from the sizes of
|
||||||
|
unions. For example,
|
||||||
|
\[ |A \cap B| = |A| + |B| - |A \cup B|\]
|
||||||
|
and
|
||||||
|
\[ |A \cap B \cap C| = |A| + |B| + |C| - |A \cup B| - |A \cup C| - |B \cup C| + |A \cup B \cup C| .\]
|
||||||
|
|
||||||
|
\subsubsection{Derangements}
|
||||||
|
|
||||||
|
\index{derangement}
|
||||||
|
|
||||||
|
As an example, let us count the number of \key{derangements}
|
||||||
|
of elements $\{1,2,\ldots,n\}$, i.e., permutations
|
||||||
|
where no element remains in its original place.
|
||||||
|
For example, when $n=3$, there are
|
||||||
|
two derangements: $(2,3,1)$ and $(3,1,2)$.
|
||||||
|
|
||||||
|
One approach for solving the problem is to use
|
||||||
|
inclusion-exclusion.
|
||||||
|
Let $X_k$ be the set of permutations
|
||||||
|
that contain the element $k$ at position $k$.
|
||||||
|
For example, when $n=3$, the sets are as follows:
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
X_1 & = & \{(1,2,3),(1,3,2)\} \\
|
||||||
|
X_2 & = & \{(1,2,3),(3,2,1)\} \\
|
||||||
|
X_3 & = & \{(1,2,3),(2,1,3)\} \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
Using these sets, the number of derangements equals
|
||||||
|
\[ n! - |X_1 \cup X_2 \cup \cdots \cup X_n|, \]
|
||||||
|
so it suffices to calculate the size of the union.
|
||||||
|
Using inclusion-exclusion, this reduces to
|
||||||
|
calculating sizes of intersections which can be
|
||||||
|
done efficiently.
|
||||||
|
For example, when $n=3$, the size of
|
||||||
|
$|X_1 \cup X_2 \cup X_3|$ is
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
& & |X_1| + |X_2| + |X_3| - |X_1 \cap X_2| - |X_1 \cap X_3| - |X_2 \cap X_3| + |X_1 \cap X_2 \cap X_3| \\
|
||||||
|
& = & 2+2+2-1-1-1+1 \\
|
||||||
|
& = & 4, \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
so the number of solutions is $3!-4=2$.
|
||||||
|
|
||||||
|
It turns out that the problem can also be solved
|
||||||
|
without using inclusion-exclusion.
|
||||||
|
Let $f(n)$ denote the number of derangements
|
||||||
|
for $\{1,2,\ldots,n\}$. We can use the following
|
||||||
|
recursive formula:
|
||||||
|
|
||||||
|
\begin{equation*}
|
||||||
|
f(n) = \begin{cases}
|
||||||
|
0 & n = 1\\
|
||||||
|
1 & n = 2\\
|
||||||
|
(n-1)(f(n-2) + f(n-1)) & n>2 \\
|
||||||
|
\end{cases}
|
||||||
|
\end{equation*}
|
||||||
|
|
||||||
|
The formula can be derived by considering
|
||||||
|
the possibilities how the element 1 changes
|
||||||
|
in the derangement.
|
||||||
|
There are $n-1$ ways to choose an element $x$
|
||||||
|
that replaces the element 1.
|
||||||
|
In each such choice, there are two options:
|
||||||
|
|
||||||
|
\textit{Option 1:} We also replace the element $x$
|
||||||
|
with the element 1.
|
||||||
|
After this, the remaining task is to construct
|
||||||
|
a derangement of $n-2$ elements.
|
||||||
|
|
||||||
|
\textit{Option 2:} We replace the element $x$
|
||||||
|
with some other element than 1.
|
||||||
|
Now we have to construct a derangement
|
||||||
|
of $n-1$ element, because we cannot replace
|
||||||
|
the element $x$ with the element $1$, and all other
|
||||||
|
elements must be changed.
|
||||||
|
|
||||||
|
\section{Burnside's lemma}
|
||||||
|
|
||||||
|
\index{Burnside's lemma}
|
||||||
|
|
||||||
|
\key{Burnside's lemma}
|
||||||
|
%\footnote{Actually, Burnside did not discover this lemma; he only mentioned it in his book \cite{bur97}.}
|
||||||
|
can be used to count
|
||||||
|
the number of combinations so that
|
||||||
|
only one representative is counted
|
||||||
|
for each group of symmetric combinations.
|
||||||
|
Burnside's lemma states that the number of
|
||||||
|
combinations is
|
||||||
|
\[\sum_{k=1}^n \frac{c(k)}{n},\]
|
||||||
|
where there are $n$ ways to change the
|
||||||
|
position of a combination,
|
||||||
|
and there are $c(k)$ combinations that
|
||||||
|
remain unchanged when the $k$th way is applied.
|
||||||
|
|
||||||
|
As an example, let us calculate the number of
|
||||||
|
necklaces of $n$ pearls,
|
||||||
|
where each pearl has $m$ possible colors.
|
||||||
|
Two necklaces are symmetric if they are
|
||||||
|
similar after rotating them.
|
||||||
|
For example, the necklace
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw[fill=white] (0,0) circle (1);
|
||||||
|
\draw[fill=red] (0,1) circle (0.3);
|
||||||
|
\draw[fill=blue] (1,0) circle (0.3);
|
||||||
|
\draw[fill=red] (0,-1) circle (0.3);
|
||||||
|
\draw[fill=green] (-1,0) circle (0.3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
has the following symmetric necklaces:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw[fill=white] (0,0) circle (1);
|
||||||
|
\draw[fill=red] (0,1) circle (0.3);
|
||||||
|
\draw[fill=blue] (1,0) circle (0.3);
|
||||||
|
\draw[fill=red] (0,-1) circle (0.3);
|
||||||
|
\draw[fill=green] (-1,0) circle (0.3);
|
||||||
|
|
||||||
|
\draw[fill=white] (4,0) circle (1);
|
||||||
|
\draw[fill=green] (4+0,1) circle (0.3);
|
||||||
|
\draw[fill=red] (4+1,0) circle (0.3);
|
||||||
|
\draw[fill=blue] (4+0,-1) circle (0.3);
|
||||||
|
\draw[fill=red] (4+-1,0) circle (0.3);
|
||||||
|
|
||||||
|
\draw[fill=white] (8,0) circle (1);
|
||||||
|
\draw[fill=red] (8+0,1) circle (0.3);
|
||||||
|
\draw[fill=green] (8+1,0) circle (0.3);
|
||||||
|
\draw[fill=red] (8+0,-1) circle (0.3);
|
||||||
|
\draw[fill=blue] (8+-1,0) circle (0.3);
|
||||||
|
|
||||||
|
\draw[fill=white] (12,0) circle (1);
|
||||||
|
\draw[fill=blue] (12+0,1) circle (0.3);
|
||||||
|
\draw[fill=red] (12+1,0) circle (0.3);
|
||||||
|
\draw[fill=green] (12+0,-1) circle (0.3);
|
||||||
|
\draw[fill=red] (12+-1,0) circle (0.3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
There are $n$ ways to change the position
|
||||||
|
of a necklace,
|
||||||
|
because we can rotate it
|
||||||
|
$0,1,\ldots,n-1$ steps clockwise.
|
||||||
|
If the number of steps is 0,
|
||||||
|
all $m^n$ necklaces remain the same,
|
||||||
|
and if the number of steps is 1,
|
||||||
|
only the $m$ necklaces where each
|
||||||
|
pearl has the same color remain the same.
|
||||||
|
|
||||||
|
More generally, when the number of steps is $k$,
|
||||||
|
a total of
|
||||||
|
\[m^{\textrm{gcd}(k,n)}\]
|
||||||
|
necklaces remain the same,
|
||||||
|
where $\textrm{gcd}(k,n)$ is the greatest common
|
||||||
|
divisor of $k$ and $n$.
|
||||||
|
The reason for this is that blocks
|
||||||
|
of pearls of size $\textrm{gcd}(k,n)$
|
||||||
|
will replace each other.
|
||||||
|
Thus, according to Burnside's lemma,
|
||||||
|
the number of necklaces is
|
||||||
|
\[\sum_{i=0}^{n-1} \frac{m^{\textrm{gcd}(i,n)}}{n}. \]
|
||||||
|
For example, the number of necklaces of length 4
|
||||||
|
with 3 colors is
|
||||||
|
\[\frac{3^4+3+3^2+3}{4} = 24. \]
|
||||||
|
|
||||||
|
\section{Cayley's formula}
|
||||||
|
|
||||||
|
\index{Cayley's formula}
|
||||||
|
|
||||||
|
\key{Cayley's formula}
|
||||||
|
% \footnote{While the formula is named after A. Cayley,
|
||||||
|
% who studied it in 1889, it was discovered earlier by C. W. Borchardt in 1860.}
|
||||||
|
states that
|
||||||
|
there are $n^{n-2}$ labeled trees
|
||||||
|
that contain $n$ nodes.
|
||||||
|
The nodes are labeled $1,2,\ldots,n$,
|
||||||
|
and two trees are different
|
||||||
|
if either their structure or
|
||||||
|
labeling is different.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
For example, when $n=4$, the number of labeled
|
||||||
|
trees is $4^{4-2}=16$:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.8]
|
||||||
|
\footnotesize
|
||||||
|
|
||||||
|
\newcommand\puua[6]{
|
||||||
|
\path[draw,thick,-] (#1,#2) -- (#1-1.25,#2-1.5);
|
||||||
|
\path[draw,thick,-] (#1,#2) -- (#1,#2-1.5);
|
||||||
|
\path[draw,thick,-] (#1,#2) -- (#1+1.25,#2-1.5);
|
||||||
|
\node[draw, circle, fill=white] at (#1,#2) {#3};
|
||||||
|
\node[draw, circle, fill=white] at (#1-1.25,#2-1.5) {#4};
|
||||||
|
\node[draw, circle, fill=white] at (#1,#2-1.5) {#5};
|
||||||
|
\node[draw, circle, fill=white] at (#1+1.25,#2-1.5) {#6};
|
||||||
|
}
|
||||||
|
\newcommand\puub[6]{
|
||||||
|
\path[draw,thick,-] (#1,#2) -- (#1+1,#2);
|
||||||
|
\path[draw,thick,-] (#1+1,#2) -- (#1+2,#2);
|
||||||
|
\path[draw,thick,-] (#1+2,#2) -- (#1+3,#2);
|
||||||
|
\node[draw, circle, fill=white] at (#1,#2) {#3};
|
||||||
|
\node[draw, circle, fill=white] at (#1+1,#2) {#4};
|
||||||
|
\node[draw, circle, fill=white] at (#1+2,#2) {#5};
|
||||||
|
\node[draw, circle, fill=white] at (#1+3,#2) {#6};
|
||||||
|
}
|
||||||
|
|
||||||
|
\puua{0}{0}{1}{2}{3}{4}
|
||||||
|
\puua{4}{0}{2}{1}{3}{4}
|
||||||
|
\puua{8}{0}{3}{1}{2}{4}
|
||||||
|
\puua{12}{0}{4}{1}{2}{3}
|
||||||
|
|
||||||
|
\puub{0}{-3}{1}{2}{3}{4}
|
||||||
|
\puub{4.5}{-3}{1}{2}{4}{3}
|
||||||
|
\puub{9}{-3}{1}{3}{2}{4}
|
||||||
|
\puub{0}{-4.5}{1}{3}{4}{2}
|
||||||
|
\puub{4.5}{-4.5}{1}{4}{2}{3}
|
||||||
|
\puub{9}{-4.5}{1}{4}{3}{2}
|
||||||
|
\puub{0}{-6}{2}{1}{3}{4}
|
||||||
|
\puub{4.5}{-6}{2}{1}{4}{3}
|
||||||
|
\puub{9}{-6}{2}{3}{1}{4}
|
||||||
|
\puub{0}{-7.5}{2}{4}{1}{3}
|
||||||
|
\puub{4.5}{-7.5}{3}{1}{2}{4}
|
||||||
|
\puub{9}{-7.5}{3}{2}{1}{4}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
Next we will see how Cayley's formula can
|
||||||
|
be derived using Prüfer codes.
|
||||||
|
|
||||||
|
\subsubsection{Prüfer code}
|
||||||
|
|
||||||
|
\index{Prüfer code}
|
||||||
|
|
||||||
|
A \key{Prüfer code}
|
||||||
|
%\footnote{In 1918, H. Prüfer proved Cayley's theorem using Prüfer codes \cite{pru18}.}
|
||||||
|
is a sequence of
|
||||||
|
$n-2$ numbers that describes a labeled tree.
|
||||||
|
The code is constructed by following a process
|
||||||
|
that removes $n-2$ leaves from the tree.
|
||||||
|
At each step, the leaf with the smallest label is removed,
|
||||||
|
and the label of its only neighbor is added to the code.
|
||||||
|
|
||||||
|
For example, let us calculate the Prüfer code
|
||||||
|
of the following graph:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (2,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (2,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (5.5,2) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
First we remove node 1 and add node 4 to the code:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
%\node[draw, circle] (1) at (2,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (2,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (5.5,2) {$5$};
|
||||||
|
|
||||||
|
%\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Then we remove node 3 and add node 4 to the code:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
%\node[draw, circle] (1) at (2,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
%\node[draw, circle] (3) at (2,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (5.5,2) {$5$};
|
||||||
|
|
||||||
|
%\path[draw,thick,-] (1) -- (4);
|
||||||
|
%\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Finally we remove node 4 and add node 2 to the code:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
%\node[draw, circle] (1) at (2,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
%\node[draw, circle] (3) at (2,1) {$3$};
|
||||||
|
%\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (5.5,2) {$5$};
|
||||||
|
|
||||||
|
%\path[draw,thick,-] (1) -- (4);
|
||||||
|
%\path[draw,thick,-] (3) -- (4);
|
||||||
|
%\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Thus, the Prüfer code of the graph is $[4,4,2]$.
|
||||||
|
|
||||||
|
We can construct a Prüfer code for any tree,
|
||||||
|
and more importantly,
|
||||||
|
the original tree can be reconstructed
|
||||||
|
from a Prüfer code.
|
||||||
|
Hence, the number of labeled trees
|
||||||
|
of $n$ nodes equals
|
||||||
|
$n^{n-2}$, the number of Prüfer codes
|
||||||
|
of size $n$.
|
|
@ -0,0 +1,856 @@
|
||||||
|
\chapter{Matrices}
|
||||||
|
|
||||||
|
\index{matrix}
|
||||||
|
|
||||||
|
A \key{matrix} is a mathematical concept
|
||||||
|
that corresponds to a two-dimensional array
|
||||||
|
in programming. For example,
|
||||||
|
\[
|
||||||
|
A =
|
||||||
|
\begin{bmatrix}
|
||||||
|
6 & 13 & 7 & 4 \\
|
||||||
|
7 & 0 & 8 & 2 \\
|
||||||
|
9 & 5 & 4 & 18 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\]
|
||||||
|
is a matrix of size $3 \times 4$, i.e.,
|
||||||
|
it has 3 rows and 4 columns.
|
||||||
|
The notation $[i,j]$ refers to
|
||||||
|
the element in row $i$ and column $j$
|
||||||
|
in a matrix.
|
||||||
|
For example, in the above matrix,
|
||||||
|
$A[2,3]=8$ and $A[3,1]=9$.
|
||||||
|
|
||||||
|
\index{vector}
|
||||||
|
|
||||||
|
A special case of a matrix is a \key{vector}
|
||||||
|
that is a one-dimensional matrix of size $n \times 1$.
|
||||||
|
For example,
|
||||||
|
\[
|
||||||
|
V =
|
||||||
|
\begin{bmatrix}
|
||||||
|
4 \\
|
||||||
|
7 \\
|
||||||
|
5 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\]
|
||||||
|
is a vector that contains three elements.
|
||||||
|
|
||||||
|
\index{transpose}
|
||||||
|
|
||||||
|
The \key{transpose} $A^T$ of a matrix $A$
|
||||||
|
is obtained when the rows and columns of $A$
|
||||||
|
are swapped, i.e., $A^T[i,j]=A[j,i]$:
|
||||||
|
\[
|
||||||
|
A^T =
|
||||||
|
\begin{bmatrix}
|
||||||
|
6 & 7 & 9 \\
|
||||||
|
13 & 0 & 5 \\
|
||||||
|
7 & 8 & 4 \\
|
||||||
|
4 & 2 & 18 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\]
|
||||||
|
|
||||||
|
\index{square matrix}
|
||||||
|
|
||||||
|
A matrix is a \key{square matrix} if it
|
||||||
|
has the same number of rows and columns.
|
||||||
|
For example, the following matrix is a
|
||||||
|
square matrix:
|
||||||
|
|
||||||
|
\[
|
||||||
|
S =
|
||||||
|
\begin{bmatrix}
|
||||||
|
3 & 12 & 4 \\
|
||||||
|
5 & 9 & 15 \\
|
||||||
|
0 & 2 & 4 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\]
|
||||||
|
|
||||||
|
\section{Operations}
|
||||||
|
|
||||||
|
The sum $A+B$ of matrices $A$ and $B$
|
||||||
|
is defined if the matrices are of the same size.
|
||||||
|
The result is a matrix where each element
|
||||||
|
is the sum of the corresponding elements
|
||||||
|
in $A$ and $B$.
|
||||||
|
|
||||||
|
For example,
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
6 & 1 & 4 \\
|
||||||
|
3 & 9 & 2 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
+
|
||||||
|
\begin{bmatrix}
|
||||||
|
4 & 9 & 3 \\
|
||||||
|
8 & 1 & 3 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
6+4 & 1+9 & 4+3 \\
|
||||||
|
3+8 & 9+1 & 2+3 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
10 & 10 & 7 \\
|
||||||
|
11 & 10 & 5 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
|
||||||
|
Multiplying a matrix $A$ by a value $x$ means
|
||||||
|
that each element of $A$ is multiplied by $x$.
|
||||||
|
For example,
|
||||||
|
\[
|
||||||
|
2 \cdot \begin{bmatrix}
|
||||||
|
6 & 1 & 4 \\
|
||||||
|
3 & 9 & 2 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 \cdot 6 & 2\cdot1 & 2\cdot4 \\
|
||||||
|
2\cdot3 & 2\cdot9 & 2\cdot2 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
12 & 2 & 8 \\
|
||||||
|
6 & 18 & 4 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
|
||||||
|
\subsubsection{Matrix multiplication}
|
||||||
|
|
||||||
|
\index{matrix multiplication}
|
||||||
|
|
||||||
|
The product $AB$ of matrices $A$ and $B$
|
||||||
|
is defined if $A$ is of size $a \times n$
|
||||||
|
and $B$ is of size $n \times b$, i.e.,
|
||||||
|
the width of $A$ equals the height of $B$.
|
||||||
|
The result is a matrix of size $a \times b$
|
||||||
|
whose elements are calculated using the formula
|
||||||
|
\[
|
||||||
|
AB[i,j] = \sum_{k=1}^n A[i,k] \cdot B[k,j].
|
||||||
|
\]
|
||||||
|
|
||||||
|
The idea is that each element of $AB$
|
||||||
|
is a sum of products of elements of $A$ and $B$
|
||||||
|
according to the following picture:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.5]
|
||||||
|
\draw (0,0) grid (4,3);
|
||||||
|
\draw (5,0) grid (10,3);
|
||||||
|
\draw (5,4) grid (10,8);
|
||||||
|
|
||||||
|
\node at (2,-1) {$A$};
|
||||||
|
\node at (7.5,-1) {$AB$};
|
||||||
|
\node at (11,6) {$B$};
|
||||||
|
|
||||||
|
\draw[thick,->,red,line width=2pt] (0,1.5) -- (4,1.5);
|
||||||
|
\draw[thick,->,red,line width=2pt] (6.5,8) -- (6.5,4);
|
||||||
|
\draw[thick,red,line width=2pt] (6.5,1.5) circle (0.4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
For example,
|
||||||
|
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 4 \\
|
||||||
|
3 & 9 \\
|
||||||
|
8 & 6 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 6 \\
|
||||||
|
2 & 9 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 \cdot 1 + 4 \cdot 2 & 1 \cdot 6 + 4 \cdot 9 \\
|
||||||
|
3 \cdot 1 + 9 \cdot 2 & 3 \cdot 6 + 9 \cdot 9 \\
|
||||||
|
8 \cdot 1 + 6 \cdot 2 & 8 \cdot 6 + 6 \cdot 9 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
9 & 42 \\
|
||||||
|
21 & 99 \\
|
||||||
|
20 & 102 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
|
||||||
|
Matrix multiplication is associative,
|
||||||
|
so $A(BC)=(AB)C$ holds,
|
||||||
|
but it is not commutative,
|
||||||
|
so $AB = BA$ does not usually hold.
|
||||||
|
|
||||||
|
\index{identity matrix}
|
||||||
|
|
||||||
|
An \key{identity matrix} is a square matrix
|
||||||
|
where each element on the diagonal is 1
|
||||||
|
and all other elements are 0.
|
||||||
|
For example, the following matrix
|
||||||
|
is the $3 \times 3$ identity matrix:
|
||||||
|
\[
|
||||||
|
I = \begin{bmatrix}
|
||||||
|
1 & 0 & 0 \\
|
||||||
|
0 & 1 & 0 \\
|
||||||
|
0 & 0 & 1 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\]
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
Multiplying a matrix by an identity matrix
|
||||||
|
does not change it. For example,
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 0 & 0 \\
|
||||||
|
0 & 1 & 0 \\
|
||||||
|
0 & 0 & 1 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 4 \\
|
||||||
|
3 & 9 \\
|
||||||
|
8 & 6 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 4 \\
|
||||||
|
3 & 9 \\
|
||||||
|
8 & 6 \\
|
||||||
|
\end{bmatrix} \hspace{10px} \textrm{and} \hspace{10px}
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 4 \\
|
||||||
|
3 & 9 \\
|
||||||
|
8 & 6 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 0 \\
|
||||||
|
0 & 1 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 4 \\
|
||||||
|
3 & 9 \\
|
||||||
|
8 & 6 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
Using a straightforward algorithm,
|
||||||
|
we can calculate the product of
|
||||||
|
two $n \times n$ matrices
|
||||||
|
in $O(n^3)$ time.
|
||||||
|
There are also more efficient algorithms
|
||||||
|
for matrix multiplication\footnote{The first such
|
||||||
|
algorithm was Strassen's algorithm,
|
||||||
|
published in 1969 \cite{str69},
|
||||||
|
whose time complexity is $O(n^{2.80735})$;
|
||||||
|
the best current algorithm \cite{gal14}
|
||||||
|
works in $O(n^{2.37286})$ time.},
|
||||||
|
but they are mostly of theoretical interest
|
||||||
|
and such algorithms are not necessary
|
||||||
|
in competitive programming.
|
||||||
|
|
||||||
|
|
||||||
|
\subsubsection{Matrix power}
|
||||||
|
|
||||||
|
\index{matrix power}
|
||||||
|
|
||||||
|
The power $A^k$ of a matrix $A$ is defined
|
||||||
|
if $A$ is a square matrix.
|
||||||
|
The definition is based on matrix multiplication:
|
||||||
|
\[ A^k = \underbrace{A \cdot A \cdot A \cdots A}_{\textrm{$k$ times}} \]
|
||||||
|
For example,
|
||||||
|
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 & 5 \\
|
||||||
|
1 & 4 \\
|
||||||
|
\end{bmatrix}^3 =
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 & 5 \\
|
||||||
|
1 & 4 \\
|
||||||
|
\end{bmatrix} \cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 & 5 \\
|
||||||
|
1 & 4 \\
|
||||||
|
\end{bmatrix} \cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 & 5 \\
|
||||||
|
1 & 4 \\
|
||||||
|
\end{bmatrix} =
|
||||||
|
\begin{bmatrix}
|
||||||
|
48 & 165 \\
|
||||||
|
33 & 114 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
In addition, $A^0$ is an identity matrix. For example,
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 & 5 \\
|
||||||
|
1 & 4 \\
|
||||||
|
\end{bmatrix}^0 =
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 0 \\
|
||||||
|
0 & 1 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
|
||||||
|
The matrix $A^k$ can be efficiently calculated
|
||||||
|
in $O(n^3 \log k)$ time using the
|
||||||
|
algorithm in Chapter 21.2. For example,
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 & 5 \\
|
||||||
|
1 & 4 \\
|
||||||
|
\end{bmatrix}^8 =
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 & 5 \\
|
||||||
|
1 & 4 \\
|
||||||
|
\end{bmatrix}^4 \cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 & 5 \\
|
||||||
|
1 & 4 \\
|
||||||
|
\end{bmatrix}^4.
|
||||||
|
\]
|
||||||
|
|
||||||
|
\subsubsection{Determinant}
|
||||||
|
|
||||||
|
\index{determinant}
|
||||||
|
|
||||||
|
The \key{determinant} $\det(A)$ of a matrix $A$
|
||||||
|
is defined if $A$ is a square matrix.
|
||||||
|
If $A$ is of size $1 \times 1$,
|
||||||
|
then $\det(A)=A[1,1]$.
|
||||||
|
The determinant of a larger matrix is
|
||||||
|
calculated recursively using the formula \index{cofactor}
|
||||||
|
\[\det(A)=\sum_{j=1}^n A[1,j] C[1,j],\]
|
||||||
|
where $C[i,j]$ is the \key{cofactor} of $A$
|
||||||
|
at $[i,j]$.
|
||||||
|
The cofactor is calculated using the formula
|
||||||
|
\[C[i,j] = (-1)^{i+j} \det(M[i,j]),\]
|
||||||
|
where $M[i,j]$ is obtained by removing
|
||||||
|
row $i$ and column $j$ from $A$.
|
||||||
|
Due to the coefficient $(-1)^{i+j}$ in the cofactor,
|
||||||
|
every other determinant is positive
|
||||||
|
and negative.
|
||||||
|
For example,
|
||||||
|
\[
|
||||||
|
\det(
|
||||||
|
\begin{bmatrix}
|
||||||
|
3 & 4 \\
|
||||||
|
1 & 6 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
) = 3 \cdot 6 - 4 \cdot 1 = 14
|
||||||
|
\]
|
||||||
|
and
|
||||||
|
\[
|
||||||
|
\det(
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 & 4 & 3 \\
|
||||||
|
5 & 1 & 6 \\
|
||||||
|
7 & 2 & 4 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
) =
|
||||||
|
2 \cdot
|
||||||
|
\det(
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 6 \\
|
||||||
|
2 & 4 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
)
|
||||||
|
-4 \cdot
|
||||||
|
\det(
|
||||||
|
\begin{bmatrix}
|
||||||
|
5 & 6 \\
|
||||||
|
7 & 4 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
)
|
||||||
|
+3 \cdot
|
||||||
|
\det(
|
||||||
|
\begin{bmatrix}
|
||||||
|
5 & 1 \\
|
||||||
|
7 & 2 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
) = 81.
|
||||||
|
\]
|
||||||
|
|
||||||
|
\index{inverse matrix}
|
||||||
|
|
||||||
|
The determinant of $A$ tells us
|
||||||
|
whether there is an \key{inverse matrix}
|
||||||
|
$A^{-1}$ such that $A \cdot A^{-1} = I$,
|
||||||
|
where $I$ is an identity matrix.
|
||||||
|
It turns out that $A^{-1}$ exists
|
||||||
|
exactly when $\det(A) \neq 0$,
|
||||||
|
and it can be calculated using the formula
|
||||||
|
|
||||||
|
\[A^{-1}[i,j] = \frac{C[j,i]}{det(A)}.\]
|
||||||
|
|
||||||
|
For example,
|
||||||
|
|
||||||
|
\[
|
||||||
|
\underbrace{
|
||||||
|
\begin{bmatrix}
|
||||||
|
2 & 4 & 3\\
|
||||||
|
5 & 1 & 6\\
|
||||||
|
7 & 2 & 4\\
|
||||||
|
\end{bmatrix}
|
||||||
|
}_{A}
|
||||||
|
\cdot
|
||||||
|
\underbrace{
|
||||||
|
\frac{1}{81}
|
||||||
|
\begin{bmatrix}
|
||||||
|
-8 & -10 & 21 \\
|
||||||
|
22 & -13 & 3 \\
|
||||||
|
3 & 24 & -18 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
}_{A^{-1}}
|
||||||
|
=
|
||||||
|
\underbrace{
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 0 & 0 \\
|
||||||
|
0 & 1 & 0 \\
|
||||||
|
0 & 0 & 1 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
}_{I}.
|
||||||
|
\]
|
||||||
|
|
||||||
|
\section{Linear recurrences}
|
||||||
|
|
||||||
|
\index{linear recurrence}
|
||||||
|
|
||||||
|
A \key{linear recurrence}
|
||||||
|
is a function $f(n)$
|
||||||
|
whose initial values are
|
||||||
|
$f(0),f(1),\ldots,f(k-1)$
|
||||||
|
and larger values
|
||||||
|
are calculated recursively using the formula
|
||||||
|
\[f(n) = c_1 f(n-1) + c_2 f(n-2) + \ldots + c_k f (n-k),\]
|
||||||
|
where $c_1,c_2,\ldots,c_k$ are constant coefficients.
|
||||||
|
|
||||||
|
Dynamic programming can be used to calculate
|
||||||
|
any value of $f(n)$ in $O(kn)$ time by calculating
|
||||||
|
all values of $f(0),f(1),\ldots,f(n)$ one after another.
|
||||||
|
However, if $k$ is small, it is possible to calculate
|
||||||
|
$f(n)$ much more efficiently in $O(k^3 \log n)$
|
||||||
|
time using matrix operations.
|
||||||
|
|
||||||
|
\subsubsection{Fibonacci numbers}
|
||||||
|
|
||||||
|
\index{Fibonacci number}
|
||||||
|
|
||||||
|
A simple example of a linear recurrence is the
|
||||||
|
following function that defines the Fibonacci numbers:
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
f(0) & = & 0 \\
|
||||||
|
f(1) & = & 1 \\
|
||||||
|
f(n) & = & f(n-1)+f(n-2) \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
In this case, $k=2$ and $c_1=c_2=1$.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
To efficiently calculate Fibonacci numbers,
|
||||||
|
we represent the
|
||||||
|
Fibonacci formula as a
|
||||||
|
square matrix $X$ of size $2 \times 2$,
|
||||||
|
for which the following holds:
|
||||||
|
\[ X \cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
f(i) \\
|
||||||
|
f(i+1) \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
f(i+1) \\
|
||||||
|
f(i+2) \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\]
|
||||||
|
Thus, values $f(i)$ and $f(i+1)$ are given as
|
||||||
|
''input'' for $X$,
|
||||||
|
and $X$ calculates values $f(i+1)$ and $f(i+2)$
|
||||||
|
from them.
|
||||||
|
It turns out that such a matrix is
|
||||||
|
|
||||||
|
\[ X =
|
||||||
|
\begin{bmatrix}
|
||||||
|
0 & 1 \\
|
||||||
|
1 & 1 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
\end{samepage}
|
||||||
|
\noindent
|
||||||
|
For example,
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
0 & 1 \\
|
||||||
|
1 & 1 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
f(5) \\
|
||||||
|
f(6) \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
0 & 1 \\
|
||||||
|
1 & 1 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
5 \\
|
||||||
|
8 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
8 \\
|
||||||
|
13 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
f(6) \\
|
||||||
|
f(7) \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
Thus, we can calculate $f(n)$ using the formula
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
f(n) \\
|
||||||
|
f(n+1) \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
X^n \cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
f(0) \\
|
||||||
|
f(1) \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
0 & 1 \\
|
||||||
|
1 & 1 \\
|
||||||
|
\end{bmatrix}^n
|
||||||
|
\cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
0 \\
|
||||||
|
1 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
The value of $X^n$ can be calculated in
|
||||||
|
$O(\log n)$ time,
|
||||||
|
so the value of $f(n)$ can also be calculated
|
||||||
|
in $O(\log n)$ time.
|
||||||
|
|
||||||
|
\subsubsection{General case}
|
||||||
|
|
||||||
|
Let us now consider the general case where
|
||||||
|
$f(n)$ is any linear recurrence.
|
||||||
|
Again, our goal is to construct a matrix $X$
|
||||||
|
for which
|
||||||
|
|
||||||
|
\[ X \cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
f(i) \\
|
||||||
|
f(i+1) \\
|
||||||
|
\vdots \\
|
||||||
|
f(i+k-1) \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
f(i+1) \\
|
||||||
|
f(i+2) \\
|
||||||
|
\vdots \\
|
||||||
|
f(i+k) \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
Such a matrix is
|
||||||
|
\[
|
||||||
|
X =
|
||||||
|
\begin{bmatrix}
|
||||||
|
0 & 1 & 0 & 0 & \cdots & 0 \\
|
||||||
|
0 & 0 & 1 & 0 & \cdots & 0 \\
|
||||||
|
0 & 0 & 0 & 1 & \cdots & 0 \\
|
||||||
|
\vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\
|
||||||
|
0 & 0 & 0 & 0 & \cdots & 1 \\
|
||||||
|
c_k & c_{k-1} & c_{k-2} & c_{k-3} & \cdots & c_1 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
In the first $k-1$ rows, each element is 0
|
||||||
|
except that one element is 1.
|
||||||
|
These rows replace $f(i)$ with $f(i+1)$,
|
||||||
|
$f(i+1)$ with $f(i+2)$, and so on.
|
||||||
|
The last row contains the coefficients of the recurrence
|
||||||
|
to calculate the new value $f(i+k)$.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
Now, $f(n)$ can be calculated in
|
||||||
|
$O(k^3 \log n)$ time using the formula
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
f(n) \\
|
||||||
|
f(n+1) \\
|
||||||
|
\vdots \\
|
||||||
|
f(n+k-1) \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
X^n \cdot
|
||||||
|
\begin{bmatrix}
|
||||||
|
f(0) \\
|
||||||
|
f(1) \\
|
||||||
|
\vdots \\
|
||||||
|
f(k-1) \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
\section{Graphs and matrices}
|
||||||
|
|
||||||
|
\subsubsection{Counting paths}
|
||||||
|
|
||||||
|
The powers of an adjacency matrix of a graph
|
||||||
|
have an interesting property.
|
||||||
|
When $V$ is an adjacency matrix of an unweighted graph,
|
||||||
|
the matrix $V^n$ contains the numbers of paths of
|
||||||
|
$n$ edges between the nodes in the graph.
|
||||||
|
|
||||||
|
For example, for the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (1,1) {$4$};
|
||||||
|
\node[draw, circle] (3) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (4) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||||
|
\path[draw,thick,->,>=latex] (6) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (6) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
the adjacency matrix is
|
||||||
|
\[
|
||||||
|
V= \begin{bmatrix}
|
||||||
|
0 & 0 & 0 & 1 & 0 & 0 \\
|
||||||
|
1 & 0 & 0 & 0 & 1 & 1 \\
|
||||||
|
0 & 1 & 0 & 0 & 0 & 0 \\
|
||||||
|
0 & 1 & 0 & 0 & 0 & 0 \\
|
||||||
|
0 & 0 & 0 & 0 & 0 & 0 \\
|
||||||
|
0 & 0 & 1 & 0 & 1 & 0 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
Now, for example, the matrix
|
||||||
|
\[
|
||||||
|
V^4= \begin{bmatrix}
|
||||||
|
0 & 0 & 1 & 1 & 1 & 0 \\
|
||||||
|
2 & 0 & 0 & 0 & 2 & 2 \\
|
||||||
|
0 & 2 & 0 & 0 & 0 & 0 \\
|
||||||
|
0 & 2 & 0 & 0 & 0 & 0 \\
|
||||||
|
0 & 0 & 0 & 0 & 0 & 0 \\
|
||||||
|
0 & 0 & 1 & 1 & 1 & 0 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\]
|
||||||
|
contains the numbers of paths of 4 edges
|
||||||
|
between the nodes.
|
||||||
|
For example, $V^4[2,5]=2$,
|
||||||
|
because there are two paths of 4 edges
|
||||||
|
from node 2 to node 5:
|
||||||
|
$2 \rightarrow 1 \rightarrow 4 \rightarrow 2 \rightarrow 5$
|
||||||
|
and
|
||||||
|
$2 \rightarrow 6 \rightarrow 3 \rightarrow 2 \rightarrow 5$.
|
||||||
|
|
||||||
|
\subsubsection{Shortest paths}
|
||||||
|
|
||||||
|
Using a similar idea in a weighted graph,
|
||||||
|
we can calculate for each pair of nodes the minimum
|
||||||
|
length of a path
|
||||||
|
between them that contains exactly $n$ edges.
|
||||||
|
To calculate this, we have to define matrix multiplication
|
||||||
|
in a new way, so that we do not calculate the numbers
|
||||||
|
of paths but minimize the lengths of paths.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
As an example, consider the following graph:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (1,1) {$4$};
|
||||||
|
\node[draw, circle] (3) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (4) at (5,3) {$3$};
|
||||||
|
\node[draw, circle] (5) at (3,1) {$5$};
|
||||||
|
\node[draw, circle] (6) at (5,1) {$6$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=left:4] {} (2);
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:1] {} (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=north:2] {} (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=north:4] {} (3);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=left:1] {} (5);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=left:2] {} (6);
|
||||||
|
\path[draw,thick,->,>=latex] (6) -- node[font=\small,label=right:3] {} (4);
|
||||||
|
\path[draw,thick,->,>=latex] (6) -- node[font=\small,label=below:2] {} (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
Let us construct an adjacency matrix where
|
||||||
|
$\infty$ means that an edge does not exist,
|
||||||
|
and other values correspond to edge weights.
|
||||||
|
The matrix is
|
||||||
|
\[
|
||||||
|
V= \begin{bmatrix}
|
||||||
|
\infty & \infty & \infty & 4 & \infty & \infty \\
|
||||||
|
2 & \infty & \infty & \infty & 1 & 2 \\
|
||||||
|
\infty & 4 & \infty & \infty & \infty & \infty \\
|
||||||
|
\infty & 1 & \infty & \infty & \infty & \infty \\
|
||||||
|
\infty & \infty & \infty & \infty & \infty & \infty \\
|
||||||
|
\infty & \infty & 3 & \infty & 2 & \infty \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
|
||||||
|
Instead of the formula
|
||||||
|
\[
|
||||||
|
AB[i,j] = \sum_{k=1}^n A[i,k] \cdot B[k,j]
|
||||||
|
\]
|
||||||
|
we now use the formula
|
||||||
|
\[
|
||||||
|
AB[i,j] = \min_{k=1}^n A[i,k] + B[k,j]
|
||||||
|
\]
|
||||||
|
for matrix multiplication, so we calculate
|
||||||
|
a minimum instead of a sum,
|
||||||
|
and a sum of elements instead of a product.
|
||||||
|
After this modification,
|
||||||
|
matrix powers correspond to
|
||||||
|
shortest paths in the graph.
|
||||||
|
|
||||||
|
For example, as
|
||||||
|
\[
|
||||||
|
V^4= \begin{bmatrix}
|
||||||
|
\infty & \infty & 10 & 11 & 9 & \infty \\
|
||||||
|
9 & \infty & \infty & \infty & 8 & 9 \\
|
||||||
|
\infty & 11 & \infty & \infty & \infty & \infty \\
|
||||||
|
\infty & 8 & \infty & \infty & \infty & \infty \\
|
||||||
|
\infty & \infty & \infty & \infty & \infty & \infty \\
|
||||||
|
\infty & \infty & 12 & 13 & 11 & \infty \\
|
||||||
|
\end{bmatrix},
|
||||||
|
\]
|
||||||
|
we can conclude that the minimum length of a path
|
||||||
|
of 4 edges
|
||||||
|
from node 2 to node 5 is 8.
|
||||||
|
Such a path is
|
||||||
|
$2 \rightarrow 1 \rightarrow 4 \rightarrow 2 \rightarrow 5$.
|
||||||
|
|
||||||
|
\subsubsection{Kirchhoff's theorem}
|
||||||
|
|
||||||
|
\index{Kirchhoff's theorem}
|
||||||
|
\index{spanning tree}
|
||||||
|
|
||||||
|
\key{Kirchhoff's theorem}
|
||||||
|
%\footnote{G. R. Kirchhoff (1824--1887) was a German physicist.}
|
||||||
|
provides a way
|
||||||
|
to calculate the number of spanning trees
|
||||||
|
of a graph as a determinant of a special matrix.
|
||||||
|
For example, the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (3,1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
has three spanning trees:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1a) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2a) at (3,3) {$2$};
|
||||||
|
\node[draw, circle] (3a) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4a) at (3,1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1a) -- (2a);
|
||||||
|
%\path[draw,thick,-] (1a) -- (3a);
|
||||||
|
\path[draw,thick,-] (3a) -- (4a);
|
||||||
|
\path[draw,thick,-] (1a) -- (4a);
|
||||||
|
|
||||||
|
\node[draw, circle] (1b) at (1+4,3) {$1$};
|
||||||
|
\node[draw, circle] (2b) at (3+4,3) {$2$};
|
||||||
|
\node[draw, circle] (3b) at (1+4,1) {$3$};
|
||||||
|
\node[draw, circle] (4b) at (3+4,1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1b) -- (2b);
|
||||||
|
\path[draw,thick,-] (1b) -- (3b);
|
||||||
|
%\path[draw,thick,-] (3b) -- (4b);
|
||||||
|
\path[draw,thick,-] (1b) -- (4b);
|
||||||
|
|
||||||
|
\node[draw, circle] (1c) at (1+8,3) {$1$};
|
||||||
|
\node[draw, circle] (2c) at (3+8,3) {$2$};
|
||||||
|
\node[draw, circle] (3c) at (1+8,1) {$3$};
|
||||||
|
\node[draw, circle] (4c) at (3+8,1) {$4$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1c) -- (2c);
|
||||||
|
\path[draw,thick,-] (1c) -- (3c);
|
||||||
|
\path[draw,thick,-] (3c) -- (4c);
|
||||||
|
%\path[draw,thick,-] (1c) -- (4c);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\index{Laplacean matrix}
|
||||||
|
To calculate the number of spanning trees,
|
||||||
|
we construct a \key{Laplacean matrix} $L$,
|
||||||
|
where $L[i,i]$ is the degree of node $i$
|
||||||
|
and $L[i,j]=-1$ if there is an edge between
|
||||||
|
nodes $i$ and $j$, and otherwise $L[i,j]=0$.
|
||||||
|
The Laplacean matrix for the above graph is as follows:
|
||||||
|
\[
|
||||||
|
L= \begin{bmatrix}
|
||||||
|
3 & -1 & -1 & -1 \\
|
||||||
|
-1 & 1 & 0 & 0 \\
|
||||||
|
-1 & 0 & 2 & -1 \\
|
||||||
|
-1 & 0 & -1 & 2 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\]
|
||||||
|
|
||||||
|
It can be shown that
|
||||||
|
the number of spanning trees equals
|
||||||
|
the determinant of a matrix that is obtained
|
||||||
|
when we remove any row and any column from $L$.
|
||||||
|
For example, if we remove the first row
|
||||||
|
and column, the result is
|
||||||
|
|
||||||
|
\[ \det(
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 & 0 & 0 \\
|
||||||
|
0 & 2 & -1 \\
|
||||||
|
0 & -1 & 2 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
) =3.\]
|
||||||
|
The determinant is always the same,
|
||||||
|
regardless of which row and column we remove from $L$.
|
||||||
|
|
||||||
|
Note that Cayley's formula in Chapter 22.5 is
|
||||||
|
a special case of Kirchhoff's theorem,
|
||||||
|
because in a complete graph of $n$ nodes
|
||||||
|
|
||||||
|
\[ \det(
|
||||||
|
\begin{bmatrix}
|
||||||
|
n-1 & -1 & \cdots & -1 \\
|
||||||
|
-1 & n-1 & \cdots & -1 \\
|
||||||
|
\vdots & \vdots & \ddots & \vdots \\
|
||||||
|
-1 & -1 & \cdots & n-1 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
) =n^{n-2}.\]
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,689 @@
|
||||||
|
\chapter{Probability}
|
||||||
|
|
||||||
|
\index{probability}
|
||||||
|
|
||||||
|
A \key{probability} is a real number between $0$ and $1$
|
||||||
|
that indicates how probable an event is.
|
||||||
|
If an event is certain to happen,
|
||||||
|
its probability is 1,
|
||||||
|
and if an event is impossible,
|
||||||
|
its probability is 0.
|
||||||
|
The probability of an event is denoted $P(\cdots)$
|
||||||
|
where the three dots describe the event.
|
||||||
|
|
||||||
|
For example, when throwing a dice,
|
||||||
|
the outcome is an integer between $1$ and $6$,
|
||||||
|
and the probability of each outcome is $1/6$.
|
||||||
|
For example, we can calculate the following probabilities:
|
||||||
|
|
||||||
|
\begin{itemize}[noitemsep]
|
||||||
|
\item $P(\textrm{''the outcome is 4''})=1/6$
|
||||||
|
\item $P(\textrm{''the outcome is not 6''})=5/6$
|
||||||
|
\item $P(\textrm{''the outcome is even''})=1/2$
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\section{Calculation}
|
||||||
|
|
||||||
|
To calculate the probability of an event,
|
||||||
|
we can either use combinatorics
|
||||||
|
or simulate the process that generates the event.
|
||||||
|
As an example, let us calculate the probability
|
||||||
|
of drawing three cards with the same value
|
||||||
|
from a shuffled deck of cards
|
||||||
|
(for example, $\spadesuit 8$, $\clubsuit 8$ and $\diamondsuit 8$).
|
||||||
|
|
||||||
|
\subsubsection*{Method 1}
|
||||||
|
|
||||||
|
We can calculate the probability using the formula
|
||||||
|
|
||||||
|
\[\frac{\textrm{number of desired outcomes}}{\textrm{total number of outcomes}}.\]
|
||||||
|
|
||||||
|
In this problem, the desired outcomes are those
|
||||||
|
in which the value of each card is the same.
|
||||||
|
There are $13 {4 \choose 3}$ such outcomes,
|
||||||
|
because there are $13$ possibilities for the
|
||||||
|
value of the cards and ${4 \choose 3}$ ways to
|
||||||
|
choose $3$ suits from $4$ possible suits.
|
||||||
|
|
||||||
|
There are a total of ${52 \choose 3}$ outcomes,
|
||||||
|
because we choose 3 cards from 52 cards.
|
||||||
|
Thus, the probability of the event is
|
||||||
|
|
||||||
|
\[\frac{13 {4 \choose 3}}{{52 \choose 3}} = \frac{1}{425}.\]
|
||||||
|
|
||||||
|
\subsubsection*{Method 2}
|
||||||
|
|
||||||
|
Another way to calculate the probability is
|
||||||
|
to simulate the process that generates the event.
|
||||||
|
In this example, we draw three cards, so the process
|
||||||
|
consists of three steps.
|
||||||
|
We require that each step of the process is successful.
|
||||||
|
|
||||||
|
Drawing the first card certainly succeeds,
|
||||||
|
because there are no restrictions.
|
||||||
|
The second step succeeds with probability $3/51$,
|
||||||
|
because there are 51 cards left and 3 of them
|
||||||
|
have the same value as the first card.
|
||||||
|
In a similar way, the third step succeeds with probability $2/50$.
|
||||||
|
|
||||||
|
The probability that the entire process succeeds is
|
||||||
|
|
||||||
|
\[1 \cdot \frac{3}{51} \cdot \frac{2}{50} = \frac{1}{425}.\]
|
||||||
|
|
||||||
|
\section{Events}
|
||||||
|
|
||||||
|
An event in probability theory can be represented as a set
|
||||||
|
\[A \subset X,\]
|
||||||
|
where $X$ contains all possible outcomes
|
||||||
|
and $A$ is a subset of outcomes.
|
||||||
|
For example, when drawing a dice, the outcomes are
|
||||||
|
\[X = \{1,2,3,4,5,6\}.\]
|
||||||
|
Now, for example, the event ''the outcome is even''
|
||||||
|
corresponds to the set
|
||||||
|
\[A = \{2,4,6\}.\]
|
||||||
|
|
||||||
|
Each outcome $x$ is assigned a probability $p(x)$.
|
||||||
|
Then, the probability $P(A)$ of an event
|
||||||
|
$A$ can be calculated as a sum
|
||||||
|
of probabilities of outcomes using the formula
|
||||||
|
\[P(A) = \sum_{x \in A} p(x).\]
|
||||||
|
For example, when throwing a dice,
|
||||||
|
$p(x)=1/6$ for each outcome $x$,
|
||||||
|
so the probability of the event
|
||||||
|
''the outcome is even'' is
|
||||||
|
\[p(2)+p(4)+p(6)=1/2.\]
|
||||||
|
|
||||||
|
The total probability of the outcomes in $X$ must
|
||||||
|
be 1, i.e., $P(X)=1$.
|
||||||
|
|
||||||
|
Since the events in probability theory are sets,
|
||||||
|
we can manipulate them using standard set operations:
|
||||||
|
|
||||||
|
\begin{itemize}
|
||||||
|
\item The \key{complement} $\bar A$ means
|
||||||
|
''$A$ does not happen''.
|
||||||
|
For example, when throwing a dice,
|
||||||
|
the complement of $A=\{2,4,6\}$ is
|
||||||
|
$\bar A = \{1,3,5\}$.
|
||||||
|
\item The \key{union} $A \cup B$ means
|
||||||
|
''$A$ or $B$ happen''.
|
||||||
|
For example, the union of
|
||||||
|
$A=\{2,5\}$
|
||||||
|
and $B=\{4,5,6\}$ is
|
||||||
|
$A \cup B = \{2,4,5,6\}$.
|
||||||
|
\item The \key{intersection} $A \cap B$ means
|
||||||
|
''$A$ and $B$ happen''.
|
||||||
|
For example, the intersection of
|
||||||
|
$A=\{2,5\}$ and $B=\{4,5,6\}$ is
|
||||||
|
$A \cap B = \{5\}$.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsubsection{Complement}
|
||||||
|
|
||||||
|
The probability of the complement
|
||||||
|
$\bar A$ is calculated using the formula
|
||||||
|
\[P(\bar A)=1-P(A).\]
|
||||||
|
|
||||||
|
Sometimes, we can solve a problem easily
|
||||||
|
using complements by solving the opposite problem.
|
||||||
|
For example, the probability of getting
|
||||||
|
at least one six when throwing a dice ten times is
|
||||||
|
\[1-(5/6)^{10}.\]
|
||||||
|
|
||||||
|
Here $5/6$ is the probability that the outcome
|
||||||
|
of a single throw is not six, and
|
||||||
|
$(5/6)^{10}$ is the probability that none of
|
||||||
|
the ten throws is a six.
|
||||||
|
The complement of this is the answer to the problem.
|
||||||
|
|
||||||
|
\subsubsection{Union}
|
||||||
|
|
||||||
|
The probability of the union $A \cup B$
|
||||||
|
is calculated using the formula
|
||||||
|
\[P(A \cup B)=P(A)+P(B)-P(A \cap B).\]
|
||||||
|
For example, when throwing a dice,
|
||||||
|
the union of the events
|
||||||
|
\[A=\textrm{''the outcome is even''}\]
|
||||||
|
and
|
||||||
|
\[B=\textrm{''the outcome is less than 4''}\]
|
||||||
|
is
|
||||||
|
\[A \cup B=\textrm{''the outcome is even or less than 4''},\]
|
||||||
|
and its probability is
|
||||||
|
\[P(A \cup B) = P(A)+P(B)-P(A \cap B)=1/2+1/2-1/6=5/6.\]
|
||||||
|
|
||||||
|
If the events $A$ and $B$ are \key{disjoint}, i.e.,
|
||||||
|
$A \cap B$ is empty,
|
||||||
|
the probability of the event $A \cup B$ is simply
|
||||||
|
|
||||||
|
\[P(A \cup B)=P(A)+P(B).\]
|
||||||
|
|
||||||
|
\subsubsection{Conditional probability}
|
||||||
|
|
||||||
|
\index{conditional probability}
|
||||||
|
|
||||||
|
The \key{conditional probability}
|
||||||
|
\[P(A | B) = \frac{P(A \cap B)}{P(B)}\]
|
||||||
|
is the probability of $A$
|
||||||
|
assuming that $B$ happens.
|
||||||
|
Hence, when calculating the
|
||||||
|
probability of $A$, we only consider the outcomes
|
||||||
|
that also belong to $B$.
|
||||||
|
|
||||||
|
Using the previous sets,
|
||||||
|
\[P(A | B)= 1/3,\]
|
||||||
|
because the outcomes of $B$ are
|
||||||
|
$\{1,2,3\}$, and one of them is even.
|
||||||
|
This is the probability of an even outcome
|
||||||
|
if we know that the outcome is between $1 \ldots 3$.
|
||||||
|
|
||||||
|
\subsubsection{Intersection}
|
||||||
|
|
||||||
|
\index{independence}
|
||||||
|
|
||||||
|
Using conditional probability,
|
||||||
|
the probability of the intersection
|
||||||
|
$A \cap B$ can be calculated using the formula
|
||||||
|
\[P(A \cap B)=P(A)P(B|A).\]
|
||||||
|
Events $A$ and $B$ are \key{independent} if
|
||||||
|
\[P(A|B)=P(A) \hspace{10px}\textrm{and}\hspace{10px} P(B|A)=P(B),\]
|
||||||
|
which means that the fact that $B$ happens does not
|
||||||
|
change the probability of $A$, and vice versa.
|
||||||
|
In this case, the probability of the intersection is
|
||||||
|
\[P(A \cap B)=P(A)P(B).\]
|
||||||
|
For example, when drawing a card from a deck, the events
|
||||||
|
\[A = \textrm{''the suit is clubs''}\]
|
||||||
|
and
|
||||||
|
\[B = \textrm{''the value is four''}\]
|
||||||
|
are independent. Hence the event
|
||||||
|
\[A \cap B = \textrm{''the card is the four of clubs''}\]
|
||||||
|
happens with probability
|
||||||
|
\[P(A \cap B)=P(A)P(B)=1/4 \cdot 1/13 = 1/52.\]
|
||||||
|
|
||||||
|
\section{Random variables}
|
||||||
|
|
||||||
|
\index{random variable}
|
||||||
|
|
||||||
|
A \key{random variable} is a value that is generated
|
||||||
|
by a random process.
|
||||||
|
For example, when throwing two dice,
|
||||||
|
a possible random variable is
|
||||||
|
\[X=\textrm{''the sum of the outcomes''}.\]
|
||||||
|
For example, if the outcomes are $[4,6]$
|
||||||
|
(meaning that we first throw a four and then a six),
|
||||||
|
then the value of $X$ is 10.
|
||||||
|
|
||||||
|
We denote $P(X=x)$ the probability that
|
||||||
|
the value of a random variable $X$ is $x$.
|
||||||
|
For example, when throwing two dice,
|
||||||
|
$P(X=10)=3/36$,
|
||||||
|
because the total number of outcomes is 36
|
||||||
|
and there are three possible ways to obtain
|
||||||
|
the sum 10: $[4,6]$, $[5,5]$ and $[6,4]$.
|
||||||
|
|
||||||
|
\subsubsection{Expected value}
|
||||||
|
|
||||||
|
\index{expected value}
|
||||||
|
|
||||||
|
The \key{expected value} $E[X]$ indicates the
|
||||||
|
average value of a random variable $X$.
|
||||||
|
The expected value can be calculated as the sum
|
||||||
|
\[\sum_x P(X=x)x,\]
|
||||||
|
where $x$ goes through all possible values of $X$.
|
||||||
|
|
||||||
|
For example, when throwing a dice,
|
||||||
|
the expected outcome is
|
||||||
|
\[1/6 \cdot 1 + 1/6 \cdot 2 + 1/6 \cdot 3 + 1/6 \cdot 4 + 1/6 \cdot 5 + 1/6 \cdot 6 = 7/2.\]
|
||||||
|
|
||||||
|
A useful property of expected values is \key{linearity}.
|
||||||
|
It means that the sum
|
||||||
|
$E[X_1+X_2+\cdots+X_n]$
|
||||||
|
always equals the sum
|
||||||
|
$E[X_1]+E[X_2]+\cdots+E[X_n]$.
|
||||||
|
This formula holds even if random variables
|
||||||
|
depend on each other.
|
||||||
|
|
||||||
|
For example, when throwing two dice,
|
||||||
|
the expected sum is
|
||||||
|
\[E[X_1+X_2]=E[X_1]+E[X_2]=7/2+7/2=7.\]
|
||||||
|
|
||||||
|
Let us now consider a problem where
|
||||||
|
$n$ balls are randomly placed in $n$ boxes,
|
||||||
|
and our task is to calculate the expected
|
||||||
|
number of empty boxes.
|
||||||
|
Each ball has an equal probability to
|
||||||
|
be placed in any of the boxes.
|
||||||
|
For example, if $n=2$, the possibilities
|
||||||
|
are as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
\draw (0,0) rectangle (1,1);
|
||||||
|
\draw (1.2,0) rectangle (2.2,1);
|
||||||
|
\draw (3,0) rectangle (4,1);
|
||||||
|
\draw (4.2,0) rectangle (5.2,1);
|
||||||
|
\draw (6,0) rectangle (7,1);
|
||||||
|
\draw (7.2,0) rectangle (8.2,1);
|
||||||
|
\draw (9,0) rectangle (10,1);
|
||||||
|
\draw (10.2,0) rectangle (11.2,1);
|
||||||
|
|
||||||
|
\draw[fill=blue] (0.5,0.2) circle (0.1);
|
||||||
|
\draw[fill=red] (1.7,0.2) circle (0.1);
|
||||||
|
\draw[fill=red] (3.5,0.2) circle (0.1);
|
||||||
|
\draw[fill=blue] (4.7,0.2) circle (0.1);
|
||||||
|
\draw[fill=blue] (6.25,0.2) circle (0.1);
|
||||||
|
\draw[fill=red] (6.75,0.2) circle (0.1);
|
||||||
|
\draw[fill=blue] (10.45,0.2) circle (0.1);
|
||||||
|
\draw[fill=red] (10.95,0.2) circle (0.1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
In this case, the expected number of
|
||||||
|
empty boxes is
|
||||||
|
\[\frac{0+0+1+1}{4} = \frac{1}{2}.\]
|
||||||
|
In the general case, the probability that a
|
||||||
|
single box is empty is
|
||||||
|
\[\Big(\frac{n-1}{n}\Big)^n,\]
|
||||||
|
because no ball should be placed in it.
|
||||||
|
Hence, using linearity, the expected number of
|
||||||
|
empty boxes is
|
||||||
|
\[n \cdot \Big(\frac{n-1}{n}\Big)^n.\]
|
||||||
|
|
||||||
|
\subsubsection{Distributions}
|
||||||
|
|
||||||
|
\index{distribution}
|
||||||
|
|
||||||
|
The \key{distribution} of a random variable $X$
|
||||||
|
shows the probability of each value that
|
||||||
|
$X$ may have.
|
||||||
|
The distribution consists of values $P(X=x)$.
|
||||||
|
For example, when throwing two dice,
|
||||||
|
the distribution for their sum is:
|
||||||
|
\begin{center}
|
||||||
|
\small {
|
||||||
|
\begin{tabular}{r|rrrrrrrrrrrrr}
|
||||||
|
$x$ & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 \\
|
||||||
|
$P(X=x)$ & $1/36$ & $2/36$ & $3/36$ & $4/36$ & $5/36$ & $6/36$ & $5/36$ & $4/36$ & $3/36$ & $2/36$ & $1/36$ \\
|
||||||
|
\end{tabular}
|
||||||
|
}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\index{uniform distribution}
|
||||||
|
In a \key{uniform distribution},
|
||||||
|
the random variable $X$ has $n$ possible
|
||||||
|
values $a,a+1,\ldots,b$ and the probability of each value is $1/n$.
|
||||||
|
For example, when throwing a dice,
|
||||||
|
$a=1$, $b=6$ and $P(X=x)=1/6$ for each value $x$.
|
||||||
|
|
||||||
|
The expected value of $X$ in a uniform distribution is
|
||||||
|
\[E[X] = \frac{a+b}{2}.\]
|
||||||
|
|
||||||
|
\index{binomial distribution}
|
||||||
|
In a \key{binomial distribution}, $n$ attempts
|
||||||
|
are made
|
||||||
|
and the probability that a single attempt succeeds
|
||||||
|
is $p$.
|
||||||
|
The random variable $X$ counts the number of
|
||||||
|
successful attempts,
|
||||||
|
and the probability of a value $x$ is
|
||||||
|
\[P(X=x)=p^x (1-p)^{n-x} {n \choose x},\]
|
||||||
|
where $p^x$ and $(1-p)^{n-x}$ correspond to
|
||||||
|
successful and unsuccessful attemps,
|
||||||
|
and ${n \choose x}$ is the number of ways
|
||||||
|
we can choose the order of the attempts.
|
||||||
|
|
||||||
|
For example, when throwing a dice ten times,
|
||||||
|
the probability of throwing a six exactly
|
||||||
|
three times is $(1/6)^3 (5/6)^7 {10 \choose 3}$.
|
||||||
|
|
||||||
|
The expected value of $X$ in a binomial distribution is
|
||||||
|
\[E[X] = pn.\]
|
||||||
|
|
||||||
|
\index{geometric distribution}
|
||||||
|
In a \key{geometric distribution},
|
||||||
|
the probability that an attempt succeeds is $p$,
|
||||||
|
and we continue until the first success happens.
|
||||||
|
The random variable $X$ counts the number
|
||||||
|
of attempts needed, and the probability of
|
||||||
|
a value $x$ is
|
||||||
|
\[P(X=x)=(1-p)^{x-1} p,\]
|
||||||
|
where $(1-p)^{x-1}$ corresponds to the unsuccessful attemps
|
||||||
|
and $p$ corresponds to the first successful attempt.
|
||||||
|
|
||||||
|
For example, if we throw a dice until we throw a six,
|
||||||
|
the probability that the number of throws
|
||||||
|
is exactly 4 is $(5/6)^3 1/6$.
|
||||||
|
|
||||||
|
The expected value of $X$ in a geometric distribution is
|
||||||
|
\[E[X]=\frac{1}{p}.\]
|
||||||
|
|
||||||
|
\section{Markov chains}
|
||||||
|
|
||||||
|
\index{Markov chain}
|
||||||
|
|
||||||
|
A \key{Markov chain}
|
||||||
|
% \footnote{A. A. Markov (1856--1922)
|
||||||
|
% was a Russian mathematician.}
|
||||||
|
is a random process
|
||||||
|
that consists of states and transitions between them.
|
||||||
|
For each state, we know the probabilities
|
||||||
|
for moving to other states.
|
||||||
|
A Markov chain can be represented as a graph
|
||||||
|
whose nodes are states and edges are transitions.
|
||||||
|
|
||||||
|
As an example, consider a problem
|
||||||
|
where we are in floor 1 in an $n$ floor building.
|
||||||
|
At each step, we randomly walk either one floor
|
||||||
|
up or one floor down, except that we always
|
||||||
|
walk one floor up from floor 1 and one floor down
|
||||||
|
from floor $n$.
|
||||||
|
What is the probability of being in floor $m$
|
||||||
|
after $k$ steps?
|
||||||
|
|
||||||
|
In this problem, each floor of the building
|
||||||
|
corresponds to a state in a Markov chain.
|
||||||
|
For example, if $n=5$, the graph is as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (4,0) {$3$};
|
||||||
|
\node[draw, circle] (4) at (6,0) {$4$};
|
||||||
|
\node[draw, circle] (5) at (8,0) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,->] (1) edge [bend left=40] node[font=\small,label=$1$] {} (2);
|
||||||
|
\path[draw,thick,->] (2) edge [bend left=40] node[font=\small,label=$1/2$] {} (3);
|
||||||
|
\path[draw,thick,->] (3) edge [bend left=40] node[font=\small,label=$1/2$] {} (4);
|
||||||
|
\path[draw,thick,->] (4) edge [bend left=40] node[font=\small,label=$1/2$] {} (5);
|
||||||
|
|
||||||
|
\path[draw,thick,->] (5) edge [bend left=40] node[font=\small,label=below:$1$] {} (4);
|
||||||
|
\path[draw,thick,->] (4) edge [bend left=40] node[font=\small,label=below:$1/2$] {} (3);
|
||||||
|
\path[draw,thick,->] (3) edge [bend left=40] node[font=\small,label=below:$1/2$] {} (2);
|
||||||
|
\path[draw,thick,->] (2) edge [bend left=40] node[font=\small,label=below:$1/2$] {} (1);
|
||||||
|
|
||||||
|
%\path[draw,thick,->] (1) edge [bend left=40] node[font=\small,label=below:$1$] {} (2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The probability distribution
|
||||||
|
of a Markov chain is a vector
|
||||||
|
$[p_1,p_2,\ldots,p_n]$, where $p_k$ is the
|
||||||
|
probability that the current state is $k$.
|
||||||
|
The formula $p_1+p_2+\cdots+p_n=1$ always holds.
|
||||||
|
|
||||||
|
In the above scenario, the initial distribution is
|
||||||
|
$[1,0,0,0,0]$, because we always begin in floor 1.
|
||||||
|
The next distribution is $[0,1,0,0,0]$,
|
||||||
|
because we can only move from floor 1 to floor 2.
|
||||||
|
After this, we can either move one floor up
|
||||||
|
or one floor down, so the next distribution is
|
||||||
|
$[1/2,0,1/2,0,0]$, and so on.
|
||||||
|
|
||||||
|
An efficient way to simulate the walk in
|
||||||
|
a Markov chain is to use dynamic programming.
|
||||||
|
The idea is to maintain the probability distribution,
|
||||||
|
and at each step go through all possibilities
|
||||||
|
how we can move.
|
||||||
|
Using this method, we can simulate
|
||||||
|
a walk of $m$ steps in $O(n^2 m)$ time.
|
||||||
|
|
||||||
|
The transitions of a Markov chain can also be
|
||||||
|
represented as a matrix that updates the
|
||||||
|
probability distribution.
|
||||||
|
In the above scenario, the matrix is
|
||||||
|
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
0 & 1/2 & 0 & 0 & 0 \\
|
||||||
|
1 & 0 & 1/2 & 0 & 0 \\
|
||||||
|
0 & 1/2 & 0 & 1/2 & 0 \\
|
||||||
|
0 & 0 & 1/2 & 0 & 1 \\
|
||||||
|
0 & 0 & 0 & 1/2 & 0 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
|
||||||
|
When we multiply a probability distribution by this matrix,
|
||||||
|
we get the new distribution after moving one step.
|
||||||
|
For example, we can move from the distribution
|
||||||
|
$[1,0,0,0,0]$ to the distribution
|
||||||
|
$[0,1,0,0,0]$ as follows:
|
||||||
|
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
0 & 1/2 & 0 & 0 & 0 \\
|
||||||
|
1 & 0 & 1/2 & 0 & 0 \\
|
||||||
|
0 & 1/2 & 0 & 1/2 & 0 \\
|
||||||
|
0 & 0 & 1/2 & 0 & 1 \\
|
||||||
|
0 & 0 & 0 & 1/2 & 0 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\begin{bmatrix}
|
||||||
|
1 \\
|
||||||
|
0 \\
|
||||||
|
0 \\
|
||||||
|
0 \\
|
||||||
|
0 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
0 \\
|
||||||
|
1 \\
|
||||||
|
0 \\
|
||||||
|
0 \\
|
||||||
|
0 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
|
||||||
|
By calculating matrix powers efficiently,
|
||||||
|
we can calculate the distribution after $m$ steps
|
||||||
|
in $O(n^3 \log m)$ time.
|
||||||
|
|
||||||
|
\section{Randomized algorithms}
|
||||||
|
|
||||||
|
\index{randomized algorithm}
|
||||||
|
|
||||||
|
Sometimes we can use randomness for solving a problem,
|
||||||
|
even if the problem is not related to probabilities.
|
||||||
|
A \key{randomized algorithm} is an algorithm that
|
||||||
|
is based on randomness.
|
||||||
|
|
||||||
|
\index{Monte Carlo algorithm}
|
||||||
|
|
||||||
|
A \key{Monte Carlo algorithm} is a randomized algorithm
|
||||||
|
that may sometimes give a wrong answer.
|
||||||
|
For such an algorithm to be useful,
|
||||||
|
the probability of a wrong answer should be small.
|
||||||
|
|
||||||
|
\index{Las Vegas algorithm}
|
||||||
|
|
||||||
|
A \key{Las Vegas algorithm} is a randomized algorithm
|
||||||
|
that always gives the correct answer,
|
||||||
|
but its running time varies randomly.
|
||||||
|
The goal is to design an algorithm that is
|
||||||
|
efficient with high probability.
|
||||||
|
|
||||||
|
Next we will go through three example problems that
|
||||||
|
can be solved using randomness.
|
||||||
|
|
||||||
|
\subsubsection{Order statistics}
|
||||||
|
|
||||||
|
\index{order statistic}
|
||||||
|
|
||||||
|
The $kth$ \key{order statistic} of an array
|
||||||
|
is the element at position $k$ after sorting
|
||||||
|
the array in increasing order.
|
||||||
|
It is easy to calculate any order statistic
|
||||||
|
in $O(n \log n)$ time by first sorting the array,
|
||||||
|
but is it really needed to sort the entire array
|
||||||
|
just to find one element?
|
||||||
|
|
||||||
|
It turns out that we can find order statistics
|
||||||
|
using a randomized algorithm without sorting the array.
|
||||||
|
The algorithm, called \key{quickselect}\footnote{In 1961,
|
||||||
|
C. A. R. Hoare published two algorithms that
|
||||||
|
are efficient on average: \index{quicksort} \index{quickselect}
|
||||||
|
\key{quicksort} \cite{hoa61a} for sorting arrays and
|
||||||
|
\key{quickselect} \cite{hoa61b} for finding order statistics.}, is a Las Vegas algorithm:
|
||||||
|
its running time is usually $O(n)$
|
||||||
|
but $O(n^2)$ in the worst case.
|
||||||
|
|
||||||
|
The algorithm chooses a random element $x$
|
||||||
|
of the array, and moves elements smaller than $x$
|
||||||
|
to the left part of the array,
|
||||||
|
and all other elements to the right part of the array.
|
||||||
|
This takes $O(n)$ time when there are $n$ elements.
|
||||||
|
Assume that the left part contains $a$ elements
|
||||||
|
and the right part contains $b$ elements.
|
||||||
|
If $a=k$, element $x$ is the $k$th order statistic.
|
||||||
|
Otherwise, if $a>k$, we recursively find the $k$th order
|
||||||
|
statistic for the left part,
|
||||||
|
and if $a<k$, we recursively find the $r$th order
|
||||||
|
statistic for the right part where $r=k-a$.
|
||||||
|
The search continues in a similar way, until the element
|
||||||
|
has been found.
|
||||||
|
|
||||||
|
When each element $x$ is randomly chosen,
|
||||||
|
the size of the array about halves at each step,
|
||||||
|
so the time complexity for
|
||||||
|
finding the $k$th order statistic is about
|
||||||
|
\[n+n/2+n/4+n/8+\cdots < 2n = O(n).\]
|
||||||
|
|
||||||
|
The worst case of the algorithm requires still $O(n^2)$ time,
|
||||||
|
because it is possible that $x$ is always chosen
|
||||||
|
in such a way that it is one of the smallest or largest
|
||||||
|
elements in the array and $O(n)$ steps are needed.
|
||||||
|
However, the probability for this is so small
|
||||||
|
that this never happens in practice.
|
||||||
|
|
||||||
|
\subsubsection{Verifying matrix multiplication}
|
||||||
|
|
||||||
|
\index{matrix multiplication}
|
||||||
|
|
||||||
|
Our next problem is to \emph{verify}
|
||||||
|
if $AB=C$ holds when $A$, $B$ and $C$
|
||||||
|
are matrices of size $n \times n$.
|
||||||
|
Of course, we can solve the problem
|
||||||
|
by calculating the product $AB$ again
|
||||||
|
(in $O(n^3)$ time using the basic algorithm),
|
||||||
|
but one could hope that verifying the
|
||||||
|
answer would by easier than to calculate it from scratch.
|
||||||
|
|
||||||
|
It turns out that we can solve the problem
|
||||||
|
using a Monte Carlo algorithm\footnote{R. M. Freivalds published
|
||||||
|
this algorithm in 1977 \cite{fre77}, and it is sometimes
|
||||||
|
called \index{Freivalds' algoritm} \key{Freivalds' algorithm}.} whose
|
||||||
|
time complexity is only $O(n^2)$.
|
||||||
|
The idea is simple: we choose a random vector
|
||||||
|
$X$ of $n$ elements, and calculate the matrices
|
||||||
|
$ABX$ and $CX$. If $ABX=CX$, we report that $AB=C$,
|
||||||
|
and otherwise we report that $AB \neq C$.
|
||||||
|
|
||||||
|
The time complexity of the algorithm is
|
||||||
|
$O(n^2)$, because we can calculate the matrices
|
||||||
|
$ABX$ and $CX$ in $O(n^2)$ time.
|
||||||
|
We can calculate the matrix $ABX$ efficiently
|
||||||
|
by using the representation $A(BX)$, so only two
|
||||||
|
multiplications of $n \times n$ and $n \times 1$
|
||||||
|
size matrices are needed.
|
||||||
|
|
||||||
|
The drawback of the algorithm is
|
||||||
|
that there is a small chance that the algorithm
|
||||||
|
makes a mistake when it reports that $AB=C$.
|
||||||
|
For example,
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
6 & 8 \\
|
||||||
|
1 & 3 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\neq
|
||||||
|
\begin{bmatrix}
|
||||||
|
8 & 7 \\
|
||||||
|
3 & 2 \\
|
||||||
|
\end{bmatrix},
|
||||||
|
\]
|
||||||
|
but
|
||||||
|
\[
|
||||||
|
\begin{bmatrix}
|
||||||
|
6 & 8 \\
|
||||||
|
1 & 3 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\begin{bmatrix}
|
||||||
|
3 \\
|
||||||
|
6 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
=
|
||||||
|
\begin{bmatrix}
|
||||||
|
8 & 7 \\
|
||||||
|
3 & 2 \\
|
||||||
|
\end{bmatrix}
|
||||||
|
\begin{bmatrix}
|
||||||
|
3 \\
|
||||||
|
6 \\
|
||||||
|
\end{bmatrix}.
|
||||||
|
\]
|
||||||
|
However, in practice, the probability that the
|
||||||
|
algorithm makes a mistake is small,
|
||||||
|
and we can decrease the probability by
|
||||||
|
verifying the result using multiple random vectors $X$
|
||||||
|
before reporting that $AB=C$.
|
||||||
|
|
||||||
|
\subsubsection{Graph coloring}
|
||||||
|
|
||||||
|
\index{coloring}
|
||||||
|
|
||||||
|
Given a graph that contains $n$ nodes and $m$ edges,
|
||||||
|
our task is to find a way to color the nodes
|
||||||
|
of the graph using two colors so that
|
||||||
|
for at least $m/2$ edges, the endpoints
|
||||||
|
have different colors.
|
||||||
|
For example, in the graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle] (5) at (6,2) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
a valid coloring is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle, fill=blue!40] (1) at (1,3) {$1$};
|
||||||
|
\node[draw, circle, fill=red!40] (2) at (4,3) {$2$};
|
||||||
|
\node[draw, circle, fill=red!40] (3) at (1,1) {$3$};
|
||||||
|
\node[draw, circle, fill=blue!40] (4) at (4,1) {$4$};
|
||||||
|
\node[draw, circle, fill=blue!40] (5) at (6,2) {$5$};
|
||||||
|
|
||||||
|
\path[draw,thick,-] (1) -- (2);
|
||||||
|
\path[draw,thick,-] (1) -- (3);
|
||||||
|
\path[draw,thick,-] (1) -- (4);
|
||||||
|
\path[draw,thick,-] (3) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (4);
|
||||||
|
\path[draw,thick,-] (2) -- (5);
|
||||||
|
\path[draw,thick,-] (4) -- (5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The above graph contains 7 edges, and for 5 of them,
|
||||||
|
the endpoints have different colors,
|
||||||
|
so the coloring is valid.
|
||||||
|
|
||||||
|
The problem can be solved using a Las Vegas algorithm
|
||||||
|
that generates random colorings until a valid coloring
|
||||||
|
has been found.
|
||||||
|
In a random coloring, the color of each node is
|
||||||
|
independently chosen so that the probability of
|
||||||
|
both colors is $1/2$.
|
||||||
|
|
||||||
|
In a random coloring, the probability that the endpoints
|
||||||
|
of a single edge have different colors is $1/2$.
|
||||||
|
Hence, the expected number of edges whose endpoints
|
||||||
|
have different colors is $m/2$.
|
||||||
|
Since it is expected that a random coloring is valid,
|
||||||
|
we will quickly find a valid coloring in practice.
|
||||||
|
|
|
@ -0,0 +1,801 @@
|
||||||
|
\chapter{Game theory}
|
||||||
|
|
||||||
|
In this chapter, we will focus on two-player
|
||||||
|
games that do not contain random elements.
|
||||||
|
Our goal is to find a strategy that we can
|
||||||
|
follow to win the game
|
||||||
|
no matter what the opponent does,
|
||||||
|
if such a strategy exists.
|
||||||
|
|
||||||
|
It turns out that there is a general strategy
|
||||||
|
for such games,
|
||||||
|
and we can analyze the games using the \key{nim theory}.
|
||||||
|
First, we will analyze simple games where
|
||||||
|
players remove sticks from heaps,
|
||||||
|
and after this, we will generalize the strategy
|
||||||
|
used in those games to other games.
|
||||||
|
|
||||||
|
\section{Game states}
|
||||||
|
|
||||||
|
Let us consider a game where there is initially
|
||||||
|
a heap of $n$ sticks.
|
||||||
|
Players $A$ and $B$ move alternately,
|
||||||
|
and player $A$ begins.
|
||||||
|
On each move, the player has to remove
|
||||||
|
1, 2 or 3 sticks from the heap,
|
||||||
|
and the player who removes the last stick wins the game.
|
||||||
|
|
||||||
|
For example, if $n=10$, the game may proceed as follows:
|
||||||
|
\begin{itemize}[noitemsep]
|
||||||
|
\item Player $A$ removes 2 sticks (8 sticks left).
|
||||||
|
\item Player $B$ removes 3 sticks (5 sticks left).
|
||||||
|
\item Player $A$ removes 1 stick (4 sticks left).
|
||||||
|
\item Player $B$ removes 2 sticks (2 sticks left).
|
||||||
|
\item Player $A$ removes 2 sticks and wins.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
This game consists of states $0,1,2,\ldots,n$,
|
||||||
|
where the number of the state corresponds to
|
||||||
|
the number of sticks left.
|
||||||
|
|
||||||
|
\subsubsection{Winning and losing states}
|
||||||
|
|
||||||
|
\index{winning state}
|
||||||
|
\index{losing state}
|
||||||
|
|
||||||
|
A \key{winning state} is a state where
|
||||||
|
the player will win the game if they
|
||||||
|
play optimally,
|
||||||
|
and a \key{losing state} is a state
|
||||||
|
where the player will lose the game if the
|
||||||
|
opponent plays optimally.
|
||||||
|
It turns out that we can classify all states
|
||||||
|
of a game so that each state is either
|
||||||
|
a winning state or a losing state.
|
||||||
|
|
||||||
|
In the above game, state 0 is clearly a
|
||||||
|
losing state, because the player cannot make
|
||||||
|
any moves.
|
||||||
|
States 1, 2 and 3 are winning states,
|
||||||
|
because we can remove 1, 2 or 3 sticks
|
||||||
|
and win the game.
|
||||||
|
State 4, in turn, is a losing state,
|
||||||
|
because any move leads to a state that
|
||||||
|
is a winning state for the opponent.
|
||||||
|
|
||||||
|
More generally, if there is a move that leads
|
||||||
|
from the current state to a losing state,
|
||||||
|
the current state is a winning state,
|
||||||
|
and otherwise the current state is a losing state.
|
||||||
|
Using this observation, we can classify all states
|
||||||
|
of a game starting with losing states where
|
||||||
|
there are no possible moves.
|
||||||
|
|
||||||
|
The states $0 \ldots 15$ of the above game
|
||||||
|
can be classified as follows
|
||||||
|
($W$ denotes a winning state and $L$ denotes a losing state):
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (16,1);
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {$L$};
|
||||||
|
\node at (1.5,0.5) {$W$};
|
||||||
|
\node at (2.5,0.5) {$W$};
|
||||||
|
\node at (3.5,0.5) {$W$};
|
||||||
|
\node at (4.5,0.5) {$L$};
|
||||||
|
\node at (5.5,0.5) {$W$};
|
||||||
|
\node at (6.5,0.5) {$W$};
|
||||||
|
\node at (7.5,0.5) {$W$};
|
||||||
|
\node at (8.5,0.5) {$L$};
|
||||||
|
\node at (9.5,0.5) {$W$};
|
||||||
|
\node at (10.5,0.5) {$W$};
|
||||||
|
\node at (11.5,0.5) {$W$};
|
||||||
|
\node at (12.5,0.5) {$L$};
|
||||||
|
\node at (13.5,0.5) {$W$};
|
||||||
|
\node at (14.5,0.5) {$W$};
|
||||||
|
\node at (15.5,0.5) {$W$};
|
||||||
|
|
||||||
|
\footnotesize
|
||||||
|
\node at (0.5,1.4) {$0$};
|
||||||
|
\node at (1.5,1.4) {$1$};
|
||||||
|
\node at (2.5,1.4) {$2$};
|
||||||
|
\node at (3.5,1.4) {$3$};
|
||||||
|
\node at (4.5,1.4) {$4$};
|
||||||
|
\node at (5.5,1.4) {$5$};
|
||||||
|
\node at (6.5,1.4) {$6$};
|
||||||
|
\node at (7.5,1.4) {$7$};
|
||||||
|
\node at (8.5,1.4) {$8$};
|
||||||
|
\node at (9.5,1.4) {$9$};
|
||||||
|
\node at (10.5,1.4) {$10$};
|
||||||
|
\node at (11.5,1.4) {$11$};
|
||||||
|
\node at (12.5,1.4) {$12$};
|
||||||
|
\node at (13.5,1.4) {$13$};
|
||||||
|
\node at (14.5,1.4) {$14$};
|
||||||
|
\node at (15.5,1.4) {$15$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
It is easy to analyze this game:
|
||||||
|
a state $k$ is a losing state if $k$ is
|
||||||
|
divisible by 4, and otherwise it
|
||||||
|
is a winning state.
|
||||||
|
An optimal way to play the game is
|
||||||
|
to always choose a move after which
|
||||||
|
the number of sticks in the heap
|
||||||
|
is divisible by 4.
|
||||||
|
Finally, there are no sticks left and
|
||||||
|
the opponent has lost.
|
||||||
|
|
||||||
|
Of course, this strategy requires that
|
||||||
|
the number of sticks is \emph{not} divisible by 4
|
||||||
|
when it is our move.
|
||||||
|
If it is, there is nothing we can do,
|
||||||
|
and the opponent will win the game if
|
||||||
|
they play optimally.
|
||||||
|
|
||||||
|
\subsubsection{State graph}
|
||||||
|
|
||||||
|
Let us now consider another stick game,
|
||||||
|
where in each state $k$, it is allowed to remove
|
||||||
|
any number $x$ of sticks such that $x$
|
||||||
|
is smaller than $k$ and divides $k$.
|
||||||
|
For example, in state 8 we may remove
|
||||||
|
1, 2 or 4 sticks, but in state 7 the only
|
||||||
|
allowed move is to remove 1 stick.
|
||||||
|
|
||||||
|
The following picture shows the states
|
||||||
|
$1 \ldots 9$ of the game as a \key{state graph},
|
||||||
|
whose nodes are the states and edges are the moves between them:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {$1$};
|
||||||
|
\node[draw, circle] (2) at (2,0) {$2$};
|
||||||
|
\node[draw, circle] (3) at (3.5,-1) {$3$};
|
||||||
|
\node[draw, circle] (4) at (1.5,-2) {$4$};
|
||||||
|
\node[draw, circle] (5) at (3,-2.75) {$5$};
|
||||||
|
\node[draw, circle] (6) at (2.5,-4.5) {$6$};
|
||||||
|
\node[draw, circle] (7) at (0.5,-3.25) {$7$};
|
||||||
|
\node[draw, circle] (8) at (-1,-4) {$8$};
|
||||||
|
\node[draw, circle] (9) at (1,-5.5) {$9$};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (3) edge [bend right=20] (2);
|
||||||
|
\path[draw,thick,->,>=latex] (4) edge [bend left=20] (2);
|
||||||
|
\path[draw,thick,->,>=latex] (4) edge [bend left=20] (3);
|
||||||
|
\path[draw,thick,->,>=latex] (5) edge [bend right=20] (4);
|
||||||
|
\path[draw,thick,->,>=latex] (6) edge [bend left=20] (5);
|
||||||
|
\path[draw,thick,->,>=latex] (6) edge [bend left=20] (4);
|
||||||
|
\path[draw,thick,->,>=latex] (6) edge [bend right=40] (3);
|
||||||
|
\path[draw,thick,->,>=latex] (7) edge [bend right=20] (6);
|
||||||
|
\path[draw,thick,->,>=latex] (8) edge [bend right=20] (7);
|
||||||
|
\path[draw,thick,->,>=latex] (8) edge [bend right=20] (6);
|
||||||
|
\path[draw,thick,->,>=latex] (8) edge [bend left=20] (4);
|
||||||
|
\path[draw,thick,->,>=latex] (9) edge [bend left=20] (8);
|
||||||
|
\path[draw,thick,->,>=latex] (9) edge [bend right=20] (6);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The final state in this game is always state 1,
|
||||||
|
which is a losing state, because there are no
|
||||||
|
valid moves.
|
||||||
|
The classification of states $1 \ldots 9$
|
||||||
|
is as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (1,0) grid (10,1);
|
||||||
|
|
||||||
|
\node at (1.5,0.5) {$L$};
|
||||||
|
\node at (2.5,0.5) {$W$};
|
||||||
|
\node at (3.5,0.5) {$L$};
|
||||||
|
\node at (4.5,0.5) {$W$};
|
||||||
|
\node at (5.5,0.5) {$L$};
|
||||||
|
\node at (6.5,0.5) {$W$};
|
||||||
|
\node at (7.5,0.5) {$L$};
|
||||||
|
\node at (8.5,0.5) {$W$};
|
||||||
|
\node at (9.5,0.5) {$L$};
|
||||||
|
|
||||||
|
\footnotesize
|
||||||
|
\node at (1.5,1.4) {$1$};
|
||||||
|
\node at (2.5,1.4) {$2$};
|
||||||
|
\node at (3.5,1.4) {$3$};
|
||||||
|
\node at (4.5,1.4) {$4$};
|
||||||
|
\node at (5.5,1.4) {$5$};
|
||||||
|
\node at (6.5,1.4) {$6$};
|
||||||
|
\node at (7.5,1.4) {$7$};
|
||||||
|
\node at (8.5,1.4) {$8$};
|
||||||
|
\node at (9.5,1.4) {$9$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Surprisingly, in this game,
|
||||||
|
all even-numbered states are winning states,
|
||||||
|
and all odd-numbered states are losing states.
|
||||||
|
|
||||||
|
\section{Nim game}
|
||||||
|
|
||||||
|
\index{nim game}
|
||||||
|
|
||||||
|
The \key{nim game} is a simple game that
|
||||||
|
has an important role in game theory,
|
||||||
|
because many other games can be played using
|
||||||
|
the same strategy.
|
||||||
|
First, we focus on nim,
|
||||||
|
and then we generalize the strategy
|
||||||
|
to other games.
|
||||||
|
|
||||||
|
There are $n$ heaps in nim,
|
||||||
|
and each heap contains some number of sticks.
|
||||||
|
The players move alternately,
|
||||||
|
and on each turn, the player chooses
|
||||||
|
a heap that still contains sticks
|
||||||
|
and removes any number of sticks from it.
|
||||||
|
The winner is the player who removes the last stick.
|
||||||
|
|
||||||
|
The states in nim are of the form
|
||||||
|
$[x_1,x_2,\ldots,x_n]$,
|
||||||
|
where $x_k$ denotes the number of sticks in heap $k$.
|
||||||
|
For example, $[10,12,5]$ is a game where
|
||||||
|
there are three heaps with 10, 12 and 5 sticks.
|
||||||
|
The state $[0,0,\ldots,0]$ is a losing state,
|
||||||
|
because it is not possible to remove any sticks,
|
||||||
|
and this is always the final state.
|
||||||
|
|
||||||
|
\subsubsection{Analysis}
|
||||||
|
\index{nim sum}
|
||||||
|
|
||||||
|
It turns out that we can easily classify
|
||||||
|
any nim state by calculating
|
||||||
|
the \key{nim sum} $s = x_1 \oplus x_2 \oplus \cdots \oplus x_n$,
|
||||||
|
where $\oplus$ is the xor operation\footnote{The optimal strategy
|
||||||
|
for nim was published in 1901 by C. L. Bouton \cite{bou01}.}.
|
||||||
|
The states whose nim sum is 0 are losing states,
|
||||||
|
and all other states are winning states.
|
||||||
|
For example, the nim sum of
|
||||||
|
$[10,12,5]$ is $10 \oplus 12 \oplus 5 = 3$,
|
||||||
|
so the state is a winning state.
|
||||||
|
|
||||||
|
But how is the nim sum related to the nim game?
|
||||||
|
We can explain this by looking at how the nim
|
||||||
|
sum changes when the nim state changes.
|
||||||
|
|
||||||
|
\textit{Losing states:}
|
||||||
|
The final state $[0,0,\ldots,0]$ is a losing state,
|
||||||
|
and its nim sum is 0, as expected.
|
||||||
|
In other losing states, any move leads to
|
||||||
|
a winning state, because when a single value $x_k$ changes,
|
||||||
|
the nim sum also changes, so the nim sum
|
||||||
|
is different from 0 after the move.
|
||||||
|
|
||||||
|
\textit{Winning states:}
|
||||||
|
We can move to a losing state if
|
||||||
|
there is any heap $k$ for which $x_k \oplus s < x_k$.
|
||||||
|
In this case, we can remove sticks from
|
||||||
|
heap $k$ so that it will contain $x_k \oplus s$ sticks,
|
||||||
|
which will lead to a losing state.
|
||||||
|
There is always such a heap, where $x_k$
|
||||||
|
has a one bit at the position of the leftmost
|
||||||
|
one bit of $s$.
|
||||||
|
|
||||||
|
As an example, consider the state $[10,12,5]$.
|
||||||
|
This state is a winning state,
|
||||||
|
because its nim sum is 3.
|
||||||
|
Thus, there has to be a move which
|
||||||
|
leads to a losing state.
|
||||||
|
Next we will find out such a move.
|
||||||
|
|
||||||
|
The nim sum of the state is as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|r}
|
||||||
|
10 & \texttt{1010} \\
|
||||||
|
12 & \texttt{1100} \\
|
||||||
|
5 & \texttt{0101} \\
|
||||||
|
\hline
|
||||||
|
3 & \texttt{0011} \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
In this case, the heap with 10 sticks
|
||||||
|
is the only heap that has a one bit
|
||||||
|
at the position of the leftmost
|
||||||
|
one bit of the nim sum:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|r}
|
||||||
|
10 & \texttt{10\underline{1}0} \\
|
||||||
|
12 & \texttt{1100} \\
|
||||||
|
5 & \texttt{0101} \\
|
||||||
|
\hline
|
||||||
|
3 & \texttt{00\underline{1}1} \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The new size of the heap has to be
|
||||||
|
$10 \oplus 3 = 9$,
|
||||||
|
so we will remove just one stick.
|
||||||
|
After this, the state will be $[9,12,5]$,
|
||||||
|
which is a losing state:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{r|r}
|
||||||
|
9 & \texttt{1001} \\
|
||||||
|
12 & \texttt{1100} \\
|
||||||
|
5 & \texttt{0101} \\
|
||||||
|
\hline
|
||||||
|
0 & \texttt{0000} \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Misère game}
|
||||||
|
|
||||||
|
\index{misère game}
|
||||||
|
|
||||||
|
In a \key{misère game}, the goal of the game
|
||||||
|
is opposite,
|
||||||
|
so the player who removes the last stick
|
||||||
|
loses the game.
|
||||||
|
It turns out that the misère nim game can be
|
||||||
|
optimally played almost like the standard nim game.
|
||||||
|
|
||||||
|
The idea is to first play the misère game
|
||||||
|
like the standard game, but change the strategy
|
||||||
|
at the end of the game.
|
||||||
|
The new strategy will be introduced in a situation
|
||||||
|
where each heap would contain at most one stick
|
||||||
|
after the next move.
|
||||||
|
|
||||||
|
In the standard game, we should choose a move
|
||||||
|
after which there is an even number of heaps with one stick.
|
||||||
|
However, in the misère game, we choose a move so that
|
||||||
|
there is an odd number of heaps with one stick.
|
||||||
|
|
||||||
|
This strategy works because a state where the
|
||||||
|
strategy changes always appears in the game,
|
||||||
|
and this state is a winning state, because
|
||||||
|
it contains exactly one heap that has more than one stick
|
||||||
|
so the nim sum is not 0.
|
||||||
|
|
||||||
|
\section{Sprague–Grundy theorem}
|
||||||
|
|
||||||
|
\index{Sprague–Grundy theorem}
|
||||||
|
|
||||||
|
The \key{Sprague–Grundy theorem}\footnote{The theorem was
|
||||||
|
independently discovered by R. Sprague \cite{spr35} and P. M. Grundy \cite{gru39}.} generalizes the
|
||||||
|
strategy used in nim to all games that fulfil
|
||||||
|
the following requirements:
|
||||||
|
|
||||||
|
\begin{itemize}[noitemsep]
|
||||||
|
\item There are two players who move alternately.
|
||||||
|
\item The game consists of states, and the possible moves
|
||||||
|
in a state do not depend on whose turn it is.
|
||||||
|
\item The game ends when a player cannot make a move.
|
||||||
|
\item The game surely ends sooner or later.
|
||||||
|
\item The players have complete information about
|
||||||
|
the states and allowed moves, and there is no randomness in the game.
|
||||||
|
\end{itemize}
|
||||||
|
The idea is to calculate for each game state
|
||||||
|
a Grundy number that corresponds to the number of
|
||||||
|
sticks in a nim heap.
|
||||||
|
When we know the Grundy numbers of all states,
|
||||||
|
we can play the game like the nim game.
|
||||||
|
|
||||||
|
\subsubsection{Grundy numbers}
|
||||||
|
|
||||||
|
\index{Grundy number}
|
||||||
|
\index{mex function}
|
||||||
|
|
||||||
|
The \key{Grundy number} of a game state is
|
||||||
|
\[\textrm{mex}(\{g_1,g_2,\ldots,g_n\}),\]
|
||||||
|
where $g_1,g_2,\ldots,g_n$ are the Grundy numbers of the
|
||||||
|
states to which we can move,
|
||||||
|
and the mex function gives the smallest
|
||||||
|
nonnegative number that is not in the set.
|
||||||
|
For example, $\textrm{mex}(\{0,1,3\})=2$.
|
||||||
|
If there are no possible moves in a state,
|
||||||
|
its Grundy number is 0, because
|
||||||
|
$\textrm{mex}(\emptyset)=0$.
|
||||||
|
|
||||||
|
For example, in the state graph
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {\phantom{0}};
|
||||||
|
\node[draw, circle] (2) at (2,0) {\phantom{0}};
|
||||||
|
\node[draw, circle] (3) at (4,0) {\phantom{0}};
|
||||||
|
\node[draw, circle] (4) at (1,-2) {\phantom{0}};
|
||||||
|
\node[draw, circle] (5) at (3,-2) {\phantom{0}};
|
||||||
|
\node[draw, circle] (6) at (5,-2) {\phantom{0}};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (6) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (6) -- (2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
the Grundy numbers are as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\node[draw, circle] (1) at (0,0) {0};
|
||||||
|
\node[draw, circle] (2) at (2,0) {1};
|
||||||
|
\node[draw, circle] (3) at (4,0) {0};
|
||||||
|
\node[draw, circle] (4) at (1,-2) {2};
|
||||||
|
\node[draw, circle] (5) at (3,-2) {0};
|
||||||
|
\node[draw, circle] (6) at (5,-2) {2};
|
||||||
|
|
||||||
|
\path[draw,thick,->,>=latex] (2) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (3) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (4);
|
||||||
|
\path[draw,thick,->,>=latex] (6) -- (5);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||||
|
\path[draw,thick,->,>=latex] (4) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||||
|
\path[draw,thick,->,>=latex] (6) -- (2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The Grundy number of a losing state is 0,
|
||||||
|
and the Grundy number of a winning state is
|
||||||
|
a positive number.
|
||||||
|
|
||||||
|
The Grundy number of a state corresponds to
|
||||||
|
the number of sticks in a nim heap.
|
||||||
|
If the Grundy number is 0, we can only move to
|
||||||
|
states whose Grundy numbers are positive,
|
||||||
|
and if the Grundy number is $x>0$, we can move
|
||||||
|
to states whose Grundy numbers include all numbers
|
||||||
|
$0,1,\ldots,x-1$.
|
||||||
|
|
||||||
|
As an example, consider a game where
|
||||||
|
the players move a figure in a maze.
|
||||||
|
Each square in the maze is either floor or wall.
|
||||||
|
On each turn, the player has to move
|
||||||
|
the figure some number
|
||||||
|
of steps left or up.
|
||||||
|
The winner of the game is the player who
|
||||||
|
makes the last move.
|
||||||
|
|
||||||
|
The following picture shows a possible initial state
|
||||||
|
of the game, where @ denotes the figure and *
|
||||||
|
denotes a square where it can move.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.65]
|
||||||
|
\begin{scope}
|
||||||
|
\fill [color=black] (0, 1) rectangle (1, 2);
|
||||||
|
\fill [color=black] (0, 3) rectangle (1, 4);
|
||||||
|
\fill [color=black] (2, 2) rectangle (3, 3);
|
||||||
|
\fill [color=black] (2, 4) rectangle (3, 5);
|
||||||
|
\fill [color=black] (4, 3) rectangle (5, 4);
|
||||||
|
|
||||||
|
\draw (0, 0) grid (5, 5);
|
||||||
|
|
||||||
|
\node at (4.5,0.5) {@};
|
||||||
|
\node at (3.5,0.5) {*};
|
||||||
|
\node at (2.5,0.5) {*};
|
||||||
|
\node at (1.5,0.5) {*};
|
||||||
|
\node at (0.5,0.5) {*};
|
||||||
|
\node at (4.5,1.5) {*};
|
||||||
|
\node at (4.5,2.5) {*};
|
||||||
|
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The states of the game are all floor squares
|
||||||
|
of the maze.
|
||||||
|
In the above maze, the Grundy numbers
|
||||||
|
are as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=.65]
|
||||||
|
\begin{scope}
|
||||||
|
\fill [color=black] (0, 1) rectangle (1, 2);
|
||||||
|
\fill [color=black] (0, 3) rectangle (1, 4);
|
||||||
|
\fill [color=black] (2, 2) rectangle (3, 3);
|
||||||
|
\fill [color=black] (2, 4) rectangle (3, 5);
|
||||||
|
\fill [color=black] (4, 3) rectangle (5, 4);
|
||||||
|
|
||||||
|
\draw (0, 0) grid (5, 5);
|
||||||
|
|
||||||
|
\node at (0.5,4.5) {0};
|
||||||
|
\node at (1.5,4.5) {1};
|
||||||
|
\node at (2.5,4.5) {};
|
||||||
|
\node at (3.5,4.5) {0};
|
||||||
|
\node at (4.5,4.5) {1};
|
||||||
|
|
||||||
|
\node at (0.5,3.5) {};
|
||||||
|
\node at (1.5,3.5) {0};
|
||||||
|
\node at (2.5,3.5) {1};
|
||||||
|
\node at (3.5,3.5) {2};
|
||||||
|
\node at (4.5,3.5) {};
|
||||||
|
|
||||||
|
\node at (0.5,2.5) {0};
|
||||||
|
\node at (1.5,2.5) {2};
|
||||||
|
\node at (2.5,2.5) {};
|
||||||
|
\node at (3.5,2.5) {1};
|
||||||
|
\node at (4.5,2.5) {0};
|
||||||
|
|
||||||
|
\node at (0.5,1.5) {};
|
||||||
|
\node at (1.5,1.5) {3};
|
||||||
|
\node at (2.5,1.5) {0};
|
||||||
|
\node at (3.5,1.5) {4};
|
||||||
|
\node at (4.5,1.5) {1};
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {0};
|
||||||
|
\node at (1.5,0.5) {4};
|
||||||
|
\node at (2.5,0.5) {1};
|
||||||
|
\node at (3.5,0.5) {3};
|
||||||
|
\node at (4.5,0.5) {2};
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Thus, each state of the maze game
|
||||||
|
corresponds to a heap in the nim game.
|
||||||
|
For example, the Grundy number for
|
||||||
|
the lower-right square is 2,
|
||||||
|
so it is a winning state.
|
||||||
|
We can reach a losing state and
|
||||||
|
win the game by moving
|
||||||
|
either four steps left or
|
||||||
|
two steps up.
|
||||||
|
|
||||||
|
Note that unlike in the original nim game,
|
||||||
|
it may be possible to move to a state whose
|
||||||
|
Grundy number is larger than the Grundy number
|
||||||
|
of the current state.
|
||||||
|
However, the opponent can always choose a move
|
||||||
|
that cancels such a move, so it is not possible
|
||||||
|
to escape from a losing state.
|
||||||
|
|
||||||
|
\subsubsection{Subgames}
|
||||||
|
|
||||||
|
Next we will assume that our game consists
|
||||||
|
of subgames, and on each turn, the player
|
||||||
|
first chooses a subgame and then a move in the subgame.
|
||||||
|
The game ends when it is not possible to make any move
|
||||||
|
in any subgame.
|
||||||
|
|
||||||
|
In this case, the Grundy number of a game
|
||||||
|
is the nim sum of the Grundy numbers of the subgames.
|
||||||
|
The game can be played like a nim game by calculating
|
||||||
|
all Grundy numbers for subgames and then their nim sum.
|
||||||
|
|
||||||
|
As an example, consider a game that consists
|
||||||
|
of three mazes.
|
||||||
|
In this game, on each turn, the player chooses one
|
||||||
|
of the mazes and then moves the figure in the maze.
|
||||||
|
Assume that the initial state of the game is as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{ccc}
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\fill [color=black] (0, 1) rectangle (1, 2);
|
||||||
|
\fill [color=black] (0, 3) rectangle (1, 4);
|
||||||
|
\fill [color=black] (2, 2) rectangle (3, 3);
|
||||||
|
\fill [color=black] (2, 4) rectangle (3, 5);
|
||||||
|
\fill [color=black] (4, 3) rectangle (5, 4);
|
||||||
|
|
||||||
|
\draw (0, 0) grid (5, 5);
|
||||||
|
|
||||||
|
\node at (4.5,0.5) {@};
|
||||||
|
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
&
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\fill [color=black] (1, 1) rectangle (2, 3);
|
||||||
|
\fill [color=black] (2, 3) rectangle (3, 4);
|
||||||
|
\fill [color=black] (4, 4) rectangle (5, 5);
|
||||||
|
|
||||||
|
\draw (0, 0) grid (5, 5);
|
||||||
|
|
||||||
|
\node at (4.5,0.5) {@};
|
||||||
|
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
&
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\fill [color=black] (1, 1) rectangle (4, 4);
|
||||||
|
|
||||||
|
\draw (0, 0) grid (5, 5);
|
||||||
|
|
||||||
|
\node at (4.5,0.5) {@};
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The Grundy numbers for the mazes are as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{ccc}
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\fill [color=black] (0, 1) rectangle (1, 2);
|
||||||
|
\fill [color=black] (0, 3) rectangle (1, 4);
|
||||||
|
\fill [color=black] (2, 2) rectangle (3, 3);
|
||||||
|
\fill [color=black] (2, 4) rectangle (3, 5);
|
||||||
|
\fill [color=black] (4, 3) rectangle (5, 4);
|
||||||
|
|
||||||
|
\draw (0, 0) grid (5, 5);
|
||||||
|
|
||||||
|
\node at (0.5,4.5) {0};
|
||||||
|
\node at (1.5,4.5) {1};
|
||||||
|
\node at (2.5,4.5) {};
|
||||||
|
\node at (3.5,4.5) {0};
|
||||||
|
\node at (4.5,4.5) {1};
|
||||||
|
|
||||||
|
\node at (0.5,3.5) {};
|
||||||
|
\node at (1.5,3.5) {0};
|
||||||
|
\node at (2.5,3.5) {1};
|
||||||
|
\node at (3.5,3.5) {2};
|
||||||
|
\node at (4.5,3.5) {};
|
||||||
|
|
||||||
|
\node at (0.5,2.5) {0};
|
||||||
|
\node at (1.5,2.5) {2};
|
||||||
|
\node at (2.5,2.5) {};
|
||||||
|
\node at (3.5,2.5) {1};
|
||||||
|
\node at (4.5,2.5) {0};
|
||||||
|
|
||||||
|
\node at (0.5,1.5) {};
|
||||||
|
\node at (1.5,1.5) {3};
|
||||||
|
\node at (2.5,1.5) {0};
|
||||||
|
\node at (3.5,1.5) {4};
|
||||||
|
\node at (4.5,1.5) {1};
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {0};
|
||||||
|
\node at (1.5,0.5) {4};
|
||||||
|
\node at (2.5,0.5) {1};
|
||||||
|
\node at (3.5,0.5) {3};
|
||||||
|
\node at (4.5,0.5) {2};
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
&
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\fill [color=black] (1, 1) rectangle (2, 3);
|
||||||
|
\fill [color=black] (2, 3) rectangle (3, 4);
|
||||||
|
\fill [color=black] (4, 4) rectangle (5, 5);
|
||||||
|
|
||||||
|
\draw (0, 0) grid (5, 5);
|
||||||
|
|
||||||
|
\node at (0.5,4.5) {0};
|
||||||
|
\node at (1.5,4.5) {1};
|
||||||
|
\node at (2.5,4.5) {2};
|
||||||
|
\node at (3.5,4.5) {3};
|
||||||
|
\node at (4.5,4.5) {};
|
||||||
|
|
||||||
|
\node at (0.5,3.5) {1};
|
||||||
|
\node at (1.5,3.5) {0};
|
||||||
|
\node at (2.5,3.5) {};
|
||||||
|
\node at (3.5,3.5) {0};
|
||||||
|
\node at (4.5,3.5) {1};
|
||||||
|
|
||||||
|
\node at (0.5,2.5) {2};
|
||||||
|
\node at (1.5,2.5) {};
|
||||||
|
\node at (2.5,2.5) {0};
|
||||||
|
\node at (3.5,2.5) {1};
|
||||||
|
\node at (4.5,2.5) {2};
|
||||||
|
|
||||||
|
\node at (0.5,1.5) {3};
|
||||||
|
\node at (1.5,1.5) {};
|
||||||
|
\node at (2.5,1.5) {1};
|
||||||
|
\node at (3.5,1.5) {2};
|
||||||
|
\node at (4.5,1.5) {0};
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {4};
|
||||||
|
\node at (1.5,0.5) {0};
|
||||||
|
\node at (2.5,0.5) {2};
|
||||||
|
\node at (3.5,0.5) {5};
|
||||||
|
\node at (4.5,0.5) {3};
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
&
|
||||||
|
\begin{tikzpicture}[scale=.55]
|
||||||
|
\begin{scope}
|
||||||
|
\fill [color=black] (1, 1) rectangle (4, 4);
|
||||||
|
|
||||||
|
\draw (0, 0) grid (5, 5);
|
||||||
|
|
||||||
|
\node at (0.5,4.5) {0};
|
||||||
|
\node at (1.5,4.5) {1};
|
||||||
|
\node at (2.5,4.5) {2};
|
||||||
|
\node at (3.5,4.5) {3};
|
||||||
|
\node at (4.5,4.5) {4};
|
||||||
|
|
||||||
|
\node at (0.5,3.5) {1};
|
||||||
|
\node at (1.5,3.5) {};
|
||||||
|
\node at (2.5,3.5) {};
|
||||||
|
\node at (3.5,3.5) {};
|
||||||
|
\node at (4.5,3.5) {0};
|
||||||
|
|
||||||
|
\node at (0.5,2.5) {2};
|
||||||
|
\node at (1.5,2.5) {};
|
||||||
|
\node at (2.5,2.5) {};
|
||||||
|
\node at (3.5,2.5) {};
|
||||||
|
\node at (4.5,2.5) {1};
|
||||||
|
|
||||||
|
\node at (0.5,1.5) {3};
|
||||||
|
\node at (1.5,1.5) {};
|
||||||
|
\node at (2.5,1.5) {};
|
||||||
|
\node at (3.5,1.5) {};
|
||||||
|
\node at (4.5,1.5) {2};
|
||||||
|
|
||||||
|
\node at (0.5,0.5) {4};
|
||||||
|
\node at (1.5,0.5) {0};
|
||||||
|
\node at (2.5,0.5) {1};
|
||||||
|
\node at (3.5,0.5) {2};
|
||||||
|
\node at (4.5,0.5) {3};
|
||||||
|
\end{scope}
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
In the initial state, the nim sum of the Grundy numbers
|
||||||
|
is $2 \oplus 3 \oplus 3 = 2$, so
|
||||||
|
the first player can win the game.
|
||||||
|
One optimal move is to move two steps up
|
||||||
|
in the first maze, which produces the nim sum
|
||||||
|
$0 \oplus 3 \oplus 3 = 0$.
|
||||||
|
|
||||||
|
\subsubsection{Grundy's game}
|
||||||
|
|
||||||
|
Sometimes a move in a game divides the game
|
||||||
|
into subgames that are independent of each other.
|
||||||
|
In this case, the Grundy number of the game is
|
||||||
|
|
||||||
|
\[\textrm{mex}(\{g_1, g_2, \ldots, g_n \}),\]
|
||||||
|
where $n$ is the number of possible moves and
|
||||||
|
\[g_k = a_{k,1} \oplus a_{k,2} \oplus \ldots \oplus a_{k,m},\]
|
||||||
|
where move $k$ generates subgames with
|
||||||
|
Grundy numbers $a_{k,1},a_{k,2},\ldots,a_{k,m}$.
|
||||||
|
|
||||||
|
\index{Grundy's game}
|
||||||
|
|
||||||
|
An example of such a game is \key{Grundy's game}.
|
||||||
|
Initially, there is a single heap that contains $n$ sticks.
|
||||||
|
On each turn, the player chooses a heap and divides
|
||||||
|
it into two nonempty heaps such that the heaps
|
||||||
|
are of different size.
|
||||||
|
The player who makes the last move wins the game.
|
||||||
|
|
||||||
|
Let $f(n)$ be the Grundy number of a heap
|
||||||
|
that contains $n$ sticks.
|
||||||
|
The Grundy number can be calculated by going
|
||||||
|
through all ways to divide the heap into
|
||||||
|
two heaps.
|
||||||
|
For example, when $n=8$, the possibilities
|
||||||
|
are $1+7$, $2+6$ and $3+5$, so
|
||||||
|
\[f(8)=\textrm{mex}(\{f(1) \oplus f(7), f(2) \oplus f(6), f(3) \oplus f(5)\}).\]
|
||||||
|
|
||||||
|
In this game, the value of $f(n)$ is based on the values
|
||||||
|
of $f(1),\ldots,f(n-1)$.
|
||||||
|
The base cases are $f(1)=f(2)=0$,
|
||||||
|
because it is not possible to divide the heaps
|
||||||
|
of 1 and 2 sticks.
|
||||||
|
The first Grundy numbers are:
|
||||||
|
\[
|
||||||
|
\begin{array}{lcl}
|
||||||
|
f(1) & = & 0 \\
|
||||||
|
f(2) & = & 0 \\
|
||||||
|
f(3) & = & 1 \\
|
||||||
|
f(4) & = & 0 \\
|
||||||
|
f(5) & = & 2 \\
|
||||||
|
f(6) & = & 1 \\
|
||||||
|
f(7) & = & 0 \\
|
||||||
|
f(8) & = & 2 \\
|
||||||
|
\end{array}
|
||||||
|
\]
|
||||||
|
The Grundy number for $n=8$ is 2,
|
||||||
|
so it is possible to win the game.
|
||||||
|
The winning move is to create heaps
|
||||||
|
$1+7$, because $f(1) \oplus f(7) = 0$.
|
||||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,559 @@
|
||||||
|
\chapter{Square root algorithms}
|
||||||
|
|
||||||
|
\index{square root algorithm}
|
||||||
|
|
||||||
|
A \key{square root algorithm} is an algorithm
|
||||||
|
that has a square root in its time complexity.
|
||||||
|
A square root can be seen as a ''poor man's logarithm'':
|
||||||
|
the complexity $O(\sqrt n)$ is better than $O(n)$
|
||||||
|
but worse than $O(\log n)$.
|
||||||
|
In any case, many square root algorithms are fast and usable in practice.
|
||||||
|
|
||||||
|
As an example, consider the problem of
|
||||||
|
creating a data structure that supports
|
||||||
|
two operations on an array:
|
||||||
|
modifying an element at a given position
|
||||||
|
and calculating the sum of elements in the given range.
|
||||||
|
We have previously solved the problem using
|
||||||
|
binary indexed and segment trees,
|
||||||
|
that support both operations in $O(\log n)$ time.
|
||||||
|
However, now we will solve the problem
|
||||||
|
in another way using a square root structure
|
||||||
|
that allows us to modify elements in $O(1)$ time
|
||||||
|
and calculate sums in $O(\sqrt n)$ time.
|
||||||
|
|
||||||
|
The idea is to divide the array into \emph{blocks}
|
||||||
|
of size $\sqrt n$ so that each block contains
|
||||||
|
the sum of elements inside the block.
|
||||||
|
For example, an array of 16 elements will be
|
||||||
|
divided into blocks of 4 elements as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) grid (16,1);
|
||||||
|
|
||||||
|
\draw (0,1) rectangle (4,2);
|
||||||
|
\draw (4,1) rectangle (8,2);
|
||||||
|
\draw (8,1) rectangle (12,2);
|
||||||
|
\draw (12,1) rectangle (16,2);
|
||||||
|
|
||||||
|
\node at (0.5, 0.5) {5};
|
||||||
|
\node at (1.5, 0.5) {8};
|
||||||
|
\node at (2.5, 0.5) {6};
|
||||||
|
\node at (3.5, 0.5) {3};
|
||||||
|
\node at (4.5, 0.5) {2};
|
||||||
|
\node at (5.5, 0.5) {7};
|
||||||
|
\node at (6.5, 0.5) {2};
|
||||||
|
\node at (7.5, 0.5) {6};
|
||||||
|
\node at (8.5, 0.5) {7};
|
||||||
|
\node at (9.5, 0.5) {1};
|
||||||
|
\node at (10.5, 0.5) {7};
|
||||||
|
\node at (11.5, 0.5) {5};
|
||||||
|
\node at (12.5, 0.5) {6};
|
||||||
|
\node at (13.5, 0.5) {2};
|
||||||
|
\node at (14.5, 0.5) {3};
|
||||||
|
\node at (15.5, 0.5) {2};
|
||||||
|
|
||||||
|
\node at (2, 1.5) {21};
|
||||||
|
\node at (6, 1.5) {17};
|
||||||
|
\node at (10, 1.5) {20};
|
||||||
|
\node at (14, 1.5) {13};
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
In this structure,
|
||||||
|
it is easy to modify array elements,
|
||||||
|
because it is only needed to update
|
||||||
|
the sum of a single block
|
||||||
|
after each modification,
|
||||||
|
which can be done in $O(1)$ time.
|
||||||
|
For example, the following picture shows
|
||||||
|
how the value of an element and
|
||||||
|
the sum of the corresponding block change:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (5,0) rectangle (6,1);
|
||||||
|
\draw (0,0) grid (16,1);
|
||||||
|
|
||||||
|
\fill[color=lightgray] (4,1) rectangle (8,2);
|
||||||
|
\draw (0,1) rectangle (4,2);
|
||||||
|
\draw (4,1) rectangle (8,2);
|
||||||
|
\draw (8,1) rectangle (12,2);
|
||||||
|
\draw (12,1) rectangle (16,2);
|
||||||
|
|
||||||
|
\node at (0.5, 0.5) {5};
|
||||||
|
\node at (1.5, 0.5) {8};
|
||||||
|
\node at (2.5, 0.5) {6};
|
||||||
|
\node at (3.5, 0.5) {3};
|
||||||
|
\node at (4.5, 0.5) {2};
|
||||||
|
\node at (5.5, 0.5) {5};
|
||||||
|
\node at (6.5, 0.5) {2};
|
||||||
|
\node at (7.5, 0.5) {6};
|
||||||
|
\node at (8.5, 0.5) {7};
|
||||||
|
\node at (9.5, 0.5) {1};
|
||||||
|
\node at (10.5, 0.5) {7};
|
||||||
|
\node at (11.5, 0.5) {5};
|
||||||
|
\node at (12.5, 0.5) {6};
|
||||||
|
\node at (13.5, 0.5) {2};
|
||||||
|
\node at (14.5, 0.5) {3};
|
||||||
|
\node at (15.5, 0.5) {2};
|
||||||
|
|
||||||
|
\node at (2, 1.5) {21};
|
||||||
|
\node at (6, 1.5) {15};
|
||||||
|
\node at (10, 1.5) {20};
|
||||||
|
\node at (14, 1.5) {13};
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Then, to calculate the sum of elements in a range,
|
||||||
|
we divide the range into three parts such that
|
||||||
|
the sum consists of values of single elements
|
||||||
|
and sums of blocks between them:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (3,0) rectangle (4,1);
|
||||||
|
\fill[color=lightgray] (12,0) rectangle (13,1);
|
||||||
|
\fill[color=lightgray] (13,0) rectangle (14,1);
|
||||||
|
\draw (0,0) grid (16,1);
|
||||||
|
|
||||||
|
\fill[color=lightgray] (4,1) rectangle (8,2);
|
||||||
|
\fill[color=lightgray] (8,1) rectangle (12,2);
|
||||||
|
\draw (0,1) rectangle (4,2);
|
||||||
|
\draw (4,1) rectangle (8,2);
|
||||||
|
\draw (8,1) rectangle (12,2);
|
||||||
|
\draw (12,1) rectangle (16,2);
|
||||||
|
|
||||||
|
\node at (0.5, 0.5) {5};
|
||||||
|
\node at (1.5, 0.5) {8};
|
||||||
|
\node at (2.5, 0.5) {6};
|
||||||
|
\node at (3.5, 0.5) {3};
|
||||||
|
\node at (4.5, 0.5) {2};
|
||||||
|
\node at (5.5, 0.5) {5};
|
||||||
|
\node at (6.5, 0.5) {2};
|
||||||
|
\node at (7.5, 0.5) {6};
|
||||||
|
\node at (8.5, 0.5) {7};
|
||||||
|
\node at (9.5, 0.5) {1};
|
||||||
|
\node at (10.5, 0.5) {7};
|
||||||
|
\node at (11.5, 0.5) {5};
|
||||||
|
\node at (12.5, 0.5) {6};
|
||||||
|
\node at (13.5, 0.5) {2};
|
||||||
|
\node at (14.5, 0.5) {3};
|
||||||
|
\node at (15.5, 0.5) {2};
|
||||||
|
|
||||||
|
\node at (2, 1.5) {21};
|
||||||
|
\node at (6, 1.5) {15};
|
||||||
|
\node at (10, 1.5) {20};
|
||||||
|
\node at (14, 1.5) {13};
|
||||||
|
|
||||||
|
\draw [decoration={brace}, decorate, line width=0.5mm] (14,-0.25) -- (3,-0.25);
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Since the number of single elements is $O(\sqrt n)$
|
||||||
|
and the number of blocks is also $O(\sqrt n)$,
|
||||||
|
the sum query takes $O(\sqrt n)$ time.
|
||||||
|
The purpose of the block size $\sqrt n$ is
|
||||||
|
that it \emph{balances} two things:
|
||||||
|
the array is divided into $\sqrt n$ blocks,
|
||||||
|
each of which contains $\sqrt n$ elements.
|
||||||
|
|
||||||
|
In practice, it is not necessary to use the
|
||||||
|
exact value of $\sqrt n$ as a parameter,
|
||||||
|
and instead we may use parameters $k$ and $n/k$ where $k$ is
|
||||||
|
different from $\sqrt n$.
|
||||||
|
The optimal parameter depends on the problem and input.
|
||||||
|
For example, if an algorithm often goes
|
||||||
|
through the blocks but rarely inspects
|
||||||
|
single elements inside the blocks,
|
||||||
|
it may be a good idea to divide the array into
|
||||||
|
$k < \sqrt n$ blocks, each of which contains $n/k > \sqrt n$
|
||||||
|
elements.
|
||||||
|
|
||||||
|
\section{Combining algorithms}
|
||||||
|
|
||||||
|
In this section we discuss two square root algorithms
|
||||||
|
that are based on combining two algorithms into one algorithm.
|
||||||
|
In both cases, we could use either of the algorithms
|
||||||
|
without the other
|
||||||
|
and solve the problem in $O(n^2)$ time.
|
||||||
|
However, by combining the algorithms, the running
|
||||||
|
time is only $O(n \sqrt n)$.
|
||||||
|
|
||||||
|
\subsubsection{Case processing}
|
||||||
|
|
||||||
|
Suppose that we are given a two-dimensional
|
||||||
|
grid that contains $n$ cells.
|
||||||
|
Each cell is assigned a letter,
|
||||||
|
and our task is to find two cells
|
||||||
|
with the same letter whose distance is minimum,
|
||||||
|
where the distance between cells
|
||||||
|
$(x_1,y_1)$ and $(x_2,y_2)$ is $|x_1-x_2|+|y_1-y_2|$.
|
||||||
|
For example, consider the following grid:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\node at (0.5,0.5) {A};
|
||||||
|
\node at (0.5,1.5) {B};
|
||||||
|
\node at (0.5,2.5) {C};
|
||||||
|
\node at (0.5,3.5) {A};
|
||||||
|
\node at (1.5,0.5) {C};
|
||||||
|
\node at (1.5,1.5) {D};
|
||||||
|
\node at (1.5,2.5) {E};
|
||||||
|
\node at (1.5,3.5) {F};
|
||||||
|
\node at (2.5,0.5) {B};
|
||||||
|
\node at (2.5,1.5) {A};
|
||||||
|
\node at (2.5,2.5) {G};
|
||||||
|
\node at (2.5,3.5) {B};
|
||||||
|
\node at (3.5,0.5) {D};
|
||||||
|
\node at (3.5,1.5) {F};
|
||||||
|
\node at (3.5,2.5) {E};
|
||||||
|
\node at (3.5,3.5) {A};
|
||||||
|
\draw (0,0) grid (4,4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
In this case, the minimum distance is 2 between the two 'E' letters.
|
||||||
|
|
||||||
|
We can solve the problem by considering each letter separately.
|
||||||
|
Using this approach, the new problem is to calculate
|
||||||
|
the minimum distance
|
||||||
|
between two cells with a \emph{fixed} letter $c$.
|
||||||
|
We focus on two algorithms for this:
|
||||||
|
|
||||||
|
\emph{Algorithm 1:} Go through all pairs of cells with letter $c$,
|
||||||
|
and calculate the minimum distance between such cells.
|
||||||
|
This will take $O(k^2)$ time where $k$ is the number of cells with letter $c$.
|
||||||
|
|
||||||
|
\emph{Algorithm 2:} Perform a breadth-first search that simultaneously
|
||||||
|
starts at each cell with letter $c$. The minimum distance between
|
||||||
|
two cells with letter $c$ will be calculated in $O(n)$ time.
|
||||||
|
|
||||||
|
One way to solve the problem is to choose either of the
|
||||||
|
algorithms and use it for all letters.
|
||||||
|
If we use Algorithm 1, the running time is $O(n^2)$,
|
||||||
|
because all cells may contain the same letter,
|
||||||
|
and in this case $k=n$.
|
||||||
|
Also if we use Algorithm 2, the running time is $O(n^2)$,
|
||||||
|
because all cells may have different letters,
|
||||||
|
and in this case $n$ searches are needed.
|
||||||
|
|
||||||
|
However, we can \emph{combine} the two algorithms and
|
||||||
|
use different algorithms for different letters
|
||||||
|
depending on how many times each letter appears in the grid.
|
||||||
|
Assume that a letter $c$ appears $k$ times.
|
||||||
|
If $k \le \sqrt n$, we use Algorithm 1, and if $k > \sqrt n$,
|
||||||
|
we use Algorithm 2.
|
||||||
|
It turns out that by doing this, the total running time
|
||||||
|
of the algorithm is only $O(n \sqrt n)$.
|
||||||
|
|
||||||
|
First, suppose that we use Algorithm 1 for a letter $c$.
|
||||||
|
Since $c$ appears at most $\sqrt n$ times in the grid,
|
||||||
|
we compare each cell with letter $c$ $O(\sqrt n)$ times
|
||||||
|
with other cells.
|
||||||
|
Thus, the time used for processing all such cells is $O(n \sqrt n)$.
|
||||||
|
Then, suppose that we use Algorithm 2 for a letter $c$.
|
||||||
|
There are at most $\sqrt n$ such letters,
|
||||||
|
so processing those letters also takes $O(n \sqrt n)$ time.
|
||||||
|
|
||||||
|
\subsubsection{Batch processing}
|
||||||
|
|
||||||
|
Our next problem also deals with
|
||||||
|
a two-dimensional grid that contains $n$ cells.
|
||||||
|
Initially, each cell except one is white.
|
||||||
|
We perform $n-1$ operations, each of which first
|
||||||
|
calculates the minimum distance from a given white cell
|
||||||
|
to a black cell, and then paints the white cell black.
|
||||||
|
|
||||||
|
For example, consider the following operation:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=black] (1,1) rectangle (2,2);
|
||||||
|
\fill[color=black] (3,1) rectangle (4,2);
|
||||||
|
\fill[color=black] (0,3) rectangle (1,4);
|
||||||
|
\node at (2.5,3.5) {*};
|
||||||
|
\draw (0,0) grid (4,4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
First, we calculate the minimum distance
|
||||||
|
from the white cell marked with * to a black cell.
|
||||||
|
The minimum distance is 2, because we can move
|
||||||
|
two steps left to a black cell.
|
||||||
|
Then, we paint the white cell black:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=black] (1,1) rectangle (2,2);
|
||||||
|
\fill[color=black] (3,1) rectangle (4,2);
|
||||||
|
\fill[color=black] (0,3) rectangle (1,4);
|
||||||
|
\fill[color=black] (2,3) rectangle (3,4);
|
||||||
|
\draw (0,0) grid (4,4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Consider the following two algorithms:
|
||||||
|
|
||||||
|
\emph{Algorithm 1:} Use breadth-first search
|
||||||
|
to calculate
|
||||||
|
for each white cell the distance to the nearest black cell.
|
||||||
|
This takes $O(n)$ time, and after the search,
|
||||||
|
we can find the minimum distance from any white cell
|
||||||
|
to a black cell in $O(1)$ time.
|
||||||
|
|
||||||
|
\emph{Algorithm 2:} Maintain a list of cells that have been
|
||||||
|
painted black, go through this list at each operation
|
||||||
|
and then add a new cell to the list.
|
||||||
|
An operation takes $O(k)$ time where $k$ is the length of the list.
|
||||||
|
|
||||||
|
We combine the above algorithms by
|
||||||
|
dividing the operations into
|
||||||
|
$O(\sqrt n)$ \emph{batches}, each of which consists
|
||||||
|
of $O(\sqrt n)$ operations.
|
||||||
|
At the beginning of each batch,
|
||||||
|
we perform Algorithm 1.
|
||||||
|
Then, we use Algorithm 2 to process the operations
|
||||||
|
in the batch.
|
||||||
|
We clear the list of Algorithm 2 between
|
||||||
|
the batches.
|
||||||
|
At each operation,
|
||||||
|
the minimum distance to a black cell
|
||||||
|
is either the distance calculated by Algorithm 1
|
||||||
|
or the distance calculated by Algorithm 2.
|
||||||
|
|
||||||
|
The resulting algorithm works in
|
||||||
|
$O(n \sqrt n)$ time.
|
||||||
|
First, Algorithm 1 is performed $O(\sqrt n)$ times,
|
||||||
|
and each search works in $O(n)$ time.
|
||||||
|
Second, when using Algorithm 2 in a batch,
|
||||||
|
the list contains $O(\sqrt n)$ cells
|
||||||
|
(because we clear the list between the batches)
|
||||||
|
and each operation takes $O(\sqrt n)$ time.
|
||||||
|
|
||||||
|
\section{Integer partitions}
|
||||||
|
|
||||||
|
Some square root algorithms are based on
|
||||||
|
the following observation:
|
||||||
|
if a positive integer $n$ is represented as
|
||||||
|
a sum of positive integers,
|
||||||
|
such a sum always contains at most
|
||||||
|
$O(\sqrt n)$ \emph{distinct} numbers.
|
||||||
|
The reason for this is that to construct
|
||||||
|
a sum that contains a maximum number of distinct
|
||||||
|
numbers, we should choose \emph{small} numbers.
|
||||||
|
If we choose the numbers $1,2,\ldots,k$,
|
||||||
|
the resulting sum is
|
||||||
|
\[\frac{k(k+1)}{2}.\]
|
||||||
|
Thus, the maximum amount of distinct numbers is $k = O(\sqrt n)$.
|
||||||
|
Next we will discuss two problems that can be solved
|
||||||
|
efficiently using this observation.
|
||||||
|
|
||||||
|
\subsubsection{Knapsack}
|
||||||
|
|
||||||
|
Suppose that we are given a list of integer weights
|
||||||
|
whose sum is $n$.
|
||||||
|
Our task is to find out all sums that can be formed using
|
||||||
|
a subset of the weights. For example, if the weights are
|
||||||
|
$\{1,3,3\}$, the possible sums are as follows:
|
||||||
|
|
||||||
|
\begin{itemize}[noitemsep]
|
||||||
|
\item $0$ (empty set)
|
||||||
|
\item $1$
|
||||||
|
\item $3$
|
||||||
|
\item $1+3=4$
|
||||||
|
\item $3+3=6$
|
||||||
|
\item $1+3+3=7$
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
Using the standard knapsack approach (see Chapter 7.4),
|
||||||
|
the problem can be solved as follows:
|
||||||
|
we define a function $\texttt{possible}(x,k)$ whose value is 1
|
||||||
|
if the sum $x$ can be formed using the first $k$ weights,
|
||||||
|
and 0 otherwise.
|
||||||
|
Since the sum of the weights is $n$,
|
||||||
|
there are at most $n$ weights and
|
||||||
|
all values of the function can be calculated
|
||||||
|
in $O(n^2)$ time using dynamic programming.
|
||||||
|
|
||||||
|
However, we can make the algorithm more efficient
|
||||||
|
by using the fact that there are at most $O(\sqrt n)$
|
||||||
|
\emph{distinct} weights.
|
||||||
|
Thus, we can process the weights in groups
|
||||||
|
that consists of similar weights.
|
||||||
|
We can process each group
|
||||||
|
in $O(n)$ time, which yields an $O(n \sqrt n)$ time algorithm.
|
||||||
|
|
||||||
|
The idea is to use an array that records the sums of weights
|
||||||
|
that can be formed using the groups processed so far.
|
||||||
|
The array contains $n$ elements: element $k$ is 1 if the sum
|
||||||
|
$k$ can be formed and 0 otherwise.
|
||||||
|
To process a group of weights, we scan the array
|
||||||
|
from left to right and record the new sums of weights that
|
||||||
|
can be formed using this group and the previous groups.
|
||||||
|
|
||||||
|
\subsubsection{String construction}
|
||||||
|
|
||||||
|
Given a string \texttt{s} of length $n$
|
||||||
|
and a set of strings $D$ whose total length is $m$,
|
||||||
|
consider the problem of counting the number of ways
|
||||||
|
\texttt{s} can be formed as a concatenation of strings in $D$.
|
||||||
|
For example,
|
||||||
|
if $\texttt{s}=\texttt{ABAB}$ and
|
||||||
|
$D=\{\texttt{A},\texttt{B},\texttt{AB}\}$,
|
||||||
|
there are 4 ways:
|
||||||
|
|
||||||
|
\begin{itemize}[noitemsep]
|
||||||
|
\item $\texttt{A}+\texttt{B}+\texttt{A}+\texttt{B}$
|
||||||
|
\item $\texttt{AB}+\texttt{A}+\texttt{B}$
|
||||||
|
\item $\texttt{A}+\texttt{B}+\texttt{AB}$
|
||||||
|
\item $\texttt{AB}+\texttt{AB}$
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
We can solve the problem using dynamic programming:
|
||||||
|
Let $\texttt{count}(k)$ denote the number of ways to construct the prefix
|
||||||
|
$\texttt{s}[0 \ldots k]$ using the strings in $D$.
|
||||||
|
Now $\texttt{count}(n-1)$ gives the answer to the problem,
|
||||||
|
and we can solve the problem in $O(n^2)$ time
|
||||||
|
using a trie structure.
|
||||||
|
|
||||||
|
However, we can solve the problem more efficiently
|
||||||
|
by using string hashing and the fact that there
|
||||||
|
are at most $O(\sqrt m)$ distinct string lengths in $D$.
|
||||||
|
First, we construct a set $H$ that contains all
|
||||||
|
hash values of the strings in $D$.
|
||||||
|
Then, when calculating a value of $\texttt{count}(k)$,
|
||||||
|
we go through all values of $p$
|
||||||
|
such that there is a string of length $p$ in $D$,
|
||||||
|
calculate the hash value of $\texttt{s}[k-p+1 \ldots k]$
|
||||||
|
and check if it belongs to $H$.
|
||||||
|
Since there are at most $O(\sqrt m)$ distinct string lengths,
|
||||||
|
this results in an algorithm whose running time is $O(n \sqrt m)$.
|
||||||
|
|
||||||
|
\section{Mo's algorithm}
|
||||||
|
|
||||||
|
\index{Mo's algorithm}
|
||||||
|
|
||||||
|
\key{Mo's algorithm}\footnote{According to \cite{cod15}, this algorithm
|
||||||
|
is named after Mo Tao, a Chinese competitive programmer, but
|
||||||
|
the technique has appeared earlier in the literature \cite{ken06}.}
|
||||||
|
can be used in many problems
|
||||||
|
that require processing range queries in
|
||||||
|
a \emph{static} array, i.e., the array values
|
||||||
|
do not change between the queries.
|
||||||
|
In each query, we are given a range $[a,b]$,
|
||||||
|
and we should calculate a value based on the
|
||||||
|
array elements between positions $a$ and $b$.
|
||||||
|
Since the array is static,
|
||||||
|
the queries can be processed in any order,
|
||||||
|
and Mo's algorithm
|
||||||
|
processes the queries in a special order which guarantees
|
||||||
|
that the algorithm works efficiently.
|
||||||
|
|
||||||
|
Mo's algorithm maintains an \emph{active range}
|
||||||
|
of the array, and the answer to a query
|
||||||
|
concerning the active range is known at each moment.
|
||||||
|
The algorithm processes the queries one by one,
|
||||||
|
and always moves the endpoints of the
|
||||||
|
active range by inserting and removing elements.
|
||||||
|
The time complexity of the algorithm is
|
||||||
|
$O(n \sqrt n f(n))$ where the array contains
|
||||||
|
$n$ elements, there are $n$ queries
|
||||||
|
and each insertion and removal of an element
|
||||||
|
takes $O(f(n))$ time.
|
||||||
|
|
||||||
|
The trick in Mo's algorithm is the order
|
||||||
|
in which the queries are processed:
|
||||||
|
The array is divided into blocks of $k=O(\sqrt n)$
|
||||||
|
elements, and a query $[a_1,b_1]$
|
||||||
|
is processed before a query $[a_2,b_2]$
|
||||||
|
if either
|
||||||
|
\begin{itemize}
|
||||||
|
\item $\lfloor a_1/k \rfloor < \lfloor a_2/k \rfloor$ or
|
||||||
|
\item $\lfloor a_1/k \rfloor = \lfloor a_2/k \rfloor$ and $b_1 < b_2$.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
Thus, all queries whose left endpoints are
|
||||||
|
in a certain block are processed one after another
|
||||||
|
sorted according to their right endpoints.
|
||||||
|
Using this order, the algorithm
|
||||||
|
only performs $O(n \sqrt n)$ operations,
|
||||||
|
because the left endpoint moves
|
||||||
|
$O(n)$ times $O(\sqrt n)$ steps,
|
||||||
|
and the right endpoint moves
|
||||||
|
$O(\sqrt n)$ times $O(n)$ steps. Thus, both
|
||||||
|
endpoints move a total of $O(n \sqrt n)$ steps during the algorithm.
|
||||||
|
|
||||||
|
\subsubsection*{Example}
|
||||||
|
|
||||||
|
As an example, consider a problem
|
||||||
|
where we are given a set of queries,
|
||||||
|
each of them corresponding to a range in an array,
|
||||||
|
and our task is to calculate for each query
|
||||||
|
the number of \emph{distinct} elements in the range.
|
||||||
|
|
||||||
|
In Mo's algorithm, the queries are always sorted
|
||||||
|
in the same way, but it depends on the problem
|
||||||
|
how the answer to the query is maintained.
|
||||||
|
In this problem, we can maintain an array
|
||||||
|
\texttt{count} where $\texttt{count}[x]$
|
||||||
|
indicates the number of times an element $x$
|
||||||
|
occurs in the active range.
|
||||||
|
|
||||||
|
When we move from one query to another query,
|
||||||
|
the active range changes.
|
||||||
|
For example, if the current range is
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (1,0) rectangle (5,1);
|
||||||
|
\draw (0,0) grid (9,1);
|
||||||
|
\node at (0.5, 0.5) {4};
|
||||||
|
\node at (1.5, 0.5) {2};
|
||||||
|
\node at (2.5, 0.5) {5};
|
||||||
|
\node at (3.5, 0.5) {4};
|
||||||
|
\node at (4.5, 0.5) {2};
|
||||||
|
\node at (5.5, 0.5) {4};
|
||||||
|
\node at (6.5, 0.5) {3};
|
||||||
|
\node at (7.5, 0.5) {3};
|
||||||
|
\node at (8.5, 0.5) {4};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
and the next range is
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\fill[color=lightgray] (2,0) rectangle (7,1);
|
||||||
|
\draw (0,0) grid (9,1);
|
||||||
|
\node at (0.5, 0.5) {4};
|
||||||
|
\node at (1.5, 0.5) {2};
|
||||||
|
\node at (2.5, 0.5) {5};
|
||||||
|
\node at (3.5, 0.5) {4};
|
||||||
|
\node at (4.5, 0.5) {2};
|
||||||
|
\node at (5.5, 0.5) {4};
|
||||||
|
\node at (6.5, 0.5) {3};
|
||||||
|
\node at (7.5, 0.5) {3};
|
||||||
|
\node at (8.5, 0.5) {4};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
there will be three steps:
|
||||||
|
the left endpoint moves one step to the right,
|
||||||
|
and the right endpoint moves two steps to the right.
|
||||||
|
|
||||||
|
After each step, the array \texttt{count}
|
||||||
|
needs to be updated.
|
||||||
|
After adding an element $x$,
|
||||||
|
we increase the value of
|
||||||
|
$\texttt{count}[x]$ by 1,
|
||||||
|
and if $\texttt{count}[x]=1$ after this,
|
||||||
|
we also increase the answer to the query by 1.
|
||||||
|
Similarly, after removing an element $x$,
|
||||||
|
we decrease the value of
|
||||||
|
$\texttt{count}[x]$ by 1,
|
||||||
|
and if $\texttt{count}[x]=0$ after this,
|
||||||
|
we also decrease the answer to the query by 1.
|
||||||
|
|
||||||
|
In this problem, the time needed to perform
|
||||||
|
each step is $O(1)$, so the total time complexity
|
||||||
|
of the algorithm is $O(n \sqrt n)$.
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,782 @@
|
||||||
|
\chapter{Geometry}
|
||||||
|
|
||||||
|
\index{geometry}
|
||||||
|
|
||||||
|
In geometric problems, it is often challenging
|
||||||
|
to find a way to approach the problem so that
|
||||||
|
the solution to the problem can be conveniently implemented
|
||||||
|
and the number of special cases is small.
|
||||||
|
|
||||||
|
As an example, consider a problem where
|
||||||
|
we are given the vertices of a quadrilateral
|
||||||
|
(a polygon that has four vertices),
|
||||||
|
and our task is to calculate its area.
|
||||||
|
For example, a possible input for the problem is as follows:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.45]
|
||||||
|
|
||||||
|
\draw[fill] (6,2) circle [radius=0.1];
|
||||||
|
\draw[fill] (5,6) circle [radius=0.1];
|
||||||
|
\draw[fill] (2,5) circle [radius=0.1];
|
||||||
|
\draw[fill] (1,1) circle [radius=0.1];
|
||||||
|
\draw[thick] (6,2) -- (5,6) -- (2,5) -- (1,1) -- (6,2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
One way to approach the problem is to divide
|
||||||
|
the quadrilateral into two triangles by a straight
|
||||||
|
line between two opposite vertices:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.45]
|
||||||
|
|
||||||
|
\draw[fill] (6,2) circle [radius=0.1];
|
||||||
|
\draw[fill] (5,6) circle [radius=0.1];
|
||||||
|
\draw[fill] (2,5) circle [radius=0.1];
|
||||||
|
\draw[fill] (1,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw[thick] (6,2) -- (5,6) -- (2,5) -- (1,1) -- (6,2);
|
||||||
|
\draw[dashed,thick] (2,5) -- (6,2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
After this, it suffices to sum the areas
|
||||||
|
of the triangles.
|
||||||
|
The area of a triangle can be calculated,
|
||||||
|
for example, using \key{Heron's formula}
|
||||||
|
%\footnote{Heron of Alexandria (c. 10--70) was a Greek mathematician.}
|
||||||
|
\[ \sqrt{s (s-a) (s-b) (s-c)},\]
|
||||||
|
where $a$, $b$ and $c$ are the lengths
|
||||||
|
of the triangle's sides and
|
||||||
|
$s=(a+b+c)/2$.
|
||||||
|
\index{Heron's formula}
|
||||||
|
|
||||||
|
This is a possible way to solve the problem,
|
||||||
|
but there is one pitfall:
|
||||||
|
how to divide the quadrilateral into triangles?
|
||||||
|
It turns out that sometimes we cannot just pick
|
||||||
|
two arbitrary opposite vertices.
|
||||||
|
For example, in the following situation,
|
||||||
|
the division line is \emph{outside} the quadrilateral:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.45]
|
||||||
|
|
||||||
|
\draw[fill] (6,2) circle [radius=0.1];
|
||||||
|
\draw[fill] (3,2) circle [radius=0.1];
|
||||||
|
\draw[fill] (2,5) circle [radius=0.1];
|
||||||
|
\draw[fill] (1,1) circle [radius=0.1];
|
||||||
|
\draw[thick] (6,2) -- (3,2) -- (2,5) -- (1,1) -- (6,2);
|
||||||
|
|
||||||
|
\draw[dashed,thick] (2,5) -- (6,2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
However, another way to draw the line works:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.45]
|
||||||
|
|
||||||
|
\draw[fill] (6,2) circle [radius=0.1];
|
||||||
|
\draw[fill] (3,2) circle [radius=0.1];
|
||||||
|
\draw[fill] (2,5) circle [radius=0.1];
|
||||||
|
\draw[fill] (1,1) circle [radius=0.1];
|
||||||
|
\draw[thick] (6,2) -- (3,2) -- (2,5) -- (1,1) -- (6,2);
|
||||||
|
|
||||||
|
\draw[dashed,thick] (3,2) -- (1,1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
It is clear for a human which of the lines is the correct
|
||||||
|
choice, but the situation is difficult for a computer.
|
||||||
|
|
||||||
|
However, it turns out that we can solve the problem using
|
||||||
|
another method that is more convenient to a programmer.
|
||||||
|
Namely, there is a general formula
|
||||||
|
\[x_1y_2-x_2y_1+x_2y_3-x_3y_2+x_3y_4-x_4y_3+x_4y_1-x_1y_4,\]
|
||||||
|
that calculates the area of a quadrilateral
|
||||||
|
whose vertices are
|
||||||
|
$(x_1,y_1)$,
|
||||||
|
$(x_2,y_2)$,
|
||||||
|
$(x_3,y_3)$ and
|
||||||
|
$(x_4,y_4)$.
|
||||||
|
This formula is easy to implement, there are no special
|
||||||
|
cases, and we can even generalize the formula
|
||||||
|
to \emph{all} polygons.
|
||||||
|
|
||||||
|
\section{Complex numbers}
|
||||||
|
|
||||||
|
\index{complex number}
|
||||||
|
\index{point}
|
||||||
|
\index{vector}
|
||||||
|
|
||||||
|
A \key{complex number} is a number of the form $x+y i$,
|
||||||
|
where $i = \sqrt{-1}$ is the \key{imaginary unit}.
|
||||||
|
A geometric interpretation of a complex number is
|
||||||
|
that it represents a two-dimensional point $(x,y)$
|
||||||
|
or a vector from the origin to a point $(x,y)$.
|
||||||
|
|
||||||
|
For example, $4+2i$ corresponds to the
|
||||||
|
following point and vector:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.45]
|
||||||
|
|
||||||
|
\draw[->,thick] (-5,0)--(5,0);
|
||||||
|
\draw[->,thick] (0,-5)--(0,5);
|
||||||
|
|
||||||
|
\draw[fill] (4,2) circle [radius=0.1];
|
||||||
|
\draw[->,thick] (0,0)--(4-0.1,2-0.1);
|
||||||
|
|
||||||
|
\node at (4,2.8) {$(4,2)$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\index{complex@\texttt{complex}}
|
||||||
|
|
||||||
|
The C++ complex number class \texttt{complex} is
|
||||||
|
useful when solving geometric problems.
|
||||||
|
Using the class we can represent points and vectors
|
||||||
|
as complex numbers, and the class contains tools
|
||||||
|
that are useful in geometry.
|
||||||
|
|
||||||
|
In the following code, \texttt{C} is the type of
|
||||||
|
a coordinate and \texttt{P} is the type of a point or a vector.
|
||||||
|
In addition, the code defines macros \texttt{X} and \texttt{Y}
|
||||||
|
that can be used to refer to x and y coordinates.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
typedef long long C;
|
||||||
|
typedef complex<C> P;
|
||||||
|
#define X real()
|
||||||
|
#define Y imag()
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
For example, the following code defines a point $p=(4,2)$
|
||||||
|
and prints its x and y coordinates:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
P p = {4,2};
|
||||||
|
cout << p.X << " " << p.Y << "\n"; // 4 2
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The following code defines vectors $v=(3,1)$ and $u=(2,2)$,
|
||||||
|
and after that calculates the sum $s=v+u$.
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
P v = {3,1};
|
||||||
|
P u = {2,2};
|
||||||
|
P s = v+u;
|
||||||
|
cout << s.X << " " << s.Y << "\n"; // 5 3
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
In practice,
|
||||||
|
an appropriate coordinate type is usually
|
||||||
|
\texttt{long long} (integer) or \texttt{long double}
|
||||||
|
(real number).
|
||||||
|
It is a good idea to use integer whenever possible,
|
||||||
|
because calculations with integers are exact.
|
||||||
|
If real numbers are needed,
|
||||||
|
precision errors should be taken into account
|
||||||
|
when comparing numbers.
|
||||||
|
A safe way to check if real numbers $a$ and $b$ are equal
|
||||||
|
is to compare them using $|a-b|<\epsilon$,
|
||||||
|
where $\epsilon$ is a small number (for example, $\epsilon=10^{-9}$).
|
||||||
|
|
||||||
|
\subsubsection*{Functions}
|
||||||
|
|
||||||
|
In the following examples, the coordinate type is
|
||||||
|
\texttt{long double}.
|
||||||
|
|
||||||
|
The function $\texttt{abs}(v)$ calculates the length
|
||||||
|
$|v|$ of a vector $v=(x,y)$
|
||||||
|
using the formula $\sqrt{x^2+y^2}$.
|
||||||
|
The function can also be used for
|
||||||
|
calculating the distance between points
|
||||||
|
$(x_1,y_1)$ and $(x_2,y_2)$,
|
||||||
|
because that distance equals the length
|
||||||
|
of the vector $(x_2-x_1,y_2-y_1)$.
|
||||||
|
|
||||||
|
The following code calculates the distance
|
||||||
|
between points $(4,2)$ and $(3,-1)$:
|
||||||
|
\begin{lstlisting}
|
||||||
|
P a = {4,2};
|
||||||
|
P b = {3,-1};
|
||||||
|
cout << abs(b-a) << "\n"; // 3.16228
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The function $\texttt{arg}(v)$ calculates the
|
||||||
|
angle of a vector $v=(x,y)$ with respect to the x axis.
|
||||||
|
The function gives the angle in radians,
|
||||||
|
where $r$ radians equals $180 r/\pi$ degrees.
|
||||||
|
The angle of a vector that points to the right is 0,
|
||||||
|
and angles decrease clockwise and increase
|
||||||
|
counterclockwise.
|
||||||
|
|
||||||
|
The function $\texttt{polar}(s,a)$ constructs a vector
|
||||||
|
whose length is $s$ and that points to an angle $a$.
|
||||||
|
A vector can be rotated by an angle $a$
|
||||||
|
by multiplying it by a vector with length 1 and angle $a$.
|
||||||
|
|
||||||
|
The following code calculates the angle of
|
||||||
|
the vector $(4,2)$, rotates it $1/2$ radians
|
||||||
|
counterclockwise, and then calculates the angle again:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
P v = {4,2};
|
||||||
|
cout << arg(v) << "\n"; // 0.463648
|
||||||
|
v *= polar(1.0,0.5);
|
||||||
|
cout << arg(v) << "\n"; // 0.963648
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
\section{Points and lines}
|
||||||
|
|
||||||
|
\index{cross product}
|
||||||
|
|
||||||
|
The \key{cross product} $a \times b$ of vectors
|
||||||
|
$a=(x_1,y_1)$ and $b=(x_2,y_2)$ is calculated
|
||||||
|
using the formula $x_1 y_2 - x_2 y_1$.
|
||||||
|
The cross product tells us whether $b$
|
||||||
|
turns left (positive value), does not turn (zero)
|
||||||
|
or turns right (negative value)
|
||||||
|
when it is placed directly after $a$.
|
||||||
|
|
||||||
|
The following picture illustrates the above cases:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.45]
|
||||||
|
|
||||||
|
\draw[->,thick] (0,0)--(4,2);
|
||||||
|
\draw[->,thick] (4,2)--(4+1,2+2);
|
||||||
|
|
||||||
|
\node at (2.5,0.5) {$a$};
|
||||||
|
\node at (5,2.5) {$b$};
|
||||||
|
|
||||||
|
\node at (3,-2) {$a \times b = 6$};
|
||||||
|
|
||||||
|
\draw[->,thick] (8+0,0)--(8+4,2);
|
||||||
|
\draw[->,thick] (8+4,2)--(8+4+2,2+1);
|
||||||
|
|
||||||
|
\node at (8+2.5,0.5) {$a$};
|
||||||
|
\node at (8+5,1.5) {$b$};
|
||||||
|
|
||||||
|
\node at (8+3,-2) {$a \times b = 0$};
|
||||||
|
|
||||||
|
\draw[->,thick] (16+0,0)--(16+4,2);
|
||||||
|
\draw[->,thick] (16+4,2)--(16+4+2,2-1);
|
||||||
|
|
||||||
|
\node at (16+2.5,0.5) {$a$};
|
||||||
|
\node at (16+5,2.5) {$b$};
|
||||||
|
|
||||||
|
\node at (16+3,-2) {$a \times b = -8$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\noindent
|
||||||
|
For example, in the first case
|
||||||
|
$a=(4,2)$ and $b=(1,2)$.
|
||||||
|
The following code calculates the cross product
|
||||||
|
using the class \texttt{complex}:
|
||||||
|
|
||||||
|
\begin{lstlisting}
|
||||||
|
P a = {4,2};
|
||||||
|
P b = {1,2};
|
||||||
|
C p = (conj(a)*b).Y; // 6
|
||||||
|
\end{lstlisting}
|
||||||
|
|
||||||
|
The above code works, because
|
||||||
|
the function \texttt{conj} negates the y coordinate
|
||||||
|
of a vector,
|
||||||
|
and when the vectors $(x_1,-y_1)$ and $(x_2,y_2)$
|
||||||
|
are multiplied together, the y coordinate
|
||||||
|
of the result is $x_1 y_2 - x_2 y_1$.
|
||||||
|
|
||||||
|
\subsubsection{Point location}
|
||||||
|
|
||||||
|
Cross products can be used to test
|
||||||
|
whether a point is located on the left or right
|
||||||
|
side of a line.
|
||||||
|
Assume that the line goes through points
|
||||||
|
$s_1$ and $s_2$, we are looking from $s_1$
|
||||||
|
to $s_2$ and the point is $p$.
|
||||||
|
|
||||||
|
For example, in the following picture,
|
||||||
|
$p$ is on the left side of the line:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.45]
|
||||||
|
\draw[dashed,thick,->] (0,-3)--(12,6);
|
||||||
|
\draw[fill] (4,0) circle [radius=0.1];
|
||||||
|
\draw[fill] (8,3) circle [radius=0.1];
|
||||||
|
\draw[fill] (5,3) circle [radius=0.1];
|
||||||
|
\node at (4,-1) {$s_1$};
|
||||||
|
\node at (8,2) {$s_2$};
|
||||||
|
\node at (5,4) {$p$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The cross product $(p-s_1) \times (p-s_2)$
|
||||||
|
tells us the location of the point $p$.
|
||||||
|
If the cross product is positive,
|
||||||
|
$p$ is located on the left side,
|
||||||
|
and if the cross product is negative,
|
||||||
|
$p$ is located on the right side.
|
||||||
|
Finally, if the cross product is zero,
|
||||||
|
points $s_1$, $s_2$ and $p$ are on the same line.
|
||||||
|
|
||||||
|
\subsubsection{Line segment intersection}
|
||||||
|
|
||||||
|
\index{line segment intersection}
|
||||||
|
|
||||||
|
Next we consider the problem of testing
|
||||||
|
whether two line segments
|
||||||
|
$ab$ and $cd$ intersect. The possible cases are:
|
||||||
|
|
||||||
|
\textit{Case 1:}
|
||||||
|
The line segments are on the same line
|
||||||
|
and they overlap each other.
|
||||||
|
In this case, there is an infinite number of
|
||||||
|
intersection points.
|
||||||
|
For example, in the following picture,
|
||||||
|
all points between $c$ and $b$ are
|
||||||
|
intersection points:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\draw (1.5,1.5)--(6,3);
|
||||||
|
\draw (0,1)--(4.5,2.5);
|
||||||
|
\draw[fill] (0,1) circle [radius=0.05];
|
||||||
|
\node at (0,0.5) {$a$};
|
||||||
|
\draw[fill] (1.5,1.5) circle [radius=0.05];
|
||||||
|
\node at (6,2.5) {$d$};
|
||||||
|
\draw[fill] (4.5,2.5) circle [radius=0.05];
|
||||||
|
\node at (1.5,1) {$c$};
|
||||||
|
\draw[fill] (6,3) circle [radius=0.05];
|
||||||
|
\node at (4.5,2) {$b$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
In this case, we can use cross products to
|
||||||
|
check if all points are on the same line.
|
||||||
|
After this, we can sort the points and check
|
||||||
|
whether the line segments overlap each other.
|
||||||
|
|
||||||
|
\textit{Case 2:}
|
||||||
|
The line segments have a common vertex
|
||||||
|
that is the only intersection point.
|
||||||
|
For example, in the following picture the
|
||||||
|
intersection point is $b=c$:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\draw (0,0)--(4,2);
|
||||||
|
\draw (4,2)--(6,1);
|
||||||
|
\draw[fill] (0,0) circle [radius=0.05];
|
||||||
|
\draw[fill] (4,2) circle [radius=0.05];
|
||||||
|
\draw[fill] (6,1) circle [radius=0.05];
|
||||||
|
|
||||||
|
\node at (0,0.5) {$a$};
|
||||||
|
\node at (4,2.5) {$b=c$};
|
||||||
|
\node at (6,1.5) {$d$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
This case is easy to check, because
|
||||||
|
there are only four possibilities
|
||||||
|
for the intersection point:
|
||||||
|
$a=c$, $a=d$, $b=c$ and $b=d$.
|
||||||
|
|
||||||
|
\textit{Case 3:}
|
||||||
|
There is exactly one intersection point
|
||||||
|
that is not a vertex of any line segment.
|
||||||
|
In the following picture, the point $p$
|
||||||
|
is the intersection point:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.9]
|
||||||
|
\draw (0,1)--(6,3);
|
||||||
|
\draw (2,4)--(4,0);
|
||||||
|
\draw[fill] (0,1) circle [radius=0.05];
|
||||||
|
\node at (0,0.5) {$c$};
|
||||||
|
\draw[fill] (6,3) circle [radius=0.05];
|
||||||
|
\node at (6,2.5) {$d$};
|
||||||
|
\draw[fill] (2,4) circle [radius=0.05];
|
||||||
|
\node at (1.5,3.5) {$a$};
|
||||||
|
\draw[fill] (4,0) circle [radius=0.05];
|
||||||
|
\node at (4,-0.4) {$b$};
|
||||||
|
\draw[fill] (3,2) circle [radius=0.05];
|
||||||
|
\node at (3,1.5) {$p$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
In this case, the line segments intersect
|
||||||
|
exactly when both points $c$ and $d$ are
|
||||||
|
on different sides of a line through $a$ and $b$,
|
||||||
|
and points $a$ and $b$ are on different
|
||||||
|
sides of a line through $c$ and $d$.
|
||||||
|
We can use cross products to check this.
|
||||||
|
|
||||||
|
\subsubsection{Point distance from a line}
|
||||||
|
|
||||||
|
Another feature of cross products is that
|
||||||
|
the area of a triangle can be calculated
|
||||||
|
using the formula
|
||||||
|
\[\frac{| (a-c) \times (b-c) |}{2},\]
|
||||||
|
where $a$, $b$ and $c$ are the vertices of the triangle.
|
||||||
|
Using this fact, we can derive a formula
|
||||||
|
for calculating the shortest distance between a point and a line.
|
||||||
|
For example, in the following picture $d$ is the
|
||||||
|
shortest distance between the point $p$ and the line
|
||||||
|
that is defined by the points $s_1$ and $s_2$:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.75]
|
||||||
|
\draw (-2,-1)--(6,3);
|
||||||
|
\draw[dashed] (1,4)--(2.40,1.2);
|
||||||
|
\node at (0,-0.5) {$s_1$};
|
||||||
|
\node at (4,1.5) {$s_2$};
|
||||||
|
\node at (0.5,4) {$p$};
|
||||||
|
\node at (2,2.7) {$d$};
|
||||||
|
\draw[fill] (0,0) circle [radius=0.05];
|
||||||
|
\draw[fill] (4,2) circle [radius=0.05];
|
||||||
|
\draw[fill] (1,4) circle [radius=0.05];
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The area of the triangle whose vertices are
|
||||||
|
$s_1$, $s_2$ and $p$ can be calculated in two ways:
|
||||||
|
it is both
|
||||||
|
$\frac{1}{2} |s_2-s_1| d$ and
|
||||||
|
$\frac{1}{2} ((s_1-p) \times (s_2-p))$.
|
||||||
|
Thus, the shortest distance is
|
||||||
|
\[ d = \frac{(s_1-p) \times (s_2-p)}{|s_2-s_1|} .\]
|
||||||
|
|
||||||
|
\subsubsection{Point inside a polygon}
|
||||||
|
|
||||||
|
Let us now consider the problem of
|
||||||
|
testing whether a point is located inside or outside
|
||||||
|
a polygon.
|
||||||
|
For example, in the following picture point $a$
|
||||||
|
is inside the polygon and point $b$ is outside
|
||||||
|
the polygon.
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.75]
|
||||||
|
%\draw (0,0)--(2,-2)--(3,1)--(5,1)--(2,3)--(1,2)--(-1,2)--(1,4)--(-2,4)--(-2,1)--(-3,3)--(-4,0)--(0,0);
|
||||||
|
\draw (0,0)--(2,2)--(5,1)--(2,3)--(1,2)--(-1,2)--(1,4)--(-2,4)--(-2,1)--(-3,3)--(-4,0)--(0,0);
|
||||||
|
|
||||||
|
\draw[fill] (-3,1) circle [radius=0.05];
|
||||||
|
\node at (-3,0.5) {$a$};
|
||||||
|
\draw[fill] (1,3) circle [radius=0.05];
|
||||||
|
\node at (1,2.5) {$b$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
A convenient way to solve the problem is to
|
||||||
|
send a \emph{ray} from the point to an arbitrary direction
|
||||||
|
and calculate the number of times it touches
|
||||||
|
the boundary of the polygon.
|
||||||
|
If the number is odd,
|
||||||
|
the point is inside the polygon,
|
||||||
|
and if the number is even,
|
||||||
|
the point is outside the polygon.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
For example, we could send the following rays:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.75]
|
||||||
|
\draw (0,0)--(2,2)--(5,1)--(2,3)--(1,2)--(-1,2)--(1,4)--(-2,4)--(-2,1)--(-3,3)--(-4,0)--(0,0);
|
||||||
|
|
||||||
|
\draw[fill] (-3,1) circle [radius=0.05];
|
||||||
|
\node at (-3,0.5) {$a$};
|
||||||
|
\draw[fill] (1,3) circle [radius=0.05];
|
||||||
|
\node at (1,2.5) {$b$};
|
||||||
|
|
||||||
|
\draw[dashed,->] (-3,1)--(-6,0);
|
||||||
|
\draw[dashed,->] (-3,1)--(0,5);
|
||||||
|
|
||||||
|
\draw[dashed,->] (1,3)--(3.5,0);
|
||||||
|
\draw[dashed,->] (1,3)--(3,4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
The rays from $a$ touch 1 and 3 times
|
||||||
|
the boundary of the polygon,
|
||||||
|
so $a$ is inside the polygon.
|
||||||
|
Correspondingly, the rays from $b$
|
||||||
|
touch 0 and 2 times the boundary of the polygon,
|
||||||
|
so $b$ is outside the polygon.
|
||||||
|
|
||||||
|
\section{Polygon area}
|
||||||
|
|
||||||
|
A general formula for calculating the area
|
||||||
|
of a polygon, sometimes called the \key{shoelace formula},
|
||||||
|
is as follows: \index{shoelace formula}
|
||||||
|
\[\frac{1}{2} |\sum_{i=1}^{n-1} (p_i \times p_{i+1})| =
|
||||||
|
\frac{1}{2} |\sum_{i=1}^{n-1} (x_i y_{i+1} - x_{i+1} y_i)|, \]
|
||||||
|
Here the vertices are
|
||||||
|
$p_1=(x_1,y_1)$, $p_2=(x_2,y_2)$, $\ldots$, $p_n=(x_n,y_n)$
|
||||||
|
in such an order that
|
||||||
|
$p_i$ and $p_{i+1}$ are adjacent vertices on the boundary
|
||||||
|
of the polygon,
|
||||||
|
and the first and last vertex is the same, i.e., $p_1=p_n$.
|
||||||
|
|
||||||
|
For example, the area of the polygon
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\filldraw (4,1.4) circle (2pt);
|
||||||
|
\filldraw (7,3.4) circle (2pt);
|
||||||
|
\filldraw (5,5.4) circle (2pt);
|
||||||
|
\filldraw (2,4.4) circle (2pt);
|
||||||
|
\filldraw (4,3.4) circle (2pt);
|
||||||
|
\node (1) at (4,1) {(4,1)};
|
||||||
|
\node (2) at (7.2,3) {(7,3)};
|
||||||
|
\node (3) at (5,5.8) {(5,5)};
|
||||||
|
\node (4) at (2,4) {(2,4)};
|
||||||
|
\node (5) at (3.5,3) {(4,3)};
|
||||||
|
\path[draw] (4,1.4) -- (7,3.4) -- (5,5.4) -- (2,4.4) -- (4,3.4) -- (4,1.4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
is
|
||||||
|
\[\frac{|(2\cdot5-5\cdot4)+(5\cdot3-7\cdot5)+(7\cdot1-4\cdot3)+(4\cdot3-4\cdot1)+(4\cdot4-2\cdot3)|}{2} = 17/2.\]
|
||||||
|
|
||||||
|
The idea of the formula is to go through trapezoids
|
||||||
|
whose one side is a side of the polygon,
|
||||||
|
and another side lies on the horizontal line $y=0$.
|
||||||
|
For example:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\path[draw,fill=lightgray] (5,5.4) -- (7,3.4) -- (7,0) -- (5,0) -- (5,5.4);
|
||||||
|
\filldraw (4,1.4) circle (2pt);
|
||||||
|
\filldraw (7,3.4) circle (2pt);
|
||||||
|
\filldraw (5,5.4) circle (2pt);
|
||||||
|
\filldraw (2,4.4) circle (2pt);
|
||||||
|
\filldraw (4,3.4) circle (2pt);
|
||||||
|
\node (1) at (4,1) {(4,1)};
|
||||||
|
\node (2) at (7.2,3) {(7,3)};
|
||||||
|
\node (3) at (5,5.8) {(5,5)};
|
||||||
|
\node (4) at (2,4) {(2,4)};
|
||||||
|
\node (5) at (3.5,3) {(4,3)};
|
||||||
|
\path[draw] (4,1.4) -- (7,3.4) -- (5,5.4) -- (2,4.4) -- (4,3.4) -- (4,1.4);
|
||||||
|
\draw (0,0) -- (10,0);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The area of such a trapezoid is
|
||||||
|
\[(x_{i+1}-x_{i}) \frac{y_i+y_{i+1}}{2},\]
|
||||||
|
where the vertices of the polygon are $p_i$ and $p_{i+1}$.
|
||||||
|
If $x_{i+1}>x_{i}$, the area is positive,
|
||||||
|
and if $x_{i+1}<x_{i}$, the area is negative.
|
||||||
|
|
||||||
|
The area of the polygon is the sum of areas of
|
||||||
|
all such trapezoids, which yields the formula
|
||||||
|
\[|\sum_{i=1}^{n-1} (x_{i+1}-x_{i}) \frac{y_i+y_{i+1}}{2}| =
|
||||||
|
\frac{1}{2} |\sum_{i=1}^{n-1} (x_i y_{i+1} - x_{i+1} y_i)|.\]
|
||||||
|
|
||||||
|
Note that the absolute value of the sum is taken,
|
||||||
|
because the value of the sum may be positive or negative,
|
||||||
|
depending on whether we walk clockwise or counterclockwise
|
||||||
|
along the boundary of the polygon.
|
||||||
|
|
||||||
|
\subsubsection{Pick's theorem}
|
||||||
|
|
||||||
|
\index{Pick's theorem}
|
||||||
|
|
||||||
|
\key{Pick's theorem} provides another way to calculate
|
||||||
|
the area of a polygon provided that all vertices
|
||||||
|
of the polygon have integer coordinates.
|
||||||
|
According to Pick's theorem, the area of the polygon is
|
||||||
|
\[ a + b/2 -1,\]
|
||||||
|
where $a$ is the number of integer points inside the polygon
|
||||||
|
and $b$ is the number of integer points on the boundary of the polygon.
|
||||||
|
|
||||||
|
For example, the area of the polygon
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\filldraw (4,1.4) circle (2pt);
|
||||||
|
\filldraw (7,3.4) circle (2pt);
|
||||||
|
\filldraw (5,5.4) circle (2pt);
|
||||||
|
\filldraw (2,4.4) circle (2pt);
|
||||||
|
\filldraw (4,3.4) circle (2pt);
|
||||||
|
\node (1) at (4,1) {(4,1)};
|
||||||
|
\node (2) at (7.2,3) {(7,3)};
|
||||||
|
\node (3) at (5,5.8) {(5,5)};
|
||||||
|
\node (4) at (2,4) {(2,4)};
|
||||||
|
\node (5) at (3.5,3) {(4,3)};
|
||||||
|
\path[draw] (4,1.4) -- (7,3.4) -- (5,5.4) -- (2,4.4) -- (4,3.4) -- (4,1.4);
|
||||||
|
|
||||||
|
\filldraw (2,4.4) circle (2pt);
|
||||||
|
\filldraw (3,4.4) circle (2pt);
|
||||||
|
\filldraw (4,4.4) circle (2pt);
|
||||||
|
\filldraw (5,4.4) circle (2pt);
|
||||||
|
\filldraw (6,4.4) circle (2pt);
|
||||||
|
|
||||||
|
\filldraw (4,3.4) circle (2pt);
|
||||||
|
\filldraw (5,3.4) circle (2pt);
|
||||||
|
\filldraw (6,3.4) circle (2pt);
|
||||||
|
\filldraw (7,3.4) circle (2pt);
|
||||||
|
|
||||||
|
\filldraw (4,2.4) circle (2pt);
|
||||||
|
\filldraw (5,2.4) circle (2pt);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
is $6+7/2-1=17/2$.
|
||||||
|
|
||||||
|
\section{Distance functions}
|
||||||
|
|
||||||
|
\index{distance function}
|
||||||
|
\index{Euclidean distance}
|
||||||
|
\index{Manhattan distance}
|
||||||
|
|
||||||
|
A \key{distance function} defines the distance between
|
||||||
|
two points.
|
||||||
|
The usual distance function is the
|
||||||
|
\key{Euclidean distance} where the distance between
|
||||||
|
points $(x_1,y_1)$ and $(x_2,y_2)$ is
|
||||||
|
\[\sqrt{(x_2-x_1)^2+(y_2-y_1)^2}.\]
|
||||||
|
An alternative distance function is the
|
||||||
|
\key{Manhattan distance}
|
||||||
|
where the distance between points
|
||||||
|
$(x_1,y_1)$ and $(x_2,y_2)$ is
|
||||||
|
\[|x_1-x_2|+|y_1-y_2|.\]
|
||||||
|
\begin{samepage}
|
||||||
|
For example, consider the following picture:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
|
||||||
|
\draw[fill] (2,1) circle [radius=0.05];
|
||||||
|
\draw[fill] (5,2) circle [radius=0.05];
|
||||||
|
|
||||||
|
\node at (2,0.5) {$(2,1)$};
|
||||||
|
\node at (5,1.5) {$(5,2)$};
|
||||||
|
|
||||||
|
\draw[dashed] (2,1) -- (5,2);
|
||||||
|
|
||||||
|
\draw[fill] (5+2,1) circle [radius=0.05];
|
||||||
|
\draw[fill] (5+5,2) circle [radius=0.05];
|
||||||
|
|
||||||
|
\node at (5+2,0.5) {$(2,1)$};
|
||||||
|
\node at (5+5,1.5) {$(5,2)$};
|
||||||
|
|
||||||
|
\draw[dashed] (5+2,1) -- (5+2,2);
|
||||||
|
\draw[dashed] (5+2,2) -- (5+5,2);
|
||||||
|
|
||||||
|
\node at (3.5,-0.5) {Euclidean distance};
|
||||||
|
\node at (5+3.5,-0.5) {Manhattan distance};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
The Euclidean distance between the points is
|
||||||
|
\[\sqrt{(5-2)^2+(2-1)^2}=\sqrt{10}\]
|
||||||
|
and the Manhattan distance is
|
||||||
|
\[|5-2|+|2-1|=4.\]
|
||||||
|
The following picture shows regions that are within a distance of 1
|
||||||
|
from the center point, using the Euclidean and Manhattan distances:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}
|
||||||
|
|
||||||
|
\draw[fill=gray!20] (0,0) circle [radius=1];
|
||||||
|
\draw[fill] (0,0) circle [radius=0.05];
|
||||||
|
|
||||||
|
\node at (0,-1.5) {Euclidean distance};
|
||||||
|
|
||||||
|
\draw[fill=gray!20] (5+0,1) -- (5-1,0) -- (5+0,-1) -- (5+1,0) -- (5+0,1);
|
||||||
|
\draw[fill] (5,0) circle [radius=0.05];
|
||||||
|
\node at (5,-1.5) {Manhattan distance};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\subsubsection{Rotating coordinates}
|
||||||
|
|
||||||
|
Some problems are easier to solve if
|
||||||
|
Manhattan distances are used instead of Euclidean distances.
|
||||||
|
As an example, consider a problem where we are given
|
||||||
|
$n$ points in the two-dimensional plane
|
||||||
|
and our task is to calculate the maximum Manhattan
|
||||||
|
distance between any two points.
|
||||||
|
|
||||||
|
For example, consider the following set of points:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.65]
|
||||||
|
\draw[color=gray] (-1,-1) grid (4,4);
|
||||||
|
|
||||||
|
\filldraw (0,2) circle (2.5pt);
|
||||||
|
\filldraw (3,3) circle (2.5pt);
|
||||||
|
\filldraw (1,0) circle (2.5pt);
|
||||||
|
\filldraw (3,1) circle (2.5pt);
|
||||||
|
|
||||||
|
\node at (0,1.5) {$A$};
|
||||||
|
\node at (3,2.5) {$C$};
|
||||||
|
\node at (1,-0.5) {$B$};
|
||||||
|
\node at (3,0.5) {$D$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The maximum Manhattan distance is 5
|
||||||
|
between points $B$ and $C$:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.65]
|
||||||
|
\draw[color=gray] (-1,-1) grid (4,4);
|
||||||
|
|
||||||
|
\filldraw (0,2) circle (2.5pt);
|
||||||
|
\filldraw (3,3) circle (2.5pt);
|
||||||
|
\filldraw (1,0) circle (2.5pt);
|
||||||
|
\filldraw (3,1) circle (2.5pt);
|
||||||
|
|
||||||
|
\node at (0,1.5) {$A$};
|
||||||
|
\node at (3,2.5) {$C$};
|
||||||
|
\node at (1,-0.5) {$B$};
|
||||||
|
\node at (3,0.5) {$D$};
|
||||||
|
|
||||||
|
\path[draw=red,thick,line width=2pt] (1,0) -- (1,3) -- (3,3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
A useful technique related to Manhattan distances
|
||||||
|
is to rotate all coordinates 45 degrees so that
|
||||||
|
a point $(x,y)$ becomes $(x+y,y-x)$.
|
||||||
|
For example, after rotating the above points,
|
||||||
|
the result is:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.6]
|
||||||
|
\draw[color=gray] (0,-3) grid (7,3);
|
||||||
|
|
||||||
|
\filldraw (2,2) circle (2.5pt);
|
||||||
|
\filldraw (6,0) circle (2.5pt);
|
||||||
|
\filldraw (1,-1) circle (2.5pt);
|
||||||
|
\filldraw (4,-2) circle (2.5pt);
|
||||||
|
|
||||||
|
\node at (2,1.5) {$A$};
|
||||||
|
\node at (6,-0.5) {$C$};
|
||||||
|
\node at (1,-1.5) {$B$};
|
||||||
|
\node at (4,-2.5) {$D$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
And the maximum distance is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.6]
|
||||||
|
\draw[color=gray] (0,-3) grid (7,3);
|
||||||
|
|
||||||
|
\filldraw (2,2) circle (2.5pt);
|
||||||
|
\filldraw (6,0) circle (2.5pt);
|
||||||
|
\filldraw (1,-1) circle (2.5pt);
|
||||||
|
\filldraw (4,-2) circle (2.5pt);
|
||||||
|
|
||||||
|
\node at (2,1.5) {$A$};
|
||||||
|
\node at (6,-0.5) {$C$};
|
||||||
|
\node at (1,-1.5) {$B$};
|
||||||
|
\node at (4,-2.5) {$D$};
|
||||||
|
|
||||||
|
\path[draw=red,thick,line width=2pt] (1,-1) -- (4,2) -- (6,0);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
Consider two points $p_1=(x_1,y_1)$ and $p_2=(x_2,y_2)$ whose rotated
|
||||||
|
coordinates are $p'_1=(x'_1,y'_1)$ and $p'_2=(x'_2,y'_2)$.
|
||||||
|
Now there are two ways to express the Manhattan distance
|
||||||
|
between $p_1$ and $p_2$:
|
||||||
|
\[|x_1-x_2|+|y_1-y_2| = \max(|x'_1-x'_2|,|y'_1-y'_2|)\]
|
||||||
|
|
||||||
|
For example, if $p_1=(1,0)$ and $p_2=(3,3)$,
|
||||||
|
the rotated coordinates are $p'_1=(1,-1)$ and $p'_2=(6,0)$
|
||||||
|
and the Manhattan distance is
|
||||||
|
\[|1-3|+|0-3| = \max(|1-6|,|-1-0|) = 5.\]
|
||||||
|
|
||||||
|
The rotated coordinates provide a simple way
|
||||||
|
to operate with Manhattan distances, because we can
|
||||||
|
consider x and y coordinates separately.
|
||||||
|
To maximize the Manhattan distance between two points,
|
||||||
|
we should find two points whose
|
||||||
|
rotated coordinates maximize the value of
|
||||||
|
\[\max(|x'_1-x'_2|,|y'_1-y'_2|).\]
|
||||||
|
This is easy, because either the horizontal or vertical
|
||||||
|
difference of the rotated coordinates has to be maximum.
|
|
@ -0,0 +1,847 @@
|
||||||
|
\chapter{Sweep line algorithms}
|
||||||
|
|
||||||
|
\index{sweep line}
|
||||||
|
|
||||||
|
Many geometric problems can be solved using
|
||||||
|
\key{sweep line} algorithms.
|
||||||
|
The idea in such algorithms is to represent
|
||||||
|
an instance of the problem as a set of events that correspond
|
||||||
|
to points in the plane.
|
||||||
|
The events are processed in increasing order
|
||||||
|
according to their x or y coordinates.
|
||||||
|
|
||||||
|
As an example, consider the following problem:
|
||||||
|
There is a company that has $n$ employees,
|
||||||
|
and we know for each employee their arrival and
|
||||||
|
leaving times on a certain day.
|
||||||
|
Our task is to calculate the maximum number of
|
||||||
|
employees that were in the office at the same time.
|
||||||
|
|
||||||
|
The problem can be solved by modeling the situation
|
||||||
|
so that each employee is assigned two events that
|
||||||
|
correspond to their arrival and leaving times.
|
||||||
|
After sorting the events, we go through them
|
||||||
|
and keep track of the number of people in the office.
|
||||||
|
For example, the table
|
||||||
|
\begin{center}
|
||||||
|
\begin{tabular}{ccc}
|
||||||
|
person & arrival time & leaving time \\
|
||||||
|
\hline
|
||||||
|
John & 10 & 15 \\
|
||||||
|
Maria & 6 & 12 \\
|
||||||
|
Peter & 14 & 16 \\
|
||||||
|
Lisa & 5 & 13 \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
corresponds to the following events:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.6]
|
||||||
|
\draw (0,0) rectangle (17,-6.5);
|
||||||
|
\path[draw,thick,-] (10,-1) -- (15,-1);
|
||||||
|
\path[draw,thick,-] (6,-2.5) -- (12,-2.5);
|
||||||
|
\path[draw,thick,-] (14,-4) -- (16,-4);
|
||||||
|
\path[draw,thick,-] (5,-5.5) -- (13,-5.5);
|
||||||
|
|
||||||
|
\draw[fill] (10,-1) circle [radius=0.05];
|
||||||
|
\draw[fill] (15,-1) circle [radius=0.05];
|
||||||
|
\draw[fill] (6,-2.5) circle [radius=0.05];
|
||||||
|
\draw[fill] (12,-2.5) circle [radius=0.05];
|
||||||
|
\draw[fill] (14,-4) circle [radius=0.05];
|
||||||
|
\draw[fill] (16,-4) circle [radius=0.05];
|
||||||
|
\draw[fill] (5,-5.5) circle [radius=0.05];
|
||||||
|
\draw[fill] (13,-5.5) circle [radius=0.05];
|
||||||
|
|
||||||
|
\node at (2,-1) {John};
|
||||||
|
\node at (2,-2.5) {Maria};
|
||||||
|
\node at (2,-4) {Peter};
|
||||||
|
\node at (2,-5.5) {Lisa};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
We go through the events from left to right
|
||||||
|
and maintain a counter.
|
||||||
|
Always when a person arrives, we increase
|
||||||
|
the value of the counter by one,
|
||||||
|
and when a person leaves,
|
||||||
|
we decrease the value of the counter by one.
|
||||||
|
The answer to the problem is the maximum
|
||||||
|
value of the counter during the algorithm.
|
||||||
|
|
||||||
|
In the example, the events are processed as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.6]
|
||||||
|
\path[draw,thick,->] (0.5,0.5) -- (16.5,0.5);
|
||||||
|
\draw (0,0) rectangle (17,-6.5);
|
||||||
|
\path[draw,thick,-] (10,-1) -- (15,-1);
|
||||||
|
\path[draw,thick,-] (6,-2.5) -- (12,-2.5);
|
||||||
|
\path[draw,thick,-] (14,-4) -- (16,-4);
|
||||||
|
\path[draw,thick,-] (5,-5.5) -- (13,-5.5);
|
||||||
|
|
||||||
|
\draw[fill] (10,-1) circle [radius=0.05];
|
||||||
|
\draw[fill] (15,-1) circle [radius=0.05];
|
||||||
|
\draw[fill] (6,-2.5) circle [radius=0.05];
|
||||||
|
\draw[fill] (12,-2.5) circle [radius=0.05];
|
||||||
|
\draw[fill] (14,-4) circle [radius=0.05];
|
||||||
|
\draw[fill] (16,-4) circle [radius=0.05];
|
||||||
|
\draw[fill] (5,-5.5) circle [radius=0.05];
|
||||||
|
\draw[fill] (13,-5.5) circle [radius=0.05];
|
||||||
|
|
||||||
|
\node at (2,-1) {John};
|
||||||
|
\node at (2,-2.5) {Maria};
|
||||||
|
\node at (2,-4) {Peter};
|
||||||
|
\node at (2,-5.5) {Lisa};
|
||||||
|
|
||||||
|
\path[draw,dashed] (10,0)--(10,-6.5);
|
||||||
|
\path[draw,dashed] (15,0)--(15,-6.5);
|
||||||
|
\path[draw,dashed] (6,0)--(6,-6.5);
|
||||||
|
\path[draw,dashed] (12,0)--(12,-6.5);
|
||||||
|
\path[draw,dashed] (14,0)--(14,-6.5);
|
||||||
|
\path[draw,dashed] (16,0)--(16,-6.5);
|
||||||
|
\path[draw,dashed] (5,0)--(5,-6.5);
|
||||||
|
\path[draw,dashed] (13,0)--(13,-6.5);
|
||||||
|
|
||||||
|
\node at (10,-7) {$+$};
|
||||||
|
\node at (15,-7) {$-$};
|
||||||
|
\node at (6,-7) {$+$};
|
||||||
|
\node at (12,-7) {$-$};
|
||||||
|
\node at (14,-7) {$+$};
|
||||||
|
\node at (16,-7) {$-$};
|
||||||
|
\node at (5,-7) {$+$};
|
||||||
|
\node at (13,-7) {$-$};
|
||||||
|
|
||||||
|
\node at (10,-8) {$3$};
|
||||||
|
\node at (15,-8) {$1$};
|
||||||
|
\node at (6,-8) {$2$};
|
||||||
|
\node at (12,-8) {$2$};
|
||||||
|
\node at (14,-8) {$2$};
|
||||||
|
\node at (16,-8) {$0$};
|
||||||
|
\node at (5,-8) {$1$};
|
||||||
|
\node at (13,-8) {$1$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
The symbols $+$ and $-$ indicate whether the
|
||||||
|
value of the counter increases or decreases,
|
||||||
|
and the value of the counter is shown below.
|
||||||
|
The maximum value of the counter is 3
|
||||||
|
between John's arrival and Maria's leaving.
|
||||||
|
|
||||||
|
The running time of the algorithm is $O(n \log n)$,
|
||||||
|
because sorting the events takes $O(n \log n)$ time
|
||||||
|
and the rest of the algorithm takes $O(n)$ time.
|
||||||
|
|
||||||
|
\section{Intersection points}
|
||||||
|
|
||||||
|
\index{intersection point}
|
||||||
|
|
||||||
|
Given a set of $n$ line segments, each of them being either
|
||||||
|
horizontal or vertical, consider the problem of
|
||||||
|
counting the total number of intersection points.
|
||||||
|
For example, when the line segments are
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.5]
|
||||||
|
\path[draw,thick,-] (0,2) -- (5,2);
|
||||||
|
\path[draw,thick,-] (1,4) -- (6,4);
|
||||||
|
\path[draw,thick,-] (6,3) -- (10,3);
|
||||||
|
\path[draw,thick,-] (2,1) -- (2,6);
|
||||||
|
\path[draw,thick,-] (8,2) -- (8,5);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
there are three intersection points:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.5]
|
||||||
|
\path[draw,thick,-] (0,2) -- (5,2);
|
||||||
|
\path[draw,thick,-] (1,4) -- (6,4);
|
||||||
|
\path[draw,thick,-] (6,3) -- (10,3);
|
||||||
|
\path[draw,thick,-] (2,1) -- (2,6);
|
||||||
|
\path[draw,thick,-] (8,2) -- (8,5);
|
||||||
|
|
||||||
|
\draw[fill] (2,2) circle [radius=0.15];
|
||||||
|
\draw[fill] (2,4) circle [radius=0.15];
|
||||||
|
\draw[fill] (8,3) circle [radius=0.15];
|
||||||
|
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
It is easy to solve the problem in $O(n^2)$ time,
|
||||||
|
because we can go through all possible pairs of line segments
|
||||||
|
and check if they intersect.
|
||||||
|
However, we can solve the problem more efficiently
|
||||||
|
in $O(n \log n)$ time using a sweep line algorithm
|
||||||
|
and a range query data structure.
|
||||||
|
|
||||||
|
The idea is to process the endpoints of the line
|
||||||
|
segments from left to right and
|
||||||
|
focus on three types of events:
|
||||||
|
\begin{enumerate}[noitemsep]
|
||||||
|
\item[(1)] horizontal segment begins
|
||||||
|
\item[(2)] horizontal segment ends
|
||||||
|
\item[(3)] vertical segment
|
||||||
|
\end{enumerate}
|
||||||
|
|
||||||
|
The following events correspond to the example:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.6]
|
||||||
|
\path[draw,dashed] (0,2) -- (5,2);
|
||||||
|
\path[draw,dashed] (1,4) -- (6,4);
|
||||||
|
\path[draw,dashed] (6,3) -- (10,3);
|
||||||
|
\path[draw,dashed] (2,1) -- (2,6);
|
||||||
|
\path[draw,dashed] (8,2) -- (8,5);
|
||||||
|
|
||||||
|
\node at (0,2) {$1$};
|
||||||
|
\node at (5,2) {$2$};
|
||||||
|
\node at (1,4) {$1$};
|
||||||
|
\node at (6,4) {$2$};
|
||||||
|
\node at (6,3) {$1$};
|
||||||
|
\node at (10,3) {$2$};
|
||||||
|
|
||||||
|
\node at (2,3.5) {$3$};
|
||||||
|
\node at (8,3.5) {$3$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
We go through the events from left to right
|
||||||
|
and use a data structure that maintains a set of
|
||||||
|
y coordinates where there is an active horizontal segment.
|
||||||
|
At event 1, we add the y coordinate of the segment
|
||||||
|
to the set, and at event 2, we remove the
|
||||||
|
y coordinate from the set.
|
||||||
|
|
||||||
|
Intersection points are calculated at event 3.
|
||||||
|
When there is a vertical segment between points
|
||||||
|
$y_1$ and $y_2$, we count the number of active
|
||||||
|
horizontal segments whose y coordinate is between
|
||||||
|
$y_1$ and $y_2$, and add this number to the total
|
||||||
|
number of intersection points.
|
||||||
|
|
||||||
|
To store y coordinates of horizontal segments,
|
||||||
|
we can use a binary indexed or segment tree,
|
||||||
|
possibly with index compression.
|
||||||
|
When such structures are used, processing each event
|
||||||
|
takes $O(\log n)$ time, so the total running
|
||||||
|
time of the algorithm is $O(n \log n)$.
|
||||||
|
|
||||||
|
\section{Closest pair problem}
|
||||||
|
|
||||||
|
\index{closest pair}
|
||||||
|
|
||||||
|
Given a set of $n$ points, our next problem is
|
||||||
|
to find two points whose Euclidean distance is minimum.
|
||||||
|
For example, if the points are
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0)--(12,0)--(12,4)--(0,4)--(0,0);
|
||||||
|
|
||||||
|
\draw (1,2) circle [radius=0.1];
|
||||||
|
\draw (3,1) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5.5,1.5) circle [radius=0.1];
|
||||||
|
\draw (6,2.5) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (9,1.5) circle [radius=0.1];
|
||||||
|
\draw (10,2) circle [radius=0.1];
|
||||||
|
\draw (1.5,3.5) circle [radius=0.1];
|
||||||
|
\draw (1.5,1) circle [radius=0.1];
|
||||||
|
\draw (2.5,3) circle [radius=0.1];
|
||||||
|
\draw (4.5,1.5) circle [radius=0.1];
|
||||||
|
\draw (5.25,0.5) circle [radius=0.1];
|
||||||
|
\draw (6.5,2) circle [radius=0.1];
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\begin{samepage}
|
||||||
|
we should find the following points:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0)--(12,0)--(12,4)--(0,4)--(0,0);
|
||||||
|
|
||||||
|
\draw (1,2) circle [radius=0.1];
|
||||||
|
\draw (3,1) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5.5,1.5) circle [radius=0.1];
|
||||||
|
\draw[fill] (6,2.5) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (9,1.5) circle [radius=0.1];
|
||||||
|
\draw (10,2) circle [radius=0.1];
|
||||||
|
\draw (1.5,3.5) circle [radius=0.1];
|
||||||
|
\draw (1.5,1) circle [radius=0.1];
|
||||||
|
\draw (2.5,3) circle [radius=0.1];
|
||||||
|
\draw (4.5,1.5) circle [radius=0.1];
|
||||||
|
\draw (5.25,0.5) circle [radius=0.1];
|
||||||
|
\draw[fill] (6.5,2) circle [radius=0.1];
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
|
||||||
|
This is another example of a problem
|
||||||
|
that can be solved in $O(n \log n)$ time
|
||||||
|
using a sweep line algorithm\footnote{Besides this approach,
|
||||||
|
there is also an
|
||||||
|
$O(n \log n)$ time divide-and-conquer algorithm \cite{sha75}
|
||||||
|
that divides the points into two sets and recursively
|
||||||
|
solves the problem for both sets.}.
|
||||||
|
We go through the points from left to right
|
||||||
|
and maintain a value $d$: the minimum distance
|
||||||
|
between two points seen so far.
|
||||||
|
At each point, we find the nearest point to the left.
|
||||||
|
If the distance is less than $d$, it is the
|
||||||
|
new minimum distance and we update
|
||||||
|
the value of $d$.
|
||||||
|
|
||||||
|
If the current point is $(x,y)$
|
||||||
|
and there is a point to the left
|
||||||
|
within a distance of less than $d$,
|
||||||
|
the x coordinate of such a point must
|
||||||
|
be between $[x-d,x]$ and the y coordinate
|
||||||
|
must be between $[y-d,y+d]$.
|
||||||
|
Thus, it suffices to only consider points
|
||||||
|
that are located in those ranges,
|
||||||
|
which makes the algorithm efficient.
|
||||||
|
|
||||||
|
For example, in the following picture, the
|
||||||
|
region marked with dashed lines contains
|
||||||
|
the points that can be within a distance of $d$
|
||||||
|
from the active point:
|
||||||
|
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0)--(12,0)--(12,4)--(0,4)--(0,0);
|
||||||
|
|
||||||
|
\draw (1,2) circle [radius=0.1];
|
||||||
|
\draw (3,1) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5.5,1.5) circle [radius=0.1];
|
||||||
|
\draw (6,2.5) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (9,1.5) circle [radius=0.1];
|
||||||
|
\draw (10,2) circle [radius=0.1];
|
||||||
|
\draw (1.5,3.5) circle [radius=0.1];
|
||||||
|
\draw (1.5,1) circle [radius=0.1];
|
||||||
|
\draw (2.5,3) circle [radius=0.1];
|
||||||
|
\draw (4.5,1.5) circle [radius=0.1];
|
||||||
|
\draw (5.25,0.5) circle [radius=0.1];
|
||||||
|
\draw[fill] (6.5,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw[dashed] (6.5,0.75)--(6.5,3.25);
|
||||||
|
\draw[dashed] (5.25,0.75)--(5.25,3.25);
|
||||||
|
\draw[dashed] (5.25,0.75)--(6.5,0.75);
|
||||||
|
\draw[dashed] (5.25,3.25)--(6.5,3.25);
|
||||||
|
|
||||||
|
\draw [decoration={brace}, decorate, line width=0.3mm] (5.25,3.5) -- (6.5,3.5);
|
||||||
|
\node at (5.875,4) {$d$};
|
||||||
|
\draw [decoration={brace}, decorate, line width=0.3mm] (6.75,3.25) -- (6.75,2);
|
||||||
|
\node at (7.25,2.625) {$d$};
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
The efficiency of the algorithm is based on the fact
|
||||||
|
that the region always contains
|
||||||
|
only $O(1)$ points.
|
||||||
|
We can go through those points in $O(\log n)$ time
|
||||||
|
by maintaining a set of points whose x coordinate
|
||||||
|
is between $[x-d,x]$, in increasing order according
|
||||||
|
to their y coordinates.
|
||||||
|
|
||||||
|
The time complexity of the algorithm is $O(n \log n)$,
|
||||||
|
because we go through $n$ points and
|
||||||
|
find for each point the nearest point to the left
|
||||||
|
in $O(\log n)$ time.
|
||||||
|
|
||||||
|
\section{Convex hull problem}
|
||||||
|
|
||||||
|
A \key{convex hull} is the smallest convex polygon
|
||||||
|
that contains all points of a given set.
|
||||||
|
Convexity means that a line segment between
|
||||||
|
any two vertices of the polygon is completely
|
||||||
|
inside the polygon.
|
||||||
|
|
||||||
|
\begin{samepage}
|
||||||
|
For example, for the points
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
\end{samepage}
|
||||||
|
the convex hull is as follows:
|
||||||
|
\begin{center}
|
||||||
|
\begin{tikzpicture}[scale=0.7]
|
||||||
|
\draw (0,0)--(4,-1)--(7,1)--(6,3)--(2,4)--(0,2)--(0,0);
|
||||||
|
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
\end{tikzpicture}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
\index{Andrew's algorithm}
|
||||||
|
|
||||||
|
\key{Andrew's algorithm} \cite{and79} provides
|
||||||
|
an easy way to
|
||||||
|
construct the convex hull for a set of points
|
||||||
|
in $O(n \log n)$ time.
|
||||||
|
The algorithm first locates the leftmost
|
||||||
|
and rightmost points, and then
|
||||||
|
constructs the convex hull in two parts:
|
||||||
|
first the upper hull and then the lower hull.
|
||||||
|
Both parts are similar, so we can focus on
|
||||||
|
constructing the upper hull.
|
||||||
|
|
||||||
|
First, we sort the points primarily according to
|
||||||
|
x coordinates and secondarily according to y coordinates.
|
||||||
|
After this, we go through the points and
|
||||||
|
add each point to the hull.
|
||||||
|
Always after adding a point to the hull,
|
||||||
|
we make sure that the last line segment
|
||||||
|
in the hull does not turn left.
|
||||||
|
As long as it turns left, we repeatedly remove the
|
||||||
|
second last point from the hull.
|
||||||
|
|
||||||
|
The following pictures show how
|
||||||
|
Andrew's algorithm works:
|
||||||
|
\\
|
||||||
|
\begin{tabular}{ccccccc}
|
||||||
|
\\
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(1,1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(1,1)--(2,2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\\
|
||||||
|
1 & & 2 & & 3 & & 4 \\
|
||||||
|
\end{tabular}
|
||||||
|
\\
|
||||||
|
\begin{tabular}{ccccccc}
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,2)--(2,4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(3,2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,-1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\\
|
||||||
|
5 & & 6 & & 7 & & 8 \\
|
||||||
|
\end{tabular}
|
||||||
|
\\
|
||||||
|
\begin{tabular}{ccccccc}
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,-1)--(4,0);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,0);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,0)--(4,3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\\
|
||||||
|
9 & & 10 & & 11 & & 12 \\
|
||||||
|
\end{tabular}
|
||||||
|
\\
|
||||||
|
\begin{tabular}{ccccccc}
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(4,3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2)--(6,1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2)--(6,1)--(6,3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\\
|
||||||
|
13 & & 14 & & 15 & & 16 \\
|
||||||
|
\end{tabular}
|
||||||
|
\\
|
||||||
|
\begin{tabular}{ccccccc}
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2)--(6,3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(4,3)--(6,3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(6,3);
|
||||||
|
\end{tikzpicture}
|
||||||
|
& \hspace{0.1cm} &
|
||||||
|
\begin{tikzpicture}[scale=0.3]
|
||||||
|
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||||
|
\draw (0,0) circle [radius=0.1];
|
||||||
|
\draw (4,-1) circle [radius=0.1];
|
||||||
|
\draw (7,1) circle [radius=0.1];
|
||||||
|
\draw (6,3) circle [radius=0.1];
|
||||||
|
\draw (2,4) circle [radius=0.1];
|
||||||
|
\draw (0,2) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (1,1) circle [radius=0.1];
|
||||||
|
\draw (2,2) circle [radius=0.1];
|
||||||
|
\draw (3,2) circle [radius=0.1];
|
||||||
|
\draw (4,0) circle [radius=0.1];
|
||||||
|
\draw (4,3) circle [radius=0.1];
|
||||||
|
\draw (5,2) circle [radius=0.1];
|
||||||
|
\draw (6,1) circle [radius=0.1];
|
||||||
|
|
||||||
|
\draw (0,0)--(0,2)--(2,4)--(6,3)--(7,1);
|
||||||
|
\end{tikzpicture}
|
||||||
|
\\
|
||||||
|
17 & & 18 & & 19 & & 20
|
||||||
|
\end{tabular}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,388 @@
|
||||||
|
\begin{thebibliography}{9}
|
||||||
|
|
||||||
|
\bibitem{aho83}
|
||||||
|
A. V. Aho, J. E. Hopcroft and J. Ullman.
|
||||||
|
\emph{Data Structures and Algorithms},
|
||||||
|
Addison-Wesley, 1983.
|
||||||
|
|
||||||
|
\bibitem{ahu91}
|
||||||
|
R. K. Ahuja and J. B. Orlin.
|
||||||
|
Distance directed augmenting path algorithms for maximum flow and parametric maximum flow problems.
|
||||||
|
\emph{Naval Research Logistics}, 38(3):413--430, 1991.
|
||||||
|
|
||||||
|
\bibitem{and79}
|
||||||
|
A. M. Andrew.
|
||||||
|
Another efficient algorithm for convex hulls in two dimensions.
|
||||||
|
\emph{Information Processing Letters}, 9(5):216--219, 1979.
|
||||||
|
|
||||||
|
\bibitem{asp79}
|
||||||
|
B. Aspvall, M. F. Plass and R. E. Tarjan.
|
||||||
|
A linear-time algorithm for testing the truth of certain quantified boolean formulas.
|
||||||
|
\emph{Information Processing Letters}, 8(3):121--123, 1979.
|
||||||
|
|
||||||
|
\bibitem{bel58}
|
||||||
|
R. Bellman.
|
||||||
|
On a routing problem.
|
||||||
|
\emph{Quarterly of Applied Mathematics}, 16(1):87--90, 1958.
|
||||||
|
|
||||||
|
\bibitem{bec07}
|
||||||
|
M. Beck, E. Pine, W. Tarrat and K. Y. Jensen.
|
||||||
|
New integer representations as the sum of three cubes.
|
||||||
|
\emph{Mathematics of Computation}, 76(259):1683--1690, 2007.
|
||||||
|
|
||||||
|
\bibitem{ben00}
|
||||||
|
M. A. Bender and M. Farach-Colton.
|
||||||
|
The LCA problem revisited. In
|
||||||
|
\emph{Latin American Symposium on Theoretical Informatics}, 88--94, 2000.
|
||||||
|
|
||||||
|
\bibitem{ben86}
|
||||||
|
J. Bentley.
|
||||||
|
\emph{Programming Pearls}.
|
||||||
|
Addison-Wesley, 1999 (2nd edition).
|
||||||
|
|
||||||
|
\bibitem{ben80}
|
||||||
|
J. Bentley and D. Wood.
|
||||||
|
An optimal worst case algorithm for reporting intersections of rectangles.
|
||||||
|
\emph{IEEE Transactions on Computers}, C-29(7):571--577, 1980.
|
||||||
|
|
||||||
|
\bibitem{bou01}
|
||||||
|
C. L. Bouton.
|
||||||
|
Nim, a game with a complete mathematical theory.
|
||||||
|
\emph{Annals of Mathematics}, 3(1/4):35--39, 1901.
|
||||||
|
|
||||||
|
% \bibitem{bur97}
|
||||||
|
% W. Burnside.
|
||||||
|
% \emph{Theory of Groups of Finite Order},
|
||||||
|
% Cambridge University Press, 1897.
|
||||||
|
|
||||||
|
\bibitem{coci}
|
||||||
|
Croatian Open Competition in Informatics, \url{http://hsin.hr/coci/}
|
||||||
|
|
||||||
|
\bibitem{cod15}
|
||||||
|
Codeforces: On ''Mo's algorithm'',
|
||||||
|
\url{http://codeforces.com/blog/entry/20032}
|
||||||
|
|
||||||
|
\bibitem{cor09}
|
||||||
|
T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein.
|
||||||
|
\emph{Introduction to Algorithms}, MIT Press, 2009 (3rd edition).
|
||||||
|
|
||||||
|
\bibitem{dij59}
|
||||||
|
E. W. Dijkstra.
|
||||||
|
A note on two problems in connexion with graphs.
|
||||||
|
\emph{Numerische Mathematik}, 1(1):269--271, 1959.
|
||||||
|
|
||||||
|
\bibitem{dik12}
|
||||||
|
K. Diks et al.
|
||||||
|
\emph{Looking for a Challenge? The Ultimate Problem Set from
|
||||||
|
the University of Warsaw Programming Competitions}, University of Warsaw, 2012.
|
||||||
|
|
||||||
|
% \bibitem{dil50}
|
||||||
|
% R. P. Dilworth.
|
||||||
|
% A decomposition theorem for partially ordered sets.
|
||||||
|
% \emph{Annals of Mathematics}, 51(1):161--166, 1950.
|
||||||
|
|
||||||
|
% \bibitem{dir52}
|
||||||
|
% G. A. Dirac.
|
||||||
|
% Some theorems on abstract graphs.
|
||||||
|
% \emph{Proceedings of the London Mathematical Society}, 3(1):69--81, 1952.
|
||||||
|
|
||||||
|
\bibitem{dim15}
|
||||||
|
M. Dima and R. Ceterchi.
|
||||||
|
Efficient range minimum queries using binary indexed trees.
|
||||||
|
\emph{Olympiad in Informatics}, 9(1):39--44, 2015.
|
||||||
|
|
||||||
|
\bibitem{edm65}
|
||||||
|
J. Edmonds.
|
||||||
|
Paths, trees, and flowers.
|
||||||
|
\emph{Canadian Journal of Mathematics}, 17(3):449--467, 1965.
|
||||||
|
|
||||||
|
\bibitem{edm72}
|
||||||
|
J. Edmonds and R. M. Karp.
|
||||||
|
Theoretical improvements in algorithmic efficiency for network flow problems.
|
||||||
|
\emph{Journal of the ACM}, 19(2):248--264, 1972.
|
||||||
|
|
||||||
|
\bibitem{eve75}
|
||||||
|
S. Even, A. Itai and A. Shamir.
|
||||||
|
On the complexity of time table and multi-commodity flow problems.
|
||||||
|
\emph{16th Annual Symposium on Foundations of Computer Science}, 184--193, 1975.
|
||||||
|
|
||||||
|
\bibitem{fan94}
|
||||||
|
D. Fanding.
|
||||||
|
A faster algorithm for shortest-path -- SPFA.
|
||||||
|
\emph{Journal of Southwest Jiaotong University}, 2, 1994.
|
||||||
|
|
||||||
|
\bibitem{fen94}
|
||||||
|
P. M. Fenwick.
|
||||||
|
A new data structure for cumulative frequency tables.
|
||||||
|
\emph{Software: Practice and Experience}, 24(3):327--336, 1994.
|
||||||
|
|
||||||
|
\bibitem{fis06}
|
||||||
|
J. Fischer and V. Heun.
|
||||||
|
Theoretical and practical improvements on the RMQ-problem, with applications to LCA and LCE.
|
||||||
|
In \emph{Annual Symposium on Combinatorial Pattern Matching}, 36--48, 2006.
|
||||||
|
|
||||||
|
\bibitem{flo62}
|
||||||
|
R. W. Floyd
|
||||||
|
Algorithm 97: shortest path.
|
||||||
|
\emph{Communications of the ACM}, 5(6):345, 1962.
|
||||||
|
|
||||||
|
\bibitem{for56a}
|
||||||
|
L. R. Ford.
|
||||||
|
Network flow theory.
|
||||||
|
RAND Corporation, Santa Monica, California, 1956.
|
||||||
|
|
||||||
|
\bibitem{for56}
|
||||||
|
L. R. Ford and D. R. Fulkerson.
|
||||||
|
Maximal flow through a network.
|
||||||
|
\emph{Canadian Journal of Mathematics}, 8(3):399--404, 1956.
|
||||||
|
|
||||||
|
\bibitem{fre77}
|
||||||
|
R. Freivalds.
|
||||||
|
Probabilistic machines can use less running time.
|
||||||
|
In \emph{IFIP congress}, 839--842, 1977.
|
||||||
|
|
||||||
|
\bibitem{gal14}
|
||||||
|
F. Le Gall.
|
||||||
|
Powers of tensors and fast matrix multiplication.
|
||||||
|
In \emph{Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation},
|
||||||
|
296--303, 2014.
|
||||||
|
|
||||||
|
\bibitem{gar79}
|
||||||
|
M. R. Garey and D. S. Johnson.
|
||||||
|
\emph{Computers and Intractability:
|
||||||
|
A Guide to the Theory of NP-Completeness},
|
||||||
|
W. H. Freeman and Company, 1979.
|
||||||
|
|
||||||
|
\bibitem{goo17}
|
||||||
|
Google Code Jam Statistics (2017),
|
||||||
|
\url{https://www.go-hero.net/jam/17}
|
||||||
|
|
||||||
|
\bibitem{gro14}
|
||||||
|
A. Grønlund and S. Pettie.
|
||||||
|
Threesomes, degenerates, and love triangles.
|
||||||
|
In \emph{Proceedings of the 55th Annual Symposium on Foundations of Computer Science},
|
||||||
|
621--630, 2014.
|
||||||
|
|
||||||
|
\bibitem{gru39}
|
||||||
|
P. M. Grundy.
|
||||||
|
Mathematics and games.
|
||||||
|
\emph{Eureka}, 2(5):6--8, 1939.
|
||||||
|
|
||||||
|
\bibitem{gus97}
|
||||||
|
D. Gusfield.
|
||||||
|
\emph{Algorithms on Strings, Trees and Sequences:
|
||||||
|
Computer Science and Computational Biology},
|
||||||
|
Cambridge University Press, 1997.
|
||||||
|
|
||||||
|
% \bibitem{hal35}
|
||||||
|
% P. Hall.
|
||||||
|
% On representatives of subsets.
|
||||||
|
% \emph{Journal London Mathematical Society} 10(1):26--30, 1935.
|
||||||
|
|
||||||
|
\bibitem{hal13}
|
||||||
|
S. Halim and F. Halim.
|
||||||
|
\emph{Competitive Programming 3: The New Lower Bound of Programming Contests}, 2013.
|
||||||
|
|
||||||
|
\bibitem{hel62}
|
||||||
|
M. Held and R. M. Karp.
|
||||||
|
A dynamic programming approach to sequencing problems.
|
||||||
|
\emph{Journal of the Society for Industrial and Applied Mathematics}, 10(1):196--210, 1962.
|
||||||
|
|
||||||
|
\bibitem{hie73}
|
||||||
|
C. Hierholzer and C. Wiener.
|
||||||
|
Über die Möglichkeit, einen Linienzug ohne Wiederholung und ohne Unterbrechung zu umfahren.
|
||||||
|
\emph{Mathematische Annalen}, 6(1), 30--32, 1873.
|
||||||
|
|
||||||
|
\bibitem{hoa61a}
|
||||||
|
C. A. R. Hoare.
|
||||||
|
Algorithm 64: Quicksort.
|
||||||
|
\emph{Communications of the ACM}, 4(7):321, 1961.
|
||||||
|
|
||||||
|
\bibitem{hoa61b}
|
||||||
|
C. A. R. Hoare.
|
||||||
|
Algorithm 65: Find.
|
||||||
|
\emph{Communications of the ACM}, 4(7):321--322, 1961.
|
||||||
|
|
||||||
|
\bibitem{hop71}
|
||||||
|
J. E. Hopcroft and J. D. Ullman.
|
||||||
|
A linear list merging algorithm.
|
||||||
|
Technical report, Cornell University, 1971.
|
||||||
|
|
||||||
|
\bibitem{hor74}
|
||||||
|
E. Horowitz and S. Sahni.
|
||||||
|
Computing partitions with applications to the knapsack problem.
|
||||||
|
\emph{Journal of the ACM}, 21(2):277--292, 1974.
|
||||||
|
|
||||||
|
\bibitem{huf52}
|
||||||
|
D. A. Huffman.
|
||||||
|
A method for the construction of minimum-redundancy codes.
|
||||||
|
\emph{Proceedings of the IRE}, 40(9):1098--1101, 1952.
|
||||||
|
|
||||||
|
\bibitem{iois}
|
||||||
|
The International Olympiad in Informatics Syllabus,
|
||||||
|
\url{https://people.ksp.sk/~misof/ioi-syllabus/}
|
||||||
|
|
||||||
|
\bibitem{kar87}
|
||||||
|
R. M. Karp and M. O. Rabin.
|
||||||
|
Efficient randomized pattern-matching algorithms.
|
||||||
|
\emph{IBM Journal of Research and Development}, 31(2):249--260, 1987.
|
||||||
|
|
||||||
|
\bibitem{kas61}
|
||||||
|
P. W. Kasteleyn.
|
||||||
|
The statistics of dimers on a lattice: I. The number of dimer arrangements on a quadratic lattice.
|
||||||
|
\emph{Physica}, 27(12):1209--1225, 1961.
|
||||||
|
|
||||||
|
\bibitem{ken06}
|
||||||
|
C. Kent, G. M. Landau and M. Ziv-Ukelson.
|
||||||
|
On the complexity of sparse exon assembly.
|
||||||
|
\emph{Journal of Computational Biology}, 13(5):1013--1027, 2006.
|
||||||
|
|
||||||
|
|
||||||
|
\bibitem{kle05}
|
||||||
|
J. Kleinberg and É. Tardos.
|
||||||
|
\emph{Algorithm Design}, Pearson, 2005.
|
||||||
|
|
||||||
|
\bibitem{knu982}
|
||||||
|
D. E. Knuth.
|
||||||
|
\emph{The Art of Computer Programming. Volume 2: Seminumerical Algorithms}, Addison–Wesley, 1998 (3rd edition).
|
||||||
|
|
||||||
|
\bibitem{knu983}
|
||||||
|
D. E. Knuth.
|
||||||
|
\emph{The Art of Computer Programming. Volume 3: Sorting and Searching}, Addison–Wesley, 1998 (2nd edition).
|
||||||
|
|
||||||
|
% \bibitem{kon31}
|
||||||
|
% D. Kőnig.
|
||||||
|
% Gráfok és mátrixok.
|
||||||
|
% \emph{Matematikai és Fizikai Lapok}, 38(1):116--119, 1931.
|
||||||
|
|
||||||
|
\bibitem{kru56}
|
||||||
|
J. B. Kruskal.
|
||||||
|
On the shortest spanning subtree of a graph and the traveling salesman problem.
|
||||||
|
\emph{Proceedings of the American Mathematical Society}, 7(1):48--50, 1956.
|
||||||
|
|
||||||
|
\bibitem{lev66}
|
||||||
|
V. I. Levenshtein.
|
||||||
|
Binary codes capable of correcting deletions, insertions, and reversals.
|
||||||
|
\emph{Soviet physics doklady}, 10(8):707--710, 1966.
|
||||||
|
|
||||||
|
\bibitem{mai84}
|
||||||
|
M. G. Main and R. J. Lorentz.
|
||||||
|
An $O(n \log n)$ algorithm for finding all repetitions in a string.
|
||||||
|
\emph{Journal of Algorithms}, 5(3):422--432, 1984.
|
||||||
|
|
||||||
|
% \bibitem{ore60}
|
||||||
|
% Ø. Ore.
|
||||||
|
% Note on Hamilton circuits.
|
||||||
|
% \emph{The American Mathematical Monthly}, 67(1):55, 1960.
|
||||||
|
|
||||||
|
\bibitem{pac13}
|
||||||
|
J. Pachocki and J. Radoszewski.
|
||||||
|
Where to use and how not to use polynomial string hashing.
|
||||||
|
\emph{Olympiads in Informatics}, 7(1):90--100, 2013.
|
||||||
|
|
||||||
|
\bibitem{par97}
|
||||||
|
I. Parberry.
|
||||||
|
An efficient algorithm for the Knight's tour problem.
|
||||||
|
\emph{Discrete Applied Mathematics}, 73(3):251--260, 1997.
|
||||||
|
|
||||||
|
% \bibitem{pic99}
|
||||||
|
% G. Pick.
|
||||||
|
% Geometrisches zur Zahlenlehre.
|
||||||
|
% \emph{Sitzungsberichte des deutschen naturwissenschaftlich-medicinischen Vereines
|
||||||
|
% für Böhmen "Lotos" in Prag. (Neue Folge)}, 19:311--319, 1899.
|
||||||
|
|
||||||
|
\bibitem{pea05}
|
||||||
|
D. Pearson.
|
||||||
|
A polynomial-time algorithm for the change-making problem.
|
||||||
|
\emph{Operations Research Letters}, 33(3):231--234, 2005.
|
||||||
|
|
||||||
|
\bibitem{pri57}
|
||||||
|
R. C. Prim.
|
||||||
|
Shortest connection networks and some generalizations.
|
||||||
|
\emph{Bell System Technical Journal}, 36(6):1389--1401, 1957.
|
||||||
|
|
||||||
|
% \bibitem{pru18}
|
||||||
|
% H. Prüfer.
|
||||||
|
% Neuer Beweis eines Satzes über Permutationen.
|
||||||
|
% \emph{Arch. Math. Phys}, 27:742--744, 1918.
|
||||||
|
|
||||||
|
\bibitem{q27}
|
||||||
|
27-Queens Puzzle: Massively Parallel Enumeration and Solution Counting.
|
||||||
|
\url{https://github.com/preusser/q27}
|
||||||
|
|
||||||
|
\bibitem{sha75}
|
||||||
|
M. I. Shamos and D. Hoey.
|
||||||
|
Closest-point problems.
|
||||||
|
In \emph{Proceedings of the 16th Annual Symposium on Foundations of Computer Science}, 151--162, 1975.
|
||||||
|
|
||||||
|
\bibitem{sha81}
|
||||||
|
M. Sharir.
|
||||||
|
A strong-connectivity algorithm and its applications in data flow analysis.
|
||||||
|
\emph{Computers \& Mathematics with Applications}, 7(1):67--72, 1981.
|
||||||
|
|
||||||
|
\bibitem{ski08}
|
||||||
|
S. S. Skiena.
|
||||||
|
\emph{The Algorithm Design Manual}, Springer, 2008 (2nd edition).
|
||||||
|
|
||||||
|
\bibitem{ski03}
|
||||||
|
S. S. Skiena and M. A. Revilla.
|
||||||
|
\emph{Programming Challenges: The Programming Contest Training Manual},
|
||||||
|
Springer, 2003.
|
||||||
|
|
||||||
|
\bibitem{main}
|
||||||
|
SZKOpuł, \texttt{https://szkopul.edu.pl/}
|
||||||
|
|
||||||
|
\bibitem{spr35}
|
||||||
|
R. Sprague.
|
||||||
|
Über mathematische Kampfspiele.
|
||||||
|
\emph{Tohoku Mathematical Journal}, 41:438--444, 1935.
|
||||||
|
|
||||||
|
\bibitem{sta06}
|
||||||
|
P. Stańczyk.
|
||||||
|
\emph{Algorytmika praktyczna w konkursach Informatycznych},
|
||||||
|
MSc thesis, University of Warsaw, 2006.
|
||||||
|
|
||||||
|
\bibitem{str69}
|
||||||
|
V. Strassen.
|
||||||
|
Gaussian elimination is not optimal.
|
||||||
|
\emph{Numerische Mathematik}, 13(4):354--356, 1969.
|
||||||
|
|
||||||
|
\bibitem{tar75}
|
||||||
|
R. E. Tarjan.
|
||||||
|
Efficiency of a good but not linear set union algorithm.
|
||||||
|
\emph{Journal of the ACM}, 22(2):215--225, 1975.
|
||||||
|
|
||||||
|
\bibitem{tar79}
|
||||||
|
R. E. Tarjan.
|
||||||
|
Applications of path compression on balanced trees.
|
||||||
|
\emph{Journal of the ACM}, 26(4):690--715, 1979.
|
||||||
|
|
||||||
|
\bibitem{tar84}
|
||||||
|
R. E. Tarjan and U. Vishkin.
|
||||||
|
Finding biconnected componemts and computing tree functions in logarithmic parallel time.
|
||||||
|
In \emph{Proceedings of the 25th Annual Symposium on Foundations of Computer Science}, 12--20, 1984.
|
||||||
|
|
||||||
|
\bibitem{tem61}
|
||||||
|
H. N. V. Temperley and M. E. Fisher.
|
||||||
|
Dimer problem in statistical mechanics -- an exact result.
|
||||||
|
\emph{Philosophical Magazine}, 6(68):1061--1063, 1961.
|
||||||
|
|
||||||
|
\bibitem{usaco}
|
||||||
|
USA Computing Olympiad, \url{http://www.usaco.org/}
|
||||||
|
|
||||||
|
\bibitem{war23}
|
||||||
|
H. C. von Warnsdorf.
|
||||||
|
\emph{Des Rösselsprunges einfachste und allgemeinste Lösung}.
|
||||||
|
Schmalkalden, 1823.
|
||||||
|
|
||||||
|
\bibitem{war62}
|
||||||
|
S. Warshall.
|
||||||
|
A theorem on boolean matrices.
|
||||||
|
\emph{Journal of the ACM}, 9(1):11--12, 1962.
|
||||||
|
|
||||||
|
% \bibitem{zec72}
|
||||||
|
% E. Zeckendorf.
|
||||||
|
% Représentation des nombres naturels par une somme de nombres de Fibonacci ou de nombres de Lucas.
|
||||||
|
% \emph{Bull. Soc. Roy. Sci. Liege}, 41:179--182, 1972.
|
||||||
|
|
||||||
|
\end{thebibliography}
|
|
@ -0,0 +1,33 @@
|
||||||
|
\chapter*{Preface}
|
||||||
|
\markboth{\MakeUppercase{Preface}}{}
|
||||||
|
\addcontentsline{toc}{chapter}{Preface}
|
||||||
|
|
||||||
|
The purpose of this book is to give you
|
||||||
|
a thorough introduction to competitive programming.
|
||||||
|
It is assumed that you already
|
||||||
|
know the basics of programming, but no previous
|
||||||
|
background in competitive programming is needed.
|
||||||
|
|
||||||
|
The book is especially intended for
|
||||||
|
students who want to learn algorithms and
|
||||||
|
possibly participate in
|
||||||
|
the International Olympiad in Informatics (IOI) or
|
||||||
|
in the International Collegiate Programming Contest (ICPC).
|
||||||
|
Of course, the book is also suitable for
|
||||||
|
anybody else interested in competitive programming.
|
||||||
|
|
||||||
|
It takes a long time to become a good competitive
|
||||||
|
programmer, but it is also an opportunity to learn a lot.
|
||||||
|
You can be sure that you will get
|
||||||
|
a good general understanding of algorithms
|
||||||
|
if you spend time reading the book,
|
||||||
|
solving problems and taking part in contests.
|
||||||
|
|
||||||
|
The book is under continuous development.
|
||||||
|
You can always send feedback on the book to
|
||||||
|
\texttt{ahslaaks@cs.helsinki.fi}.
|
||||||
|
|
||||||
|
\begin{flushright}
|
||||||
|
Helsinki, August 2019 \\
|
||||||
|
Antti Laaksonen
|
||||||
|
\end{flushright}
|
Loading…
Reference in New Issue