Compare commits
No commits in common. "f269ae391910742788ac0d6626df1e221281f191" and "7b0a21413d86d7e2498a02e10442b6d970068da7" have entirely different histories.
f269ae3919
...
7b0a21413d
24
README.md
24
README.md
|
@ -1,22 +1,4 @@
|
|||
# Competitive Programmer's Handbook
|
||||
# cphb
|
||||
|
||||
Competitive Programmer's Handbook is a modern introduction to competitive programming.
|
||||
The book discusses programming tricks and algorithm design techniques relevant in competitive programming.
|
||||
|
||||
## CSES Problem Set
|
||||
|
||||
The CSES Problem Set contains a collection of competitive programming problems.
|
||||
You can practice the techniques presented in the book by solving the problems.
|
||||
|
||||
https://cses.fi/problemset/
|
||||
|
||||
## License
|
||||
|
||||
The license of the book is Creative Commons BY-NC-SA.
|
||||
|
||||
## Other books
|
||||
|
||||
Guide to Competitive Programming is a printed book, published by Springer, based on Competitive Programmer's Handbook.
|
||||
There is also a Russian edition Олимпиадное программирование (Olympiad Programming) and a Korean edition 알고리즘 트레이닝: 프로그래밍 대회 입문 가이드.
|
||||
|
||||
https://cses.fi/book/
|
||||
SOI adjusted Competitive Programmer's Handbook
|
||||
(see https://github.com/pllk/cphb for the original)
|
131
book.tex
131
book.tex
|
@ -1,131 +0,0 @@
|
|||
\documentclass[twoside,12pt,a4paper,english]{book}
|
||||
|
||||
%\includeonly{chapter04,list}
|
||||
|
||||
\usepackage[english]{babel}
|
||||
\usepackage[utf8]{inputenc}
|
||||
\usepackage{listings}
|
||||
\usepackage[table]{xcolor}
|
||||
\usepackage{tikz}
|
||||
\usepackage{multicol}
|
||||
\usepackage{hyperref}
|
||||
\usepackage{array}
|
||||
\usepackage{microtype}
|
||||
|
||||
\usepackage{fouriernc}
|
||||
\usepackage[T1]{fontenc}
|
||||
|
||||
\usepackage{graphicx}
|
||||
\usepackage{framed}
|
||||
\usepackage{amssymb}
|
||||
\usepackage{amsmath}
|
||||
|
||||
\usepackage{pifont}
|
||||
\usepackage{ifthen}
|
||||
\usepackage{makeidx}
|
||||
\usepackage{enumitem}
|
||||
|
||||
\usepackage{titlesec}
|
||||
|
||||
\usepackage{skak}
|
||||
\usepackage[scaled=0.95]{inconsolata}
|
||||
|
||||
|
||||
\usetikzlibrary{patterns,snakes}
|
||||
\pagestyle{plain}
|
||||
|
||||
\definecolor{keywords}{HTML}{44548A}
|
||||
\definecolor{strings}{HTML}{00999A}
|
||||
\definecolor{comments}{HTML}{990000}
|
||||
|
||||
\lstset{language=C++,frame=single,basicstyle=\ttfamily \small,showstringspaces=false,columns=flexible}
|
||||
\lstset{
|
||||
literate={ö}{{\"o}}1
|
||||
{ä}{{\"a}}1
|
||||
{ü}{{\"u}}1
|
||||
}
|
||||
\lstset{xleftmargin=20pt,xrightmargin=5pt}
|
||||
\lstset{aboveskip=12pt,belowskip=8pt}
|
||||
|
||||
\lstset{
|
||||
commentstyle=\color{comments},
|
||||
keywordstyle=\color{keywords},
|
||||
stringstyle=\color{strings}
|
||||
}
|
||||
|
||||
\date{Draft \today}
|
||||
|
||||
\usepackage[a4paper,vmargin=30mm,hmargin=33mm,footskip=15mm]{geometry}
|
||||
|
||||
\title{\Huge Competitive Programmer's Handbook}
|
||||
\author{\Large Antti Laaksonen}
|
||||
|
||||
\makeindex
|
||||
\usepackage[totoc]{idxlayout}
|
||||
|
||||
\titleformat{\subsubsection}
|
||||
{\normalfont\large\bfseries\sffamily}{\thesubsection}{1em}{}
|
||||
|
||||
\begin{document}
|
||||
|
||||
%\selectlanguage{finnish}
|
||||
|
||||
%\setcounter{page}{1}
|
||||
%\pagenumbering{roman}
|
||||
|
||||
\frontmatter
|
||||
\maketitle
|
||||
\setcounter{tocdepth}{1}
|
||||
\tableofcontents
|
||||
|
||||
\include{preface}
|
||||
|
||||
\mainmatter
|
||||
\pagenumbering{arabic}
|
||||
\setcounter{page}{1}
|
||||
|
||||
\newcommand{\key}[1] {\textbf{#1}}
|
||||
|
||||
\part{Basic techniques}
|
||||
\include{chapter01}
|
||||
\include{chapter02}
|
||||
\include{chapter03}
|
||||
\include{chapter04}
|
||||
\include{chapter05}
|
||||
\include{chapter06}
|
||||
\include{chapter07}
|
||||
\include{chapter08}
|
||||
\include{chapter09}
|
||||
\include{chapter10}
|
||||
\part{Graph algorithms}
|
||||
\include{chapter11}
|
||||
\include{chapter12}
|
||||
\include{chapter13}
|
||||
\include{chapter14}
|
||||
\include{chapter15}
|
||||
\include{chapter16}
|
||||
\include{chapter17}
|
||||
\include{chapter18}
|
||||
\include{chapter19}
|
||||
\include{chapter20}
|
||||
\part{Advanced topics}
|
||||
\include{chapter21}
|
||||
\include{chapter22}
|
||||
\include{chapter23}
|
||||
\include{chapter24}
|
||||
\include{chapter25}
|
||||
\include{chapter26}
|
||||
\include{chapter27}
|
||||
\include{chapter28}
|
||||
\include{chapter29}
|
||||
\include{chapter30}
|
||||
|
||||
\cleardoublepage
|
||||
\phantomsection
|
||||
\addcontentsline{toc}{chapter}{Bibliography}
|
||||
\include{list}
|
||||
|
||||
\cleardoublepage
|
||||
\printindex
|
||||
|
||||
\end{document}
|
990
chapter01.tex
990
chapter01.tex
|
@ -1,990 +0,0 @@
|
|||
\chapter{Introduction}
|
||||
|
||||
Competitive programming combines two topics:
|
||||
(1) the design of algorithms and (2) the implementation of algorithms.
|
||||
|
||||
The \key{design of algorithms} consists of problem solving
|
||||
and mathematical thinking.
|
||||
Skills for analyzing problems and solving them
|
||||
creatively are needed.
|
||||
An algorithm for solving a problem
|
||||
has to be both correct and efficient,
|
||||
and the core of the problem is often
|
||||
about inventing an efficient algorithm.
|
||||
|
||||
Theoretical knowledge of algorithms
|
||||
is important to competitive programmers.
|
||||
Typically, a solution to a problem is
|
||||
a combination of well-known techniques and
|
||||
new insights.
|
||||
The techniques that appear in competitive programming
|
||||
also form the basis for the scientific research
|
||||
of algorithms.
|
||||
|
||||
The \key{implementation of algorithms} requires good
|
||||
programming skills.
|
||||
In competitive programming, the solutions
|
||||
are graded by testing an implemented algorithm
|
||||
using a set of test cases.
|
||||
Thus, it is not enough that the idea of the
|
||||
algorithm is correct, but the implementation also
|
||||
has to be correct.
|
||||
|
||||
A good coding style in contests is
|
||||
straightforward and concise.
|
||||
Programs should be written quickly,
|
||||
because there is not much time available.
|
||||
Unlike in traditional software engineering,
|
||||
the programs are short (usually at most a few
|
||||
hundred lines of code), and they do not need to
|
||||
be maintained after the contest.
|
||||
|
||||
\section{Programming languages}
|
||||
|
||||
\index{programming language}
|
||||
|
||||
At the moment, the most popular programming
|
||||
languages used in contests are C++, Python and Java.
|
||||
For example, in Google Code Jam 2017,
|
||||
among the best 3,000 participants,
|
||||
79 \% used C++,
|
||||
16 \% used Python and
|
||||
8 \% used Java \cite{goo17}.
|
||||
Some participants also used several languages.
|
||||
|
||||
Many people think that C++ is the best choice
|
||||
for a competitive programmer,
|
||||
and C++ is nearly always available in
|
||||
contest systems.
|
||||
The benefits of using C++ are that
|
||||
it is a very efficient language and
|
||||
its standard library contains a
|
||||
large collection
|
||||
of data structures and algorithms.
|
||||
|
||||
On the other hand, it is good to
|
||||
master several languages and understand
|
||||
their strengths.
|
||||
For example, if large integers are needed
|
||||
in the problem,
|
||||
Python can be a good choice, because it
|
||||
contains built-in operations for
|
||||
calculating with large integers.
|
||||
Still, most problems in programming contests
|
||||
are set so that
|
||||
using a specific programming language
|
||||
is not an unfair advantage.
|
||||
|
||||
All example programs in this book are written in C++,
|
||||
and the standard library's
|
||||
data structures and algorithms are often used.
|
||||
The programs follow the C++11 standard,
|
||||
which can be used in most contests nowadays.
|
||||
If you cannot program in C++ yet,
|
||||
now is a good time to start learning.
|
||||
|
||||
\subsubsection{C++ code template}
|
||||
|
||||
A typical C++ code template for competitive programming
|
||||
looks like this:
|
||||
|
||||
\begin{lstlisting}
|
||||
#include <bits/stdc++.h>
|
||||
|
||||
using namespace std;
|
||||
|
||||
int main() {
|
||||
// solution comes here
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The \texttt{\#include} line at the beginning
|
||||
of the code is a feature of the \texttt{g++} compiler
|
||||
that allows us to include the entire standard library.
|
||||
Thus, it is not needed to separately include
|
||||
libraries such as \texttt{iostream},
|
||||
\texttt{vector} and \texttt{algorithm},
|
||||
but rather they are available automatically.
|
||||
|
||||
The \texttt{using} line declares
|
||||
that the classes and functions
|
||||
of the standard library can be used directly
|
||||
in the code.
|
||||
Without the \texttt{using} line we would have
|
||||
to write, for example, \texttt{std::cout},
|
||||
but now it suffices to write \texttt{cout}.
|
||||
|
||||
The code can be compiled using the following command:
|
||||
|
||||
\begin{lstlisting}
|
||||
g++ -std=c++11 -O2 -Wall test.cpp -o test
|
||||
\end{lstlisting}
|
||||
|
||||
This command produces a binary file \texttt{test}
|
||||
from the source code \texttt{test.cpp}.
|
||||
The compiler follows the C++11 standard
|
||||
(\texttt{-std=c++11}),
|
||||
optimizes the code (\texttt{-O2})
|
||||
and shows warnings about possible errors (\texttt{-Wall}).
|
||||
|
||||
\section{Input and output}
|
||||
|
||||
\index{input and output}
|
||||
|
||||
In most contests, standard streams are used for
|
||||
reading input and writing output.
|
||||
In C++, the standard streams are
|
||||
\texttt{cin} for input and \texttt{cout} for output.
|
||||
In addition, the C functions
|
||||
\texttt{scanf} and \texttt{printf} can be used.
|
||||
|
||||
The input for the program usually consists of
|
||||
numbers and strings that are separated with
|
||||
spaces and newlines.
|
||||
They can be read from the \texttt{cin} stream
|
||||
as follows:
|
||||
|
||||
\begin{lstlisting}
|
||||
int a, b;
|
||||
string x;
|
||||
cin >> a >> b >> x;
|
||||
\end{lstlisting}
|
||||
|
||||
This kind of code always works,
|
||||
assuming that there is at least one space
|
||||
or newline between each element in the input.
|
||||
For example, the above code can read
|
||||
both of the following inputs:
|
||||
\begin{lstlisting}
|
||||
123 456 monkey
|
||||
\end{lstlisting}
|
||||
\begin{lstlisting}
|
||||
123 456
|
||||
monkey
|
||||
\end{lstlisting}
|
||||
The \texttt{cout} stream is used for output
|
||||
as follows:
|
||||
\begin{lstlisting}
|
||||
int a = 123, b = 456;
|
||||
string x = "monkey";
|
||||
cout << a << " " << b << " " << x << "\n";
|
||||
\end{lstlisting}
|
||||
|
||||
Input and output is sometimes
|
||||
a bottleneck in the program.
|
||||
The following lines at the beginning of the code
|
||||
make input and output more efficient:
|
||||
|
||||
\begin{lstlisting}
|
||||
ios::sync_with_stdio(0);
|
||||
cin.tie(0);
|
||||
\end{lstlisting}
|
||||
|
||||
Note that the newline \texttt{"\textbackslash n"}
|
||||
works faster than \texttt{endl},
|
||||
because \texttt{endl} always causes
|
||||
a flush operation.
|
||||
|
||||
The C functions \texttt{scanf}
|
||||
and \texttt{printf} are an alternative
|
||||
to the C++ standard streams.
|
||||
They are usually a bit faster,
|
||||
but they are also more difficult to use.
|
||||
The following code reads two integers from the input:
|
||||
\begin{lstlisting}
|
||||
int a, b;
|
||||
scanf("%d %d", &a, &b);
|
||||
\end{lstlisting}
|
||||
The following code prints two integers:
|
||||
\begin{lstlisting}
|
||||
int a = 123, b = 456;
|
||||
printf("%d %d\n", a, b);
|
||||
\end{lstlisting}
|
||||
|
||||
Sometimes the program should read a whole line
|
||||
from the input, possibly containing spaces.
|
||||
This can be accomplished by using the
|
||||
\texttt{getline} function:
|
||||
|
||||
\begin{lstlisting}
|
||||
string s;
|
||||
getline(cin, s);
|
||||
\end{lstlisting}
|
||||
|
||||
If the amount of data is unknown, the following
|
||||
loop is useful:
|
||||
\begin{lstlisting}
|
||||
while (cin >> x) {
|
||||
// code
|
||||
}
|
||||
\end{lstlisting}
|
||||
This loop reads elements from the input
|
||||
one after another, until there is no
|
||||
more data available in the input.
|
||||
|
||||
In some contest systems, files are used for
|
||||
input and output.
|
||||
An easy solution for this is to write
|
||||
the code as usual using standard streams,
|
||||
but add the following lines to the beginning of the code:
|
||||
\begin{lstlisting}
|
||||
freopen("input.txt", "r", stdin);
|
||||
freopen("output.txt", "w", stdout);
|
||||
\end{lstlisting}
|
||||
After this, the program reads the input from the file
|
||||
''input.txt'' and writes the output to the file
|
||||
''output.txt''.
|
||||
|
||||
\section{Working with numbers}
|
||||
|
||||
\index{integer}
|
||||
|
||||
\subsubsection{Integers}
|
||||
|
||||
The most used integer type in competitive programming
|
||||
is \texttt{int}, which is a 32-bit type with
|
||||
a value range of $-2^{31} \ldots 2^{31}-1$
|
||||
or about $-2 \cdot 10^9 \ldots 2 \cdot 10^9$.
|
||||
If the type \texttt{int} is not enough,
|
||||
the 64-bit type \texttt{long long} can be used.
|
||||
It has a value range of $-2^{63} \ldots 2^{63}-1$
|
||||
or about $-9 \cdot 10^{18} \ldots 9 \cdot 10^{18}$.
|
||||
|
||||
The following code defines a
|
||||
\texttt{long long} variable:
|
||||
\begin{lstlisting}
|
||||
long long x = 123456789123456789LL;
|
||||
\end{lstlisting}
|
||||
The suffix \texttt{LL} means that the
|
||||
type of the number is \texttt{long long}.
|
||||
|
||||
A common mistake when using the type \texttt{long long}
|
||||
is that the type \texttt{int} is still used somewhere
|
||||
in the code.
|
||||
For example, the following code contains
|
||||
a subtle error:
|
||||
|
||||
\begin{lstlisting}
|
||||
int a = 123456789;
|
||||
long long b = a*a;
|
||||
cout << b << "\n"; // -1757895751
|
||||
\end{lstlisting}
|
||||
|
||||
Even though the variable \texttt{b} is of type \texttt{long long},
|
||||
both numbers in the expression \texttt{a*a}
|
||||
are of type \texttt{int} and the result is
|
||||
also of type \texttt{int}.
|
||||
Because of this, the variable \texttt{b} will
|
||||
contain a wrong result.
|
||||
The problem can be solved by changing the type
|
||||
of \texttt{a} to \texttt{long long} or
|
||||
by changing the expression to \texttt{(long long)a*a}.
|
||||
|
||||
Usually contest problems are set so that the
|
||||
type \texttt{long long} is enough.
|
||||
Still, it is good to know that
|
||||
the \texttt{g++} compiler also provides
|
||||
a 128-bit type \texttt{\_\_int128\_t}
|
||||
with a value range of
|
||||
$-2^{127} \ldots 2^{127}-1$ or about $-10^{38} \ldots 10^{38}$.
|
||||
However, this type is not available in all contest systems.
|
||||
|
||||
\subsubsection{Modular arithmetic}
|
||||
|
||||
\index{remainder}
|
||||
\index{modular arithmetic}
|
||||
|
||||
We denote by $x \bmod m$ the remainder
|
||||
when $x$ is divided by $m$.
|
||||
For example, $17 \bmod 5 = 2$,
|
||||
because $17 = 3 \cdot 5 + 2$.
|
||||
|
||||
Sometimes, the answer to a problem is a
|
||||
very large number but it is enough to
|
||||
output it ''modulo $m$'', i.e.,
|
||||
the remainder when the answer is divided by $m$
|
||||
(for example, ''modulo $10^9+7$'').
|
||||
The idea is that even if the actual answer
|
||||
is very large,
|
||||
it suffices to use the types
|
||||
\texttt{int} and \texttt{long long}.
|
||||
|
||||
An important property of the remainder is that
|
||||
in addition, subtraction and multiplication,
|
||||
the remainder can be taken before the operation:
|
||||
|
||||
\[
|
||||
\begin{array}{rcr}
|
||||
(a+b) \bmod m & = & (a \bmod m + b \bmod m) \bmod m \\
|
||||
(a-b) \bmod m & = & (a \bmod m - b \bmod m) \bmod m \\
|
||||
(a \cdot b) \bmod m & = & (a \bmod m \cdot b \bmod m) \bmod m
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
Thus, we can take the remainder after every operation
|
||||
and the numbers will never become too large.
|
||||
|
||||
For example, the following code calculates $n!$,
|
||||
the factorial of $n$, modulo $m$:
|
||||
\begin{lstlisting}
|
||||
long long x = 1;
|
||||
for (int i = 2; i <= n; i++) {
|
||||
x = (x*i)%m;
|
||||
}
|
||||
cout << x%m << "\n";
|
||||
\end{lstlisting}
|
||||
|
||||
Usually we want the remainder to always
|
||||
be between $0\ldots m-1$.
|
||||
However, in C++ and other languages,
|
||||
the remainder of a negative number
|
||||
is either zero or negative.
|
||||
An easy way to make sure there
|
||||
are no negative remainders is to first calculate
|
||||
the remainder as usual and then add $m$
|
||||
if the result is negative:
|
||||
\begin{lstlisting}
|
||||
x = x%m;
|
||||
if (x < 0) x += m;
|
||||
\end{lstlisting}
|
||||
However, this is only needed when there
|
||||
are subtractions in the code and the
|
||||
remainder may become negative.
|
||||
|
||||
\subsubsection{Floating point numbers}
|
||||
|
||||
\index{floating point number}
|
||||
|
||||
The usual floating point types in
|
||||
competitive programming are
|
||||
the 64-bit \texttt{double}
|
||||
and, as an extension in the \texttt{g++} compiler,
|
||||
the 80-bit \texttt{long double}.
|
||||
In most cases, \texttt{double} is enough,
|
||||
but \texttt{long double} is more accurate.
|
||||
|
||||
The required precision of the answer
|
||||
is usually given in the problem statement.
|
||||
An easy way to output the answer is to use
|
||||
the \texttt{printf} function
|
||||
and give the number of decimal places
|
||||
in the formatting string.
|
||||
For example, the following code prints
|
||||
the value of $x$ with 9 decimal places:
|
||||
|
||||
\begin{lstlisting}
|
||||
printf("%.9f\n", x);
|
||||
\end{lstlisting}
|
||||
|
||||
A difficulty when using floating point numbers
|
||||
is that some numbers cannot be represented
|
||||
accurately as floating point numbers,
|
||||
and there will be rounding errors.
|
||||
For example, the result of the following code
|
||||
is surprising:
|
||||
|
||||
\begin{lstlisting}
|
||||
double x = 0.3*3+0.1;
|
||||
printf("%.20f\n", x); // 0.99999999999999988898
|
||||
\end{lstlisting}
|
||||
|
||||
Due to a rounding error,
|
||||
the value of \texttt{x} is a bit smaller than 1,
|
||||
while the correct value would be 1.
|
||||
|
||||
It is risky to compare floating point numbers
|
||||
with the \texttt{==} operator,
|
||||
because it is possible that the values should be
|
||||
equal but they are not because of precision errors.
|
||||
A better way to compare floating point numbers
|
||||
is to assume that two numbers are equal
|
||||
if the difference between them is less than $\varepsilon$,
|
||||
where $\varepsilon$ is a small number.
|
||||
|
||||
In practice, the numbers can be compared
|
||||
as follows ($\varepsilon=10^{-9}$):
|
||||
|
||||
\begin{lstlisting}
|
||||
if (abs(a-b) < 1e-9) {
|
||||
// a and b are equal
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
Note that while floating point numbers are inaccurate,
|
||||
integers up to a certain limit can still be
|
||||
represented accurately.
|
||||
For example, using \texttt{double},
|
||||
it is possible to accurately represent all
|
||||
integers whose absolute value is at most $2^{53}$.
|
||||
|
||||
\section{Shortening code}
|
||||
|
||||
Short code is ideal in competitive programming,
|
||||
because programs should be written
|
||||
as fast as possible.
|
||||
Because of this, competitive programmers often define
|
||||
shorter names for datatypes and other parts of code.
|
||||
|
||||
\subsubsection{Type names}
|
||||
\index{tuppdef@\texttt{typedef}}
|
||||
Using the command \texttt{typedef}
|
||||
it is possible to give a shorter name
|
||||
to a datatype.
|
||||
For example, the name \texttt{long long} is long,
|
||||
so we can define a shorter name \texttt{ll}:
|
||||
\begin{lstlisting}
|
||||
typedef long long ll;
|
||||
\end{lstlisting}
|
||||
After this, the code
|
||||
\begin{lstlisting}
|
||||
long long a = 123456789;
|
||||
long long b = 987654321;
|
||||
cout << a*b << "\n";
|
||||
\end{lstlisting}
|
||||
can be shortened as follows:
|
||||
\begin{lstlisting}
|
||||
ll a = 123456789;
|
||||
ll b = 987654321;
|
||||
cout << a*b << "\n";
|
||||
\end{lstlisting}
|
||||
|
||||
The command \texttt{typedef}
|
||||
can also be used with more complex types.
|
||||
For example, the following code gives
|
||||
the name \texttt{vi} for a vector of integers
|
||||
and the name \texttt{pi} for a pair
|
||||
that contains two integers.
|
||||
\begin{lstlisting}
|
||||
typedef vector<int> vi;
|
||||
typedef pair<int,int> pi;
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Macros}
|
||||
\index{macro}
|
||||
Another way to shorten code is to define
|
||||
\key{macros}.
|
||||
A macro means that certain strings in
|
||||
the code will be changed before the compilation.
|
||||
In C++, macros are defined using the
|
||||
\texttt{\#define} keyword.
|
||||
|
||||
For example, we can define the following macros:
|
||||
\begin{lstlisting}
|
||||
#define F first
|
||||
#define S second
|
||||
#define PB push_back
|
||||
#define MP make_pair
|
||||
\end{lstlisting}
|
||||
After this, the code
|
||||
\begin{lstlisting}
|
||||
v.push_back(make_pair(y1,x1));
|
||||
v.push_back(make_pair(y2,x2));
|
||||
int d = v[i].first+v[i].second;
|
||||
\end{lstlisting}
|
||||
can be shortened as follows:
|
||||
\begin{lstlisting}
|
||||
v.PB(MP(y1,x1));
|
||||
v.PB(MP(y2,x2));
|
||||
int d = v[i].F+v[i].S;
|
||||
\end{lstlisting}
|
||||
|
||||
A macro can also have parameters
|
||||
which makes it possible to shorten loops and other
|
||||
structures.
|
||||
For example, we can define the following macro:
|
||||
\begin{lstlisting}
|
||||
#define REP(i,a,b) for (int i = a; i <= b; i++)
|
||||
\end{lstlisting}
|
||||
After this, the code
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i++) {
|
||||
search(i);
|
||||
}
|
||||
\end{lstlisting}
|
||||
can be shortened as follows:
|
||||
\begin{lstlisting}
|
||||
REP(i,1,n) {
|
||||
search(i);
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
Sometimes macros cause bugs that may be difficult
|
||||
to detect. For example, consider the following macro
|
||||
that calculates the square of a number:
|
||||
\begin{lstlisting}
|
||||
#define SQ(a) a*a
|
||||
\end{lstlisting}
|
||||
This macro \emph{does not} always work as expected.
|
||||
For example, the code
|
||||
\begin{lstlisting}
|
||||
cout << SQ(3+3) << "\n";
|
||||
\end{lstlisting}
|
||||
corresponds to the code
|
||||
\begin{lstlisting}
|
||||
cout << 3+3*3+3 << "\n"; // 15
|
||||
\end{lstlisting}
|
||||
|
||||
A better version of the macro is as follows:
|
||||
\begin{lstlisting}
|
||||
#define SQ(a) (a)*(a)
|
||||
\end{lstlisting}
|
||||
Now the code
|
||||
\begin{lstlisting}
|
||||
cout << SQ(3+3) << "\n";
|
||||
\end{lstlisting}
|
||||
corresponds to the code
|
||||
\begin{lstlisting}
|
||||
cout << (3+3)*(3+3) << "\n"; // 36
|
||||
\end{lstlisting}
|
||||
|
||||
|
||||
\section{Mathematics}
|
||||
|
||||
Mathematics plays an important role in competitive
|
||||
programming, and it is not possible to become
|
||||
a successful competitive programmer without
|
||||
having good mathematical skills.
|
||||
This section discusses some important
|
||||
mathematical concepts and formulas that
|
||||
are needed later in the book.
|
||||
|
||||
\subsubsection{Sum formulas}
|
||||
|
||||
Each sum of the form
|
||||
\[\sum_{x=1}^n x^k = 1^k+2^k+3^k+\ldots+n^k,\]
|
||||
where $k$ is a positive integer,
|
||||
has a closed-form formula that is a
|
||||
polynomial of degree $k+1$.
|
||||
For example\footnote{\index{Faulhaber's formula}
|
||||
There is even a general formula for such sums, called \key{Faulhaber's formula},
|
||||
but it is too complex to be presented here.},
|
||||
\[\sum_{x=1}^n x = 1+2+3+\ldots+n = \frac{n(n+1)}{2}\]
|
||||
and
|
||||
\[\sum_{x=1}^n x^2 = 1^2+2^2+3^2+\ldots+n^2 = \frac{n(n+1)(2n+1)}{6}.\]
|
||||
|
||||
An \key{arithmetic progression} is a \index{arithmetic progression}
|
||||
sequence of numbers
|
||||
where the difference between any two consecutive
|
||||
numbers is constant.
|
||||
For example,
|
||||
\[3, 7, 11, 15\]
|
||||
is an arithmetic progression with constant 4.
|
||||
The sum of an arithmetic progression can be calculated
|
||||
using the formula
|
||||
\[\underbrace{a + \cdots + b}_{n \,\, \textrm{numbers}} = \frac{n(a+b)}{2}\]
|
||||
where $a$ is the first number,
|
||||
$b$ is the last number and
|
||||
$n$ is the amount of numbers.
|
||||
For example,
|
||||
\[3+7+11+15=\frac{4 \cdot (3+15)}{2} = 36.\]
|
||||
The formula is based on the fact
|
||||
that the sum consists of $n$ numbers and
|
||||
the value of each number is $(a+b)/2$ on average.
|
||||
|
||||
\index{geometric progression}
|
||||
A \key{geometric progression} is a sequence
|
||||
of numbers
|
||||
where the ratio between any two consecutive
|
||||
numbers is constant.
|
||||
For example,
|
||||
\[3,6,12,24\]
|
||||
is a geometric progression with constant 2.
|
||||
The sum of a geometric progression can be calculated
|
||||
using the formula
|
||||
\[a + ak + ak^2 + \cdots + b = \frac{bk-a}{k-1}\]
|
||||
where $a$ is the first number,
|
||||
$b$ is the last number and the
|
||||
ratio between consecutive numbers is $k$.
|
||||
For example,
|
||||
\[3+6+12+24=\frac{24 \cdot 2 - 3}{2-1} = 45.\]
|
||||
|
||||
This formula can be derived as follows. Let
|
||||
\[ S = a + ak + ak^2 + \cdots + b .\]
|
||||
By multiplying both sides by $k$, we get
|
||||
\[ kS = ak + ak^2 + ak^3 + \cdots + bk,\]
|
||||
and solving the equation
|
||||
\[ kS-S = bk-a\]
|
||||
yields the formula.
|
||||
|
||||
A special case of a sum of a geometric progression is the formula
|
||||
\[1+2+4+8+\ldots+2^{n-1}=2^n-1.\]
|
||||
|
||||
\index{harmonic sum}
|
||||
|
||||
A \key{harmonic sum} is a sum of the form
|
||||
\[ \sum_{x=1}^n \frac{1}{x} = 1+\frac{1}{2}+\frac{1}{3}+\ldots+\frac{1}{n}.\]
|
||||
|
||||
An upper bound for a harmonic sum is $\log_2(n)+1$.
|
||||
Namely, we can
|
||||
modify each term $1/k$ so that $k$ becomes
|
||||
the nearest power of two that does not exceed $k$.
|
||||
For example, when $n=6$, we can estimate
|
||||
the sum as follows:
|
||||
\[ 1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\frac{1}{5}+\frac{1}{6} \le
|
||||
1+\frac{1}{2}+\frac{1}{2}+\frac{1}{4}+\frac{1}{4}+\frac{1}{4}.\]
|
||||
This upper bound consists of $\log_2(n)+1$ parts
|
||||
($1$, $2 \cdot 1/2$, $4 \cdot 1/4$, etc.),
|
||||
and the value of each part is at most 1.
|
||||
|
||||
\subsubsection{Set theory}
|
||||
|
||||
\index{set theory}
|
||||
\index{set}
|
||||
\index{intersection}
|
||||
\index{union}
|
||||
\index{difference}
|
||||
\index{subset}
|
||||
\index{universal set}
|
||||
\index{complement}
|
||||
|
||||
A \key{set} is a collection of elements.
|
||||
For example, the set
|
||||
\[X=\{2,4,7\}\]
|
||||
contains elements 2, 4 and 7.
|
||||
The symbol $\emptyset$ denotes an empty set,
|
||||
and $|S|$ denotes the size of a set $S$,
|
||||
i.e., the number of elements in the set.
|
||||
For example, in the above set, $|X|=3$.
|
||||
|
||||
If a set $S$ contains an element $x$,
|
||||
we write $x \in S$,
|
||||
and otherwise we write $x \notin S$.
|
||||
For example, in the above set
|
||||
\[4 \in X \hspace{10px}\textrm{and}\hspace{10px} 5 \notin X.\]
|
||||
|
||||
\begin{samepage}
|
||||
New sets can be constructed using set operations:
|
||||
\begin{itemize}
|
||||
\item The \key{intersection} $A \cap B$ consists of elements
|
||||
that are in both $A$ and $B$.
|
||||
For example, if $A=\{1,2,5\}$ and $B=\{2,4\}$,
|
||||
then $A \cap B = \{2\}$.
|
||||
\item The \key{union} $A \cup B$ consists of elements
|
||||
that are in $A$ or $B$ or both.
|
||||
For example, if $A=\{3,7\}$ and $B=\{2,3,8\}$,
|
||||
then $A \cup B = \{2,3,7,8\}$.
|
||||
\item The \key{complement} $\bar A$ consists of elements
|
||||
that are not in $A$.
|
||||
The interpretation of a complement depends on
|
||||
the \key{universal set}, which contains all possible elements.
|
||||
For example, if $A=\{1,2,5,7\}$ and the universal set is
|
||||
$\{1,2,\ldots,10\}$, then $\bar A = \{3,4,6,8,9,10\}$.
|
||||
\item The \key{difference} $A \setminus B = A \cap \bar B$
|
||||
consists of elements that are in $A$ but not in $B$.
|
||||
Note that $B$ can contain elements that are not in $A$.
|
||||
For example, if $A=\{2,3,7,8\}$ and $B=\{3,5,8\}$,
|
||||
then $A \setminus B = \{2,7\}$.
|
||||
\end{itemize}
|
||||
\end{samepage}
|
||||
|
||||
If each element of $A$ also belongs to $S$,
|
||||
we say that $A$ is a \key{subset} of $S$,
|
||||
denoted by $A \subset S$.
|
||||
A set $S$ always has $2^{|S|}$ subsets,
|
||||
including the empty set.
|
||||
For example, the subsets of the set $\{2,4,7\}$ are
|
||||
\begin{center}
|
||||
$\emptyset$,
|
||||
$\{2\}$, $\{4\}$, $\{7\}$, $\{2,4\}$, $\{2,7\}$, $\{4,7\}$ and $\{2,4,7\}$.
|
||||
\end{center}
|
||||
|
||||
Some often used sets are
|
||||
$\mathbb{N}$ (natural numbers),
|
||||
$\mathbb{Z}$ (integers),
|
||||
$\mathbb{Q}$ (rational numbers) and
|
||||
$\mathbb{R}$ (real numbers).
|
||||
The set $\mathbb{N}$
|
||||
can be defined in two ways, depending
|
||||
on the situation:
|
||||
either $\mathbb{N}=\{0,1,2,\ldots\}$
|
||||
or $\mathbb{N}=\{1,2,3,...\}$.
|
||||
|
||||
We can also construct a set using a rule of the form
|
||||
\[\{f(n) : n \in S\},\]
|
||||
where $f(n)$ is some function.
|
||||
This set contains all elements of the form $f(n)$,
|
||||
where $n$ is an element in $S$.
|
||||
For example, the set
|
||||
\[X=\{2n : n \in \mathbb{Z}\}\]
|
||||
contains all even integers.
|
||||
|
||||
\subsubsection{Logic}
|
||||
|
||||
\index{logic}
|
||||
\index{negation}
|
||||
\index{conjuction}
|
||||
\index{disjunction}
|
||||
\index{implication}
|
||||
\index{equivalence}
|
||||
|
||||
The value of a logical expression is either
|
||||
\key{true} (1) or \key{false} (0).
|
||||
The most important logical operators are
|
||||
$\lnot$ (\key{negation}),
|
||||
$\land$ (\key{conjunction}),
|
||||
$\lor$ (\key{disjunction}),
|
||||
$\Rightarrow$ (\key{implication}) and
|
||||
$\Leftrightarrow$ (\key{equivalence}).
|
||||
The following table shows the meanings of these operators:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{rr|rrrrrrr}
|
||||
$A$ & $B$ & $\lnot A$ & $\lnot B$ & $A \land B$ & $A \lor B$ & $A \Rightarrow B$ & $A \Leftrightarrow B$ \\
|
||||
\hline
|
||||
0 & 0 & 1 & 1 & 0 & 0 & 1 & 1 \\
|
||||
0 & 1 & 1 & 0 & 0 & 1 & 1 & 0 \\
|
||||
1 & 0 & 0 & 1 & 0 & 1 & 0 & 0 \\
|
||||
1 & 1 & 0 & 0 & 1 & 1 & 1 & 1 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
The expression $\lnot A$ has the opposite value of $A$.
|
||||
The expression $A \land B$ is true if both $A$ and $B$
|
||||
are true,
|
||||
and the expression $A \lor B$ is true if $A$ or $B$ or both
|
||||
are true.
|
||||
The expression $A \Rightarrow B$ is true
|
||||
if whenever $A$ is true, also $B$ is true.
|
||||
The expression $A \Leftrightarrow B$ is true
|
||||
if $A$ and $B$ are both true or both false.
|
||||
|
||||
\index{predicate}
|
||||
|
||||
A \key{predicate} is an expression that is true or false
|
||||
depending on its parameters.
|
||||
Predicates are usually denoted by capital letters.
|
||||
For example, we can define a predicate $P(x)$
|
||||
that is true exactly when $x$ is a prime number.
|
||||
Using this definition, $P(7)$ is true but $P(8)$ is false.
|
||||
|
||||
\index{quantifier}
|
||||
|
||||
A \key{quantifier} connects a logical expression
|
||||
to the elements of a set.
|
||||
The most important quantifiers are
|
||||
$\forall$ (\key{for all}) and $\exists$ (\key{there is}).
|
||||
For example,
|
||||
\[\forall x (\exists y (y < x))\]
|
||||
means that for each element $x$ in the set,
|
||||
there is an element $y$ in the set
|
||||
such that $y$ is smaller than $x$.
|
||||
This is true in the set of integers,
|
||||
but false in the set of natural numbers.
|
||||
|
||||
Using the notation described above,
|
||||
we can express many kinds of logical propositions.
|
||||
For example,
|
||||
\[\forall x ((x>1 \land \lnot P(x)) \Rightarrow (\exists a (\exists b (a > 1 \land b > 1 \land x = ab))))\]
|
||||
means that if a number $x$ is larger than 1
|
||||
and not a prime number,
|
||||
then there are numbers $a$ and $b$
|
||||
that are larger than $1$ and whose product is $x$.
|
||||
This proposition is true in the set of integers.
|
||||
|
||||
\subsubsection{Functions}
|
||||
|
||||
The function $\lfloor x \rfloor$ rounds the number $x$
|
||||
down to an integer, and the function
|
||||
$\lceil x \rceil$ rounds the number $x$
|
||||
up to an integer. For example,
|
||||
\[ \lfloor 3/2 \rfloor = 1 \hspace{10px} \textrm{and} \hspace{10px} \lceil 3/2 \rceil = 2.\]
|
||||
|
||||
The functions $\min(x_1,x_2,\ldots,x_n)$
|
||||
and $\max(x_1,x_2,\ldots,x_n)$
|
||||
give the smallest and largest of values
|
||||
$x_1,x_2,\ldots,x_n$.
|
||||
For example,
|
||||
\[ \min(1,2,3)=1 \hspace{10px} \textrm{and} \hspace{10px} \max(1,2,3)=3.\]
|
||||
|
||||
\index{factorial}
|
||||
|
||||
The \key{factorial} $n!$ can be defined
|
||||
\[\prod_{x=1}^n x = 1 \cdot 2 \cdot 3 \cdot \ldots \cdot n\]
|
||||
or recursively
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
0! & = & 1 \\
|
||||
n! & = & n \cdot (n-1)! \\
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
\index{Fibonacci number}
|
||||
|
||||
The \key{Fibonacci numbers}
|
||||
%\footnote{Fibonacci (c. 1175--1250) was an Italian mathematician.}
|
||||
arise in many situations.
|
||||
They can be defined recursively as follows:
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
f(0) & = & 0 \\
|
||||
f(1) & = & 1 \\
|
||||
f(n) & = & f(n-1)+f(n-2) \\
|
||||
\end{array}
|
||||
\]
|
||||
The first Fibonacci numbers are
|
||||
\[0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, \ldots\]
|
||||
There is also a closed-form formula
|
||||
for calculating Fibonacci numbers, which is sometimes called
|
||||
\index{Binet's formula} \key{Binet's formula}:
|
||||
\[f(n)=\frac{(1 + \sqrt{5})^n - (1-\sqrt{5})^n}{2^n \sqrt{5}}.\]
|
||||
|
||||
\subsubsection{Logarithms}
|
||||
|
||||
\index{logarithm}
|
||||
|
||||
The \key{logarithm} of a number $x$
|
||||
is denoted $\log_k(x)$, where $k$ is the base
|
||||
of the logarithm.
|
||||
According to the definition,
|
||||
$\log_k(x)=a$ exactly when $k^a=x$.
|
||||
|
||||
A useful property of logarithms is
|
||||
that $\log_k(x)$ equals the number of times
|
||||
we have to divide $x$ by $k$ before we reach
|
||||
the number 1.
|
||||
For example, $\log_2(32)=5$
|
||||
because 5 divisions by 2 are needed:
|
||||
|
||||
\[32 \rightarrow 16 \rightarrow 8 \rightarrow 4 \rightarrow 2 \rightarrow 1 \]
|
||||
|
||||
Logarithms are often used in the analysis of
|
||||
algorithms, because many efficient algorithms
|
||||
halve something at each step.
|
||||
Hence, we can estimate the efficiency of such algorithms
|
||||
using logarithms.
|
||||
|
||||
The logarithm of a product is
|
||||
\[\log_k(ab) = \log_k(a)+\log_k(b),\]
|
||||
and consequently,
|
||||
\[\log_k(x^n) = n \cdot \log_k(x).\]
|
||||
In addition, the logarithm of a quotient is
|
||||
\[\log_k\Big(\frac{a}{b}\Big) = \log_k(a)-\log_k(b).\]
|
||||
Another useful formula is
|
||||
\[\log_u(x) = \frac{\log_k(x)}{\log_k(u)},\]
|
||||
and using this, it is possible to calculate
|
||||
logarithms to any base if there is a way to
|
||||
calculate logarithms to some fixed base.
|
||||
|
||||
\index{natural logarithm}
|
||||
|
||||
The \key{natural logarithm} $\ln(x)$ of a number $x$
|
||||
is a logarithm whose base is $e \approx 2.71828$.
|
||||
Another property of logarithms is that
|
||||
the number of digits of an integer $x$ in base $b$ is
|
||||
$\lfloor \log_b(x)+1 \rfloor$.
|
||||
For example, the representation of
|
||||
$123$ in base $2$ is 1111011 and
|
||||
$\lfloor \log_2(123)+1 \rfloor = 7$.
|
||||
|
||||
\section{Contests and resources}
|
||||
|
||||
\subsubsection{IOI}
|
||||
|
||||
The International Olympiad in Informatics (IOI)
|
||||
is an annual programming contest for
|
||||
secondary school students.
|
||||
Each country is allowed to send a team of
|
||||
four students to the contest.
|
||||
There are usually about 300 participants
|
||||
from 80 countries.
|
||||
|
||||
The IOI consists of two five-hour long contests.
|
||||
In both contests, the participants are asked to
|
||||
solve three algorithm tasks of various difficulty.
|
||||
The tasks are divided into subtasks,
|
||||
each of which has an assigned score.
|
||||
Even if the contestants are divided into teams,
|
||||
they compete as individuals.
|
||||
|
||||
The IOI syllabus \cite{iois} regulates the topics
|
||||
that may appear in IOI tasks.
|
||||
Almost all the topics in the IOI syllabus
|
||||
are covered by this book.
|
||||
|
||||
Participants for the IOI are selected through
|
||||
national contests.
|
||||
Before the IOI, many regional contests are organized,
|
||||
such as the Baltic Olympiad in Informatics (BOI),
|
||||
the Central European Olympiad in Informatics (CEOI)
|
||||
and the Asia-Pacific Informatics Olympiad (APIO).
|
||||
|
||||
Some countries organize online practice contests
|
||||
for future IOI participants,
|
||||
such as the Croatian Open Competition in Informatics \cite{coci}
|
||||
and the USA Computing Olympiad \cite{usaco}.
|
||||
In addition, a large collection of problems from Polish contests
|
||||
is available online \cite{main}.
|
||||
|
||||
\subsubsection{ICPC}
|
||||
|
||||
The International Collegiate Programming Contest (ICPC)
|
||||
is an annual programming contest for university students.
|
||||
Each team in the contest consists of three students,
|
||||
and unlike in the IOI, the students work together;
|
||||
there is only one computer available for each team.
|
||||
|
||||
The ICPC consists of several stages, and finally the
|
||||
best teams are invited to the World Finals.
|
||||
While there are tens of thousands of participants
|
||||
in the contest, there are only a small number\footnote{The exact number of final
|
||||
slots varies from year to year; in 2017, there were 133 final slots.} of final slots available,
|
||||
so even advancing to the finals
|
||||
is a great achievement in some regions.
|
||||
|
||||
In each ICPC contest, the teams have five hours of time to
|
||||
solve about ten algorithm problems.
|
||||
A solution to a problem is accepted only if it solves
|
||||
all test cases efficiently.
|
||||
During the contest, competitors may view the results of other teams,
|
||||
but for the last hour the scoreboard is frozen and it
|
||||
is not possible to see the results of the last submissions.
|
||||
|
||||
The topics that may appear at the ICPC are not so well
|
||||
specified as those at the IOI.
|
||||
In any case, it is clear that more knowledge is needed
|
||||
at the ICPC, especially more mathematical skills.
|
||||
|
||||
\subsubsection{Online contests}
|
||||
|
||||
There are also many online contests that are open for everybody.
|
||||
At the moment, the most active contest site is Codeforces,
|
||||
which organizes contests about weekly.
|
||||
In Codeforces, participants are divided into two divisions:
|
||||
beginners compete in Div2 and more experienced programmers in Div1.
|
||||
Other contest sites include AtCoder, CS Academy, HackerRank and Topcoder.
|
||||
|
||||
Some companies organize online contests with onsite finals.
|
||||
Examples of such contests are Facebook Hacker Cup,
|
||||
Google Code Jam and Yandex.Algorithm.
|
||||
Of course, companies also use those contests for recruiting:
|
||||
performing well in a contest is a good way to prove one's skills.
|
||||
|
||||
\subsubsection{Books}
|
||||
|
||||
There are already some books (besides this book) that
|
||||
focus on competitive programming and algorithmic problem solving:
|
||||
|
||||
\begin{itemize}
|
||||
\item S. S. Skiena and M. A. Revilla:
|
||||
\emph{Programming Challenges: The Programming Contest Training Manual} \cite{ski03}
|
||||
\item S. Halim and F. Halim:
|
||||
\emph{Competitive Programming 3: The New Lower Bound of Programming Contests} \cite{hal13}
|
||||
\item K. Diks et al.: \emph{Looking for a Challenge? The Ultimate Problem Set from
|
||||
the University of Warsaw Programming Competitions} \cite{dik12}
|
||||
\end{itemize}
|
||||
|
||||
The first two books are intended for beginners,
|
||||
whereas the last book contains advanced material.
|
||||
|
||||
Of course, general algorithm books are also suitable for
|
||||
competitive programmers.
|
||||
Some popular books are:
|
||||
|
||||
\begin{itemize}
|
||||
\item T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein:
|
||||
\emph{Introduction to Algorithms} \cite{cor09}
|
||||
\item J. Kleinberg and É. Tardos:
|
||||
\emph{Algorithm Design} \cite{kle05}
|
||||
\item S. S. Skiena:
|
||||
\emph{The Algorithm Design Manual} \cite{ski08}
|
||||
\end{itemize}
|
538
chapter02.tex
538
chapter02.tex
|
@ -1,538 +0,0 @@
|
|||
\chapter{Time complexity}
|
||||
|
||||
\index{time complexity}
|
||||
|
||||
The efficiency of algorithms is important in competitive programming.
|
||||
Usually, it is easy to design an algorithm
|
||||
that solves the problem slowly,
|
||||
but the real challenge is to invent a
|
||||
fast algorithm.
|
||||
If the algorithm is too slow, it will get only
|
||||
partial points or no points at all.
|
||||
|
||||
The \key{time complexity} of an algorithm
|
||||
estimates how much time the algorithm will use
|
||||
for some input.
|
||||
The idea is to represent the efficiency
|
||||
as a function whose parameter is the size of the input.
|
||||
By calculating the time complexity,
|
||||
we can find out whether the algorithm is fast enough
|
||||
without implementing it.
|
||||
|
||||
\section{Calculation rules}
|
||||
|
||||
The time complexity of an algorithm
|
||||
is denoted $O(\cdots)$
|
||||
where the three dots represent some
|
||||
function.
|
||||
Usually, the variable $n$ denotes
|
||||
the input size.
|
||||
For example, if the input is an array of numbers,
|
||||
$n$ will be the size of the array,
|
||||
and if the input is a string,
|
||||
$n$ will be the length of the string.
|
||||
|
||||
\subsubsection*{Loops}
|
||||
|
||||
A common reason why an algorithm is slow is
|
||||
that it contains many loops that go through the input.
|
||||
The more nested loops the algorithm contains,
|
||||
the slower it is.
|
||||
If there are $k$ nested loops,
|
||||
the time complexity is $O(n^k)$.
|
||||
|
||||
For example, the time complexity of the following code is $O(n)$:
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i++) {
|
||||
// code
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
And the time complexity of the following code is $O(n^2)$:
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i++) {
|
||||
for (int j = 1; j <= n; j++) {
|
||||
// code
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection*{Order of magnitude}
|
||||
|
||||
A time complexity does not tell us the exact number
|
||||
of times the code inside a loop is executed,
|
||||
but it only shows the order of magnitude.
|
||||
In the following examples, the code inside the loop
|
||||
is executed $3n$, $n+5$ and $\lceil n/2 \rceil$ times,
|
||||
but the time complexity of each code is $O(n)$.
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= 3*n; i++) {
|
||||
// code
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n+5; i++) {
|
||||
// code
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i += 2) {
|
||||
// code
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
As another example,
|
||||
the time complexity of the following code is $O(n^2)$:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i++) {
|
||||
for (int j = i+1; j <= n; j++) {
|
||||
// code
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection*{Phases}
|
||||
|
||||
If the algorithm consists of consecutive phases,
|
||||
the total time complexity is the largest
|
||||
time complexity of a single phase.
|
||||
The reason for this is that the slowest
|
||||
phase is usually the bottleneck of the code.
|
||||
|
||||
For example, the following code consists
|
||||
of three phases with time complexities
|
||||
$O(n)$, $O(n^2)$ and $O(n)$.
|
||||
Thus, the total time complexity is $O(n^2)$.
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i++) {
|
||||
// code
|
||||
}
|
||||
for (int i = 1; i <= n; i++) {
|
||||
for (int j = 1; j <= n; j++) {
|
||||
// code
|
||||
}
|
||||
}
|
||||
for (int i = 1; i <= n; i++) {
|
||||
// code
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection*{Several variables}
|
||||
|
||||
Sometimes the time complexity depends on
|
||||
several factors.
|
||||
In this case, the time complexity formula
|
||||
contains several variables.
|
||||
|
||||
For example, the time complexity of the
|
||||
following code is $O(nm)$:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i++) {
|
||||
for (int j = 1; j <= m; j++) {
|
||||
// code
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection*{Recursion}
|
||||
|
||||
The time complexity of a recursive function
|
||||
depends on the number of times the function is called
|
||||
and the time complexity of a single call.
|
||||
The total time complexity is the product of
|
||||
these values.
|
||||
|
||||
For example, consider the following function:
|
||||
\begin{lstlisting}
|
||||
void f(int n) {
|
||||
if (n == 1) return;
|
||||
f(n-1);
|
||||
}
|
||||
\end{lstlisting}
|
||||
The call $\texttt{f}(n)$ causes $n$ function calls,
|
||||
and the time complexity of each call is $O(1)$.
|
||||
Thus, the total time complexity is $O(n)$.
|
||||
|
||||
As another example, consider the following function:
|
||||
\begin{lstlisting}
|
||||
void g(int n) {
|
||||
if (n == 1) return;
|
||||
g(n-1);
|
||||
g(n-1);
|
||||
}
|
||||
\end{lstlisting}
|
||||
In this case each function call generates two other
|
||||
calls, except for $n=1$.
|
||||
Let us see what happens when $g$ is called
|
||||
with parameter $n$.
|
||||
The following table shows the function calls
|
||||
produced by this single call:
|
||||
\begin{center}
|
||||
\begin{tabular}{rr}
|
||||
function call & number of calls \\
|
||||
\hline
|
||||
$g(n)$ & 1 \\
|
||||
$g(n-1)$ & 2 \\
|
||||
$g(n-2)$ & 4 \\
|
||||
$\cdots$ & $\cdots$ \\
|
||||
$g(1)$ & $2^{n-1}$ \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
Based on this, the time complexity is
|
||||
\[1+2+4+\cdots+2^{n-1} = 2^n-1 = O(2^n).\]
|
||||
|
||||
\section{Complexity classes}
|
||||
|
||||
\index{complexity classes}
|
||||
|
||||
The following list contains common time complexities
|
||||
of algorithms:
|
||||
|
||||
\begin{description}
|
||||
\item[$O(1)$]
|
||||
\index{constant-time algorithm}
|
||||
The running time of a \key{constant-time} algorithm
|
||||
does not depend on the input size.
|
||||
A typical constant-time algorithm is a direct
|
||||
formula that calculates the answer.
|
||||
|
||||
\item[$O(\log n)$]
|
||||
\index{logarithmic algorithm}
|
||||
A \key{logarithmic} algorithm often halves
|
||||
the input size at each step.
|
||||
The running time of such an algorithm
|
||||
is logarithmic, because
|
||||
$\log_2 n$ equals the number of times
|
||||
$n$ must be divided by 2 to get 1.
|
||||
|
||||
\item[$O(\sqrt n)$]
|
||||
A \key{square root algorithm} is slower than
|
||||
$O(\log n)$ but faster than $O(n)$.
|
||||
A special property of square roots is that
|
||||
$\sqrt n = n/\sqrt n$, so the square root $\sqrt n$ lies,
|
||||
in some sense, in the middle of the input.
|
||||
|
||||
\item[$O(n)$]
|
||||
\index{linear algorithm}
|
||||
A \key{linear} algorithm goes through the input
|
||||
a constant number of times.
|
||||
This is often the best possible time complexity,
|
||||
because it is usually necessary to access each
|
||||
input element at least once before
|
||||
reporting the answer.
|
||||
|
||||
\item[$O(n \log n)$]
|
||||
This time complexity often indicates that the
|
||||
algorithm sorts the input,
|
||||
because the time complexity of efficient
|
||||
sorting algorithms is $O(n \log n)$.
|
||||
Another possibility is that the algorithm
|
||||
uses a data structure where each operation
|
||||
takes $O(\log n)$ time.
|
||||
|
||||
\item[$O(n^2)$]
|
||||
\index{quadratic algorithm}
|
||||
A \key{quadratic} algorithm often contains
|
||||
two nested loops.
|
||||
It is possible to go through all pairs of
|
||||
the input elements in $O(n^2)$ time.
|
||||
|
||||
\item[$O(n^3)$]
|
||||
\index{cubic algorithm}
|
||||
A \key{cubic} algorithm often contains
|
||||
three nested loops.
|
||||
It is possible to go through all triplets of
|
||||
the input elements in $O(n^3)$ time.
|
||||
|
||||
\item[$O(2^n)$]
|
||||
This time complexity often indicates that
|
||||
the algorithm iterates through all
|
||||
subsets of the input elements.
|
||||
For example, the subsets of $\{1,2,3\}$ are
|
||||
$\emptyset$, $\{1\}$, $\{2\}$, $\{3\}$, $\{1,2\}$,
|
||||
$\{1,3\}$, $\{2,3\}$ and $\{1,2,3\}$.
|
||||
|
||||
\item[$O(n!)$]
|
||||
This time complexity often indicates that
|
||||
the algorithm iterates through all
|
||||
permutations of the input elements.
|
||||
For example, the permutations of $\{1,2,3\}$ are
|
||||
$(1,2,3)$, $(1,3,2)$, $(2,1,3)$, $(2,3,1)$,
|
||||
$(3,1,2)$ and $(3,2,1)$.
|
||||
|
||||
\end{description}
|
||||
|
||||
\index{polynomial algorithm}
|
||||
An algorithm is \key{polynomial}
|
||||
if its time complexity is at most $O(n^k)$
|
||||
where $k$ is a constant.
|
||||
All the above time complexities except
|
||||
$O(2^n)$ and $O(n!)$ are polynomial.
|
||||
In practice, the constant $k$ is usually small,
|
||||
and therefore a polynomial time complexity
|
||||
roughly means that the algorithm is \emph{efficient}.
|
||||
|
||||
\index{NP-hard problem}
|
||||
|
||||
Most algorithms in this book are polynomial.
|
||||
Still, there are many important problems for which
|
||||
no polynomial algorithm is known, i.e.,
|
||||
nobody knows how to solve them efficiently.
|
||||
\key{NP-hard} problems are an important set
|
||||
of problems, for which no polynomial algorithm
|
||||
is known\footnote{A classic book on the topic is
|
||||
M. R. Garey's and D. S. Johnson's
|
||||
\emph{Computers and Intractability: A Guide to the Theory
|
||||
of NP-Completeness} \cite{gar79}.}.
|
||||
|
||||
\section{Estimating efficiency}
|
||||
|
||||
By calculating the time complexity of an algorithm,
|
||||
it is possible to check, before
|
||||
implementing the algorithm, that it is
|
||||
efficient enough for the problem.
|
||||
The starting point for estimations is the fact that
|
||||
a modern computer can perform some hundreds of
|
||||
millions of operations in a second.
|
||||
|
||||
For example, assume that the time limit for
|
||||
a problem is one second and the input size is $n=10^5$.
|
||||
If the time complexity is $O(n^2)$,
|
||||
the algorithm will perform about $(10^5)^2=10^{10}$ operations.
|
||||
This should take at least some tens of seconds,
|
||||
so the algorithm seems to be too slow for solving the problem.
|
||||
|
||||
On the other hand, given the input size,
|
||||
we can try to \emph{guess}
|
||||
the required time complexity of the algorithm
|
||||
that solves the problem.
|
||||
The following table contains some useful estimates
|
||||
assuming a time limit of one second.
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{ll}
|
||||
input size & required time complexity \\
|
||||
\hline
|
||||
$n \le 10$ & $O(n!)$ \\
|
||||
$n \le 20$ & $O(2^n)$ \\
|
||||
$n \le 500$ & $O(n^3)$ \\
|
||||
$n \le 5000$ & $O(n^2)$ \\
|
||||
$n \le 10^6$ & $O(n \log n)$ or $O(n)$ \\
|
||||
$n$ is large & $O(1)$ or $O(\log n)$ \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
For example, if the input size is $n=10^5$,
|
||||
it is probably expected that the time
|
||||
complexity of the algorithm is $O(n)$ or $O(n \log n)$.
|
||||
This information makes it easier to design the algorithm,
|
||||
because it rules out approaches that would yield
|
||||
an algorithm with a worse time complexity.
|
||||
|
||||
\index{constant factor}
|
||||
|
||||
Still, it is important to remember that a
|
||||
time complexity is only an estimate of efficiency,
|
||||
because it hides the \emph{constant factors}.
|
||||
For example, an algorithm that runs in $O(n)$ time
|
||||
may perform $n/2$ or $5n$ operations.
|
||||
This has an important effect on the actual
|
||||
running time of the algorithm.
|
||||
|
||||
\section{Maximum subarray sum}
|
||||
|
||||
\index{maximum subarray sum}
|
||||
|
||||
There are often several possible algorithms
|
||||
for solving a problem such that their
|
||||
time complexities are different.
|
||||
This section discusses a classic problem that
|
||||
has a straightforward $O(n^3)$ solution.
|
||||
However, by designing a better algorithm, it
|
||||
is possible to solve the problem in $O(n^2)$
|
||||
time and even in $O(n)$ time.
|
||||
|
||||
Given an array of $n$ numbers,
|
||||
our task is to calculate the
|
||||
\key{maximum subarray sum}, i.e.,
|
||||
the largest possible sum of
|
||||
a sequence of consecutive values
|
||||
in the array\footnote{J. Bentley's
|
||||
book \emph{Programming Pearls} \cite{ben86} made the problem popular.}.
|
||||
The problem is interesting when there may be
|
||||
negative values in the array.
|
||||
For example, in the array
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$-1$};
|
||||
\node at (1.5,0.5) {$2$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$-3$};
|
||||
\node at (4.5,0.5) {$5$};
|
||||
\node at (5.5,0.5) {$2$};
|
||||
\node at (6.5,0.5) {$-5$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\begin{samepage}
|
||||
the following subarray produces the maximum sum $10$:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (1,0) rectangle (6,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$-1$};
|
||||
\node at (1.5,0.5) {$2$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$-3$};
|
||||
\node at (4.5,0.5) {$5$};
|
||||
\node at (5.5,0.5) {$2$};
|
||||
\node at (6.5,0.5) {$-5$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
We assume that an empty subarray is allowed,
|
||||
so the maximum subarray sum is always at least $0$.
|
||||
|
||||
\subsubsection{Algorithm 1}
|
||||
|
||||
A straightforward way to solve the problem
|
||||
is to go through all possible subarrays,
|
||||
calculate the sum of values in each subarray and maintain
|
||||
the maximum sum.
|
||||
The following code implements this algorithm:
|
||||
|
||||
\begin{lstlisting}
|
||||
int best = 0;
|
||||
for (int a = 0; a < n; a++) {
|
||||
for (int b = a; b < n; b++) {
|
||||
int sum = 0;
|
||||
for (int k = a; k <= b; k++) {
|
||||
sum += array[k];
|
||||
}
|
||||
best = max(best,sum);
|
||||
}
|
||||
}
|
||||
cout << best << "\n";
|
||||
\end{lstlisting}
|
||||
|
||||
The variables \texttt{a} and \texttt{b} fix the first and
|
||||
last index of the subarray,
|
||||
and the sum of values is calculated to the variable \texttt{sum}.
|
||||
The variable \texttt{best} contains the maximum sum found during the search.
|
||||
|
||||
The time complexity of the algorithm is $O(n^3)$,
|
||||
because it consists of three nested loops
|
||||
that go through the input.
|
||||
|
||||
\subsubsection{Algorithm 2}
|
||||
|
||||
It is easy to make Algorithm 1 more efficient
|
||||
by removing one loop from it.
|
||||
This is possible by calculating the sum at the same
|
||||
time when the right end of the subarray moves.
|
||||
The result is the following code:
|
||||
|
||||
\begin{lstlisting}
|
||||
int best = 0;
|
||||
for (int a = 0; a < n; a++) {
|
||||
int sum = 0;
|
||||
for (int b = a; b < n; b++) {
|
||||
sum += array[b];
|
||||
best = max(best,sum);
|
||||
}
|
||||
}
|
||||
cout << best << "\n";
|
||||
\end{lstlisting}
|
||||
After this change, the time complexity is $O(n^2)$.
|
||||
|
||||
\subsubsection{Algorithm 3}
|
||||
|
||||
Surprisingly, it is possible to solve the problem
|
||||
in $O(n)$ time\footnote{In \cite{ben86}, this linear-time algorithm
|
||||
is attributed to J. B. Kadane, and the algorithm is sometimes
|
||||
called \index{Kadane's algorithm} \key{Kadane's algorithm}.}, which means
|
||||
that just one loop is enough.
|
||||
The idea is to calculate, for each array position,
|
||||
the maximum sum of a subarray that ends at that position.
|
||||
After this, the answer for the problem is the
|
||||
maximum of those sums.
|
||||
|
||||
Consider the subproblem of finding the maximum-sum subarray
|
||||
that ends at position $k$.
|
||||
There are two possibilities:
|
||||
\begin{enumerate}
|
||||
\item The subarray only contains the element at position $k$.
|
||||
\item The subarray consists of a subarray that ends
|
||||
at position $k-1$, followed by the element at position $k$.
|
||||
\end{enumerate}
|
||||
|
||||
In the latter case, since we want to
|
||||
find a subarray with maximum sum,
|
||||
the subarray that ends at position $k-1$
|
||||
should also have the maximum sum.
|
||||
Thus, we can solve the problem efficiently
|
||||
by calculating the maximum subarray sum
|
||||
for each ending position from left to right.
|
||||
|
||||
The following code implements the algorithm:
|
||||
\begin{lstlisting}
|
||||
int best = 0, sum = 0;
|
||||
for (int k = 0; k < n; k++) {
|
||||
sum = max(array[k],sum+array[k]);
|
||||
best = max(best,sum);
|
||||
}
|
||||
cout << best << "\n";
|
||||
\end{lstlisting}
|
||||
|
||||
The algorithm only contains one loop
|
||||
that goes through the input,
|
||||
so the time complexity is $O(n)$.
|
||||
This is also the best possible time complexity,
|
||||
because any algorithm for the problem
|
||||
has to examine all array elements at least once.
|
||||
|
||||
\subsubsection{Efficiency comparison}
|
||||
|
||||
It is interesting to study how efficient
|
||||
algorithms are in practice.
|
||||
The following table shows the running times
|
||||
of the above algorithms for different
|
||||
values of $n$ on a modern computer.
|
||||
|
||||
In each test, the input was generated randomly.
|
||||
The time needed for reading the input was not
|
||||
measured.
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{rrrr}
|
||||
array size $n$ & Algorithm 1 & Algorithm 2 & Algorithm 3 \\
|
||||
\hline
|
||||
$10^2$ & $0.0$ s & $0.0$ s & $0.0$ s \\
|
||||
$10^3$ & $0.1$ s & $0.0$ s & $0.0$ s \\
|
||||
$10^4$ & > $10.0$ s & $0.1$ s & $0.0$ s \\
|
||||
$10^5$ & > $10.0$ s & $5.3$ s & $0.0$ s \\
|
||||
$10^6$ & > $10.0$ s & > $10.0$ s & $0.0$ s \\
|
||||
$10^7$ & > $10.0$ s & > $10.0$ s & $0.0$ s \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
The comparison shows that all algorithms
|
||||
are efficient when the input size is small,
|
||||
but larger inputs bring out remarkable
|
||||
differences in the running times of the algorithms.
|
||||
Algorithm 1 becomes slow
|
||||
when $n=10^4$, and Algorithm 2
|
||||
becomes slow when $n=10^5$.
|
||||
Only Algorithm 3 is able to process
|
||||
even the largest inputs instantly.
|
863
chapter03.tex
863
chapter03.tex
|
@ -1,863 +0,0 @@
|
|||
\chapter{Sorting}
|
||||
|
||||
\index{sorting}
|
||||
|
||||
\key{Sorting}
|
||||
is a fundamental algorithm design problem.
|
||||
Many efficient algorithms
|
||||
use sorting as a subroutine,
|
||||
because it is often easier to process
|
||||
data if the elements are in a sorted order.
|
||||
|
||||
For example, the problem ''does an array contain
|
||||
two equal elements?'' is easy to solve using sorting.
|
||||
If the array contains two equal elements,
|
||||
they will be next to each other after sorting,
|
||||
so it is easy to find them.
|
||||
Also, the problem ''what is the most frequent element
|
||||
in an array?'' can be solved similarly.
|
||||
|
||||
There are many algorithms for sorting, and they are
|
||||
also good examples of how to apply
|
||||
different algorithm design techniques.
|
||||
The efficient general sorting algorithms
|
||||
work in $O(n \log n)$ time,
|
||||
and many algorithms that use sorting
|
||||
as a subroutine also
|
||||
have this time complexity.
|
||||
|
||||
\section{Sorting theory}
|
||||
|
||||
The basic problem in sorting is as follows:
|
||||
\begin{framed}
|
||||
\noindent
|
||||
Given an array that contains $n$ elements,
|
||||
your task is to sort the elements
|
||||
in increasing order.
|
||||
\end{framed}
|
||||
\noindent
|
||||
For example, the array
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$8$};
|
||||
\node at (3.5,0.5) {$2$};
|
||||
\node at (4.5,0.5) {$9$};
|
||||
\node at (5.5,0.5) {$2$};
|
||||
\node at (6.5,0.5) {$5$};
|
||||
\node at (7.5,0.5) {$6$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
will be as follows after sorting:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$2$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$3$};
|
||||
\node at (4.5,0.5) {$5$};
|
||||
\node at (5.5,0.5) {$6$};
|
||||
\node at (6.5,0.5) {$8$};
|
||||
\node at (7.5,0.5) {$9$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{$O(n^2)$ algorithms}
|
||||
|
||||
\index{bubble sort}
|
||||
|
||||
Simple algorithms for sorting an array
|
||||
work in $O(n^2)$ time.
|
||||
Such algorithms are short and usually
|
||||
consist of two nested loops.
|
||||
A famous $O(n^2)$ time sorting algorithm
|
||||
is \key{bubble sort} where the elements
|
||||
''bubble'' in the array according to their values.
|
||||
|
||||
Bubble sort consists of $n$ rounds.
|
||||
On each round, the algorithm iterates through
|
||||
the elements of the array.
|
||||
Whenever two consecutive elements are found
|
||||
that are not in correct order,
|
||||
the algorithm swaps them.
|
||||
The algorithm can be implemented as follows:
|
||||
\begin{lstlisting}
|
||||
for (int i = 0; i < n; i++) {
|
||||
for (int j = 0; j < n-1; j++) {
|
||||
if (array[j] > array[j+1]) {
|
||||
swap(array[j],array[j+1]);
|
||||
}
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
After the first round of the algorithm,
|
||||
the largest element will be in the correct position,
|
||||
and in general, after $k$ rounds, the $k$ largest
|
||||
elements will be in the correct positions.
|
||||
Thus, after $n$ rounds, the whole array
|
||||
will be sorted.
|
||||
|
||||
For example, in the array
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$8$};
|
||||
\node at (3.5,0.5) {$2$};
|
||||
\node at (4.5,0.5) {$9$};
|
||||
\node at (5.5,0.5) {$2$};
|
||||
\node at (6.5,0.5) {$5$};
|
||||
\node at (7.5,0.5) {$6$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\noindent
|
||||
the first round of bubble sort swaps elements
|
||||
as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$8$};
|
||||
\node at (4.5,0.5) {$9$};
|
||||
\node at (5.5,0.5) {$2$};
|
||||
\node at (6.5,0.5) {$5$};
|
||||
\node at (7.5,0.5) {$6$};
|
||||
|
||||
\draw[thick,<->] (3.5,-0.25) .. controls (3.25,-1.00) and (2.75,-1.00) .. (2.5,-0.25);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$8$};
|
||||
\node at (4.5,0.5) {$2$};
|
||||
\node at (5.5,0.5) {$9$};
|
||||
\node at (6.5,0.5) {$5$};
|
||||
\node at (7.5,0.5) {$6$};
|
||||
|
||||
\draw[thick,<->] (5.5,-0.25) .. controls (5.25,-1.00) and (4.75,-1.00) .. (4.5,-0.25);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$8$};
|
||||
\node at (4.5,0.5) {$2$};
|
||||
\node at (5.5,0.5) {$5$};
|
||||
\node at (6.5,0.5) {$9$};
|
||||
\node at (7.5,0.5) {$6$};
|
||||
|
||||
\draw[thick,<->] (6.5,-0.25) .. controls (6.25,-1.00) and (5.75,-1.00) .. (5.5,-0.25);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$8$};
|
||||
\node at (4.5,0.5) {$2$};
|
||||
\node at (5.5,0.5) {$5$};
|
||||
\node at (6.5,0.5) {$6$};
|
||||
\node at (7.5,0.5) {$9$};
|
||||
|
||||
\draw[thick,<->] (7.5,-0.25) .. controls (7.25,-1.00) and (6.75,-1.00) .. (6.5,-0.25);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Inversions}
|
||||
|
||||
\index{inversion}
|
||||
|
||||
Bubble sort is an example of a sorting
|
||||
algorithm that always swaps \emph{consecutive}
|
||||
elements in the array.
|
||||
It turns out that the time complexity
|
||||
of such an algorithm is \emph{always}
|
||||
at least $O(n^2)$, because in the worst case,
|
||||
$O(n^2)$ swaps are required for sorting the array.
|
||||
|
||||
A useful concept when analyzing sorting
|
||||
algorithms is an \key{inversion}:
|
||||
a pair of array elements
|
||||
$(\texttt{array}[a],\texttt{array}[b])$ such that
|
||||
$a<b$ and $\texttt{array}[a]>\texttt{array}[b]$,
|
||||
i.e., the elements are in the wrong order.
|
||||
For example, the array
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$2$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$6$};
|
||||
\node at (4.5,0.5) {$3$};
|
||||
\node at (5.5,0.5) {$5$};
|
||||
\node at (6.5,0.5) {$9$};
|
||||
\node at (7.5,0.5) {$8$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
has three inversions: $(6,3)$, $(6,5)$ and $(9,8)$.
|
||||
The number of inversions indicates
|
||||
how much work is needed to sort the array.
|
||||
An array is completely sorted when
|
||||
there are no inversions.
|
||||
On the other hand, if the array elements
|
||||
are in the reverse order,
|
||||
the number of inversions is the largest possible:
|
||||
\[1+2+\cdots+(n-1)=\frac{n(n-1)}{2} = O(n^2)\]
|
||||
|
||||
Swapping a pair of consecutive elements that are
|
||||
in the wrong order removes exactly one inversion
|
||||
from the array.
|
||||
Hence, if a sorting algorithm can only
|
||||
swap consecutive elements, each swap removes
|
||||
at most one inversion, and the time complexity
|
||||
of the algorithm is at least $O(n^2)$.
|
||||
|
||||
\subsubsection{$O(n \log n)$ algorithms}
|
||||
|
||||
\index{merge sort}
|
||||
|
||||
It is possible to sort an array efficiently
|
||||
in $O(n \log n)$ time using algorithms
|
||||
that are not limited to swapping consecutive elements.
|
||||
One such algorithm is \key{merge sort}\footnote{According to \cite{knu983},
|
||||
merge sort was invented by J. von Neumann in 1945.},
|
||||
which is based on recursion.
|
||||
|
||||
Merge sort sorts a subarray \texttt{array}$[a \ldots b]$ as follows:
|
||||
|
||||
\begin{enumerate}
|
||||
\item If $a=b$, do not do anything, because the subarray is already sorted.
|
||||
\item Calculate the position of the middle element: $k=\lfloor (a+b)/2 \rfloor$.
|
||||
\item Recursively sort the subarray \texttt{array}$[a \ldots k]$.
|
||||
\item Recursively sort the subarray \texttt{array}$[k+1 \ldots b]$.
|
||||
\item \emph{Merge} the sorted subarrays \texttt{array}$[a \ldots k]$ and
|
||||
\texttt{array}$[k+1 \ldots b]$
|
||||
into a sorted subarray \texttt{array}$[a \ldots b]$.
|
||||
\end{enumerate}
|
||||
|
||||
Merge sort is an efficient algorithm, because it
|
||||
halves the size of the subarray at each step.
|
||||
The recursion consists of $O(\log n)$ levels,
|
||||
and processing each level takes $O(n)$ time.
|
||||
Merging the subarrays \texttt{array}$[a \ldots k]$ and \texttt{array}$[k+1 \ldots b]$
|
||||
is possible in linear time, because they are already sorted.
|
||||
|
||||
For example, consider sorting the following array:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$6$};
|
||||
\node at (3.5,0.5) {$2$};
|
||||
\node at (4.5,0.5) {$8$};
|
||||
\node at (5.5,0.5) {$2$};
|
||||
\node at (6.5,0.5) {$5$};
|
||||
\node at (7.5,0.5) {$9$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The array will be divided into two subarrays
|
||||
as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (4,1);
|
||||
\draw (5,0) grid (9,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$6$};
|
||||
\node at (3.5,0.5) {$2$};
|
||||
|
||||
\node at (5.5,0.5) {$8$};
|
||||
\node at (6.5,0.5) {$2$};
|
||||
\node at (7.5,0.5) {$5$};
|
||||
\node at (8.5,0.5) {$9$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Then, the subarrays will be sorted recursively
|
||||
as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (4,1);
|
||||
\draw (5,0) grid (9,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$2$};
|
||||
\node at (2.5,0.5) {$3$};
|
||||
\node at (3.5,0.5) {$6$};
|
||||
|
||||
\node at (5.5,0.5) {$2$};
|
||||
\node at (6.5,0.5) {$5$};
|
||||
\node at (7.5,0.5) {$8$};
|
||||
\node at (8.5,0.5) {$9$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Finally, the algorithm merges the sorted
|
||||
subarrays and creates the final sorted array:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$2$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$3$};
|
||||
\node at (4.5,0.5) {$5$};
|
||||
\node at (5.5,0.5) {$6$};
|
||||
\node at (6.5,0.5) {$8$};
|
||||
\node at (7.5,0.5) {$9$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Sorting lower bound}
|
||||
|
||||
Is it possible to sort an array faster
|
||||
than in $O(n \log n)$ time?
|
||||
It turns out that this is \emph{not} possible
|
||||
when we restrict ourselves to sorting algorithms
|
||||
that are based on comparing array elements.
|
||||
|
||||
The lower bound for the time complexity
|
||||
can be proved by considering sorting
|
||||
as a process where each comparison of two elements
|
||||
gives more information about the contents of the array.
|
||||
The process creates the following tree:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) rectangle (3,1);
|
||||
\node at (1.5,0.5) {$x < y?$};
|
||||
|
||||
\draw[thick,->] (1.5,0) -- (-2.5,-1.5);
|
||||
\draw[thick,->] (1.5,0) -- (5.5,-1.5);
|
||||
|
||||
\draw (-4,-2.5) rectangle (-1,-1.5);
|
||||
\draw (4,-2.5) rectangle (7,-1.5);
|
||||
\node at (-2.5,-2) {$x < y?$};
|
||||
\node at (5.5,-2) {$x < y?$};
|
||||
|
||||
\draw[thick,->] (-2.5,-2.5) -- (-4.5,-4);
|
||||
\draw[thick,->] (-2.5,-2.5) -- (-0.5,-4);
|
||||
\draw[thick,->] (5.5,-2.5) -- (3.5,-4);
|
||||
\draw[thick,->] (5.5,-2.5) -- (7.5,-4);
|
||||
|
||||
\draw (-6,-5) rectangle (-3,-4);
|
||||
\draw (-2,-5) rectangle (1,-4);
|
||||
\draw (2,-5) rectangle (5,-4);
|
||||
\draw (6,-5) rectangle (9,-4);
|
||||
\node at (-4.5,-4.5) {$x < y?$};
|
||||
\node at (-0.5,-4.5) {$x < y?$};
|
||||
\node at (3.5,-4.5) {$x < y?$};
|
||||
\node at (7.5,-4.5) {$x < y?$};
|
||||
|
||||
\draw[thick,->] (-4.5,-5) -- (-5.5,-6);
|
||||
\draw[thick,->] (-4.5,-5) -- (-3.5,-6);
|
||||
\draw[thick,->] (-0.5,-5) -- (0.5,-6);
|
||||
\draw[thick,->] (-0.5,-5) -- (-1.5,-6);
|
||||
\draw[thick,->] (3.5,-5) -- (2.5,-6);
|
||||
\draw[thick,->] (3.5,-5) -- (4.5,-6);
|
||||
\draw[thick,->] (7.5,-5) -- (6.5,-6);
|
||||
\draw[thick,->] (7.5,-5) -- (8.5,-6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Here ''$x<y?$'' means that some elements
|
||||
$x$ and $y$ are compared.
|
||||
If $x<y$, the process continues to the left,
|
||||
and otherwise to the right.
|
||||
The results of the process are the possible
|
||||
ways to sort the array, a total of $n!$ ways.
|
||||
For this reason, the height of the tree
|
||||
must be at least
|
||||
\[ \log_2(n!) = \log_2(1)+\log_2(2)+\cdots+\log_2(n).\]
|
||||
We get a lower bound for this sum
|
||||
by choosing the last $n/2$ elements and
|
||||
changing the value of each element to $\log_2(n/2)$.
|
||||
This yields an estimate
|
||||
\[ \log_2(n!) \ge (n/2) \cdot \log_2(n/2),\]
|
||||
so the height of the tree and the minimum
|
||||
possible number of steps in a sorting
|
||||
algorithm in the worst case
|
||||
is at least $n \log n$.
|
||||
|
||||
\subsubsection{Counting sort}
|
||||
|
||||
\index{counting sort}
|
||||
|
||||
The lower bound $n \log n$ does not apply to
|
||||
algorithms that do not compare array elements
|
||||
but use some other information.
|
||||
An example of such an algorithm is
|
||||
\key{counting sort} that sorts an array in
|
||||
$O(n)$ time assuming that every element in the array
|
||||
is an integer between $0 \ldots c$ and $c=O(n)$.
|
||||
|
||||
The algorithm creates a \emph{bookkeeping} array,
|
||||
whose indices are elements of the original array.
|
||||
The algorithm iterates through the original array
|
||||
and calculates how many times each element
|
||||
appears in the array.
|
||||
\newpage
|
||||
|
||||
For example, the array
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$6$};
|
||||
\node at (3.5,0.5) {$9$};
|
||||
\node at (4.5,0.5) {$9$};
|
||||
\node at (5.5,0.5) {$3$};
|
||||
\node at (6.5,0.5) {$5$};
|
||||
\node at (7.5,0.5) {$9$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
corresponds to the following bookkeeping array:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (9,1);
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$0$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$0$};
|
||||
\node at (4.5,0.5) {$1$};
|
||||
\node at (5.5,0.5) {$1$};
|
||||
\node at (6.5,0.5) {$0$};
|
||||
\node at (7.5,0.5) {$0$};
|
||||
\node at (8.5,0.5) {$3$};
|
||||
|
||||
\footnotesize
|
||||
|
||||
\node at (0.5,1.5) {$1$};
|
||||
\node at (1.5,1.5) {$2$};
|
||||
\node at (2.5,1.5) {$3$};
|
||||
\node at (3.5,1.5) {$4$};
|
||||
\node at (4.5,1.5) {$5$};
|
||||
\node at (5.5,1.5) {$6$};
|
||||
\node at (6.5,1.5) {$7$};
|
||||
\node at (7.5,1.5) {$8$};
|
||||
\node at (8.5,1.5) {$9$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
For example, the value at position 3
|
||||
in the bookkeeping array is 2,
|
||||
because the element 3 appears 2 times
|
||||
in the original array.
|
||||
|
||||
Construction of the bookkeeping array
|
||||
takes $O(n)$ time. After this, the sorted array
|
||||
can be created in $O(n)$ time because
|
||||
the number of occurrences of each element can be retrieved
|
||||
from the bookkeeping array.
|
||||
Thus, the total time complexity of counting
|
||||
sort is $O(n)$.
|
||||
|
||||
Counting sort is a very efficient algorithm
|
||||
but it can only be used when the constant $c$
|
||||
is small enough, so that the array elements can
|
||||
be used as indices in the bookkeeping array.
|
||||
|
||||
\section{Sorting in C++}
|
||||
|
||||
\index{sort@\texttt{sort}}
|
||||
|
||||
It is almost never a good idea to use
|
||||
a home-made sorting algorithm
|
||||
in a contest, because there are good
|
||||
implementations available in programming languages.
|
||||
For example, the C++ standard library contains
|
||||
the function \texttt{sort} that can be easily used for
|
||||
sorting arrays and other data structures.
|
||||
|
||||
There are many benefits in using a library function.
|
||||
First, it saves time because there is no need to
|
||||
implement the function.
|
||||
Second, the library implementation is
|
||||
certainly correct and efficient: it is not probable
|
||||
that a home-made sorting function would be better.
|
||||
|
||||
In this section we will see how to use the
|
||||
C++ \texttt{sort} function.
|
||||
The following code sorts
|
||||
a vector in increasing order:
|
||||
\begin{lstlisting}
|
||||
vector<int> v = {4,2,5,3,5,8,3};
|
||||
sort(v.begin(),v.end());
|
||||
\end{lstlisting}
|
||||
After the sorting, the contents of the
|
||||
vector will be
|
||||
$[2,3,3,4,5,5,8]$.
|
||||
The default sorting order is increasing,
|
||||
but a reverse order is possible as follows:
|
||||
\begin{lstlisting}
|
||||
sort(v.rbegin(),v.rend());
|
||||
\end{lstlisting}
|
||||
An ordinary array can be sorted as follows:
|
||||
\begin{lstlisting}
|
||||
int n = 7; // array size
|
||||
int a[] = {4,2,5,3,5,8,3};
|
||||
sort(a,a+n);
|
||||
\end{lstlisting}
|
||||
\newpage
|
||||
The following code sorts the string \texttt{s}:
|
||||
\begin{lstlisting}
|
||||
string s = "monkey";
|
||||
sort(s.begin(), s.end());
|
||||
\end{lstlisting}
|
||||
Sorting a string means that the characters
|
||||
of the string are sorted.
|
||||
For example, the string ''monkey'' becomes ''ekmnoy''.
|
||||
|
||||
\subsubsection{Comparison operators}
|
||||
|
||||
\index{comparison operator}
|
||||
|
||||
The function \texttt{sort} requires that
|
||||
a \key{comparison operator} is defined for the data type
|
||||
of the elements to be sorted.
|
||||
When sorting, this operator will be used
|
||||
whenever it is necessary to find out the order of two elements.
|
||||
|
||||
Most C++ data types have a built-in comparison operator,
|
||||
and elements of those types can be sorted automatically.
|
||||
For example, numbers are sorted according to their values
|
||||
and strings are sorted in alphabetical order.
|
||||
|
||||
\index{pair@\texttt{pair}}
|
||||
|
||||
Pairs (\texttt{pair}) are sorted primarily according to their
|
||||
first elements (\texttt{first}).
|
||||
However, if the first elements of two pairs are equal,
|
||||
they are sorted according to their second elements (\texttt{second}):
|
||||
\begin{lstlisting}
|
||||
vector<pair<int,int>> v;
|
||||
v.push_back({1,5});
|
||||
v.push_back({2,3});
|
||||
v.push_back({1,2});
|
||||
sort(v.begin(), v.end());
|
||||
\end{lstlisting}
|
||||
After this, the order of the pairs is
|
||||
$(1,2)$, $(1,5)$ and $(2,3)$.
|
||||
|
||||
\index{tuple@\texttt{tuple}}
|
||||
|
||||
In a similar way, tuples (\texttt{tuple})
|
||||
are sorted primarily by the first element,
|
||||
secondarily by the second element, etc.\footnote{Note that in some older compilers,
|
||||
the function \texttt{make\_tuple} has to be used to create a tuple instead of
|
||||
braces (for example, \texttt{make\_tuple(2,1,4)} instead of \texttt{\{2,1,4\}}).}:
|
||||
\begin{lstlisting}
|
||||
vector<tuple<int,int,int>> v;
|
||||
v.push_back({2,1,4});
|
||||
v.push_back({1,5,3});
|
||||
v.push_back({2,1,3});
|
||||
sort(v.begin(), v.end());
|
||||
\end{lstlisting}
|
||||
After this, the order of the tuples is
|
||||
$(1,5,3)$, $(2,1,3)$ and $(2,1,4)$.
|
||||
|
||||
\subsubsection{User-defined structs}
|
||||
|
||||
User-defined structs do not have a comparison
|
||||
operator automatically.
|
||||
The operator should be defined inside
|
||||
the struct as a function
|
||||
\texttt{operator<},
|
||||
whose parameter is another element of the same type.
|
||||
The operator should return \texttt{true}
|
||||
if the element is smaller than the parameter,
|
||||
and \texttt{false} otherwise.
|
||||
|
||||
For example, the following struct \texttt{P}
|
||||
contains the x and y coordinates of a point.
|
||||
The comparison operator is defined so that
|
||||
the points are sorted primarily by the x coordinate
|
||||
and secondarily by the y coordinate.
|
||||
|
||||
\begin{lstlisting}
|
||||
struct P {
|
||||
int x, y;
|
||||
bool operator<(const P &p) {
|
||||
if (x != p.x) return x < p.x;
|
||||
else return y < p.y;
|
||||
}
|
||||
};
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Comparison functions}
|
||||
|
||||
\index{comparison function}
|
||||
|
||||
It is also possible to give an external
|
||||
\key{comparison function} to the \texttt{sort} function
|
||||
as a callback function.
|
||||
For example, the following comparison function \texttt{comp}
|
||||
sorts strings primarily by length and secondarily
|
||||
by alphabetical order:
|
||||
|
||||
\begin{lstlisting}
|
||||
bool comp(string a, string b) {
|
||||
if (a.size() != b.size()) return a.size() < b.size();
|
||||
return a < b;
|
||||
}
|
||||
\end{lstlisting}
|
||||
Now a vector of strings can be sorted as follows:
|
||||
\begin{lstlisting}
|
||||
sort(v.begin(), v.end(), comp);
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Binary search}
|
||||
|
||||
\index{binary search}
|
||||
|
||||
A general method for searching for an element
|
||||
in an array is to use a \texttt{for} loop
|
||||
that iterates through the elements of the array.
|
||||
For example, the following code searches for
|
||||
an element $x$ in an array:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 0; i < n; i++) {
|
||||
if (array[i] == x) {
|
||||
// x found at index i
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The time complexity of this approach is $O(n)$,
|
||||
because in the worst case, it is necessary to check
|
||||
all elements of the array.
|
||||
If the order of the elements is arbitrary,
|
||||
this is also the best possible approach, because
|
||||
there is no additional information available where
|
||||
in the array we should search for the element $x$.
|
||||
|
||||
However, if the array is \emph{sorted},
|
||||
the situation is different.
|
||||
In this case it is possible to perform the
|
||||
search much faster, because the order of the
|
||||
elements in the array guides the search.
|
||||
The following \key{binary search} algorithm
|
||||
efficiently searches for an element in a sorted array
|
||||
in $O(\log n)$ time.
|
||||
|
||||
\subsubsection{Method 1}
|
||||
|
||||
The usual way to implement binary search
|
||||
resembles looking for a word in a dictionary.
|
||||
The search maintains an active region in the array,
|
||||
which initially contains all array elements.
|
||||
Then, a number of steps is performed,
|
||||
each of which halves the size of the region.
|
||||
|
||||
At each step, the search checks the middle element
|
||||
of the active region.
|
||||
If the middle element is the target element,
|
||||
the search terminates.
|
||||
Otherwise, the search recursively continues
|
||||
to the left or right half of the region,
|
||||
depending on the value of the middle element.
|
||||
|
||||
The above idea can be implemented as follows:
|
||||
\begin{lstlisting}
|
||||
int a = 0, b = n-1;
|
||||
while (a <= b) {
|
||||
int k = (a+b)/2;
|
||||
if (array[k] == x) {
|
||||
// x found at index k
|
||||
}
|
||||
if (array[k] > x) b = k-1;
|
||||
else a = k+1;
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
In this implementation, the active region is $a \ldots b$,
|
||||
and initially the region is $0 \ldots n-1$.
|
||||
The algorithm halves the size of the region at each step,
|
||||
so the time complexity is $O(\log n)$.
|
||||
|
||||
\subsubsection{Method 2}
|
||||
|
||||
An alternative method to implement binary search
|
||||
is based on an efficient way to iterate through
|
||||
the elements of the array.
|
||||
The idea is to make jumps and slow the speed
|
||||
when we get closer to the target element.
|
||||
|
||||
The search goes through the array from left to
|
||||
right, and the initial jump length is $n/2$.
|
||||
At each step, the jump length will be halved:
|
||||
first $n/4$, then $n/8$, $n/16$, etc., until
|
||||
finally the length is 1.
|
||||
After the jumps, either the target element has
|
||||
been found or we know that it does not appear in the array.
|
||||
|
||||
The following code implements the above idea:
|
||||
\begin{lstlisting}
|
||||
int k = 0;
|
||||
for (int b = n/2; b >= 1; b /= 2) {
|
||||
while (k+b < n && array[k+b] <= x) k += b;
|
||||
}
|
||||
if (array[k] == x) {
|
||||
// x found at index k
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
During the search, the variable $b$
|
||||
contains the current jump length.
|
||||
The time complexity of the algorithm is $O(\log n)$,
|
||||
because the code in the \texttt{while} loop
|
||||
is performed at most twice for each jump length.
|
||||
|
||||
\subsubsection{C++ functions}
|
||||
|
||||
The C++ standard library contains the following functions
|
||||
that are based on binary search and work in logarithmic time:
|
||||
|
||||
\begin{itemize}
|
||||
\item \texttt{lower\_bound} returns a pointer to the
|
||||
first array element whose value is at least $x$.
|
||||
\item \texttt{upper\_bound} returns a pointer to the
|
||||
first array element whose value is larger than $x$.
|
||||
\item \texttt{equal\_range} returns both above pointers.
|
||||
\end{itemize}
|
||||
|
||||
The functions assume that the array is sorted.
|
||||
If there is no such element, the pointer points to
|
||||
the element after the last array element.
|
||||
For example, the following code finds out whether
|
||||
an array contains an element with value $x$:
|
||||
|
||||
\begin{lstlisting}
|
||||
auto k = lower_bound(array,array+n,x)-array;
|
||||
if (k < n && array[k] == x) {
|
||||
// x found at index k
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
Then, the following code counts the number of elements
|
||||
whose value is $x$:
|
||||
|
||||
\begin{lstlisting}
|
||||
auto a = lower_bound(array, array+n, x);
|
||||
auto b = upper_bound(array, array+n, x);
|
||||
cout << b-a << "\n";
|
||||
\end{lstlisting}
|
||||
|
||||
Using \texttt{equal\_range}, the code becomes shorter:
|
||||
|
||||
\begin{lstlisting}
|
||||
auto r = equal_range(array, array+n, x);
|
||||
cout << r.second-r.first << "\n";
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Finding the smallest solution}
|
||||
|
||||
An important use for binary search is
|
||||
to find the position where the value of a \emph{function} changes.
|
||||
Suppose that we wish to find the smallest value $k$
|
||||
that is a valid solution for a problem.
|
||||
We are given a function $\texttt{ok}(x)$
|
||||
that returns \texttt{true} if $x$ is a valid solution
|
||||
and \texttt{false} otherwise.
|
||||
In addition, we know that $\texttt{ok}(x)$ is \texttt{false}
|
||||
when $x<k$ and \texttt{true} when $x \ge k$.
|
||||
The situation looks as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{r|rrrrrrrr}
|
||||
$x$ & 0 & 1 & $\cdots$ & $k-1$ & $k$ & $k+1$ & $\cdots$ \\
|
||||
\hline
|
||||
$\texttt{ok}(x)$ & \texttt{false} & \texttt{false}
|
||||
& $\cdots$ & \texttt{false} & \texttt{true} & \texttt{true} & $\cdots$ \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
\noindent
|
||||
Now, the value of $k$ can be found using binary search:
|
||||
|
||||
\begin{lstlisting}
|
||||
int x = -1;
|
||||
for (int b = z; b >= 1; b /= 2) {
|
||||
while (!ok(x+b)) x += b;
|
||||
}
|
||||
int k = x+1;
|
||||
\end{lstlisting}
|
||||
|
||||
The search finds the largest value of $x$ for which
|
||||
$\texttt{ok}(x)$ is \texttt{false}.
|
||||
Thus, the next value $k=x+1$
|
||||
is the smallest possible value for which
|
||||
$\texttt{ok}(k)$ is \texttt{true}.
|
||||
The initial jump length $z$ has to be
|
||||
large enough, for example some value
|
||||
for which we know beforehand that $\texttt{ok}(z)$ is \texttt{true}.
|
||||
|
||||
The algorithm calls the function \texttt{ok}
|
||||
$O(\log z)$ times, so the total time complexity
|
||||
depends on the function \texttt{ok}.
|
||||
For example, if the function works in $O(n)$ time,
|
||||
the total time complexity is $O(n \log z)$.
|
||||
|
||||
\subsubsection{Finding the maximum value}
|
||||
|
||||
Binary search can also be used to find
|
||||
the maximum value for a function that is
|
||||
first increasing and then decreasing.
|
||||
Our task is to find a position $k$ such that
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
$f(x)<f(x+1)$ when $x<k$, and
|
||||
\item
|
||||
$f(x)>f(x+1)$ when $x \ge k$.
|
||||
\end{itemize}
|
||||
|
||||
The idea is to use binary search
|
||||
for finding the largest value of $x$
|
||||
for which $f(x)<f(x+1)$.
|
||||
This implies that $k=x+1$
|
||||
because $f(x+1)>f(x+2)$.
|
||||
The following code implements the search:
|
||||
|
||||
\begin{lstlisting}
|
||||
int x = -1;
|
||||
for (int b = z; b >= 1; b /= 2) {
|
||||
while (f(x+b) < f(x+b+1)) x += b;
|
||||
}
|
||||
int k = x+1;
|
||||
\end{lstlisting}
|
||||
|
||||
Note that unlike in the ordinary binary search,
|
||||
here it is not allowed that consecutive values
|
||||
of the function are equal.
|
||||
In this case it would not be possible to know
|
||||
how to continue the search.
|
794
chapter04.tex
794
chapter04.tex
|
@ -1,794 +0,0 @@
|
|||
\chapter{Data structures}
|
||||
|
||||
\index{data structure}
|
||||
|
||||
A \key{data structure} is a way to store
|
||||
data in the memory of a computer.
|
||||
It is important to choose an appropriate
|
||||
data structure for a problem,
|
||||
because each data structure has its own
|
||||
advantages and disadvantages.
|
||||
The crucial question is: which operations
|
||||
are efficient in the chosen data structure?
|
||||
|
||||
This chapter introduces the most important
|
||||
data structures in the C++ standard library.
|
||||
It is a good idea to use the standard library
|
||||
whenever possible,
|
||||
because it will save a lot of time.
|
||||
Later in the book we will learn about more sophisticated
|
||||
data structures that are not available
|
||||
in the standard library.
|
||||
|
||||
\section{Dynamic arrays}
|
||||
|
||||
\index{dynamic array}
|
||||
\index{vector}
|
||||
|
||||
A \key{dynamic array} is an array whose
|
||||
size can be changed during the execution
|
||||
of the program.
|
||||
The most popular dynamic array in C++ is
|
||||
the \texttt{vector} structure,
|
||||
which can be used almost like an ordinary array.
|
||||
|
||||
The following code creates an empty vector and
|
||||
adds three elements to it:
|
||||
|
||||
\begin{lstlisting}
|
||||
vector<int> v;
|
||||
v.push_back(3); // [3]
|
||||
v.push_back(2); // [3,2]
|
||||
v.push_back(5); // [3,2,5]
|
||||
\end{lstlisting}
|
||||
|
||||
After this, the elements can be accessed like in an ordinary array:
|
||||
|
||||
\begin{lstlisting}
|
||||
cout << v[0] << "\n"; // 3
|
||||
cout << v[1] << "\n"; // 2
|
||||
cout << v[2] << "\n"; // 5
|
||||
\end{lstlisting}
|
||||
|
||||
The function \texttt{size} returns the number of elements in the vector.
|
||||
The following code iterates through
|
||||
the vector and prints all elements in it:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 0; i < v.size(); i++) {
|
||||
cout << v[i] << "\n";
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\begin{samepage}
|
||||
A shorter way to iterate through a vector is as follows:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (auto x : v) {
|
||||
cout << x << "\n";
|
||||
}
|
||||
\end{lstlisting}
|
||||
\end{samepage}
|
||||
|
||||
The function \texttt{back} returns the last element
|
||||
in the vector, and
|
||||
the function \texttt{pop\_back} removes the last element:
|
||||
|
||||
\begin{lstlisting}
|
||||
vector<int> v;
|
||||
v.push_back(5);
|
||||
v.push_back(2);
|
||||
cout << v.back() << "\n"; // 2
|
||||
v.pop_back();
|
||||
cout << v.back() << "\n"; // 5
|
||||
\end{lstlisting}
|
||||
|
||||
The following code creates a vector with five elements:
|
||||
|
||||
\begin{lstlisting}
|
||||
vector<int> v = {2,4,2,5,1};
|
||||
\end{lstlisting}
|
||||
|
||||
Another way to create a vector is to give the number
|
||||
of elements and the initial value for each element:
|
||||
|
||||
\begin{lstlisting}
|
||||
// size 10, initial value 0
|
||||
vector<int> v(10);
|
||||
\end{lstlisting}
|
||||
\begin{lstlisting}
|
||||
// size 10, initial value 5
|
||||
vector<int> v(10, 5);
|
||||
\end{lstlisting}
|
||||
|
||||
The internal implementation of a vector
|
||||
uses an ordinary array.
|
||||
If the size of the vector increases and
|
||||
the array becomes too small,
|
||||
a new array is allocated and all the
|
||||
elements are moved to the new array.
|
||||
However, this does not happen often and the
|
||||
average time complexity of
|
||||
\texttt{push\_back} is $O(1)$.
|
||||
|
||||
\index{string}
|
||||
|
||||
The \texttt{string} structure
|
||||
is also a dynamic array that can be used almost like a vector.
|
||||
In addition, there is special syntax for strings
|
||||
that is not available in other data structures.
|
||||
Strings can be combined using the \texttt{+} symbol.
|
||||
The function $\texttt{substr}(k,x)$ returns the substring
|
||||
that begins at position $k$ and has length $x$,
|
||||
and the function $\texttt{find}(\texttt{t})$ finds the position
|
||||
of the first occurrence of a substring \texttt{t}.
|
||||
|
||||
The following code presents some string operations:
|
||||
|
||||
\begin{lstlisting}
|
||||
string a = "hatti";
|
||||
string b = a+a;
|
||||
cout << b << "\n"; // hattihatti
|
||||
b[5] = 'v';
|
||||
cout << b << "\n"; // hattivatti
|
||||
string c = b.substr(3,4);
|
||||
cout << c << "\n"; // tiva
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Set structures}
|
||||
|
||||
\index{set}
|
||||
|
||||
A \key{set} is a data structure that
|
||||
maintains a collection of elements.
|
||||
The basic operations of sets are element
|
||||
insertion, search and removal.
|
||||
|
||||
The C++ standard library contains two set
|
||||
implementations:
|
||||
The structure \texttt{set} is based on a balanced
|
||||
binary tree and its operations work in $O(\log n)$ time.
|
||||
The structure \texttt{unordered\_set} uses hashing,
|
||||
and its operations work in $O(1)$ time on average.
|
||||
|
||||
The choice of which set implementation to use
|
||||
is often a matter of taste.
|
||||
The benefit of the \texttt{set} structure
|
||||
is that it maintains the order of the elements
|
||||
and provides functions that are not available
|
||||
in \texttt{unordered\_set}.
|
||||
On the other hand, \texttt{unordered\_set}
|
||||
can be more efficient.
|
||||
|
||||
The following code creates a set
|
||||
that contains integers,
|
||||
and shows some of the operations.
|
||||
The function \texttt{insert} adds an element to the set,
|
||||
the function \texttt{count} returns the number of occurrences
|
||||
of an element in the set,
|
||||
and the function \texttt{erase} removes an element from the set.
|
||||
|
||||
\begin{lstlisting}
|
||||
set<int> s;
|
||||
s.insert(3);
|
||||
s.insert(2);
|
||||
s.insert(5);
|
||||
cout << s.count(3) << "\n"; // 1
|
||||
cout << s.count(4) << "\n"; // 0
|
||||
s.erase(3);
|
||||
s.insert(4);
|
||||
cout << s.count(3) << "\n"; // 0
|
||||
cout << s.count(4) << "\n"; // 1
|
||||
\end{lstlisting}
|
||||
|
||||
A set can be used mostly like a vector,
|
||||
but it is not possible to access
|
||||
the elements using the \texttt{[]} notation.
|
||||
The following code creates a set,
|
||||
prints the number of elements in it, and then
|
||||
iterates through all the elements:
|
||||
\begin{lstlisting}
|
||||
set<int> s = {2,5,6,8};
|
||||
cout << s.size() << "\n"; // 4
|
||||
for (auto x : s) {
|
||||
cout << x << "\n";
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
An important property of sets is
|
||||
that all their elements are \emph{distinct}.
|
||||
Thus, the function \texttt{count} always returns
|
||||
either 0 (the element is not in the set)
|
||||
or 1 (the element is in the set),
|
||||
and the function \texttt{insert} never adds
|
||||
an element to the set if it is
|
||||
already there.
|
||||
The following code illustrates this:
|
||||
|
||||
\begin{lstlisting}
|
||||
set<int> s;
|
||||
s.insert(5);
|
||||
s.insert(5);
|
||||
s.insert(5);
|
||||
cout << s.count(5) << "\n"; // 1
|
||||
\end{lstlisting}
|
||||
|
||||
C++ also contains the structures
|
||||
\texttt{multiset} and \texttt{unordered\_multiset}
|
||||
that otherwise work like \texttt{set}
|
||||
and \texttt{unordered\_set}
|
||||
but they can contain multiple instances of an element.
|
||||
For example, in the following code all three instances
|
||||
of the number 5 are added to a multiset:
|
||||
|
||||
\begin{lstlisting}
|
||||
multiset<int> s;
|
||||
s.insert(5);
|
||||
s.insert(5);
|
||||
s.insert(5);
|
||||
cout << s.count(5) << "\n"; // 3
|
||||
\end{lstlisting}
|
||||
The function \texttt{erase} removes
|
||||
all instances of an element
|
||||
from a multiset:
|
||||
\begin{lstlisting}
|
||||
s.erase(5);
|
||||
cout << s.count(5) << "\n"; // 0
|
||||
\end{lstlisting}
|
||||
Often, only one instance should be removed,
|
||||
which can be done as follows:
|
||||
\begin{lstlisting}
|
||||
s.erase(s.find(5));
|
||||
cout << s.count(5) << "\n"; // 2
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Map structures}
|
||||
|
||||
\index{map}
|
||||
|
||||
A \key{map} is a generalized array
|
||||
that consists of key-value-pairs.
|
||||
While the keys in an ordinary array are always
|
||||
the consecutive integers $0,1,\ldots,n-1$,
|
||||
where $n$ is the size of the array,
|
||||
the keys in a map can be of any data type and
|
||||
they do not have to be consecutive values.
|
||||
|
||||
The C++ standard library contains two map
|
||||
implementations that correspond to the set
|
||||
implementations: the structure
|
||||
\texttt{map} is based on a balanced
|
||||
binary tree and accessing elements
|
||||
takes $O(\log n)$ time,
|
||||
while the structure
|
||||
\texttt{unordered\_map} uses hashing
|
||||
and accessing elements takes $O(1)$ time on average.
|
||||
|
||||
The following code creates a map
|
||||
where the keys are strings and the values are integers:
|
||||
|
||||
\begin{lstlisting}
|
||||
map<string,int> m;
|
||||
m["monkey"] = 4;
|
||||
m["banana"] = 3;
|
||||
m["harpsichord"] = 9;
|
||||
cout << m["banana"] << "\n"; // 3
|
||||
\end{lstlisting}
|
||||
|
||||
If the value of a key is requested
|
||||
but the map does not contain it,
|
||||
the key is automatically added to the map with
|
||||
a default value.
|
||||
For example, in the following code,
|
||||
the key ''aybabtu'' with value 0
|
||||
is added to the map.
|
||||
|
||||
\begin{lstlisting}
|
||||
map<string,int> m;
|
||||
cout << m["aybabtu"] << "\n"; // 0
|
||||
\end{lstlisting}
|
||||
The function \texttt{count} checks
|
||||
if a key exists in a map:
|
||||
\begin{lstlisting}
|
||||
if (m.count("aybabtu")) {
|
||||
// key exists
|
||||
}
|
||||
\end{lstlisting}
|
||||
The following code prints all the keys and values
|
||||
in a map:
|
||||
\begin{lstlisting}
|
||||
for (auto x : m) {
|
||||
cout << x.first << " " << x.second << "\n";
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Iterators and ranges}
|
||||
|
||||
\index{iterator}
|
||||
|
||||
Many functions in the C++ standard library
|
||||
operate with iterators.
|
||||
An \key{iterator} is a variable that points
|
||||
to an element in a data structure.
|
||||
|
||||
The often used iterators \texttt{begin}
|
||||
and \texttt{end} define a range that contains
|
||||
all elements in a data structure.
|
||||
The iterator \texttt{begin} points to
|
||||
the first element in the data structure,
|
||||
and the iterator \texttt{end} points to
|
||||
the position \emph{after} the last element.
|
||||
The situation looks as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{llllllllll}
|
||||
\{ & 3, & 4, & 6, & 8, & 12, & 13, & 14, & 17 & \} \\
|
||||
& $\uparrow$ & & & & & & & & $\uparrow$ \\
|
||||
& \multicolumn{3}{l}{\texttt{s.begin()}} & & & & & & \texttt{s.end()} \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
Note the asymmetry in the iterators:
|
||||
\texttt{s.begin()} points to an element in the data structure,
|
||||
while \texttt{s.end()} points outside the data structure.
|
||||
Thus, the range defined by the iterators is \emph{half-open}.
|
||||
|
||||
\subsubsection{Working with ranges}
|
||||
|
||||
Iterators are used in C++ standard library functions
|
||||
that are given a range of elements in a data structure.
|
||||
Usually, we want to process all elements in a
|
||||
data structure, so the iterators
|
||||
\texttt{begin} and \texttt{end} are given for the function.
|
||||
|
||||
For example, the following code sorts a vector
|
||||
using the function \texttt{sort},
|
||||
then reverses the order of the elements using the function
|
||||
\texttt{reverse}, and finally shuffles the order of
|
||||
the elements using the function \texttt{random\_shuffle}.
|
||||
|
||||
\index{sort@\texttt{sort}}
|
||||
\index{reverse@\texttt{reverse}}
|
||||
\index{random\_shuffle@\texttt{random\_shuffle}}
|
||||
|
||||
\begin{lstlisting}
|
||||
sort(v.begin(), v.end());
|
||||
reverse(v.begin(), v.end());
|
||||
random_shuffle(v.begin(), v.end());
|
||||
\end{lstlisting}
|
||||
|
||||
These functions can also be used with an ordinary array.
|
||||
In this case, the functions are given pointers to the array
|
||||
instead of iterators:
|
||||
|
||||
\newpage
|
||||
\begin{lstlisting}
|
||||
sort(a, a+n);
|
||||
reverse(a, a+n);
|
||||
random_shuffle(a, a+n);
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Set iterators}
|
||||
|
||||
Iterators are often used to access
|
||||
elements of a set.
|
||||
The following code creates an iterator
|
||||
\texttt{it} that points to the smallest element in a set:
|
||||
\begin{lstlisting}
|
||||
set<int>::iterator it = s.begin();
|
||||
\end{lstlisting}
|
||||
A shorter way to write the code is as follows:
|
||||
\begin{lstlisting}
|
||||
auto it = s.begin();
|
||||
\end{lstlisting}
|
||||
The element to which an iterator points
|
||||
can be accessed using the \texttt{*} symbol.
|
||||
For example, the following code prints
|
||||
the first element in the set:
|
||||
|
||||
\begin{lstlisting}
|
||||
auto it = s.begin();
|
||||
cout << *it << "\n";
|
||||
\end{lstlisting}
|
||||
|
||||
Iterators can be moved using the operators
|
||||
\texttt{++} (forward) and \texttt{--} (backward),
|
||||
meaning that the iterator moves to the next
|
||||
or previous element in the set.
|
||||
|
||||
The following code prints all the elements
|
||||
in increasing order:
|
||||
\begin{lstlisting}
|
||||
for (auto it = s.begin(); it != s.end(); it++) {
|
||||
cout << *it << "\n";
|
||||
}
|
||||
\end{lstlisting}
|
||||
The following code prints the largest element in the set:
|
||||
\begin{lstlisting}
|
||||
auto it = s.end(); it--;
|
||||
cout << *it << "\n";
|
||||
\end{lstlisting}
|
||||
|
||||
The function $\texttt{find}(x)$ returns an iterator
|
||||
that points to an element whose value is $x$.
|
||||
However, if the set does not contain $x$,
|
||||
the iterator will be \texttt{end}.
|
||||
|
||||
\begin{lstlisting}
|
||||
auto it = s.find(x);
|
||||
if (it == s.end()) {
|
||||
// x is not found
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The function $\texttt{lower\_bound}(x)$ returns
|
||||
an iterator to the smallest element in the set
|
||||
whose value is \emph{at least} $x$, and
|
||||
the function $\texttt{upper\_bound}(x)$
|
||||
returns an iterator to the smallest element in the set
|
||||
whose value is \emph{larger than} $x$.
|
||||
In both functions, if such an element does not exist,
|
||||
the return value is \texttt{end}.
|
||||
These functions are not supported by the
|
||||
\texttt{unordered\_set} structure which
|
||||
does not maintain the order of the elements.
|
||||
|
||||
\begin{samepage}
|
||||
For example, the following code finds the element
|
||||
nearest to $x$:
|
||||
|
||||
\begin{lstlisting}
|
||||
auto it = s.lower_bound(x);
|
||||
if (it == s.begin()) {
|
||||
cout << *it << "\n";
|
||||
} else if (it == s.end()) {
|
||||
it--;
|
||||
cout << *it << "\n";
|
||||
} else {
|
||||
int a = *it; it--;
|
||||
int b = *it;
|
||||
if (x-b < a-x) cout << b << "\n";
|
||||
else cout << a << "\n";
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The code assumes that the set is not empty,
|
||||
and goes through all possible cases
|
||||
using an iterator \texttt{it}.
|
||||
First, the iterator points to the smallest
|
||||
element whose value is at least $x$.
|
||||
If \texttt{it} equals \texttt{begin},
|
||||
the corresponding element is nearest to $x$.
|
||||
If \texttt{it} equals \texttt{end},
|
||||
the largest element in the set is nearest to $x$.
|
||||
If none of the previous cases hold,
|
||||
the element nearest to $x$ is either the
|
||||
element that corresponds to \texttt{it} or the previous element.
|
||||
\end{samepage}
|
||||
|
||||
\section{Other structures}
|
||||
|
||||
\subsubsection{Bitset}
|
||||
|
||||
\index{bitset}
|
||||
|
||||
A \key{bitset} is an array
|
||||
whose each value is either 0 or 1.
|
||||
For example, the following code creates
|
||||
a bitset that contains 10 elements:
|
||||
\begin{lstlisting}
|
||||
bitset<10> s;
|
||||
s[1] = 1;
|
||||
s[3] = 1;
|
||||
s[4] = 1;
|
||||
s[7] = 1;
|
||||
cout << s[4] << "\n"; // 1
|
||||
cout << s[5] << "\n"; // 0
|
||||
\end{lstlisting}
|
||||
|
||||
The benefit of using bitsets is that
|
||||
they require less memory than ordinary arrays,
|
||||
because each element in a bitset only
|
||||
uses one bit of memory.
|
||||
For example,
|
||||
if $n$ bits are stored in an \texttt{int} array,
|
||||
$32n$ bits of memory will be used,
|
||||
but a corresponding bitset only requires $n$ bits of memory.
|
||||
In addition, the values of a bitset
|
||||
can be efficiently manipulated using
|
||||
bit operators, which makes it possible to
|
||||
optimize algorithms using bit sets.
|
||||
|
||||
The following code shows another way to create the above bitset:
|
||||
\begin{lstlisting}
|
||||
bitset<10> s(string("0010011010")); // from right to left
|
||||
cout << s[4] << "\n"; // 1
|
||||
cout << s[5] << "\n"; // 0
|
||||
\end{lstlisting}
|
||||
|
||||
The function \texttt{count} returns the number
|
||||
of ones in the bitset:
|
||||
|
||||
\begin{lstlisting}
|
||||
bitset<10> s(string("0010011010"));
|
||||
cout << s.count() << "\n"; // 4
|
||||
\end{lstlisting}
|
||||
|
||||
The following code shows examples of using bit operations:
|
||||
\begin{lstlisting}
|
||||
bitset<10> a(string("0010110110"));
|
||||
bitset<10> b(string("1011011000"));
|
||||
cout << (a&b) << "\n"; // 0010010000
|
||||
cout << (a|b) << "\n"; // 1011111110
|
||||
cout << (a^b) << "\n"; // 1001101110
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Deque}
|
||||
|
||||
\index{deque}
|
||||
|
||||
A \key{deque} is a dynamic array
|
||||
whose size can be efficiently
|
||||
changed at both ends of the array.
|
||||
Like a vector, a deque provides the functions
|
||||
\texttt{push\_back} and \texttt{pop\_back}, but
|
||||
it also includes the functions
|
||||
\texttt{push\_front} and \texttt{pop\_front}
|
||||
which are not available in a vector.
|
||||
|
||||
A deque can be used as follows:
|
||||
\begin{lstlisting}
|
||||
deque<int> d;
|
||||
d.push_back(5); // [5]
|
||||
d.push_back(2); // [5,2]
|
||||
d.push_front(3); // [3,5,2]
|
||||
d.pop_back(); // [3,5]
|
||||
d.pop_front(); // [5]
|
||||
\end{lstlisting}
|
||||
|
||||
The internal implementation of a deque
|
||||
is more complex than that of a vector,
|
||||
and for this reason, a deque is slower than a vector.
|
||||
Still, both adding and removing
|
||||
elements take $O(1)$ time on average at both ends.
|
||||
|
||||
\subsubsection{Stack}
|
||||
|
||||
\index{stack}
|
||||
|
||||
A \key{stack}
|
||||
is a data structure that provides two
|
||||
$O(1)$ time operations:
|
||||
adding an element to the top,
|
||||
and removing an element from the top.
|
||||
It is only possible to access the top
|
||||
element of a stack.
|
||||
|
||||
The following code shows how a stack can be used:
|
||||
\begin{lstlisting}
|
||||
stack<int> s;
|
||||
s.push(3);
|
||||
s.push(2);
|
||||
s.push(5);
|
||||
cout << s.top(); // 5
|
||||
s.pop();
|
||||
cout << s.top(); // 2
|
||||
\end{lstlisting}
|
||||
\subsubsection{Queue}
|
||||
|
||||
\index{queue}
|
||||
|
||||
A \key{queue} also
|
||||
provides two $O(1)$ time operations:
|
||||
adding an element to the end of the queue,
|
||||
and removing the first element in the queue.
|
||||
It is only possible to access the first
|
||||
and last element of a queue.
|
||||
|
||||
The following code shows how a queue can be used:
|
||||
\begin{lstlisting}
|
||||
queue<int> q;
|
||||
q.push(3);
|
||||
q.push(2);
|
||||
q.push(5);
|
||||
cout << q.front(); // 3
|
||||
q.pop();
|
||||
cout << q.front(); // 2
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Priority queue}
|
||||
|
||||
\index{priority queue}
|
||||
\index{heap}
|
||||
|
||||
A \key{priority queue}
|
||||
maintains a set of elements.
|
||||
The supported operations are insertion and,
|
||||
depending on the type of the queue,
|
||||
retrieval and removal of
|
||||
either the minimum or maximum element.
|
||||
Insertion and removal take $O(\log n)$ time,
|
||||
and retrieval takes $O(1)$ time.
|
||||
|
||||
While an ordered set efficiently supports
|
||||
all the operations of a priority queue,
|
||||
the benefit of using a priority queue is
|
||||
that it has smaller constant factors.
|
||||
A priority queue is usually implemented using
|
||||
a heap structure that is much simpler than a
|
||||
balanced binary tree used in an ordered set.
|
||||
|
||||
\begin{samepage}
|
||||
By default, the elements in a C++
|
||||
priority queue are sorted in decreasing order,
|
||||
and it is possible to find and remove the
|
||||
largest element in the queue.
|
||||
The following code illustrates this:
|
||||
|
||||
\begin{lstlisting}
|
||||
priority_queue<int> q;
|
||||
q.push(3);
|
||||
q.push(5);
|
||||
q.push(7);
|
||||
q.push(2);
|
||||
cout << q.top() << "\n"; // 7
|
||||
q.pop();
|
||||
cout << q.top() << "\n"; // 5
|
||||
q.pop();
|
||||
q.push(6);
|
||||
cout << q.top() << "\n"; // 6
|
||||
q.pop();
|
||||
\end{lstlisting}
|
||||
\end{samepage}
|
||||
|
||||
If we want to create a priority queue
|
||||
that supports finding and removing
|
||||
the smallest element,
|
||||
we can do it as follows:
|
||||
|
||||
\begin{lstlisting}
|
||||
priority_queue<int,vector<int>,greater<int>> q;
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Policy-based data structures}
|
||||
|
||||
The \texttt{g++} compiler also supports
|
||||
some data structures that are not part
|
||||
of the C++ standard library.
|
||||
Such structures are called \emph{policy-based}
|
||||
data structures.
|
||||
To use these structures, the following lines
|
||||
must be added to the code:
|
||||
\begin{lstlisting}
|
||||
#include <ext/pb_ds/assoc_container.hpp>
|
||||
using namespace __gnu_pbds;
|
||||
\end{lstlisting}
|
||||
After this, we can define a data structure \texttt{indexed\_set} that
|
||||
is like \texttt{set} but can be indexed like an array.
|
||||
The definition for \texttt{int} values is as follows:
|
||||
\begin{lstlisting}
|
||||
typedef tree<int,null_type,less<int>,rb_tree_tag,
|
||||
tree_order_statistics_node_update> indexed_set;
|
||||
\end{lstlisting}
|
||||
Now we can create a set as follows:
|
||||
\begin{lstlisting}
|
||||
indexed_set s;
|
||||
s.insert(2);
|
||||
s.insert(3);
|
||||
s.insert(7);
|
||||
s.insert(9);
|
||||
\end{lstlisting}
|
||||
The speciality of this set is that we have access to
|
||||
the indices that the elements would have in a sorted array.
|
||||
The function $\texttt{find\_by\_order}$ returns
|
||||
an iterator to the element at a given position:
|
||||
\begin{lstlisting}
|
||||
auto x = s.find_by_order(2);
|
||||
cout << *x << "\n"; // 7
|
||||
\end{lstlisting}
|
||||
And the function $\texttt{order\_of\_key}$
|
||||
returns the position of a given element:
|
||||
\begin{lstlisting}
|
||||
cout << s.order_of_key(7) << "\n"; // 2
|
||||
\end{lstlisting}
|
||||
If the element does not appear in the set,
|
||||
we get the position that the element would have
|
||||
in the set:
|
||||
\begin{lstlisting}
|
||||
cout << s.order_of_key(6) << "\n"; // 2
|
||||
cout << s.order_of_key(8) << "\n"; // 3
|
||||
\end{lstlisting}
|
||||
Both the functions work in logarithmic time.
|
||||
|
||||
\section{Comparison to sorting}
|
||||
|
||||
It is often possible to solve a problem
|
||||
using either data structures or sorting.
|
||||
Sometimes there are remarkable differences
|
||||
in the actual efficiency of these approaches,
|
||||
which may be hidden in their time complexities.
|
||||
|
||||
Let us consider a problem where
|
||||
we are given two lists $A$ and $B$
|
||||
that both contain $n$ elements.
|
||||
Our task is to calculate the number of elements
|
||||
that belong to both of the lists.
|
||||
For example, for the lists
|
||||
\[A = [5,2,8,9] \hspace{10px} \textrm{and} \hspace{10px} B = [3,2,9,5],\]
|
||||
the answer is 3 because the numbers 2, 5
|
||||
and 9 belong to both of the lists.
|
||||
|
||||
A straightforward solution to the problem is
|
||||
to go through all pairs of elements in $O(n^2)$ time,
|
||||
but next we will focus on
|
||||
more efficient algorithms.
|
||||
|
||||
\subsubsection{Algorithm 1}
|
||||
|
||||
We construct a set of the elements that appear in $A$,
|
||||
and after this, we iterate through the elements
|
||||
of $B$ and check for each elements if it
|
||||
also belongs to $A$.
|
||||
This is efficient because the elements of $A$
|
||||
are in a set.
|
||||
Using the \texttt{set} structure,
|
||||
the time complexity of the algorithm is $O(n \log n)$.
|
||||
|
||||
\subsubsection{Algorithm 2}
|
||||
|
||||
It is not necessary to maintain an ordered set,
|
||||
so instead of the \texttt{set} structure
|
||||
we can also use the \texttt{unordered\_set} structure.
|
||||
This is an easy way to make the algorithm
|
||||
more efficient, because we only have to change
|
||||
the underlying data structure.
|
||||
The time complexity of the new algorithm is $O(n)$.
|
||||
|
||||
\subsubsection{Algorithm 3}
|
||||
|
||||
Instead of data structures, we can use sorting.
|
||||
First, we sort both lists $A$ and $B$.
|
||||
After this, we iterate through both the lists
|
||||
at the same time and find the common elements.
|
||||
The time complexity of sorting is $O(n \log n)$,
|
||||
and the rest of the algorithm works in $O(n)$ time,
|
||||
so the total time complexity is $O(n \log n)$.
|
||||
|
||||
\subsubsection{Efficiency comparison}
|
||||
|
||||
The following table shows how efficient
|
||||
the above algorithms are when $n$ varies and
|
||||
the elements of the lists are random
|
||||
integers between $1 \ldots 10^9$:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{rrrr}
|
||||
$n$ & Algorithm 1 & Algorithm 2 & Algorithm 3 \\
|
||||
\hline
|
||||
$10^6$ & $1.5$ s & $0.3$ s & $0.2$ s \\
|
||||
$2 \cdot 10^6$ & $3.7$ s & $0.8$ s & $0.3$ s \\
|
||||
$3 \cdot 10^6$ & $5.7$ s & $1.3$ s & $0.5$ s \\
|
||||
$4 \cdot 10^6$ & $7.7$ s & $1.7$ s & $0.7$ s \\
|
||||
$5 \cdot 10^6$ & $10.0$ s & $2.3$ s & $0.9$ s \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
Algorithms 1 and 2 are equal except that
|
||||
they use different set structures.
|
||||
In this problem, this choice has an important effect on
|
||||
the running time, because Algorithm 2
|
||||
is 4–5 times faster than Algorithm 1.
|
||||
|
||||
However, the most efficient algorithm is Algorithm 3
|
||||
which uses sorting.
|
||||
It only uses half the time compared to Algorithm 2.
|
||||
Interestingly, the time complexity of both
|
||||
Algorithm 1 and Algorithm 3 is $O(n \log n)$,
|
||||
but despite this, Algorithm 3 is ten times faster.
|
||||
This can be explained by the fact that
|
||||
sorting is a simple procedure and it is done
|
||||
only once at the beginning of Algorithm 3,
|
||||
and the rest of the algorithm works in linear time.
|
||||
On the other hand,
|
||||
Algorithm 1 maintains a complex balanced binary tree
|
||||
during the whole algorithm.
|
758
chapter05.tex
758
chapter05.tex
|
@ -1,758 +0,0 @@
|
|||
\chapter{Complete search}
|
||||
|
||||
\key{Complete search}
|
||||
is a general method that can be used
|
||||
to solve almost any algorithm problem.
|
||||
The idea is to generate all possible
|
||||
solutions to the problem using brute force,
|
||||
and then select the best solution or count the
|
||||
number of solutions, depending on the problem.
|
||||
|
||||
Complete search is a good technique
|
||||
if there is enough time to go through all the solutions,
|
||||
because the search is usually easy to implement
|
||||
and it always gives the correct answer.
|
||||
If complete search is too slow,
|
||||
other techniques, such as greedy algorithms or
|
||||
dynamic programming, may be needed.
|
||||
|
||||
\section{Generating subsets}
|
||||
|
||||
\index{subset}
|
||||
|
||||
We first consider the problem of generating
|
||||
all subsets of a set of $n$ elements.
|
||||
For example, the subsets of $\{0,1,2\}$ are
|
||||
$\emptyset$, $\{0\}$, $\{1\}$, $\{2\}$, $\{0,1\}$,
|
||||
$\{0,2\}$, $\{1,2\}$ and $\{0,1,2\}$.
|
||||
There are two common methods to generate subsets:
|
||||
we can either perform a recursive search
|
||||
or exploit the bit representation of integers.
|
||||
|
||||
\subsubsection{Method 1}
|
||||
|
||||
An elegant way to go through all subsets
|
||||
of a set is to use recursion.
|
||||
The following function \texttt{search}
|
||||
generates the subsets of the set
|
||||
$\{0,1,\ldots,n-1\}$.
|
||||
The function maintains a vector \texttt{subset}
|
||||
that will contain the elements of each subset.
|
||||
The search begins when the function is called
|
||||
with parameter 0.
|
||||
|
||||
\begin{lstlisting}
|
||||
void search(int k) {
|
||||
if (k == n) {
|
||||
// process subset
|
||||
} else {
|
||||
search(k+1);
|
||||
subset.push_back(k);
|
||||
search(k+1);
|
||||
subset.pop_back();
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
When the function \texttt{search}
|
||||
is called with parameter $k$,
|
||||
it decides whether to include the
|
||||
element $k$ in the subset or not,
|
||||
and in both cases,
|
||||
then calls itself with parameter $k+1$
|
||||
However, if $k=n$, the function notices that
|
||||
all elements have been processed
|
||||
and a subset has been generated.
|
||||
|
||||
The following tree illustrates the function calls when $n=3$.
|
||||
We can always choose either the left branch
|
||||
($k$ is not included in the subset) or the right branch
|
||||
($k$ is included in the subset).
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.45]
|
||||
\begin{scope}
|
||||
\small
|
||||
\node at (0,0) {$\texttt{search}(0)$};
|
||||
|
||||
\node at (-8,-4) {$\texttt{search}(1)$};
|
||||
\node at (8,-4) {$\texttt{search}(1)$};
|
||||
|
||||
\path[draw,thick,->] (0,0-0.5) -- (-8,-4+0.5);
|
||||
\path[draw,thick,->] (0,0-0.5) -- (8,-4+0.5);
|
||||
|
||||
\node at (-12,-8) {$\texttt{search}(2)$};
|
||||
\node at (-4,-8) {$\texttt{search}(2)$};
|
||||
\node at (4,-8) {$\texttt{search}(2)$};
|
||||
\node at (12,-8) {$\texttt{search}(2)$};
|
||||
|
||||
\path[draw,thick,->] (-8,-4-0.5) -- (-12,-8+0.5);
|
||||
\path[draw,thick,->] (-8,-4-0.5) -- (-4,-8+0.5);
|
||||
\path[draw,thick,->] (8,-4-0.5) -- (4,-8+0.5);
|
||||
\path[draw,thick,->] (8,-4-0.5) -- (12,-8+0.5);
|
||||
|
||||
\node at (-14,-12) {$\texttt{search}(3)$};
|
||||
\node at (-10,-12) {$\texttt{search}(3)$};
|
||||
\node at (-6,-12) {$\texttt{search}(3)$};
|
||||
\node at (-2,-12) {$\texttt{search}(3)$};
|
||||
\node at (2,-12) {$\texttt{search}(3)$};
|
||||
\node at (6,-12) {$\texttt{search}(3)$};
|
||||
\node at (10,-12) {$\texttt{search}(3)$};
|
||||
\node at (14,-12) {$\texttt{search}(3)$};
|
||||
|
||||
\node at (-14,-13.5) {$\emptyset$};
|
||||
\node at (-10,-13.5) {$\{2\}$};
|
||||
\node at (-6,-13.5) {$\{1\}$};
|
||||
\node at (-2,-13.5) {$\{1,2\}$};
|
||||
\node at (2,-13.5) {$\{0\}$};
|
||||
\node at (6,-13.5) {$\{0,2\}$};
|
||||
\node at (10,-13.5) {$\{0,1\}$};
|
||||
\node at (14,-13.5) {$\{0,1,2\}$};
|
||||
|
||||
|
||||
\path[draw,thick,->] (-12,-8-0.5) -- (-14,-12+0.5);
|
||||
\path[draw,thick,->] (-12,-8-0.5) -- (-10,-12+0.5);
|
||||
\path[draw,thick,->] (-4,-8-0.5) -- (-6,-12+0.5);
|
||||
\path[draw,thick,->] (-4,-8-0.5) -- (-2,-12+0.5);
|
||||
\path[draw,thick,->] (4,-8-0.5) -- (2,-12+0.5);
|
||||
\path[draw,thick,->] (4,-8-0.5) -- (6,-12+0.5);
|
||||
\path[draw,thick,->] (12,-8-0.5) -- (10,-12+0.5);
|
||||
\path[draw,thick,->] (12,-8-0.5) -- (14,-12+0.5);
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Method 2}
|
||||
|
||||
Another way to generate subsets is based on
|
||||
the bit representation of integers.
|
||||
Each subset of a set of $n$ elements
|
||||
can be represented as a sequence of $n$ bits,
|
||||
which corresponds to an integer between $0 \ldots 2^n-1$.
|
||||
The ones in the bit sequence indicate
|
||||
which elements are included in the subset.
|
||||
|
||||
The usual convention is that
|
||||
the last bit corresponds to element 0,
|
||||
the second last bit corresponds to element 1,
|
||||
and so on.
|
||||
For example, the bit representation of 25
|
||||
is 11001, which corresponds to the subset $\{0,3,4\}$.
|
||||
|
||||
The following code goes through the subsets
|
||||
of a set of $n$ elements
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int b = 0; b < (1<<n); b++) {
|
||||
// process subset
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The following code shows how we can find
|
||||
the elements of a subset that corresponds to a bit sequence.
|
||||
When processing each subset,
|
||||
the code builds a vector that contains the
|
||||
elements in the subset.
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int b = 0; b < (1<<n); b++) {
|
||||
vector<int> subset;
|
||||
for (int i = 0; i < n; i++) {
|
||||
if (b&(1<<i)) subset.push_back(i);
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Generating permutations}
|
||||
|
||||
\index{permutation}
|
||||
|
||||
Next we consider the problem of generating
|
||||
all permutations of a set of $n$ elements.
|
||||
For example, the permutations of $\{0,1,2\}$ are
|
||||
$(0,1,2)$, $(0,2,1)$, $(1,0,2)$, $(1,2,0)$,
|
||||
$(2,0,1)$ and $(2,1,0)$.
|
||||
Again, there are two approaches:
|
||||
we can either use recursion or go through the
|
||||
permutations iteratively.
|
||||
|
||||
\subsubsection{Method 1}
|
||||
|
||||
Like subsets, permutations can be generated
|
||||
using recursion.
|
||||
The following function \texttt{search} goes
|
||||
through the permutations of the set $\{0,1,\ldots,n-1\}$.
|
||||
The function builds a vector \texttt{permutation}
|
||||
that contains the permutation,
|
||||
and the search begins when the function is
|
||||
called without parameters.
|
||||
|
||||
\begin{lstlisting}
|
||||
void search() {
|
||||
if (permutation.size() == n) {
|
||||
// process permutation
|
||||
} else {
|
||||
for (int i = 0; i < n; i++) {
|
||||
if (chosen[i]) continue;
|
||||
chosen[i] = true;
|
||||
permutation.push_back(i);
|
||||
search();
|
||||
chosen[i] = false;
|
||||
permutation.pop_back();
|
||||
}
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
Each function call adds a new element to
|
||||
\texttt{permutation}.
|
||||
The array \texttt{chosen} indicates which
|
||||
elements are already included in the permutation.
|
||||
If the size of \texttt{permutation} equals the size of the set,
|
||||
a permutation has been generated.
|
||||
|
||||
\subsubsection{Method 2}
|
||||
|
||||
\index{next\_permutation@\texttt{next\_permutation}}
|
||||
|
||||
Another method for generating permutations
|
||||
is to begin with the permutation
|
||||
$\{0,1,\ldots,n-1\}$ and repeatedly
|
||||
use a function that constructs the next permutation
|
||||
in increasing order.
|
||||
The C++ standard library contains the function
|
||||
\texttt{next\_permutation} that can be used for this:
|
||||
|
||||
\begin{lstlisting}
|
||||
vector<int> permutation;
|
||||
for (int i = 0; i < n; i++) {
|
||||
permutation.push_back(i);
|
||||
}
|
||||
do {
|
||||
// process permutation
|
||||
} while (next_permutation(permutation.begin(),permutation.end()));
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Backtracking}
|
||||
|
||||
\index{backtracking}
|
||||
|
||||
A \key{backtracking} algorithm
|
||||
begins with an empty solution
|
||||
and extends the solution step by step.
|
||||
The search recursively
|
||||
goes through all different ways how
|
||||
a solution can be constructed.
|
||||
|
||||
\index{queen problem}
|
||||
|
||||
As an example, consider the problem of
|
||||
calculating the number
|
||||
of ways $n$ queens can be placed on
|
||||
an $n \times n$ chessboard so that
|
||||
no two queens attack each other.
|
||||
For example, when $n=4$,
|
||||
there are two possible solutions:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.65]
|
||||
\begin{scope}
|
||||
\draw (0, 0) grid (4, 4);
|
||||
\node at (1.5,3.5) {\symqueen};
|
||||
\node at (3.5,2.5) {\symqueen};
|
||||
\node at (0.5,1.5) {\symqueen};
|
||||
\node at (2.5,0.5) {\symqueen};
|
||||
|
||||
\draw (6, 0) grid (10, 4);
|
||||
\node at (6+2.5,3.5) {\symqueen};
|
||||
\node at (6+0.5,2.5) {\symqueen};
|
||||
\node at (6+3.5,1.5) {\symqueen};
|
||||
\node at (6+1.5,0.5) {\symqueen};
|
||||
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The problem can be solved using backtracking
|
||||
by placing queens to the board row by row.
|
||||
More precisely, exactly one queen will
|
||||
be placed on each row so that no queen attacks
|
||||
any of the queens placed before.
|
||||
A solution has been found when all
|
||||
$n$ queens have been placed on the board.
|
||||
|
||||
For example, when $n=4$,
|
||||
some partial solutions generated by
|
||||
the backtracking algorithm are as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\draw (0, 0) grid (4, 4);
|
||||
|
||||
\draw (-9, -6) grid (-5, -2);
|
||||
\draw (-3, -6) grid (1, -2);
|
||||
\draw (3, -6) grid (7, -2);
|
||||
\draw (9, -6) grid (13, -2);
|
||||
|
||||
\node at (-9+0.5,-3+0.5) {\symqueen};
|
||||
\node at (-3+1+0.5,-3+0.5) {\symqueen};
|
||||
\node at (3+2+0.5,-3+0.5) {\symqueen};
|
||||
\node at (9+3+0.5,-3+0.5) {\symqueen};
|
||||
|
||||
\draw (2,0) -- (-7,-2);
|
||||
\draw (2,0) -- (-1,-2);
|
||||
\draw (2,0) -- (5,-2);
|
||||
\draw (2,0) -- (11,-2);
|
||||
|
||||
\draw (-11, -12) grid (-7, -8);
|
||||
\draw (-6, -12) grid (-2, -8);
|
||||
\draw (-1, -12) grid (3, -8);
|
||||
\draw (4, -12) grid (8, -8);
|
||||
\draw[white] (11, -12) grid (15, -8);
|
||||
\node at (-11+1+0.5,-9+0.5) {\symqueen};
|
||||
\node at (-6+1+0.5,-9+0.5) {\symqueen};
|
||||
\node at (-1+1+0.5,-9+0.5) {\symqueen};
|
||||
\node at (4+1+0.5,-9+0.5) {\symqueen};
|
||||
\node at (-11+0+0.5,-10+0.5) {\symqueen};
|
||||
\node at (-6+1+0.5,-10+0.5) {\symqueen};
|
||||
\node at (-1+2+0.5,-10+0.5) {\symqueen};
|
||||
\node at (4+3+0.5,-10+0.5) {\symqueen};
|
||||
|
||||
\draw (-1,-6) -- (-9,-8);
|
||||
\draw (-1,-6) -- (-4,-8);
|
||||
\draw (-1,-6) -- (1,-8);
|
||||
\draw (-1,-6) -- (6,-8);
|
||||
|
||||
\node at (-9,-13) {illegal};
|
||||
\node at (-4,-13) {illegal};
|
||||
\node at (1,-13) {illegal};
|
||||
\node at (6,-13) {valid};
|
||||
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
At the bottom level, the three first configurations
|
||||
are illegal, because the queens attack each other.
|
||||
However, the fourth configuration is valid
|
||||
and it can be extended to a complete solution by
|
||||
placing two more queens to the board.
|
||||
There is only one way to place the two remaining queens.
|
||||
|
||||
\begin{samepage}
|
||||
The algorithm can be implemented as follows:
|
||||
\begin{lstlisting}
|
||||
void search(int y) {
|
||||
if (y == n) {
|
||||
count++;
|
||||
return;
|
||||
}
|
||||
for (int x = 0; x < n; x++) {
|
||||
if (column[x] || diag1[x+y] || diag2[x-y+n-1]) continue;
|
||||
column[x] = diag1[x+y] = diag2[x-y+n-1] = 1;
|
||||
search(y+1);
|
||||
column[x] = diag1[x+y] = diag2[x-y+n-1] = 0;
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
\end{samepage}
|
||||
The search begins by calling \texttt{search(0)}.
|
||||
The size of the board is $n \times n$,
|
||||
and the code calculates the number of solutions
|
||||
to \texttt{count}.
|
||||
|
||||
The code assumes that the rows and columns
|
||||
of the board are numbered from 0 to $n-1$.
|
||||
When the function \texttt{search} is
|
||||
called with parameter $y$,
|
||||
it places a queen on row $y$
|
||||
and then calls itself with parameter $y+1$.
|
||||
Then, if $y=n$, a solution has been found
|
||||
and the variable \texttt{count} is increased by one.
|
||||
|
||||
The array \texttt{column} keeps track of columns
|
||||
that contain a queen,
|
||||
and the arrays \texttt{diag1} and \texttt{diag2}
|
||||
keep track of diagonals.
|
||||
It is not allowed to add another queen to a
|
||||
column or diagonal that already contains a queen.
|
||||
For example, the columns and diagonals of
|
||||
the $4 \times 4$ board are numbered as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.65]
|
||||
\begin{scope}
|
||||
\draw (0-6, 0) grid (4-6, 4);
|
||||
\node at (-6+0.5,3.5) {$0$};
|
||||
\node at (-6+1.5,3.5) {$1$};
|
||||
\node at (-6+2.5,3.5) {$2$};
|
||||
\node at (-6+3.5,3.5) {$3$};
|
||||
\node at (-6+0.5,2.5) {$0$};
|
||||
\node at (-6+1.5,2.5) {$1$};
|
||||
\node at (-6+2.5,2.5) {$2$};
|
||||
\node at (-6+3.5,2.5) {$3$};
|
||||
\node at (-6+0.5,1.5) {$0$};
|
||||
\node at (-6+1.5,1.5) {$1$};
|
||||
\node at (-6+2.5,1.5) {$2$};
|
||||
\node at (-6+3.5,1.5) {$3$};
|
||||
\node at (-6+0.5,0.5) {$0$};
|
||||
\node at (-6+1.5,0.5) {$1$};
|
||||
\node at (-6+2.5,0.5) {$2$};
|
||||
\node at (-6+3.5,0.5) {$3$};
|
||||
|
||||
\draw (0, 0) grid (4, 4);
|
||||
\node at (0.5,3.5) {$0$};
|
||||
\node at (1.5,3.5) {$1$};
|
||||
\node at (2.5,3.5) {$2$};
|
||||
\node at (3.5,3.5) {$3$};
|
||||
\node at (0.5,2.5) {$1$};
|
||||
\node at (1.5,2.5) {$2$};
|
||||
\node at (2.5,2.5) {$3$};
|
||||
\node at (3.5,2.5) {$4$};
|
||||
\node at (0.5,1.5) {$2$};
|
||||
\node at (1.5,1.5) {$3$};
|
||||
\node at (2.5,1.5) {$4$};
|
||||
\node at (3.5,1.5) {$5$};
|
||||
\node at (0.5,0.5) {$3$};
|
||||
\node at (1.5,0.5) {$4$};
|
||||
\node at (2.5,0.5) {$5$};
|
||||
\node at (3.5,0.5) {$6$};
|
||||
|
||||
\draw (6, 0) grid (10, 4);
|
||||
\node at (6.5,3.5) {$3$};
|
||||
\node at (7.5,3.5) {$4$};
|
||||
\node at (8.5,3.5) {$5$};
|
||||
\node at (9.5,3.5) {$6$};
|
||||
\node at (6.5,2.5) {$2$};
|
||||
\node at (7.5,2.5) {$3$};
|
||||
\node at (8.5,2.5) {$4$};
|
||||
\node at (9.5,2.5) {$5$};
|
||||
\node at (6.5,1.5) {$1$};
|
||||
\node at (7.5,1.5) {$2$};
|
||||
\node at (8.5,1.5) {$3$};
|
||||
\node at (9.5,1.5) {$4$};
|
||||
\node at (6.5,0.5) {$0$};
|
||||
\node at (7.5,0.5) {$1$};
|
||||
\node at (8.5,0.5) {$2$};
|
||||
\node at (9.5,0.5) {$3$};
|
||||
|
||||
\node at (-4,-1) {\texttt{column}};
|
||||
\node at (2,-1) {\texttt{diag1}};
|
||||
\node at (8,-1) {\texttt{diag2}};
|
||||
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Let $q(n)$ denote the number of ways
|
||||
to place $n$ queens on an $n \times n$ chessboard.
|
||||
The above backtracking
|
||||
algorithm tells us that, for example, $q(8)=92$.
|
||||
When $n$ increases, the search quickly becomes slow,
|
||||
because the number of solutions increases
|
||||
exponentially.
|
||||
For example, calculating $q(16)=14772512$
|
||||
using the above algorithm already takes about a minute
|
||||
on a modern computer\footnote{There is no known way to efficiently
|
||||
calculate larger values of $q(n)$. The current record is
|
||||
$q(27)=234907967154122528$, calculated in 2016 \cite{q27}.}.
|
||||
|
||||
\section{Pruning the search}
|
||||
|
||||
We can often optimize backtracking
|
||||
by pruning the search tree.
|
||||
The idea is to add ''intelligence'' to the algorithm
|
||||
so that it will notice as soon as possible
|
||||
if a partial solution cannot be extended
|
||||
to a complete solution.
|
||||
Such optimizations can have a tremendous
|
||||
effect on the efficiency of the search.
|
||||
|
||||
Let us consider the problem
|
||||
of calculating the number of paths
|
||||
in an $n \times n$ grid from the upper-left corner
|
||||
to the lower-right corner such that the
|
||||
path visits each square exactly once.
|
||||
For example, in a $7 \times 7$ grid,
|
||||
there are 111712 such paths.
|
||||
One of the paths is as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\draw (0, 0) grid (7, 7);
|
||||
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||
(5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) --
|
||||
(3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) --
|
||||
(4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) --
|
||||
(5.5,6.5) -- (6.5,6.5) -- (6.5,0.5);
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
We focus on the $7 \times 7$ case,
|
||||
because its level of difficulty is appropriate to our needs.
|
||||
We begin with a straightforward backtracking algorithm,
|
||||
and then optimize it step by step using observations
|
||||
of how the search can be pruned.
|
||||
After each optimization, we measure the running time
|
||||
of the algorithm and the number of recursive calls,
|
||||
so that we clearly see the effect of each
|
||||
optimization on the efficiency of the search.
|
||||
|
||||
\subsubsection{Basic algorithm}
|
||||
|
||||
The first version of the algorithm does not contain
|
||||
any optimizations. We simply use backtracking to generate
|
||||
all possible paths from the upper-left corner to
|
||||
the lower-right corner and count the number of such paths.
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
running time: 483 seconds
|
||||
\item
|
||||
number of recursive calls: 76 billion
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Optimization 1}
|
||||
|
||||
In any solution, we first move one step
|
||||
down or right.
|
||||
There are always two paths that
|
||||
are symmetric
|
||||
about the diagonal of the grid
|
||||
after the first step.
|
||||
For example, the following paths are symmetric:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{ccc}
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\draw (0, 0) grid (7, 7);
|
||||
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||
(5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) --
|
||||
(3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) --
|
||||
(4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) --
|
||||
(5.5,6.5) -- (6.5,6.5) -- (6.5,0.5);
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
& \hspace{20px}
|
||||
&
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}[yscale=1,xscale=-1,rotate=-90]
|
||||
\draw (0, 0) grid (7, 7);
|
||||
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||
(5.5,0.5) -- (5.5,3.5) -- (3.5,3.5) --
|
||||
(3.5,5.5) -- (1.5,5.5) -- (1.5,6.5) --
|
||||
(4.5,6.5) -- (4.5,4.5) -- (5.5,4.5) --
|
||||
(5.5,6.5) -- (6.5,6.5) -- (6.5,0.5);
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
Hence, we can decide that we always first
|
||||
move one step down (or right),
|
||||
and finally multiply the number of solutions by two.
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
running time: 244 seconds
|
||||
\item
|
||||
number of recursive calls: 38 billion
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Optimization 2}
|
||||
|
||||
If the path reaches the lower-right square
|
||||
before it has visited all other squares of the grid,
|
||||
it is clear that
|
||||
it will not be possible to complete the solution.
|
||||
An example of this is the following path:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\draw (0, 0) grid (7, 7);
|
||||
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||
(6.5,0.5);
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Using this observation, we can terminate the search
|
||||
immediately if we reach the lower-right square too early.
|
||||
\begin{itemize}
|
||||
\item
|
||||
running time: 119 seconds
|
||||
\item
|
||||
number of recursive calls: 20 billion
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Optimization 3}
|
||||
|
||||
If the path touches a wall
|
||||
and can turn either left or right,
|
||||
the grid splits into two parts
|
||||
that contain unvisited squares.
|
||||
For example, in the following situation,
|
||||
the path can turn either left or right:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\draw (0, 0) grid (7, 7);
|
||||
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||
(5.5,0.5) -- (5.5,6.5);
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
In this case, we cannot visit all squares anymore,
|
||||
so we can terminate the search.
|
||||
This optimization is very useful:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
running time: 1.8 seconds
|
||||
\item
|
||||
number of recursive calls: 221 million
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Optimization 4}
|
||||
|
||||
The idea of Optimization 3
|
||||
can be generalized:
|
||||
if the path cannot continue forward
|
||||
but can turn either left or right,
|
||||
the grid splits into two parts
|
||||
that both contain unvisited squares.
|
||||
For example, consider the following path:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\draw (0, 0) grid (7, 7);
|
||||
\draw[thick,->] (0.5,6.5) -- (0.5,4.5) -- (2.5,4.5) --
|
||||
(2.5,3.5) -- (0.5,3.5) -- (0.5,0.5) --
|
||||
(3.5,0.5) -- (3.5,1.5) -- (1.5,1.5) --
|
||||
(1.5,2.5) -- (4.5,2.5) -- (4.5,0.5) --
|
||||
(5.5,0.5) -- (5.5,4.5) -- (3.5,4.5);
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
It is clear that we cannot visit all squares anymore,
|
||||
so we can terminate the search.
|
||||
After this optimization, the search is
|
||||
very efficient:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
running time: 0.6 seconds
|
||||
\item
|
||||
number of recursive calls: 69 million
|
||||
\end{itemize}
|
||||
|
||||
~\\
|
||||
Now is a good moment to stop optimizing
|
||||
the algorithm and see what we have achieved.
|
||||
The running time of the original algorithm
|
||||
was 483 seconds,
|
||||
and now after the optimizations,
|
||||
the running time is only 0.6 seconds.
|
||||
Thus, the algorithm became nearly 1000 times
|
||||
faster after the optimizations.
|
||||
|
||||
This is a usual phenomenon in backtracking,
|
||||
because the search tree is usually large
|
||||
and even simple observations can effectively
|
||||
prune the search.
|
||||
Especially useful are optimizations that
|
||||
occur during the first steps of the algorithm,
|
||||
i.e., at the top of the search tree.
|
||||
|
||||
\section{Meet in the middle}
|
||||
|
||||
\index{meet in the middle}
|
||||
|
||||
\key{Meet in the middle} is a technique
|
||||
where the search space is divided into
|
||||
two parts of about equal size.
|
||||
A separate search is performed
|
||||
for both of the parts,
|
||||
and finally the results of the searches are combined.
|
||||
|
||||
The technique can be used
|
||||
if there is an efficient way to combine the
|
||||
results of the searches.
|
||||
In such a situation, the two searches may require less
|
||||
time than one large search.
|
||||
Typically, we can turn a factor of $2^n$
|
||||
into a factor of $2^{n/2}$ using the meet in the
|
||||
middle technique.
|
||||
|
||||
As an example, consider a problem where
|
||||
we are given a list of $n$ numbers and
|
||||
a number $x$,
|
||||
and we want to find out if it is possible
|
||||
to choose some numbers from the list so that
|
||||
their sum is $x$.
|
||||
For example, given the list $[2,4,5,9]$ and $x=15$,
|
||||
we can choose the numbers $[2,4,9]$ to get $2+4+9=15$.
|
||||
However, if $x=10$ for the same list,
|
||||
it is not possible to form the sum.
|
||||
|
||||
A simple algorithm to the problem is to
|
||||
go through all subsets of the elements and
|
||||
check if the sum of any of the subsets is $x$.
|
||||
The running time of such an algorithm is $O(2^n)$,
|
||||
because there are $2^n$ subsets.
|
||||
However, using the meet in the middle technique,
|
||||
we can achieve a more efficient $O(2^{n/2})$ time algorithm\footnote{This
|
||||
idea was introduced in 1974 by E. Horowitz and S. Sahni \cite{hor74}.}.
|
||||
Note that $O(2^n)$ and $O(2^{n/2})$ are different
|
||||
complexities because $2^{n/2}$ equals $\sqrt{2^n}$.
|
||||
|
||||
The idea is to divide the list into
|
||||
two lists $A$ and $B$ such that both
|
||||
lists contain about half of the numbers.
|
||||
The first search generates all subsets
|
||||
of $A$ and stores their sums to a list $S_A$.
|
||||
Correspondingly, the second search creates
|
||||
a list $S_B$ from $B$.
|
||||
After this, it suffices to check if it is possible
|
||||
to choose one element from $S_A$ and another
|
||||
element from $S_B$ such that their sum is $x$.
|
||||
This is possible exactly when there is a way to
|
||||
form the sum $x$ using the numbers of the original list.
|
||||
|
||||
For example, suppose that the list is $[2,4,5,9]$ and $x=15$.
|
||||
First, we divide the list into $A=[2,4]$ and $B=[5,9]$.
|
||||
After this, we create lists
|
||||
$S_A=[0,2,4,6]$ and $S_B=[0,5,9,14]$.
|
||||
In this case, the sum $x=15$ is possible to form,
|
||||
because $S_A$ contains the sum $6$,
|
||||
$S_B$ contains the sum $9$, and $6+9=15$.
|
||||
This corresponds to the solution $[2,4,9]$.
|
||||
|
||||
We can implement the algorithm so that
|
||||
its time complexity is $O(2^{n/2})$.
|
||||
First, we generate \emph{sorted} lists $S_A$ and $S_B$,
|
||||
which can be done in $O(2^{n/2})$ time using a merge-like technique.
|
||||
After this, since the lists are sorted,
|
||||
we can check in $O(2^{n/2})$ time if
|
||||
the sum $x$ can be created from $S_A$ and $S_B$.
|
680
chapter06.tex
680
chapter06.tex
|
@ -1,680 +0,0 @@
|
|||
\chapter{Greedy algorithms}
|
||||
|
||||
\index{greedy algorithm}
|
||||
|
||||
A \key{greedy algorithm}
|
||||
constructs a solution to the problem
|
||||
by always making a choice that looks
|
||||
the best at the moment.
|
||||
A greedy algorithm never takes back
|
||||
its choices, but directly constructs
|
||||
the final solution.
|
||||
For this reason, greedy algorithms
|
||||
are usually very efficient.
|
||||
|
||||
The difficulty in designing greedy algorithms
|
||||
is to find a greedy strategy
|
||||
that always produces an optimal solution
|
||||
to the problem.
|
||||
The locally optimal choices in a greedy
|
||||
algorithm should also be globally optimal.
|
||||
It is often difficult to argue that
|
||||
a greedy algorithm works.
|
||||
|
||||
\section{Coin problem}
|
||||
|
||||
As a first example, we consider a problem
|
||||
where we are given a set of coins
|
||||
and our task is to form a sum of money $n$
|
||||
using the coins.
|
||||
The values of the coins are
|
||||
$\texttt{coins}=\{c_1,c_2,\ldots,c_k\}$,
|
||||
and each coin can be used as many times we want.
|
||||
What is the minimum number of coins needed?
|
||||
|
||||
For example, if the coins are the euro coins (in cents)
|
||||
\[\{1,2,5,10,20,50,100,200\}\]
|
||||
and $n=520$,
|
||||
we need at least four coins.
|
||||
The optimal solution is to select coins
|
||||
$200+200+100+20$ whose sum is 520.
|
||||
|
||||
\subsubsection{Greedy algorithm}
|
||||
|
||||
A simple greedy algorithm to the problem
|
||||
always selects the largest possible coin,
|
||||
until the required sum of money has been constructed.
|
||||
This algorithm works in the example case,
|
||||
because we first select two 200 cent coins,
|
||||
then one 100 cent coin and finally one 20 cent coin.
|
||||
But does this algorithm always work?
|
||||
|
||||
It turns out that if the coins are the euro coins,
|
||||
the greedy algorithm \emph{always} works, i.e.,
|
||||
it always produces a solution with the fewest
|
||||
possible number of coins.
|
||||
The correctness of the algorithm can be
|
||||
shown as follows:
|
||||
|
||||
First, each coin 1, 5, 10, 50 and 100 appears
|
||||
at most once in an optimal solution,
|
||||
because if the
|
||||
solution would contain two such coins,
|
||||
we could replace them by one coin and
|
||||
obtain a better solution.
|
||||
For example, if the solution would contain
|
||||
coins $5+5$, we could replace them by coin $10$.
|
||||
|
||||
In the same way, coins 2 and 20 appear
|
||||
at most twice in an optimal solution,
|
||||
because we could replace
|
||||
coins $2+2+2$ by coins $5+1$ and
|
||||
coins $20+20+20$ by coins $50+10$.
|
||||
Moreover, an optimal solution cannot contain
|
||||
coins $2+2+1$ or $20+20+10$,
|
||||
because we could replace them by coins $5$ and $50$.
|
||||
|
||||
Using these observations,
|
||||
we can show for each coin $x$ that
|
||||
it is not possible to optimally construct
|
||||
a sum $x$ or any larger sum by only using coins
|
||||
that are smaller than $x$.
|
||||
For example, if $x=100$, the largest optimal
|
||||
sum using the smaller coins is $50+20+20+5+2+2=99$.
|
||||
Thus, the greedy algorithm that always selects
|
||||
the largest coin produces the optimal solution.
|
||||
|
||||
This example shows that it can be difficult
|
||||
to argue that a greedy algorithm works,
|
||||
even if the algorithm itself is simple.
|
||||
|
||||
\subsubsection{General case}
|
||||
|
||||
In the general case, the coin set can contain any coins
|
||||
and the greedy algorithm \emph{does not} necessarily produce
|
||||
an optimal solution.
|
||||
|
||||
We can prove that a greedy algorithm does not work
|
||||
by showing a counterexample
|
||||
where the algorithm gives a wrong answer.
|
||||
In this problem we can easily find a counterexample:
|
||||
if the coins are $\{1,3,4\}$ and the target sum
|
||||
is 6, the greedy algorithm produces the solution
|
||||
$4+1+1$ while the optimal solution is $3+3$.
|
||||
|
||||
It is not known if the general coin problem
|
||||
can be solved using any greedy algorithm\footnote{However, it is possible
|
||||
to \emph{check} in polynomial time
|
||||
if the greedy algorithm presented in this chapter works for
|
||||
a given set of coins \cite{pea05}.}.
|
||||
However, as we will see in Chapter 7,
|
||||
in some cases,
|
||||
the general problem can be efficiently
|
||||
solved using a dynamic
|
||||
programming algorithm that always gives the
|
||||
correct answer.
|
||||
|
||||
\section{Scheduling}
|
||||
|
||||
Many scheduling problems can be solved
|
||||
using greedy algorithms.
|
||||
A classic problem is as follows:
|
||||
Given $n$ events with their starting and ending
|
||||
times, find a schedule
|
||||
that includes as many events as possible.
|
||||
It is not possible to select an event partially.
|
||||
For example, consider the following events:
|
||||
\begin{center}
|
||||
\begin{tabular}{lll}
|
||||
event & starting time & ending time \\
|
||||
\hline
|
||||
$A$ & 1 & 3 \\
|
||||
$B$ & 2 & 5 \\
|
||||
$C$ & 3 & 9 \\
|
||||
$D$ & 6 & 8 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
In this case the maximum number of events is two.
|
||||
For example, we can select events $B$ and $D$
|
||||
as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.4]
|
||||
\begin{scope}
|
||||
\draw (2, 0) rectangle (6, -1);
|
||||
\draw[fill=lightgray] (4, -1.5) rectangle (10, -2.5);
|
||||
\draw (6, -3) rectangle (18, -4);
|
||||
\draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
|
||||
\node at (2.5,-0.5) {$A$};
|
||||
\node at (4.5,-2) {$B$};
|
||||
\node at (6.5,-3.5) {$C$};
|
||||
\node at (12.5,-5) {$D$};
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
It is possible to invent several greedy algorithms
|
||||
for the problem, but which of them works in every case?
|
||||
|
||||
\subsubsection*{Algorithm 1}
|
||||
|
||||
The first idea is to select as \emph{short}
|
||||
events as possible.
|
||||
In the example case this algorithm
|
||||
selects the following events:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.4]
|
||||
\begin{scope}
|
||||
\draw[fill=lightgray] (2, 0) rectangle (6, -1);
|
||||
\draw (4, -1.5) rectangle (10, -2.5);
|
||||
\draw (6, -3) rectangle (18, -4);
|
||||
\draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
|
||||
\node at (2.5,-0.5) {$A$};
|
||||
\node at (4.5,-2) {$B$};
|
||||
\node at (6.5,-3.5) {$C$};
|
||||
\node at (12.5,-5) {$D$};
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
However, selecting short events is not always
|
||||
a correct strategy. For example, the algorithm fails
|
||||
in the following case:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.4]
|
||||
\begin{scope}
|
||||
\draw (1, 0) rectangle (7, -1);
|
||||
\draw[fill=lightgray] (6, -1.5) rectangle (9, -2.5);
|
||||
\draw (8, -3) rectangle (14, -4);
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
If we select the short event, we can only select one event.
|
||||
However, it would be possible to select both long events.
|
||||
|
||||
\subsubsection*{Algorithm 2}
|
||||
|
||||
Another idea is to always select the next possible
|
||||
event that \emph{begins} as \emph{early} as possible.
|
||||
This algorithm selects the following events:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.4]
|
||||
\begin{scope}
|
||||
\draw[fill=lightgray] (2, 0) rectangle (6, -1);
|
||||
\draw (4, -1.5) rectangle (10, -2.5);
|
||||
\draw[fill=lightgray] (6, -3) rectangle (18, -4);
|
||||
\draw (12, -4.5) rectangle (16, -5.5);
|
||||
\node at (2.5,-0.5) {$A$};
|
||||
\node at (4.5,-2) {$B$};
|
||||
\node at (6.5,-3.5) {$C$};
|
||||
\node at (12.5,-5) {$D$};
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
However, we can find a counterexample
|
||||
also for this algorithm.
|
||||
For example, in the following case,
|
||||
the algorithm only selects one event:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.4]
|
||||
\begin{scope}
|
||||
\draw[fill=lightgray] (1, 0) rectangle (14, -1);
|
||||
\draw (3, -1.5) rectangle (7, -2.5);
|
||||
\draw (8, -3) rectangle (12, -4);
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
If we select the first event, it is not possible
|
||||
to select any other events.
|
||||
However, it would be possible to select the
|
||||
other two events.
|
||||
|
||||
\subsubsection*{Algorithm 3}
|
||||
|
||||
The third idea is to always select the next
|
||||
possible event that \emph{ends} as \emph{early} as possible.
|
||||
This algorithm selects the following events:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.4]
|
||||
\begin{scope}
|
||||
\draw[fill=lightgray] (2, 0) rectangle (6, -1);
|
||||
\draw (4, -1.5) rectangle (10, -2.5);
|
||||
\draw (6, -3) rectangle (18, -4);
|
||||
\draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
|
||||
\node at (2.5,-0.5) {$A$};
|
||||
\node at (4.5,-2) {$B$};
|
||||
\node at (6.5,-3.5) {$C$};
|
||||
\node at (12.5,-5) {$D$};
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
It turns out that this algorithm
|
||||
\emph{always} produces an optimal solution.
|
||||
The reason for this is that it is always an optimal choice
|
||||
to first select an event that ends
|
||||
as early as possible.
|
||||
After this, it is an optimal choice
|
||||
to select the next event
|
||||
using the same strategy, etc.,
|
||||
until we cannot select any more events.
|
||||
|
||||
One way to argue that the algorithm works
|
||||
is to consider
|
||||
what happens if we first select an event
|
||||
that ends later than the event that ends
|
||||
as early as possible.
|
||||
Now, we will have at most an equal number of
|
||||
choices how we can select the next event.
|
||||
Hence, selecting an event that ends later
|
||||
can never yield a better solution,
|
||||
and the greedy algorithm is correct.
|
||||
|
||||
\section{Tasks and deadlines}
|
||||
|
||||
Let us now consider a problem where
|
||||
we are given $n$ tasks with durations and deadlines
|
||||
and our task is to choose an order to perform the tasks.
|
||||
For each task, we earn $d-x$ points
|
||||
where $d$ is the task's deadline
|
||||
and $x$ is the moment when we finish the task.
|
||||
What is the largest possible total score
|
||||
we can obtain?
|
||||
|
||||
For example, suppose that the tasks are as follows:
|
||||
\begin{center}
|
||||
\begin{tabular}{lll}
|
||||
task & duration & deadline \\
|
||||
\hline
|
||||
$A$ & 4 & 2 \\
|
||||
$B$ & 3 & 5 \\
|
||||
$C$ & 2 & 7 \\
|
||||
$D$ & 4 & 5 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
In this case, an optimal schedule for the tasks
|
||||
is as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.4]
|
||||
\begin{scope}
|
||||
\draw (0, 0) rectangle (4, -1);
|
||||
\draw (4, 0) rectangle (10, -1);
|
||||
\draw (10, 0) rectangle (18, -1);
|
||||
\draw (18, 0) rectangle (26, -1);
|
||||
\node at (0.5,-0.5) {$C$};
|
||||
\node at (4.5,-0.5) {$B$};
|
||||
\node at (10.5,-0.5) {$A$};
|
||||
\node at (18.5,-0.5) {$D$};
|
||||
|
||||
\draw (0,1.5) -- (26,1.5);
|
||||
\foreach \i in {0,2,...,26}
|
||||
{
|
||||
\draw (\i,1.25) -- (\i,1.75);
|
||||
}
|
||||
\footnotesize
|
||||
\node at (0,2.5) {0};
|
||||
\node at (10,2.5) {5};
|
||||
\node at (20,2.5) {10};
|
||||
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
In this solution, $C$ yields 5 points,
|
||||
$B$ yields 0 points, $A$ yields $-7$ points
|
||||
and $D$ yields $-8$ points,
|
||||
so the total score is $-10$.
|
||||
|
||||
Surprisingly, the optimal solution to the problem
|
||||
does not depend on the deadlines at all,
|
||||
but a correct greedy strategy is to simply
|
||||
perform the tasks \emph{sorted by their durations}
|
||||
in increasing order.
|
||||
The reason for this is that if we ever perform
|
||||
two tasks one after another such that the first task
|
||||
takes longer than the second task,
|
||||
we can obtain a better solution if we swap the tasks.
|
||||
For example, consider the following schedule:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.4]
|
||||
\begin{scope}
|
||||
\draw (0, 0) rectangle (8, -1);
|
||||
\draw (8, 0) rectangle (12, -1);
|
||||
\node at (0.5,-0.5) {$X$};
|
||||
\node at (8.5,-0.5) {$Y$};
|
||||
|
||||
\draw [decoration={brace}, decorate, line width=0.3mm] (7.75,-1.5) -- (0.25,-1.5);
|
||||
\draw [decoration={brace}, decorate, line width=0.3mm] (11.75,-1.5) -- (8.25,-1.5);
|
||||
|
||||
\footnotesize
|
||||
\node at (4,-2.5) {$a$};
|
||||
\node at (10,-2.5) {$b$};
|
||||
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Here $a>b$, so we should swap the tasks:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.4]
|
||||
\begin{scope}
|
||||
\draw (0, 0) rectangle (4, -1);
|
||||
\draw (4, 0) rectangle (12, -1);
|
||||
\node at (0.5,-0.5) {$Y$};
|
||||
\node at (4.5,-0.5) {$X$};
|
||||
|
||||
\draw [decoration={brace}, decorate, line width=0.3mm] (3.75,-1.5) -- (0.25,-1.5);
|
||||
\draw [decoration={brace}, decorate, line width=0.3mm] (11.75,-1.5) -- (4.25,-1.5);
|
||||
|
||||
\footnotesize
|
||||
\node at (2,-2.5) {$b$};
|
||||
\node at (8,-2.5) {$a$};
|
||||
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Now $X$ gives $b$ points less and $Y$ gives $a$ points more,
|
||||
so the total score increases by $a-b > 0$.
|
||||
In an optimal solution,
|
||||
for any two consecutive tasks,
|
||||
it must hold that the shorter task comes
|
||||
before the longer task.
|
||||
Thus, the tasks must be performed
|
||||
sorted by their durations.
|
||||
|
||||
\section{Minimizing sums}
|
||||
|
||||
We next consider a problem where
|
||||
we are given $n$ numbers $a_1,a_2,\ldots,a_n$
|
||||
and our task is to find a value $x$
|
||||
that minimizes the sum
|
||||
\[|a_1-x|^c+|a_2-x|^c+\cdots+|a_n-x|^c.\]
|
||||
We focus on the cases $c=1$ and $c=2$.
|
||||
|
||||
\subsubsection{Case $c=1$}
|
||||
|
||||
In this case, we should minimize the sum
|
||||
\[|a_1-x|+|a_2-x|+\cdots+|a_n-x|.\]
|
||||
For example, if the numbers are $[1,2,9,2,6]$,
|
||||
the best solution is to select $x=2$
|
||||
which produces the sum
|
||||
\[
|
||||
|1-2|+|2-2|+|9-2|+|2-2|+|6-2|=12.
|
||||
\]
|
||||
In the general case, the best choice for $x$
|
||||
is the \textit{median} of the numbers,
|
||||
i.e., the middle number after sorting.
|
||||
For example, the list $[1,2,9,2,6]$
|
||||
becomes $[1,2,2,6,9]$ after sorting,
|
||||
so the median is 2.
|
||||
|
||||
The median is an optimal choice,
|
||||
because if $x$ is smaller than the median,
|
||||
the sum becomes smaller by increasing $x$,
|
||||
and if $x$ is larger then the median,
|
||||
the sum becomes smaller by decreasing $x$.
|
||||
Hence, the optimal solution is that $x$
|
||||
is the median.
|
||||
If $n$ is even and there are two medians,
|
||||
both medians and all values between them
|
||||
are optimal choices.
|
||||
|
||||
\subsubsection{Case $c=2$}
|
||||
|
||||
In this case, we should minimize the sum
|
||||
\[(a_1-x)^2+(a_2-x)^2+\cdots+(a_n-x)^2.\]
|
||||
For example, if the numbers are $[1,2,9,2,6]$,
|
||||
the best solution is to select $x=4$
|
||||
which produces the sum
|
||||
\[
|
||||
(1-4)^2+(2-4)^2+(9-4)^2+(2-4)^2+(6-4)^2=46.
|
||||
\]
|
||||
In the general case, the best choice for $x$
|
||||
is the \emph{average} of the numbers.
|
||||
In the example the average is $(1+2+9+2+6)/5=4$.
|
||||
This result can be derived by presenting
|
||||
the sum as follows:
|
||||
\[
|
||||
nx^2 - 2x(a_1+a_2+\cdots+a_n) + (a_1^2+a_2^2+\cdots+a_n^2)
|
||||
\]
|
||||
The last part does not depend on $x$,
|
||||
so we can ignore it.
|
||||
The remaining parts form a function
|
||||
$nx^2-2xs$ where $s=a_1+a_2+\cdots+a_n$.
|
||||
This is a parabola opening upwards
|
||||
with roots $x=0$ and $x=2s/n$,
|
||||
and the minimum value is the average
|
||||
of the roots $x=s/n$, i.e.,
|
||||
the average of the numbers $a_1,a_2,\ldots,a_n$.
|
||||
|
||||
\section{Data compression}
|
||||
|
||||
\index{data compression}
|
||||
\index{binary code}
|
||||
\index{codeword}
|
||||
|
||||
A \key{binary code} assigns for each character
|
||||
of a string a \key{codeword} that consists of bits.
|
||||
We can \emph{compress} the string using the binary code
|
||||
by replacing each character by the
|
||||
corresponding codeword.
|
||||
For example, the following binary code
|
||||
assigns codewords for characters
|
||||
\texttt{A}–\texttt{D}:
|
||||
\begin{center}
|
||||
\begin{tabular}{rr}
|
||||
character & codeword \\
|
||||
\hline
|
||||
\texttt{A} & 00 \\
|
||||
\texttt{B} & 01 \\
|
||||
\texttt{C} & 10 \\
|
||||
\texttt{D} & 11 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
This is a \key{constant-length} code
|
||||
which means that the length of each
|
||||
codeword is the same.
|
||||
For example, we can compress the string
|
||||
\texttt{AABACDACA} as follows:
|
||||
\[00\,00\,01\,00\,10\,11\,00\,10\,00\]
|
||||
Using this code, the length of the compressed
|
||||
string is 18 bits.
|
||||
However, we can compress the string better
|
||||
if we use a \key{variable-length} code
|
||||
where codewords may have different lengths.
|
||||
Then we can give short codewords for
|
||||
characters that appear often
|
||||
and long codewords for characters
|
||||
that appear rarely.
|
||||
It turns out that an \key{optimal} code
|
||||
for the above string is as follows:
|
||||
\begin{center}
|
||||
\begin{tabular}{rr}
|
||||
character & codeword \\
|
||||
\hline
|
||||
\texttt{A} & 0 \\
|
||||
\texttt{B} & 110 \\
|
||||
\texttt{C} & 10 \\
|
||||
\texttt{D} & 111 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
An optimal code produces a compressed string
|
||||
that is as short as possible.
|
||||
In this case, the compressed string using
|
||||
the optimal code is
|
||||
\[0\,0\,110\,0\,10\,111\,0\,10\,0,\]
|
||||
so only 15 bits are needed instead of 18 bits.
|
||||
Thus, thanks to a better code it was possible to
|
||||
save 3 bits in the compressed string.
|
||||
|
||||
We require that no codeword
|
||||
is a prefix of another codeword.
|
||||
For example, it is not allowed that a code
|
||||
would contain both codewords 10
|
||||
and 1011.
|
||||
The reason for this is that we want
|
||||
to be able to generate the original string
|
||||
from the compressed string.
|
||||
If a codeword could be a prefix of another codeword,
|
||||
this would not always be possible.
|
||||
For example, the following code is \emph{not} valid:
|
||||
\begin{center}
|
||||
\begin{tabular}{rr}
|
||||
character & codeword \\
|
||||
\hline
|
||||
\texttt{A} & 10 \\
|
||||
\texttt{B} & 11 \\
|
||||
\texttt{C} & 1011 \\
|
||||
\texttt{D} & 111 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
Using this code, it would not be possible to know
|
||||
if the compressed string 1011 corresponds to
|
||||
the string \texttt{AB} or the string \texttt{C}.
|
||||
|
||||
\index{Huffman coding}
|
||||
|
||||
\subsubsection{Huffman coding}
|
||||
|
||||
\key{Huffman coding}\footnote{D. A. Huffman discovered this method
|
||||
when solving a university course assignment
|
||||
and published the algorithm in 1952 \cite{huf52}.} is a greedy algorithm
|
||||
that constructs an optimal code for
|
||||
compressing a given string.
|
||||
The algorithm builds a binary tree
|
||||
based on the frequencies of the characters
|
||||
in the string,
|
||||
and each character's codeword can be read
|
||||
by following a path from the root to
|
||||
the corresponding node.
|
||||
A move to the left corresponds to bit 0,
|
||||
and a move to the right corresponds to bit 1.
|
||||
|
||||
Initially, each character of the string is
|
||||
represented by a node whose weight is the
|
||||
number of times the character occurs in the string.
|
||||
Then at each step two nodes with minimum weights
|
||||
are combined by creating
|
||||
a new node whose weight is the sum of the weights
|
||||
of the original nodes.
|
||||
The process continues until all nodes have been combined.
|
||||
|
||||
Next we will see how Huffman coding creates
|
||||
the optimal code for the string
|
||||
\texttt{AABACDACA}.
|
||||
Initially, there are four nodes that correspond
|
||||
to the characters of the string:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$5$};
|
||||
\node[draw, circle] (2) at (2,0) {$1$};
|
||||
\node[draw, circle] (3) at (4,0) {$2$};
|
||||
\node[draw, circle] (4) at (6,0) {$1$};
|
||||
|
||||
\node[color=blue] at (0,-0.75) {\texttt{A}};
|
||||
\node[color=blue] at (2,-0.75) {\texttt{B}};
|
||||
\node[color=blue] at (4,-0.75) {\texttt{C}};
|
||||
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
||||
|
||||
%\path[draw,thick,-] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The node that represents character \texttt{A}
|
||||
has weight 5 because character \texttt{A}
|
||||
appears 5 times in the string.
|
||||
The other weights have been calculated
|
||||
in the same way.
|
||||
|
||||
The first step is to combine the nodes that
|
||||
correspond to characters \texttt{B} and \texttt{D},
|
||||
both with weight 1.
|
||||
The result is:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$5$};
|
||||
\node[draw, circle] (3) at (2,0) {$2$};
|
||||
\node[draw, circle] (2) at (4,0) {$1$};
|
||||
\node[draw, circle] (4) at (6,0) {$1$};
|
||||
\node[draw, circle] (5) at (5,1) {$2$};
|
||||
|
||||
\node[color=blue] at (0,-0.75) {\texttt{A}};
|
||||
\node[color=blue] at (2,-0.75) {\texttt{C}};
|
||||
\node[color=blue] at (4,-0.75) {\texttt{B}};
|
||||
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
||||
|
||||
\node at (4.3,0.7) {0};
|
||||
\node at (5.7,0.7) {1};
|
||||
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
After this, the nodes with weight 2 are combined:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,0) {$5$};
|
||||
\node[draw, circle] (3) at (3,1) {$2$};
|
||||
\node[draw, circle] (2) at (4,0) {$1$};
|
||||
\node[draw, circle] (4) at (6,0) {$1$};
|
||||
\node[draw, circle] (5) at (5,1) {$2$};
|
||||
\node[draw, circle] (6) at (4,2) {$4$};
|
||||
|
||||
\node[color=blue] at (1,-0.75) {\texttt{A}};
|
||||
\node[color=blue] at (3,1-0.75) {\texttt{C}};
|
||||
\node[color=blue] at (4,-0.75) {\texttt{B}};
|
||||
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
||||
|
||||
\node at (4.3,0.7) {0};
|
||||
\node at (5.7,0.7) {1};
|
||||
\node at (3.3,1.7) {0};
|
||||
\node at (4.7,1.7) {1};
|
||||
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Finally, the two remaining nodes are combined:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (2,2) {$5$};
|
||||
\node[draw, circle] (3) at (3,1) {$2$};
|
||||
\node[draw, circle] (2) at (4,0) {$1$};
|
||||
\node[draw, circle] (4) at (6,0) {$1$};
|
||||
\node[draw, circle] (5) at (5,1) {$2$};
|
||||
\node[draw, circle] (6) at (4,2) {$4$};
|
||||
\node[draw, circle] (7) at (3,3) {$9$};
|
||||
|
||||
\node[color=blue] at (2,2-0.75) {\texttt{A}};
|
||||
\node[color=blue] at (3,1-0.75) {\texttt{C}};
|
||||
\node[color=blue] at (4,-0.75) {\texttt{B}};
|
||||
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
||||
|
||||
\node at (4.3,0.7) {0};
|
||||
\node at (5.7,0.7) {1};
|
||||
\node at (3.3,1.7) {0};
|
||||
\node at (4.7,1.7) {1};
|
||||
\node at (2.3,2.7) {0};
|
||||
\node at (3.7,2.7) {1};
|
||||
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\path[draw,thick,-] (1) -- (7);
|
||||
\path[draw,thick,-] (6) -- (7);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Now all nodes are in the tree, so the code is ready.
|
||||
The following codewords can be read from the tree:
|
||||
\begin{center}
|
||||
\begin{tabular}{rr}
|
||||
character & codeword \\
|
||||
\hline
|
||||
\texttt{A} & 0 \\
|
||||
\texttt{B} & 110 \\
|
||||
\texttt{C} & 10 \\
|
||||
\texttt{D} & 111 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
1049
chapter07.tex
1049
chapter07.tex
File diff suppressed because it is too large
Load Diff
732
chapter08.tex
732
chapter08.tex
|
@ -1,732 +0,0 @@
|
|||
\chapter{Amortized analysis}
|
||||
|
||||
\index{amortized analysis}
|
||||
|
||||
The time complexity of an algorithm
|
||||
is often easy to analyze
|
||||
just by examining the structure
|
||||
of the algorithm:
|
||||
what loops does the algorithm contain
|
||||
and how many times the loops are performed.
|
||||
However, sometimes a straightforward analysis
|
||||
does not give a true picture of the efficiency of the algorithm.
|
||||
|
||||
\key{Amortized analysis} can be used to analyze
|
||||
algorithms that contain operations whose
|
||||
time complexity varies.
|
||||
The idea is to estimate the total time used to
|
||||
all such operations during the
|
||||
execution of the algorithm, instead of focusing
|
||||
on individual operations.
|
||||
|
||||
\section{Two pointers method}
|
||||
|
||||
\index{two pointers method}
|
||||
|
||||
In the \key{two pointers method},
|
||||
two pointers are used to
|
||||
iterate through the array values.
|
||||
Both pointers can move to one direction only,
|
||||
which ensures that the algorithm works efficiently.
|
||||
Next we discuss two problems that can be solved
|
||||
using the two pointers method.
|
||||
|
||||
\subsubsection{Subarray sum}
|
||||
|
||||
As the first example,
|
||||
consider a problem where we are
|
||||
given an array of $n$ positive integers
|
||||
and a target sum $x$,
|
||||
and we want to find a subarray whose sum is $x$
|
||||
or report that there is no such subarray.
|
||||
|
||||
For example, the array
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$1$};
|
||||
\node at (5.5,0.5) {$1$};
|
||||
\node at (6.5,0.5) {$2$};
|
||||
\node at (7.5,0.5) {$3$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
contains a subarray whose sum is 8:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (2,0) rectangle (5,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$1$};
|
||||
\node at (5.5,0.5) {$1$};
|
||||
\node at (6.5,0.5) {$2$};
|
||||
\node at (7.5,0.5) {$3$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
This problem can be solved in
|
||||
$O(n)$ time by using the two pointers method.
|
||||
The idea is to maintain pointers that point to the
|
||||
first and last value of a subarray.
|
||||
On each turn, the left pointer moves one step
|
||||
to the right, and the right pointer moves to the right
|
||||
as long as the resulting subarray sum is at most $x$.
|
||||
If the sum becomes exactly $x$,
|
||||
a solution has been found.
|
||||
|
||||
As an example, consider the following array
|
||||
and a target sum $x=8$:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$1$};
|
||||
\node at (5.5,0.5) {$1$};
|
||||
\node at (6.5,0.5) {$2$};
|
||||
\node at (7.5,0.5) {$3$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The initial subarray contains the values
|
||||
1, 3 and 2 whose sum is 6:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (0,0) rectangle (3,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$1$};
|
||||
\node at (5.5,0.5) {$1$};
|
||||
\node at (6.5,0.5) {$2$};
|
||||
\node at (7.5,0.5) {$3$};
|
||||
|
||||
\draw[thick,->] (0.5,-0.7) -- (0.5,-0.1);
|
||||
\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Then, the left pointer moves one step to the right.
|
||||
The right pointer does not move, because otherwise
|
||||
the subarray sum would exceed $x$.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (1,0) rectangle (3,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$1$};
|
||||
\node at (5.5,0.5) {$1$};
|
||||
\node at (6.5,0.5) {$2$};
|
||||
\node at (7.5,0.5) {$3$};
|
||||
|
||||
\draw[thick,->] (1.5,-0.7) -- (1.5,-0.1);
|
||||
\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Again, the left pointer moves one step to the right,
|
||||
and this time the right pointer moves three
|
||||
steps to the right.
|
||||
The subarray sum is $2+5+1=8$, so a subarray
|
||||
whose sum is $x$ has been found.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (2,0) rectangle (5,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$1$};
|
||||
\node at (5.5,0.5) {$1$};
|
||||
\node at (6.5,0.5) {$2$};
|
||||
\node at (7.5,0.5) {$3$};
|
||||
|
||||
\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
|
||||
\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The running time of the algorithm depends on
|
||||
the number of steps the right pointer moves.
|
||||
While there is no useful upper bound on how many steps the
|
||||
pointer can move on a \emph{single} turn.
|
||||
we know that the pointer moves \emph{a total of}
|
||||
$O(n)$ steps during the algorithm,
|
||||
because it only moves to the right.
|
||||
|
||||
Since both the left and right pointer
|
||||
move $O(n)$ steps during the algorithm,
|
||||
the algorithm works in $O(n)$ time.
|
||||
|
||||
\subsubsection{2SUM problem}
|
||||
|
||||
\index{2SUM problem}
|
||||
|
||||
Another problem that can be solved using
|
||||
the two pointers method is the following problem,
|
||||
also known as the \key{2SUM problem}:
|
||||
given an array of $n$ numbers and
|
||||
a target sum $x$, find
|
||||
two array values such that their sum is $x$,
|
||||
or report that no such values exist.
|
||||
|
||||
To solve the problem, we first
|
||||
sort the array values in increasing order.
|
||||
After that, we iterate through the array using
|
||||
two pointers.
|
||||
The left pointer starts at the first value
|
||||
and moves one step to the right on each turn.
|
||||
The right pointer begins at the last value
|
||||
and always moves to the left until the sum of the
|
||||
left and right value is at most $x$.
|
||||
If the sum is exactly $x$,
|
||||
a solution has been found.
|
||||
|
||||
For example, consider the following array
|
||||
and a target sum $x=12$:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$4$};
|
||||
\node at (2.5,0.5) {$5$};
|
||||
\node at (3.5,0.5) {$6$};
|
||||
\node at (4.5,0.5) {$7$};
|
||||
\node at (5.5,0.5) {$9$};
|
||||
\node at (6.5,0.5) {$9$};
|
||||
\node at (7.5,0.5) {$10$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The initial positions of the pointers
|
||||
are as follows.
|
||||
The sum of the values is $1+10=11$
|
||||
that is smaller than $x$.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (0,0) rectangle (1,1);
|
||||
\fill[color=lightgray] (7,0) rectangle (8,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$4$};
|
||||
\node at (2.5,0.5) {$5$};
|
||||
\node at (3.5,0.5) {$6$};
|
||||
\node at (4.5,0.5) {$7$};
|
||||
\node at (5.5,0.5) {$9$};
|
||||
\node at (6.5,0.5) {$9$};
|
||||
\node at (7.5,0.5) {$10$};
|
||||
|
||||
\draw[thick,->] (0.5,-0.7) -- (0.5,-0.1);
|
||||
\draw[thick,->] (7.5,-0.7) -- (7.5,-0.1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Then the left pointer moves one step to the right.
|
||||
The right pointer moves three steps to the left,
|
||||
and the sum becomes $4+7=11$.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (1,0) rectangle (2,1);
|
||||
\fill[color=lightgray] (4,0) rectangle (5,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$4$};
|
||||
\node at (2.5,0.5) {$5$};
|
||||
\node at (3.5,0.5) {$6$};
|
||||
\node at (4.5,0.5) {$7$};
|
||||
\node at (5.5,0.5) {$9$};
|
||||
\node at (6.5,0.5) {$9$};
|
||||
\node at (7.5,0.5) {$10$};
|
||||
|
||||
\draw[thick,->] (1.5,-0.7) -- (1.5,-0.1);
|
||||
\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
After this, the left pointer moves one step to the right again.
|
||||
The right pointer does not move, and a solution
|
||||
$5+7=12$ has been found.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (2,0) rectangle (3,1);
|
||||
\fill[color=lightgray] (4,0) rectangle (5,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$4$};
|
||||
\node at (2.5,0.5) {$5$};
|
||||
\node at (3.5,0.5) {$6$};
|
||||
\node at (4.5,0.5) {$7$};
|
||||
\node at (5.5,0.5) {$9$};
|
||||
\node at (6.5,0.5) {$9$};
|
||||
\node at (7.5,0.5) {$10$};
|
||||
|
||||
\draw[thick,->] (2.5,-0.7) -- (2.5,-0.1);
|
||||
\draw[thick,->] (4.5,-0.7) -- (4.5,-0.1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The running time of the algorithm is
|
||||
$O(n \log n)$, because it first sorts
|
||||
the array in $O(n \log n)$ time,
|
||||
and then both pointers move $O(n)$ steps.
|
||||
|
||||
Note that it is possible to solve the problem
|
||||
in another way in $O(n \log n)$ time using binary search.
|
||||
In such a solution, we iterate through the array
|
||||
and for each array value, we try to find another
|
||||
value that yields the sum $x$.
|
||||
This can be done by performing $n$ binary searches,
|
||||
each of which takes $O(\log n)$ time.
|
||||
|
||||
\index{3SUM problem}
|
||||
A more difficult problem is
|
||||
the \key{3SUM problem} that asks to
|
||||
find \emph{three} array values
|
||||
whose sum is $x$.
|
||||
Using the idea of the above algorithm,
|
||||
this problem can be solved in $O(n^2)$ time\footnote{For a long time,
|
||||
it was thought that solving
|
||||
the 3SUM problem more efficiently than in $O(n^2)$ time
|
||||
would not be possible.
|
||||
However, in 2014, it turned out \cite{gro14}
|
||||
that this is not the case.}.
|
||||
Can you see how?
|
||||
|
||||
\section{Nearest smaller elements}
|
||||
|
||||
\index{nearest smaller elements}
|
||||
|
||||
Amortized analysis is often used to
|
||||
estimate the number of operations
|
||||
performed on a data structure.
|
||||
The operations may be distributed unevenly so
|
||||
that most operations occur during a
|
||||
certain phase of the algorithm, but the total
|
||||
number of the operations is limited.
|
||||
|
||||
As an example, consider the problem
|
||||
of finding for each array element
|
||||
the \key{nearest smaller element}, i.e.,
|
||||
the first smaller element that precedes the element
|
||||
in the array.
|
||||
It is possible that no such element exists,
|
||||
in which case the algorithm should report this.
|
||||
Next we will see how the problem can be
|
||||
efficiently solved using a stack structure.
|
||||
|
||||
We go through the array from left to right
|
||||
and maintain a stack of array elements.
|
||||
At each array position, we remove elements from the stack
|
||||
until the top element is smaller than the
|
||||
current element, or the stack is empty.
|
||||
Then, we report that the top element is
|
||||
the nearest smaller element of the current element,
|
||||
or if the stack is empty, there is no such element.
|
||||
Finally, we add the current element to the stack.
|
||||
|
||||
As an example, consider the following array:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$2$};
|
||||
\node at (4.5,0.5) {$5$};
|
||||
\node at (5.5,0.5) {$3$};
|
||||
\node at (6.5,0.5) {$4$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
First, the elements 1, 3 and 4 are added to the stack,
|
||||
because each element is larger than the previous element.
|
||||
Thus, the nearest smaller element of 4 is 3,
|
||||
and the nearest smaller element of 3 is 1.
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (2,0) rectangle (3,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$2$};
|
||||
\node at (4.5,0.5) {$5$};
|
||||
\node at (5.5,0.5) {$3$};
|
||||
\node at (6.5,0.5) {$4$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
|
||||
\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
|
||||
\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2);
|
||||
\draw (2.2,0.2-1.2) rectangle (2.8,0.8-1.2);
|
||||
|
||||
\node at (0.5,0.5-1.2) {$1$};
|
||||
\node at (1.5,0.5-1.2) {$3$};
|
||||
\node at (2.5,0.5-1.2) {$4$};
|
||||
|
||||
\draw[->,thick] (0.8,0.5-1.2) -- (1.2,0.5-1.2);
|
||||
\draw[->,thick] (1.8,0.5-1.2) -- (2.2,0.5-1.2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The next element 2 is smaller than the two top
|
||||
elements in the stack.
|
||||
Thus, the elements 3 and 4 are removed from the stack,
|
||||
and then the element 2 is added to the stack.
|
||||
Its nearest smaller element is 1:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (3,0) rectangle (4,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$2$};
|
||||
\node at (4.5,0.5) {$5$};
|
||||
\node at (5.5,0.5) {$3$};
|
||||
\node at (6.5,0.5) {$4$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
|
||||
\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
|
||||
\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
|
||||
|
||||
\node at (0.5,0.5-1.2) {$1$};
|
||||
\node at (3.5,0.5-1.2) {$2$};
|
||||
|
||||
\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Then, the element 5 is larger than the element 2,
|
||||
so it will be added to the stack, and
|
||||
its nearest smaller element is 2:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (4,0) rectangle (5,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$2$};
|
||||
\node at (4.5,0.5) {$5$};
|
||||
\node at (5.5,0.5) {$3$};
|
||||
\node at (6.5,0.5) {$4$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
|
||||
\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
|
||||
\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
|
||||
\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2);
|
||||
|
||||
\node at (0.5,0.5-1.2) {$1$};
|
||||
\node at (3.5,0.5-1.2) {$2$};
|
||||
\node at (4.5,0.5-1.2) {$5$};
|
||||
|
||||
\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2);
|
||||
\draw[->,thick] (3.8,0.5-1.2) -- (4.2,0.5-1.2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
After this, the element 5 is removed from the stack
|
||||
and the elements 3 and 4 are added to the stack:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (6,0) rectangle (7,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$2$};
|
||||
\node at (4.5,0.5) {$5$};
|
||||
\node at (5.5,0.5) {$3$};
|
||||
\node at (6.5,0.5) {$4$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
|
||||
\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
|
||||
\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
|
||||
\draw (5.2,0.2-1.2) rectangle (5.8,0.8-1.2);
|
||||
\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2);
|
||||
|
||||
\node at (0.5,0.5-1.2) {$1$};
|
||||
\node at (3.5,0.5-1.2) {$2$};
|
||||
\node at (5.5,0.5-1.2) {$3$};
|
||||
\node at (6.5,0.5-1.2) {$4$};
|
||||
|
||||
\draw[->,thick] (0.8,0.5-1.2) -- (3.2,0.5-1.2);
|
||||
\draw[->,thick] (3.8,0.5-1.2) -- (5.2,0.5-1.2);
|
||||
\draw[->,thick] (5.8,0.5-1.2) -- (6.2,0.5-1.2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Finally, all elements except 1 are removed
|
||||
from the stack and the last element 2
|
||||
is added to the stack:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (7,0) rectangle (8,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$1$};
|
||||
\node at (1.5,0.5) {$3$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$2$};
|
||||
\node at (4.5,0.5) {$5$};
|
||||
\node at (5.5,0.5) {$3$};
|
||||
\node at (6.5,0.5) {$4$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
|
||||
\draw (0.2,0.2-1.2) rectangle (0.8,0.8-1.2);
|
||||
\draw (7.2,0.2-1.2) rectangle (7.8,0.8-1.2);
|
||||
|
||||
\node at (0.5,0.5-1.2) {$1$};
|
||||
\node at (7.5,0.5-1.2) {$2$};
|
||||
|
||||
\draw[->,thick] (0.8,0.5-1.2) -- (7.2,0.5-1.2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The efficiency of the algorithm depends on
|
||||
the total number of stack operations.
|
||||
If the current element is larger than
|
||||
the top element in the stack, it is directly
|
||||
added to the stack, which is efficient.
|
||||
However, sometimes the stack can contain several
|
||||
larger elements and it takes time to remove them.
|
||||
Still, each element is added \emph{exactly once} to the stack
|
||||
and removed \emph{at most once} from the stack.
|
||||
Thus, each element causes $O(1)$ stack operations,
|
||||
and the algorithm works in $O(n)$ time.
|
||||
|
||||
\section{Sliding window minimum}
|
||||
|
||||
\index{sliding window}
|
||||
\index{sliding window minimum}
|
||||
|
||||
A \key{sliding window} is a constant-size subarray
|
||||
that moves from left to right through the array.
|
||||
At each window position,
|
||||
we want to calculate some information
|
||||
about the elements inside the window.
|
||||
In this section, we focus on the problem
|
||||
of maintaining the \key{sliding window minimum},
|
||||
which means that
|
||||
we should report the smallest value inside each window.
|
||||
|
||||
The sliding window minimum can be calculated
|
||||
using a similar idea that we used to calculate
|
||||
the nearest smaller elements.
|
||||
We maintain a queue
|
||||
where each element is larger than
|
||||
the previous element,
|
||||
and the first element
|
||||
always corresponds to the minimum element inside the window.
|
||||
After each window move,
|
||||
we remove elements from the end of the queue
|
||||
until the last queue element
|
||||
is smaller than the new window element,
|
||||
or the queue becomes empty.
|
||||
We also remove the first queue element
|
||||
if it is not inside the window anymore.
|
||||
Finally, we add the new window element
|
||||
to the end of the queue.
|
||||
|
||||
As an example, consider the following array:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$2$};
|
||||
\node at (1.5,0.5) {$1$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$3$};
|
||||
\node at (5.5,0.5) {$4$};
|
||||
\node at (6.5,0.5) {$1$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Suppose that the size of the sliding window is 4.
|
||||
At the first window position, the smallest value is 1:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (0,0) rectangle (4,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$2$};
|
||||
\node at (1.5,0.5) {$1$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$3$};
|
||||
\node at (5.5,0.5) {$4$};
|
||||
\node at (6.5,0.5) {$1$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
|
||||
\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2);
|
||||
\draw (2.2,0.2-1.2) rectangle (2.8,0.8-1.2);
|
||||
\draw (3.2,0.2-1.2) rectangle (3.8,0.8-1.2);
|
||||
|
||||
\node at (1.5,0.5-1.2) {$1$};
|
||||
\node at (2.5,0.5-1.2) {$4$};
|
||||
\node at (3.5,0.5-1.2) {$5$};
|
||||
|
||||
\draw[->,thick] (1.8,0.5-1.2) -- (2.2,0.5-1.2);
|
||||
\draw[->,thick] (2.8,0.5-1.2) -- (3.2,0.5-1.2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Then the window moves one step right.
|
||||
The new element 3 is smaller than the elements
|
||||
4 and 5 in the queue, so the elements 4 and 5
|
||||
are removed from the queue
|
||||
and the element 3 is added to the queue.
|
||||
The smallest value is still 1.
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (1,0) rectangle (5,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$2$};
|
||||
\node at (1.5,0.5) {$1$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$3$};
|
||||
\node at (5.5,0.5) {$4$};
|
||||
\node at (6.5,0.5) {$1$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
|
||||
\draw (1.2,0.2-1.2) rectangle (1.8,0.8-1.2);
|
||||
\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2);
|
||||
|
||||
\node at (1.5,0.5-1.2) {$1$};
|
||||
\node at (4.5,0.5-1.2) {$3$};
|
||||
|
||||
\draw[->,thick] (1.8,0.5-1.2) -- (4.2,0.5-1.2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
After this, the window moves again,
|
||||
and the smallest element 1
|
||||
does not belong to the window anymore.
|
||||
Thus, it is removed from the queue and the smallest
|
||||
value is now 3. Also the new element 4
|
||||
is added to the queue.
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (2,0) rectangle (6,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$2$};
|
||||
\node at (1.5,0.5) {$1$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$3$};
|
||||
\node at (5.5,0.5) {$4$};
|
||||
\node at (6.5,0.5) {$1$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
|
||||
\draw (4.2,0.2-1.2) rectangle (4.8,0.8-1.2);
|
||||
\draw (5.2,0.2-1.2) rectangle (5.8,0.8-1.2);
|
||||
|
||||
\node at (4.5,0.5-1.2) {$3$};
|
||||
\node at (5.5,0.5-1.2) {$4$};
|
||||
|
||||
\draw[->,thick] (4.8,0.5-1.2) -- (5.2,0.5-1.2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The next new element 1 is smaller than all elements
|
||||
in the queue.
|
||||
Thus, all elements are removed from the queue
|
||||
and it will only contain the element 1:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (3,0) rectangle (7,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$2$};
|
||||
\node at (1.5,0.5) {$1$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$3$};
|
||||
\node at (5.5,0.5) {$4$};
|
||||
\node at (6.5,0.5) {$1$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
|
||||
\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2);
|
||||
|
||||
\node at (6.5,0.5-1.2) {$1$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Finally the window reaches its last position.
|
||||
The element 2 is added to the queue,
|
||||
but the smallest value inside the window
|
||||
is still 1.
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (4,0) rectangle (8,1);
|
||||
\draw (0,0) grid (8,1);
|
||||
|
||||
\node at (0.5,0.5) {$2$};
|
||||
\node at (1.5,0.5) {$1$};
|
||||
\node at (2.5,0.5) {$4$};
|
||||
\node at (3.5,0.5) {$5$};
|
||||
\node at (4.5,0.5) {$3$};
|
||||
\node at (5.5,0.5) {$4$};
|
||||
\node at (6.5,0.5) {$1$};
|
||||
\node at (7.5,0.5) {$2$};
|
||||
|
||||
\draw (6.2,0.2-1.2) rectangle (6.8,0.8-1.2);
|
||||
\draw (7.2,0.2-1.2) rectangle (7.8,0.8-1.2);
|
||||
|
||||
\node at (6.5,0.5-1.2) {$1$};
|
||||
\node at (7.5,0.5-1.2) {$2$};
|
||||
|
||||
\draw[->,thick] (6.8,0.5-1.2) -- (7.2,0.5-1.2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Since each array element
|
||||
is added to the queue exactly once and
|
||||
removed from the queue at most once,
|
||||
the algorithm works in $O(n)$ time.
|
||||
|
||||
|
||||
|
1403
chapter09.tex
1403
chapter09.tex
File diff suppressed because it is too large
Load Diff
849
chapter10.tex
849
chapter10.tex
|
@ -1,849 +0,0 @@
|
|||
\chapter{Bit manipulation}
|
||||
|
||||
All data in computer programs is internally stored as bits,
|
||||
i.e., as numbers 0 and 1.
|
||||
This chapter discusses the bit representation
|
||||
of integers, and shows examples
|
||||
of how to use bit operations.
|
||||
It turns out that there are many uses for
|
||||
bit manipulation in algorithm programming.
|
||||
|
||||
\section{Bit representation}
|
||||
|
||||
\index{bit representation}
|
||||
|
||||
In programming, an $n$ bit integer is internally
|
||||
stored as a binary number that consists of $n$ bits.
|
||||
For example, the C++ type \texttt{int} is
|
||||
a 32-bit type, which means that every \texttt{int}
|
||||
number consists of 32 bits.
|
||||
|
||||
Here is the bit representation of
|
||||
the \texttt{int} number 43:
|
||||
\[00000000000000000000000000101011\]
|
||||
The bits in the representation are indexed from right to left.
|
||||
To convert a bit representation $b_k \cdots b_2 b_1 b_0$ into a number,
|
||||
we can use the formula
|
||||
\[b_k 2^k + \ldots + b_2 2^2 + b_1 2^1 + b_0 2^0.\]
|
||||
For example,
|
||||
\[1 \cdot 2^5 + 1 \cdot 2^3 + 1 \cdot 2^1 + 1 \cdot 2^0 = 43.\]
|
||||
|
||||
The bit representation of a number is either
|
||||
\key{signed} or \key{unsigned}.
|
||||
Usually a signed representation is used,
|
||||
which means that both negative and positive
|
||||
numbers can be represented.
|
||||
A signed variable of $n$ bits can contain any
|
||||
integer between $-2^{n-1}$ and $2^{n-1}-1$.
|
||||
For example, the \texttt{int} type in C++ is
|
||||
a signed type, so an \texttt{int} variable can contain any
|
||||
integer between $-2^{31}$ and $2^{31}-1$.
|
||||
|
||||
The first bit in a signed representation
|
||||
is the sign of the number (0 for nonnegative numbers
|
||||
and 1 for negative numbers), and
|
||||
the remaining $n-1$ bits contain the magnitude of the number.
|
||||
\key{Two's complement} is used, which means that the
|
||||
opposite number of a number is calculated by first
|
||||
inverting all the bits in the number,
|
||||
and then increasing the number by one.
|
||||
|
||||
For example, the bit representation of
|
||||
the \texttt{int} number $-43$ is
|
||||
\[11111111111111111111111111010101.\]
|
||||
|
||||
In an unsigned representation, only nonnegative
|
||||
numbers can be used, but the upper bound for the values is larger.
|
||||
An unsigned variable of $n$ bits can contain any
|
||||
integer between $0$ and $2^n-1$.
|
||||
For example, in C++, an \texttt{unsigned int} variable
|
||||
can contain any integer between $0$ and $2^{32}-1$.
|
||||
|
||||
There is a connection between the
|
||||
representations:
|
||||
a signed number $-x$ equals an unsigned number $2^n-x$.
|
||||
For example, the following code shows that
|
||||
the signed number $x=-43$ equals the unsigned
|
||||
number $y=2^{32}-43$:
|
||||
\begin{lstlisting}
|
||||
int x = -43;
|
||||
unsigned int y = x;
|
||||
cout << x << "\n"; // -43
|
||||
cout << y << "\n"; // 4294967253
|
||||
\end{lstlisting}
|
||||
|
||||
If a number is larger than the upper bound
|
||||
of the bit representation, the number will overflow.
|
||||
In a signed representation,
|
||||
the next number after $2^{n-1}-1$ is $-2^{n-1}$,
|
||||
and in an unsigned representation,
|
||||
the next number after $2^n-1$ is $0$.
|
||||
For example, consider the following code:
|
||||
\begin{lstlisting}
|
||||
int x = 2147483647
|
||||
cout << x << "\n"; // 2147483647
|
||||
x++;
|
||||
cout << x << "\n"; // -2147483648
|
||||
\end{lstlisting}
|
||||
|
||||
Initially, the value of $x$ is $2^{31}-1$.
|
||||
This is the largest value that can be stored
|
||||
in an \texttt{int} variable,
|
||||
so the next number after $2^{31}-1$ is $-2^{31}$.
|
||||
|
||||
|
||||
\section{Bit operations}
|
||||
|
||||
\newcommand\XOR{\mathbin{\char`\^}}
|
||||
|
||||
\subsubsection{And operation}
|
||||
|
||||
\index{and operation}
|
||||
|
||||
The \key{and} operation $x$ \& $y$ produces a number
|
||||
that has one bits in positions where both
|
||||
$x$ and $y$ have one bits.
|
||||
For example, $22$ \& $26$ = 18, because
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{rrr}
|
||||
& 10110 & (22)\\
|
||||
\& & 11010 & (26) \\
|
||||
\hline
|
||||
= & 10010 & (18) \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
Using the and operation, we can check if a number
|
||||
$x$ is even because
|
||||
$x$ \& $1$ = 0 if $x$ is even, and
|
||||
$x$ \& $1$ = 1 if $x$ is odd.
|
||||
More generally, $x$ is divisible by $2^k$
|
||||
exactly when $x$ \& $(2^k-1)$ = 0.
|
||||
|
||||
\subsubsection{Or operation}
|
||||
|
||||
\index{or operation}
|
||||
|
||||
The \key{or} operation $x$ | $y$ produces a number
|
||||
that has one bits in positions where at least one
|
||||
of $x$ and $y$ have one bits.
|
||||
For example, $22$ | $26$ = 30, because
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{rrr}
|
||||
& 10110 & (22)\\
|
||||
| & 11010 & (26) \\
|
||||
\hline
|
||||
= & 11110 & (30) \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Xor operation}
|
||||
|
||||
\index{xor operation}
|
||||
|
||||
The \key{xor} operation $x$ $\XOR$ $y$ produces a number
|
||||
that has one bits in positions where exactly one
|
||||
of $x$ and $y$ have one bits.
|
||||
For example, $22$ $\XOR$ $26$ = 12, because
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{rrr}
|
||||
& 10110 & (22)\\
|
||||
$\XOR$ & 11010 & (26) \\
|
||||
\hline
|
||||
= & 01100 & (12) \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Not operation}
|
||||
|
||||
\index{not operation}
|
||||
|
||||
The \key{not} operation \textasciitilde$x$
|
||||
produces a number where all the bits of $x$
|
||||
have been inverted.
|
||||
The formula \textasciitilde$x = -x-1$ holds,
|
||||
for example, \textasciitilde$29 = -30$.
|
||||
|
||||
The result of the not operation at the bit level
|
||||
depends on the length of the bit representation,
|
||||
because the operation inverts all bits.
|
||||
For example, if the numbers are 32-bit
|
||||
\texttt{int} numbers, the result is as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{rrrr}
|
||||
$x$ & = & 29 & 00000000000000000000000000011101 \\
|
||||
\textasciitilde$x$ & = & $-30$ & 11111111111111111111111111100010 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Bit shifts}
|
||||
|
||||
\index{bit shift}
|
||||
|
||||
The left bit shift $x < < k$ appends $k$
|
||||
zero bits to the number,
|
||||
and the right bit shift $x > > k$
|
||||
removes the $k$ last bits from the number.
|
||||
For example, $14 < < 2 = 56$,
|
||||
because $14$ and $56$ correspond to 1110 and 111000.
|
||||
Similarly, $49 > > 3 = 6$,
|
||||
because $49$ and $6$ correspond to 110001 and 110.
|
||||
|
||||
Note that $x < < k$
|
||||
corresponds to multiplying $x$ by $2^k$,
|
||||
and $x > > k$
|
||||
corresponds to dividing $x$ by $2^k$
|
||||
rounded down to an integer.
|
||||
|
||||
\subsubsection{Applications}
|
||||
|
||||
A number of the form $1 < < k$ has a one bit
|
||||
in position $k$ and all other bits are zero,
|
||||
so we can use such numbers to access single bits of numbers.
|
||||
In particular, the $k$th bit of a number is one
|
||||
exactly when $x$ \& $(1 < < k)$ is not zero.
|
||||
The following code prints the bit representation
|
||||
of an \texttt{int} number $x$:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 31; i >= 0; i--) {
|
||||
if (x&(1<<i)) cout << "1";
|
||||
else cout << "0";
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
It is also possible to modify single bits
|
||||
of numbers using similar ideas.
|
||||
For example, the formula $x$ | $(1 < < k)$
|
||||
sets the $k$th bit of $x$ to one,
|
||||
the formula
|
||||
$x$ \& \textasciitilde $(1 < < k)$
|
||||
sets the $k$th bit of $x$ to zero,
|
||||
and the formula
|
||||
$x$ $\XOR$ $(1 < < k)$
|
||||
inverts the $k$th bit of $x$.
|
||||
|
||||
The formula $x$ \& $(x-1)$ sets the last
|
||||
one bit of $x$ to zero,
|
||||
and the formula $x$ \& $-x$ sets all the
|
||||
one bits to zero, except for the last one bit.
|
||||
The formula $x$ | $(x-1)$
|
||||
inverts all the bits after the last one bit.
|
||||
Also note that a positive number $x$ is
|
||||
a power of two exactly when $x$ \& $(x-1) = 0$.
|
||||
|
||||
\subsubsection*{Additional functions}
|
||||
|
||||
The g++ compiler provides the following
|
||||
functions for counting bits:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
$\texttt{\_\_builtin\_clz}(x)$:
|
||||
the number of zeros at the beginning of the number
|
||||
\item
|
||||
$\texttt{\_\_builtin\_ctz}(x)$:
|
||||
the number of zeros at the end of the number
|
||||
\item
|
||||
$\texttt{\_\_builtin\_popcount}(x)$:
|
||||
the number of ones in the number
|
||||
\item
|
||||
$\texttt{\_\_builtin\_parity}(x)$:
|
||||
the parity (even or odd) of the number of ones
|
||||
\end{itemize}
|
||||
\begin{samepage}
|
||||
|
||||
The functions can be used as follows:
|
||||
\begin{lstlisting}
|
||||
int x = 5328; // 00000000000000000001010011010000
|
||||
cout << __builtin_clz(x) << "\n"; // 19
|
||||
cout << __builtin_ctz(x) << "\n"; // 4
|
||||
cout << __builtin_popcount(x) << "\n"; // 5
|
||||
cout << __builtin_parity(x) << "\n"; // 1
|
||||
\end{lstlisting}
|
||||
\end{samepage}
|
||||
|
||||
While the above functions only support \texttt{int} numbers,
|
||||
there are also \texttt{long long} versions of
|
||||
the functions available with the suffix \texttt{ll}.
|
||||
|
||||
\section{Representing sets}
|
||||
|
||||
Every subset of a set
|
||||
$\{0,1,2,\ldots,n-1\}$
|
||||
can be represented as an $n$ bit integer
|
||||
whose one bits indicate which
|
||||
elements belong to the subset.
|
||||
This is an efficient way to represent sets,
|
||||
because every element requires only one bit of memory,
|
||||
and set operations can be implemented as bit operations.
|
||||
|
||||
For example, since \texttt{int} is a 32-bit type,
|
||||
an \texttt{int} number can represent any subset
|
||||
of the set $\{0,1,2,\ldots,31\}$.
|
||||
The bit representation of the set $\{1,3,4,8\}$ is
|
||||
\[00000000000000000000000100011010,\]
|
||||
which corresponds to the number $2^8+2^4+2^3+2^1=282$.
|
||||
|
||||
\subsubsection{Set implementation}
|
||||
|
||||
The following code declares an \texttt{int}
|
||||
variable $x$ that can contain
|
||||
a subset of $\{0,1,2,\ldots,31\}$.
|
||||
After this, the code adds the elements 1, 3, 4 and 8
|
||||
to the set and prints the size of the set.
|
||||
\begin{lstlisting}
|
||||
int x = 0;
|
||||
x |= (1<<1);
|
||||
x |= (1<<3);
|
||||
x |= (1<<4);
|
||||
x |= (1<<8);
|
||||
cout << __builtin_popcount(x) << "\n"; // 4
|
||||
\end{lstlisting}
|
||||
Then, the following code prints all
|
||||
elements that belong to the set:
|
||||
\begin{lstlisting}
|
||||
for (int i = 0; i < 32; i++) {
|
||||
if (x&(1<<i)) cout << i << " ";
|
||||
}
|
||||
// output: 1 3 4 8
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Set operations}
|
||||
|
||||
Set operations can be implemented as follows as bit operations:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{lll}
|
||||
& set syntax & bit syntax \\
|
||||
\hline
|
||||
intersection & $a \cap b$ & $a$ \& $b$ \\
|
||||
union & $a \cup b$ & $a$ | $b$ \\
|
||||
complement & $\bar a$ & \textasciitilde$a$ \\
|
||||
difference & $a \setminus b$ & $a$ \& (\textasciitilde$b$) \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
For example, the following code first constructs
|
||||
the sets $x=\{1,3,4,8\}$ and $y=\{3,6,8,9\}$,
|
||||
and then constructs the set $z = x \cup y = \{1,3,4,6,8,9\}$:
|
||||
|
||||
\begin{lstlisting}
|
||||
int x = (1<<1)|(1<<3)|(1<<4)|(1<<8);
|
||||
int y = (1<<3)|(1<<6)|(1<<8)|(1<<9);
|
||||
int z = x|y;
|
||||
cout << __builtin_popcount(z) << "\n"; // 6
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Iterating through subsets}
|
||||
|
||||
The following code goes through
|
||||
the subsets of $\{0,1,\ldots,n-1\}$:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int b = 0; b < (1<<n); b++) {
|
||||
// process subset b
|
||||
}
|
||||
\end{lstlisting}
|
||||
The following code goes through
|
||||
the subsets with exactly $k$ elements:
|
||||
\begin{lstlisting}
|
||||
for (int b = 0; b < (1<<n); b++) {
|
||||
if (__builtin_popcount(b) == k) {
|
||||
// process subset b
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
The following code goes through the subsets
|
||||
of a set $x$:
|
||||
\begin{lstlisting}
|
||||
int b = 0;
|
||||
do {
|
||||
// process subset b
|
||||
} while (b=(b-x)&x);
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Bit optimizations}
|
||||
|
||||
Many algorithms can be optimized using
|
||||
bit operations.
|
||||
Such optimizations do not change the
|
||||
time complexity of the algorithm,
|
||||
but they may have a large impact
|
||||
on the actual running time of the code.
|
||||
In this section we discuss examples
|
||||
of such situations.
|
||||
|
||||
\subsubsection{Hamming distances}
|
||||
|
||||
\index{Hamming distance}
|
||||
The \key{Hamming distance}
|
||||
$\texttt{hamming}(a,b)$ between two
|
||||
strings $a$ and $b$ of equal length is
|
||||
the number of positions where the strings differ.
|
||||
For example,
|
||||
\[\texttt{hamming}(01101,11001)=2.\]
|
||||
|
||||
Consider the following problem: Given
|
||||
a list of $n$ bit strings, each of length $k$,
|
||||
calculate the minimum Hamming distance
|
||||
between two strings in the list.
|
||||
For example, the answer for $[00111,01101,11110]$
|
||||
is 2, because
|
||||
\begin{itemize}[noitemsep]
|
||||
\item $\texttt{hamming}(00111,01101)=2$,
|
||||
\item $\texttt{hamming}(00111,11110)=3$, and
|
||||
\item $\texttt{hamming}(01101,11110)=3$.
|
||||
\end{itemize}
|
||||
|
||||
A straightforward way to solve the problem is
|
||||
to go through all pairs of strings and calculate
|
||||
their Hamming distances,
|
||||
which yields an $O(n^2 k)$ time algorithm.
|
||||
The following function can be used to
|
||||
calculate distances:
|
||||
\begin{lstlisting}
|
||||
int hamming(string a, string b) {
|
||||
int d = 0;
|
||||
for (int i = 0; i < k; i++) {
|
||||
if (a[i] != b[i]) d++;
|
||||
}
|
||||
return d;
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
However, if $k$ is small, we can optimize the code
|
||||
by storing the bit strings as integers and
|
||||
calculating the Hamming distances using bit operations.
|
||||
In particular, if $k \le 32$, we can just store
|
||||
the strings as \texttt{int} values and use the
|
||||
following function to calculate distances:
|
||||
\begin{lstlisting}
|
||||
int hamming(int a, int b) {
|
||||
return __builtin_popcount(a^b);
|
||||
}
|
||||
\end{lstlisting}
|
||||
In the above function, the xor operation constructs
|
||||
a bit string that has one bits in positions
|
||||
where $a$ and $b$ differ.
|
||||
Then, the number of bits is calculated using
|
||||
the \texttt{\_\_builtin\_popcount} function.
|
||||
|
||||
To compare the implementations, we generated
|
||||
a list of 10000 random bit strings of length 30.
|
||||
Using the first approach, the search took
|
||||
13.5 seconds, and after the bit optimization,
|
||||
it only took 0.5 seconds.
|
||||
Thus, the bit optimized code was almost
|
||||
30 times faster than the original code.
|
||||
|
||||
\subsubsection{Counting subgrids}
|
||||
|
||||
As another example, consider the
|
||||
following problem:
|
||||
Given an $n \times n$ grid whose
|
||||
each square is either black (1) or white (0),
|
||||
calculate the number of subgrids
|
||||
whose all corners are black.
|
||||
For example, the grid
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.5]
|
||||
\fill[black] (1,1) rectangle (2,2);
|
||||
\fill[black] (1,4) rectangle (2,5);
|
||||
\fill[black] (4,1) rectangle (5,2);
|
||||
\fill[black] (4,4) rectangle (5,5);
|
||||
\fill[black] (1,3) rectangle (2,4);
|
||||
\fill[black] (2,3) rectangle (3,4);
|
||||
\fill[black] (2,1) rectangle (3,2);
|
||||
\fill[black] (0,2) rectangle (1,3);
|
||||
\draw (0,0) grid (5,5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
contains two such subgrids:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.5]
|
||||
\fill[black] (1,1) rectangle (2,2);
|
||||
\fill[black] (1,4) rectangle (2,5);
|
||||
\fill[black] (4,1) rectangle (5,2);
|
||||
\fill[black] (4,4) rectangle (5,5);
|
||||
\fill[black] (1,3) rectangle (2,4);
|
||||
\fill[black] (2,3) rectangle (3,4);
|
||||
\fill[black] (2,1) rectangle (3,2);
|
||||
\fill[black] (0,2) rectangle (1,3);
|
||||
\draw (0,0) grid (5,5);
|
||||
|
||||
\fill[black] (7+1,1) rectangle (7+2,2);
|
||||
\fill[black] (7+1,4) rectangle (7+2,5);
|
||||
\fill[black] (7+4,1) rectangle (7+5,2);
|
||||
\fill[black] (7+4,4) rectangle (7+5,5);
|
||||
\fill[black] (7+1,3) rectangle (7+2,4);
|
||||
\fill[black] (7+2,3) rectangle (7+3,4);
|
||||
\fill[black] (7+2,1) rectangle (7+3,2);
|
||||
\fill[black] (7+0,2) rectangle (7+1,3);
|
||||
\draw (7+0,0) grid (7+5,5);
|
||||
|
||||
\draw[color=red,line width=1mm] (1,1) rectangle (3,4);
|
||||
\draw[color=red,line width=1mm] (7+1,1) rectangle (7+5,5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
There is an $O(n^3)$ time algorithm for solving the problem:
|
||||
go through all $O(n^2)$ pairs of rows and for each pair
|
||||
$(a,b)$ calculate the number of columns that contain a black
|
||||
square in both rows in $O(n)$ time.
|
||||
The following code assumes that $\texttt{color}[y][x]$
|
||||
denotes the color in row $y$ and column $x$:
|
||||
\begin{lstlisting}
|
||||
int count = 0;
|
||||
for (int i = 0; i < n; i++) {
|
||||
if (color[a][i] == 1 && color[b][i] == 1) count++;
|
||||
}
|
||||
\end{lstlisting}
|
||||
Then, those columns
|
||||
account for $\texttt{count}(\texttt{count}-1)/2$ subgrids with black corners,
|
||||
because we can choose any two of them to form a subgrid.
|
||||
|
||||
To optimize this algorithm, we divide the grid into blocks
|
||||
of columns such that each block consists of $N$
|
||||
consecutive columns. Then, each row is stored as
|
||||
a list of $N$-bit numbers that describe the colors
|
||||
of the squares. Now we can process $N$ columns at the same time
|
||||
using bit operations. In the following code,
|
||||
$\texttt{color}[y][k]$ represents
|
||||
a block of $N$ colors as bits.
|
||||
\begin{lstlisting}
|
||||
int count = 0;
|
||||
for (int i = 0; i <= n/N; i++) {
|
||||
count += __builtin_popcount(color[a][i]&color[b][i]);
|
||||
}
|
||||
\end{lstlisting}
|
||||
The resulting algorithm works in $O(n^3/N)$ time.
|
||||
|
||||
We generated a random grid of size $2500 \times 2500$
|
||||
and compared the original and bit optimized implementation.
|
||||
While the original code took $29.6$ seconds,
|
||||
the bit optimized version only took $3.1$ seconds
|
||||
with $N=32$ (\texttt{int} numbers) and $1.7$ seconds
|
||||
with $N=64$ (\texttt{long long} numbers).
|
||||
|
||||
\section{Dynamic programming}
|
||||
|
||||
Bit operations provide an efficient and convenient
|
||||
way to implement dynamic programming algorithms
|
||||
whose states contain subsets of elements,
|
||||
because such states can be stored as integers.
|
||||
Next we discuss examples of combining
|
||||
bit operations and dynamic programming.
|
||||
|
||||
\subsubsection{Optimal selection}
|
||||
|
||||
As a first example, consider the following problem:
|
||||
We are given the prices of $k$ products
|
||||
over $n$ days, and we want to buy each product
|
||||
exactly once.
|
||||
However, we are allowed to buy at most one product
|
||||
in a day.
|
||||
What is the minimum total price?
|
||||
For example, consider the following scenario ($k=3$ and $n=8$):
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.65]
|
||||
\draw (0, 0) grid (8,3);
|
||||
\node at (-2.5,2.5) {product 0};
|
||||
\node at (-2.5,1.5) {product 1};
|
||||
\node at (-2.5,0.5) {product 2};
|
||||
|
||||
\foreach \x in {0,...,7}
|
||||
{\node at (\x+0.5,3.5) {\x};}
|
||||
\foreach \x/\v in {0/6,1/9,2/5,3/2,4/8,5/9,6/1,7/6}
|
||||
{\node at (\x+0.5,2.5) {\v};}
|
||||
\foreach \x/\v in {0/8,1/2,2/6,3/2,4/7,5/5,6/7,7/2}
|
||||
{\node at (\x+0.5,1.5) {\v};}
|
||||
\foreach \x/\v in {0/5,1/3,2/9,3/7,4/3,5/5,6/1,7/4}
|
||||
{\node at (\x+0.5,0.5) {\v};}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
In this scenario, the minimum total price is $5$:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.65]
|
||||
\fill [color=lightgray] (1, 1) rectangle (2, 2);
|
||||
\fill [color=lightgray] (3, 2) rectangle (4, 3);
|
||||
\fill [color=lightgray] (6, 0) rectangle (7, 1);
|
||||
\draw (0, 0) grid (8,3);
|
||||
\node at (-2.5,2.5) {product 0};
|
||||
\node at (-2.5,1.5) {product 1};
|
||||
\node at (-2.5,0.5) {product 2};
|
||||
|
||||
\foreach \x in {0,...,7}
|
||||
{\node at (\x+0.5,3.5) {\x};}
|
||||
\foreach \x/\v in {0/6,1/9,2/5,3/2,4/8,5/9,6/1,7/6}
|
||||
{\node at (\x+0.5,2.5) {\v};}
|
||||
\foreach \x/\v in {0/8,1/2,2/6,3/2,4/7,5/5,6/7,7/2}
|
||||
{\node at (\x+0.5,1.5) {\v};}
|
||||
\foreach \x/\v in {0/5,1/3,2/9,3/7,4/3,5/5,6/1,7/4}
|
||||
{\node at (\x+0.5,0.5) {\v};}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Let $\texttt{price}[x][d]$ denote the price of product $x$
|
||||
on day $d$.
|
||||
For example, in the above scenario $\texttt{price}[2][3] = 7$.
|
||||
Then, let $\texttt{total}(S,d)$ denote the minimum total
|
||||
price for buying a subset $S$ of products by day $d$.
|
||||
Using this function, the solution to the problem is
|
||||
$\texttt{total}(\{0 \ldots k-1\},n-1)$.
|
||||
|
||||
First, $\texttt{total}(\emptyset,d) = 0$,
|
||||
because it does not cost anything to buy an empty set,
|
||||
and $\texttt{total}(\{x\},0) = \texttt{price}[x][0]$,
|
||||
because there is one way to buy one product on the first day.
|
||||
Then, the following recurrence can be used:
|
||||
\begin{equation*}
|
||||
\begin{split}
|
||||
\texttt{total}(S,d) = \min( & \texttt{total}(S,d-1), \\
|
||||
& \min_{x \in S} (\texttt{total}(S \setminus x,d-1)+\texttt{price}[x][d]))
|
||||
\end{split}
|
||||
\end{equation*}
|
||||
This means that we either do not buy any product on day $d$
|
||||
or buy a product $x$ that belongs to $S$.
|
||||
In the latter case, we remove $x$ from $S$ and add the
|
||||
price of $x$ to the total price.
|
||||
|
||||
The next step is to calculate the values of the function
|
||||
using dynamic programming.
|
||||
To store the function values, we declare an array
|
||||
\begin{lstlisting}
|
||||
int total[1<<K][N];
|
||||
\end{lstlisting}
|
||||
where $K$ and $N$ are suitably large constants.
|
||||
The first dimension of the array corresponds to a bit
|
||||
representation of a subset.
|
||||
|
||||
First, the cases where $d=0$ can be processed as follows:
|
||||
\begin{lstlisting}
|
||||
for (int x = 0; x < k; x++) {
|
||||
total[1<<x][0] = price[x][0];
|
||||
}
|
||||
\end{lstlisting}
|
||||
Then, the recurrence translates into the following code:
|
||||
\begin{lstlisting}
|
||||
for (int d = 1; d < n; d++) {
|
||||
for (int s = 0; s < (1<<k); s++) {
|
||||
total[s][d] = total[s][d-1];
|
||||
for (int x = 0; x < k; x++) {
|
||||
if (s&(1<<x)) {
|
||||
total[s][d] = min(total[s][d],
|
||||
total[s^(1<<x)][d-1]+price[x][d]);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
The time complexity of the algorithm is $O(n 2^k k)$.
|
||||
|
||||
\subsubsection{From permutations to subsets}
|
||||
|
||||
Using dynamic programming, it is often possible
|
||||
to change an iteration over permutations into
|
||||
an iteration over subsets\footnote{This technique was introduced in 1962
|
||||
by M. Held and R. M. Karp \cite{hel62}.}.
|
||||
The benefit of this is that
|
||||
$n!$, the number of permutations,
|
||||
is much larger than $2^n$, the number of subsets.
|
||||
For example, if $n=20$, then
|
||||
$n! \approx 2.4 \cdot 10^{18}$ and $2^n \approx 10^6$.
|
||||
Thus, for certain values of $n$,
|
||||
we can efficiently go through the subsets but not through the permutations.
|
||||
|
||||
As an example, consider the following problem:
|
||||
There is an elevator with maximum weight $x$,
|
||||
and $n$ people with known weights
|
||||
who want to get from the ground floor
|
||||
to the top floor.
|
||||
What is the minimum number of rides needed
|
||||
if the people enter the elevator in an optimal order?
|
||||
|
||||
For example, suppose that $x=10$, $n=5$
|
||||
and the weights are as follows:
|
||||
\begin{center}
|
||||
\begin{tabular}{ll}
|
||||
person & weight \\
|
||||
\hline
|
||||
0 & 2 \\
|
||||
1 & 3 \\
|
||||
2 & 3 \\
|
||||
3 & 5 \\
|
||||
4 & 6 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
In this case, the minimum number of rides is 2.
|
||||
One optimal order is $\{0,2,3,1,4\}$,
|
||||
which partitions the people into two rides:
|
||||
first $\{0,2,3\}$ (total weight 10),
|
||||
and then $\{1,4\}$ (total weight 9).
|
||||
|
||||
The problem can be easily solved in $O(n! n)$ time
|
||||
by testing all possible permutations of $n$ people.
|
||||
However, we can use dynamic programming to get
|
||||
a more efficient $O(2^n n)$ time algorithm.
|
||||
The idea is to calculate for each subset of people
|
||||
two values: the minimum number of rides needed and
|
||||
the minimum weight of people who ride in the last group.
|
||||
|
||||
Let $\texttt{weight}[p]$ denote the weight of
|
||||
person $p$.
|
||||
We define two functions:
|
||||
$\texttt{rides}(S)$ is the minimum number of
|
||||
rides for a subset $S$,
|
||||
and $\texttt{last}(S)$ is the minimum weight
|
||||
of the last ride.
|
||||
For example, in the above scenario
|
||||
\[ \texttt{rides}(\{1,3,4\})=2 \hspace{10px} \textrm{and}
|
||||
\hspace{10px} \texttt{last}(\{1,3,4\})=5,\]
|
||||
because the optimal rides are $\{1,4\}$ and $\{3\}$,
|
||||
and the second ride has weight 5.
|
||||
Of course, our final goal is to calculate the value
|
||||
of $\texttt{rides}(\{0 \ldots n-1\})$.
|
||||
|
||||
We can calculate the values
|
||||
of the functions recursively and then apply
|
||||
dynamic programming.
|
||||
The idea is to go through all people
|
||||
who belong to $S$ and optimally
|
||||
choose the last person $p$ who enters the elevator.
|
||||
Each such choice yields a subproblem
|
||||
for a smaller subset of people.
|
||||
If $\texttt{last}(S \setminus p)+\texttt{weight}[p] \le x$,
|
||||
we can add $p$ to the last ride.
|
||||
Otherwise, we have to reserve a new ride
|
||||
that initially only contains $p$.
|
||||
|
||||
To implement dynamic programming,
|
||||
we declare an array
|
||||
\begin{lstlisting}
|
||||
pair<int,int> best[1<<N];
|
||||
\end{lstlisting}
|
||||
that contains for each subset $S$
|
||||
a pair $(\texttt{rides}(S),\texttt{last}(S))$.
|
||||
We set the value for an empty group as follows:
|
||||
\begin{lstlisting}
|
||||
best[0] = {1,0};
|
||||
\end{lstlisting}
|
||||
Then, we can fill the array as follows:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int s = 1; s < (1<<n); s++) {
|
||||
// initial value: n+1 rides are needed
|
||||
best[s] = {n+1,0};
|
||||
for (int p = 0; p < n; p++) {
|
||||
if (s&(1<<p)) {
|
||||
auto option = best[s^(1<<p)];
|
||||
if (option.second+weight[p] <= x) {
|
||||
// add p to an existing ride
|
||||
option.second += weight[p];
|
||||
} else {
|
||||
// reserve a new ride for p
|
||||
option.first++;
|
||||
option.second = weight[p];
|
||||
}
|
||||
best[s] = min(best[s], option);
|
||||
}
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
Note that the above loop guarantees that
|
||||
for any two subsets $S_1$ and $S_2$
|
||||
such that $S_1 \subset S_2$, we process $S_1$ before $S_2$.
|
||||
Thus, the dynamic programming values are calculated in the
|
||||
correct order.
|
||||
|
||||
\subsubsection{Counting subsets}
|
||||
|
||||
Our last problem in this chapter is as follows:
|
||||
Let $X=\{0 \ldots n-1\}$, and each subset $S \subset X$
|
||||
is assigned an integer $\texttt{value}[S]$.
|
||||
Our task is to calculate for each $S$
|
||||
\[\texttt{sum}(S) = \sum_{A \subset S} \texttt{value}[A],\]
|
||||
i.e., the sum of values of subsets of $S$.
|
||||
|
||||
For example, suppose that $n=3$ and the values are as follows:
|
||||
\begin{multicols}{2}
|
||||
\begin{itemize}
|
||||
\item $\texttt{value}[\emptyset] = 3$
|
||||
\item $\texttt{value}[\{0\}] = 1$
|
||||
\item $\texttt{value}[\{1\}] = 4$
|
||||
\item $\texttt{value}[\{0,1\}] = 5$
|
||||
\item $\texttt{value}[\{2\}] = 5$
|
||||
\item $\texttt{value}[\{0,2\}] = 1$
|
||||
\item $\texttt{value}[\{1,2\}] = 3$
|
||||
\item $\texttt{value}[\{0,1,2\}] = 3$
|
||||
\end{itemize}
|
||||
\end{multicols}
|
||||
In this case, for example,
|
||||
\begin{equation*}
|
||||
\begin{split}
|
||||
\texttt{sum}(\{0,2\}) &= \texttt{value}[\emptyset]+\texttt{value}[\{0\}]+\texttt{value}[\{2\}]+\texttt{value}[\{0,2\}] \\
|
||||
&= 3 + 1 + 5 + 1 = 10.
|
||||
\end{split}
|
||||
\end{equation*}
|
||||
|
||||
Because there are a total of $2^n$ subsets,
|
||||
one possible solution is to go through all
|
||||
pairs of subsets in $O(2^{2n})$ time.
|
||||
However, using dynamic programming, we
|
||||
can solve the problem in $O(2^n n)$ time.
|
||||
The idea is to focus on sums where the
|
||||
elements that may be removed from $S$ are restricted.
|
||||
|
||||
Let $\texttt{partial}(S,k)$ denote the sum of
|
||||
values of subsets of $S$ with the restriction
|
||||
that only elements $0 \ldots k$
|
||||
may be removed from $S$.
|
||||
For example,
|
||||
\[\texttt{partial}(\{0,2\},1)=\texttt{value}[\{2\}]+\texttt{value}[\{0,2\}],\]
|
||||
because we may only remove elements $0 \ldots 1$.
|
||||
We can calculate values of \texttt{sum} using
|
||||
values of \texttt{partial}, because
|
||||
\[\texttt{sum}(S) = \texttt{partial}(S,n-1).\]
|
||||
The base cases for the function are
|
||||
\[\texttt{partial}(S,-1)=\texttt{value}[S],\]
|
||||
because in this case no elements can be removed from $S$.
|
||||
Then, in the general case we can use the following recurrence:
|
||||
\begin{equation*}
|
||||
\texttt{partial}(S,k) = \begin{cases}
|
||||
\texttt{partial}(S,k-1) & k \notin S \\
|
||||
\texttt{partial}(S,k-1) + \texttt{partial}(S \setminus \{k\},k-1) & k \in S
|
||||
\end{cases}
|
||||
\end{equation*}
|
||||
Here we focus on the element $k$.
|
||||
If $k \in S$, we have two options: we may either keep $k$ in $S$
|
||||
or remove it from $S$.
|
||||
|
||||
There is a particularly clever way to implement the
|
||||
calculation of sums. We can declare an array
|
||||
\begin{lstlisting}
|
||||
int sum[1<<N];
|
||||
\end{lstlisting}
|
||||
that will contain the sum of each subset.
|
||||
The array is initialized as follows:
|
||||
\begin{lstlisting}
|
||||
for (int s = 0; s < (1<<n); s++) {
|
||||
sum[s] = value[s];
|
||||
}
|
||||
\end{lstlisting}
|
||||
Then, we can fill the array as follows:
|
||||
\begin{lstlisting}
|
||||
for (int k = 0; k < n; k++) {
|
||||
for (int s = 0; s < (1<<n); s++) {
|
||||
if (s&(1<<k)) sum[s] += sum[s^(1<<k)];
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
This code calculates the values of $\texttt{partial}(S,k)$
|
||||
for $k=0 \ldots n-1$ to the array \texttt{sum}.
|
||||
Since $\texttt{partial}(S,k)$ is always based on
|
||||
$\texttt{partial}(S,k-1)$, we can reuse the array
|
||||
\texttt{sum}, which yields a very efficient implementation.
|
764
chapter11.tex
764
chapter11.tex
|
@ -1,764 +0,0 @@
|
|||
\chapter{Basics of graphs}
|
||||
|
||||
Many programming problems can be solved by
|
||||
modeling the problem as a graph problem
|
||||
and using an appropriate graph algorithm.
|
||||
A typical example of a graph is a network
|
||||
of roads and cities in a country.
|
||||
Sometimes, though, the graph is hidden
|
||||
in the problem and it may be difficult to detect it.
|
||||
|
||||
This part of the book discusses graph algorithms,
|
||||
especially focusing on topics that
|
||||
are important in competitive programming.
|
||||
In this chapter, we go through concepts
|
||||
related to graphs,
|
||||
and study different ways to represent graphs in algorithms.
|
||||
|
||||
\section{Graph terminology}
|
||||
|
||||
\index{graph}
|
||||
\index{node}
|
||||
\index{edge}
|
||||
|
||||
A \key{graph} consists of \key{nodes}
|
||||
and \key{edges}. In this book,
|
||||
the variable $n$ denotes the number of nodes
|
||||
in a graph, and the variable $m$ denotes
|
||||
the number of edges.
|
||||
The nodes are numbered
|
||||
using integers $1,2,\ldots,n$.
|
||||
|
||||
For example, the following graph consists of 5 nodes and 7 edges:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (6,2) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\index{path}
|
||||
|
||||
A \key{path} leads from node $a$ to node $b$
|
||||
through edges of the graph.
|
||||
The \key{length} of a path is the number of
|
||||
edges in it.
|
||||
For example, the above graph contains
|
||||
a path $1 \rightarrow 3 \rightarrow 4 \rightarrow 5$
|
||||
of length 3
|
||||
from node 1 to node 5:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (6,2) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\index{cycle}
|
||||
|
||||
A path is a \key{cycle} if the first and last
|
||||
node is the same.
|
||||
For example, the above graph contains
|
||||
a cycle $1 \rightarrow 3 \rightarrow 4 \rightarrow 1$.
|
||||
A path is \key{simple} if each node appears
|
||||
at most once in the path.
|
||||
|
||||
|
||||
%
|
||||
% \begin{itemize}
|
||||
% \item $1 \rightarrow 2 \rightarrow 5$ (length 2)
|
||||
% \item $1 \rightarrow 4 \rightarrow 5$ (length 2)
|
||||
% \item $1 \rightarrow 2 \rightarrow 4 \rightarrow 5$ (length 3)
|
||||
% \item $1 \rightarrow 3 \rightarrow 4 \rightarrow 5$ (length 3)
|
||||
% \item $1 \rightarrow 4 \rightarrow 2 \rightarrow 5$ (length 3)
|
||||
% \item $1 \rightarrow 3 \rightarrow 4 \rightarrow 2 \rightarrow 5$ (length 4)
|
||||
% \end{itemize}
|
||||
|
||||
\subsubsection{Connectivity}
|
||||
|
||||
\index{connected graph}
|
||||
|
||||
A graph is \key{connected} if there is a path
|
||||
between any two nodes.
|
||||
For example, the following graph is connected:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The following graph is not connected,
|
||||
because it is not possible to get
|
||||
from node 4 to any other node:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\index{component}
|
||||
|
||||
The connected parts of a graph are
|
||||
called its \key{components}.
|
||||
For example, the following graph
|
||||
contains three components:
|
||||
$\{1,\,2,\,3\}$,
|
||||
$\{4,\,5,\,6,\,7\}$ and
|
||||
$\{8\}$.
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.8]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
|
||||
\node[draw, circle] (6) at (6,1) {$6$};
|
||||
\node[draw, circle] (7) at (9,1) {$7$};
|
||||
\node[draw, circle] (4) at (6,3) {$4$};
|
||||
\node[draw, circle] (5) at (9,3) {$5$};
|
||||
|
||||
\node[draw, circle] (8) at (11,2) {$8$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
\path[draw,thick,-] (5) -- (7);
|
||||
\path[draw,thick,-] (6) -- (7);
|
||||
\path[draw,thick,-] (6) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\index{tree}
|
||||
|
||||
A \key{tree} is a connected graph
|
||||
that consists of $n$ nodes and $n-1$ edges.
|
||||
There is a unique path
|
||||
between any two nodes of a tree.
|
||||
For example, the following graph is a tree:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (6,2) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
%\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
%\path[draw,thick,-] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Edge directions}
|
||||
|
||||
\index{directed graph}
|
||||
|
||||
A graph is \key{directed}
|
||||
if the edges can be traversed
|
||||
in one direction only.
|
||||
For example, the following graph is directed:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (6,2) {$5$};
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (4);
|
||||
\path[draw,thick,->,>=latex] (2) -- (5);
|
||||
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\path[draw,thick,->,>=latex] (3) -- (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The above graph contains
|
||||
a path $3 \rightarrow 1 \rightarrow 2 \rightarrow 5$
|
||||
from node $3$ to node $5$,
|
||||
but there is no path from node $5$ to node $3$.
|
||||
|
||||
\subsubsection{Edge weights}
|
||||
|
||||
\index{weighted graph}
|
||||
|
||||
In a \key{weighted} graph, each edge is assigned
|
||||
a \key{weight}.
|
||||
The weights are often interpreted as edge lengths.
|
||||
For example, the following graph is weighted:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (6,2) {$5$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:1] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:7] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:7] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:3] {} (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The length of a path in a weighted graph
|
||||
is the sum of the edge weights on the path.
|
||||
For example, in the above graph,
|
||||
the length of the path
|
||||
$1 \rightarrow 2 \rightarrow 5$ is $12$,
|
||||
and the length of the path
|
||||
$1 \rightarrow 3 \rightarrow 4 \rightarrow 5$ is $11$.
|
||||
The latter path is the \key{shortest} path from node $1$ to node $5$.
|
||||
|
||||
\subsubsection{Neighbors and degrees}
|
||||
|
||||
\index{neighbor}
|
||||
\index{degree}
|
||||
|
||||
Two nodes are \key{neighbors} or \key{adjacent}
|
||||
if there is an edge between them.
|
||||
The \key{degree} of a node
|
||||
is the number of its neighbors.
|
||||
For example, in the following graph,
|
||||
the neighbors of node 2 are 1, 4 and 5,
|
||||
so its degree is 3.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (6,2) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
%\path[draw,thick,-] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The sum of degrees in a graph is always $2m$,
|
||||
where $m$ is the number of edges,
|
||||
because each edge
|
||||
increases the degree of exactly two nodes by one.
|
||||
For this reason, the sum of degrees is always even.
|
||||
|
||||
\index{regular graph}
|
||||
\index{complete graph}
|
||||
|
||||
A graph is \key{regular} if the
|
||||
degree of every node is a constant $d$.
|
||||
A graph is \key{complete} if the
|
||||
degree of every node is $n-1$, i.e.,
|
||||
the graph contains all possible edges
|
||||
between the nodes.
|
||||
|
||||
\index{indegree}
|
||||
\index{outdegree}
|
||||
|
||||
In a directed graph, the \key{indegree}
|
||||
of a node is the number of edges
|
||||
that end at the node,
|
||||
and the \key{outdegree} of a node
|
||||
is the number of edges that start at the node.
|
||||
For example, in the following graph,
|
||||
the indegree of node 2 is 2,
|
||||
and the outdegree of node 2 is 1.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (6,2) {$5$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (1) -- (3);
|
||||
\path[draw,thick,->,>=latex] (1) -- (4);
|
||||
\path[draw,thick,->,>=latex] (3) -- (4);
|
||||
\path[draw,thick,->,>=latex] (2) -- (4);
|
||||
\path[draw,thick,<-,>=latex] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Colorings}
|
||||
|
||||
\index{coloring}
|
||||
\index{bipartite graph}
|
||||
|
||||
In a \key{coloring} of a graph,
|
||||
each node is assigned a color so that
|
||||
no adjacent nodes have the same color.
|
||||
|
||||
A graph is \key{bipartite} if
|
||||
it is possible to color it using two colors.
|
||||
It turns out that a graph is bipartite
|
||||
exactly when it does not contain a cycle
|
||||
with an odd number of edges.
|
||||
For example, the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$2$};
|
||||
\node[draw, circle] (2) at (4,3) {$3$};
|
||||
\node[draw, circle] (3) at (1,1) {$5$};
|
||||
\node[draw, circle] (4) at (4,1) {$6$};
|
||||
\node[draw, circle] (5) at (-2,1) {$4$};
|
||||
\node[draw, circle] (6) at (-2,3) {$1$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
is bipartite, because it can be colored as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle, fill=blue!40] (1) at (1,3) {$2$};
|
||||
\node[draw, circle, fill=red!40] (2) at (4,3) {$3$};
|
||||
\node[draw, circle, fill=red!40] (3) at (1,1) {$5$};
|
||||
\node[draw, circle, fill=blue!40] (4) at (4,1) {$6$};
|
||||
\node[draw, circle, fill=red!40] (5) at (-2,1) {$4$};
|
||||
\node[draw, circle, fill=blue!40] (6) at (-2,3) {$1$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
However, the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$2$};
|
||||
\node[draw, circle] (2) at (4,3) {$3$};
|
||||
\node[draw, circle] (3) at (1,1) {$5$};
|
||||
\node[draw, circle] (4) at (4,1) {$6$};
|
||||
\node[draw, circle] (5) at (-2,1) {$4$};
|
||||
\node[draw, circle] (6) at (-2,3) {$1$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\path[draw,thick,-] (1) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
is not bipartite, because it is not possible to color
|
||||
the following cycle of three nodes using two colors:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$2$};
|
||||
\node[draw, circle] (2) at (4,3) {$3$};
|
||||
\node[draw, circle] (3) at (1,1) {$5$};
|
||||
\node[draw, circle] (4) at (4,1) {$6$};
|
||||
\node[draw, circle] (5) at (-2,1) {$4$};
|
||||
\node[draw, circle] (6) at (-2,3) {$1$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\path[draw,thick,-] (1) -- (6);
|
||||
|
||||
\path[draw=red,thick,-,line width=2pt] (1) -- (3);
|
||||
\path[draw=red,thick,-,line width=2pt] (3) -- (6);
|
||||
\path[draw=red,thick,-,line width=2pt] (6) -- (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Simplicity}
|
||||
|
||||
\index{simple graph}
|
||||
|
||||
A graph is \key{simple}
|
||||
if no edge starts and ends at the same node,
|
||||
and there are no multiple
|
||||
edges between two nodes.
|
||||
Often we assume that graphs are simple.
|
||||
For example, the following graph is \emph{not} simple:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$2$};
|
||||
\node[draw, circle] (2) at (4,3) {$3$};
|
||||
\node[draw, circle] (3) at (1,1) {$5$};
|
||||
\node[draw, circle] (4) at (4,1) {$6$};
|
||||
\node[draw, circle] (5) at (-2,1) {$4$};
|
||||
\node[draw, circle] (6) at (-2,3) {$1$};
|
||||
|
||||
\path[draw,thick,-] (1) edge [bend right=20] (2);
|
||||
\path[draw,thick,-] (2) edge [bend right=20] (1);
|
||||
%\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
|
||||
\tikzset{every loop/.style={in=135,out=190}}
|
||||
\path[draw,thick,-] (5) edge [loop left] (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\section{Graph representation}
|
||||
|
||||
There are several ways to represent graphs
|
||||
in algorithms.
|
||||
The choice of a data structure
|
||||
depends on the size of the graph and
|
||||
the way the algorithm processes it.
|
||||
Next we will go through three common representations.
|
||||
|
||||
\subsubsection{Adjacency list representation}
|
||||
|
||||
\index{adjacency list}
|
||||
|
||||
In the adjacency list representation,
|
||||
each node $x$ in the graph is assigned an \key{adjacency list}
|
||||
that consists of nodes
|
||||
to which there is an edge from $x$.
|
||||
Adjacency lists are the most popular
|
||||
way to represent graphs, and most algorithms can be
|
||||
efficiently implemented using them.
|
||||
|
||||
A convenient way to store the adjacency lists is to declare
|
||||
an array of vectors as follows:
|
||||
\begin{lstlisting}
|
||||
vector<int> adj[N];
|
||||
\end{lstlisting}
|
||||
|
||||
The constant $N$ is chosen so that all
|
||||
adjacency lists can be stored.
|
||||
For example, the graph
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (3,1) {$4$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (2) -- (4);
|
||||
\path[draw,thick,->,>=latex] (3) -- (4);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
can be stored as follows:
|
||||
\begin{lstlisting}
|
||||
adj[1].push_back(2);
|
||||
adj[2].push_back(3);
|
||||
adj[2].push_back(4);
|
||||
adj[3].push_back(4);
|
||||
adj[4].push_back(1);
|
||||
\end{lstlisting}
|
||||
|
||||
If the graph is undirected, it can be stored in a similar way,
|
||||
but each edge is added in both directions.
|
||||
|
||||
For a weighted graph, the structure can be extended
|
||||
as follows:
|
||||
|
||||
\begin{lstlisting}
|
||||
vector<pair<int,int>> adj[N];
|
||||
\end{lstlisting}
|
||||
|
||||
In this case, the adjacency list of node $a$
|
||||
contains the pair $(b,w)$
|
||||
always when there is an edge from node $a$ to node $b$
|
||||
with weight $w$. For example, the graph
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (3,1) {$4$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
|
||||
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
|
||||
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
|
||||
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
can be stored as follows:
|
||||
\begin{lstlisting}
|
||||
adj[1].push_back({2,5});
|
||||
adj[2].push_back({3,7});
|
||||
adj[2].push_back({4,6});
|
||||
adj[3].push_back({4,5});
|
||||
adj[4].push_back({1,2});
|
||||
\end{lstlisting}
|
||||
|
||||
The benefit of using adjacency lists is that
|
||||
we can efficiently find the nodes to which
|
||||
we can move from a given node through an edge.
|
||||
For example, the following loop goes through all nodes
|
||||
to which we can move from node $s$:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (auto u : adj[s]) {
|
||||
// process node u
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Adjacency matrix representation}
|
||||
|
||||
\index{adjacency matrix}
|
||||
|
||||
An \key{adjacency matrix} is a two-dimensional array
|
||||
that indicates which edges the graph contains.
|
||||
We can efficiently check from an adjacency matrix
|
||||
if there is an edge between two nodes.
|
||||
The matrix can be stored as an array
|
||||
\begin{lstlisting}
|
||||
int adj[N][N];
|
||||
\end{lstlisting}
|
||||
where each value $\texttt{adj}[a][b]$ indicates
|
||||
whether the graph contains an edge from
|
||||
node $a$ to node $b$.
|
||||
If the edge is included in the graph,
|
||||
then $\texttt{adj}[a][b]=1$,
|
||||
and otherwise $\texttt{adj}[a][b]=0$.
|
||||
For example, the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (3,1) {$4$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (2) -- (4);
|
||||
\path[draw,thick,->,>=latex] (3) -- (4);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
can be represented as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (4,4);
|
||||
\node at (0.5,0.5) {1};
|
||||
\node at (1.5,0.5) {0};
|
||||
\node at (2.5,0.5) {0};
|
||||
\node at (3.5,0.5) {0};
|
||||
\node at (0.5,1.5) {0};
|
||||
\node at (1.5,1.5) {0};
|
||||
\node at (2.5,1.5) {0};
|
||||
\node at (3.5,1.5) {1};
|
||||
\node at (0.5,2.5) {0};
|
||||
\node at (1.5,2.5) {0};
|
||||
\node at (2.5,2.5) {1};
|
||||
\node at (3.5,2.5) {1};
|
||||
\node at (0.5,3.5) {0};
|
||||
\node at (1.5,3.5) {1};
|
||||
\node at (2.5,3.5) {0};
|
||||
\node at (3.5,3.5) {0};
|
||||
\node at (-0.5,0.5) {4};
|
||||
\node at (-0.5,1.5) {3};
|
||||
\node at (-0.5,2.5) {2};
|
||||
\node at (-0.5,3.5) {1};
|
||||
\node at (0.5,4.5) {1};
|
||||
\node at (1.5,4.5) {2};
|
||||
\node at (2.5,4.5) {3};
|
||||
\node at (3.5,4.5) {4};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
If the graph is weighted, the adjacency matrix
|
||||
representation can be extended so that
|
||||
the matrix contains the weight of the edge
|
||||
if the edge exists.
|
||||
Using this representation, the graph
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (3,1) {$4$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
|
||||
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
|
||||
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
|
||||
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\begin{samepage}
|
||||
corresponds to the following matrix:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (4,4);
|
||||
\node at (0.5,0.5) {2};
|
||||
\node at (1.5,0.5) {0};
|
||||
\node at (2.5,0.5) {0};
|
||||
\node at (3.5,0.5) {0};
|
||||
\node at (0.5,1.5) {0};
|
||||
\node at (1.5,1.5) {0};
|
||||
\node at (2.5,1.5) {0};
|
||||
\node at (3.5,1.5) {5};
|
||||
\node at (0.5,2.5) {0};
|
||||
\node at (1.5,2.5) {0};
|
||||
\node at (2.5,2.5) {7};
|
||||
\node at (3.5,2.5) {6};
|
||||
\node at (0.5,3.5) {0};
|
||||
\node at (1.5,3.5) {5};
|
||||
\node at (2.5,3.5) {0};
|
||||
\node at (3.5,3.5) {0};
|
||||
\node at (-0.5,0.5) {4};
|
||||
\node at (-0.5,1.5) {3};
|
||||
\node at (-0.5,2.5) {2};
|
||||
\node at (-0.5,3.5) {1};
|
||||
\node at (0.5,4.5) {1};
|
||||
\node at (1.5,4.5) {2};
|
||||
\node at (2.5,4.5) {3};
|
||||
\node at (3.5,4.5) {4};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
The drawback of the adjacency matrix representation
|
||||
is that the matrix contains $n^2$ elements,
|
||||
and usually most of them are zero.
|
||||
For this reason, the representation cannot be used
|
||||
if the graph is large.
|
||||
|
||||
\subsubsection{Edge list representation}
|
||||
|
||||
\index{edge list}
|
||||
|
||||
An \key{edge list} contains all edges of a graph
|
||||
in some order.
|
||||
This is a convenient way to represent a graph
|
||||
if the algorithm processes all edges of the graph
|
||||
and it is not needed to find edges that start
|
||||
at a given node.
|
||||
|
||||
The edge list can be stored in a vector
|
||||
\begin{lstlisting}
|
||||
vector<pair<int,int>> edges;
|
||||
\end{lstlisting}
|
||||
where each pair $(a,b)$ denotes that
|
||||
there is an edge from node $a$ to node $b$.
|
||||
Thus, the graph
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (3,1) {$4$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (2) -- (4);
|
||||
\path[draw,thick,->,>=latex] (3) -- (4);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
can be represented as follows:
|
||||
\begin{lstlisting}
|
||||
edges.push_back({1,2});
|
||||
edges.push_back({2,3});
|
||||
edges.push_back({2,4});
|
||||
edges.push_back({3,4});
|
||||
edges.push_back({4,1});
|
||||
\end{lstlisting}
|
||||
|
||||
\noindent
|
||||
If the graph is weighted, the structure can
|
||||
be extended as follows:
|
||||
\begin{lstlisting}
|
||||
vector<tuple<int,int,int>> edges;
|
||||
\end{lstlisting}
|
||||
Each element in this list is of the
|
||||
form $(a,b,w)$, which means that there
|
||||
is an edge from node $a$ to node $b$ with weight $w$.
|
||||
For example, the graph
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (3,1) {$4$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
|
||||
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
|
||||
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
|
||||
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\begin{samepage}
|
||||
can be represented as follows\footnote{In some older compilers, the function
|
||||
\texttt{make\_tuple} must be used instead of the braces (for example,
|
||||
\texttt{make\_tuple(1,2,5)} instead of \texttt{\{1,2,5\}}).}:
|
||||
\begin{lstlisting}
|
||||
edges.push_back({1,2,5});
|
||||
edges.push_back({2,3,7});
|
||||
edges.push_back({2,4,6});
|
||||
edges.push_back({3,4,5});
|
||||
edges.push_back({4,1,2});
|
||||
\end{lstlisting}
|
||||
\end{samepage}
|
549
chapter12.tex
549
chapter12.tex
|
@ -1,549 +0,0 @@
|
|||
\chapter{Graph traversal}
|
||||
|
||||
This chapter discusses two fundamental
|
||||
graph algorithms:
|
||||
depth-first search and breadth-first search.
|
||||
Both algorithms are given a starting
|
||||
node in the graph,
|
||||
and they visit all nodes that can be reached
|
||||
from the starting node.
|
||||
The difference in the algorithms is the order
|
||||
in which they visit the nodes.
|
||||
|
||||
\section{Depth-first search}
|
||||
|
||||
\index{depth-first search}
|
||||
|
||||
\key{Depth-first search} (DFS)
|
||||
is a straightforward graph traversal technique.
|
||||
The algorithm begins at a starting node,
|
||||
and proceeds to all other nodes that are
|
||||
reachable from the starting node using
|
||||
the edges of the graph.
|
||||
|
||||
Depth-first search always follows a single
|
||||
path in the graph as long as it finds
|
||||
new nodes.
|
||||
After this, it returns to previous
|
||||
nodes and begins to explore other parts of the graph.
|
||||
The algorithm keeps track of visited nodes,
|
||||
so that it processes each node only once.
|
||||
|
||||
\subsubsection*{Example}
|
||||
|
||||
Let us consider how depth-first search processes
|
||||
the following graph:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
We may begin the search at any node of the graph;
|
||||
now we will begin the search at node 1.
|
||||
|
||||
The search first proceeds to node 2:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
After this, nodes 3 and 5 will be visited:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||
\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The neighbors of node 5 are 2 and 3,
|
||||
but the search has already visited both of them,
|
||||
so it is time to return to the previous nodes.
|
||||
Also the neighbors of nodes 3 and 2
|
||||
have been visited, so we next move
|
||||
from node 1 to node 4:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||
\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
|
||||
\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
|
||||
\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
After this, the search terminates because it has visited
|
||||
all nodes.
|
||||
|
||||
The time complexity of depth-first search is $O(n+m)$
|
||||
where $n$ is the number of nodes and $m$ is the
|
||||
number of edges,
|
||||
because the algorithm processes each node and edge once.
|
||||
|
||||
\subsubsection*{Implementation}
|
||||
|
||||
Depth-first search can be conveniently
|
||||
implemented using recursion.
|
||||
The following function \texttt{dfs} begins
|
||||
a depth-first search at a given node.
|
||||
The function assumes that the graph is
|
||||
stored as adjacency lists in an array
|
||||
\begin{lstlisting}
|
||||
vector<int> adj[N];
|
||||
\end{lstlisting}
|
||||
and also maintains an array
|
||||
\begin{lstlisting}
|
||||
bool visited[N];
|
||||
\end{lstlisting}
|
||||
that keeps track of the visited nodes.
|
||||
Initially, each array value is \texttt{false},
|
||||
and when the search arrives at node $s$,
|
||||
the value of \texttt{visited}[$s$] becomes \texttt{true}.
|
||||
The function can be implemented as follows:
|
||||
\begin{lstlisting}
|
||||
void dfs(int s) {
|
||||
if (visited[s]) return;
|
||||
visited[s] = true;
|
||||
// process node s
|
||||
for (auto u: adj[s]) {
|
||||
dfs(u);
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Breadth-first search}
|
||||
|
||||
\index{breadth-first search}
|
||||
|
||||
\key{Breadth-first search} (BFS) visits the nodes
|
||||
in increasing order of their distance
|
||||
from the starting node.
|
||||
Thus, we can calculate the distance
|
||||
from the starting node to all other
|
||||
nodes using breadth-first search.
|
||||
However, breadth-first search is more difficult
|
||||
to implement than depth-first search.
|
||||
|
||||
Breadth-first search goes through the nodes
|
||||
one level after another.
|
||||
First the search explores the nodes whose
|
||||
distance from the starting node is 1,
|
||||
then the nodes whose distance is 2, and so on.
|
||||
This process continues until all nodes
|
||||
have been visited.
|
||||
|
||||
\subsubsection*{Example}
|
||||
|
||||
Let us consider how breadth-first search processes
|
||||
the following graph:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,5) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
\node[draw, circle] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Suppose that the search begins at node 1.
|
||||
First, we process all nodes that can be reached
|
||||
from node 1 using a single edge:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,5) {$3$};
|
||||
\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
\node[draw, circle] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
After this, we proceed to nodes 3 and 5:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||
\node[draw, circle,fill=lightgray] (3) at (5,5) {$3$};
|
||||
\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
|
||||
\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
|
||||
\node[draw, circle] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Finally, we visit node 6:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle,fill=lightgray] (1) at (1,5) {$1$};
|
||||
\node[draw, circle,fill=lightgray] (2) at (3,5) {$2$};
|
||||
\node[draw, circle,fill=lightgray] (3) at (5,5) {$3$};
|
||||
\node[draw, circle,fill=lightgray] (4) at (1,3) {$4$};
|
||||
\node[draw, circle,fill=lightgray] (5) at (3,3) {$5$};
|
||||
\node[draw, circle,fill=lightgray] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (6);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Now we have calculated the distances
|
||||
from the starting node to all nodes of the graph.
|
||||
The distances are as follows:
|
||||
|
||||
\begin{tabular}{ll}
|
||||
\\
|
||||
node & distance \\
|
||||
\hline
|
||||
1 & 0 \\
|
||||
2 & 1 \\
|
||||
3 & 2 \\
|
||||
4 & 1 \\
|
||||
5 & 2 \\
|
||||
6 & 3 \\
|
||||
\\
|
||||
\end{tabular}
|
||||
|
||||
Like in depth-first search,
|
||||
the time complexity of breadth-first search
|
||||
is $O(n+m)$, where $n$ is the number of nodes
|
||||
and $m$ is the number of edges.
|
||||
|
||||
\subsubsection*{Implementation}
|
||||
|
||||
Breadth-first search is more difficult
|
||||
to implement than depth-first search,
|
||||
because the algorithm visits nodes
|
||||
in different parts of the graph.
|
||||
A typical implementation is based on
|
||||
a queue that contains nodes.
|
||||
At each step, the next node in the queue
|
||||
will be processed.
|
||||
|
||||
The following code assumes that the graph is stored
|
||||
as adjacency lists and maintains the following
|
||||
data structures:
|
||||
\begin{lstlisting}
|
||||
queue<int> q;
|
||||
bool visited[N];
|
||||
int distance[N];
|
||||
\end{lstlisting}
|
||||
|
||||
The queue \texttt{q}
|
||||
contains nodes to be processed
|
||||
in increasing order of their distance.
|
||||
New nodes are always added to the end
|
||||
of the queue, and the node at the beginning
|
||||
of the queue is the next node to be processed.
|
||||
The array \texttt{visited} indicates
|
||||
which nodes the search has already visited,
|
||||
and the array \texttt{distance} will contain the
|
||||
distances from the starting node to all nodes of the graph.
|
||||
|
||||
The search can be implemented as follows,
|
||||
starting at node $x$:
|
||||
\begin{lstlisting}
|
||||
visited[x] = true;
|
||||
distance[x] = 0;
|
||||
q.push(x);
|
||||
while (!q.empty()) {
|
||||
int s = q.front(); q.pop();
|
||||
// process node s
|
||||
for (auto u : adj[s]) {
|
||||
if (visited[u]) continue;
|
||||
visited[u] = true;
|
||||
distance[u] = distance[s]+1;
|
||||
q.push(u);
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Applications}
|
||||
|
||||
Using the graph traversal algorithms,
|
||||
we can check many properties of graphs.
|
||||
Usually, both depth-first search and
|
||||
breadth-first search may be used,
|
||||
but in practice, depth-first search
|
||||
is a better choice, because it is
|
||||
easier to implement.
|
||||
In the following applications we will
|
||||
assume that the graph is undirected.
|
||||
|
||||
\subsubsection{Connectivity check}
|
||||
|
||||
\index{connected graph}
|
||||
|
||||
A graph is connected if there is a path
|
||||
between any two nodes of the graph.
|
||||
Thus, we can check if a graph is connected
|
||||
by starting at an arbitrary node and
|
||||
finding out if we can reach all other nodes.
|
||||
|
||||
For example, in the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (2) at (7,5) {$2$};
|
||||
\node[draw, circle] (1) at (3,5) {$1$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (5) at (7,3) {$5$};
|
||||
\node[draw, circle] (4) at (3,3) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
a depth-first search from node $1$ visits
|
||||
the following nodes:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (2) at (7,5) {$2$};
|
||||
\node[draw, circle,fill=lightgray] (1) at (3,5) {$1$};
|
||||
\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (5) at (7,3) {$5$};
|
||||
\node[draw, circle,fill=lightgray] (4) at (3,3) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Since the search did not visit all the nodes,
|
||||
we can conclude that the graph is not connected.
|
||||
In a similar way, we can also find all connected components
|
||||
of a graph by iterating through the nodes and always
|
||||
starting a new depth-first search if the current node
|
||||
does not belong to any component yet.
|
||||
|
||||
\subsubsection{Finding cycles}
|
||||
|
||||
\index{cycle}
|
||||
|
||||
A graph contains a cycle if during a graph traversal,
|
||||
we find a node whose neighbor (other than the
|
||||
previous node in the current path) has already been
|
||||
visited.
|
||||
For example, the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (2) at (7,5) {$2$};
|
||||
\node[draw, circle] (1) at (3,5) {$1$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (5) at (7,3) {$5$};
|
||||
\node[draw, circle] (4) at (3,3) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
contains two cycles and we can find one
|
||||
of them as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle,fill=lightgray] (2) at (7,5) {$2$};
|
||||
\node[draw, circle,fill=lightgray] (1) at (3,5) {$1$};
|
||||
\node[draw, circle,fill=lightgray] (3) at (5,4) {$3$};
|
||||
\node[draw, circle,fill=lightgray] (5) at (7,3) {$5$};
|
||||
\node[draw, circle] (4) at (3,3) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- (5);
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
After moving from node 2 to node 5 we notice that
|
||||
the neighbor 3 of node 5 has already been visited.
|
||||
Thus, the graph contains a cycle that goes through node 3,
|
||||
for example, $3 \rightarrow 2 \rightarrow 5 \rightarrow 3$.
|
||||
|
||||
Another way to find out whether a graph contains a cycle
|
||||
is to simply calculate the number of nodes and edges
|
||||
in every component.
|
||||
If a component contains $c$ nodes and no cycle,
|
||||
it must contain exactly $c-1$ edges
|
||||
(so it has to be a tree).
|
||||
If there are $c$ or more edges, the component
|
||||
surely contains a cycle.
|
||||
|
||||
\subsubsection{Bipartiteness check}
|
||||
|
||||
\index{bipartite graph}
|
||||
|
||||
A graph is bipartite if its nodes can be colored
|
||||
using two colors so that there are no adjacent
|
||||
nodes with the same color.
|
||||
It is surprisingly easy to check if a graph
|
||||
is bipartite using graph traversal algorithms.
|
||||
|
||||
The idea is to color the starting node blue,
|
||||
all its neighbors red, all their neighbors blue, and so on.
|
||||
If at some point of the search we notice that
|
||||
two adjacent nodes have the same color,
|
||||
this means that the graph is not bipartite.
|
||||
Otherwise the graph is bipartite and one coloring
|
||||
has been found.
|
||||
|
||||
For example, the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (2) at (5,5) {$2$};
|
||||
\node[draw, circle] (1) at (3,5) {$1$};
|
||||
\node[draw, circle] (3) at (7,4) {$3$};
|
||||
\node[draw, circle] (5) at (5,3) {$5$};
|
||||
\node[draw, circle] (4) at (3,3) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (5) -- (4);
|
||||
\path[draw,thick,-] (4) -- (1);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (5) -- (3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
is not bipartite, because a search from node 1
|
||||
proceeds as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle,fill=red!40] (2) at (5,5) {$2$};
|
||||
\node[draw, circle,fill=blue!40] (1) at (3,5) {$1$};
|
||||
\node[draw, circle,fill=blue!40] (3) at (7,4) {$3$};
|
||||
\node[draw, circle,fill=red!40] (5) at (5,3) {$5$};
|
||||
\node[draw, circle] (4) at (3,3) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (5) -- (4);
|
||||
\path[draw,thick,-] (4) -- (1);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (5) -- (3);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- (2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
We notice that the color of both nodes 2 and 5
|
||||
is red, while they are adjacent nodes in the graph.
|
||||
Thus, the graph is not bipartite.
|
||||
|
||||
This algorithm always works, because when there
|
||||
are only two colors available,
|
||||
the color of the starting node in a component
|
||||
determines the colors of all other nodes in the component.
|
||||
It does not make any difference whether the
|
||||
starting node is red or blue.
|
||||
|
||||
Note that in the general case,
|
||||
it is difficult to find out if the nodes
|
||||
in a graph can be colored using $k$ colors
|
||||
so that no adjacent nodes have the same color.
|
||||
Even when $k=3$, no efficient algorithm is known
|
||||
but the problem is NP-hard.
|
802
chapter13.tex
802
chapter13.tex
|
@ -1,802 +0,0 @@
|
|||
\chapter{Shortest paths}
|
||||
|
||||
\index{shortest path}
|
||||
|
||||
Finding a shortest path between two nodes
|
||||
of a graph
|
||||
is an important problem that has many
|
||||
practical applications.
|
||||
For example, a natural problem related to a road network
|
||||
is to calculate the shortest possible length of a route
|
||||
between two cities, given the lengths of the roads.
|
||||
|
||||
In an unweighted graph, the length of a path equals
|
||||
the number of its edges, and we can
|
||||
simply use breadth-first search to find
|
||||
a shortest path.
|
||||
However, in this chapter we focus on
|
||||
weighted graphs
|
||||
where more sophisticated algorithms
|
||||
are needed
|
||||
for finding shortest paths.
|
||||
|
||||
\section{Bellman–Ford algorithm}
|
||||
|
||||
\index{Bellman–Ford algorithm}
|
||||
|
||||
The \key{Bellman–Ford algorithm}\footnote{The algorithm is named after
|
||||
R. E. Bellman and L. R. Ford who published it independently
|
||||
in 1958 and 1956, respectively \cite{bel58,for56a}.} finds
|
||||
shortest paths from a starting node to all
|
||||
nodes of the graph.
|
||||
The algorithm can process all kinds of graphs,
|
||||
provided that the graph does not contain a
|
||||
cycle with negative length.
|
||||
If the graph contains a negative cycle,
|
||||
the algorithm can detect this.
|
||||
|
||||
The algorithm keeps track of distances
|
||||
from the starting node to all nodes of the graph.
|
||||
Initially, the distance to the starting node is 0
|
||||
and the distance to all other nodes in infinite.
|
||||
The algorithm reduces the distances by finding
|
||||
edges that shorten the paths until it is not
|
||||
possible to reduce any distance.
|
||||
|
||||
\subsubsection{Example}
|
||||
|
||||
Let us consider how the Bellman–Ford algorithm
|
||||
works in the following graph:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (1,3) {1};
|
||||
\node[draw, circle] (2) at (4,3) {2};
|
||||
\node[draw, circle] (3) at (1,1) {3};
|
||||
\node[draw, circle] (4) at (4,1) {4};
|
||||
\node[draw, circle] (5) at (6,2) {6};
|
||||
\node[color=red] at (1,3+0.55) {$0$};
|
||||
\node[color=red] at (4,3+0.55) {$\infty$};
|
||||
\node[color=red] at (1,1-0.55) {$\infty$};
|
||||
\node[color=red] at (4,1-0.55) {$\infty$};
|
||||
\node[color=red] at (6,2-0.55) {$\infty$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Each node of the graph is assigned a distance.
|
||||
Initially, the distance to the starting node is 0,
|
||||
and the distance to all other nodes is infinite.
|
||||
|
||||
The algorithm searches for edges that reduce distances.
|
||||
First, all edges from node 1 reduce distances:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (1,3) {1};
|
||||
\node[draw, circle] (2) at (4,3) {2};
|
||||
\node[draw, circle] (3) at (1,1) {3};
|
||||
\node[draw, circle] (4) at (4,1) {4};
|
||||
\node[draw, circle] (5) at (6,2) {5};
|
||||
\node[color=red] at (1,3+0.55) {$0$};
|
||||
\node[color=red] at (4,3+0.55) {$5$};
|
||||
\node[color=red] at (1,1-0.55) {$3$};
|
||||
\node[color=red] at (4,1-0.55) {$7$};
|
||||
\node[color=red] at (6,2-0.55) {$\infty$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
After this, edges
|
||||
$2 \rightarrow 5$ and $3 \rightarrow 4$
|
||||
reduce distances:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (1,3) {1};
|
||||
\node[draw, circle] (2) at (4,3) {2};
|
||||
\node[draw, circle] (3) at (1,1) {3};
|
||||
\node[draw, circle] (4) at (4,1) {4};
|
||||
\node[draw, circle] (5) at (6,2) {5};
|
||||
\node[color=red] at (1,3+0.55) {$0$};
|
||||
\node[color=red] at (4,3+0.55) {$5$};
|
||||
\node[color=red] at (1,1-0.55) {$3$};
|
||||
\node[color=red] at (4,1-0.55) {$4$};
|
||||
\node[color=red] at (6,2-0.55) {$7$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Finally, there is one more change:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (1,3) {1};
|
||||
\node[draw, circle] (2) at (4,3) {2};
|
||||
\node[draw, circle] (3) at (1,1) {3};
|
||||
\node[draw, circle] (4) at (4,1) {4};
|
||||
\node[draw, circle] (5) at (6,2) {5};
|
||||
\node[color=red] at (1,3+0.55) {$0$};
|
||||
\node[color=red] at (4,3+0.55) {$5$};
|
||||
\node[color=red] at (1,1-0.55) {$3$};
|
||||
\node[color=red] at (4,1-0.55) {$4$};
|
||||
\node[color=red] at (6,2-0.55) {$6$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
After this, no edge can reduce any distance.
|
||||
This means that the distances are final,
|
||||
and we have successfully
|
||||
calculated the shortest distances
|
||||
from the starting node to all nodes of the graph.
|
||||
|
||||
For example, the shortest distance 3
|
||||
from node 1 to node 5 corresponds to
|
||||
the following path:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (1,3) {1};
|
||||
\node[draw, circle] (2) at (4,3) {2};
|
||||
\node[draw, circle] (3) at (1,1) {3};
|
||||
\node[draw, circle] (4) at (4,1) {4};
|
||||
\node[draw, circle] (5) at (6,2) {5};
|
||||
\node[color=red] at (1,3+0.55) {$0$};
|
||||
\node[color=red] at (4,3+0.55) {$5$};
|
||||
\node[color=red] at (1,1-0.55) {$3$};
|
||||
\node[color=red] at (4,1-0.55) {$4$};
|
||||
\node[color=red] at (6,2-0.55) {$6$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Implementation}
|
||||
|
||||
The following implementation of the
|
||||
Bellman–Ford algorithm determines the shortest distances
|
||||
from a node $x$ to all nodes of the graph.
|
||||
The code assumes that the graph is stored
|
||||
as an edge list \texttt{edges}
|
||||
that consists of tuples of the form $(a,b,w)$,
|
||||
meaning that there is an edge from node $a$ to node $b$
|
||||
with weight $w$.
|
||||
|
||||
The algorithm consists of $n-1$ rounds,
|
||||
and on each round the algorithm goes through
|
||||
all edges of the graph and tries to
|
||||
reduce the distances.
|
||||
The algorithm constructs an array \texttt{distance}
|
||||
that will contain the distances from $x$
|
||||
to all nodes of the graph.
|
||||
The constant \texttt{INF} denotes an infinite distance.
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i++) distance[i] = INF;
|
||||
distance[x] = 0;
|
||||
for (int i = 1; i <= n-1; i++) {
|
||||
for (auto e : edges) {
|
||||
int a, b, w;
|
||||
tie(a, b, w) = e;
|
||||
distance[b] = min(distance[b], distance[a]+w);
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The time complexity of the algorithm is $O(nm)$,
|
||||
because the algorithm consists of $n-1$ rounds and
|
||||
iterates through all $m$ edges during a round.
|
||||
If there are no negative cycles in the graph,
|
||||
all distances are final after $n-1$ rounds,
|
||||
because each shortest path can contain at most $n-1$ edges.
|
||||
|
||||
In practice, the final distances can usually
|
||||
be found faster than in $n-1$ rounds.
|
||||
Thus, a possible way to make the algorithm more efficient
|
||||
is to stop the algorithm if no distance
|
||||
can be reduced during a round.
|
||||
|
||||
\subsubsection{Negative cycles}
|
||||
|
||||
\index{negative cycle}
|
||||
|
||||
The Bellman–Ford algorithm can also be used to
|
||||
check if the graph contains a cycle with negative length.
|
||||
For example, the graph
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (2,1) {$2$};
|
||||
\node[draw, circle] (3) at (2,-1) {$3$};
|
||||
\node[draw, circle] (4) at (4,0) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:$3$] {} (2);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:$1$] {} (4);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=below:$5$] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:$-7$] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=right:$2$] {} (3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\noindent
|
||||
contains a negative cycle
|
||||
$2 \rightarrow 3 \rightarrow 4 \rightarrow 2$
|
||||
with length $-4$.
|
||||
|
||||
If the graph contains a negative cycle,
|
||||
we can shorten infinitely many times
|
||||
any path that contains the cycle by repeating the cycle
|
||||
again and again.
|
||||
Thus, the concept of a shortest path
|
||||
is not meaningful in this situation.
|
||||
|
||||
A negative cycle can be detected
|
||||
using the Bellman–Ford algorithm by
|
||||
running the algorithm for $n$ rounds.
|
||||
If the last round reduces any distance,
|
||||
the graph contains a negative cycle.
|
||||
Note that this algorithm can be used to
|
||||
search for
|
||||
a negative cycle in the whole graph
|
||||
regardless of the starting node.
|
||||
|
||||
\subsubsection{SPFA algorithm}
|
||||
|
||||
\index{SPFA algorithm}
|
||||
|
||||
The \key{SPFA algorithm} (''Shortest Path Faster Algorithm'') \cite{fan94}
|
||||
is a variant of the Bellman–Ford algorithm,
|
||||
that is often more efficient than the original algorithm.
|
||||
The SPFA algorithm does not go through all the edges on each round,
|
||||
but instead, it chooses the edges to be examined
|
||||
in a more intelligent way.
|
||||
|
||||
The algorithm maintains a queue of nodes that might
|
||||
be used for reducing the distances.
|
||||
First, the algorithm adds the starting node $x$
|
||||
to the queue.
|
||||
Then, the algorithm always processes the
|
||||
first node in the queue, and when an edge
|
||||
$a \rightarrow b$ reduces a distance,
|
||||
node $b$ is added to the queue.
|
||||
%
|
||||
% The following implementation uses a
|
||||
% \texttt{queue} \texttt{q}.
|
||||
% In addition, an array \texttt{inqueue} indicates
|
||||
% if a node is already in the queue,
|
||||
% in which case the algorithm does not add
|
||||
% the node to the queue again.
|
||||
%
|
||||
% \begin{lstlisting}
|
||||
% for (int i = 1; i <= n; i++) distance[i] = INF;
|
||||
% distance[x] = 0;
|
||||
% q.push(x);
|
||||
% while (!q.empty()) {
|
||||
% int a = q.front(); q.pop();
|
||||
% inqueue[a] = false;
|
||||
% for (auto b : v[a]) {
|
||||
% if (distance[a]+b.second < distance[b.first]) {
|
||||
% distance[b.first] = distance[a]+b.second;
|
||||
% if (!inqueue[b]) {q.push(b); inqueue[b] = true;}
|
||||
% }
|
||||
% }
|
||||
% }
|
||||
% \end{lstlisting}
|
||||
|
||||
The efficiency of the SPFA algorithm depends
|
||||
on the structure of the graph:
|
||||
the algorithm is often efficient,
|
||||
but its worst case time complexity is still
|
||||
$O(nm)$ and it is possible to create inputs
|
||||
that make the algorithm as slow as the
|
||||
original Bellman–Ford algorithm.
|
||||
|
||||
\section{Dijkstra's algorithm}
|
||||
|
||||
\index{Dijkstra's algorithm}
|
||||
|
||||
\key{Dijkstra's algorithm}\footnote{E. W. Dijkstra published the algorithm in 1959 \cite{dij59};
|
||||
however, his original paper does not mention how to implement the algorithm efficiently.}
|
||||
finds shortest
|
||||
paths from the starting node to all nodes of the graph,
|
||||
like the Bellman–Ford algorithm.
|
||||
The benefit of Dijsktra's algorithm is that
|
||||
it is more efficient and can be used for
|
||||
processing large graphs.
|
||||
However, the algorithm requires that there
|
||||
are no negative weight edges in the graph.
|
||||
|
||||
Like the Bellman–Ford algorithm,
|
||||
Dijkstra's algorithm maintains distances
|
||||
to the nodes and reduces them during the search.
|
||||
Dijkstra's algorithm is efficient, because
|
||||
it only processes
|
||||
each edge in the graph once, using the fact
|
||||
that there are no negative edges.
|
||||
|
||||
\subsubsection{Example}
|
||||
|
||||
Let us consider how Dijkstra's algorithm
|
||||
works in the following graph when the
|
||||
starting node is node 1:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {3};
|
||||
\node[draw, circle] (2) at (4,3) {4};
|
||||
\node[draw, circle] (3) at (1,1) {2};
|
||||
\node[draw, circle] (4) at (4,1) {1};
|
||||
\node[draw, circle] (5) at (6,2) {5};
|
||||
|
||||
\node[color=red] at (1,3+0.6) {$\infty$};
|
||||
\node[color=red] at (4,3+0.6) {$\infty$};
|
||||
\node[color=red] at (1,1-0.6) {$\infty$};
|
||||
\node[color=red] at (4,1-0.6) {$0$};
|
||||
\node[color=red] at (6,2-0.6) {$\infty$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Like in the Bellman–Ford algorithm,
|
||||
initially the distance to the starting node is 0
|
||||
and the distance to all other nodes is infinite.
|
||||
|
||||
At each step, Dijkstra's algorithm selects a node
|
||||
that has not been processed yet and whose distance
|
||||
is as small as possible.
|
||||
The first such node is node 1 with distance 0.
|
||||
|
||||
When a node is selected, the algorithm
|
||||
goes through all edges that start at the node
|
||||
and reduces the distances using them:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {3};
|
||||
\node[draw, circle] (2) at (4,3) {4};
|
||||
\node[draw, circle] (3) at (1,1) {2};
|
||||
\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
|
||||
\node[draw, circle] (5) at (6,2) {5};
|
||||
|
||||
\node[color=red] at (1,3+0.6) {$\infty$};
|
||||
\node[color=red] at (4,3+0.6) {$9$};
|
||||
\node[color=red] at (1,1-0.6) {$5$};
|
||||
\node[color=red] at (4,1-0.6) {$0$};
|
||||
\node[color=red] at (6,2-0.6) {$1$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
In this case,
|
||||
the edges from node 1 reduced the distances of
|
||||
nodes 2, 4 and 5, whose distances are now 5, 9 and 1.
|
||||
|
||||
The next node to be processed is node 5 with distance 1.
|
||||
This reduces the distance to node 4 from 9 to 3:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (1,3) {3};
|
||||
\node[draw, circle] (2) at (4,3) {4};
|
||||
\node[draw, circle] (3) at (1,1) {2};
|
||||
\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
|
||||
\node[draw, circle, fill=lightgray] (5) at (6,2) {5};
|
||||
|
||||
\node[color=red] at (1,3+0.6) {$\infty$};
|
||||
\node[color=red] at (4,3+0.6) {$3$};
|
||||
\node[color=red] at (1,1-0.6) {$5$};
|
||||
\node[color=red] at (4,1-0.6) {$0$};
|
||||
\node[color=red] at (6,2-0.6) {$1$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- (2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
After this, the next node is node 4, which reduces
|
||||
the distance to node 3 to 9:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {3};
|
||||
\node[draw, circle, fill=lightgray] (2) at (4,3) {4};
|
||||
\node[draw, circle] (3) at (1,1) {2};
|
||||
\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
|
||||
\node[draw, circle, fill=lightgray] (5) at (6,2) {5};
|
||||
|
||||
\node[color=red] at (1,3+0.6) {$9$};
|
||||
\node[color=red] at (4,3+0.6) {$3$};
|
||||
\node[color=red] at (1,1-0.6) {$5$};
|
||||
\node[color=red] at (4,1-0.6) {$0$};
|
||||
\node[color=red] at (6,2-0.6) {$1$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
A remarkable property in Dijkstra's algorithm is that
|
||||
whenever a node is selected, its distance is final.
|
||||
For example, at this point of the algorithm,
|
||||
the distances 0, 1 and 3 are the final distances
|
||||
to nodes 1, 5 and 4.
|
||||
|
||||
After this, the algorithm processes the two
|
||||
remaining nodes, and the final distances are as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle, fill=lightgray] (1) at (1,3) {3};
|
||||
\node[draw, circle, fill=lightgray] (2) at (4,3) {4};
|
||||
\node[draw, circle, fill=lightgray] (3) at (1,1) {2};
|
||||
\node[draw, circle, fill=lightgray] (4) at (4,1) {1};
|
||||
\node[draw, circle, fill=lightgray] (5) at (6,2) {5};
|
||||
|
||||
\node[color=red] at (1,3+0.6) {$7$};
|
||||
\node[color=red] at (4,3+0.6) {$3$};
|
||||
\node[color=red] at (1,1-0.6) {$5$};
|
||||
\node[color=red] at (4,1-0.6) {$0$};
|
||||
\node[color=red] at (6,2-0.6) {$1$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:6] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Negative edges}
|
||||
|
||||
The efficiency of Dijkstra's algorithm is
|
||||
based on the fact that the graph does not
|
||||
contain negative edges.
|
||||
If there is a negative edge,
|
||||
the algorithm may give incorrect results.
|
||||
As an example, consider the following graph:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (2,1) {$2$};
|
||||
\node[draw, circle] (3) at (2,-1) {$3$};
|
||||
\node[draw, circle] (4) at (4,0) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:2] {} (2);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:3] {} (4);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=below:6] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:$-5$] {} (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\noindent
|
||||
The shortest path from node 1 to node 4 is
|
||||
$1 \rightarrow 3 \rightarrow 4$
|
||||
and its length is 1.
|
||||
However, Dijkstra's algorithm
|
||||
finds the path $1 \rightarrow 2 \rightarrow 4$
|
||||
by following the minimum weight edges.
|
||||
The algorithm does not take into account that
|
||||
on the other path, the weight $-5$
|
||||
compensates the previous large weight $6$.
|
||||
|
||||
\subsubsection{Implementation}
|
||||
|
||||
The following implementation of Dijkstra's algorithm
|
||||
calculates the minimum distances from a node $x$
|
||||
to other nodes of the graph.
|
||||
The graph is stored as adjacency lists
|
||||
so that \texttt{adj[$a$]} contains a pair $(b,w)$
|
||||
always when there is an edge from node $a$ to node $b$
|
||||
with weight $w$.
|
||||
|
||||
An efficient implementation of Dijkstra's algorithm
|
||||
requires that it is possible to efficiently find the
|
||||
minimum distance node that has not been processed.
|
||||
An appropriate data structure for this is a priority queue
|
||||
that contains the nodes ordered by their distances.
|
||||
Using a priority queue, the next node to be processed
|
||||
can be retrieved in logarithmic time.
|
||||
|
||||
In the following code, the priority queue
|
||||
\texttt{q} contains pairs of the form $(-d,x)$,
|
||||
meaning that the current distance to node $x$ is $d$.
|
||||
The array $\texttt{distance}$ contains the distance to
|
||||
each node, and the array $\texttt{processed}$ indicates
|
||||
whether a node has been processed.
|
||||
Initially the distance is $0$ to $x$ and $\infty$ to all other nodes.
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i++) distance[i] = INF;
|
||||
distance[x] = 0;
|
||||
q.push({0,x});
|
||||
while (!q.empty()) {
|
||||
int a = q.top().second; q.pop();
|
||||
if (processed[a]) continue;
|
||||
processed[a] = true;
|
||||
for (auto u : adj[a]) {
|
||||
int b = u.first, w = u.second;
|
||||
if (distance[a]+w < distance[b]) {
|
||||
distance[b] = distance[a]+w;
|
||||
q.push({-distance[b],b});
|
||||
}
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
Note that the priority queue contains \emph{negative}
|
||||
distances to nodes.
|
||||
The reason for this is that the
|
||||
default version of the C++ priority queue finds maximum
|
||||
elements, while we want to find minimum elements.
|
||||
By using negative distances,
|
||||
we can directly use the default priority queue\footnote{Of
|
||||
course, we could also declare the priority queue as in Chapter 4.5
|
||||
and use positive distances, but the implementation would be a bit longer.}.
|
||||
Also note that there may be several instances of the same
|
||||
node in the priority queue; however, only the instance with the
|
||||
minimum distance will be processed.
|
||||
|
||||
The time complexity of the above implementation is
|
||||
$O(n+m \log m)$, because the algorithm goes through
|
||||
all nodes of the graph and adds for each edge
|
||||
at most one distance to the priority queue.
|
||||
|
||||
\section{Floyd–Warshall algorithm}
|
||||
|
||||
\index{Floyd–Warshall algorithm}
|
||||
|
||||
The \key{Floyd–Warshall algorithm}\footnote{The algorithm
|
||||
is named after R. W. Floyd and S. Warshall
|
||||
who published it independently in 1962 \cite{flo62,war62}.}
|
||||
provides an alternative way to approach the problem
|
||||
of finding shortest paths.
|
||||
Unlike the other algorithms of this chapter,
|
||||
it finds all shortest paths between the nodes
|
||||
in a single run.
|
||||
|
||||
The algorithm maintains a two-dimensional array
|
||||
that contains distances between the nodes.
|
||||
First, distances are calculated only using
|
||||
direct edges between the nodes,
|
||||
and after this, the algorithm reduces distances
|
||||
by using intermediate nodes in paths.
|
||||
|
||||
\subsubsection{Example}
|
||||
|
||||
Let us consider how the Floyd–Warshall algorithm
|
||||
works in the following graph:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$3$};
|
||||
\node[draw, circle] (2) at (4,3) {$4$};
|
||||
\node[draw, circle] (3) at (1,1) {$2$};
|
||||
\node[draw, circle] (4) at (4,1) {$1$};
|
||||
\node[draw, circle] (5) at (6,2) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Initially, the distance from each node to itself is $0$,
|
||||
and the distance between nodes $a$ and $b$ is $x$
|
||||
if there is an edge between nodes $a$ and $b$ with weight $x$.
|
||||
All other distances are infinite.
|
||||
|
||||
In this graph, the initial array is as follows:
|
||||
\begin{center}
|
||||
\begin{tabular}{r|rrrrr}
|
||||
& 1 & 2 & 3 & 4 & 5 \\
|
||||
\hline
|
||||
1 & 0 & 5 & $\infty$ & 9 & 1 \\
|
||||
2 & 5 & 0 & 2 & $\infty$ & $\infty$ \\
|
||||
3 & $\infty$ & 2 & 0 & 7 & $\infty$ \\
|
||||
4 & 9 & $\infty$ & 7 & 0 & 2 \\
|
||||
5 & 1 & $\infty$ & $\infty$ & 2 & 0 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
\vspace{10pt}
|
||||
The algorithm consists of consecutive rounds.
|
||||
On each round, the algorithm selects a new node
|
||||
that can act as an intermediate node in paths from now on,
|
||||
and distances are reduced using this node.
|
||||
|
||||
On the first round, node 1 is the new intermediate node.
|
||||
There is a new path between nodes 2 and 4
|
||||
with length 14, because node 1 connects them.
|
||||
There is also a new path
|
||||
between nodes 2 and 5 with length 6.
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{r|rrrrr}
|
||||
& 1 & 2 & 3 & 4 & 5 \\
|
||||
\hline
|
||||
1 & 0 & 5 & $\infty$ & 9 & 1 \\
|
||||
2 & 5 & 0 & 2 & \textbf{14} & \textbf{6} \\
|
||||
3 & $\infty$ & 2 & 0 & 7 & $\infty$ \\
|
||||
4 & 9 & \textbf{14} & 7 & 0 & 2 \\
|
||||
5 & 1 & \textbf{6} & $\infty$ & 2 & 0 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
\vspace{10pt}
|
||||
|
||||
On the second round, node 2 is the new intermediate node.
|
||||
This creates new paths between nodes 1 and 3
|
||||
and between nodes 3 and 5:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{r|rrrrr}
|
||||
& 1 & 2 & 3 & 4 & 5 \\
|
||||
\hline
|
||||
1 & 0 & 5 & \textbf{7} & 9 & 1 \\
|
||||
2 & 5 & 0 & 2 & 14 & 6 \\
|
||||
3 & \textbf{7} & 2 & 0 & 7 & \textbf{8} \\
|
||||
4 & 9 & 14 & 7 & 0 & 2 \\
|
||||
5 & 1 & 6 & \textbf{8} & 2 & 0 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
\vspace{10pt}
|
||||
|
||||
On the third round, node 3 is the new intermediate round.
|
||||
There is a new path between nodes 2 and 4:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{r|rrrrr}
|
||||
& 1 & 2 & 3 & 4 & 5 \\
|
||||
\hline
|
||||
1 & 0 & 5 & 7 & 9 & 1 \\
|
||||
2 & 5 & 0 & 2 & \textbf{9} & 6 \\
|
||||
3 & 7 & 2 & 0 & 7 & 8 \\
|
||||
4 & 9 & \textbf{9} & 7 & 0 & 2 \\
|
||||
5 & 1 & 6 & 8 & 2 & 0 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
\vspace{10pt}
|
||||
|
||||
The algorithm continues like this,
|
||||
until all nodes have been appointed intermediate nodes.
|
||||
After the algorithm has finished, the array contains
|
||||
the minimum distances between any two nodes:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{r|rrrrr}
|
||||
& 1 & 2 & 3 & 4 & 5 \\
|
||||
\hline
|
||||
1 & 0 & 5 & 7 & 3 & 1 \\
|
||||
2 & 5 & 0 & 2 & 8 & 6 \\
|
||||
3 & 7 & 2 & 0 & 7 & 8 \\
|
||||
4 & 3 & 8 & 7 & 0 & 2 \\
|
||||
5 & 1 & 6 & 8 & 2 & 0 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
For example, the array tells us that the
|
||||
shortest distance between nodes 2 and 4 is 8.
|
||||
This corresponds to the following path:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$3$};
|
||||
\node[draw, circle] (2) at (4,3) {$4$};
|
||||
\node[draw, circle] (3) at (1,1) {$2$};
|
||||
\node[draw, circle] (4) at (4,1) {$1$};
|
||||
\node[draw, circle] (5) at (6,2) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- (2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Implementation}
|
||||
|
||||
The advantage of the
|
||||
Floyd–Warshall algorithm that it is
|
||||
easy to implement.
|
||||
The following code constructs a
|
||||
distance matrix where $\texttt{distance}[a][b]$
|
||||
is the shortest distance between nodes $a$ and $b$.
|
||||
First, the algorithm initializes \texttt{distance}
|
||||
using the adjacency matrix \texttt{adj} of the graph:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i++) {
|
||||
for (int j = 1; j <= n; j++) {
|
||||
if (i == j) distance[i][j] = 0;
|
||||
else if (adj[i][j]) distance[i][j] = adj[i][j];
|
||||
else distance[i][j] = INF;
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
After this, the shortest distances can be found as follows:
|
||||
\begin{lstlisting}
|
||||
for (int k = 1; k <= n; k++) {
|
||||
for (int i = 1; i <= n; i++) {
|
||||
for (int j = 1; j <= n; j++) {
|
||||
distance[i][j] = min(distance[i][j],
|
||||
distance[i][k]+distance[k][j]);
|
||||
}
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The time complexity of the algorithm is $O(n^3)$,
|
||||
because it contains three nested loops
|
||||
that go through the nodes of the graph.
|
||||
|
||||
Since the implementation of the Floyd–Warshall
|
||||
algorithm is simple, the algorithm can be
|
||||
a good choice even if it is only needed to find a
|
||||
single shortest path in the graph.
|
||||
However, the algorithm can only be used when the graph
|
||||
is so small that a cubic time complexity is fast enough.
|
609
chapter14.tex
609
chapter14.tex
|
@ -1,609 +0,0 @@
|
|||
\chapter{Tree algorithms}
|
||||
|
||||
\index{tree}
|
||||
|
||||
A \key{tree} is a connected, acyclic graph
|
||||
that consists of $n$ nodes and $n-1$ edges.
|
||||
Removing any edge from a tree divides it
|
||||
into two components,
|
||||
and adding any edge to a tree creates a cycle.
|
||||
Moreover, there is always a unique path between any
|
||||
two nodes of a tree.
|
||||
|
||||
For example, the following tree consists of 8 nodes and 7 edges:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,3) {$1$};
|
||||
\node[draw, circle] (2) at (2,3) {$4$};
|
||||
\node[draw, circle] (3) at (0,1) {$2$};
|
||||
\node[draw, circle] (4) at (2,1) {$3$};
|
||||
\node[draw, circle] (5) at (4,1) {$7$};
|
||||
\node[draw, circle] (6) at (-2,3) {$5$};
|
||||
\node[draw, circle] (7) at (-2,1) {$6$};
|
||||
\node[draw, circle] (8) at (-4,1) {$8$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
\path[draw,thick,-] (7) -- (8);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\index{leaf}
|
||||
|
||||
The \key{leaves} of a tree are the nodes
|
||||
with degree 1, i.e., with only one neighbor.
|
||||
For example, the leaves of the above tree
|
||||
are nodes 3, 5, 7 and 8.
|
||||
|
||||
\index{root}
|
||||
\index{rooted tree}
|
||||
|
||||
In a \key{rooted} tree, one of the nodes
|
||||
is appointed the \key{root} of the tree,
|
||||
and all other nodes are
|
||||
placed underneath the root.
|
||||
For example, in the following tree,
|
||||
node 1 is the root node.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,3) {$1$};
|
||||
\node[draw, circle] (4) at (2,1) {$4$};
|
||||
\node[draw, circle] (2) at (-2,1) {$2$};
|
||||
\node[draw, circle] (3) at (0,1) {$3$};
|
||||
\node[draw, circle] (7) at (2,-1) {$7$};
|
||||
\node[draw, circle] (5) at (-3,-1) {$5$};
|
||||
\node[draw, circle] (6) at (-1,-1) {$6$};
|
||||
\node[draw, circle] (8) at (-1,-3) {$8$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (6);
|
||||
\path[draw,thick,-] (4) -- (7);
|
||||
\path[draw,thick,-] (6) -- (8);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\index{child}
|
||||
\index{parent}
|
||||
|
||||
In a rooted tree, the \key{children} of a node
|
||||
are its lower neighbors, and the \key{parent} of a node
|
||||
is its upper neighbor.
|
||||
Each node has exactly one parent,
|
||||
except for the root that does not have a parent.
|
||||
For example, in the above tree,
|
||||
the children of node 2 are nodes 5 and 6,
|
||||
and its parent is node 1.
|
||||
|
||||
\index{subtree}
|
||||
|
||||
The structure of a rooted tree is \emph{recursive}:
|
||||
each node of the tree acts as the root of a \key{subtree}
|
||||
that contains the node itself and all nodes
|
||||
that are in the subtrees of its children.
|
||||
For example, in the above tree, the subtree of node 2
|
||||
consists of nodes 2, 5, 6 and 8:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (2) at (-2,1) {$2$};
|
||||
\node[draw, circle] (5) at (-3,-1) {$5$};
|
||||
\node[draw, circle] (6) at (-1,-1) {$6$};
|
||||
\node[draw, circle] (8) at (-1,-3) {$8$};
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (6);
|
||||
\path[draw,thick,-] (6) -- (8);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\section{Tree traversal}
|
||||
|
||||
General graph traversal algorithms
|
||||
can be used to traverse the nodes of a tree.
|
||||
However, the traversal of a tree is easier to implement than
|
||||
that of a general graph, because
|
||||
there are no cycles in the tree and it is not
|
||||
possible to reach a node from multiple directions.
|
||||
|
||||
The typical way to traverse a tree is to start
|
||||
a depth-first search at an arbitrary node.
|
||||
The following recursive function can be used:
|
||||
|
||||
\begin{lstlisting}
|
||||
void dfs(int s, int e) {
|
||||
// process node s
|
||||
for (auto u : adj[s]) {
|
||||
if (u != e) dfs(u, s);
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The function is given two parameters: the current node $s$
|
||||
and the previous node $e$.
|
||||
The purpose of the parameter $e$ is to make sure
|
||||
that the search only moves to nodes
|
||||
that have not been visited yet.
|
||||
|
||||
The following function call starts the search
|
||||
at node $x$:
|
||||
|
||||
\begin{lstlisting}
|
||||
dfs(x, 0);
|
||||
\end{lstlisting}
|
||||
|
||||
In the first call $e=0$, because there is no
|
||||
previous node, and it is allowed
|
||||
to proceed to any direction in the tree.
|
||||
|
||||
\subsubsection{Dynamic programming}
|
||||
|
||||
Dynamic programming can be used to calculate
|
||||
some information during a tree traversal.
|
||||
Using dynamic programming, we can, for example,
|
||||
calculate in $O(n)$ time for each node of a rooted tree the
|
||||
number of nodes in its subtree
|
||||
or the length of the longest path from the node
|
||||
to a leaf.
|
||||
|
||||
As an example, let us calculate for each node $s$
|
||||
a value $\texttt{count}[s]$: the number of nodes in its subtree.
|
||||
The subtree contains the node itself and
|
||||
all nodes in the subtrees of its children,
|
||||
so we can calculate the number of nodes
|
||||
recursively using the following code:
|
||||
|
||||
\begin{lstlisting}
|
||||
void dfs(int s, int e) {
|
||||
count[s] = 1;
|
||||
for (auto u : adj[s]) {
|
||||
if (u == e) continue;
|
||||
dfs(u, s);
|
||||
count[s] += count[u];
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Diameter}
|
||||
|
||||
\index{diameter}
|
||||
|
||||
The \key{diameter} of a tree
|
||||
is the maximum length of a path between two nodes.
|
||||
For example, consider the following tree:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,3) {$1$};
|
||||
\node[draw, circle] (2) at (2,3) {$4$};
|
||||
\node[draw, circle] (3) at (0,1) {$2$};
|
||||
\node[draw, circle] (4) at (2,1) {$3$};
|
||||
\node[draw, circle] (5) at (4,1) {$7$};
|
||||
\node[draw, circle] (6) at (-2,3) {$5$};
|
||||
\node[draw, circle] (7) at (-2,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The diameter of this tree is 4,
|
||||
which corresponds to the following path:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,3) {$1$};
|
||||
\node[draw, circle] (2) at (2,3) {$4$};
|
||||
\node[draw, circle] (3) at (0,1) {$2$};
|
||||
\node[draw, circle] (4) at (2,1) {$3$};
|
||||
\node[draw, circle] (5) at (4,1) {$7$};
|
||||
\node[draw, circle] (6) at (-2,3) {$5$};
|
||||
\node[draw, circle] (7) at (-2,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
|
||||
\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Note that there may be several maximum-length paths.
|
||||
In the above path, we could replace node 6 with node 5
|
||||
to obtain another path with length 4.
|
||||
|
||||
Next we will discuss two $O(n)$ time algorithms
|
||||
for calculating the diameter of a tree.
|
||||
The first algorithm is based on dynamic programming,
|
||||
and the second algorithm uses two depth-first searches.
|
||||
|
||||
\subsubsection{Algorithm 1}
|
||||
|
||||
A general way to approach many tree problems
|
||||
is to first root the tree arbitrarily.
|
||||
After this, we can try to solve the problem
|
||||
separately for each subtree.
|
||||
Our first algorithm for calculating the diameter
|
||||
is based on this idea.
|
||||
|
||||
An important observation is that every path
|
||||
in a rooted tree has a \emph{highest point}:
|
||||
the highest node that belongs to the path.
|
||||
Thus, we can calculate for each node the length
|
||||
of the longest path whose highest point is the node.
|
||||
One of those paths corresponds to the diameter of the tree.
|
||||
|
||||
For example, in the following tree,
|
||||
node 1 is the highest point on the path
|
||||
that corresponds to the diameter:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,3) {$1$};
|
||||
\node[draw, circle] (2) at (2,1) {$4$};
|
||||
\node[draw, circle] (3) at (-2,1) {$2$};
|
||||
\node[draw, circle] (4) at (0,1) {$3$};
|
||||
\node[draw, circle] (5) at (2,-1) {$7$};
|
||||
\node[draw, circle] (6) at (-3,-1) {$5$};
|
||||
\node[draw, circle] (7) at (-1,-1) {$6$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
|
||||
\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
We calculate for each node $x$ two values:
|
||||
\begin{itemize}
|
||||
\item $\texttt{toLeaf}(x)$: the maximum length of a path from $x$ to any leaf
|
||||
\item $\texttt{maxLength}(x)$: the maximum length of a path
|
||||
whose highest point is $x$
|
||||
\end{itemize}
|
||||
For example, in the above tree,
|
||||
$\texttt{toLeaf}(1)=2$, because there is a path
|
||||
$1 \rightarrow 2 \rightarrow 6$,
|
||||
and $\texttt{maxLength}(1)=4$,
|
||||
because there is a path
|
||||
$6 \rightarrow 2 \rightarrow 1 \rightarrow 4 \rightarrow 7$.
|
||||
In this case, $\texttt{maxLength}(1)$ equals the diameter.
|
||||
|
||||
Dynamic programming can be used to calculate the above
|
||||
values for all nodes in $O(n)$ time.
|
||||
First, to calculate $\texttt{toLeaf}(x)$,
|
||||
we go through the children of $x$,
|
||||
choose a child $c$ with maximum $\texttt{toLeaf}(c)$
|
||||
and add one to this value.
|
||||
Then, to calculate $\texttt{maxLength}(x)$,
|
||||
we choose two distinct children $a$ and $b$
|
||||
such that the sum $\texttt{toLeaf}(a)+\texttt{toLeaf}(b)$
|
||||
is maximum and add two to this sum.
|
||||
|
||||
\subsubsection{Algorithm 2}
|
||||
|
||||
Another efficient way to calculate the diameter
|
||||
of a tree is based on two depth-first searches.
|
||||
First, we choose an arbitrary node $a$ in the tree
|
||||
and find the farthest node $b$ from $a$.
|
||||
Then, we find the farthest node $c$ from $b$.
|
||||
The diameter of the tree is the distance between $b$ and $c$.
|
||||
|
||||
In the following graph, $a$, $b$ and $c$ could be:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,3) {$1$};
|
||||
\node[draw, circle] (2) at (2,3) {$4$};
|
||||
\node[draw, circle] (3) at (0,1) {$2$};
|
||||
\node[draw, circle] (4) at (2,1) {$3$};
|
||||
\node[draw, circle] (5) at (4,1) {$7$};
|
||||
\node[draw, circle] (6) at (-2,3) {$5$};
|
||||
\node[draw, circle] (7) at (-2,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
\node[color=red] at (2,1.6) {$a$};
|
||||
\node[color=red] at (-2,1.6) {$b$};
|
||||
\node[color=red] at (4,1.6) {$c$};
|
||||
|
||||
\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
This is an elegant method, but why does it work?
|
||||
|
||||
It helps to draw the tree differently so that
|
||||
the path that corresponds to the diameter
|
||||
is horizontal, and all other
|
||||
nodes hang from it:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (2,1) {$1$};
|
||||
\node[draw, circle] (2) at (4,1) {$4$};
|
||||
\node[draw, circle] (3) at (0,1) {$2$};
|
||||
\node[draw, circle] (4) at (2,-1) {$3$};
|
||||
\node[draw, circle] (5) at (6,1) {$7$};
|
||||
\node[draw, circle] (6) at (0,-1) {$5$};
|
||||
\node[draw, circle] (7) at (-2,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
\node[color=red] at (2,-1.6) {$a$};
|
||||
\node[color=red] at (-2,1.6) {$b$};
|
||||
\node[color=red] at (6,1.6) {$c$};
|
||||
\node[color=red] at (2,1.6) {$x$};
|
||||
|
||||
\path[draw,thick,-,color=red,line width=2pt] (7) -- (3);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (3) -- (1);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (1) -- (2);
|
||||
\path[draw,thick,-,color=red,line width=2pt] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Node $x$ indicates the place where the path
|
||||
from node $a$ joins the path that corresponds
|
||||
to the diameter.
|
||||
The farthest node from $a$
|
||||
is node $b$, node $c$ or some other node
|
||||
that is at least as far from node $x$.
|
||||
Thus, this node is always a valid choice for
|
||||
an endpoint of a path that corresponds to the diameter.
|
||||
|
||||
\section{All longest paths}
|
||||
|
||||
Our next problem is to calculate for every node
|
||||
in the tree the maximum length of a path
|
||||
that begins at the node.
|
||||
This can be seen as a generalization of the
|
||||
tree diameter problem, because the largest of those
|
||||
lengths equals the diameter of the tree.
|
||||
Also this problem can be solved in $O(n)$ time.
|
||||
|
||||
As an example, consider the following tree:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (-1.5,-1) {$4$};
|
||||
\node[draw, circle] (3) at (2,0) {$2$};
|
||||
\node[draw, circle] (4) at (-1.5,1) {$3$};
|
||||
\node[draw, circle] (6) at (3.5,-1) {$6$};
|
||||
\node[draw, circle] (7) at (3.5,1) {$5$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Let $\texttt{maxLength}(x)$ denote the maximum length
|
||||
of a path that begins at node $x$.
|
||||
For example, in the above tree,
|
||||
$\texttt{maxLength}(4)=3$, because there
|
||||
is a path $4 \rightarrow 1 \rightarrow 2 \rightarrow 6$.
|
||||
Here is a complete table of the values:
|
||||
\begin{center}
|
||||
\begin{tabular}{l|lllllll}
|
||||
node $x$ & 1 & 2 & 3 & 4 & 5 & 6 \\
|
||||
$\texttt{maxLength}(x)$ & 2 & 2 & 3 & 3 & 3 & 3 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
Also in this problem, a good starting point
|
||||
for solving the problem is to root the tree arbitrarily:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,3) {$1$};
|
||||
\node[draw, circle] (2) at (2,1) {$4$};
|
||||
\node[draw, circle] (3) at (-2,1) {$2$};
|
||||
\node[draw, circle] (4) at (0,1) {$3$};
|
||||
\node[draw, circle] (6) at (-3,-1) {$5$};
|
||||
\node[draw, circle] (7) at (-1,-1) {$6$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The first part of the problem is to calculate for every node $x$
|
||||
the maximum length of a path that goes through a child of $x$.
|
||||
For example, the longest path from node 1
|
||||
goes through its child 2:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,3) {$1$};
|
||||
\node[draw, circle] (2) at (2,1) {$4$};
|
||||
\node[draw, circle] (3) at (-2,1) {$2$};
|
||||
\node[draw, circle] (4) at (0,1) {$3$};
|
||||
\node[draw, circle] (6) at (-3,-1) {$5$};
|
||||
\node[draw, circle] (7) at (-1,-1) {$6$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
|
||||
\path[draw,thick,->,color=red,line width=2pt] (1) -- (3);
|
||||
\path[draw,thick,->,color=red,line width=2pt] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
This part is easy to solve in $O(n)$ time, because we can use
|
||||
dynamic programming as we have done previously.
|
||||
|
||||
Then, the second part of the problem is to calculate
|
||||
for every node $x$ the maximum length of a path
|
||||
through its parent $p$.
|
||||
For example, the longest path
|
||||
from node 3 goes through its parent 1:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,3) {$1$};
|
||||
\node[draw, circle] (2) at (2,1) {$4$};
|
||||
\node[draw, circle] (3) at (-2,1) {$2$};
|
||||
\node[draw, circle] (4) at (0,1) {$3$};
|
||||
\node[draw, circle] (6) at (-3,-1) {$5$};
|
||||
\node[draw, circle] (7) at (-1,-1) {$6$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
|
||||
\path[draw,thick,->,color=red,line width=2pt] (4) -- (1);
|
||||
\path[draw,thick,->,color=red,line width=2pt] (1) -- (3);
|
||||
\path[draw,thick,->,color=red,line width=2pt] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
At first glance, it seems that we should choose
|
||||
the longest path from $p$.
|
||||
However, this \emph{does not} always work,
|
||||
because the longest path from $p$
|
||||
may go through $x$.
|
||||
Here is an example of this situation:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,3) {$1$};
|
||||
\node[draw, circle] (2) at (2,1) {$4$};
|
||||
\node[draw, circle] (3) at (-2,1) {$2$};
|
||||
\node[draw, circle] (4) at (0,1) {$3$};
|
||||
\node[draw, circle] (6) at (-3,-1) {$5$};
|
||||
\node[draw, circle] (7) at (-1,-1) {$6$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
|
||||
\path[draw,thick,->,color=red,line width=2pt] (3) -- (1);
|
||||
\path[draw,thick,->,color=red,line width=2pt] (1) -- (2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Still, we can solve the second part in
|
||||
$O(n)$ time by storing \emph{two} maximum lengths
|
||||
for each node $x$:
|
||||
\begin{itemize}
|
||||
\item $\texttt{maxLength}_1(x)$:
|
||||
the maximum length of a path from $x$
|
||||
\item $\texttt{maxLength}_2(x)$
|
||||
the maximum length of a path from $x$
|
||||
in another direction than the first path
|
||||
\end{itemize}
|
||||
For example, in the above graph,
|
||||
$\texttt{maxLength}_1(1)=2$
|
||||
using the path $1 \rightarrow 2 \rightarrow 5$,
|
||||
and $\texttt{maxLength}_2(1)=1$
|
||||
using the path $1 \rightarrow 3$.
|
||||
|
||||
Finally, if the path that corresponds to
|
||||
$\texttt{maxLength}_1(p)$ goes through $x$,
|
||||
we conclude that the maximum length is
|
||||
$\texttt{maxLength}_2(p)+1$,
|
||||
and otherwise the maximum length is
|
||||
$\texttt{maxLength}_1(p)+1$.
|
||||
|
||||
|
||||
\section{Binary trees}
|
||||
|
||||
\index{binary tree}
|
||||
|
||||
\begin{samepage}
|
||||
A \key{binary tree} is a rooted tree
|
||||
where each node has a left and right subtree.
|
||||
It is possible that a subtree of a node is empty.
|
||||
Thus, every node in a binary tree has
|
||||
zero, one or two children.
|
||||
|
||||
For example, the following tree is a binary tree:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (-1.5,-1.5) {$2$};
|
||||
\node[draw, circle] (3) at (1.5,-1.5) {$3$};
|
||||
\node[draw, circle] (4) at (-3,-3) {$4$};
|
||||
\node[draw, circle] (5) at (0,-3) {$5$};
|
||||
\node[draw, circle] (6) at (-1.5,-4.5) {$6$};
|
||||
\node[draw, circle] (7) at (3,-3) {$7$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\path[draw,thick,-] (3) -- (7);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
\index{pre-order}
|
||||
\index{in-order}
|
||||
\index{post-order}
|
||||
|
||||
The nodes of a binary tree have three natural
|
||||
orderings that correspond to different ways to
|
||||
recursively traverse the tree:
|
||||
|
||||
\begin{itemize}
|
||||
\item \key{pre-order}: first process the root,
|
||||
then traverse the left subtree, then traverse the right subtree
|
||||
\item \key{in-order}: first traverse the left subtree,
|
||||
then process the root, then traverse the right subtree
|
||||
\item \key{post-order}: first traverse the left subtree,
|
||||
then traverse the right subtree, then process the root
|
||||
\end{itemize}
|
||||
|
||||
For the above tree, the nodes in
|
||||
pre-order are
|
||||
$[1,2,4,5,6,3,7]$,
|
||||
in in-order $[4,2,6,5,1,3,7]$
|
||||
and in post-order $[4,6,5,2,7,3,1]$.
|
||||
|
||||
If we know the pre-order and in-order
|
||||
of a tree, we can reconstruct the exact structure of the tree.
|
||||
For example, the above tree is the only possible tree
|
||||
with pre-order $[1,2,4,5,6,3,7]$ and
|
||||
in-order $[4,2,6,5,1,3,7]$.
|
||||
In a similar way, the post-order and in-order
|
||||
also determine the structure of a tree.
|
||||
|
||||
However, the situation is different if we only know
|
||||
the pre-order and post-order of a tree.
|
||||
In this case, there may be more than one tree
|
||||
that match the orderings.
|
||||
For example, in both of the trees
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (-1.5,-1.5) {$2$};
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
|
||||
\node[draw, circle] (1b) at (0+4,0) {$1$};
|
||||
\node[draw, circle] (2b) at (1.5+4,-1.5) {$2$};
|
||||
\path[draw,thick,-] (1b) -- (2b);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
the pre-order is $[1,2]$ and the post-order is $[2,1]$,
|
||||
but the structures of the trees are different.
|
||||
|
712
chapter15.tex
712
chapter15.tex
|
@ -1,712 +0,0 @@
|
|||
\chapter{Spanning trees}
|
||||
|
||||
\index{spanning tree}
|
||||
|
||||
A \key{spanning tree} of a graph consists of
|
||||
all nodes of the graph and some of the
|
||||
edges of the graph so that there is a path
|
||||
between any two nodes.
|
||||
Like trees in general, spanning trees are
|
||||
connected and acyclic.
|
||||
Usually there are several ways to construct a spanning tree.
|
||||
|
||||
For example, consider the following graph:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
One spanning tree for the graph is as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The weight of a spanning tree is the sum of its edge weights.
|
||||
For example, the weight of the above spanning tree is
|
||||
$3+5+9+3+2=22$.
|
||||
|
||||
\index{minimum spanning tree}
|
||||
|
||||
A \key{minimum spanning tree}
|
||||
is a spanning tree whose weight is as small as possible.
|
||||
The weight of a minimum spanning tree for the example graph
|
||||
is 20, and such a tree can be constructed as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\index{maximum spanning tree}
|
||||
|
||||
In a similar way, a \key{maximum spanning tree}
|
||||
is a spanning tree whose weight is as large as possible.
|
||||
The weight of a maximum spanning tree for the
|
||||
example graph is 32:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Note that a graph may have several
|
||||
minimum and maximum spanning trees,
|
||||
so the trees are not unique.
|
||||
|
||||
It turns out that several greedy methods
|
||||
can be used to construct minimum and maximum
|
||||
spanning trees.
|
||||
In this chapter, we discuss two algorithms
|
||||
that process
|
||||
the edges of the graph ordered by their weights.
|
||||
We focus on finding minimum spanning trees,
|
||||
but the same algorithms can find
|
||||
maximum spanning trees by processing the edges in reverse order.
|
||||
|
||||
\section{Kruskal's algorithm}
|
||||
|
||||
\index{Kruskal's algorithm}
|
||||
|
||||
In \key{Kruskal's algorithm}\footnote{The algorithm was published in 1956
|
||||
by J. B. Kruskal \cite{kru56}.}, the initial spanning tree
|
||||
only contains the nodes of the graph
|
||||
and does not contain any edges.
|
||||
Then the algorithm goes through the edges
|
||||
ordered by their weights, and always adds an edge
|
||||
to the tree if it does not create a cycle.
|
||||
|
||||
The algorithm maintains the components
|
||||
of the tree.
|
||||
Initially, each node of the graph
|
||||
belongs to a separate component.
|
||||
Always when an edge is added to the tree,
|
||||
two components are joined.
|
||||
Finally, all nodes belong to the same component,
|
||||
and a minimum spanning tree has been found.
|
||||
|
||||
\subsubsection{Example}
|
||||
|
||||
\begin{samepage}
|
||||
Let us consider how Kruskal's algorithm processes the
|
||||
following graph:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
\begin{samepage}
|
||||
The first step of the algorithm is to sort the
|
||||
edges in increasing order of their weights.
|
||||
The result is the following list:
|
||||
|
||||
\begin{tabular}{ll}
|
||||
\\
|
||||
edge & weight \\
|
||||
\hline
|
||||
5--6 & 2 \\
|
||||
1--2 & 3 \\
|
||||
3--6 & 3 \\
|
||||
1--5 & 5 \\
|
||||
2--3 & 5 \\
|
||||
2--5 & 6 \\
|
||||
4--6 & 7 \\
|
||||
3--4 & 9 \\
|
||||
\\
|
||||
\end{tabular}
|
||||
\end{samepage}
|
||||
|
||||
After this, the algorithm goes through the list
|
||||
and adds each edge to the tree if it joins
|
||||
two separate components.
|
||||
|
||||
Initially, each node is in its own component:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
%\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The first edge to be added to the tree is
|
||||
the edge 5--6 that creates a component $\{5,6\}$
|
||||
by joining the components $\{5\}$ and $\{6\}$:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
|
||||
%\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
After this, the edges 1--2, 3--6 and 1--5 are added in a similar way:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
After those steps, most components have been joined
|
||||
and there are two components in the tree:
|
||||
$\{1,2,3,5,6\}$ and $\{4\}$.
|
||||
|
||||
The next edge in the list is the edge 2--3,
|
||||
but it will not be included in the tree, because
|
||||
nodes 2 and 3 are already in the same component.
|
||||
For the same reason, the edge 2--5 will not be included in the tree.
|
||||
|
||||
\begin{samepage}
|
||||
Finally, the edge 4--6 will be included in the tree:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
After this, the algorithm will not add any
|
||||
new edges, because the graph is connected
|
||||
and there is a path between any two nodes.
|
||||
The resulting graph is a minimum spanning tree
|
||||
with weight $2+3+3+5+7=20$.
|
||||
|
||||
\subsubsection{Why does this work?}
|
||||
|
||||
It is a good question why Kruskal's algorithm works.
|
||||
Why does the greedy strategy guarantee that we
|
||||
will find a minimum spanning tree?
|
||||
|
||||
Let us see what happens if the minimum weight edge of
|
||||
the graph is \emph{not} included in the spanning tree.
|
||||
For example, suppose that a spanning tree
|
||||
for the previous graph would not contain the
|
||||
minimum weight edge 5--6.
|
||||
We do not know the exact structure of such a spanning tree,
|
||||
but in any case it has to contain some edges.
|
||||
Assume that the tree would be as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
|
||||
\path[draw,thick,-,dashed] (1) -- (2);
|
||||
\path[draw,thick,-,dashed] (2) -- (5);
|
||||
\path[draw,thick,-,dashed] (2) -- (3);
|
||||
\path[draw,thick,-,dashed] (3) -- (4);
|
||||
\path[draw,thick,-,dashed] (4) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
However, it is not possible that the above tree
|
||||
would be a minimum spanning tree for the graph.
|
||||
The reason for this is that we can remove an edge
|
||||
from the tree and replace it with the minimum weight edge 5--6.
|
||||
This produces a spanning tree whose weight is
|
||||
\emph{smaller}:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
|
||||
\path[draw,thick,-,dashed] (1) -- (2);
|
||||
\path[draw,thick,-,dashed] (2) -- (5);
|
||||
\path[draw,thick,-,dashed] (3) -- (4);
|
||||
\path[draw,thick,-,dashed] (4) -- (6);
|
||||
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
For this reason, it is always optimal
|
||||
to include the minimum weight edge
|
||||
in the tree to produce a minimum spanning tree.
|
||||
Using a similar argument, we can show that it
|
||||
is also optimal to add the next edge in weight order
|
||||
to the tree, and so on.
|
||||
Hence, Kruskal's algorithm works correctly and
|
||||
always produces a minimum spanning tree.
|
||||
|
||||
\subsubsection{Implementation}
|
||||
|
||||
When implementing Kruskal's algorithm,
|
||||
it is convenient to use
|
||||
the edge list representation of the graph.
|
||||
The first phase of the algorithm sorts the
|
||||
edges in the list in $O(m \log m)$ time.
|
||||
After this, the second phase of the algorithm
|
||||
builds the minimum spanning tree as follows:
|
||||
|
||||
\begin{lstlisting}
|
||||
for (...) {
|
||||
if (!same(a,b)) unite(a,b);
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The loop goes through the edges in the list
|
||||
and always processes an edge $a$--$b$
|
||||
where $a$ and $b$ are two nodes.
|
||||
Two functions are needed:
|
||||
the function \texttt{same} determines
|
||||
if $a$ and $b$ are in the same component,
|
||||
and the function \texttt{unite}
|
||||
joins the components that contain $a$ and $b$.
|
||||
|
||||
The problem is how to efficiently implement
|
||||
the functions \texttt{same} and \texttt{unite}.
|
||||
One possibility is to implement the function
|
||||
\texttt{same} as a graph traversal and check if
|
||||
we can get from node $a$ to node $b$.
|
||||
However, the time complexity of such a function
|
||||
would be $O(n+m)$
|
||||
and the resulting algorithm would be slow,
|
||||
because the function \texttt{same} will be called for each edge in the graph.
|
||||
|
||||
We will solve the problem using a union-find structure
|
||||
that implements both functions in $O(\log n)$ time.
|
||||
Thus, the time complexity of Kruskal's algorithm
|
||||
will be $O(m \log n)$ after sorting the edge list.
|
||||
|
||||
\section{Union-find structure}
|
||||
|
||||
\index{union-find structure}
|
||||
|
||||
A \key{union-find structure} maintains
|
||||
a collection of sets.
|
||||
The sets are disjoint, so no element
|
||||
belongs to more than one set.
|
||||
Two $O(\log n)$ time operations are supported:
|
||||
the \texttt{unite} operation joins two sets,
|
||||
and the \texttt{find} operation finds the representative
|
||||
of the set that contains a given element\footnote{The structure presented here
|
||||
was introduced in 1971 by J. D. Hopcroft and J. D. Ullman \cite{hop71}.
|
||||
Later, in 1975, R. E. Tarjan studied a more sophisticated variant
|
||||
of the structure \cite{tar75} that is discussed in many algorithm
|
||||
textbooks nowadays.}.
|
||||
|
||||
\subsubsection{Structure}
|
||||
|
||||
In a union-find structure, one element in each set
|
||||
is the representative of the set,
|
||||
and there is a chain from any other element of the
|
||||
set to the representative.
|
||||
For example, assume that the sets are
|
||||
$\{1,4,7\}$, $\{5\}$ and $\{2,3,6,8\}$:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (0,-1) {$1$};
|
||||
\node[draw, circle] (2) at (7,0) {$2$};
|
||||
\node[draw, circle] (3) at (7,-1.5) {$3$};
|
||||
\node[draw, circle] (4) at (1,0) {$4$};
|
||||
\node[draw, circle] (5) at (4,0) {$5$};
|
||||
\node[draw, circle] (6) at (6,-2.5) {$6$};
|
||||
\node[draw, circle] (7) at (2,-1) {$7$};
|
||||
\node[draw, circle] (8) at (8,-2.5) {$8$};
|
||||
|
||||
\path[draw,thick,->] (1) -- (4);
|
||||
\path[draw,thick,->] (7) -- (4);
|
||||
|
||||
\path[draw,thick,->] (3) -- (2);
|
||||
\path[draw,thick,->] (6) -- (3);
|
||||
\path[draw,thick,->] (8) -- (3);
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
In this case the representatives
|
||||
of the sets are 4, 5 and 2.
|
||||
We can find the representative of any element
|
||||
by following the chain that begins at the element.
|
||||
For example, the element 2 is the representative
|
||||
for the element 6, because
|
||||
we follow the chain $6 \rightarrow 3 \rightarrow 2$.
|
||||
Two elements belong to the same set exactly when
|
||||
their representatives are the same.
|
||||
|
||||
Two sets can be joined by connecting the
|
||||
representative of one set to the
|
||||
representative of the other set.
|
||||
For example, the sets
|
||||
$\{1,4,7\}$ and $\{2,3,6,8\}$
|
||||
can be joined as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (2,-1) {$1$};
|
||||
\node[draw, circle] (2) at (7,0) {$2$};
|
||||
\node[draw, circle] (3) at (7,-1.5) {$3$};
|
||||
\node[draw, circle] (4) at (3,0) {$4$};
|
||||
\node[draw, circle] (6) at (6,-2.5) {$6$};
|
||||
\node[draw, circle] (7) at (4,-1) {$7$};
|
||||
\node[draw, circle] (8) at (8,-2.5) {$8$};
|
||||
|
||||
\path[draw,thick,->] (1) -- (4);
|
||||
\path[draw,thick,->] (7) -- (4);
|
||||
|
||||
\path[draw,thick,->] (3) -- (2);
|
||||
\path[draw,thick,->] (6) -- (3);
|
||||
\path[draw,thick,->] (8) -- (3);
|
||||
|
||||
\path[draw,thick,->] (4) -- (2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The resulting set contains the elements
|
||||
$\{1,2,3,4,6,7,8\}$.
|
||||
From this on, the element 2 is the representative
|
||||
for the entire set and the old representative 4
|
||||
points to the element 2.
|
||||
|
||||
The efficiency of the union-find structure depends on
|
||||
how the sets are joined.
|
||||
It turns out that we can follow a simple strategy:
|
||||
always connect the representative of the
|
||||
\emph{smaller} set to the representative of the \emph{larger} set
|
||||
(or if the sets are of equal size,
|
||||
we can make an arbitrary choice).
|
||||
Using this strategy, the length of any chain
|
||||
will be $O(\log n)$, so we can
|
||||
find the representative of any element
|
||||
efficiently by following the corresponding chain.
|
||||
|
||||
\subsubsection{Implementation}
|
||||
|
||||
The union-find structure can be implemented
|
||||
using arrays.
|
||||
In the following implementation,
|
||||
the array \texttt{link} contains for each element
|
||||
the next element
|
||||
in the chain or the element itself if it is
|
||||
a representative,
|
||||
and the array \texttt{size} indicates for each representative
|
||||
the size of the corresponding set.
|
||||
|
||||
Initially, each element belongs to a separate set:
|
||||
\begin{lstlisting}
|
||||
for (int i = 1; i <= n; i++) link[i] = i;
|
||||
for (int i = 1; i <= n; i++) size[i] = 1;
|
||||
\end{lstlisting}
|
||||
|
||||
The function \texttt{find} returns
|
||||
the representative for an element $x$.
|
||||
The representative can be found by following
|
||||
the chain that begins at $x$.
|
||||
|
||||
\begin{lstlisting}
|
||||
int find(int x) {
|
||||
while (x != link[x]) x = link[x];
|
||||
return x;
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The function \texttt{same} checks
|
||||
whether elements $a$ and $b$ belong to the same set.
|
||||
This can easily be done by using the
|
||||
function \texttt{find}:
|
||||
|
||||
\begin{lstlisting}
|
||||
bool same(int a, int b) {
|
||||
return find(a) == find(b);
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\begin{samepage}
|
||||
The function \texttt{unite} joins the sets
|
||||
that contain elements $a$ and $b$
|
||||
(the elements have to be in different sets).
|
||||
The function first finds the representatives
|
||||
of the sets and then connects the smaller
|
||||
set to the larger set.
|
||||
|
||||
\begin{lstlisting}
|
||||
void unite(int a, int b) {
|
||||
a = find(a);
|
||||
b = find(b);
|
||||
if (size[a] < size[b]) swap(a,b);
|
||||
size[a] += size[b];
|
||||
link[b] = a;
|
||||
}
|
||||
\end{lstlisting}
|
||||
\end{samepage}
|
||||
|
||||
The time complexity of the function \texttt{find}
|
||||
is $O(\log n)$ assuming that the length of each
|
||||
chain is $O(\log n)$.
|
||||
In this case, the functions \texttt{same} and \texttt{unite}
|
||||
also work in $O(\log n)$ time.
|
||||
The function \texttt{unite} makes sure that the
|
||||
length of each chain is $O(\log n)$ by connecting
|
||||
the smaller set to the larger set.
|
||||
|
||||
\section{Prim's algorithm}
|
||||
|
||||
\index{Prim's algorithm}
|
||||
|
||||
\key{Prim's algorithm}\footnote{The algorithm is
|
||||
named after R. C. Prim who published it in 1957 \cite{pri57}.
|
||||
However, the same algorithm was discovered already in 1930
|
||||
by V. Jarník.} is an alternative method
|
||||
for finding a minimum spanning tree.
|
||||
The algorithm first adds an arbitrary node
|
||||
to the tree.
|
||||
After this, the algorithm always chooses
|
||||
a minimum-weight edge that
|
||||
adds a new node to the tree.
|
||||
Finally, all nodes have been added to the tree
|
||||
and a minimum spanning tree has been found.
|
||||
|
||||
Prim's algorithm resembles Dijkstra's algorithm.
|
||||
The difference is that Dijkstra's algorithm always
|
||||
selects an edge whose distance from the starting
|
||||
node is minimum, but Prim's algorithm simply selects
|
||||
the minimum weight edge that adds a new node to the tree.
|
||||
|
||||
\subsubsection{Example}
|
||||
|
||||
Let us consider how Prim's algorithm works
|
||||
in the following graph:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
|
||||
%\path[draw=red,thick,-,line width=2pt] (5) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Initially, there are no edges between the nodes:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
%\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
An arbitrary node can be the starting node,
|
||||
so let us choose node 1.
|
||||
First, we add node 2 that is connected by
|
||||
an edge of weight 3:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
After this, there are two edges with weight 5,
|
||||
so we can add either node 3 or node 5 to the tree.
|
||||
Let us add node 3 first:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
%\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
%\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\begin{samepage}
|
||||
The process continues until all nodes have been included in the tree:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1.5,2) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (5,3) {$3$};
|
||||
\node[draw, circle] (4) at (6.5,2) {$4$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:5] {} (3);
|
||||
%\path[draw,thick,-] (3) -- node[font=\small,label=above:9] {} (4);
|
||||
%\path[draw,thick,-] (1) -- node[font=\small,label=below:5] {} (5);
|
||||
\path[draw,thick,-] (5) -- node[font=\small,label=below:2] {} (6);
|
||||
\path[draw,thick,-] (6) -- node[font=\small,label=below:7] {} (4);
|
||||
%\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (5);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=left:3] {} (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
\subsubsection{Implementation}
|
||||
|
||||
Like Dijkstra's algorithm, Prim's algorithm can be
|
||||
efficiently implemented using a priority queue.
|
||||
The priority queue should contain all nodes
|
||||
that can be connected to the current component using
|
||||
a single edge, in increasing order of the weights
|
||||
of the corresponding edges.
|
||||
|
||||
The time complexity of Prim's algorithm is
|
||||
$O(n + m \log m)$ that equals the time complexity
|
||||
of Dijkstra's algorithm.
|
||||
In practice, Prim's and Kruskal's algorithms
|
||||
are both efficient, and the choice of the algorithm
|
||||
is a matter of taste.
|
||||
Still, most competitive programmers use Kruskal's algorithm.
|
708
chapter16.tex
708
chapter16.tex
|
@ -1,708 +0,0 @@
|
|||
\chapter{Directed graphs}
|
||||
|
||||
In this chapter, we focus on two classes of directed graphs:
|
||||
\begin{itemize}
|
||||
\item \key{Acyclic graphs}:
|
||||
There are no cycles in the graph,
|
||||
so there is no path from any node to itself\footnote{Directed acyclic
|
||||
graphs are sometimes called DAGs.}.
|
||||
\item \key{Successor graphs}:
|
||||
The outdegree of each node is 1,
|
||||
so each node has a unique successor.
|
||||
\end{itemize}
|
||||
It turns out that in both cases,
|
||||
we can design efficient algorithms that are based
|
||||
on the special properties of the graphs.
|
||||
|
||||
\section{Topological sorting}
|
||||
|
||||
\index{topological sorting}
|
||||
\index{cycle}
|
||||
|
||||
A \key{topological sort} is an ordering
|
||||
of the nodes of a directed graph
|
||||
such that if there is a path from node $a$ to node $b$,
|
||||
then node $a$ appears before node $b$ in the ordering.
|
||||
For example, for the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,5) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
\node[draw, circle] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
one topological sort is
|
||||
$[4,1,5,2,3,6]$:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (-6,0) {$1$};
|
||||
\node[draw, circle] (2) at (-3,0) {$2$};
|
||||
\node[draw, circle] (3) at (-1.5,0) {$3$};
|
||||
\node[draw, circle] (4) at (-7.5,0) {$4$};
|
||||
\node[draw, circle] (5) at (-4.5,0) {$5$};
|
||||
\node[draw, circle] (6) at (-0,0) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) edge [bend right=30] (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\path[draw,thick,->,>=latex] (4) edge [bend left=30] (5);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) edge [bend left=30] (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
An acyclic graph always has a topological sort.
|
||||
However, if the graph contains a cycle,
|
||||
it is not possible to form a topological sort,
|
||||
because no node of the cycle can appear
|
||||
before the other nodes of the cycle in the ordering.
|
||||
It turns out that depth-first search can be used
|
||||
to both check if a directed graph contains a cycle
|
||||
and, if it does not contain a cycle, to construct a topological sort.
|
||||
|
||||
\subsubsection{Algorithm}
|
||||
|
||||
The idea is to go through the nodes of the graph
|
||||
and always begin a depth-first search at the current node
|
||||
if it has not been processed yet.
|
||||
During the searches, the nodes have three possible states:
|
||||
|
||||
\begin{itemize}
|
||||
\item state 0: the node has not been processed (white)
|
||||
\item state 1: the node is under processing (light gray)
|
||||
\item state 2: the node has been processed (dark gray)
|
||||
\end{itemize}
|
||||
|
||||
Initially, the state of each node is 0.
|
||||
When a search reaches a node for the first time,
|
||||
its state becomes 1.
|
||||
Finally, after all successors of the node have
|
||||
been processed, its state becomes 2.
|
||||
|
||||
If the graph contains a cycle, we will find this out
|
||||
during the search, because sooner or later
|
||||
we will arrive at a node whose state is 1.
|
||||
In this case, it is not possible to construct a topological sort.
|
||||
|
||||
If the graph does not contain a cycle, we can construct
|
||||
a topological sort by
|
||||
adding each node to a list when the state of the node becomes 2.
|
||||
This list in reverse order is a topological sort.
|
||||
|
||||
\subsubsection{Example 1}
|
||||
|
||||
In the example graph, the search first proceeds
|
||||
from node 1 to node 6:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle,fill=gray!20] (1) at (1,5) {$1$};
|
||||
\node[draw, circle,fill=gray!20] (2) at (3,5) {$2$};
|
||||
\node[draw, circle,fill=gray!20] (3) at (5,5) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
\node[draw, circle,fill=gray!80] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||
%\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Now node 6 has been processed, so it is added to the list.
|
||||
After this, also nodes 3, 2 and 1 are added to the list:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle,fill=gray!80] (1) at (1,5) {$1$};
|
||||
\node[draw, circle,fill=gray!80] (2) at (3,5) {$2$};
|
||||
\node[draw, circle,fill=gray!80] (3) at (5,5) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
\node[draw, circle,fill=gray!80] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
At this point, the list is $[6,3,2,1]$.
|
||||
The next search begins at node 4:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle,fill=gray!80] (1) at (1,5) {$1$};
|
||||
\node[draw, circle,fill=gray!80] (2) at (3,5) {$2$};
|
||||
\node[draw, circle,fill=gray!80] (3) at (5,5) {$3$};
|
||||
\node[draw, circle,fill=gray!20] (4) at (1,3) {$4$};
|
||||
\node[draw, circle,fill=gray!80] (5) at (3,3) {$5$};
|
||||
\node[draw, circle,fill=gray!80] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
%\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Thus, the final list is $[6,3,2,1,5,4]$.
|
||||
We have processed all nodes, so a topological sort has
|
||||
been found.
|
||||
The topological sort is the reverse list
|
||||
$[4,5,1,2,3,6]$:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (3,0) {$1$};
|
||||
\node[draw, circle] (2) at (4.5,0) {$2$};
|
||||
\node[draw, circle] (3) at (6,0) {$3$};
|
||||
\node[draw, circle] (4) at (0,0) {$4$};
|
||||
\node[draw, circle] (5) at (1.5,0) {$5$};
|
||||
\node[draw, circle] (6) at (7.5,0) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (4) edge [bend left=30] (1);
|
||||
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (5) edge [bend right=30] (2);
|
||||
\path[draw,thick,->,>=latex] (5) edge [bend right=40] (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Note that a topological sort is not unique,
|
||||
and there can be several topological sorts for a graph.
|
||||
|
||||
\subsubsection{Example 2}
|
||||
|
||||
Let us now consider a graph for which we
|
||||
cannot construct a topological sort,
|
||||
because the graph contains a cycle:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,5) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
\node[draw, circle] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (3) -- (5);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The search proceeds as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle,fill=gray!20] (1) at (1,5) {$1$};
|
||||
\node[draw, circle,fill=gray!20] (2) at (3,5) {$2$};
|
||||
\node[draw, circle,fill=gray!20] (3) at (5,5) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle,fill=gray!20] (5) at (3,3) {$5$};
|
||||
\node[draw, circle] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- (2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The search reaches node 2 whose state is 1,
|
||||
which means that the graph contains a cycle.
|
||||
In this example, there is a cycle
|
||||
$2 \rightarrow 3 \rightarrow 5 \rightarrow 2$.
|
||||
|
||||
\section{Dynamic programming}
|
||||
|
||||
If a directed graph is acyclic,
|
||||
dynamic programming can be applied to it.
|
||||
For example, we can efficiently solve the following
|
||||
problems concerning paths from a starting node
|
||||
to an ending node:
|
||||
|
||||
\begin{itemize}
|
||||
\item how many different paths are there?
|
||||
\item what is the shortest/longest path?
|
||||
\item what is the minimum/maximum number of edges in a path?
|
||||
\item which nodes certainly appear in any path?
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Counting the number of paths}
|
||||
|
||||
As an example, let us calculate the number of paths
|
||||
from node 1 to node 6 in the following graph:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,5) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
\node[draw, circle] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (1) -- (4);
|
||||
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
There are a total of three such paths:
|
||||
\begin{itemize}
|
||||
\item $1 \rightarrow 2 \rightarrow 3 \rightarrow 6$
|
||||
\item $1 \rightarrow 4 \rightarrow 5 \rightarrow 2 \rightarrow 3 \rightarrow 6$
|
||||
\item $1 \rightarrow 4 \rightarrow 5 \rightarrow 3 \rightarrow 6$
|
||||
\end{itemize}
|
||||
|
||||
Let $\texttt{paths}(x)$ denote the number of paths from
|
||||
node 1 to node $x$.
|
||||
As a base case, $\texttt{paths}(1)=1$.
|
||||
Then, to calculate other values of $\texttt{paths}(x)$,
|
||||
we may use the recursion
|
||||
\[\texttt{paths}(x) = \texttt{paths}(a_1)+\texttt{paths}(a_2)+\cdots+\texttt{paths}(a_k)\]
|
||||
where $a_1,a_2,\ldots,a_k$ are the nodes from which there
|
||||
is an edge to $x$.
|
||||
Since the graph is acyclic, the values of $\texttt{paths}(x)$
|
||||
can be calculated in the order of a topological sort.
|
||||
A topological sort for the above graph is as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (4.5,0) {$2$};
|
||||
\node[draw, circle] (3) at (6,0) {$3$};
|
||||
\node[draw, circle] (4) at (1.5,0) {$4$};
|
||||
\node[draw, circle] (5) at (3,0) {$5$};
|
||||
\node[draw, circle] (6) at (7.5,0) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) edge [bend left=30] (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (1) -- (4);
|
||||
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) edge [bend right=30] (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Hence, the numbers of paths are as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,5) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
\node[draw, circle] (6) at (5,3) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (1) -- (4);
|
||||
\path[draw,thick,->,>=latex] (4) -- (5);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) -- (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
|
||||
\node[color=red] at (1,2.3) {$1$};
|
||||
\node[color=red] at (3,2.3) {$1$};
|
||||
\node[color=red] at (5,2.3) {$3$};
|
||||
\node[color=red] at (1,5.7) {$1$};
|
||||
\node[color=red] at (3,5.7) {$2$};
|
||||
\node[color=red] at (5,5.7) {$3$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
For example, to calculate the value of $\texttt{paths}(3)$,
|
||||
we can use the formula $\texttt{paths}(2)+\texttt{paths}(5)$,
|
||||
because there are edges from nodes 2 and 5
|
||||
to node 3.
|
||||
Since $\texttt{paths}(2)=2$ and $\texttt{paths}(5)=1$, we conclude that $\texttt{paths}(3)=3$.
|
||||
|
||||
\subsubsection{Extending Dijkstra's algorithm}
|
||||
|
||||
\index{Dijkstra's algorithm}
|
||||
|
||||
A by-product of Dijkstra's algorithm is a directed, acyclic
|
||||
graph that indicates for each node of the original graph
|
||||
the possible ways to reach the node using a shortest path
|
||||
from the starting node.
|
||||
Dynamic programming can be applied to that graph.
|
||||
For example, in the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (2,0) {$2$};
|
||||
\node[draw, circle] (3) at (0,-2) {$3$};
|
||||
\node[draw, circle] (4) at (2,-2) {$4$};
|
||||
\node[draw, circle] (5) at (4,-1) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
\path[draw,thick,-] (1) -- node[font=\small,label=left:5] {} (3);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=right:4] {} (4);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:8] {} (5);
|
||||
\path[draw,thick,-] (3) -- node[font=\small,label=below:2] {} (4);
|
||||
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
|
||||
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
the shortest paths from node 1 may use the following edges:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (2,0) {$2$};
|
||||
\node[draw, circle] (3) at (0,-2) {$3$};
|
||||
\node[draw, circle] (4) at (2,-2) {$4$};
|
||||
\node[draw, circle] (5) at (4,-1) {$5$};
|
||||
|
||||
\path[draw,thick,->] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
\path[draw,thick,->] (1) -- node[font=\small,label=left:5] {} (3);
|
||||
\path[draw,thick,->] (2) -- node[font=\small,label=right:4] {} (4);
|
||||
\path[draw,thick,->] (3) -- node[font=\small,label=below:2] {} (4);
|
||||
\path[draw,thick,->] (4) -- node[font=\small,label=below:1] {} (5);
|
||||
\path[draw,thick,->] (2) -- node[font=\small,label=above:2] {} (3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Now we can, for example, calculate the number of
|
||||
shortest paths from node 1 to node 5
|
||||
using dynamic programming:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (2,0) {$2$};
|
||||
\node[draw, circle] (3) at (0,-2) {$3$};
|
||||
\node[draw, circle] (4) at (2,-2) {$4$};
|
||||
\node[draw, circle] (5) at (4,-1) {$5$};
|
||||
|
||||
\path[draw,thick,->] (1) -- node[font=\small,label=above:3] {} (2);
|
||||
\path[draw,thick,->] (1) -- node[font=\small,label=left:5] {} (3);
|
||||
\path[draw,thick,->] (2) -- node[font=\small,label=right:4] {} (4);
|
||||
\path[draw,thick,->] (3) -- node[font=\small,label=below:2] {} (4);
|
||||
\path[draw,thick,->] (4) -- node[font=\small,label=below:1] {} (5);
|
||||
\path[draw,thick,->] (2) -- node[font=\small,label=above:2] {} (3);
|
||||
|
||||
\node[color=red] at (0,0.7) {$1$};
|
||||
\node[color=red] at (2,0.7) {$1$};
|
||||
\node[color=red] at (0,-2.7) {$2$};
|
||||
\node[color=red] at (2,-2.7) {$3$};
|
||||
\node[color=red] at (4,-1.7) {$3$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Representing problems as graphs}
|
||||
|
||||
Actually, any dynamic programming problem
|
||||
can be represented as a directed, acyclic graph.
|
||||
In such a graph, each node corresponds to a dynamic programming state
|
||||
and the edges indicate how the states depend on each other.
|
||||
|
||||
As an example, consider the problem
|
||||
of forming a sum of money $n$
|
||||
using coins
|
||||
$\{c_1,c_2,\ldots,c_k\}$.
|
||||
In this problem, we can construct a graph where
|
||||
each node corresponds to a sum of money,
|
||||
and the edges show how the coins can be chosen.
|
||||
For example, for coins $\{1,3,4\}$ and $n=6$,
|
||||
the graph is as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (0) at (0,0) {$0$};
|
||||
\node[draw, circle] (1) at (2,0) {$1$};
|
||||
\node[draw, circle] (2) at (4,0) {$2$};
|
||||
\node[draw, circle] (3) at (6,0) {$3$};
|
||||
\node[draw, circle] (4) at (8,0) {$4$};
|
||||
\node[draw, circle] (5) at (10,0) {$5$};
|
||||
\node[draw, circle] (6) at (12,0) {$6$};
|
||||
|
||||
\path[draw,thick,->] (0) -- (1);
|
||||
\path[draw,thick,->] (1) -- (2);
|
||||
\path[draw,thick,->] (2) -- (3);
|
||||
\path[draw,thick,->] (3) -- (4);
|
||||
\path[draw,thick,->] (4) -- (5);
|
||||
\path[draw,thick,->] (5) -- (6);
|
||||
|
||||
\path[draw,thick,->] (0) edge [bend right=30] (3);
|
||||
\path[draw,thick,->] (1) edge [bend right=30] (4);
|
||||
\path[draw,thick,->] (2) edge [bend right=30] (5);
|
||||
\path[draw,thick,->] (3) edge [bend right=30] (6);
|
||||
|
||||
\path[draw,thick,->] (0) edge [bend left=30] (4);
|
||||
\path[draw,thick,->] (1) edge [bend left=30] (5);
|
||||
\path[draw,thick,->] (2) edge [bend left=30] (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Using this representation,
|
||||
the shortest path from node 0 to node $n$
|
||||
corresponds to a solution with the minimum number of coins,
|
||||
and the total number of paths from node 0 to node $n$
|
||||
equals the total number of solutions.
|
||||
|
||||
\section{Successor paths}
|
||||
|
||||
\index{successor graph}
|
||||
\index{functional graph}
|
||||
|
||||
For the rest of the chapter,
|
||||
we will focus on \key{successor graphs}.
|
||||
In those graphs,
|
||||
the outdegree of each node is 1, i.e.,
|
||||
exactly one edge starts at each node.
|
||||
A successor graph consists of one or more
|
||||
components, each of which contains
|
||||
one cycle and some paths that lead to it.
|
||||
|
||||
Successor graphs are sometimes called
|
||||
\key{functional graphs}.
|
||||
The reason for this is that any successor graph
|
||||
corresponds to a function that defines
|
||||
the edges of the graph.
|
||||
The parameter for the function is a node of the graph,
|
||||
and the function gives the successor of that node.
|
||||
|
||||
For example, the function
|
||||
\begin{center}
|
||||
\begin{tabular}{r|rrrrrrrrr}
|
||||
$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
|
||||
\hline
|
||||
$\texttt{succ}(x)$ & 3 & 5 & 7 & 6 & 2 & 2 & 1 & 6 & 3 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
defines the following graph:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (2,0) {$2$};
|
||||
\node[draw, circle] (3) at (-2,0) {$3$};
|
||||
\node[draw, circle] (4) at (1,-3) {$4$};
|
||||
\node[draw, circle] (5) at (4,0) {$5$};
|
||||
\node[draw, circle] (6) at (2,-1.5) {$6$};
|
||||
\node[draw, circle] (7) at (-2,-1.5) {$7$};
|
||||
\node[draw, circle] (8) at (3,-3) {$8$};
|
||||
\node[draw, circle] (9) at (-4,0) {$9$};
|
||||
|
||||
\path[draw,thick,->] (1) -- (3);
|
||||
\path[draw,thick,->] (2) edge [bend left=40] (5);
|
||||
\path[draw,thick,->] (3) -- (7);
|
||||
\path[draw,thick,->] (4) -- (6);
|
||||
\path[draw,thick,->] (5) edge [bend left=40] (2);
|
||||
\path[draw,thick,->] (6) -- (2);
|
||||
\path[draw,thick,->] (7) -- (1);
|
||||
\path[draw,thick,->] (8) -- (6);
|
||||
\path[draw,thick,->] (9) -- (3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Since each node of a successor graph has a
|
||||
unique successor, we can also define a function $\texttt{succ}(x,k)$
|
||||
that gives the node that we will reach if
|
||||
we begin at node $x$ and walk $k$ steps forward.
|
||||
For example, in the above graph $\texttt{succ}(4,6)=2$,
|
||||
because we will reach node 2 by walking 6 steps from node 4:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$4$};
|
||||
\node[draw, circle] (2) at (1.5,0) {$6$};
|
||||
\node[draw, circle] (3) at (3,0) {$2$};
|
||||
\node[draw, circle] (4) at (4.5,0) {$5$};
|
||||
\node[draw, circle] (5) at (6,0) {$2$};
|
||||
\node[draw, circle] (6) at (7.5,0) {$5$};
|
||||
\node[draw, circle] (7) at (9,0) {$2$};
|
||||
|
||||
\path[draw,thick,->] (1) -- (2);
|
||||
\path[draw,thick,->] (2) -- (3);
|
||||
\path[draw,thick,->] (3) -- (4);
|
||||
\path[draw,thick,->] (4) -- (5);
|
||||
\path[draw,thick,->] (5) -- (6);
|
||||
\path[draw,thick,->] (6) -- (7);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
A straightforward way to calculate a value of $\texttt{succ}(x,k)$
|
||||
is to start at node $x$ and walk $k$ steps forward, which takes $O(k)$ time.
|
||||
However, using preprocessing, any value of $\texttt{succ}(x,k)$
|
||||
can be calculated in only $O(\log k)$ time.
|
||||
|
||||
The idea is to precalculate all values of $\texttt{succ}(x,k)$ where
|
||||
$k$ is a power of two and at most $u$, where $u$ is
|
||||
the maximum number of steps we will ever walk.
|
||||
This can be efficiently done, because
|
||||
we can use the following recursion:
|
||||
|
||||
\begin{equation*}
|
||||
\texttt{succ}(x,k) = \begin{cases}
|
||||
\texttt{succ}(x) & k = 1\\
|
||||
\texttt{succ}(\texttt{succ}(x,k/2),k/2) & k > 1\\
|
||||
\end{cases}
|
||||
\end{equation*}
|
||||
|
||||
Precalculating the values takes $O(n \log u)$ time,
|
||||
because $O(\log u)$ values are calculated for each node.
|
||||
In the above graph, the first values are as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{r|rrrrrrrrr}
|
||||
$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
|
||||
\hline
|
||||
$\texttt{succ}(x,1)$ & 3 & 5 & 7 & 6 & 2 & 2 & 1 & 6 & 3 \\
|
||||
$\texttt{succ}(x,2)$ & 7 & 2 & 1 & 2 & 5 & 5 & 3 & 2 & 7 \\
|
||||
$\texttt{succ}(x,4)$ & 3 & 2 & 7 & 2 & 5 & 5 & 1 & 2 & 3 \\
|
||||
$\texttt{succ}(x,8)$ & 7 & 2 & 1 & 2 & 5 & 5 & 3 & 2 & 7 \\
|
||||
$\cdots$ \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
After this, any value of $\texttt{succ}(x,k)$ can be calculated
|
||||
by presenting the number of steps $k$ as a sum of powers of two.
|
||||
For example, if we want to calculate the value of $\texttt{succ}(x,11)$,
|
||||
we first form the representation $11=8+2+1$.
|
||||
Using that,
|
||||
\[\texttt{succ}(x,11)=\texttt{succ}(\texttt{succ}(\texttt{succ}(x,8),2),1).\]
|
||||
For example, in the previous graph
|
||||
\[\texttt{succ}(4,11)=\texttt{succ}(\texttt{succ}(\texttt{succ}(4,8),2),1)=5.\]
|
||||
|
||||
Such a representation always consists of
|
||||
$O(\log k)$ parts, so calculating a value of $\texttt{succ}(x,k)$
|
||||
takes $O(\log k)$ time.
|
||||
|
||||
\section{Cycle detection}
|
||||
|
||||
\index{cycle}
|
||||
\index{cycle detection}
|
||||
|
||||
Consider a successor graph that only contains
|
||||
a path that ends in a cycle.
|
||||
We may ask the following questions:
|
||||
if we begin our walk at the starting node,
|
||||
what is the first node in the cycle
|
||||
and how many nodes does the cycle contain?
|
||||
|
||||
For example, in the graph
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (5) at (0,0) {$5$};
|
||||
\node[draw, circle] (4) at (-2,0) {$4$};
|
||||
\node[draw, circle] (6) at (-1,1.5) {$6$};
|
||||
\node[draw, circle] (3) at (-4,0) {$3$};
|
||||
\node[draw, circle] (2) at (-6,0) {$2$};
|
||||
\node[draw, circle] (1) at (-8,0) {$1$};
|
||||
|
||||
\path[draw,thick,->] (1) -- (2);
|
||||
\path[draw,thick,->] (2) -- (3);
|
||||
\path[draw,thick,->] (3) -- (4);
|
||||
\path[draw,thick,->] (4) -- (5);
|
||||
\path[draw,thick,->] (5) -- (6);
|
||||
\path[draw,thick,->] (6) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
we begin our walk at node 1,
|
||||
the first node that belongs to the cycle is node 4, and the cycle consists
|
||||
of three nodes (4, 5 and 6).
|
||||
|
||||
A simple way to detect the cycle is to walk in the
|
||||
graph and keep track of
|
||||
all nodes that have been visited. Once a node is visited
|
||||
for the second time, we can conclude
|
||||
that the node is the first node in the cycle.
|
||||
This method works in $O(n)$ time and also uses
|
||||
$O(n)$ memory.
|
||||
|
||||
However, there are better algorithms for cycle detection.
|
||||
The time complexity of such algorithms is still $O(n)$,
|
||||
but they only use $O(1)$ memory.
|
||||
This is an important improvement if $n$ is large.
|
||||
Next we will discuss Floyd's algorithm that
|
||||
achieves these properties.
|
||||
|
||||
\subsubsection{Floyd's algorithm}
|
||||
|
||||
\index{Floyd's algorithm}
|
||||
|
||||
\key{Floyd's algorithm}\footnote{The idea of the algorithm is mentioned in \cite{knu982}
|
||||
and attributed to R. W. Floyd; however, it is not known if Floyd actually
|
||||
discovered the algorithm.} walks forward
|
||||
in the graph using two pointers $a$ and $b$.
|
||||
Both pointers begin at a node $x$ that
|
||||
is the starting node of the graph.
|
||||
Then, on each turn, the pointer $a$ walks
|
||||
one step forward and the pointer $b$
|
||||
walks two steps forward.
|
||||
The process continues until
|
||||
the pointers meet each other:
|
||||
\begin{lstlisting}
|
||||
a = succ(x);
|
||||
b = succ(succ(x));
|
||||
while (a != b) {
|
||||
a = succ(a);
|
||||
b = succ(succ(b));
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
At this point, the pointer $a$ has walked $k$ steps
|
||||
and the pointer $b$ has walked $2k$ steps,
|
||||
so the length of the cycle divides $k$.
|
||||
Thus, the first node that belongs to the cycle
|
||||
can be found by moving the pointer $a$ to node $x$
|
||||
and advancing the pointers
|
||||
step by step until they meet again.
|
||||
\begin{lstlisting}
|
||||
a = x;
|
||||
while (a != b) {
|
||||
a = succ(a);
|
||||
b = succ(b);
|
||||
}
|
||||
first = a;
|
||||
\end{lstlisting}
|
||||
|
||||
After this, the length of the cycle
|
||||
can be calculated as follows:
|
||||
\begin{lstlisting}
|
||||
b = succ(a);
|
||||
length = 1;
|
||||
while (a != b) {
|
||||
b = succ(b);
|
||||
length++;
|
||||
}
|
||||
\end{lstlisting}
|
563
chapter17.tex
563
chapter17.tex
|
@ -1,563 +0,0 @@
|
|||
\chapter{Strong connectivity}
|
||||
|
||||
\index{strongly connected graph}
|
||||
|
||||
In a directed graph,
|
||||
the edges can be traversed in one direction only,
|
||||
so even if the graph is connected,
|
||||
this does not guarantee that there would be
|
||||
a path from a node to another node.
|
||||
For this reason, it is meaningful to define a new concept
|
||||
that requires more than connectivity.
|
||||
|
||||
A graph is \key{strongly connected}
|
||||
if there is a path from any node to all
|
||||
other nodes in the graph.
|
||||
For example, in the following picture,
|
||||
the left graph is strongly connected
|
||||
while the right graph is not.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,1) {$1$};
|
||||
\node[draw, circle] (2) at (3,1) {$2$};
|
||||
\node[draw, circle] (3) at (1,-1) {$3$};
|
||||
\node[draw, circle] (4) at (3,-1) {$4$};
|
||||
|
||||
\path[draw,thick,->] (1) -- (2);
|
||||
\path[draw,thick,->] (2) -- (4);
|
||||
\path[draw,thick,->] (4) -- (3);
|
||||
\path[draw,thick,->] (3) -- (1);
|
||||
|
||||
\node[draw, circle] (1b) at (6,1) {$1$};
|
||||
\node[draw, circle] (2b) at (8,1) {$2$};
|
||||
\node[draw, circle] (3b) at (6,-1) {$3$};
|
||||
\node[draw, circle] (4b) at (8,-1) {$4$};
|
||||
|
||||
\path[draw,thick,->] (1b) -- (2b);
|
||||
\path[draw,thick,->] (2b) -- (4b);
|
||||
\path[draw,thick,->] (4b) -- (3b);
|
||||
\path[draw,thick,->] (1b) -- (3b);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The right graph is not strongly connected
|
||||
because, for example, there is no path
|
||||
from node 2 to node 1.
|
||||
|
||||
\index{strongly connected component}
|
||||
\index{component graph}
|
||||
|
||||
The \key{strongly connected components}
|
||||
of a graph divide the graph into strongly connected
|
||||
parts that are as large as possible.
|
||||
The strongly connected components form an
|
||||
acyclic \key{component graph} that represents
|
||||
the deep structure of the original graph.
|
||||
|
||||
For example, for the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||
|
||||
\path[draw,thick,->] (2) -- (1);
|
||||
\path[draw,thick,->] (1) -- (3);
|
||||
\path[draw,thick,->] (3) -- (2);
|
||||
\path[draw,thick,->] (2) -- (4);
|
||||
\path[draw,thick,->] (3) -- (5);
|
||||
\path[draw,thick,->] (4) edge [bend left] (6);
|
||||
\path[draw,thick,->] (6) edge [bend left] (4);
|
||||
\path[draw,thick,->] (4) -- (5);
|
||||
\path[draw,thick,->] (5) -- (7);
|
||||
\path[draw,thick,->] (6) -- (7);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
the strongly connected components are as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||
|
||||
\path[draw,thick,->] (2) -- (1);
|
||||
\path[draw,thick,->] (1) -- (3);
|
||||
\path[draw,thick,->] (3) -- (2);
|
||||
\path[draw,thick,->] (2) -- (4);
|
||||
\path[draw,thick,->] (3) -- (5);
|
||||
\path[draw,thick,->] (4) edge [bend left] (6);
|
||||
\path[draw,thick,->] (6) edge [bend left] (4);
|
||||
\path[draw,thick,->] (4) -- (5);
|
||||
\path[draw,thick,->] (5) -- (7);
|
||||
\path[draw,thick,->] (6) -- (7);
|
||||
|
||||
\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
|
||||
\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
|
||||
\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
|
||||
\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The corresponding component graph is as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (-3,1) {$B$};
|
||||
\node[draw, circle] (2) at (-6,2) {$A$};
|
||||
\node[draw, circle] (3) at (-5,0) {$D$};
|
||||
\node[draw, circle] (4) at (-7,0) {$C$};
|
||||
|
||||
\path[draw,thick,->] (1) -- (2);
|
||||
\path[draw,thick,->] (1) -- (3);
|
||||
\path[draw,thick,->] (2) -- (3);
|
||||
\path[draw,thick,->] (2) -- (4);
|
||||
\path[draw,thick,->] (3) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The components are $A=\{1,2\}$,
|
||||
$B=\{3,6,7\}$, $C=\{4\}$ and $D=\{5\}$.
|
||||
|
||||
A component graph is an acyclic, directed graph,
|
||||
so it is easier to process than the original graph.
|
||||
Since the graph does not contain cycles,
|
||||
we can always construct a topological sort and
|
||||
use dynamic programming techniques like those
|
||||
presented in Chapter 16.
|
||||
|
||||
\section{Kosaraju's algorithm}
|
||||
|
||||
\index{Kosaraju's algorithm}
|
||||
|
||||
\key{Kosaraju's algorithm}\footnote{According to \cite{aho83},
|
||||
S. R. Kosaraju invented this algorithm in 1978
|
||||
but did not publish it. In 1981, the same algorithm was rediscovered
|
||||
and published by M. Sharir \cite{sha81}.} is an efficient
|
||||
method for finding the strongly connected components
|
||||
of a directed graph.
|
||||
The algorithm performs two depth-first searches:
|
||||
the first search constructs a list of nodes
|
||||
according to the structure of the graph,
|
||||
and the second search forms the strongly connected components.
|
||||
|
||||
\subsubsection{Search 1}
|
||||
|
||||
The first phase of Kosaraju's algorithm constructs
|
||||
a list of nodes in the order in which a
|
||||
depth-first search processes them.
|
||||
The algorithm goes through the nodes,
|
||||
and begins a depth-first search at each
|
||||
unprocessed node.
|
||||
Each node will be added to the list
|
||||
after it has been processed.
|
||||
|
||||
In the example graph, the nodes are processed
|
||||
in the following order:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||
|
||||
\node at (-7,2.75) {$1/8$};
|
||||
\node at (-5,2.75) {$2/7$};
|
||||
\node at (-3,2.75) {$9/14$};
|
||||
\node at (-7,-0.75) {$4/5$};
|
||||
\node at (-5,-0.75) {$3/6$};
|
||||
\node at (-3,-0.75) {$11/12$};
|
||||
\node at (-1,1.75) {$10/13$};
|
||||
|
||||
\path[draw,thick,->] (2) -- (1);
|
||||
\path[draw,thick,->] (1) -- (3);
|
||||
\path[draw,thick,->] (3) -- (2);
|
||||
\path[draw,thick,->] (2) -- (4);
|
||||
\path[draw,thick,->] (3) -- (5);
|
||||
\path[draw,thick,->] (4) edge [bend left] (6);
|
||||
\path[draw,thick,->] (6) edge [bend left] (4);
|
||||
\path[draw,thick,->] (4) -- (5);
|
||||
\path[draw,thick,->] (5) -- (7);
|
||||
\path[draw,thick,->] (6) -- (7);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The notation $x/y$ means that
|
||||
processing the node started
|
||||
at time $x$ and finished at time $y$.
|
||||
Thus, the corresponding list is as follows:
|
||||
|
||||
\begin{tabular}{ll}
|
||||
\\
|
||||
node & processing time \\
|
||||
\hline
|
||||
4 & 5 \\
|
||||
5 & 6 \\
|
||||
2 & 7 \\
|
||||
1 & 8 \\
|
||||
6 & 12 \\
|
||||
7 & 13 \\
|
||||
3 & 14 \\
|
||||
\\
|
||||
\end{tabular}
|
||||
%
|
||||
% In the second phase of the algorithm,
|
||||
% the nodes will be processed
|
||||
% in reverse order: $[3,7,6,1,2,5,4]$.
|
||||
|
||||
\subsubsection{Search 2}
|
||||
|
||||
The second phase of the algorithm
|
||||
forms the strongly connected components
|
||||
of the graph.
|
||||
First, the algorithm reverses every
|
||||
edge in the graph.
|
||||
This guarantees that during the second search,
|
||||
we will always find strongly connected
|
||||
components that do not have extra nodes.
|
||||
|
||||
After reversing the edges,
|
||||
the example graph is as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||
|
||||
\path[draw,thick,<-] (2) -- (1);
|
||||
\path[draw,thick,<-] (1) -- (3);
|
||||
\path[draw,thick,<-] (3) -- (2);
|
||||
\path[draw,thick,<-] (2) -- (4);
|
||||
\path[draw,thick,<-] (3) -- (5);
|
||||
\path[draw,thick,<-] (4) edge [bend left] (6);
|
||||
\path[draw,thick,<-] (6) edge [bend left] (4);
|
||||
\path[draw,thick,<-] (4) -- (5);
|
||||
\path[draw,thick,<-] (5) -- (7);
|
||||
\path[draw,thick,<-] (6) -- (7);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
After this, the algorithm goes through
|
||||
the list of nodes created by the first search,
|
||||
in \emph{reverse} order.
|
||||
If a node does not belong to a component,
|
||||
the algorithm creates a new component
|
||||
and starts a depth-first search
|
||||
that adds all new nodes found during the search
|
||||
to the new component.
|
||||
|
||||
In the example graph, the first component
|
||||
begins at node 3:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||
|
||||
\path[draw,thick,<-] (2) -- (1);
|
||||
\path[draw,thick,<-] (1) -- (3);
|
||||
\path[draw,thick,<-] (3) -- (2);
|
||||
\path[draw,thick,<-] (2) -- (4);
|
||||
\path[draw,thick,<-] (3) -- (5);
|
||||
\path[draw,thick,<-] (4) edge [bend left] (6);
|
||||
\path[draw,thick,<-] (6) edge [bend left] (4);
|
||||
\path[draw,thick,<-] (4) -- (5);
|
||||
\path[draw,thick,<-] (5) -- (7);
|
||||
\path[draw,thick,<-] (6) -- (7);
|
||||
|
||||
\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Note that since all edges are reversed,
|
||||
the component does not ''leak'' to other parts in the graph.
|
||||
|
||||
\begin{samepage}
|
||||
The next nodes in the list are nodes 7 and 6,
|
||||
but they already belong to a component,
|
||||
so the next new component begins at node 1:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||
|
||||
\path[draw,thick,<-] (2) -- (1);
|
||||
\path[draw,thick,<-] (1) -- (3);
|
||||
\path[draw,thick,<-] (3) -- (2);
|
||||
\path[draw,thick,<-] (2) -- (4);
|
||||
\path[draw,thick,<-] (3) -- (5);
|
||||
\path[draw,thick,<-] (4) edge [bend left] (6);
|
||||
\path[draw,thick,<-] (6) edge [bend left] (4);
|
||||
\path[draw,thick,<-] (4) -- (5);
|
||||
\path[draw,thick,<-] (5) -- (7);
|
||||
\path[draw,thick,<-] (6) -- (7);
|
||||
|
||||
\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
|
||||
\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
|
||||
%\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
|
||||
%\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
\begin{samepage}
|
||||
Finally, the algorithm processes nodes 5 and 4
|
||||
that create the remaining strongly connected components:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
|
||||
\node[draw, circle] (1) at (-1,1) {$7$};
|
||||
\node[draw, circle] (2) at (-3,2) {$3$};
|
||||
\node[draw, circle] (4) at (-5,2) {$2$};
|
||||
\node[draw, circle] (6) at (-7,2) {$1$};
|
||||
\node[draw, circle] (3) at (-3,0) {$6$};
|
||||
\node[draw, circle] (5) at (-5,0) {$5$};
|
||||
\node[draw, circle] (7) at (-7,0) {$4$};
|
||||
|
||||
\path[draw,thick,<-] (2) -- (1);
|
||||
\path[draw,thick,<-] (1) -- (3);
|
||||
\path[draw,thick,<-] (3) -- (2);
|
||||
\path[draw,thick,<-] (2) -- (4);
|
||||
\path[draw,thick,<-] (3) -- (5);
|
||||
\path[draw,thick,<-] (4) edge [bend left] (6);
|
||||
\path[draw,thick,<-] (6) edge [bend left] (4);
|
||||
\path[draw,thick,<-] (4) -- (5);
|
||||
\path[draw,thick,<-] (5) -- (7);
|
||||
\path[draw,thick,<-] (6) -- (7);
|
||||
|
||||
\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
|
||||
\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
|
||||
\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
|
||||
\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
The time complexity of the algorithm is $O(n+m)$,
|
||||
because the algorithm
|
||||
performs two depth-first searches.
|
||||
|
||||
\section{2SAT problem}
|
||||
|
||||
\index{2SAT problem}
|
||||
|
||||
Strong connectivity is also linked with the
|
||||
\key{2SAT problem}\footnote{The algorithm presented here was
|
||||
introduced in \cite{asp79}.
|
||||
There is also another well-known linear-time algorithm \cite{eve75}
|
||||
that is based on backtracking.}.
|
||||
In this problem, we are given a logical formula
|
||||
\[
|
||||
(a_1 \lor b_1) \land (a_2 \lor b_2) \land \cdots \land (a_m \lor b_m),
|
||||
\]
|
||||
where each $a_i$ and $b_i$ is either a logical variable
|
||||
($x_1,x_2,\ldots,x_n$)
|
||||
or a negation of a logical variable
|
||||
($\lnot x_1, \lnot x_2, \ldots, \lnot x_n$).
|
||||
The symbols ''$\land$'' and ''$\lor$'' denote
|
||||
logical operators ''and'' and ''or''.
|
||||
Our task is to assign each variable a value
|
||||
so that the formula is true, or state
|
||||
that this is not possible.
|
||||
|
||||
For example, the formula
|
||||
\[
|
||||
L_1 = (x_2 \lor \lnot x_1) \land
|
||||
(\lnot x_1 \lor \lnot x_2) \land
|
||||
(x_1 \lor x_3) \land
|
||||
(\lnot x_2 \lor \lnot x_3) \land
|
||||
(x_1 \lor x_4)
|
||||
\]
|
||||
is true when the variables are assigned as follows:
|
||||
|
||||
\[
|
||||
\begin{cases}
|
||||
x_1 = \textrm{false} \\
|
||||
x_2 = \textrm{false} \\
|
||||
x_3 = \textrm{true} \\
|
||||
x_4 = \textrm{true} \\
|
||||
\end{cases}
|
||||
\]
|
||||
|
||||
However, the formula
|
||||
\[
|
||||
L_2 = (x_1 \lor x_2) \land
|
||||
(x_1 \lor \lnot x_2) \land
|
||||
(\lnot x_1 \lor x_3) \land
|
||||
(\lnot x_1 \lor \lnot x_3)
|
||||
\]
|
||||
is always false, regardless of how we
|
||||
assign the values.
|
||||
The reason for this is that we cannot
|
||||
choose a value for $x_1$
|
||||
without creating a contradiction.
|
||||
If $x_1$ is false, both $x_2$ and $\lnot x_2$
|
||||
should be true which is impossible,
|
||||
and if $x_1$ is true, both $x_3$ and $\lnot x_3$
|
||||
should be true which is also impossible.
|
||||
|
||||
The 2SAT problem can be represented as a graph
|
||||
whose nodes correspond to
|
||||
variables $x_i$ and negations $\lnot x_i$,
|
||||
and edges determine the connections
|
||||
between the variables.
|
||||
Each pair $(a_i \lor b_i)$ generates two edges:
|
||||
$\lnot a_i \to b_i$ and $\lnot b_i \to a_i$.
|
||||
This means that if $a_i$ does not hold,
|
||||
$b_i$ must hold, and vice versa.
|
||||
|
||||
The graph for the formula $L_1$ is:
|
||||
\\
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=1.0,minimum size=2pt]
|
||||
\node[draw, circle, inner sep=1.3pt] (1) at (1,2) {$\lnot x_3$};
|
||||
\node[draw, circle] (2) at (3,2) {$x_2$};
|
||||
\node[draw, circle, inner sep=1.3pt] (3) at (1,0) {$\lnot x_4$};
|
||||
\node[draw, circle] (4) at (3,0) {$x_1$};
|
||||
\node[draw, circle, inner sep=1.3pt] (5) at (5,2) {$\lnot x_1$};
|
||||
\node[draw, circle] (6) at (7,2) {$x_4$};
|
||||
\node[draw, circle, inner sep=1.3pt] (7) at (5,0) {$\lnot x_2$};
|
||||
\node[draw, circle] (8) at (7,0) {$x_3$};
|
||||
|
||||
\path[draw,thick,->] (1) -- (4);
|
||||
\path[draw,thick,->] (4) -- (2);
|
||||
\path[draw,thick,->] (2) -- (1);
|
||||
\path[draw,thick,->] (3) -- (4);
|
||||
\path[draw,thick,->] (2) -- (5);
|
||||
\path[draw,thick,->] (4) -- (7);
|
||||
\path[draw,thick,->] (5) -- (6);
|
||||
\path[draw,thick,->] (5) -- (8);
|
||||
\path[draw,thick,->] (8) -- (7);
|
||||
\path[draw,thick,->] (7) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
And the graph for the formula $L_2$ is:
|
||||
\\
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=1.0,minimum size=2pt]
|
||||
\node[draw, circle] (1) at (1,2) {$x_3$};
|
||||
\node[draw, circle] (2) at (3,2) {$x_2$};
|
||||
\node[draw, circle, inner sep=1.3pt] (3) at (5,2) {$\lnot x_2$};
|
||||
\node[draw, circle, inner sep=1.3pt] (4) at (7,2) {$\lnot x_3$};
|
||||
\node[draw, circle, inner sep=1.3pt] (5) at (4,3.5) {$\lnot x_1$};
|
||||
\node[draw, circle] (6) at (4,0.5) {$x_1$};
|
||||
|
||||
\path[draw,thick,->] (1) -- (5);
|
||||
\path[draw,thick,->] (4) -- (5);
|
||||
\path[draw,thick,->] (6) -- (1);
|
||||
\path[draw,thick,->] (6) -- (4);
|
||||
\path[draw,thick,->] (5) -- (2);
|
||||
\path[draw,thick,->] (5) -- (3);
|
||||
\path[draw,thick,->] (2) -- (6);
|
||||
\path[draw,thick,->] (3) -- (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The structure of the graph tells us whether
|
||||
it is possible to assign the values
|
||||
of the variables so
|
||||
that the formula is true.
|
||||
It turns out that this can be done
|
||||
exactly when there are no nodes
|
||||
$x_i$ and $\lnot x_i$ such that
|
||||
both nodes belong to the
|
||||
same strongly connected component.
|
||||
If there are such nodes,
|
||||
the graph contains
|
||||
a path from $x_i$ to $\lnot x_i$
|
||||
and also a path from $\lnot x_i$ to $x_i$,
|
||||
so both $x_i$ and $\lnot x_i$ should be true
|
||||
which is not possible.
|
||||
|
||||
In the graph of the formula $L_1$
|
||||
there are no nodes $x_i$ and $\lnot x_i$
|
||||
such that both nodes
|
||||
belong to the same strongly connected component,
|
||||
so a solution exists.
|
||||
In the graph of the formula $L_2$
|
||||
all nodes belong to the same strongly connected component,
|
||||
so a solution does not exist.
|
||||
|
||||
If a solution exists, the values for the variables
|
||||
can be found by going through the nodes of the
|
||||
component graph in a reverse topological sort order.
|
||||
At each step, we process a component
|
||||
that does not contain edges that lead to an
|
||||
unprocessed component.
|
||||
If the variables in the component
|
||||
have not been assigned values,
|
||||
their values will be determined
|
||||
according to the values in the component,
|
||||
and if they already have values,
|
||||
they remain unchanged.
|
||||
The process continues until each variable
|
||||
has been assigned a value.
|
||||
|
||||
The component graph for the formula $L_1$ is as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=1.0]
|
||||
\node[draw, circle] (1) at (0,0) {$A$};
|
||||
\node[draw, circle] (2) at (2,0) {$B$};
|
||||
\node[draw, circle] (3) at (4,0) {$C$};
|
||||
\node[draw, circle] (4) at (6,0) {$D$};
|
||||
|
||||
\path[draw,thick,->] (1) -- (2);
|
||||
\path[draw,thick,->] (2) -- (3);
|
||||
\path[draw,thick,->] (3) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The components are
|
||||
$A = \{\lnot x_4\}$,
|
||||
$B = \{x_1, x_2, \lnot x_3\}$,
|
||||
$C = \{\lnot x_1, \lnot x_2, x_3\}$ and
|
||||
$D = \{x_4\}$.
|
||||
When constructing the solution,
|
||||
we first process the component $D$
|
||||
where $x_4$ becomes true.
|
||||
After this, we process the component $C$
|
||||
where $x_1$ and $x_2$ become false
|
||||
and $x_3$ becomes true.
|
||||
All variables have been assigned values,
|
||||
so the remaining components $A$ and $B$
|
||||
do not change the variables.
|
||||
|
||||
Note that this method works, because the
|
||||
graph has a special structure:
|
||||
if there are paths from node $x_i$ to node $x_j$
|
||||
and from node $x_j$ to node $\lnot x_j$,
|
||||
then node $x_i$ never becomes true.
|
||||
The reason for this is that there is also
|
||||
a path from node $\lnot x_j$ to node $\lnot x_i$,
|
||||
and both $x_i$ and $x_j$ become false.
|
||||
|
||||
\index{3SAT problem}
|
||||
|
||||
A more difficult problem is the \key{3SAT problem},
|
||||
where each part of the formula is of the form
|
||||
$(a_i \lor b_i \lor c_i)$.
|
||||
This problem is NP-hard, so no efficient algorithm
|
||||
for solving the problem is known.
|
1225
chapter18.tex
1225
chapter18.tex
File diff suppressed because it is too large
Load Diff
674
chapter19.tex
674
chapter19.tex
|
@ -1,674 +0,0 @@
|
|||
\chapter{Paths and circuits}
|
||||
|
||||
This chapter focuses on two types of paths in graphs:
|
||||
\begin{itemize}
|
||||
\item An \key{Eulerian path} is a path that
|
||||
goes through each edge exactly once.
|
||||
\item A \key{Hamiltonian path} is a path
|
||||
that visits each node exactly once.
|
||||
\end{itemize}
|
||||
|
||||
While Eulerian and Hamiltonian paths look like
|
||||
similar concepts at first glance,
|
||||
the computational problems related to them
|
||||
are very different.
|
||||
It turns out that there is a simple rule that
|
||||
determines whether a graph contains an Eulerian path,
|
||||
and there is also an efficient algorithm to
|
||||
find such a path if it exists.
|
||||
On the contrary, checking the existence of a Hamiltonian path is a NP-hard
|
||||
problem, and no efficient algorithm is known for solving the problem.
|
||||
|
||||
\section{Eulerian paths}
|
||||
|
||||
\index{Eulerian path}
|
||||
|
||||
An \key{Eulerian path}\footnote{L. Euler studied such paths in 1736
|
||||
when he solved the famous Königsberg bridge problem.
|
||||
This was the birth of graph theory.} is a path
|
||||
that goes exactly once through each edge of the graph.
|
||||
For example, the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
has an Eulerian path from node 2 to node 5:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:1.}] {} (1);
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]left:2.}] {} (4);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]south:3.}] {} (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]left:4.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:5.}] {} (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]south:6.}] {} (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\index{Eulerian circuit}
|
||||
An \key{Eulerian circuit}
|
||||
is an Eulerian path that starts and ends
|
||||
at the same node.
|
||||
For example, the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
has an Eulerian circuit that starts and ends at node 1:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]left:1.}] {} (4);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]south:2.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]right:3.}] {} (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:4.}] {} (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]north:5.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:6.}] {} (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Existence}
|
||||
|
||||
The existence of Eulerian paths and circuits
|
||||
depends on the degrees of the nodes.
|
||||
First, an undirected graph has an Eulerian path
|
||||
exactly when all the edges
|
||||
belong to the same connected component and
|
||||
\begin{itemize}
|
||||
\item the degree of each node is even \emph{or}
|
||||
\item the degree of exactly two nodes is odd,
|
||||
and the degree of all other nodes is even.
|
||||
\end{itemize}
|
||||
|
||||
In the first case, each Eulerian path is also an Eulerian circuit.
|
||||
In the second case, the odd-degree nodes are the starting
|
||||
and ending nodes of an Eulerian path which is not an Eulerian circuit.
|
||||
|
||||
\begin{samepage}
|
||||
For example, in the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
nodes 1, 3 and 4 have a degree of 2,
|
||||
and nodes 2 and 5 have a degree of 3.
|
||||
Exactly two nodes have an odd degree,
|
||||
so there is an Eulerian path between nodes 2 and 5,
|
||||
but the graph does not contain an Eulerian circuit.
|
||||
|
||||
In a directed graph,
|
||||
we focus on indegrees and outdegrees
|
||||
of the nodes.
|
||||
A directed graph contains an Eulerian path
|
||||
exactly when all the edges belong to the same
|
||||
connected component and
|
||||
\begin{itemize}
|
||||
\item in each node, the indegree equals the outdegree, \emph{or}
|
||||
\item in one node, the indegree is one larger than the outdegree,
|
||||
in another node, the outdegree is one larger than the indegree,
|
||||
and in all other nodes, the indegree equals the outdegree.
|
||||
\end{itemize}
|
||||
|
||||
In the first case, each Eulerian path
|
||||
is also an Eulerian circuit,
|
||||
and in the second case, the graph contains an Eulerian path
|
||||
that begins at the node whose outdegree is larger
|
||||
and ends at the node whose indegree is larger.
|
||||
|
||||
For example, in the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\path[draw,thick,->,>=latex] (3) -- (5);
|
||||
\path[draw,thick,->,>=latex] (2) -- (5);
|
||||
\path[draw,thick,->,>=latex] (5) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
nodes 1, 3 and 4 have both indegree 1 and outdegree 1,
|
||||
node 2 has indegree 1 and outdegree 2,
|
||||
and node 5 has indegree 2 and outdegree 1.
|
||||
Hence, the graph contains an Eulerian path
|
||||
from node 2 to node 5:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:1.}] {} (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]south:2.}] {} (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:3.}] {} (4);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]left:4.}] {} (1);
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:5.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]left:6.}] {} (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Hierholzer's algorithm}
|
||||
|
||||
\index{Hierholzer's algorithm}
|
||||
|
||||
\key{Hierholzer's algorithm}\footnote{The algorithm was published
|
||||
in 1873 after Hierholzer's death \cite{hie73}.} is an efficient
|
||||
method for constructing
|
||||
an Eulerian circuit.
|
||||
The algorithm consists of several rounds,
|
||||
each of which adds new edges to the circuit.
|
||||
Of course, we assume that the graph contains
|
||||
an Eulerian circuit; otherwise Hierholzer's
|
||||
algorithm cannot find it.
|
||||
|
||||
First, the algorithm constructs a circuit that contains
|
||||
some (not necessarily all) of the edges of the graph.
|
||||
After this, the algorithm extends the circuit
|
||||
step by step by adding subcircuits to it.
|
||||
The process continues until all edges have been added
|
||||
to the circuit.
|
||||
|
||||
The algorithm extends the circuit by always finding
|
||||
a node $x$ that belongs to the circuit but has
|
||||
an outgoing edge that is not included in the circuit.
|
||||
The algorithm constructs a new path from node $x$
|
||||
that only contains edges that are not yet in the circuit.
|
||||
Sooner or later,
|
||||
the path will return to node $x$,
|
||||
which creates a subcircuit.
|
||||
|
||||
If the graph only contains an Eulerian path,
|
||||
we can still use Hierholzer's algorithm
|
||||
to find it by adding an extra edge to the graph
|
||||
and removing the edge after the circuit
|
||||
has been constructed.
|
||||
For example, in an undirected graph,
|
||||
we add the extra edge between the two
|
||||
odd-degree nodes.
|
||||
|
||||
Next we will see how Hierholzer's algorithm
|
||||
constructs an Eulerian circuit for an undirected graph.
|
||||
|
||||
\subsubsection{Example}
|
||||
|
||||
\begin{samepage}
|
||||
Let us consider the following graph:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (3,5) {$1$};
|
||||
\node[draw, circle] (2) at (1,3) {$2$};
|
||||
\node[draw, circle] (3) at (3,3) {$3$};
|
||||
\node[draw, circle] (4) at (5,3) {$4$};
|
||||
\node[draw, circle] (5) at (1,1) {$5$};
|
||||
\node[draw, circle] (6) at (3,1) {$6$};
|
||||
\node[draw, circle] (7) at (5,1) {$7$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (6);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (4) -- (7);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\path[draw,thick,-] (6) -- (7);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
\begin{samepage}
|
||||
Suppose that the algorithm first creates a circuit
|
||||
that begins at node 1.
|
||||
A possible circuit is
|
||||
$1 \rightarrow 2 \rightarrow 3 \rightarrow 1$:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (3,5) {$1$};
|
||||
\node[draw, circle] (2) at (1,3) {$2$};
|
||||
\node[draw, circle] (3) at (3,3) {$3$};
|
||||
\node[draw, circle] (4) at (5,3) {$4$};
|
||||
\node[draw, circle] (5) at (1,1) {$5$};
|
||||
\node[draw, circle] (6) at (3,1) {$6$};
|
||||
\node[draw, circle] (7) at (5,1) {$7$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (6);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (4) -- (7);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\path[draw,thick,-] (6) -- (7);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:2.}] {} (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]east:3.}] {} (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
After this, the algorithm adds
|
||||
the subcircuit
|
||||
$2 \rightarrow 5 \rightarrow 6 \rightarrow 2$
|
||||
to the circuit:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (3,5) {$1$};
|
||||
\node[draw, circle] (2) at (1,3) {$2$};
|
||||
\node[draw, circle] (3) at (3,3) {$3$};
|
||||
\node[draw, circle] (4) at (5,3) {$4$};
|
||||
\node[draw, circle] (5) at (1,1) {$5$};
|
||||
\node[draw, circle] (6) at (3,1) {$6$};
|
||||
\node[draw, circle] (7) at (5,1) {$7$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (6);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (4) -- (7);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\path[draw,thick,-] (6) -- (7);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]west:2.}] {} (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:3.}] {} (6);
|
||||
\path[draw=red,thick,->,line width=2pt] (6) -- node[font=\small,label={[red]north:4.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:5.}] {} (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]east:6.}] {} (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Finally, the algorithm adds the subcircuit
|
||||
$6 \rightarrow 3 \rightarrow 4 \rightarrow 7 \rightarrow 6$
|
||||
to the circuit:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (3,5) {$1$};
|
||||
\node[draw, circle] (2) at (1,3) {$2$};
|
||||
\node[draw, circle] (3) at (3,3) {$3$};
|
||||
\node[draw, circle] (4) at (5,3) {$4$};
|
||||
\node[draw, circle] (5) at (1,1) {$5$};
|
||||
\node[draw, circle] (6) at (3,1) {$6$};
|
||||
\node[draw, circle] (7) at (5,1) {$7$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (2) -- (6);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (3) -- (6);
|
||||
\path[draw,thick,-] (4) -- (7);
|
||||
\path[draw,thick,-] (5) -- (6);
|
||||
\path[draw,thick,-] (6) -- (7);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]west:2.}] {} (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:3.}] {} (6);
|
||||
\path[draw=red,thick,->,line width=2pt] (6) -- node[font=\small,label={[red]east:4.}] {} (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]north:5.}] {} (4);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]east:6.}] {} (7);
|
||||
\path[draw=red,thick,->,line width=2pt] (7) -- node[font=\small,label={[red]south:7.}] {} (6);
|
||||
\path[draw=red,thick,->,line width=2pt] (6) -- node[font=\small,label={[red]right:8.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:9.}] {} (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]east:10.}] {} (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
Now all edges are included in the circuit,
|
||||
so we have successfully constructed an Eulerian circuit.
|
||||
|
||||
\section{Hamiltonian paths}
|
||||
|
||||
\index{Hamiltonian path}
|
||||
|
||||
A \key{Hamiltonian path}
|
||||
%\footnote{W. R. Hamilton (1805--1865) was an Irish mathematician.}
|
||||
is a path
|
||||
that visits each node of the graph exactly once.
|
||||
For example, the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
contains a Hamiltonian path from node 1 to node 3:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]left:1.}] {} (4);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]south:2.}] {} (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]left:3.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:4.}] {} (3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\index{Hamiltonian circuit}
|
||||
|
||||
If a Hamiltonian path begins and ends at the same node,
|
||||
it is called a \key{Hamiltonian circuit}.
|
||||
The graph above also has an Hamiltonian circuit
|
||||
that begins and ends at node 1:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,5) {$1$};
|
||||
\node[draw, circle] (2) at (3,5) {$2$};
|
||||
\node[draw, circle] (3) at (5,4) {$3$};
|
||||
\node[draw, circle] (4) at (1,3) {$4$};
|
||||
\node[draw, circle] (5) at (3,3) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (2) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (5);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
|
||||
\path[draw=red,thick,->,line width=2pt] (1) -- node[font=\small,label={[red]north:1.}] {} (2);
|
||||
\path[draw=red,thick,->,line width=2pt] (2) -- node[font=\small,label={[red]north:2.}] {} (3);
|
||||
\path[draw=red,thick,->,line width=2pt] (3) -- node[font=\small,label={[red]south:3.}] {} (5);
|
||||
\path[draw=red,thick,->,line width=2pt] (5) -- node[font=\small,label={[red]south:4.}] {} (4);
|
||||
\path[draw=red,thick,->,line width=2pt] (4) -- node[font=\small,label={[red]left:5.}] {} (1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Existence}
|
||||
|
||||
No efficient method is known for testing if a graph
|
||||
contains a Hamiltonian path, and the problem is NP-hard.
|
||||
Still, in some special cases, we can be certain
|
||||
that a graph contains a Hamiltonian path.
|
||||
|
||||
A simple observation is that if the graph is complete,
|
||||
i.e., there is an edge between all pairs of nodes,
|
||||
it also contains a Hamiltonian path.
|
||||
Also stronger results have been achieved:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
\index{Dirac's theorem}
|
||||
\key{Dirac's theorem}: %\cite{dir52}
|
||||
If the degree of each node is at least $n/2$,
|
||||
the graph contains a Hamiltonian path.
|
||||
\item
|
||||
\index{Ore's theorem}
|
||||
\key{Ore's theorem}: %\cite{ore60}
|
||||
If the sum of degrees of each non-adjacent pair of nodes
|
||||
is at least $n$,
|
||||
the graph contains a Hamiltonian path.
|
||||
\end{itemize}
|
||||
|
||||
A common property in these theorems and other results is
|
||||
that they guarantee the existence of a Hamiltonian path
|
||||
if the graph has \emph{a large number} of edges.
|
||||
This makes sense, because the more edges the graph contains,
|
||||
the more possibilities there is to construct a Hamiltonian path.
|
||||
|
||||
\subsubsection{Construction}
|
||||
|
||||
Since there is no efficient way to check if a Hamiltonian
|
||||
path exists, it is clear that there is also no method
|
||||
to efficiently construct the path, because otherwise
|
||||
we could just try to construct the path and see
|
||||
whether it exists.
|
||||
|
||||
A simple way to search for a Hamiltonian path is
|
||||
to use a backtracking algorithm that goes through all
|
||||
possible ways to construct the path.
|
||||
The time complexity of such an algorithm is at least $O(n!)$,
|
||||
because there are $n!$ different ways to choose the order of $n$ nodes.
|
||||
|
||||
A more efficient solution is based on dynamic programming
|
||||
(see Chapter 10.5).
|
||||
The idea is to calculate values
|
||||
of a function $\texttt{possible}(S,x)$,
|
||||
where $S$ is a subset of nodes and $x$
|
||||
is one of the nodes.
|
||||
The function indicates whether there is a Hamiltonian path
|
||||
that visits the nodes of $S$ and ends at node $x$.
|
||||
It is possible to implement this solution in $O(2^n n^2)$ time.
|
||||
|
||||
\section{De Bruijn sequences}
|
||||
|
||||
\index{De Bruijn sequence}
|
||||
|
||||
A \key{De Bruijn sequence}
|
||||
is a string that contains
|
||||
every string of length $n$
|
||||
exactly once as a substring, for a fixed
|
||||
alphabet of $k$ characters.
|
||||
The length of such a string is
|
||||
$k^n+n-1$ characters.
|
||||
For example, when $n=3$ and $k=2$,
|
||||
an example of a De Bruijn sequence is
|
||||
\[0001011100.\]
|
||||
The substrings of this string are all
|
||||
combinations of three bits:
|
||||
000, 001, 010, 011, 100, 101, 110 and 111.
|
||||
|
||||
It turns out that each De Bruijn sequence
|
||||
corresponds to an Eulerian path in a graph.
|
||||
The idea is to construct a graph where
|
||||
each node contains a string of $n-1$ characters
|
||||
and each edge adds one character to the string.
|
||||
The following graph corresponds to the above scenario:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.8]
|
||||
\node[draw, circle] (00) at (-3,0) {00};
|
||||
\node[draw, circle] (11) at (3,0) {11};
|
||||
\node[draw, circle] (01) at (0,2) {01};
|
||||
\node[draw, circle] (10) at (0,-2) {10};
|
||||
|
||||
\path[draw,thick,->] (00) edge [bend left=20] node[font=\small,label=1] {} (01);
|
||||
\path[draw,thick,->] (01) edge [bend left=20] node[font=\small,label=1] {} (11);
|
||||
\path[draw,thick,->] (11) edge [bend left=20] node[font=\small,label=below:0] {} (10);
|
||||
\path[draw,thick,->] (10) edge [bend left=20] node[font=\small,label=below:0] {} (00);
|
||||
|
||||
\path[draw,thick,->] (01) edge [bend left=30] node[font=\small,label=right:0] {} (10);
|
||||
\path[draw,thick,->] (10) edge [bend left=30] node[font=\small,label=left:1] {} (01);
|
||||
|
||||
\path[draw,thick,-] (00) edge [loop left] node[font=\small,label=below:0] {} (00);
|
||||
\path[draw,thick,-] (11) edge [loop right] node[font=\small,label=below:1] {} (11);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
An Eulerian path in this graph corresponds to a string
|
||||
that contains all strings of length $n$.
|
||||
The string contains the characters of the starting node
|
||||
and all characters of the edges.
|
||||
The starting node has $n-1$ characters
|
||||
and there are $k^n$ characters in the edges,
|
||||
so the length of the string is $k^n+n-1$.
|
||||
|
||||
\section{Knight's tours}
|
||||
|
||||
\index{knight's tour}
|
||||
|
||||
A \key{knight's tour} is a sequence of moves
|
||||
of a knight on an $n \times n$ chessboard
|
||||
following the rules of chess such that the knight
|
||||
visits each square exactly once.
|
||||
A knight's tour is called a \emph{closed} tour
|
||||
if the knight finally returns to the starting square and
|
||||
otherwise it is called an \emph{open} tour.
|
||||
|
||||
For example, here is an open knight's tour on a $5 \times 5$ board:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (5,5);
|
||||
\node at (0.5,4.5) {$1$};
|
||||
\node at (1.5,4.5) {$4$};
|
||||
\node at (2.5,4.5) {$11$};
|
||||
\node at (3.5,4.5) {$16$};
|
||||
\node at (4.5,4.5) {$25$};
|
||||
\node at (0.5,3.5) {$12$};
|
||||
\node at (1.5,3.5) {$17$};
|
||||
\node at (2.5,3.5) {$2$};
|
||||
\node at (3.5,3.5) {$5$};
|
||||
\node at (4.5,3.5) {$10$};
|
||||
\node at (0.5,2.5) {$3$};
|
||||
\node at (1.5,2.5) {$20$};
|
||||
\node at (2.5,2.5) {$7$};
|
||||
\node at (3.5,2.5) {$24$};
|
||||
\node at (4.5,2.5) {$15$};
|
||||
\node at (0.5,1.5) {$18$};
|
||||
\node at (1.5,1.5) {$13$};
|
||||
\node at (2.5,1.5) {$22$};
|
||||
\node at (3.5,1.5) {$9$};
|
||||
\node at (4.5,1.5) {$6$};
|
||||
\node at (0.5,0.5) {$21$};
|
||||
\node at (1.5,0.5) {$8$};
|
||||
\node at (2.5,0.5) {$19$};
|
||||
\node at (3.5,0.5) {$14$};
|
||||
\node at (4.5,0.5) {$23$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
A knight's tour corresponds to a Hamiltonian path in a graph
|
||||
whose nodes represent the squares of the board,
|
||||
and two nodes are connected with an edge if a knight
|
||||
can move between the squares according to the rules of chess.
|
||||
|
||||
A natural way to construct a knight's tour is to use backtracking.
|
||||
The search can be made more efficient by using
|
||||
\emph{heuristics} that attempt to guide the knight so that
|
||||
a complete tour will be found quickly.
|
||||
|
||||
\subsubsection{Warnsdorf's rule}
|
||||
|
||||
\index{heuristic}
|
||||
\index{Warnsdorf's rule}
|
||||
|
||||
\key{Warnsdorf's rule} is a simple and effective heuristic
|
||||
for finding a knight's tour\footnote{This heuristic was proposed
|
||||
in Warnsdorf's book \cite{war23} in 1823. There are
|
||||
also polynomial algorithms for finding knight's tours
|
||||
\cite{par97}, but they are more complicated.}.
|
||||
Using the rule, it is possible to efficiently construct a tour
|
||||
even on a large board.
|
||||
The idea is to always move the knight so that it ends up
|
||||
in a square where the number of possible moves is as
|
||||
\emph{small} as possible.
|
||||
|
||||
For example, in the following situation, there are five
|
||||
possible squares to which the knight can move (squares $a \ldots e$):
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (5,5);
|
||||
\node at (0.5,4.5) {$1$};
|
||||
\node at (2.5,3.5) {$2$};
|
||||
\node at (4.5,4.5) {$a$};
|
||||
\node at (0.5,2.5) {$b$};
|
||||
\node at (4.5,2.5) {$e$};
|
||||
\node at (1.5,1.5) {$c$};
|
||||
\node at (3.5,1.5) {$d$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
In this situation, Warnsdorf's rule moves the knight to square $a$,
|
||||
because after this choice, there is only a single possible move.
|
||||
The other choices would move the knight to squares where
|
||||
there would be three moves available.
|
||||
|
||||
|
1441
chapter20.tex
1441
chapter20.tex
File diff suppressed because it is too large
Load Diff
726
chapter21.tex
726
chapter21.tex
|
@ -1,726 +0,0 @@
|
|||
\chapter{Number theory}
|
||||
|
||||
\index{number theory}
|
||||
|
||||
\key{Number theory} is a branch of mathematics
|
||||
that studies integers.
|
||||
Number theory is a fascinating field,
|
||||
because many questions involving integers
|
||||
are very difficult to solve even if they
|
||||
seem simple at first glance.
|
||||
|
||||
As an example, consider the following equation:
|
||||
\[x^3 + y^3 + z^3 = 33\]
|
||||
It is easy to find three real numbers $x$, $y$ and $z$
|
||||
that satisfy the equation.
|
||||
For example, we can choose
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
x = 3, \\
|
||||
y = \sqrt[3]{3}, \\
|
||||
z = \sqrt[3]{3}.\\
|
||||
\end{array}
|
||||
\]
|
||||
However, it is an open problem in number theory
|
||||
if there are any three
|
||||
\emph{integers} $x$, $y$ and $z$
|
||||
that would satisfy the equation \cite{bec07}.
|
||||
|
||||
In this chapter, we will focus on basic concepts
|
||||
and algorithms in number theory.
|
||||
Throughout the chapter, we will assume that all numbers
|
||||
are integers, if not otherwise stated.
|
||||
|
||||
\section{Primes and factors}
|
||||
|
||||
\index{divisibility}
|
||||
\index{factor}
|
||||
\index{divisor}
|
||||
|
||||
A number $a$ is called a \key{factor} or a \key{divisor} of a number $b$
|
||||
if $a$ divides $b$.
|
||||
If $a$ is a factor of $b$,
|
||||
we write $a \mid b$, and otherwise we write $a \nmid b$.
|
||||
For example, the factors of 24 are
|
||||
1, 2, 3, 4, 6, 8, 12 and 24.
|
||||
|
||||
\index{prime}
|
||||
\index{prime decomposition}
|
||||
|
||||
A number $n>1$ is a \key{prime}
|
||||
if its only positive factors are 1 and $n$.
|
||||
For example, 7, 19 and 41 are primes,
|
||||
but 35 is not a prime, because $5 \cdot 7 = 35$.
|
||||
For every number $n>1$, there is a unique
|
||||
\key{prime factorization}
|
||||
\[ n = p_1^{\alpha_1} p_2^{\alpha_2} \cdots p_k^{\alpha_k},\]
|
||||
where $p_1,p_2,\ldots,p_k$ are distinct primes and
|
||||
$\alpha_1,\alpha_2,\ldots,\alpha_k$ are positive numbers.
|
||||
For example, the prime factorization for 84 is
|
||||
\[84 = 2^2 \cdot 3^1 \cdot 7^1.\]
|
||||
|
||||
The \key{number of factors} of a number $n$ is
|
||||
\[\tau(n)=\prod_{i=1}^k (\alpha_i+1),\]
|
||||
because for each prime $p_i$, there are
|
||||
$\alpha_i+1$ ways to choose how many times
|
||||
it appears in the factor.
|
||||
For example, the number of factors
|
||||
of 84 is
|
||||
$\tau(84)=3 \cdot 2 \cdot 2 = 12$.
|
||||
The factors are
|
||||
1, 2, 3, 4, 6, 7, 12, 14, 21, 28, 42 and 84.
|
||||
|
||||
The \key{sum of factors} of $n$ is
|
||||
\[\sigma(n)=\prod_{i=1}^k (1+p_i+\ldots+p_i^{\alpha_i}) = \prod_{i=1}^k \frac{p_i^{a_i+1}-1}{p_i-1},\]
|
||||
where the latter formula is based on the geometric progression formula.
|
||||
For example, the sum of factors of 84 is
|
||||
\[\sigma(84)=\frac{2^3-1}{2-1} \cdot \frac{3^2-1}{3-1} \cdot \frac{7^2-1}{7-1} = 7 \cdot 4 \cdot 8 = 224.\]
|
||||
|
||||
The \key{product of factors} of $n$ is
|
||||
\[\mu(n)=n^{\tau(n)/2},\]
|
||||
because we can form $\tau(n)/2$ pairs from the factors,
|
||||
each with product $n$.
|
||||
For example, the factors of 84
|
||||
produce the pairs
|
||||
$1 \cdot 84$, $2 \cdot 42$, $3 \cdot 28$, etc.,
|
||||
and the product of the factors is $\mu(84)=84^6=351298031616$.
|
||||
|
||||
\index{perfect number}
|
||||
|
||||
A number $n$ is called a \key{perfect number} if $n=\sigma(n)-n$,
|
||||
i.e., $n$ equals the sum of its factors
|
||||
between $1$ and $n-1$.
|
||||
For example, 28 is a perfect number,
|
||||
because $28=1+2+4+7+14$.
|
||||
|
||||
\subsubsection{Number of primes}
|
||||
|
||||
It is easy to show that there is an infinite number
|
||||
of primes.
|
||||
If the number of primes would be finite,
|
||||
we could construct a set $P=\{p_1,p_2,\ldots,p_n\}$
|
||||
that would contain all the primes.
|
||||
For example, $p_1=2$, $p_2=3$, $p_3=5$, and so on.
|
||||
However, using $P$, we could form a new prime
|
||||
\[p_1 p_2 \cdots p_n+1\]
|
||||
that is larger than all elements in $P$.
|
||||
This is a contradiction, and the number of primes
|
||||
has to be infinite.
|
||||
|
||||
\subsubsection{Density of primes}
|
||||
|
||||
The density of primes means how often there are primes
|
||||
among the numbers.
|
||||
Let $\pi(n)$ denote the number of primes between
|
||||
$1$ and $n$. For example, $\pi(10)=4$, because
|
||||
there are 4 primes between $1$ and $10$: 2, 3, 5 and 7.
|
||||
|
||||
It is possible to show that
|
||||
\[\pi(n) \approx \frac{n}{\ln n},\]
|
||||
which means that primes are quite frequent.
|
||||
For example, the number of primes between
|
||||
$1$ and $10^6$ is $\pi(10^6)=78498$,
|
||||
and $10^6 / \ln 10^6 \approx 72382$.
|
||||
|
||||
\subsubsection{Conjectures}
|
||||
|
||||
There are many \emph{conjectures} involving primes.
|
||||
Most people think that the conjectures are true,
|
||||
but nobody has been able to prove them.
|
||||
For example, the following conjectures are famous:
|
||||
|
||||
\begin{itemize}
|
||||
\index{Goldbach's conjecture}
|
||||
\item \key{Goldbach's conjecture}:
|
||||
Each even integer $n>2$ can be represented as a
|
||||
sum $n=a+b$ so that both $a$ and $b$ are primes.
|
||||
\index{twin prime}
|
||||
\item \key{Twin prime conjecture}:
|
||||
There is an infinite number of pairs
|
||||
of the form $\{p,p+2\}$,
|
||||
where both $p$ and $p+2$ are primes.
|
||||
\index{Legendre's conjecture}
|
||||
\item \key{Legendre's conjecture}:
|
||||
There is always a prime between numbers
|
||||
$n^2$ and $(n+1)^2$, where $n$ is any positive integer.
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Basic algorithms}
|
||||
|
||||
If a number $n$ is not prime,
|
||||
it can be represented as a product $a \cdot b$,
|
||||
where $a \le \sqrt n$ or $b \le \sqrt n$,
|
||||
so it certainly has a factor between $2$ and $\lfloor \sqrt n \rfloor$.
|
||||
Using this observation, we can both test
|
||||
if a number is prime and find the prime factorization
|
||||
of a number in $O(\sqrt n)$ time.
|
||||
|
||||
The following function \texttt{prime} checks
|
||||
if the given number $n$ is prime.
|
||||
The function attempts to divide $n$ by
|
||||
all numbers between $2$ and $\lfloor \sqrt n \rfloor$,
|
||||
and if none of them divides $n$, then $n$ is prime.
|
||||
|
||||
\begin{lstlisting}
|
||||
bool prime(int n) {
|
||||
if (n < 2) return false;
|
||||
for (int x = 2; x*x <= n; x++) {
|
||||
if (n%x == 0) return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\noindent
|
||||
The following function \texttt{factors}
|
||||
constructs a vector that contains the prime
|
||||
factorization of $n$.
|
||||
The function divides $n$ by its prime factors,
|
||||
and adds them to the vector.
|
||||
The process ends when the remaining number $n$
|
||||
has no factors between $2$ and $\lfloor \sqrt n \rfloor$.
|
||||
If $n>1$, it is prime and the last factor.
|
||||
|
||||
\begin{lstlisting}
|
||||
vector<int> factors(int n) {
|
||||
vector<int> f;
|
||||
for (int x = 2; x*x <= n; x++) {
|
||||
while (n%x == 0) {
|
||||
f.push_back(x);
|
||||
n /= x;
|
||||
}
|
||||
}
|
||||
if (n > 1) f.push_back(n);
|
||||
return f;
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
Note that each prime factor appears in the vector
|
||||
as many times as it divides the number.
|
||||
For example, $24=2^3 \cdot 3$,
|
||||
so the result of the function is $[2,2,2,3]$.
|
||||
|
||||
\subsubsection{Sieve of Eratosthenes}
|
||||
|
||||
\index{sieve of Eratosthenes}
|
||||
|
||||
The \key{sieve of Eratosthenes}
|
||||
%\footnote{Eratosthenes (c. 276 BC -- c. 194 BC) was a Greek mathematician.}
|
||||
is a preprocessing
|
||||
algorithm that builds an array using which we
|
||||
can efficiently check if a given number between $2 \ldots n$
|
||||
is prime and, if it is not, find one prime factor of the number.
|
||||
|
||||
The algorithm builds an array $\texttt{sieve}$
|
||||
whose positions $2,3,\ldots,n$ are used.
|
||||
The value $\texttt{sieve}[k]=0$ means
|
||||
that $k$ is prime,
|
||||
and the value $\texttt{sieve}[k] \neq 0$
|
||||
means that $k$ is not a prime and one
|
||||
of its prime factors is $\texttt{sieve}[k]$.
|
||||
|
||||
The algorithm iterates through the numbers
|
||||
$2 \ldots n$ one by one.
|
||||
Always when a new prime $x$ is found,
|
||||
the algorithm records that the multiples
|
||||
of $x$ ($2x,3x,4x,\ldots$) are not primes,
|
||||
because the number $x$ divides them.
|
||||
|
||||
For example, if $n=20$, the array is as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (19,1);
|
||||
|
||||
\node at (0.5,0.5) {$0$};
|
||||
\node at (1.5,0.5) {$0$};
|
||||
\node at (2.5,0.5) {$2$};
|
||||
\node at (3.5,0.5) {$0$};
|
||||
\node at (4.5,0.5) {$3$};
|
||||
\node at (5.5,0.5) {$0$};
|
||||
\node at (6.5,0.5) {$2$};
|
||||
\node at (7.5,0.5) {$3$};
|
||||
\node at (8.5,0.5) {$5$};
|
||||
\node at (9.5,0.5) {$0$};
|
||||
\node at (10.5,0.5) {$3$};
|
||||
\node at (11.5,0.5) {$0$};
|
||||
\node at (12.5,0.5) {$7$};
|
||||
\node at (13.5,0.5) {$5$};
|
||||
\node at (14.5,0.5) {$2$};
|
||||
\node at (15.5,0.5) {$0$};
|
||||
\node at (16.5,0.5) {$3$};
|
||||
\node at (17.5,0.5) {$0$};
|
||||
\node at (18.5,0.5) {$5$};
|
||||
|
||||
\footnotesize
|
||||
|
||||
\node at (0.5,1.5) {$2$};
|
||||
\node at (1.5,1.5) {$3$};
|
||||
\node at (2.5,1.5) {$4$};
|
||||
\node at (3.5,1.5) {$5$};
|
||||
\node at (4.5,1.5) {$6$};
|
||||
\node at (5.5,1.5) {$7$};
|
||||
\node at (6.5,1.5) {$8$};
|
||||
\node at (7.5,1.5) {$9$};
|
||||
\node at (8.5,1.5) {$10$};
|
||||
\node at (9.5,1.5) {$11$};
|
||||
\node at (10.5,1.5) {$12$};
|
||||
\node at (11.5,1.5) {$13$};
|
||||
\node at (12.5,1.5) {$14$};
|
||||
\node at (13.5,1.5) {$15$};
|
||||
\node at (14.5,1.5) {$16$};
|
||||
\node at (15.5,1.5) {$17$};
|
||||
\node at (16.5,1.5) {$18$};
|
||||
\node at (17.5,1.5) {$19$};
|
||||
\node at (18.5,1.5) {$20$};
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The following code implements the sieve of
|
||||
Eratosthenes.
|
||||
The code assumes that each element of
|
||||
\texttt{sieve} is initially zero.
|
||||
|
||||
\begin{lstlisting}
|
||||
for (int x = 2; x <= n; x++) {
|
||||
if (sieve[x]) continue;
|
||||
for (int u = 2*x; u <= n; u += x) {
|
||||
sieve[u] = x;
|
||||
}
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
The inner loop of the algorithm is executed
|
||||
$n/x$ times for each value of $x$.
|
||||
Thus, an upper bound for the running time
|
||||
of the algorithm is the harmonic sum
|
||||
\[\sum_{x=2}^n n/x = n/2 + n/3 + n/4 + \cdots + n/n = O(n \log n).\]
|
||||
|
||||
\index{harmonic sum}
|
||||
|
||||
In fact, the algorithm is more efficient,
|
||||
because the inner loop will be executed only if
|
||||
the number $x$ is prime.
|
||||
It can be shown that the running time of the
|
||||
algorithm is only $O(n \log \log n)$,
|
||||
a complexity very near to $O(n)$.
|
||||
|
||||
\subsubsection{Euclid's algorithm}
|
||||
|
||||
\index{greatest common divisor}
|
||||
\index{least common multiple}
|
||||
\index{Euclid's algorithm}
|
||||
|
||||
The \key{greatest common divisor} of
|
||||
numbers $a$ and $b$, $\gcd(a,b)$,
|
||||
is the greatest number that divides both $a$ and $b$,
|
||||
and the \key{least common multiple} of
|
||||
$a$ and $b$, $\textrm{lcm}(a,b)$,
|
||||
is the smallest number that is divisible by
|
||||
both $a$ and $b$.
|
||||
For example,
|
||||
$\gcd(24,36)=12$ and
|
||||
$\textrm{lcm}(24,36)=72$.
|
||||
|
||||
The greatest common divisor and the least common multiple
|
||||
are connected as follows:
|
||||
\[\textrm{lcm}(a,b)=\frac{ab}{\textrm{gcd}(a,b)}\]
|
||||
|
||||
\key{Euclid's algorithm}\footnote{Euclid was a Greek mathematician who
|
||||
lived in about 300 BC. This is perhaps the first known algorithm in history.} provides an efficient way
|
||||
to find the greatest common divisor of two numbers.
|
||||
The algorithm is based on the following formula:
|
||||
\begin{equation*}
|
||||
\textrm{gcd}(a,b) = \begin{cases}
|
||||
a & b = 0\\
|
||||
\textrm{gcd}(b,a \bmod b) & b \neq 0\\
|
||||
\end{cases}
|
||||
\end{equation*}
|
||||
|
||||
For example,
|
||||
\[\textrm{gcd}(24,36) = \textrm{gcd}(36,24)
|
||||
= \textrm{gcd}(24,12) = \textrm{gcd}(12,0)=12.\]
|
||||
|
||||
The algorithm can be implemented as follows:
|
||||
\begin{lstlisting}
|
||||
int gcd(int a, int b) {
|
||||
if (b == 0) return a;
|
||||
return gcd(b, a%b);
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
It can be shown that Euclid's algorithm works
|
||||
in $O(\log n)$ time, where $n=\min(a,b)$.
|
||||
The worst case for the algorithm is
|
||||
the case when $a$ and $b$ are consecutive Fibonacci numbers.
|
||||
For example,
|
||||
\[\textrm{gcd}(13,8)=\textrm{gcd}(8,5)
|
||||
=\textrm{gcd}(5,3)=\textrm{gcd}(3,2)=\textrm{gcd}(2,1)=\textrm{gcd}(1,0)=1.\]
|
||||
|
||||
\subsubsection{Euler's totient function}
|
||||
|
||||
\index{coprime}
|
||||
\index{Euler's totient function}
|
||||
|
||||
Numbers $a$ and $b$ are \key{coprime}
|
||||
if $\textrm{gcd}(a,b)=1$.
|
||||
\key{Euler's totient function} $\varphi(n)$
|
||||
%\footnote{Euler presented this function in 1763.}
|
||||
gives the number of coprime numbers to $n$
|
||||
between $1$ and $n$.
|
||||
For example, $\varphi(12)=4$,
|
||||
because 1, 5, 7 and 11
|
||||
are coprime to 12.
|
||||
|
||||
The value of $\varphi(n)$ can be calculated
|
||||
from the prime factorization of $n$
|
||||
using the formula
|
||||
\[ \varphi(n) = \prod_{i=1}^k p_i^{\alpha_i-1}(p_i-1). \]
|
||||
For example, $\varphi(12)=2^1 \cdot (2-1) \cdot 3^0 \cdot (3-1)=4$.
|
||||
Note that $\varphi(n)=n-1$ if $n$ is prime.
|
||||
|
||||
\section{Modular arithmetic}
|
||||
|
||||
\index{modular arithmetic}
|
||||
|
||||
In \key{modular arithmetic},
|
||||
the set of numbers is limited so
|
||||
that only numbers $0,1,2,\ldots,m-1$ are used,
|
||||
where $m$ is a constant.
|
||||
Each number $x$ is
|
||||
represented by the number $x \bmod m$:
|
||||
the remainder after dividing $x$ by $m$.
|
||||
For example, if $m=17$, then $75$
|
||||
is represented by $75 \bmod 17 = 7$.
|
||||
|
||||
Often we can take remainders before doing
|
||||
calculations.
|
||||
In particular, the following formulas hold:
|
||||
\[
|
||||
\begin{array}{rcl}
|
||||
(x+y) \bmod m & = & (x \bmod m + y \bmod m) \bmod m \\
|
||||
(x-y) \bmod m & = & (x \bmod m - y \bmod m) \bmod m \\
|
||||
(x \cdot y) \bmod m & = & (x \bmod m \cdot y \bmod m) \bmod m \\
|
||||
x^n \bmod m & = & (x \bmod m)^n \bmod m \\
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
\subsubsection{Modular exponentiation}
|
||||
|
||||
There is often need to efficiently calculate
|
||||
the value of $x^n \bmod m$.
|
||||
This can be done in $O(\log n)$ time
|
||||
using the following recursion:
|
||||
\begin{equation*}
|
||||
x^n = \begin{cases}
|
||||
1 & n = 0\\
|
||||
x^{n/2} \cdot x^{n/2} & \text{$n$ is even}\\
|
||||
x^{n-1} \cdot x & \text{$n$ is odd}
|
||||
\end{cases}
|
||||
\end{equation*}
|
||||
|
||||
It is important that in the case of an even $n$,
|
||||
the value of $x^{n/2}$ is calculated only once.
|
||||
This guarantees that the time complexity of the
|
||||
algorithm is $O(\log n)$, because $n$ is always halved
|
||||
when it is even.
|
||||
|
||||
The following function calculates the value of
|
||||
$x^n \bmod m$:
|
||||
|
||||
\begin{lstlisting}
|
||||
int modpow(int x, int n, int m) {
|
||||
if (n == 0) return 1%m;
|
||||
long long u = modpow(x,n/2,m);
|
||||
u = (u*u)%m;
|
||||
if (n%2 == 1) u = (u*x)%m;
|
||||
return u;
|
||||
}
|
||||
\end{lstlisting}
|
||||
|
||||
\subsubsection{Fermat's theorem and Euler's theorem}
|
||||
|
||||
\index{Fermat's theorem}
|
||||
\index{Euler's theorem}
|
||||
|
||||
\key{Fermat's theorem}
|
||||
%\footnote{Fermat discovered this theorem in 1640.}
|
||||
states that
|
||||
\[x^{m-1} \bmod m = 1\]
|
||||
when $m$ is prime and $x$ and $m$ are coprime.
|
||||
This also yields
|
||||
\[x^k \bmod m = x^{k \bmod (m-1)} \bmod m.\]
|
||||
More generally, \key{Euler's theorem}
|
||||
%\footnote{Euler published this theorem in 1763.}
|
||||
states that
|
||||
\[x^{\varphi(m)} \bmod m = 1\]
|
||||
when $x$ and $m$ are coprime.
|
||||
Fermat's theorem follows from Euler's theorem,
|
||||
because if $m$ is a prime, then $\varphi(m)=m-1$.
|
||||
|
||||
\subsubsection{Modular inverse}
|
||||
|
||||
\index{modular inverse}
|
||||
|
||||
The inverse of $x$ modulo $m$
|
||||
is a number $x^{-1}$ such that
|
||||
\[ x x^{-1} \bmod m = 1. \]
|
||||
For example, if $x=6$ and $m=17$,
|
||||
then $x^{-1}=3$, because $6\cdot3 \bmod 17=1$.
|
||||
|
||||
Using modular inverses, we can divide numbers
|
||||
modulo $m$, because division by $x$
|
||||
corresponds to multiplication by $x^{-1}$.
|
||||
For example, to evaluate the value of $36/6 \bmod 17$,
|
||||
we can use the formula $2 \cdot 3 \bmod 17$,
|
||||
because $36 \bmod 17 = 2$ and $6^{-1} \bmod 17 = 3$.
|
||||
|
||||
However, a modular inverse does not always exist.
|
||||
For example, if $x=2$ and $m=4$, the equation
|
||||
\[ x x^{-1} \bmod m = 1 \]
|
||||
cannot be solved, because all multiples of 2
|
||||
are even and the remainder can never be 1 when $m=4$.
|
||||
It turns out that the value of $x^{-1} \bmod m$
|
||||
can be calculated exactly when $x$ and $m$ are coprime.
|
||||
|
||||
If a modular inverse exists, it can be
|
||||
calculated using the formula
|
||||
\[
|
||||
x^{-1} = x^{\varphi(m)-1}.
|
||||
\]
|
||||
If $m$ is prime, the formula becomes
|
||||
\[
|
||||
x^{-1} = x^{m-2}.
|
||||
\]
|
||||
For example,
|
||||
\[6^{-1} \bmod 17 =6^{17-2} \bmod 17 = 3.\]
|
||||
|
||||
This formula allows us to efficiently calculate
|
||||
modular inverses using the modular exponentation algorithm.
|
||||
The formula can be derived using Euler's theorem.
|
||||
First, the modular inverse should satisfy the following equation:
|
||||
\[
|
||||
x x^{-1} \bmod m = 1.
|
||||
\]
|
||||
On the other hand, according to Euler's theorem,
|
||||
\[
|
||||
x^{\varphi(m)} \bmod m = xx^{\varphi(m)-1} \bmod m = 1,
|
||||
\]
|
||||
so the numbers $x^{-1}$ and $x^{\varphi(m)-1}$ are equal.
|
||||
|
||||
\subsubsection{Computer arithmetic}
|
||||
|
||||
In programming, unsigned integers are represented modulo $2^k$,
|
||||
where $k$ is the number of bits of the data type.
|
||||
A usual consequence of this is that a number wraps around
|
||||
if it becomes too large.
|
||||
|
||||
For example, in C++, numbers of type \texttt{unsigned int}
|
||||
are represented modulo $2^{32}$.
|
||||
The following code declares an \texttt{unsigned int}
|
||||
variable whose value is $123456789$.
|
||||
After this, the value will be multiplied by itself,
|
||||
and the result is
|
||||
$123456789^2 \bmod 2^{32} = 2537071545$.
|
||||
|
||||
\begin{lstlisting}
|
||||
unsigned int x = 123456789;
|
||||
cout << x*x << "\n"; // 2537071545
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Solving equations}
|
||||
|
||||
\subsubsection*{Diophantine equations}
|
||||
|
||||
\index{Diophantine equation}
|
||||
|
||||
A \key{Diophantine equation}
|
||||
%\footnote{Diophantus of Alexandria was a Greek mathematician who lived in the 3th century.}
|
||||
is an equation of the form
|
||||
\[ ax + by = c, \]
|
||||
where $a$, $b$ and $c$ are constants
|
||||
and the values of $x$ and $y$ should be found.
|
||||
Each number in the equation has to be an integer.
|
||||
For example, one solution for the equation
|
||||
$5x+2y=11$ is $x=3$ and $y=-2$.
|
||||
|
||||
\index{extended Euclid's algorithm}
|
||||
|
||||
We can efficiently solve a Diophantine equation
|
||||
by using Euclid's algorithm.
|
||||
It turns out that we can extend Euclid's algorithm
|
||||
so that it will find numbers $x$ and $y$
|
||||
that satisfy the following equation:
|
||||
\[
|
||||
ax + by = \textrm{gcd}(a,b)
|
||||
\]
|
||||
|
||||
A Diophantine equation can be solved if
|
||||
$c$ is divisible by
|
||||
$\textrm{gcd}(a,b)$,
|
||||
and otherwise it cannot be solved.
|
||||
|
||||
As an example, let us find numbers $x$ and $y$
|
||||
that satisfy the following equation:
|
||||
\[
|
||||
39x + 15y = 12
|
||||
\]
|
||||
The equation can be solved, because
|
||||
$\textrm{gcd}(39,15)=3$ and $3 \mid 12$.
|
||||
When Euclid's algorithm calculates the
|
||||
greatest common divisor of 39 and 15,
|
||||
it produces the following sequence of function calls:
|
||||
\[
|
||||
\textrm{gcd}(39,15) = \textrm{gcd}(15,9)
|
||||
= \textrm{gcd}(9,6) = \textrm{gcd}(6,3)
|
||||
= \textrm{gcd}(3,0) = 3 \]
|
||||
This corresponds to the following equations:
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
39 - 2 \cdot 15 & = & 9 \\
|
||||
15 - 1 \cdot 9 & = & 6 \\
|
||||
9 - 1 \cdot 6 & = & 3 \\
|
||||
\end{array}
|
||||
\]
|
||||
Using these equations, we can derive
|
||||
\[
|
||||
39 \cdot 2 + 15 \cdot (-5) = 3
|
||||
\]
|
||||
and by multiplying this by 4, the result is
|
||||
\[
|
||||
39 \cdot 8 + 15 \cdot (-20) = 12,
|
||||
\]
|
||||
so a solution to the equation is
|
||||
$x=8$ and $y=-20$.
|
||||
|
||||
A solution to a Diophantine equation is not unique,
|
||||
because we can form an infinite number of solutions
|
||||
if we know one solution.
|
||||
If a pair $(x,y)$ is a solution, then also all pairs
|
||||
\[(x+\frac{kb}{\textrm{gcd}(a,b)},y-\frac{ka}{\textrm{gcd}(a,b)})\]
|
||||
are solutions, where $k$ is any integer.
|
||||
|
||||
\subsubsection{Chinese remainder theorem}
|
||||
|
||||
\index{Chinese remainder theorem}
|
||||
|
||||
The \key{Chinese remainder theorem} solves
|
||||
a group of equations of the form
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
x & = & a_1 \bmod m_1 \\
|
||||
x & = & a_2 \bmod m_2 \\
|
||||
\cdots \\
|
||||
x & = & a_n \bmod m_n \\
|
||||
\end{array}
|
||||
\]
|
||||
where all pairs of $m_1,m_2,\ldots,m_n$ are coprime.
|
||||
|
||||
Let $x^{-1}_m$ be the inverse of $x$ modulo $m$, and
|
||||
\[ X_k = \frac{m_1 m_2 \cdots m_n}{m_k}.\]
|
||||
Using this notation, a solution to the equations is
|
||||
\[x = a_1 X_1 {X_1}^{-1}_{m_1} + a_2 X_2 {X_2}^{-1}_{m_2} + \cdots + a_n X_n {X_n}^{-1}_{m_n}.\]
|
||||
In this solution, for each $k=1,2,\ldots,n$,
|
||||
\[a_k X_k {X_k}^{-1}_{m_k} \bmod m_k = a_k,\]
|
||||
because
|
||||
\[X_k {X_k}^{-1}_{m_k} \bmod m_k = 1.\]
|
||||
Since all other terms in the sum are divisible by $m_k$,
|
||||
they have no effect on the remainder,
|
||||
and $x \bmod m_k = a_k$.
|
||||
|
||||
For example, a solution for
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
x & = & 3 \bmod 5 \\
|
||||
x & = & 4 \bmod 7 \\
|
||||
x & = & 2 \bmod 3 \\
|
||||
\end{array}
|
||||
\]
|
||||
is
|
||||
\[ 3 \cdot 21 \cdot 1 + 4 \cdot 15 \cdot 1 + 2 \cdot 35 \cdot 2 = 263.\]
|
||||
|
||||
Once we have found a solution $x$,
|
||||
we can create an infinite number of other solutions,
|
||||
because all numbers of the form
|
||||
\[x+m_1 m_2 \cdots m_n\]
|
||||
are solutions.
|
||||
|
||||
\section{Other results}
|
||||
|
||||
\subsubsection{Lagrange's theorem}
|
||||
|
||||
\index{Lagrange's theorem}
|
||||
|
||||
\key{Lagrange's theorem}
|
||||
%\footnote{J.-L. Lagrange (1736--1813) was an Italian mathematician.}
|
||||
states that every positive integer
|
||||
can be represented as a sum of four squares, i.e.,
|
||||
$a^2+b^2+c^2+d^2$.
|
||||
For example, the number 123 can be represented
|
||||
as the sum $8^2+5^2+5^2+3^2$.
|
||||
|
||||
\subsubsection{Zeckendorf's theorem}
|
||||
|
||||
\index{Zeckendorf's theorem}
|
||||
\index{Fibonacci number}
|
||||
|
||||
\key{Zeckendorf's theorem}
|
||||
%\footnote{E. Zeckendorf published the theorem in 1972 \cite{zec72}; however, this was not a new result.}
|
||||
states that every
|
||||
positive integer has a unique representation
|
||||
as a sum of Fibonacci numbers such that
|
||||
no two numbers are equal or consecutive
|
||||
Fibonacci numbers.
|
||||
For example, the number 74 can be represented
|
||||
as the sum $55+13+5+1$.
|
||||
|
||||
\subsubsection{Pythagorean triples}
|
||||
|
||||
\index{Pythagorean triple}
|
||||
\index{Euclid's formula}
|
||||
|
||||
A \key{Pythagorean triple} is a triple $(a,b,c)$
|
||||
that satisfies the Pythagorean theorem
|
||||
$a^2+b^2=c^2$, which means that there is a right triangle
|
||||
with side lengths $a$, $b$ and $c$.
|
||||
For example, $(3,4,5)$ is a Pythagorean triple.
|
||||
|
||||
If $(a,b,c)$ is a Pythagorean triple,
|
||||
all triples of the form $(ka,kb,kc)$
|
||||
are also Pythagorean triples where $k>1$.
|
||||
A Pythagorean triple is \emph{primitive} if
|
||||
$a$, $b$ and $c$ are coprime,
|
||||
and all Pythagorean triples can be constructed
|
||||
from primitive triples using a multiplier $k$.
|
||||
|
||||
\key{Euclid's formula} can be used to produce
|
||||
all primitive Pythagorean triples.
|
||||
Each such triple is of the form
|
||||
\[(n^2-m^2,2nm,n^2+m^2),\]
|
||||
where $0<m<n$, $n$ and $m$ are coprime
|
||||
and at least one of $n$ and $m$ is even.
|
||||
For example, when $m=1$ and $n=2$, the formula
|
||||
produces the smallest Pythagorean triple
|
||||
\[(2^2-1^2,2\cdot2\cdot1,2^2+1^2)=(3,4,5).\]
|
||||
|
||||
\subsubsection{Wilson's theorem}
|
||||
|
||||
\index{Wilson's theorem}
|
||||
|
||||
\key{Wilson's theorem}
|
||||
%\footnote{J. Wilson (1741--1793) was an English mathematician.}
|
||||
states that a number $n$
|
||||
is prime exactly when
|
||||
\[(n-1)! \bmod n = n-1.\]
|
||||
For example, the number 11 is prime, because
|
||||
\[10! \bmod 11 = 10,\]
|
||||
and the number 12 is not prime, because
|
||||
\[11! \bmod 12 = 0 \neq 11.\]
|
||||
|
||||
Hence, Wilson's theorem can be used to find out
|
||||
whether a number is prime. However, in practice, the theorem cannot be
|
||||
applied to large values of $n$, because it is difficult
|
||||
to calculate values of $(n-1)!$ when $n$ is large.
|
||||
|
||||
|
925
chapter22.tex
925
chapter22.tex
|
@ -1,925 +0,0 @@
|
|||
\chapter{Combinatorics}
|
||||
|
||||
\index{combinatorics}
|
||||
|
||||
\key{Combinatorics} studies methods for counting
|
||||
combinations of objects.
|
||||
Usually, the goal is to find a way to
|
||||
count the combinations efficiently
|
||||
without generating each combination separately.
|
||||
|
||||
As an example, consider the problem
|
||||
of counting the number of ways to
|
||||
represent an integer $n$ as a sum of positive integers.
|
||||
For example, there are 8 representations
|
||||
for $4$:
|
||||
\begin{multicols}{2}
|
||||
\begin{itemize}
|
||||
\item $1+1+1+1$
|
||||
\item $1+1+2$
|
||||
\item $1+2+1$
|
||||
\item $2+1+1$
|
||||
\item $2+2$
|
||||
\item $3+1$
|
||||
\item $1+3$
|
||||
\item $4$
|
||||
\end{itemize}
|
||||
\end{multicols}
|
||||
|
||||
A combinatorial problem can often be solved
|
||||
using a recursive function.
|
||||
In this problem, we can define a function $f(n)$
|
||||
that gives the number of representations for $n$.
|
||||
For example, $f(4)=8$ according to the above example.
|
||||
The values of the function
|
||||
can be recursively calculated as follows:
|
||||
\begin{equation*}
|
||||
f(n) = \begin{cases}
|
||||
1 & n = 0\\
|
||||
f(0)+f(1)+\cdots+f(n-1) & n > 0\\
|
||||
\end{cases}
|
||||
\end{equation*}
|
||||
The base case is $f(0)=1$,
|
||||
because the empty sum represents the number 0.
|
||||
Then, if $n>0$, we consider all ways to
|
||||
choose the first number of the sum.
|
||||
If the first number is $k$,
|
||||
there are $f(n-k)$ representations
|
||||
for the remaining part of the sum.
|
||||
Thus, we calculate the sum of all values
|
||||
of the form $f(n-k)$ where $k<n$.
|
||||
|
||||
The first values for the function are:
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
f(0) & = & 1 \\
|
||||
f(1) & = & 1 \\
|
||||
f(2) & = & 2 \\
|
||||
f(3) & = & 4 \\
|
||||
f(4) & = & 8 \\
|
||||
\end{array}
|
||||
\]
|
||||
|
||||
Sometimes, a recursive formula can be replaced
|
||||
with a closed-form formula.
|
||||
In this problem,
|
||||
\[
|
||||
f(n)=2^{n-1},
|
||||
\]
|
||||
which is based on the fact that there are $n-1$
|
||||
possible positions for +-signs in the sum
|
||||
and we can choose any subset of them.
|
||||
|
||||
\section{Binomial coefficients}
|
||||
|
||||
\index{binomial coefficient}
|
||||
|
||||
The \key{binomial coefficient} ${n \choose k}$
|
||||
equals the number of ways we can choose a subset
|
||||
of $k$ elements from a set of $n$ elements.
|
||||
For example, ${5 \choose 3}=10$,
|
||||
because the set $\{1,2,3,4,5\}$
|
||||
has 10 subsets of 3 elements:
|
||||
\[ \{1,2,3\}, \{1,2,4\}, \{1,2,5\}, \{1,3,4\}, \{1,3,5\},
|
||||
\{1,4,5\}, \{2,3,4\}, \{2,3,5\}, \{2,4,5\}, \{3,4,5\} \]
|
||||
|
||||
\subsubsection{Formula 1}
|
||||
|
||||
Binomial coefficients can be
|
||||
recursively calculated as follows:
|
||||
|
||||
\[
|
||||
{n \choose k} = {n-1 \choose k-1} + {n-1 \choose k}
|
||||
\]
|
||||
|
||||
The idea is to fix an element $x$ in the set.
|
||||
If $x$ is included in the subset,
|
||||
we have to choose $k-1$
|
||||
elements from $n-1$ elements,
|
||||
and if $x$ is not included in the subset,
|
||||
we have to choose $k$ elements from $n-1$ elements.
|
||||
|
||||
The base cases for the recursion are
|
||||
\[
|
||||
{n \choose 0} = {n \choose n} = 1,
|
||||
\]
|
||||
because there is always exactly
|
||||
one way to construct an empty subset
|
||||
and a subset that contains all the elements.
|
||||
|
||||
\subsubsection{Formula 2}
|
||||
|
||||
Another way to calculate binomial coefficients is as follows:
|
||||
\[
|
||||
{n \choose k} = \frac{n!}{k!(n-k)!}.
|
||||
\]
|
||||
|
||||
There are $n!$ permutations of $n$ elements.
|
||||
We go through all permutations and always
|
||||
include the first $k$ elements of the permutation
|
||||
in the subset.
|
||||
Since the order of the elements in the subset
|
||||
and outside the subset does not matter,
|
||||
the result is divided by $k!$ and $(n-k)!$
|
||||
|
||||
\subsubsection{Properties}
|
||||
|
||||
For binomial coefficients,
|
||||
\[
|
||||
{n \choose k} = {n \choose n-k},
|
||||
\]
|
||||
because we actually divide a set of $n$ elements into
|
||||
two subsets: the first contains $k$ elements
|
||||
and the second contains $n-k$ elements.
|
||||
|
||||
The sum of binomial coefficients is
|
||||
\[
|
||||
{n \choose 0}+{n \choose 1}+{n \choose 2}+\ldots+{n \choose n}=2^n.
|
||||
\]
|
||||
|
||||
The reason for the name ''binomial coefficient''
|
||||
can be seen when the binomial $(a+b)$ is raised to
|
||||
the $n$th power:
|
||||
|
||||
\[ (a+b)^n =
|
||||
{n \choose 0} a^n b^0 +
|
||||
{n \choose 1} a^{n-1} b^1 +
|
||||
\ldots +
|
||||
{n \choose n-1} a^1 b^{n-1} +
|
||||
{n \choose n} a^0 b^n. \]
|
||||
|
||||
\index{Pascal's triangle}
|
||||
|
||||
Binomial coefficients also appear in
|
||||
\key{Pascal's triangle}
|
||||
where each value equals the sum of two
|
||||
above values:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}{0.9}
|
||||
\node at (0,0) {1};
|
||||
\node at (-0.5,-0.5) {1};
|
||||
\node at (0.5,-0.5) {1};
|
||||
\node at (-1,-1) {1};
|
||||
\node at (0,-1) {2};
|
||||
\node at (1,-1) {1};
|
||||
\node at (-1.5,-1.5) {1};
|
||||
\node at (-0.5,-1.5) {3};
|
||||
\node at (0.5,-1.5) {3};
|
||||
\node at (1.5,-1.5) {1};
|
||||
\node at (-2,-2) {1};
|
||||
\node at (-1,-2) {4};
|
||||
\node at (0,-2) {6};
|
||||
\node at (1,-2) {4};
|
||||
\node at (2,-2) {1};
|
||||
\node at (-2,-2.5) {$\ldots$};
|
||||
\node at (-1,-2.5) {$\ldots$};
|
||||
\node at (0,-2.5) {$\ldots$};
|
||||
\node at (1,-2.5) {$\ldots$};
|
||||
\node at (2,-2.5) {$\ldots$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Boxes and balls}
|
||||
|
||||
''Boxes and balls'' is a useful model,
|
||||
where we count the ways to
|
||||
place $k$ balls in $n$ boxes.
|
||||
Let us consider three scenarios:
|
||||
|
||||
\textit{Scenario 1}: Each box can contain
|
||||
at most one ball.
|
||||
For example, when $n=5$ and $k=2$,
|
||||
there are 10 solutions:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.5]
|
||||
\newcommand\lax[3]{
|
||||
\path[draw,thick,-] (#1-0.5,#2+0.5) -- (#1-0.5,#2-0.5) --
|
||||
(#1+0.5,#2-0.5) -- (#1+0.5,#2+0.5);
|
||||
\ifthenelse{\equal{#3}{1}}{\draw[fill=black] (#1,#2-0.3) circle (0.15);}{}
|
||||
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1-0.2,#2-0.3) circle (0.15);}{}
|
||||
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1+0.2,#2-0.3) circle (0.15);}{}
|
||||
}
|
||||
\newcommand\laa[7]{
|
||||
\lax{#1}{#2}{#3}
|
||||
\lax{#1+1.2}{#2}{#4}
|
||||
\lax{#1+2.4}{#2}{#5}
|
||||
\lax{#1+3.6}{#2}{#6}
|
||||
\lax{#1+4.8}{#2}{#7}
|
||||
}
|
||||
|
||||
\laa{0}{0}{1}{1}{0}{0}{0}
|
||||
\laa{0}{-2}{1}{0}{1}{0}{0}
|
||||
\laa{0}{-4}{1}{0}{0}{1}{0}
|
||||
\laa{0}{-6}{1}{0}{0}{0}{1}
|
||||
\laa{8}{0}{0}{1}{1}{0}{0}
|
||||
\laa{8}{-2}{0}{1}{0}{1}{0}
|
||||
\laa{8}{-4}{0}{1}{0}{0}{1}
|
||||
\laa{16}{0}{0}{0}{1}{1}{0}
|
||||
\laa{16}{-2}{0}{0}{1}{0}{1}
|
||||
\laa{16}{-4}{0}{0}{0}{1}{1}
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
In this scenario, the answer is directly the
|
||||
binomial coefficient ${n \choose k}$.
|
||||
|
||||
\textit{Scenario 2}: A box can contain multiple balls.
|
||||
For example, when $n=5$ and $k=2$,
|
||||
there are 15 solutions:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.5]
|
||||
\newcommand\lax[3]{
|
||||
\path[draw,thick,-] (#1-0.5,#2+0.5) -- (#1-0.5,#2-0.5) --
|
||||
(#1+0.5,#2-0.5) -- (#1+0.5,#2+0.5);
|
||||
\ifthenelse{\equal{#3}{1}}{\draw[fill=black] (#1,#2-0.3) circle (0.15);}{}
|
||||
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1-0.2,#2-0.3) circle (0.15);}{}
|
||||
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1+0.2,#2-0.3) circle (0.15);}{}
|
||||
}
|
||||
\newcommand\laa[7]{
|
||||
\lax{#1}{#2}{#3}
|
||||
\lax{#1+1.2}{#2}{#4}
|
||||
\lax{#1+2.4}{#2}{#5}
|
||||
\lax{#1+3.6}{#2}{#6}
|
||||
\lax{#1+4.8}{#2}{#7}
|
||||
}
|
||||
|
||||
\laa{0}{0}{2}{0}{0}{0}{0}
|
||||
\laa{0}{-2}{1}{1}{0}{0}{0}
|
||||
\laa{0}{-4}{1}{0}{1}{0}{0}
|
||||
\laa{0}{-6}{1}{0}{0}{1}{0}
|
||||
\laa{0}{-8}{1}{0}{0}{0}{1}
|
||||
\laa{8}{0}{0}{2}{0}{0}{0}
|
||||
\laa{8}{-2}{0}{1}{1}{0}{0}
|
||||
\laa{8}{-4}{0}{1}{0}{1}{0}
|
||||
\laa{8}{-6}{0}{1}{0}{0}{1}
|
||||
\laa{8}{-8}{0}{0}{2}{0}{0}
|
||||
\laa{16}{0}{0}{0}{1}{1}{0}
|
||||
\laa{16}{-2}{0}{0}{1}{0}{1}
|
||||
\laa{16}{-4}{0}{0}{0}{2}{0}
|
||||
\laa{16}{-6}{0}{0}{0}{1}{1}
|
||||
\laa{16}{-8}{0}{0}{0}{0}{2}
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The process of placing the balls in the boxes
|
||||
can be represented as a string
|
||||
that consists of symbols
|
||||
''o'' and ''$\rightarrow$''.
|
||||
Initially, assume that we are standing at the leftmost box.
|
||||
The symbol ''o'' means that we place a ball
|
||||
in the current box, and the symbol
|
||||
''$\rightarrow$'' means that we move to
|
||||
the next box to the right.
|
||||
|
||||
Using this notation, each solution is a string
|
||||
that contains $k$ times the symbol ''o'' and
|
||||
$n-1$ times the symbol ''$\rightarrow$''.
|
||||
For example, the upper-right solution
|
||||
in the above picture corresponds to the string
|
||||
''$\rightarrow$ $\rightarrow$ o $\rightarrow$ o $\rightarrow$''.
|
||||
Thus, the number of solutions is
|
||||
${k+n-1 \choose k}$.
|
||||
|
||||
\textit{Scenario 3}: Each box may contain at most one ball,
|
||||
and in addition, no two adjacent boxes may both contain a ball.
|
||||
For example, when $n=5$ and $k=2$,
|
||||
there are 6 solutions:
|
||||
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.5]
|
||||
\newcommand\lax[3]{
|
||||
\path[draw,thick,-] (#1-0.5,#2+0.5) -- (#1-0.5,#2-0.5) --
|
||||
(#1+0.5,#2-0.5) -- (#1+0.5,#2+0.5);
|
||||
\ifthenelse{\equal{#3}{1}}{\draw[fill=black] (#1,#2-0.3) circle (0.15);}{}
|
||||
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1-0.2,#2-0.3) circle (0.15);}{}
|
||||
\ifthenelse{\equal{#3}{2}}{\draw[fill=black] (#1+0.2,#2-0.3) circle (0.15);}{}
|
||||
}
|
||||
\newcommand\laa[7]{
|
||||
\lax{#1}{#2}{#3}
|
||||
\lax{#1+1.2}{#2}{#4}
|
||||
\lax{#1+2.4}{#2}{#5}
|
||||
\lax{#1+3.6}{#2}{#6}
|
||||
\lax{#1+4.8}{#2}{#7}
|
||||
}
|
||||
|
||||
\laa{0}{0}{1}{0}{1}{0}{0}
|
||||
\laa{0}{-2}{1}{0}{0}{1}{0}
|
||||
\laa{8}{0}{1}{0}{0}{0}{1}
|
||||
\laa{8}{-2}{0}{1}{0}{1}{0}
|
||||
\laa{16}{0}{0}{1}{0}{0}{1}
|
||||
\laa{16}{-2}{0}{0}{1}{0}{1}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
In this scenario, we can assume that
|
||||
$k$ balls are initially placed in boxes
|
||||
and there is an empty box between each
|
||||
two adjacent boxes.
|
||||
The remaining task is to choose the
|
||||
positions for the remaining empty boxes.
|
||||
There are $n-2k+1$ such boxes and
|
||||
$k+1$ positions for them.
|
||||
Thus, using the formula of scenario 2,
|
||||
the number of solutions is
|
||||
${n-k+1 \choose n-2k+1}$.
|
||||
|
||||
\subsubsection{Multinomial coefficients}
|
||||
|
||||
\index{multinomial coefficient}
|
||||
|
||||
The \key{multinomial coefficient}
|
||||
\[ {n \choose k_1,k_2,\ldots,k_m} = \frac{n!}{k_1! k_2! \cdots k_m!}, \]
|
||||
equals the number of ways
|
||||
we can divide $n$ elements into subsets
|
||||
of sizes $k_1,k_2,\ldots,k_m$,
|
||||
where $k_1+k_2+\cdots+k_m=n$.
|
||||
Multinomial coefficients can be seen as a
|
||||
generalization of binomial cofficients;
|
||||
if $m=2$, the above formula
|
||||
corresponds to the binomial coefficient formula.
|
||||
|
||||
\section{Catalan numbers}
|
||||
|
||||
\index{Catalan number}
|
||||
|
||||
The \key{Catalan number}
|
||||
%\footnote{E. C. Catalan (1814--1894) was a Belgian mathematician.}
|
||||
$C_n$ equals the
|
||||
number of valid
|
||||
parenthesis expressions that consist of
|
||||
$n$ left parentheses and $n$ right parentheses.
|
||||
|
||||
For example, $C_3=5$, because
|
||||
we can construct the following parenthesis
|
||||
expressions using three
|
||||
left and right parentheses:
|
||||
|
||||
\begin{itemize}[noitemsep]
|
||||
\item \texttt{()()()}
|
||||
\item \texttt{(())()}
|
||||
\item \texttt{()(())}
|
||||
\item \texttt{((()))}
|
||||
\item \texttt{(()())}
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Parenthesis expressions}
|
||||
|
||||
\index{parenthesis expression}
|
||||
|
||||
What is exactly a \emph{valid parenthesis expression}?
|
||||
The following rules precisely define all
|
||||
valid parenthesis expressions:
|
||||
|
||||
\begin{itemize}
|
||||
\item An empty parenthesis expression is valid.
|
||||
\item If an expression $A$ is valid,
|
||||
then also the expression
|
||||
\texttt{(}$A$\texttt{)} is valid.
|
||||
\item If expressions $A$ and $B$ are valid,
|
||||
then also the expression $AB$ is valid.
|
||||
\end{itemize}
|
||||
|
||||
Another way to characterize valid
|
||||
parenthesis expressions is that if
|
||||
we choose any prefix of such an expression,
|
||||
it has to contain at least as many left
|
||||
parentheses as right parentheses.
|
||||
In addition, the complete expression has to
|
||||
contain an equal number of left and right
|
||||
parentheses.
|
||||
|
||||
\subsubsection{Formula 1}
|
||||
|
||||
Catalan numbers can be calculated using the formula
|
||||
\[ C_n = \sum_{i=0}^{n-1} C_{i} C_{n-i-1}.\]
|
||||
|
||||
The sum goes through the ways to divide the
|
||||
expression into two parts
|
||||
such that both parts are valid
|
||||
expressions and the first part is as short as possible
|
||||
but not empty.
|
||||
For any $i$, the first part contains $i+1$ pairs
|
||||
of parentheses and the number of expressions
|
||||
is the product of the following values:
|
||||
|
||||
\begin{itemize}
|
||||
\item $C_{i}$: the number of ways to construct an expression
|
||||
using the parentheses of the first part,
|
||||
not counting the outermost parentheses
|
||||
\item $C_{n-i-1}$: the number of ways to construct an
|
||||
expression using the parentheses of the second part
|
||||
\end{itemize}
|
||||
|
||||
The base case is $C_0=1$,
|
||||
because we can construct an empty parenthesis
|
||||
expression using zero pairs of parentheses.
|
||||
|
||||
\subsubsection{Formula 2}
|
||||
|
||||
Catalan numbers can also be calculated
|
||||
using binomial coefficients:
|
||||
\[ C_n = \frac{1}{n+1} {2n \choose n}\]
|
||||
The formula can be explained as follows:
|
||||
|
||||
There are a total of ${2n \choose n}$ ways
|
||||
to construct a (not necessarily valid)
|
||||
parenthesis expression that contains $n$ left
|
||||
parentheses and $n$ right parentheses.
|
||||
Let us calculate the number of such
|
||||
expressions that are \emph{not} valid.
|
||||
|
||||
If a parenthesis expression is not valid,
|
||||
it has to contain a prefix where the
|
||||
number of right parentheses exceeds the
|
||||
number of left parentheses.
|
||||
The idea is to reverse each parenthesis
|
||||
that belongs to such a prefix.
|
||||
For example, the expression
|
||||
\texttt{())()(} contains a prefix \texttt{())},
|
||||
and after reversing the prefix,
|
||||
the expression becomes \texttt{)((()(}.
|
||||
|
||||
The resulting expression consists of $n+1$
|
||||
left parentheses and $n-1$ right parentheses.
|
||||
The number of such expressions is ${2n \choose n+1}$,
|
||||
which equals the number of non-valid
|
||||
parenthesis expressions.
|
||||
Thus, the number of valid parenthesis
|
||||
expressions can be calculated using the formula
|
||||
\[{2n \choose n}-{2n \choose n+1} = {2n \choose n} - \frac{n}{n+1} {2n \choose n} = \frac{1}{n+1} {2n \choose n}.\]
|
||||
|
||||
\subsubsection{Counting trees}
|
||||
|
||||
Catalan numbers are also related to trees:
|
||||
|
||||
\begin{itemize}
|
||||
\item there are $C_n$ binary trees of $n$ nodes
|
||||
\item there are $C_{n-1}$ rooted trees of $n$ nodes
|
||||
\end{itemize}
|
||||
\noindent
|
||||
For example, for $C_3=5$, the binary trees are
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\path[draw,thick,-] (0,0) -- (-1,-1);
|
||||
\path[draw,thick,-] (0,0) -- (1,-1);
|
||||
\draw[fill=white] (0,0) circle (0.3);
|
||||
\draw[fill=white] (-1,-1) circle (0.3);
|
||||
\draw[fill=white] (1,-1) circle (0.3);
|
||||
|
||||
\path[draw,thick,-] (4,0) -- (4-0.75,-1) -- (4-1.5,-2);
|
||||
\draw[fill=white] (4,0) circle (0.3);
|
||||
\draw[fill=white] (4-0.75,-1) circle (0.3);
|
||||
\draw[fill=white] (4-1.5,-2) circle (0.3);
|
||||
|
||||
\path[draw,thick,-] (6.5,0) -- (6.5-0.75,-1) -- (6.5-0,-2);
|
||||
\draw[fill=white] (6.5,0) circle (0.3);
|
||||
\draw[fill=white] (6.5-0.75,-1) circle (0.3);
|
||||
\draw[fill=white] (6.5-0,-2) circle (0.3);
|
||||
|
||||
\path[draw,thick,-] (9,0) -- (9+0.75,-1) -- (9-0,-2);
|
||||
\draw[fill=white] (9,0) circle (0.3);
|
||||
\draw[fill=white] (9+0.75,-1) circle (0.3);
|
||||
\draw[fill=white] (9-0,-2) circle (0.3);
|
||||
|
||||
\path[draw,thick,-] (11.5,0) -- (11.5+0.75,-1) -- (11.5+1.5,-2);
|
||||
\draw[fill=white] (11.5,0) circle (0.3);
|
||||
\draw[fill=white] (11.5+0.75,-1) circle (0.3);
|
||||
\draw[fill=white] (11.5+1.5,-2) circle (0.3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
and the rooted trees are
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\path[draw,thick,-] (0,0) -- (-1,-1);
|
||||
\path[draw,thick,-] (0,0) -- (0,-1);
|
||||
\path[draw,thick,-] (0,0) -- (1,-1);
|
||||
\draw[fill=white] (0,0) circle (0.3);
|
||||
\draw[fill=white] (-1,-1) circle (0.3);
|
||||
\draw[fill=white] (0,-1) circle (0.3);
|
||||
\draw[fill=white] (1,-1) circle (0.3);
|
||||
|
||||
\path[draw,thick,-] (3,0) -- (3,-1) -- (3,-2) -- (3,-3);
|
||||
\draw[fill=white] (3,0) circle (0.3);
|
||||
\draw[fill=white] (3,-1) circle (0.3);
|
||||
\draw[fill=white] (3,-2) circle (0.3);
|
||||
\draw[fill=white] (3,-3) circle (0.3);
|
||||
|
||||
\path[draw,thick,-] (6+0,0) -- (6-1,-1);
|
||||
\path[draw,thick,-] (6+0,0) -- (6+1,-1) -- (6+1,-2);
|
||||
\draw[fill=white] (6+0,0) circle (0.3);
|
||||
\draw[fill=white] (6-1,-1) circle (0.3);
|
||||
\draw[fill=white] (6+1,-1) circle (0.3);
|
||||
\draw[fill=white] (6+1,-2) circle (0.3);
|
||||
|
||||
\path[draw,thick,-] (9+0,0) -- (9+1,-1);
|
||||
\path[draw,thick,-] (9+0,0) -- (9-1,-1) -- (9-1,-2);
|
||||
\draw[fill=white] (9+0,0) circle (0.3);
|
||||
\draw[fill=white] (9+1,-1) circle (0.3);
|
||||
\draw[fill=white] (9-1,-1) circle (0.3);
|
||||
\draw[fill=white] (9-1,-2) circle (0.3);
|
||||
|
||||
\path[draw,thick,-] (12+0,0) -- (12+0,-1) -- (12-1,-2);
|
||||
\path[draw,thick,-] (12+0,0) -- (12+0,-1) -- (12+1,-2);
|
||||
\draw[fill=white] (12+0,0) circle (0.3);
|
||||
\draw[fill=white] (12+0,-1) circle (0.3);
|
||||
\draw[fill=white] (12-1,-2) circle (0.3);
|
||||
\draw[fill=white] (12+1,-2) circle (0.3);
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\section{Inclusion-exclusion}
|
||||
|
||||
\index{inclusion-exclusion}
|
||||
|
||||
\key{Inclusion-exclusion} is a technique
|
||||
that can be used for counting the size
|
||||
of a union of sets when the sizes of
|
||||
the intersections are known, and vice versa.
|
||||
A simple example of the technique is the formula
|
||||
\[ |A \cup B| = |A| + |B| - |A \cap B|,\]
|
||||
where $A$ and $B$ are sets and $|X|$
|
||||
denotes the size of $X$.
|
||||
The formula can be illustrated as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.8]
|
||||
|
||||
\draw (0,0) circle (1.5);
|
||||
\draw (1.5,0) circle (1.5);
|
||||
|
||||
\node at (-0.75,0) {\small $A$};
|
||||
\node at (2.25,0) {\small $B$};
|
||||
\node at (0.75,0) {\small $A \cap B$};
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Our goal is to calculate
|
||||
the size of the union $A \cup B$
|
||||
that corresponds to the area of the region
|
||||
that belongs to at least one circle.
|
||||
The picture shows that we can calculate
|
||||
the area of $A \cup B$ by first summing the
|
||||
areas of $A$ and $B$ and then subtracting
|
||||
the area of $A \cap B$.
|
||||
|
||||
The same idea can be applied when the number
|
||||
of sets is larger.
|
||||
When there are three sets, the inclusion-exclusion formula is
|
||||
\[ |A \cup B \cup C| = |A| + |B| + |C| - |A \cap B| - |A \cap C| - |B \cap C| + |A \cap B \cap C| \]
|
||||
and the corresponding picture is
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.8]
|
||||
|
||||
\draw (0,0) circle (1.75);
|
||||
\draw (2,0) circle (1.75);
|
||||
\draw (1,1.5) circle (1.75);
|
||||
|
||||
\node at (-0.75,-0.25) {\small $A$};
|
||||
\node at (2.75,-0.25) {\small $B$};
|
||||
\node at (1,2.5) {\small $C$};
|
||||
\node at (1,-0.5) {\small $A \cap B$};
|
||||
\node at (0,1.25) {\small $A \cap C$};
|
||||
\node at (2,1.25) {\small $B \cap C$};
|
||||
\node at (1,0.5) {\scriptsize $A \cap B \cap C$};
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
In the general case, the size of the
|
||||
union $X_1 \cup X_2 \cup \cdots \cup X_n$
|
||||
can be calculated by going through all possible
|
||||
intersections that contain some of the sets $X_1,X_2,\ldots,X_n$.
|
||||
If the intersection contains an odd number of sets,
|
||||
its size is added to the answer,
|
||||
and otherwise its size is subtracted from the answer.
|
||||
|
||||
Note that there are similar formulas
|
||||
for calculating
|
||||
the size of an intersection from the sizes of
|
||||
unions. For example,
|
||||
\[ |A \cap B| = |A| + |B| - |A \cup B|\]
|
||||
and
|
||||
\[ |A \cap B \cap C| = |A| + |B| + |C| - |A \cup B| - |A \cup C| - |B \cup C| + |A \cup B \cup C| .\]
|
||||
|
||||
\subsubsection{Derangements}
|
||||
|
||||
\index{derangement}
|
||||
|
||||
As an example, let us count the number of \key{derangements}
|
||||
of elements $\{1,2,\ldots,n\}$, i.e., permutations
|
||||
where no element remains in its original place.
|
||||
For example, when $n=3$, there are
|
||||
two derangements: $(2,3,1)$ and $(3,1,2)$.
|
||||
|
||||
One approach for solving the problem is to use
|
||||
inclusion-exclusion.
|
||||
Let $X_k$ be the set of permutations
|
||||
that contain the element $k$ at position $k$.
|
||||
For example, when $n=3$, the sets are as follows:
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
X_1 & = & \{(1,2,3),(1,3,2)\} \\
|
||||
X_2 & = & \{(1,2,3),(3,2,1)\} \\
|
||||
X_3 & = & \{(1,2,3),(2,1,3)\} \\
|
||||
\end{array}
|
||||
\]
|
||||
Using these sets, the number of derangements equals
|
||||
\[ n! - |X_1 \cup X_2 \cup \cdots \cup X_n|, \]
|
||||
so it suffices to calculate the size of the union.
|
||||
Using inclusion-exclusion, this reduces to
|
||||
calculating sizes of intersections which can be
|
||||
done efficiently.
|
||||
For example, when $n=3$, the size of
|
||||
$|X_1 \cup X_2 \cup X_3|$ is
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
& & |X_1| + |X_2| + |X_3| - |X_1 \cap X_2| - |X_1 \cap X_3| - |X_2 \cap X_3| + |X_1 \cap X_2 \cap X_3| \\
|
||||
& = & 2+2+2-1-1-1+1 \\
|
||||
& = & 4, \\
|
||||
\end{array}
|
||||
\]
|
||||
so the number of solutions is $3!-4=2$.
|
||||
|
||||
It turns out that the problem can also be solved
|
||||
without using inclusion-exclusion.
|
||||
Let $f(n)$ denote the number of derangements
|
||||
for $\{1,2,\ldots,n\}$. We can use the following
|
||||
recursive formula:
|
||||
|
||||
\begin{equation*}
|
||||
f(n) = \begin{cases}
|
||||
0 & n = 1\\
|
||||
1 & n = 2\\
|
||||
(n-1)(f(n-2) + f(n-1)) & n>2 \\
|
||||
\end{cases}
|
||||
\end{equation*}
|
||||
|
||||
The formula can be derived by considering
|
||||
the possibilities how the element 1 changes
|
||||
in the derangement.
|
||||
There are $n-1$ ways to choose an element $x$
|
||||
that replaces the element 1.
|
||||
In each such choice, there are two options:
|
||||
|
||||
\textit{Option 1:} We also replace the element $x$
|
||||
with the element 1.
|
||||
After this, the remaining task is to construct
|
||||
a derangement of $n-2$ elements.
|
||||
|
||||
\textit{Option 2:} We replace the element $x$
|
||||
with some other element than 1.
|
||||
Now we have to construct a derangement
|
||||
of $n-1$ element, because we cannot replace
|
||||
the element $x$ with the element $1$, and all other
|
||||
elements must be changed.
|
||||
|
||||
\section{Burnside's lemma}
|
||||
|
||||
\index{Burnside's lemma}
|
||||
|
||||
\key{Burnside's lemma}
|
||||
%\footnote{Actually, Burnside did not discover this lemma; he only mentioned it in his book \cite{bur97}.}
|
||||
can be used to count
|
||||
the number of combinations so that
|
||||
only one representative is counted
|
||||
for each group of symmetric combinations.
|
||||
Burnside's lemma states that the number of
|
||||
combinations is
|
||||
\[\sum_{k=1}^n \frac{c(k)}{n},\]
|
||||
where there are $n$ ways to change the
|
||||
position of a combination,
|
||||
and there are $c(k)$ combinations that
|
||||
remain unchanged when the $k$th way is applied.
|
||||
|
||||
As an example, let us calculate the number of
|
||||
necklaces of $n$ pearls,
|
||||
where each pearl has $m$ possible colors.
|
||||
Two necklaces are symmetric if they are
|
||||
similar after rotating them.
|
||||
For example, the necklace
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw[fill=white] (0,0) circle (1);
|
||||
\draw[fill=red] (0,1) circle (0.3);
|
||||
\draw[fill=blue] (1,0) circle (0.3);
|
||||
\draw[fill=red] (0,-1) circle (0.3);
|
||||
\draw[fill=green] (-1,0) circle (0.3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
has the following symmetric necklaces:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw[fill=white] (0,0) circle (1);
|
||||
\draw[fill=red] (0,1) circle (0.3);
|
||||
\draw[fill=blue] (1,0) circle (0.3);
|
||||
\draw[fill=red] (0,-1) circle (0.3);
|
||||
\draw[fill=green] (-1,0) circle (0.3);
|
||||
|
||||
\draw[fill=white] (4,0) circle (1);
|
||||
\draw[fill=green] (4+0,1) circle (0.3);
|
||||
\draw[fill=red] (4+1,0) circle (0.3);
|
||||
\draw[fill=blue] (4+0,-1) circle (0.3);
|
||||
\draw[fill=red] (4+-1,0) circle (0.3);
|
||||
|
||||
\draw[fill=white] (8,0) circle (1);
|
||||
\draw[fill=red] (8+0,1) circle (0.3);
|
||||
\draw[fill=green] (8+1,0) circle (0.3);
|
||||
\draw[fill=red] (8+0,-1) circle (0.3);
|
||||
\draw[fill=blue] (8+-1,0) circle (0.3);
|
||||
|
||||
\draw[fill=white] (12,0) circle (1);
|
||||
\draw[fill=blue] (12+0,1) circle (0.3);
|
||||
\draw[fill=red] (12+1,0) circle (0.3);
|
||||
\draw[fill=green] (12+0,-1) circle (0.3);
|
||||
\draw[fill=red] (12+-1,0) circle (0.3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
There are $n$ ways to change the position
|
||||
of a necklace,
|
||||
because we can rotate it
|
||||
$0,1,\ldots,n-1$ steps clockwise.
|
||||
If the number of steps is 0,
|
||||
all $m^n$ necklaces remain the same,
|
||||
and if the number of steps is 1,
|
||||
only the $m$ necklaces where each
|
||||
pearl has the same color remain the same.
|
||||
|
||||
More generally, when the number of steps is $k$,
|
||||
a total of
|
||||
\[m^{\textrm{gcd}(k,n)}\]
|
||||
necklaces remain the same,
|
||||
where $\textrm{gcd}(k,n)$ is the greatest common
|
||||
divisor of $k$ and $n$.
|
||||
The reason for this is that blocks
|
||||
of pearls of size $\textrm{gcd}(k,n)$
|
||||
will replace each other.
|
||||
Thus, according to Burnside's lemma,
|
||||
the number of necklaces is
|
||||
\[\sum_{i=0}^{n-1} \frac{m^{\textrm{gcd}(i,n)}}{n}. \]
|
||||
For example, the number of necklaces of length 4
|
||||
with 3 colors is
|
||||
\[\frac{3^4+3+3^2+3}{4} = 24. \]
|
||||
|
||||
\section{Cayley's formula}
|
||||
|
||||
\index{Cayley's formula}
|
||||
|
||||
\key{Cayley's formula}
|
||||
% \footnote{While the formula is named after A. Cayley,
|
||||
% who studied it in 1889, it was discovered earlier by C. W. Borchardt in 1860.}
|
||||
states that
|
||||
there are $n^{n-2}$ labeled trees
|
||||
that contain $n$ nodes.
|
||||
The nodes are labeled $1,2,\ldots,n$,
|
||||
and two trees are different
|
||||
if either their structure or
|
||||
labeling is different.
|
||||
|
||||
\begin{samepage}
|
||||
For example, when $n=4$, the number of labeled
|
||||
trees is $4^{4-2}=16$:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.8]
|
||||
\footnotesize
|
||||
|
||||
\newcommand\puua[6]{
|
||||
\path[draw,thick,-] (#1,#2) -- (#1-1.25,#2-1.5);
|
||||
\path[draw,thick,-] (#1,#2) -- (#1,#2-1.5);
|
||||
\path[draw,thick,-] (#1,#2) -- (#1+1.25,#2-1.5);
|
||||
\node[draw, circle, fill=white] at (#1,#2) {#3};
|
||||
\node[draw, circle, fill=white] at (#1-1.25,#2-1.5) {#4};
|
||||
\node[draw, circle, fill=white] at (#1,#2-1.5) {#5};
|
||||
\node[draw, circle, fill=white] at (#1+1.25,#2-1.5) {#6};
|
||||
}
|
||||
\newcommand\puub[6]{
|
||||
\path[draw,thick,-] (#1,#2) -- (#1+1,#2);
|
||||
\path[draw,thick,-] (#1+1,#2) -- (#1+2,#2);
|
||||
\path[draw,thick,-] (#1+2,#2) -- (#1+3,#2);
|
||||
\node[draw, circle, fill=white] at (#1,#2) {#3};
|
||||
\node[draw, circle, fill=white] at (#1+1,#2) {#4};
|
||||
\node[draw, circle, fill=white] at (#1+2,#2) {#5};
|
||||
\node[draw, circle, fill=white] at (#1+3,#2) {#6};
|
||||
}
|
||||
|
||||
\puua{0}{0}{1}{2}{3}{4}
|
||||
\puua{4}{0}{2}{1}{3}{4}
|
||||
\puua{8}{0}{3}{1}{2}{4}
|
||||
\puua{12}{0}{4}{1}{2}{3}
|
||||
|
||||
\puub{0}{-3}{1}{2}{3}{4}
|
||||
\puub{4.5}{-3}{1}{2}{4}{3}
|
||||
\puub{9}{-3}{1}{3}{2}{4}
|
||||
\puub{0}{-4.5}{1}{3}{4}{2}
|
||||
\puub{4.5}{-4.5}{1}{4}{2}{3}
|
||||
\puub{9}{-4.5}{1}{4}{3}{2}
|
||||
\puub{0}{-6}{2}{1}{3}{4}
|
||||
\puub{4.5}{-6}{2}{1}{4}{3}
|
||||
\puub{9}{-6}{2}{3}{1}{4}
|
||||
\puub{0}{-7.5}{2}{4}{1}{3}
|
||||
\puub{4.5}{-7.5}{3}{1}{2}{4}
|
||||
\puub{9}{-7.5}{3}{2}{1}{4}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
Next we will see how Cayley's formula can
|
||||
be derived using Prüfer codes.
|
||||
|
||||
\subsubsection{Prüfer code}
|
||||
|
||||
\index{Prüfer code}
|
||||
|
||||
A \key{Prüfer code}
|
||||
%\footnote{In 1918, H. Prüfer proved Cayley's theorem using Prüfer codes \cite{pru18}.}
|
||||
is a sequence of
|
||||
$n-2$ numbers that describes a labeled tree.
|
||||
The code is constructed by following a process
|
||||
that removes $n-2$ leaves from the tree.
|
||||
At each step, the leaf with the smallest label is removed,
|
||||
and the label of its only neighbor is added to the code.
|
||||
|
||||
For example, let us calculate the Prüfer code
|
||||
of the following graph:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (2,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (2,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (5.5,2) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
First we remove node 1 and add node 4 to the code:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
%\node[draw, circle] (1) at (2,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (2,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (5.5,2) {$5$};
|
||||
|
||||
%\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Then we remove node 3 and add node 4 to the code:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
%\node[draw, circle] (1) at (2,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
%\node[draw, circle] (3) at (2,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (5.5,2) {$5$};
|
||||
|
||||
%\path[draw,thick,-] (1) -- (4);
|
||||
%\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Finally we remove node 4 and add node 2 to the code:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
%\node[draw, circle] (1) at (2,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
%\node[draw, circle] (3) at (2,1) {$3$};
|
||||
%\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (5.5,2) {$5$};
|
||||
|
||||
%\path[draw,thick,-] (1) -- (4);
|
||||
%\path[draw,thick,-] (3) -- (4);
|
||||
%\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Thus, the Prüfer code of the graph is $[4,4,2]$.
|
||||
|
||||
We can construct a Prüfer code for any tree,
|
||||
and more importantly,
|
||||
the original tree can be reconstructed
|
||||
from a Prüfer code.
|
||||
Hence, the number of labeled trees
|
||||
of $n$ nodes equals
|
||||
$n^{n-2}$, the number of Prüfer codes
|
||||
of size $n$.
|
856
chapter23.tex
856
chapter23.tex
|
@ -1,856 +0,0 @@
|
|||
\chapter{Matrices}
|
||||
|
||||
\index{matrix}
|
||||
|
||||
A \key{matrix} is a mathematical concept
|
||||
that corresponds to a two-dimensional array
|
||||
in programming. For example,
|
||||
\[
|
||||
A =
|
||||
\begin{bmatrix}
|
||||
6 & 13 & 7 & 4 \\
|
||||
7 & 0 & 8 & 2 \\
|
||||
9 & 5 & 4 & 18 \\
|
||||
\end{bmatrix}
|
||||
\]
|
||||
is a matrix of size $3 \times 4$, i.e.,
|
||||
it has 3 rows and 4 columns.
|
||||
The notation $[i,j]$ refers to
|
||||
the element in row $i$ and column $j$
|
||||
in a matrix.
|
||||
For example, in the above matrix,
|
||||
$A[2,3]=8$ and $A[3,1]=9$.
|
||||
|
||||
\index{vector}
|
||||
|
||||
A special case of a matrix is a \key{vector}
|
||||
that is a one-dimensional matrix of size $n \times 1$.
|
||||
For example,
|
||||
\[
|
||||
V =
|
||||
\begin{bmatrix}
|
||||
4 \\
|
||||
7 \\
|
||||
5 \\
|
||||
\end{bmatrix}
|
||||
\]
|
||||
is a vector that contains three elements.
|
||||
|
||||
\index{transpose}
|
||||
|
||||
The \key{transpose} $A^T$ of a matrix $A$
|
||||
is obtained when the rows and columns of $A$
|
||||
are swapped, i.e., $A^T[i,j]=A[j,i]$:
|
||||
\[
|
||||
A^T =
|
||||
\begin{bmatrix}
|
||||
6 & 7 & 9 \\
|
||||
13 & 0 & 5 \\
|
||||
7 & 8 & 4 \\
|
||||
4 & 2 & 18 \\
|
||||
\end{bmatrix}
|
||||
\]
|
||||
|
||||
\index{square matrix}
|
||||
|
||||
A matrix is a \key{square matrix} if it
|
||||
has the same number of rows and columns.
|
||||
For example, the following matrix is a
|
||||
square matrix:
|
||||
|
||||
\[
|
||||
S =
|
||||
\begin{bmatrix}
|
||||
3 & 12 & 4 \\
|
||||
5 & 9 & 15 \\
|
||||
0 & 2 & 4 \\
|
||||
\end{bmatrix}
|
||||
\]
|
||||
|
||||
\section{Operations}
|
||||
|
||||
The sum $A+B$ of matrices $A$ and $B$
|
||||
is defined if the matrices are of the same size.
|
||||
The result is a matrix where each element
|
||||
is the sum of the corresponding elements
|
||||
in $A$ and $B$.
|
||||
|
||||
For example,
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
6 & 1 & 4 \\
|
||||
3 & 9 & 2 \\
|
||||
\end{bmatrix}
|
||||
+
|
||||
\begin{bmatrix}
|
||||
4 & 9 & 3 \\
|
||||
8 & 1 & 3 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
6+4 & 1+9 & 4+3 \\
|
||||
3+8 & 9+1 & 2+3 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
10 & 10 & 7 \\
|
||||
11 & 10 & 5 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
|
||||
Multiplying a matrix $A$ by a value $x$ means
|
||||
that each element of $A$ is multiplied by $x$.
|
||||
For example,
|
||||
\[
|
||||
2 \cdot \begin{bmatrix}
|
||||
6 & 1 & 4 \\
|
||||
3 & 9 & 2 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
2 \cdot 6 & 2\cdot1 & 2\cdot4 \\
|
||||
2\cdot3 & 2\cdot9 & 2\cdot2 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
12 & 2 & 8 \\
|
||||
6 & 18 & 4 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
|
||||
\subsubsection{Matrix multiplication}
|
||||
|
||||
\index{matrix multiplication}
|
||||
|
||||
The product $AB$ of matrices $A$ and $B$
|
||||
is defined if $A$ is of size $a \times n$
|
||||
and $B$ is of size $n \times b$, i.e.,
|
||||
the width of $A$ equals the height of $B$.
|
||||
The result is a matrix of size $a \times b$
|
||||
whose elements are calculated using the formula
|
||||
\[
|
||||
AB[i,j] = \sum_{k=1}^n A[i,k] \cdot B[k,j].
|
||||
\]
|
||||
|
||||
The idea is that each element of $AB$
|
||||
is a sum of products of elements of $A$ and $B$
|
||||
according to the following picture:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.5]
|
||||
\draw (0,0) grid (4,3);
|
||||
\draw (5,0) grid (10,3);
|
||||
\draw (5,4) grid (10,8);
|
||||
|
||||
\node at (2,-1) {$A$};
|
||||
\node at (7.5,-1) {$AB$};
|
||||
\node at (11,6) {$B$};
|
||||
|
||||
\draw[thick,->,red,line width=2pt] (0,1.5) -- (4,1.5);
|
||||
\draw[thick,->,red,line width=2pt] (6.5,8) -- (6.5,4);
|
||||
\draw[thick,red,line width=2pt] (6.5,1.5) circle (0.4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
For example,
|
||||
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
1 & 4 \\
|
||||
3 & 9 \\
|
||||
8 & 6 \\
|
||||
\end{bmatrix}
|
||||
\cdot
|
||||
\begin{bmatrix}
|
||||
1 & 6 \\
|
||||
2 & 9 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
1 \cdot 1 + 4 \cdot 2 & 1 \cdot 6 + 4 \cdot 9 \\
|
||||
3 \cdot 1 + 9 \cdot 2 & 3 \cdot 6 + 9 \cdot 9 \\
|
||||
8 \cdot 1 + 6 \cdot 2 & 8 \cdot 6 + 6 \cdot 9 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
9 & 42 \\
|
||||
21 & 99 \\
|
||||
20 & 102 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
|
||||
Matrix multiplication is associative,
|
||||
so $A(BC)=(AB)C$ holds,
|
||||
but it is not commutative,
|
||||
so $AB = BA$ does not usually hold.
|
||||
|
||||
\index{identity matrix}
|
||||
|
||||
An \key{identity matrix} is a square matrix
|
||||
where each element on the diagonal is 1
|
||||
and all other elements are 0.
|
||||
For example, the following matrix
|
||||
is the $3 \times 3$ identity matrix:
|
||||
\[
|
||||
I = \begin{bmatrix}
|
||||
1 & 0 & 0 \\
|
||||
0 & 1 & 0 \\
|
||||
0 & 0 & 1 \\
|
||||
\end{bmatrix}
|
||||
\]
|
||||
|
||||
\begin{samepage}
|
||||
Multiplying a matrix by an identity matrix
|
||||
does not change it. For example,
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
1 & 0 & 0 \\
|
||||
0 & 1 & 0 \\
|
||||
0 & 0 & 1 \\
|
||||
\end{bmatrix}
|
||||
\cdot
|
||||
\begin{bmatrix}
|
||||
1 & 4 \\
|
||||
3 & 9 \\
|
||||
8 & 6 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
1 & 4 \\
|
||||
3 & 9 \\
|
||||
8 & 6 \\
|
||||
\end{bmatrix} \hspace{10px} \textrm{and} \hspace{10px}
|
||||
\begin{bmatrix}
|
||||
1 & 4 \\
|
||||
3 & 9 \\
|
||||
8 & 6 \\
|
||||
\end{bmatrix}
|
||||
\cdot
|
||||
\begin{bmatrix}
|
||||
1 & 0 \\
|
||||
0 & 1 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
1 & 4 \\
|
||||
3 & 9 \\
|
||||
8 & 6 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
\end{samepage}
|
||||
|
||||
Using a straightforward algorithm,
|
||||
we can calculate the product of
|
||||
two $n \times n$ matrices
|
||||
in $O(n^3)$ time.
|
||||
There are also more efficient algorithms
|
||||
for matrix multiplication\footnote{The first such
|
||||
algorithm was Strassen's algorithm,
|
||||
published in 1969 \cite{str69},
|
||||
whose time complexity is $O(n^{2.80735})$;
|
||||
the best current algorithm \cite{gal14}
|
||||
works in $O(n^{2.37286})$ time.},
|
||||
but they are mostly of theoretical interest
|
||||
and such algorithms are not necessary
|
||||
in competitive programming.
|
||||
|
||||
|
||||
\subsubsection{Matrix power}
|
||||
|
||||
\index{matrix power}
|
||||
|
||||
The power $A^k$ of a matrix $A$ is defined
|
||||
if $A$ is a square matrix.
|
||||
The definition is based on matrix multiplication:
|
||||
\[ A^k = \underbrace{A \cdot A \cdot A \cdots A}_{\textrm{$k$ times}} \]
|
||||
For example,
|
||||
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
2 & 5 \\
|
||||
1 & 4 \\
|
||||
\end{bmatrix}^3 =
|
||||
\begin{bmatrix}
|
||||
2 & 5 \\
|
||||
1 & 4 \\
|
||||
\end{bmatrix} \cdot
|
||||
\begin{bmatrix}
|
||||
2 & 5 \\
|
||||
1 & 4 \\
|
||||
\end{bmatrix} \cdot
|
||||
\begin{bmatrix}
|
||||
2 & 5 \\
|
||||
1 & 4 \\
|
||||
\end{bmatrix} =
|
||||
\begin{bmatrix}
|
||||
48 & 165 \\
|
||||
33 & 114 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
In addition, $A^0$ is an identity matrix. For example,
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
2 & 5 \\
|
||||
1 & 4 \\
|
||||
\end{bmatrix}^0 =
|
||||
\begin{bmatrix}
|
||||
1 & 0 \\
|
||||
0 & 1 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
|
||||
The matrix $A^k$ can be efficiently calculated
|
||||
in $O(n^3 \log k)$ time using the
|
||||
algorithm in Chapter 21.2. For example,
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
2 & 5 \\
|
||||
1 & 4 \\
|
||||
\end{bmatrix}^8 =
|
||||
\begin{bmatrix}
|
||||
2 & 5 \\
|
||||
1 & 4 \\
|
||||
\end{bmatrix}^4 \cdot
|
||||
\begin{bmatrix}
|
||||
2 & 5 \\
|
||||
1 & 4 \\
|
||||
\end{bmatrix}^4.
|
||||
\]
|
||||
|
||||
\subsubsection{Determinant}
|
||||
|
||||
\index{determinant}
|
||||
|
||||
The \key{determinant} $\det(A)$ of a matrix $A$
|
||||
is defined if $A$ is a square matrix.
|
||||
If $A$ is of size $1 \times 1$,
|
||||
then $\det(A)=A[1,1]$.
|
||||
The determinant of a larger matrix is
|
||||
calculated recursively using the formula \index{cofactor}
|
||||
\[\det(A)=\sum_{j=1}^n A[1,j] C[1,j],\]
|
||||
where $C[i,j]$ is the \key{cofactor} of $A$
|
||||
at $[i,j]$.
|
||||
The cofactor is calculated using the formula
|
||||
\[C[i,j] = (-1)^{i+j} \det(M[i,j]),\]
|
||||
where $M[i,j]$ is obtained by removing
|
||||
row $i$ and column $j$ from $A$.
|
||||
Due to the coefficient $(-1)^{i+j}$ in the cofactor,
|
||||
every other determinant is positive
|
||||
and negative.
|
||||
For example,
|
||||
\[
|
||||
\det(
|
||||
\begin{bmatrix}
|
||||
3 & 4 \\
|
||||
1 & 6 \\
|
||||
\end{bmatrix}
|
||||
) = 3 \cdot 6 - 4 \cdot 1 = 14
|
||||
\]
|
||||
and
|
||||
\[
|
||||
\det(
|
||||
\begin{bmatrix}
|
||||
2 & 4 & 3 \\
|
||||
5 & 1 & 6 \\
|
||||
7 & 2 & 4 \\
|
||||
\end{bmatrix}
|
||||
) =
|
||||
2 \cdot
|
||||
\det(
|
||||
\begin{bmatrix}
|
||||
1 & 6 \\
|
||||
2 & 4 \\
|
||||
\end{bmatrix}
|
||||
)
|
||||
-4 \cdot
|
||||
\det(
|
||||
\begin{bmatrix}
|
||||
5 & 6 \\
|
||||
7 & 4 \\
|
||||
\end{bmatrix}
|
||||
)
|
||||
+3 \cdot
|
||||
\det(
|
||||
\begin{bmatrix}
|
||||
5 & 1 \\
|
||||
7 & 2 \\
|
||||
\end{bmatrix}
|
||||
) = 81.
|
||||
\]
|
||||
|
||||
\index{inverse matrix}
|
||||
|
||||
The determinant of $A$ tells us
|
||||
whether there is an \key{inverse matrix}
|
||||
$A^{-1}$ such that $A \cdot A^{-1} = I$,
|
||||
where $I$ is an identity matrix.
|
||||
It turns out that $A^{-1}$ exists
|
||||
exactly when $\det(A) \neq 0$,
|
||||
and it can be calculated using the formula
|
||||
|
||||
\[A^{-1}[i,j] = \frac{C[j,i]}{det(A)}.\]
|
||||
|
||||
For example,
|
||||
|
||||
\[
|
||||
\underbrace{
|
||||
\begin{bmatrix}
|
||||
2 & 4 & 3\\
|
||||
5 & 1 & 6\\
|
||||
7 & 2 & 4\\
|
||||
\end{bmatrix}
|
||||
}_{A}
|
||||
\cdot
|
||||
\underbrace{
|
||||
\frac{1}{81}
|
||||
\begin{bmatrix}
|
||||
-8 & -10 & 21 \\
|
||||
22 & -13 & 3 \\
|
||||
3 & 24 & -18 \\
|
||||
\end{bmatrix}
|
||||
}_{A^{-1}}
|
||||
=
|
||||
\underbrace{
|
||||
\begin{bmatrix}
|
||||
1 & 0 & 0 \\
|
||||
0 & 1 & 0 \\
|
||||
0 & 0 & 1 \\
|
||||
\end{bmatrix}
|
||||
}_{I}.
|
||||
\]
|
||||
|
||||
\section{Linear recurrences}
|
||||
|
||||
\index{linear recurrence}
|
||||
|
||||
A \key{linear recurrence}
|
||||
is a function $f(n)$
|
||||
whose initial values are
|
||||
$f(0),f(1),\ldots,f(k-1)$
|
||||
and larger values
|
||||
are calculated recursively using the formula
|
||||
\[f(n) = c_1 f(n-1) + c_2 f(n-2) + \ldots + c_k f (n-k),\]
|
||||
where $c_1,c_2,\ldots,c_k$ are constant coefficients.
|
||||
|
||||
Dynamic programming can be used to calculate
|
||||
any value of $f(n)$ in $O(kn)$ time by calculating
|
||||
all values of $f(0),f(1),\ldots,f(n)$ one after another.
|
||||
However, if $k$ is small, it is possible to calculate
|
||||
$f(n)$ much more efficiently in $O(k^3 \log n)$
|
||||
time using matrix operations.
|
||||
|
||||
\subsubsection{Fibonacci numbers}
|
||||
|
||||
\index{Fibonacci number}
|
||||
|
||||
A simple example of a linear recurrence is the
|
||||
following function that defines the Fibonacci numbers:
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
f(0) & = & 0 \\
|
||||
f(1) & = & 1 \\
|
||||
f(n) & = & f(n-1)+f(n-2) \\
|
||||
\end{array}
|
||||
\]
|
||||
In this case, $k=2$ and $c_1=c_2=1$.
|
||||
|
||||
\begin{samepage}
|
||||
To efficiently calculate Fibonacci numbers,
|
||||
we represent the
|
||||
Fibonacci formula as a
|
||||
square matrix $X$ of size $2 \times 2$,
|
||||
for which the following holds:
|
||||
\[ X \cdot
|
||||
\begin{bmatrix}
|
||||
f(i) \\
|
||||
f(i+1) \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
f(i+1) \\
|
||||
f(i+2) \\
|
||||
\end{bmatrix}
|
||||
\]
|
||||
Thus, values $f(i)$ and $f(i+1)$ are given as
|
||||
''input'' for $X$,
|
||||
and $X$ calculates values $f(i+1)$ and $f(i+2)$
|
||||
from them.
|
||||
It turns out that such a matrix is
|
||||
|
||||
\[ X =
|
||||
\begin{bmatrix}
|
||||
0 & 1 \\
|
||||
1 & 1 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
\end{samepage}
|
||||
\noindent
|
||||
For example,
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
0 & 1 \\
|
||||
1 & 1 \\
|
||||
\end{bmatrix}
|
||||
\cdot
|
||||
\begin{bmatrix}
|
||||
f(5) \\
|
||||
f(6) \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
0 & 1 \\
|
||||
1 & 1 \\
|
||||
\end{bmatrix}
|
||||
\cdot
|
||||
\begin{bmatrix}
|
||||
5 \\
|
||||
8 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
8 \\
|
||||
13 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
f(6) \\
|
||||
f(7) \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
Thus, we can calculate $f(n)$ using the formula
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
f(n) \\
|
||||
f(n+1) \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
X^n \cdot
|
||||
\begin{bmatrix}
|
||||
f(0) \\
|
||||
f(1) \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
0 & 1 \\
|
||||
1 & 1 \\
|
||||
\end{bmatrix}^n
|
||||
\cdot
|
||||
\begin{bmatrix}
|
||||
0 \\
|
||||
1 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
The value of $X^n$ can be calculated in
|
||||
$O(\log n)$ time,
|
||||
so the value of $f(n)$ can also be calculated
|
||||
in $O(\log n)$ time.
|
||||
|
||||
\subsubsection{General case}
|
||||
|
||||
Let us now consider the general case where
|
||||
$f(n)$ is any linear recurrence.
|
||||
Again, our goal is to construct a matrix $X$
|
||||
for which
|
||||
|
||||
\[ X \cdot
|
||||
\begin{bmatrix}
|
||||
f(i) \\
|
||||
f(i+1) \\
|
||||
\vdots \\
|
||||
f(i+k-1) \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
f(i+1) \\
|
||||
f(i+2) \\
|
||||
\vdots \\
|
||||
f(i+k) \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
Such a matrix is
|
||||
\[
|
||||
X =
|
||||
\begin{bmatrix}
|
||||
0 & 1 & 0 & 0 & \cdots & 0 \\
|
||||
0 & 0 & 1 & 0 & \cdots & 0 \\
|
||||
0 & 0 & 0 & 1 & \cdots & 0 \\
|
||||
\vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\
|
||||
0 & 0 & 0 & 0 & \cdots & 1 \\
|
||||
c_k & c_{k-1} & c_{k-2} & c_{k-3} & \cdots & c_1 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
In the first $k-1$ rows, each element is 0
|
||||
except that one element is 1.
|
||||
These rows replace $f(i)$ with $f(i+1)$,
|
||||
$f(i+1)$ with $f(i+2)$, and so on.
|
||||
The last row contains the coefficients of the recurrence
|
||||
to calculate the new value $f(i+k)$.
|
||||
|
||||
\begin{samepage}
|
||||
Now, $f(n)$ can be calculated in
|
||||
$O(k^3 \log n)$ time using the formula
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
f(n) \\
|
||||
f(n+1) \\
|
||||
\vdots \\
|
||||
f(n+k-1) \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
X^n \cdot
|
||||
\begin{bmatrix}
|
||||
f(0) \\
|
||||
f(1) \\
|
||||
\vdots \\
|
||||
f(k-1) \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
\end{samepage}
|
||||
|
||||
\section{Graphs and matrices}
|
||||
|
||||
\subsubsection{Counting paths}
|
||||
|
||||
The powers of an adjacency matrix of a graph
|
||||
have an interesting property.
|
||||
When $V$ is an adjacency matrix of an unweighted graph,
|
||||
the matrix $V^n$ contains the numbers of paths of
|
||||
$n$ edges between the nodes in the graph.
|
||||
|
||||
For example, for the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (1,1) {$4$};
|
||||
\node[draw, circle] (3) at (3,3) {$2$};
|
||||
\node[draw, circle] (4) at (5,3) {$3$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- (1);
|
||||
\path[draw,thick,->,>=latex] (4) -- (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- (5);
|
||||
\path[draw,thick,->,>=latex] (3) -- (6);
|
||||
\path[draw,thick,->,>=latex] (6) -- (4);
|
||||
\path[draw,thick,->,>=latex] (6) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
the adjacency matrix is
|
||||
\[
|
||||
V= \begin{bmatrix}
|
||||
0 & 0 & 0 & 1 & 0 & 0 \\
|
||||
1 & 0 & 0 & 0 & 1 & 1 \\
|
||||
0 & 1 & 0 & 0 & 0 & 0 \\
|
||||
0 & 1 & 0 & 0 & 0 & 0 \\
|
||||
0 & 0 & 0 & 0 & 0 & 0 \\
|
||||
0 & 0 & 1 & 0 & 1 & 0 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
Now, for example, the matrix
|
||||
\[
|
||||
V^4= \begin{bmatrix}
|
||||
0 & 0 & 1 & 1 & 1 & 0 \\
|
||||
2 & 0 & 0 & 0 & 2 & 2 \\
|
||||
0 & 2 & 0 & 0 & 0 & 0 \\
|
||||
0 & 2 & 0 & 0 & 0 & 0 \\
|
||||
0 & 0 & 0 & 0 & 0 & 0 \\
|
||||
0 & 0 & 1 & 1 & 1 & 0 \\
|
||||
\end{bmatrix}
|
||||
\]
|
||||
contains the numbers of paths of 4 edges
|
||||
between the nodes.
|
||||
For example, $V^4[2,5]=2$,
|
||||
because there are two paths of 4 edges
|
||||
from node 2 to node 5:
|
||||
$2 \rightarrow 1 \rightarrow 4 \rightarrow 2 \rightarrow 5$
|
||||
and
|
||||
$2 \rightarrow 6 \rightarrow 3 \rightarrow 2 \rightarrow 5$.
|
||||
|
||||
\subsubsection{Shortest paths}
|
||||
|
||||
Using a similar idea in a weighted graph,
|
||||
we can calculate for each pair of nodes the minimum
|
||||
length of a path
|
||||
between them that contains exactly $n$ edges.
|
||||
To calculate this, we have to define matrix multiplication
|
||||
in a new way, so that we do not calculate the numbers
|
||||
of paths but minimize the lengths of paths.
|
||||
|
||||
\begin{samepage}
|
||||
As an example, consider the following graph:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (1,1) {$4$};
|
||||
\node[draw, circle] (3) at (3,3) {$2$};
|
||||
\node[draw, circle] (4) at (5,3) {$3$};
|
||||
\node[draw, circle] (5) at (3,1) {$5$};
|
||||
\node[draw, circle] (6) at (5,1) {$6$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=left:4] {} (2);
|
||||
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:1] {} (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=north:2] {} (1);
|
||||
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=north:4] {} (3);
|
||||
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=left:1] {} (5);
|
||||
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=left:2] {} (6);
|
||||
\path[draw,thick,->,>=latex] (6) -- node[font=\small,label=right:3] {} (4);
|
||||
\path[draw,thick,->,>=latex] (6) -- node[font=\small,label=below:2] {} (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
Let us construct an adjacency matrix where
|
||||
$\infty$ means that an edge does not exist,
|
||||
and other values correspond to edge weights.
|
||||
The matrix is
|
||||
\[
|
||||
V= \begin{bmatrix}
|
||||
\infty & \infty & \infty & 4 & \infty & \infty \\
|
||||
2 & \infty & \infty & \infty & 1 & 2 \\
|
||||
\infty & 4 & \infty & \infty & \infty & \infty \\
|
||||
\infty & 1 & \infty & \infty & \infty & \infty \\
|
||||
\infty & \infty & \infty & \infty & \infty & \infty \\
|
||||
\infty & \infty & 3 & \infty & 2 & \infty \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
|
||||
Instead of the formula
|
||||
\[
|
||||
AB[i,j] = \sum_{k=1}^n A[i,k] \cdot B[k,j]
|
||||
\]
|
||||
we now use the formula
|
||||
\[
|
||||
AB[i,j] = \min_{k=1}^n A[i,k] + B[k,j]
|
||||
\]
|
||||
for matrix multiplication, so we calculate
|
||||
a minimum instead of a sum,
|
||||
and a sum of elements instead of a product.
|
||||
After this modification,
|
||||
matrix powers correspond to
|
||||
shortest paths in the graph.
|
||||
|
||||
For example, as
|
||||
\[
|
||||
V^4= \begin{bmatrix}
|
||||
\infty & \infty & 10 & 11 & 9 & \infty \\
|
||||
9 & \infty & \infty & \infty & 8 & 9 \\
|
||||
\infty & 11 & \infty & \infty & \infty & \infty \\
|
||||
\infty & 8 & \infty & \infty & \infty & \infty \\
|
||||
\infty & \infty & \infty & \infty & \infty & \infty \\
|
||||
\infty & \infty & 12 & 13 & 11 & \infty \\
|
||||
\end{bmatrix},
|
||||
\]
|
||||
we can conclude that the minimum length of a path
|
||||
of 4 edges
|
||||
from node 2 to node 5 is 8.
|
||||
Such a path is
|
||||
$2 \rightarrow 1 \rightarrow 4 \rightarrow 2 \rightarrow 5$.
|
||||
|
||||
\subsubsection{Kirchhoff's theorem}
|
||||
|
||||
\index{Kirchhoff's theorem}
|
||||
\index{spanning tree}
|
||||
|
||||
\key{Kirchhoff's theorem}
|
||||
%\footnote{G. R. Kirchhoff (1824--1887) was a German physicist.}
|
||||
provides a way
|
||||
to calculate the number of spanning trees
|
||||
of a graph as a determinant of a special matrix.
|
||||
For example, the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (3,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (3,1) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
has three spanning trees:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1a) at (1,3) {$1$};
|
||||
\node[draw, circle] (2a) at (3,3) {$2$};
|
||||
\node[draw, circle] (3a) at (1,1) {$3$};
|
||||
\node[draw, circle] (4a) at (3,1) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1a) -- (2a);
|
||||
%\path[draw,thick,-] (1a) -- (3a);
|
||||
\path[draw,thick,-] (3a) -- (4a);
|
||||
\path[draw,thick,-] (1a) -- (4a);
|
||||
|
||||
\node[draw, circle] (1b) at (1+4,3) {$1$};
|
||||
\node[draw, circle] (2b) at (3+4,3) {$2$};
|
||||
\node[draw, circle] (3b) at (1+4,1) {$3$};
|
||||
\node[draw, circle] (4b) at (3+4,1) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1b) -- (2b);
|
||||
\path[draw,thick,-] (1b) -- (3b);
|
||||
%\path[draw,thick,-] (3b) -- (4b);
|
||||
\path[draw,thick,-] (1b) -- (4b);
|
||||
|
||||
\node[draw, circle] (1c) at (1+8,3) {$1$};
|
||||
\node[draw, circle] (2c) at (3+8,3) {$2$};
|
||||
\node[draw, circle] (3c) at (1+8,1) {$3$};
|
||||
\node[draw, circle] (4c) at (3+8,1) {$4$};
|
||||
|
||||
\path[draw,thick,-] (1c) -- (2c);
|
||||
\path[draw,thick,-] (1c) -- (3c);
|
||||
\path[draw,thick,-] (3c) -- (4c);
|
||||
%\path[draw,thick,-] (1c) -- (4c);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\index{Laplacean matrix}
|
||||
To calculate the number of spanning trees,
|
||||
we construct a \key{Laplacean matrix} $L$,
|
||||
where $L[i,i]$ is the degree of node $i$
|
||||
and $L[i,j]=-1$ if there is an edge between
|
||||
nodes $i$ and $j$, and otherwise $L[i,j]=0$.
|
||||
The Laplacean matrix for the above graph is as follows:
|
||||
\[
|
||||
L= \begin{bmatrix}
|
||||
3 & -1 & -1 & -1 \\
|
||||
-1 & 1 & 0 & 0 \\
|
||||
-1 & 0 & 2 & -1 \\
|
||||
-1 & 0 & -1 & 2 \\
|
||||
\end{bmatrix}
|
||||
\]
|
||||
|
||||
It can be shown that
|
||||
the number of spanning trees equals
|
||||
the determinant of a matrix that is obtained
|
||||
when we remove any row and any column from $L$.
|
||||
For example, if we remove the first row
|
||||
and column, the result is
|
||||
|
||||
\[ \det(
|
||||
\begin{bmatrix}
|
||||
1 & 0 & 0 \\
|
||||
0 & 2 & -1 \\
|
||||
0 & -1 & 2 \\
|
||||
\end{bmatrix}
|
||||
) =3.\]
|
||||
The determinant is always the same,
|
||||
regardless of which row and column we remove from $L$.
|
||||
|
||||
Note that Cayley's formula in Chapter 22.5 is
|
||||
a special case of Kirchhoff's theorem,
|
||||
because in a complete graph of $n$ nodes
|
||||
|
||||
\[ \det(
|
||||
\begin{bmatrix}
|
||||
n-1 & -1 & \cdots & -1 \\
|
||||
-1 & n-1 & \cdots & -1 \\
|
||||
\vdots & \vdots & \ddots & \vdots \\
|
||||
-1 & -1 & \cdots & n-1 \\
|
||||
\end{bmatrix}
|
||||
) =n^{n-2}.\]
|
||||
|
||||
|
||||
|
689
chapter24.tex
689
chapter24.tex
|
@ -1,689 +0,0 @@
|
|||
\chapter{Probability}
|
||||
|
||||
\index{probability}
|
||||
|
||||
A \key{probability} is a real number between $0$ and $1$
|
||||
that indicates how probable an event is.
|
||||
If an event is certain to happen,
|
||||
its probability is 1,
|
||||
and if an event is impossible,
|
||||
its probability is 0.
|
||||
The probability of an event is denoted $P(\cdots)$
|
||||
where the three dots describe the event.
|
||||
|
||||
For example, when throwing a dice,
|
||||
the outcome is an integer between $1$ and $6$,
|
||||
and the probability of each outcome is $1/6$.
|
||||
For example, we can calculate the following probabilities:
|
||||
|
||||
\begin{itemize}[noitemsep]
|
||||
\item $P(\textrm{''the outcome is 4''})=1/6$
|
||||
\item $P(\textrm{''the outcome is not 6''})=5/6$
|
||||
\item $P(\textrm{''the outcome is even''})=1/2$
|
||||
\end{itemize}
|
||||
|
||||
\section{Calculation}
|
||||
|
||||
To calculate the probability of an event,
|
||||
we can either use combinatorics
|
||||
or simulate the process that generates the event.
|
||||
As an example, let us calculate the probability
|
||||
of drawing three cards with the same value
|
||||
from a shuffled deck of cards
|
||||
(for example, $\spadesuit 8$, $\clubsuit 8$ and $\diamondsuit 8$).
|
||||
|
||||
\subsubsection*{Method 1}
|
||||
|
||||
We can calculate the probability using the formula
|
||||
|
||||
\[\frac{\textrm{number of desired outcomes}}{\textrm{total number of outcomes}}.\]
|
||||
|
||||
In this problem, the desired outcomes are those
|
||||
in which the value of each card is the same.
|
||||
There are $13 {4 \choose 3}$ such outcomes,
|
||||
because there are $13$ possibilities for the
|
||||
value of the cards and ${4 \choose 3}$ ways to
|
||||
choose $3$ suits from $4$ possible suits.
|
||||
|
||||
There are a total of ${52 \choose 3}$ outcomes,
|
||||
because we choose 3 cards from 52 cards.
|
||||
Thus, the probability of the event is
|
||||
|
||||
\[\frac{13 {4 \choose 3}}{{52 \choose 3}} = \frac{1}{425}.\]
|
||||
|
||||
\subsubsection*{Method 2}
|
||||
|
||||
Another way to calculate the probability is
|
||||
to simulate the process that generates the event.
|
||||
In this example, we draw three cards, so the process
|
||||
consists of three steps.
|
||||
We require that each step of the process is successful.
|
||||
|
||||
Drawing the first card certainly succeeds,
|
||||
because there are no restrictions.
|
||||
The second step succeeds with probability $3/51$,
|
||||
because there are 51 cards left and 3 of them
|
||||
have the same value as the first card.
|
||||
In a similar way, the third step succeeds with probability $2/50$.
|
||||
|
||||
The probability that the entire process succeeds is
|
||||
|
||||
\[1 \cdot \frac{3}{51} \cdot \frac{2}{50} = \frac{1}{425}.\]
|
||||
|
||||
\section{Events}
|
||||
|
||||
An event in probability theory can be represented as a set
|
||||
\[A \subset X,\]
|
||||
where $X$ contains all possible outcomes
|
||||
and $A$ is a subset of outcomes.
|
||||
For example, when drawing a dice, the outcomes are
|
||||
\[X = \{1,2,3,4,5,6\}.\]
|
||||
Now, for example, the event ''the outcome is even''
|
||||
corresponds to the set
|
||||
\[A = \{2,4,6\}.\]
|
||||
|
||||
Each outcome $x$ is assigned a probability $p(x)$.
|
||||
Then, the probability $P(A)$ of an event
|
||||
$A$ can be calculated as a sum
|
||||
of probabilities of outcomes using the formula
|
||||
\[P(A) = \sum_{x \in A} p(x).\]
|
||||
For example, when throwing a dice,
|
||||
$p(x)=1/6$ for each outcome $x$,
|
||||
so the probability of the event
|
||||
''the outcome is even'' is
|
||||
\[p(2)+p(4)+p(6)=1/2.\]
|
||||
|
||||
The total probability of the outcomes in $X$ must
|
||||
be 1, i.e., $P(X)=1$.
|
||||
|
||||
Since the events in probability theory are sets,
|
||||
we can manipulate them using standard set operations:
|
||||
|
||||
\begin{itemize}
|
||||
\item The \key{complement} $\bar A$ means
|
||||
''$A$ does not happen''.
|
||||
For example, when throwing a dice,
|
||||
the complement of $A=\{2,4,6\}$ is
|
||||
$\bar A = \{1,3,5\}$.
|
||||
\item The \key{union} $A \cup B$ means
|
||||
''$A$ or $B$ happen''.
|
||||
For example, the union of
|
||||
$A=\{2,5\}$
|
||||
and $B=\{4,5,6\}$ is
|
||||
$A \cup B = \{2,4,5,6\}$.
|
||||
\item The \key{intersection} $A \cap B$ means
|
||||
''$A$ and $B$ happen''.
|
||||
For example, the intersection of
|
||||
$A=\{2,5\}$ and $B=\{4,5,6\}$ is
|
||||
$A \cap B = \{5\}$.
|
||||
\end{itemize}
|
||||
|
||||
\subsubsection{Complement}
|
||||
|
||||
The probability of the complement
|
||||
$\bar A$ is calculated using the formula
|
||||
\[P(\bar A)=1-P(A).\]
|
||||
|
||||
Sometimes, we can solve a problem easily
|
||||
using complements by solving the opposite problem.
|
||||
For example, the probability of getting
|
||||
at least one six when throwing a dice ten times is
|
||||
\[1-(5/6)^{10}.\]
|
||||
|
||||
Here $5/6$ is the probability that the outcome
|
||||
of a single throw is not six, and
|
||||
$(5/6)^{10}$ is the probability that none of
|
||||
the ten throws is a six.
|
||||
The complement of this is the answer to the problem.
|
||||
|
||||
\subsubsection{Union}
|
||||
|
||||
The probability of the union $A \cup B$
|
||||
is calculated using the formula
|
||||
\[P(A \cup B)=P(A)+P(B)-P(A \cap B).\]
|
||||
For example, when throwing a dice,
|
||||
the union of the events
|
||||
\[A=\textrm{''the outcome is even''}\]
|
||||
and
|
||||
\[B=\textrm{''the outcome is less than 4''}\]
|
||||
is
|
||||
\[A \cup B=\textrm{''the outcome is even or less than 4''},\]
|
||||
and its probability is
|
||||
\[P(A \cup B) = P(A)+P(B)-P(A \cap B)=1/2+1/2-1/6=5/6.\]
|
||||
|
||||
If the events $A$ and $B$ are \key{disjoint}, i.e.,
|
||||
$A \cap B$ is empty,
|
||||
the probability of the event $A \cup B$ is simply
|
||||
|
||||
\[P(A \cup B)=P(A)+P(B).\]
|
||||
|
||||
\subsubsection{Conditional probability}
|
||||
|
||||
\index{conditional probability}
|
||||
|
||||
The \key{conditional probability}
|
||||
\[P(A | B) = \frac{P(A \cap B)}{P(B)}\]
|
||||
is the probability of $A$
|
||||
assuming that $B$ happens.
|
||||
Hence, when calculating the
|
||||
probability of $A$, we only consider the outcomes
|
||||
that also belong to $B$.
|
||||
|
||||
Using the previous sets,
|
||||
\[P(A | B)= 1/3,\]
|
||||
because the outcomes of $B$ are
|
||||
$\{1,2,3\}$, and one of them is even.
|
||||
This is the probability of an even outcome
|
||||
if we know that the outcome is between $1 \ldots 3$.
|
||||
|
||||
\subsubsection{Intersection}
|
||||
|
||||
\index{independence}
|
||||
|
||||
Using conditional probability,
|
||||
the probability of the intersection
|
||||
$A \cap B$ can be calculated using the formula
|
||||
\[P(A \cap B)=P(A)P(B|A).\]
|
||||
Events $A$ and $B$ are \key{independent} if
|
||||
\[P(A|B)=P(A) \hspace{10px}\textrm{and}\hspace{10px} P(B|A)=P(B),\]
|
||||
which means that the fact that $B$ happens does not
|
||||
change the probability of $A$, and vice versa.
|
||||
In this case, the probability of the intersection is
|
||||
\[P(A \cap B)=P(A)P(B).\]
|
||||
For example, when drawing a card from a deck, the events
|
||||
\[A = \textrm{''the suit is clubs''}\]
|
||||
and
|
||||
\[B = \textrm{''the value is four''}\]
|
||||
are independent. Hence the event
|
||||
\[A \cap B = \textrm{''the card is the four of clubs''}\]
|
||||
happens with probability
|
||||
\[P(A \cap B)=P(A)P(B)=1/4 \cdot 1/13 = 1/52.\]
|
||||
|
||||
\section{Random variables}
|
||||
|
||||
\index{random variable}
|
||||
|
||||
A \key{random variable} is a value that is generated
|
||||
by a random process.
|
||||
For example, when throwing two dice,
|
||||
a possible random variable is
|
||||
\[X=\textrm{''the sum of the outcomes''}.\]
|
||||
For example, if the outcomes are $[4,6]$
|
||||
(meaning that we first throw a four and then a six),
|
||||
then the value of $X$ is 10.
|
||||
|
||||
We denote $P(X=x)$ the probability that
|
||||
the value of a random variable $X$ is $x$.
|
||||
For example, when throwing two dice,
|
||||
$P(X=10)=3/36$,
|
||||
because the total number of outcomes is 36
|
||||
and there are three possible ways to obtain
|
||||
the sum 10: $[4,6]$, $[5,5]$ and $[6,4]$.
|
||||
|
||||
\subsubsection{Expected value}
|
||||
|
||||
\index{expected value}
|
||||
|
||||
The \key{expected value} $E[X]$ indicates the
|
||||
average value of a random variable $X$.
|
||||
The expected value can be calculated as the sum
|
||||
\[\sum_x P(X=x)x,\]
|
||||
where $x$ goes through all possible values of $X$.
|
||||
|
||||
For example, when throwing a dice,
|
||||
the expected outcome is
|
||||
\[1/6 \cdot 1 + 1/6 \cdot 2 + 1/6 \cdot 3 + 1/6 \cdot 4 + 1/6 \cdot 5 + 1/6 \cdot 6 = 7/2.\]
|
||||
|
||||
A useful property of expected values is \key{linearity}.
|
||||
It means that the sum
|
||||
$E[X_1+X_2+\cdots+X_n]$
|
||||
always equals the sum
|
||||
$E[X_1]+E[X_2]+\cdots+E[X_n]$.
|
||||
This formula holds even if random variables
|
||||
depend on each other.
|
||||
|
||||
For example, when throwing two dice,
|
||||
the expected sum is
|
||||
\[E[X_1+X_2]=E[X_1]+E[X_2]=7/2+7/2=7.\]
|
||||
|
||||
Let us now consider a problem where
|
||||
$n$ balls are randomly placed in $n$ boxes,
|
||||
and our task is to calculate the expected
|
||||
number of empty boxes.
|
||||
Each ball has an equal probability to
|
||||
be placed in any of the boxes.
|
||||
For example, if $n=2$, the possibilities
|
||||
are as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
\draw (0,0) rectangle (1,1);
|
||||
\draw (1.2,0) rectangle (2.2,1);
|
||||
\draw (3,0) rectangle (4,1);
|
||||
\draw (4.2,0) rectangle (5.2,1);
|
||||
\draw (6,0) rectangle (7,1);
|
||||
\draw (7.2,0) rectangle (8.2,1);
|
||||
\draw (9,0) rectangle (10,1);
|
||||
\draw (10.2,0) rectangle (11.2,1);
|
||||
|
||||
\draw[fill=blue] (0.5,0.2) circle (0.1);
|
||||
\draw[fill=red] (1.7,0.2) circle (0.1);
|
||||
\draw[fill=red] (3.5,0.2) circle (0.1);
|
||||
\draw[fill=blue] (4.7,0.2) circle (0.1);
|
||||
\draw[fill=blue] (6.25,0.2) circle (0.1);
|
||||
\draw[fill=red] (6.75,0.2) circle (0.1);
|
||||
\draw[fill=blue] (10.45,0.2) circle (0.1);
|
||||
\draw[fill=red] (10.95,0.2) circle (0.1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
In this case, the expected number of
|
||||
empty boxes is
|
||||
\[\frac{0+0+1+1}{4} = \frac{1}{2}.\]
|
||||
In the general case, the probability that a
|
||||
single box is empty is
|
||||
\[\Big(\frac{n-1}{n}\Big)^n,\]
|
||||
because no ball should be placed in it.
|
||||
Hence, using linearity, the expected number of
|
||||
empty boxes is
|
||||
\[n \cdot \Big(\frac{n-1}{n}\Big)^n.\]
|
||||
|
||||
\subsubsection{Distributions}
|
||||
|
||||
\index{distribution}
|
||||
|
||||
The \key{distribution} of a random variable $X$
|
||||
shows the probability of each value that
|
||||
$X$ may have.
|
||||
The distribution consists of values $P(X=x)$.
|
||||
For example, when throwing two dice,
|
||||
the distribution for their sum is:
|
||||
\begin{center}
|
||||
\small {
|
||||
\begin{tabular}{r|rrrrrrrrrrrrr}
|
||||
$x$ & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 \\
|
||||
$P(X=x)$ & $1/36$ & $2/36$ & $3/36$ & $4/36$ & $5/36$ & $6/36$ & $5/36$ & $4/36$ & $3/36$ & $2/36$ & $1/36$ \\
|
||||
\end{tabular}
|
||||
}
|
||||
\end{center}
|
||||
|
||||
\index{uniform distribution}
|
||||
In a \key{uniform distribution},
|
||||
the random variable $X$ has $n$ possible
|
||||
values $a,a+1,\ldots,b$ and the probability of each value is $1/n$.
|
||||
For example, when throwing a dice,
|
||||
$a=1$, $b=6$ and $P(X=x)=1/6$ for each value $x$.
|
||||
|
||||
The expected value of $X$ in a uniform distribution is
|
||||
\[E[X] = \frac{a+b}{2}.\]
|
||||
|
||||
\index{binomial distribution}
|
||||
In a \key{binomial distribution}, $n$ attempts
|
||||
are made
|
||||
and the probability that a single attempt succeeds
|
||||
is $p$.
|
||||
The random variable $X$ counts the number of
|
||||
successful attempts,
|
||||
and the probability of a value $x$ is
|
||||
\[P(X=x)=p^x (1-p)^{n-x} {n \choose x},\]
|
||||
where $p^x$ and $(1-p)^{n-x}$ correspond to
|
||||
successful and unsuccessful attemps,
|
||||
and ${n \choose x}$ is the number of ways
|
||||
we can choose the order of the attempts.
|
||||
|
||||
For example, when throwing a dice ten times,
|
||||
the probability of throwing a six exactly
|
||||
three times is $(1/6)^3 (5/6)^7 {10 \choose 3}$.
|
||||
|
||||
The expected value of $X$ in a binomial distribution is
|
||||
\[E[X] = pn.\]
|
||||
|
||||
\index{geometric distribution}
|
||||
In a \key{geometric distribution},
|
||||
the probability that an attempt succeeds is $p$,
|
||||
and we continue until the first success happens.
|
||||
The random variable $X$ counts the number
|
||||
of attempts needed, and the probability of
|
||||
a value $x$ is
|
||||
\[P(X=x)=(1-p)^{x-1} p,\]
|
||||
where $(1-p)^{x-1}$ corresponds to the unsuccessful attemps
|
||||
and $p$ corresponds to the first successful attempt.
|
||||
|
||||
For example, if we throw a dice until we throw a six,
|
||||
the probability that the number of throws
|
||||
is exactly 4 is $(5/6)^3 1/6$.
|
||||
|
||||
The expected value of $X$ in a geometric distribution is
|
||||
\[E[X]=\frac{1}{p}.\]
|
||||
|
||||
\section{Markov chains}
|
||||
|
||||
\index{Markov chain}
|
||||
|
||||
A \key{Markov chain}
|
||||
% \footnote{A. A. Markov (1856--1922)
|
||||
% was a Russian mathematician.}
|
||||
is a random process
|
||||
that consists of states and transitions between them.
|
||||
For each state, we know the probabilities
|
||||
for moving to other states.
|
||||
A Markov chain can be represented as a graph
|
||||
whose nodes are states and edges are transitions.
|
||||
|
||||
As an example, consider a problem
|
||||
where we are in floor 1 in an $n$ floor building.
|
||||
At each step, we randomly walk either one floor
|
||||
up or one floor down, except that we always
|
||||
walk one floor up from floor 1 and one floor down
|
||||
from floor $n$.
|
||||
What is the probability of being in floor $m$
|
||||
after $k$ steps?
|
||||
|
||||
In this problem, each floor of the building
|
||||
corresponds to a state in a Markov chain.
|
||||
For example, if $n=5$, the graph is as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (2,0) {$2$};
|
||||
\node[draw, circle] (3) at (4,0) {$3$};
|
||||
\node[draw, circle] (4) at (6,0) {$4$};
|
||||
\node[draw, circle] (5) at (8,0) {$5$};
|
||||
|
||||
\path[draw,thick,->] (1) edge [bend left=40] node[font=\small,label=$1$] {} (2);
|
||||
\path[draw,thick,->] (2) edge [bend left=40] node[font=\small,label=$1/2$] {} (3);
|
||||
\path[draw,thick,->] (3) edge [bend left=40] node[font=\small,label=$1/2$] {} (4);
|
||||
\path[draw,thick,->] (4) edge [bend left=40] node[font=\small,label=$1/2$] {} (5);
|
||||
|
||||
\path[draw,thick,->] (5) edge [bend left=40] node[font=\small,label=below:$1$] {} (4);
|
||||
\path[draw,thick,->] (4) edge [bend left=40] node[font=\small,label=below:$1/2$] {} (3);
|
||||
\path[draw,thick,->] (3) edge [bend left=40] node[font=\small,label=below:$1/2$] {} (2);
|
||||
\path[draw,thick,->] (2) edge [bend left=40] node[font=\small,label=below:$1/2$] {} (1);
|
||||
|
||||
%\path[draw,thick,->] (1) edge [bend left=40] node[font=\small,label=below:$1$] {} (2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The probability distribution
|
||||
of a Markov chain is a vector
|
||||
$[p_1,p_2,\ldots,p_n]$, where $p_k$ is the
|
||||
probability that the current state is $k$.
|
||||
The formula $p_1+p_2+\cdots+p_n=1$ always holds.
|
||||
|
||||
In the above scenario, the initial distribution is
|
||||
$[1,0,0,0,0]$, because we always begin in floor 1.
|
||||
The next distribution is $[0,1,0,0,0]$,
|
||||
because we can only move from floor 1 to floor 2.
|
||||
After this, we can either move one floor up
|
||||
or one floor down, so the next distribution is
|
||||
$[1/2,0,1/2,0,0]$, and so on.
|
||||
|
||||
An efficient way to simulate the walk in
|
||||
a Markov chain is to use dynamic programming.
|
||||
The idea is to maintain the probability distribution,
|
||||
and at each step go through all possibilities
|
||||
how we can move.
|
||||
Using this method, we can simulate
|
||||
a walk of $m$ steps in $O(n^2 m)$ time.
|
||||
|
||||
The transitions of a Markov chain can also be
|
||||
represented as a matrix that updates the
|
||||
probability distribution.
|
||||
In the above scenario, the matrix is
|
||||
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
0 & 1/2 & 0 & 0 & 0 \\
|
||||
1 & 0 & 1/2 & 0 & 0 \\
|
||||
0 & 1/2 & 0 & 1/2 & 0 \\
|
||||
0 & 0 & 1/2 & 0 & 1 \\
|
||||
0 & 0 & 0 & 1/2 & 0 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
|
||||
When we multiply a probability distribution by this matrix,
|
||||
we get the new distribution after moving one step.
|
||||
For example, we can move from the distribution
|
||||
$[1,0,0,0,0]$ to the distribution
|
||||
$[0,1,0,0,0]$ as follows:
|
||||
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
0 & 1/2 & 0 & 0 & 0 \\
|
||||
1 & 0 & 1/2 & 0 & 0 \\
|
||||
0 & 1/2 & 0 & 1/2 & 0 \\
|
||||
0 & 0 & 1/2 & 0 & 1 \\
|
||||
0 & 0 & 0 & 1/2 & 0 \\
|
||||
\end{bmatrix}
|
||||
\begin{bmatrix}
|
||||
1 \\
|
||||
0 \\
|
||||
0 \\
|
||||
0 \\
|
||||
0 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
0 \\
|
||||
1 \\
|
||||
0 \\
|
||||
0 \\
|
||||
0 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
|
||||
By calculating matrix powers efficiently,
|
||||
we can calculate the distribution after $m$ steps
|
||||
in $O(n^3 \log m)$ time.
|
||||
|
||||
\section{Randomized algorithms}
|
||||
|
||||
\index{randomized algorithm}
|
||||
|
||||
Sometimes we can use randomness for solving a problem,
|
||||
even if the problem is not related to probabilities.
|
||||
A \key{randomized algorithm} is an algorithm that
|
||||
is based on randomness.
|
||||
|
||||
\index{Monte Carlo algorithm}
|
||||
|
||||
A \key{Monte Carlo algorithm} is a randomized algorithm
|
||||
that may sometimes give a wrong answer.
|
||||
For such an algorithm to be useful,
|
||||
the probability of a wrong answer should be small.
|
||||
|
||||
\index{Las Vegas algorithm}
|
||||
|
||||
A \key{Las Vegas algorithm} is a randomized algorithm
|
||||
that always gives the correct answer,
|
||||
but its running time varies randomly.
|
||||
The goal is to design an algorithm that is
|
||||
efficient with high probability.
|
||||
|
||||
Next we will go through three example problems that
|
||||
can be solved using randomness.
|
||||
|
||||
\subsubsection{Order statistics}
|
||||
|
||||
\index{order statistic}
|
||||
|
||||
The $kth$ \key{order statistic} of an array
|
||||
is the element at position $k$ after sorting
|
||||
the array in increasing order.
|
||||
It is easy to calculate any order statistic
|
||||
in $O(n \log n)$ time by first sorting the array,
|
||||
but is it really needed to sort the entire array
|
||||
just to find one element?
|
||||
|
||||
It turns out that we can find order statistics
|
||||
using a randomized algorithm without sorting the array.
|
||||
The algorithm, called \key{quickselect}\footnote{In 1961,
|
||||
C. A. R. Hoare published two algorithms that
|
||||
are efficient on average: \index{quicksort} \index{quickselect}
|
||||
\key{quicksort} \cite{hoa61a} for sorting arrays and
|
||||
\key{quickselect} \cite{hoa61b} for finding order statistics.}, is a Las Vegas algorithm:
|
||||
its running time is usually $O(n)$
|
||||
but $O(n^2)$ in the worst case.
|
||||
|
||||
The algorithm chooses a random element $x$
|
||||
of the array, and moves elements smaller than $x$
|
||||
to the left part of the array,
|
||||
and all other elements to the right part of the array.
|
||||
This takes $O(n)$ time when there are $n$ elements.
|
||||
Assume that the left part contains $a$ elements
|
||||
and the right part contains $b$ elements.
|
||||
If $a=k$, element $x$ is the $k$th order statistic.
|
||||
Otherwise, if $a>k$, we recursively find the $k$th order
|
||||
statistic for the left part,
|
||||
and if $a<k$, we recursively find the $r$th order
|
||||
statistic for the right part where $r=k-a$.
|
||||
The search continues in a similar way, until the element
|
||||
has been found.
|
||||
|
||||
When each element $x$ is randomly chosen,
|
||||
the size of the array about halves at each step,
|
||||
so the time complexity for
|
||||
finding the $k$th order statistic is about
|
||||
\[n+n/2+n/4+n/8+\cdots < 2n = O(n).\]
|
||||
|
||||
The worst case of the algorithm requires still $O(n^2)$ time,
|
||||
because it is possible that $x$ is always chosen
|
||||
in such a way that it is one of the smallest or largest
|
||||
elements in the array and $O(n)$ steps are needed.
|
||||
However, the probability for this is so small
|
||||
that this never happens in practice.
|
||||
|
||||
\subsubsection{Verifying matrix multiplication}
|
||||
|
||||
\index{matrix multiplication}
|
||||
|
||||
Our next problem is to \emph{verify}
|
||||
if $AB=C$ holds when $A$, $B$ and $C$
|
||||
are matrices of size $n \times n$.
|
||||
Of course, we can solve the problem
|
||||
by calculating the product $AB$ again
|
||||
(in $O(n^3)$ time using the basic algorithm),
|
||||
but one could hope that verifying the
|
||||
answer would by easier than to calculate it from scratch.
|
||||
|
||||
It turns out that we can solve the problem
|
||||
using a Monte Carlo algorithm\footnote{R. M. Freivalds published
|
||||
this algorithm in 1977 \cite{fre77}, and it is sometimes
|
||||
called \index{Freivalds' algoritm} \key{Freivalds' algorithm}.} whose
|
||||
time complexity is only $O(n^2)$.
|
||||
The idea is simple: we choose a random vector
|
||||
$X$ of $n$ elements, and calculate the matrices
|
||||
$ABX$ and $CX$. If $ABX=CX$, we report that $AB=C$,
|
||||
and otherwise we report that $AB \neq C$.
|
||||
|
||||
The time complexity of the algorithm is
|
||||
$O(n^2)$, because we can calculate the matrices
|
||||
$ABX$ and $CX$ in $O(n^2)$ time.
|
||||
We can calculate the matrix $ABX$ efficiently
|
||||
by using the representation $A(BX)$, so only two
|
||||
multiplications of $n \times n$ and $n \times 1$
|
||||
size matrices are needed.
|
||||
|
||||
The drawback of the algorithm is
|
||||
that there is a small chance that the algorithm
|
||||
makes a mistake when it reports that $AB=C$.
|
||||
For example,
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
6 & 8 \\
|
||||
1 & 3 \\
|
||||
\end{bmatrix}
|
||||
\neq
|
||||
\begin{bmatrix}
|
||||
8 & 7 \\
|
||||
3 & 2 \\
|
||||
\end{bmatrix},
|
||||
\]
|
||||
but
|
||||
\[
|
||||
\begin{bmatrix}
|
||||
6 & 8 \\
|
||||
1 & 3 \\
|
||||
\end{bmatrix}
|
||||
\begin{bmatrix}
|
||||
3 \\
|
||||
6 \\
|
||||
\end{bmatrix}
|
||||
=
|
||||
\begin{bmatrix}
|
||||
8 & 7 \\
|
||||
3 & 2 \\
|
||||
\end{bmatrix}
|
||||
\begin{bmatrix}
|
||||
3 \\
|
||||
6 \\
|
||||
\end{bmatrix}.
|
||||
\]
|
||||
However, in practice, the probability that the
|
||||
algorithm makes a mistake is small,
|
||||
and we can decrease the probability by
|
||||
verifying the result using multiple random vectors $X$
|
||||
before reporting that $AB=C$.
|
||||
|
||||
\subsubsection{Graph coloring}
|
||||
|
||||
\index{coloring}
|
||||
|
||||
Given a graph that contains $n$ nodes and $m$ edges,
|
||||
our task is to find a way to color the nodes
|
||||
of the graph using two colors so that
|
||||
for at least $m/2$ edges, the endpoints
|
||||
have different colors.
|
||||
For example, in the graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (1,3) {$1$};
|
||||
\node[draw, circle] (2) at (4,3) {$2$};
|
||||
\node[draw, circle] (3) at (1,1) {$3$};
|
||||
\node[draw, circle] (4) at (4,1) {$4$};
|
||||
\node[draw, circle] (5) at (6,2) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
a valid coloring is as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle, fill=blue!40] (1) at (1,3) {$1$};
|
||||
\node[draw, circle, fill=red!40] (2) at (4,3) {$2$};
|
||||
\node[draw, circle, fill=red!40] (3) at (1,1) {$3$};
|
||||
\node[draw, circle, fill=blue!40] (4) at (4,1) {$4$};
|
||||
\node[draw, circle, fill=blue!40] (5) at (6,2) {$5$};
|
||||
|
||||
\path[draw,thick,-] (1) -- (2);
|
||||
\path[draw,thick,-] (1) -- (3);
|
||||
\path[draw,thick,-] (1) -- (4);
|
||||
\path[draw,thick,-] (3) -- (4);
|
||||
\path[draw,thick,-] (2) -- (4);
|
||||
\path[draw,thick,-] (2) -- (5);
|
||||
\path[draw,thick,-] (4) -- (5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The above graph contains 7 edges, and for 5 of them,
|
||||
the endpoints have different colors,
|
||||
so the coloring is valid.
|
||||
|
||||
The problem can be solved using a Las Vegas algorithm
|
||||
that generates random colorings until a valid coloring
|
||||
has been found.
|
||||
In a random coloring, the color of each node is
|
||||
independently chosen so that the probability of
|
||||
both colors is $1/2$.
|
||||
|
||||
In a random coloring, the probability that the endpoints
|
||||
of a single edge have different colors is $1/2$.
|
||||
Hence, the expected number of edges whose endpoints
|
||||
have different colors is $m/2$.
|
||||
Since it is expected that a random coloring is valid,
|
||||
we will quickly find a valid coloring in practice.
|
||||
|
801
chapter25.tex
801
chapter25.tex
|
@ -1,801 +0,0 @@
|
|||
\chapter{Game theory}
|
||||
|
||||
In this chapter, we will focus on two-player
|
||||
games that do not contain random elements.
|
||||
Our goal is to find a strategy that we can
|
||||
follow to win the game
|
||||
no matter what the opponent does,
|
||||
if such a strategy exists.
|
||||
|
||||
It turns out that there is a general strategy
|
||||
for such games,
|
||||
and we can analyze the games using the \key{nim theory}.
|
||||
First, we will analyze simple games where
|
||||
players remove sticks from heaps,
|
||||
and after this, we will generalize the strategy
|
||||
used in those games to other games.
|
||||
|
||||
\section{Game states}
|
||||
|
||||
Let us consider a game where there is initially
|
||||
a heap of $n$ sticks.
|
||||
Players $A$ and $B$ move alternately,
|
||||
and player $A$ begins.
|
||||
On each move, the player has to remove
|
||||
1, 2 or 3 sticks from the heap,
|
||||
and the player who removes the last stick wins the game.
|
||||
|
||||
For example, if $n=10$, the game may proceed as follows:
|
||||
\begin{itemize}[noitemsep]
|
||||
\item Player $A$ removes 2 sticks (8 sticks left).
|
||||
\item Player $B$ removes 3 sticks (5 sticks left).
|
||||
\item Player $A$ removes 1 stick (4 sticks left).
|
||||
\item Player $B$ removes 2 sticks (2 sticks left).
|
||||
\item Player $A$ removes 2 sticks and wins.
|
||||
\end{itemize}
|
||||
|
||||
This game consists of states $0,1,2,\ldots,n$,
|
||||
where the number of the state corresponds to
|
||||
the number of sticks left.
|
||||
|
||||
\subsubsection{Winning and losing states}
|
||||
|
||||
\index{winning state}
|
||||
\index{losing state}
|
||||
|
||||
A \key{winning state} is a state where
|
||||
the player will win the game if they
|
||||
play optimally,
|
||||
and a \key{losing state} is a state
|
||||
where the player will lose the game if the
|
||||
opponent plays optimally.
|
||||
It turns out that we can classify all states
|
||||
of a game so that each state is either
|
||||
a winning state or a losing state.
|
||||
|
||||
In the above game, state 0 is clearly a
|
||||
losing state, because the player cannot make
|
||||
any moves.
|
||||
States 1, 2 and 3 are winning states,
|
||||
because we can remove 1, 2 or 3 sticks
|
||||
and win the game.
|
||||
State 4, in turn, is a losing state,
|
||||
because any move leads to a state that
|
||||
is a winning state for the opponent.
|
||||
|
||||
More generally, if there is a move that leads
|
||||
from the current state to a losing state,
|
||||
the current state is a winning state,
|
||||
and otherwise the current state is a losing state.
|
||||
Using this observation, we can classify all states
|
||||
of a game starting with losing states where
|
||||
there are no possible moves.
|
||||
|
||||
The states $0 \ldots 15$ of the above game
|
||||
can be classified as follows
|
||||
($W$ denotes a winning state and $L$ denotes a losing state):
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (16,1);
|
||||
|
||||
\node at (0.5,0.5) {$L$};
|
||||
\node at (1.5,0.5) {$W$};
|
||||
\node at (2.5,0.5) {$W$};
|
||||
\node at (3.5,0.5) {$W$};
|
||||
\node at (4.5,0.5) {$L$};
|
||||
\node at (5.5,0.5) {$W$};
|
||||
\node at (6.5,0.5) {$W$};
|
||||
\node at (7.5,0.5) {$W$};
|
||||
\node at (8.5,0.5) {$L$};
|
||||
\node at (9.5,0.5) {$W$};
|
||||
\node at (10.5,0.5) {$W$};
|
||||
\node at (11.5,0.5) {$W$};
|
||||
\node at (12.5,0.5) {$L$};
|
||||
\node at (13.5,0.5) {$W$};
|
||||
\node at (14.5,0.5) {$W$};
|
||||
\node at (15.5,0.5) {$W$};
|
||||
|
||||
\footnotesize
|
||||
\node at (0.5,1.4) {$0$};
|
||||
\node at (1.5,1.4) {$1$};
|
||||
\node at (2.5,1.4) {$2$};
|
||||
\node at (3.5,1.4) {$3$};
|
||||
\node at (4.5,1.4) {$4$};
|
||||
\node at (5.5,1.4) {$5$};
|
||||
\node at (6.5,1.4) {$6$};
|
||||
\node at (7.5,1.4) {$7$};
|
||||
\node at (8.5,1.4) {$8$};
|
||||
\node at (9.5,1.4) {$9$};
|
||||
\node at (10.5,1.4) {$10$};
|
||||
\node at (11.5,1.4) {$11$};
|
||||
\node at (12.5,1.4) {$12$};
|
||||
\node at (13.5,1.4) {$13$};
|
||||
\node at (14.5,1.4) {$14$};
|
||||
\node at (15.5,1.4) {$15$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
It is easy to analyze this game:
|
||||
a state $k$ is a losing state if $k$ is
|
||||
divisible by 4, and otherwise it
|
||||
is a winning state.
|
||||
An optimal way to play the game is
|
||||
to always choose a move after which
|
||||
the number of sticks in the heap
|
||||
is divisible by 4.
|
||||
Finally, there are no sticks left and
|
||||
the opponent has lost.
|
||||
|
||||
Of course, this strategy requires that
|
||||
the number of sticks is \emph{not} divisible by 4
|
||||
when it is our move.
|
||||
If it is, there is nothing we can do,
|
||||
and the opponent will win the game if
|
||||
they play optimally.
|
||||
|
||||
\subsubsection{State graph}
|
||||
|
||||
Let us now consider another stick game,
|
||||
where in each state $k$, it is allowed to remove
|
||||
any number $x$ of sticks such that $x$
|
||||
is smaller than $k$ and divides $k$.
|
||||
For example, in state 8 we may remove
|
||||
1, 2 or 4 sticks, but in state 7 the only
|
||||
allowed move is to remove 1 stick.
|
||||
|
||||
The following picture shows the states
|
||||
$1 \ldots 9$ of the game as a \key{state graph},
|
||||
whose nodes are the states and edges are the moves between them:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {$1$};
|
||||
\node[draw, circle] (2) at (2,0) {$2$};
|
||||
\node[draw, circle] (3) at (3.5,-1) {$3$};
|
||||
\node[draw, circle] (4) at (1.5,-2) {$4$};
|
||||
\node[draw, circle] (5) at (3,-2.75) {$5$};
|
||||
\node[draw, circle] (6) at (2.5,-4.5) {$6$};
|
||||
\node[draw, circle] (7) at (0.5,-3.25) {$7$};
|
||||
\node[draw, circle] (8) at (-1,-4) {$8$};
|
||||
\node[draw, circle] (9) at (1,-5.5) {$9$};
|
||||
|
||||
\path[draw,thick,->,>=latex] (2) -- (1);
|
||||
\path[draw,thick,->,>=latex] (3) edge [bend right=20] (2);
|
||||
\path[draw,thick,->,>=latex] (4) edge [bend left=20] (2);
|
||||
\path[draw,thick,->,>=latex] (4) edge [bend left=20] (3);
|
||||
\path[draw,thick,->,>=latex] (5) edge [bend right=20] (4);
|
||||
\path[draw,thick,->,>=latex] (6) edge [bend left=20] (5);
|
||||
\path[draw,thick,->,>=latex] (6) edge [bend left=20] (4);
|
||||
\path[draw,thick,->,>=latex] (6) edge [bend right=40] (3);
|
||||
\path[draw,thick,->,>=latex] (7) edge [bend right=20] (6);
|
||||
\path[draw,thick,->,>=latex] (8) edge [bend right=20] (7);
|
||||
\path[draw,thick,->,>=latex] (8) edge [bend right=20] (6);
|
||||
\path[draw,thick,->,>=latex] (8) edge [bend left=20] (4);
|
||||
\path[draw,thick,->,>=latex] (9) edge [bend left=20] (8);
|
||||
\path[draw,thick,->,>=latex] (9) edge [bend right=20] (6);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The final state in this game is always state 1,
|
||||
which is a losing state, because there are no
|
||||
valid moves.
|
||||
The classification of states $1 \ldots 9$
|
||||
is as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (1,0) grid (10,1);
|
||||
|
||||
\node at (1.5,0.5) {$L$};
|
||||
\node at (2.5,0.5) {$W$};
|
||||
\node at (3.5,0.5) {$L$};
|
||||
\node at (4.5,0.5) {$W$};
|
||||
\node at (5.5,0.5) {$L$};
|
||||
\node at (6.5,0.5) {$W$};
|
||||
\node at (7.5,0.5) {$L$};
|
||||
\node at (8.5,0.5) {$W$};
|
||||
\node at (9.5,0.5) {$L$};
|
||||
|
||||
\footnotesize
|
||||
\node at (1.5,1.4) {$1$};
|
||||
\node at (2.5,1.4) {$2$};
|
||||
\node at (3.5,1.4) {$3$};
|
||||
\node at (4.5,1.4) {$4$};
|
||||
\node at (5.5,1.4) {$5$};
|
||||
\node at (6.5,1.4) {$6$};
|
||||
\node at (7.5,1.4) {$7$};
|
||||
\node at (8.5,1.4) {$8$};
|
||||
\node at (9.5,1.4) {$9$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Surprisingly, in this game,
|
||||
all even-numbered states are winning states,
|
||||
and all odd-numbered states are losing states.
|
||||
|
||||
\section{Nim game}
|
||||
|
||||
\index{nim game}
|
||||
|
||||
The \key{nim game} is a simple game that
|
||||
has an important role in game theory,
|
||||
because many other games can be played using
|
||||
the same strategy.
|
||||
First, we focus on nim,
|
||||
and then we generalize the strategy
|
||||
to other games.
|
||||
|
||||
There are $n$ heaps in nim,
|
||||
and each heap contains some number of sticks.
|
||||
The players move alternately,
|
||||
and on each turn, the player chooses
|
||||
a heap that still contains sticks
|
||||
and removes any number of sticks from it.
|
||||
The winner is the player who removes the last stick.
|
||||
|
||||
The states in nim are of the form
|
||||
$[x_1,x_2,\ldots,x_n]$,
|
||||
where $x_k$ denotes the number of sticks in heap $k$.
|
||||
For example, $[10,12,5]$ is a game where
|
||||
there are three heaps with 10, 12 and 5 sticks.
|
||||
The state $[0,0,\ldots,0]$ is a losing state,
|
||||
because it is not possible to remove any sticks,
|
||||
and this is always the final state.
|
||||
|
||||
\subsubsection{Analysis}
|
||||
\index{nim sum}
|
||||
|
||||
It turns out that we can easily classify
|
||||
any nim state by calculating
|
||||
the \key{nim sum} $s = x_1 \oplus x_2 \oplus \cdots \oplus x_n$,
|
||||
where $\oplus$ is the xor operation\footnote{The optimal strategy
|
||||
for nim was published in 1901 by C. L. Bouton \cite{bou01}.}.
|
||||
The states whose nim sum is 0 are losing states,
|
||||
and all other states are winning states.
|
||||
For example, the nim sum of
|
||||
$[10,12,5]$ is $10 \oplus 12 \oplus 5 = 3$,
|
||||
so the state is a winning state.
|
||||
|
||||
But how is the nim sum related to the nim game?
|
||||
We can explain this by looking at how the nim
|
||||
sum changes when the nim state changes.
|
||||
|
||||
\textit{Losing states:}
|
||||
The final state $[0,0,\ldots,0]$ is a losing state,
|
||||
and its nim sum is 0, as expected.
|
||||
In other losing states, any move leads to
|
||||
a winning state, because when a single value $x_k$ changes,
|
||||
the nim sum also changes, so the nim sum
|
||||
is different from 0 after the move.
|
||||
|
||||
\textit{Winning states:}
|
||||
We can move to a losing state if
|
||||
there is any heap $k$ for which $x_k \oplus s < x_k$.
|
||||
In this case, we can remove sticks from
|
||||
heap $k$ so that it will contain $x_k \oplus s$ sticks,
|
||||
which will lead to a losing state.
|
||||
There is always such a heap, where $x_k$
|
||||
has a one bit at the position of the leftmost
|
||||
one bit of $s$.
|
||||
|
||||
As an example, consider the state $[10,12,5]$.
|
||||
This state is a winning state,
|
||||
because its nim sum is 3.
|
||||
Thus, there has to be a move which
|
||||
leads to a losing state.
|
||||
Next we will find out such a move.
|
||||
|
||||
The nim sum of the state is as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{r|r}
|
||||
10 & \texttt{1010} \\
|
||||
12 & \texttt{1100} \\
|
||||
5 & \texttt{0101} \\
|
||||
\hline
|
||||
3 & \texttt{0011} \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
In this case, the heap with 10 sticks
|
||||
is the only heap that has a one bit
|
||||
at the position of the leftmost
|
||||
one bit of the nim sum:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{r|r}
|
||||
10 & \texttt{10\underline{1}0} \\
|
||||
12 & \texttt{1100} \\
|
||||
5 & \texttt{0101} \\
|
||||
\hline
|
||||
3 & \texttt{00\underline{1}1} \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
The new size of the heap has to be
|
||||
$10 \oplus 3 = 9$,
|
||||
so we will remove just one stick.
|
||||
After this, the state will be $[9,12,5]$,
|
||||
which is a losing state:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{r|r}
|
||||
9 & \texttt{1001} \\
|
||||
12 & \texttt{1100} \\
|
||||
5 & \texttt{0101} \\
|
||||
\hline
|
||||
0 & \texttt{0000} \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Misère game}
|
||||
|
||||
\index{misère game}
|
||||
|
||||
In a \key{misère game}, the goal of the game
|
||||
is opposite,
|
||||
so the player who removes the last stick
|
||||
loses the game.
|
||||
It turns out that the misère nim game can be
|
||||
optimally played almost like the standard nim game.
|
||||
|
||||
The idea is to first play the misère game
|
||||
like the standard game, but change the strategy
|
||||
at the end of the game.
|
||||
The new strategy will be introduced in a situation
|
||||
where each heap would contain at most one stick
|
||||
after the next move.
|
||||
|
||||
In the standard game, we should choose a move
|
||||
after which there is an even number of heaps with one stick.
|
||||
However, in the misère game, we choose a move so that
|
||||
there is an odd number of heaps with one stick.
|
||||
|
||||
This strategy works because a state where the
|
||||
strategy changes always appears in the game,
|
||||
and this state is a winning state, because
|
||||
it contains exactly one heap that has more than one stick
|
||||
so the nim sum is not 0.
|
||||
|
||||
\section{Sprague–Grundy theorem}
|
||||
|
||||
\index{Sprague–Grundy theorem}
|
||||
|
||||
The \key{Sprague–Grundy theorem}\footnote{The theorem was
|
||||
independently discovered by R. Sprague \cite{spr35} and P. M. Grundy \cite{gru39}.} generalizes the
|
||||
strategy used in nim to all games that fulfil
|
||||
the following requirements:
|
||||
|
||||
\begin{itemize}[noitemsep]
|
||||
\item There are two players who move alternately.
|
||||
\item The game consists of states, and the possible moves
|
||||
in a state do not depend on whose turn it is.
|
||||
\item The game ends when a player cannot make a move.
|
||||
\item The game surely ends sooner or later.
|
||||
\item The players have complete information about
|
||||
the states and allowed moves, and there is no randomness in the game.
|
||||
\end{itemize}
|
||||
The idea is to calculate for each game state
|
||||
a Grundy number that corresponds to the number of
|
||||
sticks in a nim heap.
|
||||
When we know the Grundy numbers of all states,
|
||||
we can play the game like the nim game.
|
||||
|
||||
\subsubsection{Grundy numbers}
|
||||
|
||||
\index{Grundy number}
|
||||
\index{mex function}
|
||||
|
||||
The \key{Grundy number} of a game state is
|
||||
\[\textrm{mex}(\{g_1,g_2,\ldots,g_n\}),\]
|
||||
where $g_1,g_2,\ldots,g_n$ are the Grundy numbers of the
|
||||
states to which we can move,
|
||||
and the mex function gives the smallest
|
||||
nonnegative number that is not in the set.
|
||||
For example, $\textrm{mex}(\{0,1,3\})=2$.
|
||||
If there are no possible moves in a state,
|
||||
its Grundy number is 0, because
|
||||
$\textrm{mex}(\emptyset)=0$.
|
||||
|
||||
For example, in the state graph
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {\phantom{0}};
|
||||
\node[draw, circle] (2) at (2,0) {\phantom{0}};
|
||||
\node[draw, circle] (3) at (4,0) {\phantom{0}};
|
||||
\node[draw, circle] (4) at (1,-2) {\phantom{0}};
|
||||
\node[draw, circle] (5) at (3,-2) {\phantom{0}};
|
||||
\node[draw, circle] (6) at (5,-2) {\phantom{0}};
|
||||
|
||||
\path[draw,thick,->,>=latex] (2) -- (1);
|
||||
\path[draw,thick,->,>=latex] (3) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) -- (4);
|
||||
\path[draw,thick,->,>=latex] (6) -- (5);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\path[draw,thick,->,>=latex] (4) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (6) -- (2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
the Grundy numbers are as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\node[draw, circle] (1) at (0,0) {0};
|
||||
\node[draw, circle] (2) at (2,0) {1};
|
||||
\node[draw, circle] (3) at (4,0) {0};
|
||||
\node[draw, circle] (4) at (1,-2) {2};
|
||||
\node[draw, circle] (5) at (3,-2) {0};
|
||||
\node[draw, circle] (6) at (5,-2) {2};
|
||||
|
||||
\path[draw,thick,->,>=latex] (2) -- (1);
|
||||
\path[draw,thick,->,>=latex] (3) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) -- (4);
|
||||
\path[draw,thick,->,>=latex] (6) -- (5);
|
||||
\path[draw,thick,->,>=latex] (4) -- (1);
|
||||
\path[draw,thick,->,>=latex] (4) -- (2);
|
||||
\path[draw,thick,->,>=latex] (5) -- (2);
|
||||
\path[draw,thick,->,>=latex] (6) -- (2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The Grundy number of a losing state is 0,
|
||||
and the Grundy number of a winning state is
|
||||
a positive number.
|
||||
|
||||
The Grundy number of a state corresponds to
|
||||
the number of sticks in a nim heap.
|
||||
If the Grundy number is 0, we can only move to
|
||||
states whose Grundy numbers are positive,
|
||||
and if the Grundy number is $x>0$, we can move
|
||||
to states whose Grundy numbers include all numbers
|
||||
$0,1,\ldots,x-1$.
|
||||
|
||||
As an example, consider a game where
|
||||
the players move a figure in a maze.
|
||||
Each square in the maze is either floor or wall.
|
||||
On each turn, the player has to move
|
||||
the figure some number
|
||||
of steps left or up.
|
||||
The winner of the game is the player who
|
||||
makes the last move.
|
||||
|
||||
The following picture shows a possible initial state
|
||||
of the game, where @ denotes the figure and *
|
||||
denotes a square where it can move.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.65]
|
||||
\begin{scope}
|
||||
\fill [color=black] (0, 1) rectangle (1, 2);
|
||||
\fill [color=black] (0, 3) rectangle (1, 4);
|
||||
\fill [color=black] (2, 2) rectangle (3, 3);
|
||||
\fill [color=black] (2, 4) rectangle (3, 5);
|
||||
\fill [color=black] (4, 3) rectangle (5, 4);
|
||||
|
||||
\draw (0, 0) grid (5, 5);
|
||||
|
||||
\node at (4.5,0.5) {@};
|
||||
\node at (3.5,0.5) {*};
|
||||
\node at (2.5,0.5) {*};
|
||||
\node at (1.5,0.5) {*};
|
||||
\node at (0.5,0.5) {*};
|
||||
\node at (4.5,1.5) {*};
|
||||
\node at (4.5,2.5) {*};
|
||||
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The states of the game are all floor squares
|
||||
of the maze.
|
||||
In the above maze, the Grundy numbers
|
||||
are as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=.65]
|
||||
\begin{scope}
|
||||
\fill [color=black] (0, 1) rectangle (1, 2);
|
||||
\fill [color=black] (0, 3) rectangle (1, 4);
|
||||
\fill [color=black] (2, 2) rectangle (3, 3);
|
||||
\fill [color=black] (2, 4) rectangle (3, 5);
|
||||
\fill [color=black] (4, 3) rectangle (5, 4);
|
||||
|
||||
\draw (0, 0) grid (5, 5);
|
||||
|
||||
\node at (0.5,4.5) {0};
|
||||
\node at (1.5,4.5) {1};
|
||||
\node at (2.5,4.5) {};
|
||||
\node at (3.5,4.5) {0};
|
||||
\node at (4.5,4.5) {1};
|
||||
|
||||
\node at (0.5,3.5) {};
|
||||
\node at (1.5,3.5) {0};
|
||||
\node at (2.5,3.5) {1};
|
||||
\node at (3.5,3.5) {2};
|
||||
\node at (4.5,3.5) {};
|
||||
|
||||
\node at (0.5,2.5) {0};
|
||||
\node at (1.5,2.5) {2};
|
||||
\node at (2.5,2.5) {};
|
||||
\node at (3.5,2.5) {1};
|
||||
\node at (4.5,2.5) {0};
|
||||
|
||||
\node at (0.5,1.5) {};
|
||||
\node at (1.5,1.5) {3};
|
||||
\node at (2.5,1.5) {0};
|
||||
\node at (3.5,1.5) {4};
|
||||
\node at (4.5,1.5) {1};
|
||||
|
||||
\node at (0.5,0.5) {0};
|
||||
\node at (1.5,0.5) {4};
|
||||
\node at (2.5,0.5) {1};
|
||||
\node at (3.5,0.5) {3};
|
||||
\node at (4.5,0.5) {2};
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Thus, each state of the maze game
|
||||
corresponds to a heap in the nim game.
|
||||
For example, the Grundy number for
|
||||
the lower-right square is 2,
|
||||
so it is a winning state.
|
||||
We can reach a losing state and
|
||||
win the game by moving
|
||||
either four steps left or
|
||||
two steps up.
|
||||
|
||||
Note that unlike in the original nim game,
|
||||
it may be possible to move to a state whose
|
||||
Grundy number is larger than the Grundy number
|
||||
of the current state.
|
||||
However, the opponent can always choose a move
|
||||
that cancels such a move, so it is not possible
|
||||
to escape from a losing state.
|
||||
|
||||
\subsubsection{Subgames}
|
||||
|
||||
Next we will assume that our game consists
|
||||
of subgames, and on each turn, the player
|
||||
first chooses a subgame and then a move in the subgame.
|
||||
The game ends when it is not possible to make any move
|
||||
in any subgame.
|
||||
|
||||
In this case, the Grundy number of a game
|
||||
is the nim sum of the Grundy numbers of the subgames.
|
||||
The game can be played like a nim game by calculating
|
||||
all Grundy numbers for subgames and then their nim sum.
|
||||
|
||||
As an example, consider a game that consists
|
||||
of three mazes.
|
||||
In this game, on each turn, the player chooses one
|
||||
of the mazes and then moves the figure in the maze.
|
||||
Assume that the initial state of the game is as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{ccc}
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\fill [color=black] (0, 1) rectangle (1, 2);
|
||||
\fill [color=black] (0, 3) rectangle (1, 4);
|
||||
\fill [color=black] (2, 2) rectangle (3, 3);
|
||||
\fill [color=black] (2, 4) rectangle (3, 5);
|
||||
\fill [color=black] (4, 3) rectangle (5, 4);
|
||||
|
||||
\draw (0, 0) grid (5, 5);
|
||||
|
||||
\node at (4.5,0.5) {@};
|
||||
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
&
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\fill [color=black] (1, 1) rectangle (2, 3);
|
||||
\fill [color=black] (2, 3) rectangle (3, 4);
|
||||
\fill [color=black] (4, 4) rectangle (5, 5);
|
||||
|
||||
\draw (0, 0) grid (5, 5);
|
||||
|
||||
\node at (4.5,0.5) {@};
|
||||
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
&
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\fill [color=black] (1, 1) rectangle (4, 4);
|
||||
|
||||
\draw (0, 0) grid (5, 5);
|
||||
|
||||
\node at (4.5,0.5) {@};
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
The Grundy numbers for the mazes are as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tabular}{ccc}
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\fill [color=black] (0, 1) rectangle (1, 2);
|
||||
\fill [color=black] (0, 3) rectangle (1, 4);
|
||||
\fill [color=black] (2, 2) rectangle (3, 3);
|
||||
\fill [color=black] (2, 4) rectangle (3, 5);
|
||||
\fill [color=black] (4, 3) rectangle (5, 4);
|
||||
|
||||
\draw (0, 0) grid (5, 5);
|
||||
|
||||
\node at (0.5,4.5) {0};
|
||||
\node at (1.5,4.5) {1};
|
||||
\node at (2.5,4.5) {};
|
||||
\node at (3.5,4.5) {0};
|
||||
\node at (4.5,4.5) {1};
|
||||
|
||||
\node at (0.5,3.5) {};
|
||||
\node at (1.5,3.5) {0};
|
||||
\node at (2.5,3.5) {1};
|
||||
\node at (3.5,3.5) {2};
|
||||
\node at (4.5,3.5) {};
|
||||
|
||||
\node at (0.5,2.5) {0};
|
||||
\node at (1.5,2.5) {2};
|
||||
\node at (2.5,2.5) {};
|
||||
\node at (3.5,2.5) {1};
|
||||
\node at (4.5,2.5) {0};
|
||||
|
||||
\node at (0.5,1.5) {};
|
||||
\node at (1.5,1.5) {3};
|
||||
\node at (2.5,1.5) {0};
|
||||
\node at (3.5,1.5) {4};
|
||||
\node at (4.5,1.5) {1};
|
||||
|
||||
\node at (0.5,0.5) {0};
|
||||
\node at (1.5,0.5) {4};
|
||||
\node at (2.5,0.5) {1};
|
||||
\node at (3.5,0.5) {3};
|
||||
\node at (4.5,0.5) {2};
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
&
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\fill [color=black] (1, 1) rectangle (2, 3);
|
||||
\fill [color=black] (2, 3) rectangle (3, 4);
|
||||
\fill [color=black] (4, 4) rectangle (5, 5);
|
||||
|
||||
\draw (0, 0) grid (5, 5);
|
||||
|
||||
\node at (0.5,4.5) {0};
|
||||
\node at (1.5,4.5) {1};
|
||||
\node at (2.5,4.5) {2};
|
||||
\node at (3.5,4.5) {3};
|
||||
\node at (4.5,4.5) {};
|
||||
|
||||
\node at (0.5,3.5) {1};
|
||||
\node at (1.5,3.5) {0};
|
||||
\node at (2.5,3.5) {};
|
||||
\node at (3.5,3.5) {0};
|
||||
\node at (4.5,3.5) {1};
|
||||
|
||||
\node at (0.5,2.5) {2};
|
||||
\node at (1.5,2.5) {};
|
||||
\node at (2.5,2.5) {0};
|
||||
\node at (3.5,2.5) {1};
|
||||
\node at (4.5,2.5) {2};
|
||||
|
||||
\node at (0.5,1.5) {3};
|
||||
\node at (1.5,1.5) {};
|
||||
\node at (2.5,1.5) {1};
|
||||
\node at (3.5,1.5) {2};
|
||||
\node at (4.5,1.5) {0};
|
||||
|
||||
\node at (0.5,0.5) {4};
|
||||
\node at (1.5,0.5) {0};
|
||||
\node at (2.5,0.5) {2};
|
||||
\node at (3.5,0.5) {5};
|
||||
\node at (4.5,0.5) {3};
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
&
|
||||
\begin{tikzpicture}[scale=.55]
|
||||
\begin{scope}
|
||||
\fill [color=black] (1, 1) rectangle (4, 4);
|
||||
|
||||
\draw (0, 0) grid (5, 5);
|
||||
|
||||
\node at (0.5,4.5) {0};
|
||||
\node at (1.5,4.5) {1};
|
||||
\node at (2.5,4.5) {2};
|
||||
\node at (3.5,4.5) {3};
|
||||
\node at (4.5,4.5) {4};
|
||||
|
||||
\node at (0.5,3.5) {1};
|
||||
\node at (1.5,3.5) {};
|
||||
\node at (2.5,3.5) {};
|
||||
\node at (3.5,3.5) {};
|
||||
\node at (4.5,3.5) {0};
|
||||
|
||||
\node at (0.5,2.5) {2};
|
||||
\node at (1.5,2.5) {};
|
||||
\node at (2.5,2.5) {};
|
||||
\node at (3.5,2.5) {};
|
||||
\node at (4.5,2.5) {1};
|
||||
|
||||
\node at (0.5,1.5) {3};
|
||||
\node at (1.5,1.5) {};
|
||||
\node at (2.5,1.5) {};
|
||||
\node at (3.5,1.5) {};
|
||||
\node at (4.5,1.5) {2};
|
||||
|
||||
\node at (0.5,0.5) {4};
|
||||
\node at (1.5,0.5) {0};
|
||||
\node at (2.5,0.5) {1};
|
||||
\node at (3.5,0.5) {2};
|
||||
\node at (4.5,0.5) {3};
|
||||
\end{scope}
|
||||
\end{tikzpicture}
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
|
||||
In the initial state, the nim sum of the Grundy numbers
|
||||
is $2 \oplus 3 \oplus 3 = 2$, so
|
||||
the first player can win the game.
|
||||
One optimal move is to move two steps up
|
||||
in the first maze, which produces the nim sum
|
||||
$0 \oplus 3 \oplus 3 = 0$.
|
||||
|
||||
\subsubsection{Grundy's game}
|
||||
|
||||
Sometimes a move in a game divides the game
|
||||
into subgames that are independent of each other.
|
||||
In this case, the Grundy number of the game is
|
||||
|
||||
\[\textrm{mex}(\{g_1, g_2, \ldots, g_n \}),\]
|
||||
where $n$ is the number of possible moves and
|
||||
\[g_k = a_{k,1} \oplus a_{k,2} \oplus \ldots \oplus a_{k,m},\]
|
||||
where move $k$ generates subgames with
|
||||
Grundy numbers $a_{k,1},a_{k,2},\ldots,a_{k,m}$.
|
||||
|
||||
\index{Grundy's game}
|
||||
|
||||
An example of such a game is \key{Grundy's game}.
|
||||
Initially, there is a single heap that contains $n$ sticks.
|
||||
On each turn, the player chooses a heap and divides
|
||||
it into two nonempty heaps such that the heaps
|
||||
are of different size.
|
||||
The player who makes the last move wins the game.
|
||||
|
||||
Let $f(n)$ be the Grundy number of a heap
|
||||
that contains $n$ sticks.
|
||||
The Grundy number can be calculated by going
|
||||
through all ways to divide the heap into
|
||||
two heaps.
|
||||
For example, when $n=8$, the possibilities
|
||||
are $1+7$, $2+6$ and $3+5$, so
|
||||
\[f(8)=\textrm{mex}(\{f(1) \oplus f(7), f(2) \oplus f(6), f(3) \oplus f(5)\}).\]
|
||||
|
||||
In this game, the value of $f(n)$ is based on the values
|
||||
of $f(1),\ldots,f(n-1)$.
|
||||
The base cases are $f(1)=f(2)=0$,
|
||||
because it is not possible to divide the heaps
|
||||
of 1 and 2 sticks.
|
||||
The first Grundy numbers are:
|
||||
\[
|
||||
\begin{array}{lcl}
|
||||
f(1) & = & 0 \\
|
||||
f(2) & = & 0 \\
|
||||
f(3) & = & 1 \\
|
||||
f(4) & = & 0 \\
|
||||
f(5) & = & 2 \\
|
||||
f(6) & = & 1 \\
|
||||
f(7) & = & 0 \\
|
||||
f(8) & = & 2 \\
|
||||
\end{array}
|
||||
\]
|
||||
The Grundy number for $n=8$ is 2,
|
||||
so it is possible to win the game.
|
||||
The winning move is to create heaps
|
||||
$1+7$, because $f(1) \oplus f(7) = 0$.
|
||||
|
1148
chapter26.tex
1148
chapter26.tex
File diff suppressed because it is too large
Load Diff
559
chapter27.tex
559
chapter27.tex
|
@ -1,559 +0,0 @@
|
|||
\chapter{Square root algorithms}
|
||||
|
||||
\index{square root algorithm}
|
||||
|
||||
A \key{square root algorithm} is an algorithm
|
||||
that has a square root in its time complexity.
|
||||
A square root can be seen as a ''poor man's logarithm'':
|
||||
the complexity $O(\sqrt n)$ is better than $O(n)$
|
||||
but worse than $O(\log n)$.
|
||||
In any case, many square root algorithms are fast and usable in practice.
|
||||
|
||||
As an example, consider the problem of
|
||||
creating a data structure that supports
|
||||
two operations on an array:
|
||||
modifying an element at a given position
|
||||
and calculating the sum of elements in the given range.
|
||||
We have previously solved the problem using
|
||||
binary indexed and segment trees,
|
||||
that support both operations in $O(\log n)$ time.
|
||||
However, now we will solve the problem
|
||||
in another way using a square root structure
|
||||
that allows us to modify elements in $O(1)$ time
|
||||
and calculate sums in $O(\sqrt n)$ time.
|
||||
|
||||
The idea is to divide the array into \emph{blocks}
|
||||
of size $\sqrt n$ so that each block contains
|
||||
the sum of elements inside the block.
|
||||
For example, an array of 16 elements will be
|
||||
divided into blocks of 4 elements as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) grid (16,1);
|
||||
|
||||
\draw (0,1) rectangle (4,2);
|
||||
\draw (4,1) rectangle (8,2);
|
||||
\draw (8,1) rectangle (12,2);
|
||||
\draw (12,1) rectangle (16,2);
|
||||
|
||||
\node at (0.5, 0.5) {5};
|
||||
\node at (1.5, 0.5) {8};
|
||||
\node at (2.5, 0.5) {6};
|
||||
\node at (3.5, 0.5) {3};
|
||||
\node at (4.5, 0.5) {2};
|
||||
\node at (5.5, 0.5) {7};
|
||||
\node at (6.5, 0.5) {2};
|
||||
\node at (7.5, 0.5) {6};
|
||||
\node at (8.5, 0.5) {7};
|
||||
\node at (9.5, 0.5) {1};
|
||||
\node at (10.5, 0.5) {7};
|
||||
\node at (11.5, 0.5) {5};
|
||||
\node at (12.5, 0.5) {6};
|
||||
\node at (13.5, 0.5) {2};
|
||||
\node at (14.5, 0.5) {3};
|
||||
\node at (15.5, 0.5) {2};
|
||||
|
||||
\node at (2, 1.5) {21};
|
||||
\node at (6, 1.5) {17};
|
||||
\node at (10, 1.5) {20};
|
||||
\node at (14, 1.5) {13};
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
In this structure,
|
||||
it is easy to modify array elements,
|
||||
because it is only needed to update
|
||||
the sum of a single block
|
||||
after each modification,
|
||||
which can be done in $O(1)$ time.
|
||||
For example, the following picture shows
|
||||
how the value of an element and
|
||||
the sum of the corresponding block change:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (5,0) rectangle (6,1);
|
||||
\draw (0,0) grid (16,1);
|
||||
|
||||
\fill[color=lightgray] (4,1) rectangle (8,2);
|
||||
\draw (0,1) rectangle (4,2);
|
||||
\draw (4,1) rectangle (8,2);
|
||||
\draw (8,1) rectangle (12,2);
|
||||
\draw (12,1) rectangle (16,2);
|
||||
|
||||
\node at (0.5, 0.5) {5};
|
||||
\node at (1.5, 0.5) {8};
|
||||
\node at (2.5, 0.5) {6};
|
||||
\node at (3.5, 0.5) {3};
|
||||
\node at (4.5, 0.5) {2};
|
||||
\node at (5.5, 0.5) {5};
|
||||
\node at (6.5, 0.5) {2};
|
||||
\node at (7.5, 0.5) {6};
|
||||
\node at (8.5, 0.5) {7};
|
||||
\node at (9.5, 0.5) {1};
|
||||
\node at (10.5, 0.5) {7};
|
||||
\node at (11.5, 0.5) {5};
|
||||
\node at (12.5, 0.5) {6};
|
||||
\node at (13.5, 0.5) {2};
|
||||
\node at (14.5, 0.5) {3};
|
||||
\node at (15.5, 0.5) {2};
|
||||
|
||||
\node at (2, 1.5) {21};
|
||||
\node at (6, 1.5) {15};
|
||||
\node at (10, 1.5) {20};
|
||||
\node at (14, 1.5) {13};
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Then, to calculate the sum of elements in a range,
|
||||
we divide the range into three parts such that
|
||||
the sum consists of values of single elements
|
||||
and sums of blocks between them:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (3,0) rectangle (4,1);
|
||||
\fill[color=lightgray] (12,0) rectangle (13,1);
|
||||
\fill[color=lightgray] (13,0) rectangle (14,1);
|
||||
\draw (0,0) grid (16,1);
|
||||
|
||||
\fill[color=lightgray] (4,1) rectangle (8,2);
|
||||
\fill[color=lightgray] (8,1) rectangle (12,2);
|
||||
\draw (0,1) rectangle (4,2);
|
||||
\draw (4,1) rectangle (8,2);
|
||||
\draw (8,1) rectangle (12,2);
|
||||
\draw (12,1) rectangle (16,2);
|
||||
|
||||
\node at (0.5, 0.5) {5};
|
||||
\node at (1.5, 0.5) {8};
|
||||
\node at (2.5, 0.5) {6};
|
||||
\node at (3.5, 0.5) {3};
|
||||
\node at (4.5, 0.5) {2};
|
||||
\node at (5.5, 0.5) {5};
|
||||
\node at (6.5, 0.5) {2};
|
||||
\node at (7.5, 0.5) {6};
|
||||
\node at (8.5, 0.5) {7};
|
||||
\node at (9.5, 0.5) {1};
|
||||
\node at (10.5, 0.5) {7};
|
||||
\node at (11.5, 0.5) {5};
|
||||
\node at (12.5, 0.5) {6};
|
||||
\node at (13.5, 0.5) {2};
|
||||
\node at (14.5, 0.5) {3};
|
||||
\node at (15.5, 0.5) {2};
|
||||
|
||||
\node at (2, 1.5) {21};
|
||||
\node at (6, 1.5) {15};
|
||||
\node at (10, 1.5) {20};
|
||||
\node at (14, 1.5) {13};
|
||||
|
||||
\draw [decoration={brace}, decorate, line width=0.5mm] (14,-0.25) -- (3,-0.25);
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Since the number of single elements is $O(\sqrt n)$
|
||||
and the number of blocks is also $O(\sqrt n)$,
|
||||
the sum query takes $O(\sqrt n)$ time.
|
||||
The purpose of the block size $\sqrt n$ is
|
||||
that it \emph{balances} two things:
|
||||
the array is divided into $\sqrt n$ blocks,
|
||||
each of which contains $\sqrt n$ elements.
|
||||
|
||||
In practice, it is not necessary to use the
|
||||
exact value of $\sqrt n$ as a parameter,
|
||||
and instead we may use parameters $k$ and $n/k$ where $k$ is
|
||||
different from $\sqrt n$.
|
||||
The optimal parameter depends on the problem and input.
|
||||
For example, if an algorithm often goes
|
||||
through the blocks but rarely inspects
|
||||
single elements inside the blocks,
|
||||
it may be a good idea to divide the array into
|
||||
$k < \sqrt n$ blocks, each of which contains $n/k > \sqrt n$
|
||||
elements.
|
||||
|
||||
\section{Combining algorithms}
|
||||
|
||||
In this section we discuss two square root algorithms
|
||||
that are based on combining two algorithms into one algorithm.
|
||||
In both cases, we could use either of the algorithms
|
||||
without the other
|
||||
and solve the problem in $O(n^2)$ time.
|
||||
However, by combining the algorithms, the running
|
||||
time is only $O(n \sqrt n)$.
|
||||
|
||||
\subsubsection{Case processing}
|
||||
|
||||
Suppose that we are given a two-dimensional
|
||||
grid that contains $n$ cells.
|
||||
Each cell is assigned a letter,
|
||||
and our task is to find two cells
|
||||
with the same letter whose distance is minimum,
|
||||
where the distance between cells
|
||||
$(x_1,y_1)$ and $(x_2,y_2)$ is $|x_1-x_2|+|y_1-y_2|$.
|
||||
For example, consider the following grid:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\node at (0.5,0.5) {A};
|
||||
\node at (0.5,1.5) {B};
|
||||
\node at (0.5,2.5) {C};
|
||||
\node at (0.5,3.5) {A};
|
||||
\node at (1.5,0.5) {C};
|
||||
\node at (1.5,1.5) {D};
|
||||
\node at (1.5,2.5) {E};
|
||||
\node at (1.5,3.5) {F};
|
||||
\node at (2.5,0.5) {B};
|
||||
\node at (2.5,1.5) {A};
|
||||
\node at (2.5,2.5) {G};
|
||||
\node at (2.5,3.5) {B};
|
||||
\node at (3.5,0.5) {D};
|
||||
\node at (3.5,1.5) {F};
|
||||
\node at (3.5,2.5) {E};
|
||||
\node at (3.5,3.5) {A};
|
||||
\draw (0,0) grid (4,4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
In this case, the minimum distance is 2 between the two 'E' letters.
|
||||
|
||||
We can solve the problem by considering each letter separately.
|
||||
Using this approach, the new problem is to calculate
|
||||
the minimum distance
|
||||
between two cells with a \emph{fixed} letter $c$.
|
||||
We focus on two algorithms for this:
|
||||
|
||||
\emph{Algorithm 1:} Go through all pairs of cells with letter $c$,
|
||||
and calculate the minimum distance between such cells.
|
||||
This will take $O(k^2)$ time where $k$ is the number of cells with letter $c$.
|
||||
|
||||
\emph{Algorithm 2:} Perform a breadth-first search that simultaneously
|
||||
starts at each cell with letter $c$. The minimum distance between
|
||||
two cells with letter $c$ will be calculated in $O(n)$ time.
|
||||
|
||||
One way to solve the problem is to choose either of the
|
||||
algorithms and use it for all letters.
|
||||
If we use Algorithm 1, the running time is $O(n^2)$,
|
||||
because all cells may contain the same letter,
|
||||
and in this case $k=n$.
|
||||
Also if we use Algorithm 2, the running time is $O(n^2)$,
|
||||
because all cells may have different letters,
|
||||
and in this case $n$ searches are needed.
|
||||
|
||||
However, we can \emph{combine} the two algorithms and
|
||||
use different algorithms for different letters
|
||||
depending on how many times each letter appears in the grid.
|
||||
Assume that a letter $c$ appears $k$ times.
|
||||
If $k \le \sqrt n$, we use Algorithm 1, and if $k > \sqrt n$,
|
||||
we use Algorithm 2.
|
||||
It turns out that by doing this, the total running time
|
||||
of the algorithm is only $O(n \sqrt n)$.
|
||||
|
||||
First, suppose that we use Algorithm 1 for a letter $c$.
|
||||
Since $c$ appears at most $\sqrt n$ times in the grid,
|
||||
we compare each cell with letter $c$ $O(\sqrt n)$ times
|
||||
with other cells.
|
||||
Thus, the time used for processing all such cells is $O(n \sqrt n)$.
|
||||
Then, suppose that we use Algorithm 2 for a letter $c$.
|
||||
There are at most $\sqrt n$ such letters,
|
||||
so processing those letters also takes $O(n \sqrt n)$ time.
|
||||
|
||||
\subsubsection{Batch processing}
|
||||
|
||||
Our next problem also deals with
|
||||
a two-dimensional grid that contains $n$ cells.
|
||||
Initially, each cell except one is white.
|
||||
We perform $n-1$ operations, each of which first
|
||||
calculates the minimum distance from a given white cell
|
||||
to a black cell, and then paints the white cell black.
|
||||
|
||||
For example, consider the following operation:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=black] (1,1) rectangle (2,2);
|
||||
\fill[color=black] (3,1) rectangle (4,2);
|
||||
\fill[color=black] (0,3) rectangle (1,4);
|
||||
\node at (2.5,3.5) {*};
|
||||
\draw (0,0) grid (4,4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
First, we calculate the minimum distance
|
||||
from the white cell marked with * to a black cell.
|
||||
The minimum distance is 2, because we can move
|
||||
two steps left to a black cell.
|
||||
Then, we paint the white cell black:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=black] (1,1) rectangle (2,2);
|
||||
\fill[color=black] (3,1) rectangle (4,2);
|
||||
\fill[color=black] (0,3) rectangle (1,4);
|
||||
\fill[color=black] (2,3) rectangle (3,4);
|
||||
\draw (0,0) grid (4,4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Consider the following two algorithms:
|
||||
|
||||
\emph{Algorithm 1:} Use breadth-first search
|
||||
to calculate
|
||||
for each white cell the distance to the nearest black cell.
|
||||
This takes $O(n)$ time, and after the search,
|
||||
we can find the minimum distance from any white cell
|
||||
to a black cell in $O(1)$ time.
|
||||
|
||||
\emph{Algorithm 2:} Maintain a list of cells that have been
|
||||
painted black, go through this list at each operation
|
||||
and then add a new cell to the list.
|
||||
An operation takes $O(k)$ time where $k$ is the length of the list.
|
||||
|
||||
We combine the above algorithms by
|
||||
dividing the operations into
|
||||
$O(\sqrt n)$ \emph{batches}, each of which consists
|
||||
of $O(\sqrt n)$ operations.
|
||||
At the beginning of each batch,
|
||||
we perform Algorithm 1.
|
||||
Then, we use Algorithm 2 to process the operations
|
||||
in the batch.
|
||||
We clear the list of Algorithm 2 between
|
||||
the batches.
|
||||
At each operation,
|
||||
the minimum distance to a black cell
|
||||
is either the distance calculated by Algorithm 1
|
||||
or the distance calculated by Algorithm 2.
|
||||
|
||||
The resulting algorithm works in
|
||||
$O(n \sqrt n)$ time.
|
||||
First, Algorithm 1 is performed $O(\sqrt n)$ times,
|
||||
and each search works in $O(n)$ time.
|
||||
Second, when using Algorithm 2 in a batch,
|
||||
the list contains $O(\sqrt n)$ cells
|
||||
(because we clear the list between the batches)
|
||||
and each operation takes $O(\sqrt n)$ time.
|
||||
|
||||
\section{Integer partitions}
|
||||
|
||||
Some square root algorithms are based on
|
||||
the following observation:
|
||||
if a positive integer $n$ is represented as
|
||||
a sum of positive integers,
|
||||
such a sum always contains at most
|
||||
$O(\sqrt n)$ \emph{distinct} numbers.
|
||||
The reason for this is that to construct
|
||||
a sum that contains a maximum number of distinct
|
||||
numbers, we should choose \emph{small} numbers.
|
||||
If we choose the numbers $1,2,\ldots,k$,
|
||||
the resulting sum is
|
||||
\[\frac{k(k+1)}{2}.\]
|
||||
Thus, the maximum amount of distinct numbers is $k = O(\sqrt n)$.
|
||||
Next we will discuss two problems that can be solved
|
||||
efficiently using this observation.
|
||||
|
||||
\subsubsection{Knapsack}
|
||||
|
||||
Suppose that we are given a list of integer weights
|
||||
whose sum is $n$.
|
||||
Our task is to find out all sums that can be formed using
|
||||
a subset of the weights. For example, if the weights are
|
||||
$\{1,3,3\}$, the possible sums are as follows:
|
||||
|
||||
\begin{itemize}[noitemsep]
|
||||
\item $0$ (empty set)
|
||||
\item $1$
|
||||
\item $3$
|
||||
\item $1+3=4$
|
||||
\item $3+3=6$
|
||||
\item $1+3+3=7$
|
||||
\end{itemize}
|
||||
|
||||
Using the standard knapsack approach (see Chapter 7.4),
|
||||
the problem can be solved as follows:
|
||||
we define a function $\texttt{possible}(x,k)$ whose value is 1
|
||||
if the sum $x$ can be formed using the first $k$ weights,
|
||||
and 0 otherwise.
|
||||
Since the sum of the weights is $n$,
|
||||
there are at most $n$ weights and
|
||||
all values of the function can be calculated
|
||||
in $O(n^2)$ time using dynamic programming.
|
||||
|
||||
However, we can make the algorithm more efficient
|
||||
by using the fact that there are at most $O(\sqrt n)$
|
||||
\emph{distinct} weights.
|
||||
Thus, we can process the weights in groups
|
||||
that consists of similar weights.
|
||||
We can process each group
|
||||
in $O(n)$ time, which yields an $O(n \sqrt n)$ time algorithm.
|
||||
|
||||
The idea is to use an array that records the sums of weights
|
||||
that can be formed using the groups processed so far.
|
||||
The array contains $n$ elements: element $k$ is 1 if the sum
|
||||
$k$ can be formed and 0 otherwise.
|
||||
To process a group of weights, we scan the array
|
||||
from left to right and record the new sums of weights that
|
||||
can be formed using this group and the previous groups.
|
||||
|
||||
\subsubsection{String construction}
|
||||
|
||||
Given a string \texttt{s} of length $n$
|
||||
and a set of strings $D$ whose total length is $m$,
|
||||
consider the problem of counting the number of ways
|
||||
\texttt{s} can be formed as a concatenation of strings in $D$.
|
||||
For example,
|
||||
if $\texttt{s}=\texttt{ABAB}$ and
|
||||
$D=\{\texttt{A},\texttt{B},\texttt{AB}\}$,
|
||||
there are 4 ways:
|
||||
|
||||
\begin{itemize}[noitemsep]
|
||||
\item $\texttt{A}+\texttt{B}+\texttt{A}+\texttt{B}$
|
||||
\item $\texttt{AB}+\texttt{A}+\texttt{B}$
|
||||
\item $\texttt{A}+\texttt{B}+\texttt{AB}$
|
||||
\item $\texttt{AB}+\texttt{AB}$
|
||||
\end{itemize}
|
||||
|
||||
We can solve the problem using dynamic programming:
|
||||
Let $\texttt{count}(k)$ denote the number of ways to construct the prefix
|
||||
$\texttt{s}[0 \ldots k]$ using the strings in $D$.
|
||||
Now $\texttt{count}(n-1)$ gives the answer to the problem,
|
||||
and we can solve the problem in $O(n^2)$ time
|
||||
using a trie structure.
|
||||
|
||||
However, we can solve the problem more efficiently
|
||||
by using string hashing and the fact that there
|
||||
are at most $O(\sqrt m)$ distinct string lengths in $D$.
|
||||
First, we construct a set $H$ that contains all
|
||||
hash values of the strings in $D$.
|
||||
Then, when calculating a value of $\texttt{count}(k)$,
|
||||
we go through all values of $p$
|
||||
such that there is a string of length $p$ in $D$,
|
||||
calculate the hash value of $\texttt{s}[k-p+1 \ldots k]$
|
||||
and check if it belongs to $H$.
|
||||
Since there are at most $O(\sqrt m)$ distinct string lengths,
|
||||
this results in an algorithm whose running time is $O(n \sqrt m)$.
|
||||
|
||||
\section{Mo's algorithm}
|
||||
|
||||
\index{Mo's algorithm}
|
||||
|
||||
\key{Mo's algorithm}\footnote{According to \cite{cod15}, this algorithm
|
||||
is named after Mo Tao, a Chinese competitive programmer, but
|
||||
the technique has appeared earlier in the literature \cite{ken06}.}
|
||||
can be used in many problems
|
||||
that require processing range queries in
|
||||
a \emph{static} array, i.e., the array values
|
||||
do not change between the queries.
|
||||
In each query, we are given a range $[a,b]$,
|
||||
and we should calculate a value based on the
|
||||
array elements between positions $a$ and $b$.
|
||||
Since the array is static,
|
||||
the queries can be processed in any order,
|
||||
and Mo's algorithm
|
||||
processes the queries in a special order which guarantees
|
||||
that the algorithm works efficiently.
|
||||
|
||||
Mo's algorithm maintains an \emph{active range}
|
||||
of the array, and the answer to a query
|
||||
concerning the active range is known at each moment.
|
||||
The algorithm processes the queries one by one,
|
||||
and always moves the endpoints of the
|
||||
active range by inserting and removing elements.
|
||||
The time complexity of the algorithm is
|
||||
$O(n \sqrt n f(n))$ where the array contains
|
||||
$n$ elements, there are $n$ queries
|
||||
and each insertion and removal of an element
|
||||
takes $O(f(n))$ time.
|
||||
|
||||
The trick in Mo's algorithm is the order
|
||||
in which the queries are processed:
|
||||
The array is divided into blocks of $k=O(\sqrt n)$
|
||||
elements, and a query $[a_1,b_1]$
|
||||
is processed before a query $[a_2,b_2]$
|
||||
if either
|
||||
\begin{itemize}
|
||||
\item $\lfloor a_1/k \rfloor < \lfloor a_2/k \rfloor$ or
|
||||
\item $\lfloor a_1/k \rfloor = \lfloor a_2/k \rfloor$ and $b_1 < b_2$.
|
||||
\end{itemize}
|
||||
|
||||
Thus, all queries whose left endpoints are
|
||||
in a certain block are processed one after another
|
||||
sorted according to their right endpoints.
|
||||
Using this order, the algorithm
|
||||
only performs $O(n \sqrt n)$ operations,
|
||||
because the left endpoint moves
|
||||
$O(n)$ times $O(\sqrt n)$ steps,
|
||||
and the right endpoint moves
|
||||
$O(\sqrt n)$ times $O(n)$ steps. Thus, both
|
||||
endpoints move a total of $O(n \sqrt n)$ steps during the algorithm.
|
||||
|
||||
\subsubsection*{Example}
|
||||
|
||||
As an example, consider a problem
|
||||
where we are given a set of queries,
|
||||
each of them corresponding to a range in an array,
|
||||
and our task is to calculate for each query
|
||||
the number of \emph{distinct} elements in the range.
|
||||
|
||||
In Mo's algorithm, the queries are always sorted
|
||||
in the same way, but it depends on the problem
|
||||
how the answer to the query is maintained.
|
||||
In this problem, we can maintain an array
|
||||
\texttt{count} where $\texttt{count}[x]$
|
||||
indicates the number of times an element $x$
|
||||
occurs in the active range.
|
||||
|
||||
When we move from one query to another query,
|
||||
the active range changes.
|
||||
For example, if the current range is
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (1,0) rectangle (5,1);
|
||||
\draw (0,0) grid (9,1);
|
||||
\node at (0.5, 0.5) {4};
|
||||
\node at (1.5, 0.5) {2};
|
||||
\node at (2.5, 0.5) {5};
|
||||
\node at (3.5, 0.5) {4};
|
||||
\node at (4.5, 0.5) {2};
|
||||
\node at (5.5, 0.5) {4};
|
||||
\node at (6.5, 0.5) {3};
|
||||
\node at (7.5, 0.5) {3};
|
||||
\node at (8.5, 0.5) {4};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
and the next range is
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\fill[color=lightgray] (2,0) rectangle (7,1);
|
||||
\draw (0,0) grid (9,1);
|
||||
\node at (0.5, 0.5) {4};
|
||||
\node at (1.5, 0.5) {2};
|
||||
\node at (2.5, 0.5) {5};
|
||||
\node at (3.5, 0.5) {4};
|
||||
\node at (4.5, 0.5) {2};
|
||||
\node at (5.5, 0.5) {4};
|
||||
\node at (6.5, 0.5) {3};
|
||||
\node at (7.5, 0.5) {3};
|
||||
\node at (8.5, 0.5) {4};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
there will be three steps:
|
||||
the left endpoint moves one step to the right,
|
||||
and the right endpoint moves two steps to the right.
|
||||
|
||||
After each step, the array \texttt{count}
|
||||
needs to be updated.
|
||||
After adding an element $x$,
|
||||
we increase the value of
|
||||
$\texttt{count}[x]$ by 1,
|
||||
and if $\texttt{count}[x]=1$ after this,
|
||||
we also increase the answer to the query by 1.
|
||||
Similarly, after removing an element $x$,
|
||||
we decrease the value of
|
||||
$\texttt{count}[x]$ by 1,
|
||||
and if $\texttt{count}[x]=0$ after this,
|
||||
we also decrease the answer to the query by 1.
|
||||
|
||||
In this problem, the time needed to perform
|
||||
each step is $O(1)$, so the total time complexity
|
||||
of the algorithm is $O(n \sqrt n)$.
|
1161
chapter28.tex
1161
chapter28.tex
File diff suppressed because it is too large
Load Diff
782
chapter29.tex
782
chapter29.tex
|
@ -1,782 +0,0 @@
|
|||
\chapter{Geometry}
|
||||
|
||||
\index{geometry}
|
||||
|
||||
In geometric problems, it is often challenging
|
||||
to find a way to approach the problem so that
|
||||
the solution to the problem can be conveniently implemented
|
||||
and the number of special cases is small.
|
||||
|
||||
As an example, consider a problem where
|
||||
we are given the vertices of a quadrilateral
|
||||
(a polygon that has four vertices),
|
||||
and our task is to calculate its area.
|
||||
For example, a possible input for the problem is as follows:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.45]
|
||||
|
||||
\draw[fill] (6,2) circle [radius=0.1];
|
||||
\draw[fill] (5,6) circle [radius=0.1];
|
||||
\draw[fill] (2,5) circle [radius=0.1];
|
||||
\draw[fill] (1,1) circle [radius=0.1];
|
||||
\draw[thick] (6,2) -- (5,6) -- (2,5) -- (1,1) -- (6,2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
One way to approach the problem is to divide
|
||||
the quadrilateral into two triangles by a straight
|
||||
line between two opposite vertices:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.45]
|
||||
|
||||
\draw[fill] (6,2) circle [radius=0.1];
|
||||
\draw[fill] (5,6) circle [radius=0.1];
|
||||
\draw[fill] (2,5) circle [radius=0.1];
|
||||
\draw[fill] (1,1) circle [radius=0.1];
|
||||
|
||||
\draw[thick] (6,2) -- (5,6) -- (2,5) -- (1,1) -- (6,2);
|
||||
\draw[dashed,thick] (2,5) -- (6,2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
After this, it suffices to sum the areas
|
||||
of the triangles.
|
||||
The area of a triangle can be calculated,
|
||||
for example, using \key{Heron's formula}
|
||||
%\footnote{Heron of Alexandria (c. 10--70) was a Greek mathematician.}
|
||||
\[ \sqrt{s (s-a) (s-b) (s-c)},\]
|
||||
where $a$, $b$ and $c$ are the lengths
|
||||
of the triangle's sides and
|
||||
$s=(a+b+c)/2$.
|
||||
\index{Heron's formula}
|
||||
|
||||
This is a possible way to solve the problem,
|
||||
but there is one pitfall:
|
||||
how to divide the quadrilateral into triangles?
|
||||
It turns out that sometimes we cannot just pick
|
||||
two arbitrary opposite vertices.
|
||||
For example, in the following situation,
|
||||
the division line is \emph{outside} the quadrilateral:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.45]
|
||||
|
||||
\draw[fill] (6,2) circle [radius=0.1];
|
||||
\draw[fill] (3,2) circle [radius=0.1];
|
||||
\draw[fill] (2,5) circle [radius=0.1];
|
||||
\draw[fill] (1,1) circle [radius=0.1];
|
||||
\draw[thick] (6,2) -- (3,2) -- (2,5) -- (1,1) -- (6,2);
|
||||
|
||||
\draw[dashed,thick] (2,5) -- (6,2);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
However, another way to draw the line works:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.45]
|
||||
|
||||
\draw[fill] (6,2) circle [radius=0.1];
|
||||
\draw[fill] (3,2) circle [radius=0.1];
|
||||
\draw[fill] (2,5) circle [radius=0.1];
|
||||
\draw[fill] (1,1) circle [radius=0.1];
|
||||
\draw[thick] (6,2) -- (3,2) -- (2,5) -- (1,1) -- (6,2);
|
||||
|
||||
\draw[dashed,thick] (3,2) -- (1,1);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
It is clear for a human which of the lines is the correct
|
||||
choice, but the situation is difficult for a computer.
|
||||
|
||||
However, it turns out that we can solve the problem using
|
||||
another method that is more convenient to a programmer.
|
||||
Namely, there is a general formula
|
||||
\[x_1y_2-x_2y_1+x_2y_3-x_3y_2+x_3y_4-x_4y_3+x_4y_1-x_1y_4,\]
|
||||
that calculates the area of a quadrilateral
|
||||
whose vertices are
|
||||
$(x_1,y_1)$,
|
||||
$(x_2,y_2)$,
|
||||
$(x_3,y_3)$ and
|
||||
$(x_4,y_4)$.
|
||||
This formula is easy to implement, there are no special
|
||||
cases, and we can even generalize the formula
|
||||
to \emph{all} polygons.
|
||||
|
||||
\section{Complex numbers}
|
||||
|
||||
\index{complex number}
|
||||
\index{point}
|
||||
\index{vector}
|
||||
|
||||
A \key{complex number} is a number of the form $x+y i$,
|
||||
where $i = \sqrt{-1}$ is the \key{imaginary unit}.
|
||||
A geometric interpretation of a complex number is
|
||||
that it represents a two-dimensional point $(x,y)$
|
||||
or a vector from the origin to a point $(x,y)$.
|
||||
|
||||
For example, $4+2i$ corresponds to the
|
||||
following point and vector:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.45]
|
||||
|
||||
\draw[->,thick] (-5,0)--(5,0);
|
||||
\draw[->,thick] (0,-5)--(0,5);
|
||||
|
||||
\draw[fill] (4,2) circle [radius=0.1];
|
||||
\draw[->,thick] (0,0)--(4-0.1,2-0.1);
|
||||
|
||||
\node at (4,2.8) {$(4,2)$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\index{complex@\texttt{complex}}
|
||||
|
||||
The C++ complex number class \texttt{complex} is
|
||||
useful when solving geometric problems.
|
||||
Using the class we can represent points and vectors
|
||||
as complex numbers, and the class contains tools
|
||||
that are useful in geometry.
|
||||
|
||||
In the following code, \texttt{C} is the type of
|
||||
a coordinate and \texttt{P} is the type of a point or a vector.
|
||||
In addition, the code defines macros \texttt{X} and \texttt{Y}
|
||||
that can be used to refer to x and y coordinates.
|
||||
|
||||
\begin{lstlisting}
|
||||
typedef long long C;
|
||||
typedef complex<C> P;
|
||||
#define X real()
|
||||
#define Y imag()
|
||||
\end{lstlisting}
|
||||
|
||||
For example, the following code defines a point $p=(4,2)$
|
||||
and prints its x and y coordinates:
|
||||
|
||||
\begin{lstlisting}
|
||||
P p = {4,2};
|
||||
cout << p.X << " " << p.Y << "\n"; // 4 2
|
||||
\end{lstlisting}
|
||||
|
||||
The following code defines vectors $v=(3,1)$ and $u=(2,2)$,
|
||||
and after that calculates the sum $s=v+u$.
|
||||
|
||||
\begin{lstlisting}
|
||||
P v = {3,1};
|
||||
P u = {2,2};
|
||||
P s = v+u;
|
||||
cout << s.X << " " << s.Y << "\n"; // 5 3
|
||||
\end{lstlisting}
|
||||
|
||||
In practice,
|
||||
an appropriate coordinate type is usually
|
||||
\texttt{long long} (integer) or \texttt{long double}
|
||||
(real number).
|
||||
It is a good idea to use integer whenever possible,
|
||||
because calculations with integers are exact.
|
||||
If real numbers are needed,
|
||||
precision errors should be taken into account
|
||||
when comparing numbers.
|
||||
A safe way to check if real numbers $a$ and $b$ are equal
|
||||
is to compare them using $|a-b|<\epsilon$,
|
||||
where $\epsilon$ is a small number (for example, $\epsilon=10^{-9}$).
|
||||
|
||||
\subsubsection*{Functions}
|
||||
|
||||
In the following examples, the coordinate type is
|
||||
\texttt{long double}.
|
||||
|
||||
The function $\texttt{abs}(v)$ calculates the length
|
||||
$|v|$ of a vector $v=(x,y)$
|
||||
using the formula $\sqrt{x^2+y^2}$.
|
||||
The function can also be used for
|
||||
calculating the distance between points
|
||||
$(x_1,y_1)$ and $(x_2,y_2)$,
|
||||
because that distance equals the length
|
||||
of the vector $(x_2-x_1,y_2-y_1)$.
|
||||
|
||||
The following code calculates the distance
|
||||
between points $(4,2)$ and $(3,-1)$:
|
||||
\begin{lstlisting}
|
||||
P a = {4,2};
|
||||
P b = {3,-1};
|
||||
cout << abs(b-a) << "\n"; // 3.16228
|
||||
\end{lstlisting}
|
||||
|
||||
The function $\texttt{arg}(v)$ calculates the
|
||||
angle of a vector $v=(x,y)$ with respect to the x axis.
|
||||
The function gives the angle in radians,
|
||||
where $r$ radians equals $180 r/\pi$ degrees.
|
||||
The angle of a vector that points to the right is 0,
|
||||
and angles decrease clockwise and increase
|
||||
counterclockwise.
|
||||
|
||||
The function $\texttt{polar}(s,a)$ constructs a vector
|
||||
whose length is $s$ and that points to an angle $a$.
|
||||
A vector can be rotated by an angle $a$
|
||||
by multiplying it by a vector with length 1 and angle $a$.
|
||||
|
||||
The following code calculates the angle of
|
||||
the vector $(4,2)$, rotates it $1/2$ radians
|
||||
counterclockwise, and then calculates the angle again:
|
||||
|
||||
\begin{lstlisting}
|
||||
P v = {4,2};
|
||||
cout << arg(v) << "\n"; // 0.463648
|
||||
v *= polar(1.0,0.5);
|
||||
cout << arg(v) << "\n"; // 0.963648
|
||||
\end{lstlisting}
|
||||
|
||||
\section{Points and lines}
|
||||
|
||||
\index{cross product}
|
||||
|
||||
The \key{cross product} $a \times b$ of vectors
|
||||
$a=(x_1,y_1)$ and $b=(x_2,y_2)$ is calculated
|
||||
using the formula $x_1 y_2 - x_2 y_1$.
|
||||
The cross product tells us whether $b$
|
||||
turns left (positive value), does not turn (zero)
|
||||
or turns right (negative value)
|
||||
when it is placed directly after $a$.
|
||||
|
||||
The following picture illustrates the above cases:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.45]
|
||||
|
||||
\draw[->,thick] (0,0)--(4,2);
|
||||
\draw[->,thick] (4,2)--(4+1,2+2);
|
||||
|
||||
\node at (2.5,0.5) {$a$};
|
||||
\node at (5,2.5) {$b$};
|
||||
|
||||
\node at (3,-2) {$a \times b = 6$};
|
||||
|
||||
\draw[->,thick] (8+0,0)--(8+4,2);
|
||||
\draw[->,thick] (8+4,2)--(8+4+2,2+1);
|
||||
|
||||
\node at (8+2.5,0.5) {$a$};
|
||||
\node at (8+5,1.5) {$b$};
|
||||
|
||||
\node at (8+3,-2) {$a \times b = 0$};
|
||||
|
||||
\draw[->,thick] (16+0,0)--(16+4,2);
|
||||
\draw[->,thick] (16+4,2)--(16+4+2,2-1);
|
||||
|
||||
\node at (16+2.5,0.5) {$a$};
|
||||
\node at (16+5,2.5) {$b$};
|
||||
|
||||
\node at (16+3,-2) {$a \times b = -8$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\noindent
|
||||
For example, in the first case
|
||||
$a=(4,2)$ and $b=(1,2)$.
|
||||
The following code calculates the cross product
|
||||
using the class \texttt{complex}:
|
||||
|
||||
\begin{lstlisting}
|
||||
P a = {4,2};
|
||||
P b = {1,2};
|
||||
C p = (conj(a)*b).Y; // 6
|
||||
\end{lstlisting}
|
||||
|
||||
The above code works, because
|
||||
the function \texttt{conj} negates the y coordinate
|
||||
of a vector,
|
||||
and when the vectors $(x_1,-y_1)$ and $(x_2,y_2)$
|
||||
are multiplied together, the y coordinate
|
||||
of the result is $x_1 y_2 - x_2 y_1$.
|
||||
|
||||
\subsubsection{Point location}
|
||||
|
||||
Cross products can be used to test
|
||||
whether a point is located on the left or right
|
||||
side of a line.
|
||||
Assume that the line goes through points
|
||||
$s_1$ and $s_2$, we are looking from $s_1$
|
||||
to $s_2$ and the point is $p$.
|
||||
|
||||
For example, in the following picture,
|
||||
$p$ is on the left side of the line:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.45]
|
||||
\draw[dashed,thick,->] (0,-3)--(12,6);
|
||||
\draw[fill] (4,0) circle [radius=0.1];
|
||||
\draw[fill] (8,3) circle [radius=0.1];
|
||||
\draw[fill] (5,3) circle [radius=0.1];
|
||||
\node at (4,-1) {$s_1$};
|
||||
\node at (8,2) {$s_2$};
|
||||
\node at (5,4) {$p$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The cross product $(p-s_1) \times (p-s_2)$
|
||||
tells us the location of the point $p$.
|
||||
If the cross product is positive,
|
||||
$p$ is located on the left side,
|
||||
and if the cross product is negative,
|
||||
$p$ is located on the right side.
|
||||
Finally, if the cross product is zero,
|
||||
points $s_1$, $s_2$ and $p$ are on the same line.
|
||||
|
||||
\subsubsection{Line segment intersection}
|
||||
|
||||
\index{line segment intersection}
|
||||
|
||||
Next we consider the problem of testing
|
||||
whether two line segments
|
||||
$ab$ and $cd$ intersect. The possible cases are:
|
||||
|
||||
\textit{Case 1:}
|
||||
The line segments are on the same line
|
||||
and they overlap each other.
|
||||
In this case, there is an infinite number of
|
||||
intersection points.
|
||||
For example, in the following picture,
|
||||
all points between $c$ and $b$ are
|
||||
intersection points:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\draw (1.5,1.5)--(6,3);
|
||||
\draw (0,1)--(4.5,2.5);
|
||||
\draw[fill] (0,1) circle [radius=0.05];
|
||||
\node at (0,0.5) {$a$};
|
||||
\draw[fill] (1.5,1.5) circle [radius=0.05];
|
||||
\node at (6,2.5) {$d$};
|
||||
\draw[fill] (4.5,2.5) circle [radius=0.05];
|
||||
\node at (1.5,1) {$c$};
|
||||
\draw[fill] (6,3) circle [radius=0.05];
|
||||
\node at (4.5,2) {$b$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
In this case, we can use cross products to
|
||||
check if all points are on the same line.
|
||||
After this, we can sort the points and check
|
||||
whether the line segments overlap each other.
|
||||
|
||||
\textit{Case 2:}
|
||||
The line segments have a common vertex
|
||||
that is the only intersection point.
|
||||
For example, in the following picture the
|
||||
intersection point is $b=c$:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\draw (0,0)--(4,2);
|
||||
\draw (4,2)--(6,1);
|
||||
\draw[fill] (0,0) circle [radius=0.05];
|
||||
\draw[fill] (4,2) circle [radius=0.05];
|
||||
\draw[fill] (6,1) circle [radius=0.05];
|
||||
|
||||
\node at (0,0.5) {$a$};
|
||||
\node at (4,2.5) {$b=c$};
|
||||
\node at (6,1.5) {$d$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
This case is easy to check, because
|
||||
there are only four possibilities
|
||||
for the intersection point:
|
||||
$a=c$, $a=d$, $b=c$ and $b=d$.
|
||||
|
||||
\textit{Case 3:}
|
||||
There is exactly one intersection point
|
||||
that is not a vertex of any line segment.
|
||||
In the following picture, the point $p$
|
||||
is the intersection point:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.9]
|
||||
\draw (0,1)--(6,3);
|
||||
\draw (2,4)--(4,0);
|
||||
\draw[fill] (0,1) circle [radius=0.05];
|
||||
\node at (0,0.5) {$c$};
|
||||
\draw[fill] (6,3) circle [radius=0.05];
|
||||
\node at (6,2.5) {$d$};
|
||||
\draw[fill] (2,4) circle [radius=0.05];
|
||||
\node at (1.5,3.5) {$a$};
|
||||
\draw[fill] (4,0) circle [radius=0.05];
|
||||
\node at (4,-0.4) {$b$};
|
||||
\draw[fill] (3,2) circle [radius=0.05];
|
||||
\node at (3,1.5) {$p$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
In this case, the line segments intersect
|
||||
exactly when both points $c$ and $d$ are
|
||||
on different sides of a line through $a$ and $b$,
|
||||
and points $a$ and $b$ are on different
|
||||
sides of a line through $c$ and $d$.
|
||||
We can use cross products to check this.
|
||||
|
||||
\subsubsection{Point distance from a line}
|
||||
|
||||
Another feature of cross products is that
|
||||
the area of a triangle can be calculated
|
||||
using the formula
|
||||
\[\frac{| (a-c) \times (b-c) |}{2},\]
|
||||
where $a$, $b$ and $c$ are the vertices of the triangle.
|
||||
Using this fact, we can derive a formula
|
||||
for calculating the shortest distance between a point and a line.
|
||||
For example, in the following picture $d$ is the
|
||||
shortest distance between the point $p$ and the line
|
||||
that is defined by the points $s_1$ and $s_2$:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.75]
|
||||
\draw (-2,-1)--(6,3);
|
||||
\draw[dashed] (1,4)--(2.40,1.2);
|
||||
\node at (0,-0.5) {$s_1$};
|
||||
\node at (4,1.5) {$s_2$};
|
||||
\node at (0.5,4) {$p$};
|
||||
\node at (2,2.7) {$d$};
|
||||
\draw[fill] (0,0) circle [radius=0.05];
|
||||
\draw[fill] (4,2) circle [radius=0.05];
|
||||
\draw[fill] (1,4) circle [radius=0.05];
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The area of the triangle whose vertices are
|
||||
$s_1$, $s_2$ and $p$ can be calculated in two ways:
|
||||
it is both
|
||||
$\frac{1}{2} |s_2-s_1| d$ and
|
||||
$\frac{1}{2} ((s_1-p) \times (s_2-p))$.
|
||||
Thus, the shortest distance is
|
||||
\[ d = \frac{(s_1-p) \times (s_2-p)}{|s_2-s_1|} .\]
|
||||
|
||||
\subsubsection{Point inside a polygon}
|
||||
|
||||
Let us now consider the problem of
|
||||
testing whether a point is located inside or outside
|
||||
a polygon.
|
||||
For example, in the following picture point $a$
|
||||
is inside the polygon and point $b$ is outside
|
||||
the polygon.
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.75]
|
||||
%\draw (0,0)--(2,-2)--(3,1)--(5,1)--(2,3)--(1,2)--(-1,2)--(1,4)--(-2,4)--(-2,1)--(-3,3)--(-4,0)--(0,0);
|
||||
\draw (0,0)--(2,2)--(5,1)--(2,3)--(1,2)--(-1,2)--(1,4)--(-2,4)--(-2,1)--(-3,3)--(-4,0)--(0,0);
|
||||
|
||||
\draw[fill] (-3,1) circle [radius=0.05];
|
||||
\node at (-3,0.5) {$a$};
|
||||
\draw[fill] (1,3) circle [radius=0.05];
|
||||
\node at (1,2.5) {$b$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
A convenient way to solve the problem is to
|
||||
send a \emph{ray} from the point to an arbitrary direction
|
||||
and calculate the number of times it touches
|
||||
the boundary of the polygon.
|
||||
If the number is odd,
|
||||
the point is inside the polygon,
|
||||
and if the number is even,
|
||||
the point is outside the polygon.
|
||||
|
||||
\begin{samepage}
|
||||
For example, we could send the following rays:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.75]
|
||||
\draw (0,0)--(2,2)--(5,1)--(2,3)--(1,2)--(-1,2)--(1,4)--(-2,4)--(-2,1)--(-3,3)--(-4,0)--(0,0);
|
||||
|
||||
\draw[fill] (-3,1) circle [radius=0.05];
|
||||
\node at (-3,0.5) {$a$};
|
||||
\draw[fill] (1,3) circle [radius=0.05];
|
||||
\node at (1,2.5) {$b$};
|
||||
|
||||
\draw[dashed,->] (-3,1)--(-6,0);
|
||||
\draw[dashed,->] (-3,1)--(0,5);
|
||||
|
||||
\draw[dashed,->] (1,3)--(3.5,0);
|
||||
\draw[dashed,->] (1,3)--(3,4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
The rays from $a$ touch 1 and 3 times
|
||||
the boundary of the polygon,
|
||||
so $a$ is inside the polygon.
|
||||
Correspondingly, the rays from $b$
|
||||
touch 0 and 2 times the boundary of the polygon,
|
||||
so $b$ is outside the polygon.
|
||||
|
||||
\section{Polygon area}
|
||||
|
||||
A general formula for calculating the area
|
||||
of a polygon, sometimes called the \key{shoelace formula},
|
||||
is as follows: \index{shoelace formula}
|
||||
\[\frac{1}{2} |\sum_{i=1}^{n-1} (p_i \times p_{i+1})| =
|
||||
\frac{1}{2} |\sum_{i=1}^{n-1} (x_i y_{i+1} - x_{i+1} y_i)|, \]
|
||||
Here the vertices are
|
||||
$p_1=(x_1,y_1)$, $p_2=(x_2,y_2)$, $\ldots$, $p_n=(x_n,y_n)$
|
||||
in such an order that
|
||||
$p_i$ and $p_{i+1}$ are adjacent vertices on the boundary
|
||||
of the polygon,
|
||||
and the first and last vertex is the same, i.e., $p_1=p_n$.
|
||||
|
||||
For example, the area of the polygon
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\filldraw (4,1.4) circle (2pt);
|
||||
\filldraw (7,3.4) circle (2pt);
|
||||
\filldraw (5,5.4) circle (2pt);
|
||||
\filldraw (2,4.4) circle (2pt);
|
||||
\filldraw (4,3.4) circle (2pt);
|
||||
\node (1) at (4,1) {(4,1)};
|
||||
\node (2) at (7.2,3) {(7,3)};
|
||||
\node (3) at (5,5.8) {(5,5)};
|
||||
\node (4) at (2,4) {(2,4)};
|
||||
\node (5) at (3.5,3) {(4,3)};
|
||||
\path[draw] (4,1.4) -- (7,3.4) -- (5,5.4) -- (2,4.4) -- (4,3.4) -- (4,1.4);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
is
|
||||
\[\frac{|(2\cdot5-5\cdot4)+(5\cdot3-7\cdot5)+(7\cdot1-4\cdot3)+(4\cdot3-4\cdot1)+(4\cdot4-2\cdot3)|}{2} = 17/2.\]
|
||||
|
||||
The idea of the formula is to go through trapezoids
|
||||
whose one side is a side of the polygon,
|
||||
and another side lies on the horizontal line $y=0$.
|
||||
For example:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\path[draw,fill=lightgray] (5,5.4) -- (7,3.4) -- (7,0) -- (5,0) -- (5,5.4);
|
||||
\filldraw (4,1.4) circle (2pt);
|
||||
\filldraw (7,3.4) circle (2pt);
|
||||
\filldraw (5,5.4) circle (2pt);
|
||||
\filldraw (2,4.4) circle (2pt);
|
||||
\filldraw (4,3.4) circle (2pt);
|
||||
\node (1) at (4,1) {(4,1)};
|
||||
\node (2) at (7.2,3) {(7,3)};
|
||||
\node (3) at (5,5.8) {(5,5)};
|
||||
\node (4) at (2,4) {(2,4)};
|
||||
\node (5) at (3.5,3) {(4,3)};
|
||||
\path[draw] (4,1.4) -- (7,3.4) -- (5,5.4) -- (2,4.4) -- (4,3.4) -- (4,1.4);
|
||||
\draw (0,0) -- (10,0);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The area of such a trapezoid is
|
||||
\[(x_{i+1}-x_{i}) \frac{y_i+y_{i+1}}{2},\]
|
||||
where the vertices of the polygon are $p_i$ and $p_{i+1}$.
|
||||
If $x_{i+1}>x_{i}$, the area is positive,
|
||||
and if $x_{i+1}<x_{i}$, the area is negative.
|
||||
|
||||
The area of the polygon is the sum of areas of
|
||||
all such trapezoids, which yields the formula
|
||||
\[|\sum_{i=1}^{n-1} (x_{i+1}-x_{i}) \frac{y_i+y_{i+1}}{2}| =
|
||||
\frac{1}{2} |\sum_{i=1}^{n-1} (x_i y_{i+1} - x_{i+1} y_i)|.\]
|
||||
|
||||
Note that the absolute value of the sum is taken,
|
||||
because the value of the sum may be positive or negative,
|
||||
depending on whether we walk clockwise or counterclockwise
|
||||
along the boundary of the polygon.
|
||||
|
||||
\subsubsection{Pick's theorem}
|
||||
|
||||
\index{Pick's theorem}
|
||||
|
||||
\key{Pick's theorem} provides another way to calculate
|
||||
the area of a polygon provided that all vertices
|
||||
of the polygon have integer coordinates.
|
||||
According to Pick's theorem, the area of the polygon is
|
||||
\[ a + b/2 -1,\]
|
||||
where $a$ is the number of integer points inside the polygon
|
||||
and $b$ is the number of integer points on the boundary of the polygon.
|
||||
|
||||
For example, the area of the polygon
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\filldraw (4,1.4) circle (2pt);
|
||||
\filldraw (7,3.4) circle (2pt);
|
||||
\filldraw (5,5.4) circle (2pt);
|
||||
\filldraw (2,4.4) circle (2pt);
|
||||
\filldraw (4,3.4) circle (2pt);
|
||||
\node (1) at (4,1) {(4,1)};
|
||||
\node (2) at (7.2,3) {(7,3)};
|
||||
\node (3) at (5,5.8) {(5,5)};
|
||||
\node (4) at (2,4) {(2,4)};
|
||||
\node (5) at (3.5,3) {(4,3)};
|
||||
\path[draw] (4,1.4) -- (7,3.4) -- (5,5.4) -- (2,4.4) -- (4,3.4) -- (4,1.4);
|
||||
|
||||
\filldraw (2,4.4) circle (2pt);
|
||||
\filldraw (3,4.4) circle (2pt);
|
||||
\filldraw (4,4.4) circle (2pt);
|
||||
\filldraw (5,4.4) circle (2pt);
|
||||
\filldraw (6,4.4) circle (2pt);
|
||||
|
||||
\filldraw (4,3.4) circle (2pt);
|
||||
\filldraw (5,3.4) circle (2pt);
|
||||
\filldraw (6,3.4) circle (2pt);
|
||||
\filldraw (7,3.4) circle (2pt);
|
||||
|
||||
\filldraw (4,2.4) circle (2pt);
|
||||
\filldraw (5,2.4) circle (2pt);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
is $6+7/2-1=17/2$.
|
||||
|
||||
\section{Distance functions}
|
||||
|
||||
\index{distance function}
|
||||
\index{Euclidean distance}
|
||||
\index{Manhattan distance}
|
||||
|
||||
A \key{distance function} defines the distance between
|
||||
two points.
|
||||
The usual distance function is the
|
||||
\key{Euclidean distance} where the distance between
|
||||
points $(x_1,y_1)$ and $(x_2,y_2)$ is
|
||||
\[\sqrt{(x_2-x_1)^2+(y_2-y_1)^2}.\]
|
||||
An alternative distance function is the
|
||||
\key{Manhattan distance}
|
||||
where the distance between points
|
||||
$(x_1,y_1)$ and $(x_2,y_2)$ is
|
||||
\[|x_1-x_2|+|y_1-y_2|.\]
|
||||
\begin{samepage}
|
||||
For example, consider the following picture:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
|
||||
\draw[fill] (2,1) circle [radius=0.05];
|
||||
\draw[fill] (5,2) circle [radius=0.05];
|
||||
|
||||
\node at (2,0.5) {$(2,1)$};
|
||||
\node at (5,1.5) {$(5,2)$};
|
||||
|
||||
\draw[dashed] (2,1) -- (5,2);
|
||||
|
||||
\draw[fill] (5+2,1) circle [radius=0.05];
|
||||
\draw[fill] (5+5,2) circle [radius=0.05];
|
||||
|
||||
\node at (5+2,0.5) {$(2,1)$};
|
||||
\node at (5+5,1.5) {$(5,2)$};
|
||||
|
||||
\draw[dashed] (5+2,1) -- (5+2,2);
|
||||
\draw[dashed] (5+2,2) -- (5+5,2);
|
||||
|
||||
\node at (3.5,-0.5) {Euclidean distance};
|
||||
\node at (5+3.5,-0.5) {Manhattan distance};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
The Euclidean distance between the points is
|
||||
\[\sqrt{(5-2)^2+(2-1)^2}=\sqrt{10}\]
|
||||
and the Manhattan distance is
|
||||
\[|5-2|+|2-1|=4.\]
|
||||
The following picture shows regions that are within a distance of 1
|
||||
from the center point, using the Euclidean and Manhattan distances:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}
|
||||
|
||||
\draw[fill=gray!20] (0,0) circle [radius=1];
|
||||
\draw[fill] (0,0) circle [radius=0.05];
|
||||
|
||||
\node at (0,-1.5) {Euclidean distance};
|
||||
|
||||
\draw[fill=gray!20] (5+0,1) -- (5-1,0) -- (5+0,-1) -- (5+1,0) -- (5+0,1);
|
||||
\draw[fill] (5,0) circle [radius=0.05];
|
||||
\node at (5,-1.5) {Manhattan distance};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\subsubsection{Rotating coordinates}
|
||||
|
||||
Some problems are easier to solve if
|
||||
Manhattan distances are used instead of Euclidean distances.
|
||||
As an example, consider a problem where we are given
|
||||
$n$ points in the two-dimensional plane
|
||||
and our task is to calculate the maximum Manhattan
|
||||
distance between any two points.
|
||||
|
||||
For example, consider the following set of points:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.65]
|
||||
\draw[color=gray] (-1,-1) grid (4,4);
|
||||
|
||||
\filldraw (0,2) circle (2.5pt);
|
||||
\filldraw (3,3) circle (2.5pt);
|
||||
\filldraw (1,0) circle (2.5pt);
|
||||
\filldraw (3,1) circle (2.5pt);
|
||||
|
||||
\node at (0,1.5) {$A$};
|
||||
\node at (3,2.5) {$C$};
|
||||
\node at (1,-0.5) {$B$};
|
||||
\node at (3,0.5) {$D$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The maximum Manhattan distance is 5
|
||||
between points $B$ and $C$:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.65]
|
||||
\draw[color=gray] (-1,-1) grid (4,4);
|
||||
|
||||
\filldraw (0,2) circle (2.5pt);
|
||||
\filldraw (3,3) circle (2.5pt);
|
||||
\filldraw (1,0) circle (2.5pt);
|
||||
\filldraw (3,1) circle (2.5pt);
|
||||
|
||||
\node at (0,1.5) {$A$};
|
||||
\node at (3,2.5) {$C$};
|
||||
\node at (1,-0.5) {$B$};
|
||||
\node at (3,0.5) {$D$};
|
||||
|
||||
\path[draw=red,thick,line width=2pt] (1,0) -- (1,3) -- (3,3);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
A useful technique related to Manhattan distances
|
||||
is to rotate all coordinates 45 degrees so that
|
||||
a point $(x,y)$ becomes $(x+y,y-x)$.
|
||||
For example, after rotating the above points,
|
||||
the result is:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.6]
|
||||
\draw[color=gray] (0,-3) grid (7,3);
|
||||
|
||||
\filldraw (2,2) circle (2.5pt);
|
||||
\filldraw (6,0) circle (2.5pt);
|
||||
\filldraw (1,-1) circle (2.5pt);
|
||||
\filldraw (4,-2) circle (2.5pt);
|
||||
|
||||
\node at (2,1.5) {$A$};
|
||||
\node at (6,-0.5) {$C$};
|
||||
\node at (1,-1.5) {$B$};
|
||||
\node at (4,-2.5) {$D$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
And the maximum distance is as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.6]
|
||||
\draw[color=gray] (0,-3) grid (7,3);
|
||||
|
||||
\filldraw (2,2) circle (2.5pt);
|
||||
\filldraw (6,0) circle (2.5pt);
|
||||
\filldraw (1,-1) circle (2.5pt);
|
||||
\filldraw (4,-2) circle (2.5pt);
|
||||
|
||||
\node at (2,1.5) {$A$};
|
||||
\node at (6,-0.5) {$C$};
|
||||
\node at (1,-1.5) {$B$};
|
||||
\node at (4,-2.5) {$D$};
|
||||
|
||||
\path[draw=red,thick,line width=2pt] (1,-1) -- (4,2) -- (6,0);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
Consider two points $p_1=(x_1,y_1)$ and $p_2=(x_2,y_2)$ whose rotated
|
||||
coordinates are $p'_1=(x'_1,y'_1)$ and $p'_2=(x'_2,y'_2)$.
|
||||
Now there are two ways to express the Manhattan distance
|
||||
between $p_1$ and $p_2$:
|
||||
\[|x_1-x_2|+|y_1-y_2| = \max(|x'_1-x'_2|,|y'_1-y'_2|)\]
|
||||
|
||||
For example, if $p_1=(1,0)$ and $p_2=(3,3)$,
|
||||
the rotated coordinates are $p'_1=(1,-1)$ and $p'_2=(6,0)$
|
||||
and the Manhattan distance is
|
||||
\[|1-3|+|0-3| = \max(|1-6|,|-1-0|) = 5.\]
|
||||
|
||||
The rotated coordinates provide a simple way
|
||||
to operate with Manhattan distances, because we can
|
||||
consider x and y coordinates separately.
|
||||
To maximize the Manhattan distance between two points,
|
||||
we should find two points whose
|
||||
rotated coordinates maximize the value of
|
||||
\[\max(|x'_1-x'_2|,|y'_1-y'_2|).\]
|
||||
This is easy, because either the horizontal or vertical
|
||||
difference of the rotated coordinates has to be maximum.
|
847
chapter30.tex
847
chapter30.tex
|
@ -1,847 +0,0 @@
|
|||
\chapter{Sweep line algorithms}
|
||||
|
||||
\index{sweep line}
|
||||
|
||||
Many geometric problems can be solved using
|
||||
\key{sweep line} algorithms.
|
||||
The idea in such algorithms is to represent
|
||||
an instance of the problem as a set of events that correspond
|
||||
to points in the plane.
|
||||
The events are processed in increasing order
|
||||
according to their x or y coordinates.
|
||||
|
||||
As an example, consider the following problem:
|
||||
There is a company that has $n$ employees,
|
||||
and we know for each employee their arrival and
|
||||
leaving times on a certain day.
|
||||
Our task is to calculate the maximum number of
|
||||
employees that were in the office at the same time.
|
||||
|
||||
The problem can be solved by modeling the situation
|
||||
so that each employee is assigned two events that
|
||||
correspond to their arrival and leaving times.
|
||||
After sorting the events, we go through them
|
||||
and keep track of the number of people in the office.
|
||||
For example, the table
|
||||
\begin{center}
|
||||
\begin{tabular}{ccc}
|
||||
person & arrival time & leaving time \\
|
||||
\hline
|
||||
John & 10 & 15 \\
|
||||
Maria & 6 & 12 \\
|
||||
Peter & 14 & 16 \\
|
||||
Lisa & 5 & 13 \\
|
||||
\end{tabular}
|
||||
\end{center}
|
||||
corresponds to the following events:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.6]
|
||||
\draw (0,0) rectangle (17,-6.5);
|
||||
\path[draw,thick,-] (10,-1) -- (15,-1);
|
||||
\path[draw,thick,-] (6,-2.5) -- (12,-2.5);
|
||||
\path[draw,thick,-] (14,-4) -- (16,-4);
|
||||
\path[draw,thick,-] (5,-5.5) -- (13,-5.5);
|
||||
|
||||
\draw[fill] (10,-1) circle [radius=0.05];
|
||||
\draw[fill] (15,-1) circle [radius=0.05];
|
||||
\draw[fill] (6,-2.5) circle [radius=0.05];
|
||||
\draw[fill] (12,-2.5) circle [radius=0.05];
|
||||
\draw[fill] (14,-4) circle [radius=0.05];
|
||||
\draw[fill] (16,-4) circle [radius=0.05];
|
||||
\draw[fill] (5,-5.5) circle [radius=0.05];
|
||||
\draw[fill] (13,-5.5) circle [radius=0.05];
|
||||
|
||||
\node at (2,-1) {John};
|
||||
\node at (2,-2.5) {Maria};
|
||||
\node at (2,-4) {Peter};
|
||||
\node at (2,-5.5) {Lisa};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
We go through the events from left to right
|
||||
and maintain a counter.
|
||||
Always when a person arrives, we increase
|
||||
the value of the counter by one,
|
||||
and when a person leaves,
|
||||
we decrease the value of the counter by one.
|
||||
The answer to the problem is the maximum
|
||||
value of the counter during the algorithm.
|
||||
|
||||
In the example, the events are processed as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.6]
|
||||
\path[draw,thick,->] (0.5,0.5) -- (16.5,0.5);
|
||||
\draw (0,0) rectangle (17,-6.5);
|
||||
\path[draw,thick,-] (10,-1) -- (15,-1);
|
||||
\path[draw,thick,-] (6,-2.5) -- (12,-2.5);
|
||||
\path[draw,thick,-] (14,-4) -- (16,-4);
|
||||
\path[draw,thick,-] (5,-5.5) -- (13,-5.5);
|
||||
|
||||
\draw[fill] (10,-1) circle [radius=0.05];
|
||||
\draw[fill] (15,-1) circle [radius=0.05];
|
||||
\draw[fill] (6,-2.5) circle [radius=0.05];
|
||||
\draw[fill] (12,-2.5) circle [radius=0.05];
|
||||
\draw[fill] (14,-4) circle [radius=0.05];
|
||||
\draw[fill] (16,-4) circle [radius=0.05];
|
||||
\draw[fill] (5,-5.5) circle [radius=0.05];
|
||||
\draw[fill] (13,-5.5) circle [radius=0.05];
|
||||
|
||||
\node at (2,-1) {John};
|
||||
\node at (2,-2.5) {Maria};
|
||||
\node at (2,-4) {Peter};
|
||||
\node at (2,-5.5) {Lisa};
|
||||
|
||||
\path[draw,dashed] (10,0)--(10,-6.5);
|
||||
\path[draw,dashed] (15,0)--(15,-6.5);
|
||||
\path[draw,dashed] (6,0)--(6,-6.5);
|
||||
\path[draw,dashed] (12,0)--(12,-6.5);
|
||||
\path[draw,dashed] (14,0)--(14,-6.5);
|
||||
\path[draw,dashed] (16,0)--(16,-6.5);
|
||||
\path[draw,dashed] (5,0)--(5,-6.5);
|
||||
\path[draw,dashed] (13,0)--(13,-6.5);
|
||||
|
||||
\node at (10,-7) {$+$};
|
||||
\node at (15,-7) {$-$};
|
||||
\node at (6,-7) {$+$};
|
||||
\node at (12,-7) {$-$};
|
||||
\node at (14,-7) {$+$};
|
||||
\node at (16,-7) {$-$};
|
||||
\node at (5,-7) {$+$};
|
||||
\node at (13,-7) {$-$};
|
||||
|
||||
\node at (10,-8) {$3$};
|
||||
\node at (15,-8) {$1$};
|
||||
\node at (6,-8) {$2$};
|
||||
\node at (12,-8) {$2$};
|
||||
\node at (14,-8) {$2$};
|
||||
\node at (16,-8) {$0$};
|
||||
\node at (5,-8) {$1$};
|
||||
\node at (13,-8) {$1$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
The symbols $+$ and $-$ indicate whether the
|
||||
value of the counter increases or decreases,
|
||||
and the value of the counter is shown below.
|
||||
The maximum value of the counter is 3
|
||||
between John's arrival and Maria's leaving.
|
||||
|
||||
The running time of the algorithm is $O(n \log n)$,
|
||||
because sorting the events takes $O(n \log n)$ time
|
||||
and the rest of the algorithm takes $O(n)$ time.
|
||||
|
||||
\section{Intersection points}
|
||||
|
||||
\index{intersection point}
|
||||
|
||||
Given a set of $n$ line segments, each of them being either
|
||||
horizontal or vertical, consider the problem of
|
||||
counting the total number of intersection points.
|
||||
For example, when the line segments are
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.5]
|
||||
\path[draw,thick,-] (0,2) -- (5,2);
|
||||
\path[draw,thick,-] (1,4) -- (6,4);
|
||||
\path[draw,thick,-] (6,3) -- (10,3);
|
||||
\path[draw,thick,-] (2,1) -- (2,6);
|
||||
\path[draw,thick,-] (8,2) -- (8,5);
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
there are three intersection points:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.5]
|
||||
\path[draw,thick,-] (0,2) -- (5,2);
|
||||
\path[draw,thick,-] (1,4) -- (6,4);
|
||||
\path[draw,thick,-] (6,3) -- (10,3);
|
||||
\path[draw,thick,-] (2,1) -- (2,6);
|
||||
\path[draw,thick,-] (8,2) -- (8,5);
|
||||
|
||||
\draw[fill] (2,2) circle [radius=0.15];
|
||||
\draw[fill] (2,4) circle [radius=0.15];
|
||||
\draw[fill] (8,3) circle [radius=0.15];
|
||||
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
It is easy to solve the problem in $O(n^2)$ time,
|
||||
because we can go through all possible pairs of line segments
|
||||
and check if they intersect.
|
||||
However, we can solve the problem more efficiently
|
||||
in $O(n \log n)$ time using a sweep line algorithm
|
||||
and a range query data structure.
|
||||
|
||||
The idea is to process the endpoints of the line
|
||||
segments from left to right and
|
||||
focus on three types of events:
|
||||
\begin{enumerate}[noitemsep]
|
||||
\item[(1)] horizontal segment begins
|
||||
\item[(2)] horizontal segment ends
|
||||
\item[(3)] vertical segment
|
||||
\end{enumerate}
|
||||
|
||||
The following events correspond to the example:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.6]
|
||||
\path[draw,dashed] (0,2) -- (5,2);
|
||||
\path[draw,dashed] (1,4) -- (6,4);
|
||||
\path[draw,dashed] (6,3) -- (10,3);
|
||||
\path[draw,dashed] (2,1) -- (2,6);
|
||||
\path[draw,dashed] (8,2) -- (8,5);
|
||||
|
||||
\node at (0,2) {$1$};
|
||||
\node at (5,2) {$2$};
|
||||
\node at (1,4) {$1$};
|
||||
\node at (6,4) {$2$};
|
||||
\node at (6,3) {$1$};
|
||||
\node at (10,3) {$2$};
|
||||
|
||||
\node at (2,3.5) {$3$};
|
||||
\node at (8,3.5) {$3$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
We go through the events from left to right
|
||||
and use a data structure that maintains a set of
|
||||
y coordinates where there is an active horizontal segment.
|
||||
At event 1, we add the y coordinate of the segment
|
||||
to the set, and at event 2, we remove the
|
||||
y coordinate from the set.
|
||||
|
||||
Intersection points are calculated at event 3.
|
||||
When there is a vertical segment between points
|
||||
$y_1$ and $y_2$, we count the number of active
|
||||
horizontal segments whose y coordinate is between
|
||||
$y_1$ and $y_2$, and add this number to the total
|
||||
number of intersection points.
|
||||
|
||||
To store y coordinates of horizontal segments,
|
||||
we can use a binary indexed or segment tree,
|
||||
possibly with index compression.
|
||||
When such structures are used, processing each event
|
||||
takes $O(\log n)$ time, so the total running
|
||||
time of the algorithm is $O(n \log n)$.
|
||||
|
||||
\section{Closest pair problem}
|
||||
|
||||
\index{closest pair}
|
||||
|
||||
Given a set of $n$ points, our next problem is
|
||||
to find two points whose Euclidean distance is minimum.
|
||||
For example, if the points are
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0)--(12,0)--(12,4)--(0,4)--(0,0);
|
||||
|
||||
\draw (1,2) circle [radius=0.1];
|
||||
\draw (3,1) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5.5,1.5) circle [radius=0.1];
|
||||
\draw (6,2.5) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (9,1.5) circle [radius=0.1];
|
||||
\draw (10,2) circle [radius=0.1];
|
||||
\draw (1.5,3.5) circle [radius=0.1];
|
||||
\draw (1.5,1) circle [radius=0.1];
|
||||
\draw (2.5,3) circle [radius=0.1];
|
||||
\draw (4.5,1.5) circle [radius=0.1];
|
||||
\draw (5.25,0.5) circle [radius=0.1];
|
||||
\draw (6.5,2) circle [radius=0.1];
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\begin{samepage}
|
||||
we should find the following points:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0)--(12,0)--(12,4)--(0,4)--(0,0);
|
||||
|
||||
\draw (1,2) circle [radius=0.1];
|
||||
\draw (3,1) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5.5,1.5) circle [radius=0.1];
|
||||
\draw[fill] (6,2.5) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (9,1.5) circle [radius=0.1];
|
||||
\draw (10,2) circle [radius=0.1];
|
||||
\draw (1.5,3.5) circle [radius=0.1];
|
||||
\draw (1.5,1) circle [radius=0.1];
|
||||
\draw (2.5,3) circle [radius=0.1];
|
||||
\draw (4.5,1.5) circle [radius=0.1];
|
||||
\draw (5.25,0.5) circle [radius=0.1];
|
||||
\draw[fill] (6.5,2) circle [radius=0.1];
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
|
||||
This is another example of a problem
|
||||
that can be solved in $O(n \log n)$ time
|
||||
using a sweep line algorithm\footnote{Besides this approach,
|
||||
there is also an
|
||||
$O(n \log n)$ time divide-and-conquer algorithm \cite{sha75}
|
||||
that divides the points into two sets and recursively
|
||||
solves the problem for both sets.}.
|
||||
We go through the points from left to right
|
||||
and maintain a value $d$: the minimum distance
|
||||
between two points seen so far.
|
||||
At each point, we find the nearest point to the left.
|
||||
If the distance is less than $d$, it is the
|
||||
new minimum distance and we update
|
||||
the value of $d$.
|
||||
|
||||
If the current point is $(x,y)$
|
||||
and there is a point to the left
|
||||
within a distance of less than $d$,
|
||||
the x coordinate of such a point must
|
||||
be between $[x-d,x]$ and the y coordinate
|
||||
must be between $[y-d,y+d]$.
|
||||
Thus, it suffices to only consider points
|
||||
that are located in those ranges,
|
||||
which makes the algorithm efficient.
|
||||
|
||||
For example, in the following picture, the
|
||||
region marked with dashed lines contains
|
||||
the points that can be within a distance of $d$
|
||||
from the active point:
|
||||
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0)--(12,0)--(12,4)--(0,4)--(0,0);
|
||||
|
||||
\draw (1,2) circle [radius=0.1];
|
||||
\draw (3,1) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5.5,1.5) circle [radius=0.1];
|
||||
\draw (6,2.5) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (9,1.5) circle [radius=0.1];
|
||||
\draw (10,2) circle [radius=0.1];
|
||||
\draw (1.5,3.5) circle [radius=0.1];
|
||||
\draw (1.5,1) circle [radius=0.1];
|
||||
\draw (2.5,3) circle [radius=0.1];
|
||||
\draw (4.5,1.5) circle [radius=0.1];
|
||||
\draw (5.25,0.5) circle [radius=0.1];
|
||||
\draw[fill] (6.5,2) circle [radius=0.1];
|
||||
|
||||
\draw[dashed] (6.5,0.75)--(6.5,3.25);
|
||||
\draw[dashed] (5.25,0.75)--(5.25,3.25);
|
||||
\draw[dashed] (5.25,0.75)--(6.5,0.75);
|
||||
\draw[dashed] (5.25,3.25)--(6.5,3.25);
|
||||
|
||||
\draw [decoration={brace}, decorate, line width=0.3mm] (5.25,3.5) -- (6.5,3.5);
|
||||
\node at (5.875,4) {$d$};
|
||||
\draw [decoration={brace}, decorate, line width=0.3mm] (6.75,3.25) -- (6.75,2);
|
||||
\node at (7.25,2.625) {$d$};
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
The efficiency of the algorithm is based on the fact
|
||||
that the region always contains
|
||||
only $O(1)$ points.
|
||||
We can go through those points in $O(\log n)$ time
|
||||
by maintaining a set of points whose x coordinate
|
||||
is between $[x-d,x]$, in increasing order according
|
||||
to their y coordinates.
|
||||
|
||||
The time complexity of the algorithm is $O(n \log n)$,
|
||||
because we go through $n$ points and
|
||||
find for each point the nearest point to the left
|
||||
in $O(\log n)$ time.
|
||||
|
||||
\section{Convex hull problem}
|
||||
|
||||
A \key{convex hull} is the smallest convex polygon
|
||||
that contains all points of a given set.
|
||||
Convexity means that a line segment between
|
||||
any two vertices of the polygon is completely
|
||||
inside the polygon.
|
||||
|
||||
\begin{samepage}
|
||||
For example, for the points
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
\end{samepage}
|
||||
the convex hull is as follows:
|
||||
\begin{center}
|
||||
\begin{tikzpicture}[scale=0.7]
|
||||
\draw (0,0)--(4,-1)--(7,1)--(6,3)--(2,4)--(0,2)--(0,0);
|
||||
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
\end{tikzpicture}
|
||||
\end{center}
|
||||
|
||||
\index{Andrew's algorithm}
|
||||
|
||||
\key{Andrew's algorithm} \cite{and79} provides
|
||||
an easy way to
|
||||
construct the convex hull for a set of points
|
||||
in $O(n \log n)$ time.
|
||||
The algorithm first locates the leftmost
|
||||
and rightmost points, and then
|
||||
constructs the convex hull in two parts:
|
||||
first the upper hull and then the lower hull.
|
||||
Both parts are similar, so we can focus on
|
||||
constructing the upper hull.
|
||||
|
||||
First, we sort the points primarily according to
|
||||
x coordinates and secondarily according to y coordinates.
|
||||
After this, we go through the points and
|
||||
add each point to the hull.
|
||||
Always after adding a point to the hull,
|
||||
we make sure that the last line segment
|
||||
in the hull does not turn left.
|
||||
As long as it turns left, we repeatedly remove the
|
||||
second last point from the hull.
|
||||
|
||||
The following pictures show how
|
||||
Andrew's algorithm works:
|
||||
\\
|
||||
\begin{tabular}{ccccccc}
|
||||
\\
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(1,1);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(1,1)--(2,2);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,2);
|
||||
\end{tikzpicture}
|
||||
\\
|
||||
1 & & 2 & & 3 & & 4 \\
|
||||
\end{tabular}
|
||||
\\
|
||||
\begin{tabular}{ccccccc}
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,2)--(2,4);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(3,2);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,-1);
|
||||
\end{tikzpicture}
|
||||
\\
|
||||
5 & & 6 & & 7 & & 8 \\
|
||||
\end{tabular}
|
||||
\\
|
||||
\begin{tabular}{ccccccc}
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,-1)--(4,0);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,0);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,0)--(4,3);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(3,2)--(4,3);
|
||||
\end{tikzpicture}
|
||||
\\
|
||||
9 & & 10 & & 11 & & 12 \\
|
||||
\end{tabular}
|
||||
\\
|
||||
\begin{tabular}{ccccccc}
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(4,3);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2)--(6,1);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2)--(6,1)--(6,3);
|
||||
\end{tikzpicture}
|
||||
\\
|
||||
13 & & 14 & & 15 & & 16 \\
|
||||
\end{tabular}
|
||||
\\
|
||||
\begin{tabular}{ccccccc}
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(4,3)--(5,2)--(6,3);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(4,3)--(6,3);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(6,3);
|
||||
\end{tikzpicture}
|
||||
& \hspace{0.1cm} &
|
||||
\begin{tikzpicture}[scale=0.3]
|
||||
\draw (-1,-2)--(8,-2)--(8,5)--(-1,5)--(-1,-2);
|
||||
\draw (0,0) circle [radius=0.1];
|
||||
\draw (4,-1) circle [radius=0.1];
|
||||
\draw (7,1) circle [radius=0.1];
|
||||
\draw (6,3) circle [radius=0.1];
|
||||
\draw (2,4) circle [radius=0.1];
|
||||
\draw (0,2) circle [radius=0.1];
|
||||
|
||||
\draw (1,1) circle [radius=0.1];
|
||||
\draw (2,2) circle [radius=0.1];
|
||||
\draw (3,2) circle [radius=0.1];
|
||||
\draw (4,0) circle [radius=0.1];
|
||||
\draw (4,3) circle [radius=0.1];
|
||||
\draw (5,2) circle [radius=0.1];
|
||||
\draw (6,1) circle [radius=0.1];
|
||||
|
||||
\draw (0,0)--(0,2)--(2,4)--(6,3)--(7,1);
|
||||
\end{tikzpicture}
|
||||
\\
|
||||
17 & & 18 & & 19 & & 20
|
||||
\end{tabular}
|
||||
|
||||
|
||||
|
||||
|
388
list.tex
388
list.tex
|
@ -1,388 +0,0 @@
|
|||
\begin{thebibliography}{9}
|
||||
|
||||
\bibitem{aho83}
|
||||
A. V. Aho, J. E. Hopcroft and J. Ullman.
|
||||
\emph{Data Structures and Algorithms},
|
||||
Addison-Wesley, 1983.
|
||||
|
||||
\bibitem{ahu91}
|
||||
R. K. Ahuja and J. B. Orlin.
|
||||
Distance directed augmenting path algorithms for maximum flow and parametric maximum flow problems.
|
||||
\emph{Naval Research Logistics}, 38(3):413--430, 1991.
|
||||
|
||||
\bibitem{and79}
|
||||
A. M. Andrew.
|
||||
Another efficient algorithm for convex hulls in two dimensions.
|
||||
\emph{Information Processing Letters}, 9(5):216--219, 1979.
|
||||
|
||||
\bibitem{asp79}
|
||||
B. Aspvall, M. F. Plass and R. E. Tarjan.
|
||||
A linear-time algorithm for testing the truth of certain quantified boolean formulas.
|
||||
\emph{Information Processing Letters}, 8(3):121--123, 1979.
|
||||
|
||||
\bibitem{bel58}
|
||||
R. Bellman.
|
||||
On a routing problem.
|
||||
\emph{Quarterly of Applied Mathematics}, 16(1):87--90, 1958.
|
||||
|
||||
\bibitem{bec07}
|
||||
M. Beck, E. Pine, W. Tarrat and K. Y. Jensen.
|
||||
New integer representations as the sum of three cubes.
|
||||
\emph{Mathematics of Computation}, 76(259):1683--1690, 2007.
|
||||
|
||||
\bibitem{ben00}
|
||||
M. A. Bender and M. Farach-Colton.
|
||||
The LCA problem revisited. In
|
||||
\emph{Latin American Symposium on Theoretical Informatics}, 88--94, 2000.
|
||||
|
||||
\bibitem{ben86}
|
||||
J. Bentley.
|
||||
\emph{Programming Pearls}.
|
||||
Addison-Wesley, 1999 (2nd edition).
|
||||
|
||||
\bibitem{ben80}
|
||||
J. Bentley and D. Wood.
|
||||
An optimal worst case algorithm for reporting intersections of rectangles.
|
||||
\emph{IEEE Transactions on Computers}, C-29(7):571--577, 1980.
|
||||
|
||||
\bibitem{bou01}
|
||||
C. L. Bouton.
|
||||
Nim, a game with a complete mathematical theory.
|
||||
\emph{Annals of Mathematics}, 3(1/4):35--39, 1901.
|
||||
|
||||
% \bibitem{bur97}
|
||||
% W. Burnside.
|
||||
% \emph{Theory of Groups of Finite Order},
|
||||
% Cambridge University Press, 1897.
|
||||
|
||||
\bibitem{coci}
|
||||
Croatian Open Competition in Informatics, \url{http://hsin.hr/coci/}
|
||||
|
||||
\bibitem{cod15}
|
||||
Codeforces: On ''Mo's algorithm'',
|
||||
\url{http://codeforces.com/blog/entry/20032}
|
||||
|
||||
\bibitem{cor09}
|
||||
T. H. Cormen, C. E. Leiserson, R. L. Rivest and C. Stein.
|
||||
\emph{Introduction to Algorithms}, MIT Press, 2009 (3rd edition).
|
||||
|
||||
\bibitem{dij59}
|
||||
E. W. Dijkstra.
|
||||
A note on two problems in connexion with graphs.
|
||||
\emph{Numerische Mathematik}, 1(1):269--271, 1959.
|
||||
|
||||
\bibitem{dik12}
|
||||
K. Diks et al.
|
||||
\emph{Looking for a Challenge? The Ultimate Problem Set from
|
||||
the University of Warsaw Programming Competitions}, University of Warsaw, 2012.
|
||||
|
||||
% \bibitem{dil50}
|
||||
% R. P. Dilworth.
|
||||
% A decomposition theorem for partially ordered sets.
|
||||
% \emph{Annals of Mathematics}, 51(1):161--166, 1950.
|
||||
|
||||
% \bibitem{dir52}
|
||||
% G. A. Dirac.
|
||||
% Some theorems on abstract graphs.
|
||||
% \emph{Proceedings of the London Mathematical Society}, 3(1):69--81, 1952.
|
||||
|
||||
\bibitem{dim15}
|
||||
M. Dima and R. Ceterchi.
|
||||
Efficient range minimum queries using binary indexed trees.
|
||||
\emph{Olympiad in Informatics}, 9(1):39--44, 2015.
|
||||
|
||||
\bibitem{edm65}
|
||||
J. Edmonds.
|
||||
Paths, trees, and flowers.
|
||||
\emph{Canadian Journal of Mathematics}, 17(3):449--467, 1965.
|
||||
|
||||
\bibitem{edm72}
|
||||
J. Edmonds and R. M. Karp.
|
||||
Theoretical improvements in algorithmic efficiency for network flow problems.
|
||||
\emph{Journal of the ACM}, 19(2):248--264, 1972.
|
||||
|
||||
\bibitem{eve75}
|
||||
S. Even, A. Itai and A. Shamir.
|
||||
On the complexity of time table and multi-commodity flow problems.
|
||||
\emph{16th Annual Symposium on Foundations of Computer Science}, 184--193, 1975.
|
||||
|
||||
\bibitem{fan94}
|
||||
D. Fanding.
|
||||
A faster algorithm for shortest-path -- SPFA.
|
||||
\emph{Journal of Southwest Jiaotong University}, 2, 1994.
|
||||
|
||||
\bibitem{fen94}
|
||||
P. M. Fenwick.
|
||||
A new data structure for cumulative frequency tables.
|
||||
\emph{Software: Practice and Experience}, 24(3):327--336, 1994.
|
||||
|
||||
\bibitem{fis06}
|
||||
J. Fischer and V. Heun.
|
||||
Theoretical and practical improvements on the RMQ-problem, with applications to LCA and LCE.
|
||||
In \emph{Annual Symposium on Combinatorial Pattern Matching}, 36--48, 2006.
|
||||
|
||||
\bibitem{flo62}
|
||||
R. W. Floyd
|
||||
Algorithm 97: shortest path.
|
||||
\emph{Communications of the ACM}, 5(6):345, 1962.
|
||||
|
||||
\bibitem{for56a}
|
||||
L. R. Ford.
|
||||
Network flow theory.
|
||||
RAND Corporation, Santa Monica, California, 1956.
|
||||
|
||||
\bibitem{for56}
|
||||
L. R. Ford and D. R. Fulkerson.
|
||||
Maximal flow through a network.
|
||||
\emph{Canadian Journal of Mathematics}, 8(3):399--404, 1956.
|
||||
|
||||
\bibitem{fre77}
|
||||
R. Freivalds.
|
||||
Probabilistic machines can use less running time.
|
||||
In \emph{IFIP congress}, 839--842, 1977.
|
||||
|
||||
\bibitem{gal14}
|
||||
F. Le Gall.
|
||||
Powers of tensors and fast matrix multiplication.
|
||||
In \emph{Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation},
|
||||
296--303, 2014.
|
||||
|
||||
\bibitem{gar79}
|
||||
M. R. Garey and D. S. Johnson.
|
||||
\emph{Computers and Intractability:
|
||||
A Guide to the Theory of NP-Completeness},
|
||||
W. H. Freeman and Company, 1979.
|
||||
|
||||
\bibitem{goo17}
|
||||
Google Code Jam Statistics (2017),
|
||||
\url{https://www.go-hero.net/jam/17}
|
||||
|
||||
\bibitem{gro14}
|
||||
A. Grønlund and S. Pettie.
|
||||
Threesomes, degenerates, and love triangles.
|
||||
In \emph{Proceedings of the 55th Annual Symposium on Foundations of Computer Science},
|
||||
621--630, 2014.
|
||||
|
||||
\bibitem{gru39}
|
||||
P. M. Grundy.
|
||||
Mathematics and games.
|
||||
\emph{Eureka}, 2(5):6--8, 1939.
|
||||
|
||||
\bibitem{gus97}
|
||||
D. Gusfield.
|
||||
\emph{Algorithms on Strings, Trees and Sequences:
|
||||
Computer Science and Computational Biology},
|
||||
Cambridge University Press, 1997.
|
||||
|
||||
% \bibitem{hal35}
|
||||
% P. Hall.
|
||||
% On representatives of subsets.
|
||||
% \emph{Journal London Mathematical Society} 10(1):26--30, 1935.
|
||||
|
||||
\bibitem{hal13}
|
||||
S. Halim and F. Halim.
|
||||
\emph{Competitive Programming 3: The New Lower Bound of Programming Contests}, 2013.
|
||||
|
||||
\bibitem{hel62}
|
||||
M. Held and R. M. Karp.
|
||||
A dynamic programming approach to sequencing problems.
|
||||
\emph{Journal of the Society for Industrial and Applied Mathematics}, 10(1):196--210, 1962.
|
||||
|
||||
\bibitem{hie73}
|
||||
C. Hierholzer and C. Wiener.
|
||||
Über die Möglichkeit, einen Linienzug ohne Wiederholung und ohne Unterbrechung zu umfahren.
|
||||
\emph{Mathematische Annalen}, 6(1), 30--32, 1873.
|
||||
|
||||
\bibitem{hoa61a}
|
||||
C. A. R. Hoare.
|
||||
Algorithm 64: Quicksort.
|
||||
\emph{Communications of the ACM}, 4(7):321, 1961.
|
||||
|
||||
\bibitem{hoa61b}
|
||||
C. A. R. Hoare.
|
||||
Algorithm 65: Find.
|
||||
\emph{Communications of the ACM}, 4(7):321--322, 1961.
|
||||
|
||||
\bibitem{hop71}
|
||||
J. E. Hopcroft and J. D. Ullman.
|
||||
A linear list merging algorithm.
|
||||
Technical report, Cornell University, 1971.
|
||||
|
||||
\bibitem{hor74}
|
||||
E. Horowitz and S. Sahni.
|
||||
Computing partitions with applications to the knapsack problem.
|
||||
\emph{Journal of the ACM}, 21(2):277--292, 1974.
|
||||
|
||||
\bibitem{huf52}
|
||||
D. A. Huffman.
|
||||
A method for the construction of minimum-redundancy codes.
|
||||
\emph{Proceedings of the IRE}, 40(9):1098--1101, 1952.
|
||||
|
||||
\bibitem{iois}
|
||||
The International Olympiad in Informatics Syllabus,
|
||||
\url{https://people.ksp.sk/~misof/ioi-syllabus/}
|
||||
|
||||
\bibitem{kar87}
|
||||
R. M. Karp and M. O. Rabin.
|
||||
Efficient randomized pattern-matching algorithms.
|
||||
\emph{IBM Journal of Research and Development}, 31(2):249--260, 1987.
|
||||
|
||||
\bibitem{kas61}
|
||||
P. W. Kasteleyn.
|
||||
The statistics of dimers on a lattice: I. The number of dimer arrangements on a quadratic lattice.
|
||||
\emph{Physica}, 27(12):1209--1225, 1961.
|
||||
|
||||
\bibitem{ken06}
|
||||
C. Kent, G. M. Landau and M. Ziv-Ukelson.
|
||||
On the complexity of sparse exon assembly.
|
||||
\emph{Journal of Computational Biology}, 13(5):1013--1027, 2006.
|
||||
|
||||
|
||||
\bibitem{kle05}
|
||||
J. Kleinberg and É. Tardos.
|
||||
\emph{Algorithm Design}, Pearson, 2005.
|
||||
|
||||
\bibitem{knu982}
|
||||
D. E. Knuth.
|
||||
\emph{The Art of Computer Programming. Volume 2: Seminumerical Algorithms}, Addison–Wesley, 1998 (3rd edition).
|
||||
|
||||
\bibitem{knu983}
|
||||
D. E. Knuth.
|
||||
\emph{The Art of Computer Programming. Volume 3: Sorting and Searching}, Addison–Wesley, 1998 (2nd edition).
|
||||
|
||||
% \bibitem{kon31}
|
||||
% D. Kőnig.
|
||||
% Gráfok és mátrixok.
|
||||
% \emph{Matematikai és Fizikai Lapok}, 38(1):116--119, 1931.
|
||||
|
||||
\bibitem{kru56}
|
||||
J. B. Kruskal.
|
||||
On the shortest spanning subtree of a graph and the traveling salesman problem.
|
||||
\emph{Proceedings of the American Mathematical Society}, 7(1):48--50, 1956.
|
||||
|
||||
\bibitem{lev66}
|
||||
V. I. Levenshtein.
|
||||
Binary codes capable of correcting deletions, insertions, and reversals.
|
||||
\emph{Soviet physics doklady}, 10(8):707--710, 1966.
|
||||
|
||||
\bibitem{mai84}
|
||||
M. G. Main and R. J. Lorentz.
|
||||
An $O(n \log n)$ algorithm for finding all repetitions in a string.
|
||||
\emph{Journal of Algorithms}, 5(3):422--432, 1984.
|
||||
|
||||
% \bibitem{ore60}
|
||||
% Ø. Ore.
|
||||
% Note on Hamilton circuits.
|
||||
% \emph{The American Mathematical Monthly}, 67(1):55, 1960.
|
||||
|
||||
\bibitem{pac13}
|
||||
J. Pachocki and J. Radoszewski.
|
||||
Where to use and how not to use polynomial string hashing.
|
||||
\emph{Olympiads in Informatics}, 7(1):90--100, 2013.
|
||||
|
||||
\bibitem{par97}
|
||||
I. Parberry.
|
||||
An efficient algorithm for the Knight's tour problem.
|
||||
\emph{Discrete Applied Mathematics}, 73(3):251--260, 1997.
|
||||
|
||||
% \bibitem{pic99}
|
||||
% G. Pick.
|
||||
% Geometrisches zur Zahlenlehre.
|
||||
% \emph{Sitzungsberichte des deutschen naturwissenschaftlich-medicinischen Vereines
|
||||
% für Böhmen "Lotos" in Prag. (Neue Folge)}, 19:311--319, 1899.
|
||||
|
||||
\bibitem{pea05}
|
||||
D. Pearson.
|
||||
A polynomial-time algorithm for the change-making problem.
|
||||
\emph{Operations Research Letters}, 33(3):231--234, 2005.
|
||||
|
||||
\bibitem{pri57}
|
||||
R. C. Prim.
|
||||
Shortest connection networks and some generalizations.
|
||||
\emph{Bell System Technical Journal}, 36(6):1389--1401, 1957.
|
||||
|
||||
% \bibitem{pru18}
|
||||
% H. Prüfer.
|
||||
% Neuer Beweis eines Satzes über Permutationen.
|
||||
% \emph{Arch. Math. Phys}, 27:742--744, 1918.
|
||||
|
||||
\bibitem{q27}
|
||||
27-Queens Puzzle: Massively Parallel Enumeration and Solution Counting.
|
||||
\url{https://github.com/preusser/q27}
|
||||
|
||||
\bibitem{sha75}
|
||||
M. I. Shamos and D. Hoey.
|
||||
Closest-point problems.
|
||||
In \emph{Proceedings of the 16th Annual Symposium on Foundations of Computer Science}, 151--162, 1975.
|
||||
|
||||
\bibitem{sha81}
|
||||
M. Sharir.
|
||||
A strong-connectivity algorithm and its applications in data flow analysis.
|
||||
\emph{Computers \& Mathematics with Applications}, 7(1):67--72, 1981.
|
||||
|
||||
\bibitem{ski08}
|
||||
S. S. Skiena.
|
||||
\emph{The Algorithm Design Manual}, Springer, 2008 (2nd edition).
|
||||
|
||||
\bibitem{ski03}
|
||||
S. S. Skiena and M. A. Revilla.
|
||||
\emph{Programming Challenges: The Programming Contest Training Manual},
|
||||
Springer, 2003.
|
||||
|
||||
\bibitem{main}
|
||||
SZKOpuł, \texttt{https://szkopul.edu.pl/}
|
||||
|
||||
\bibitem{spr35}
|
||||
R. Sprague.
|
||||
Über mathematische Kampfspiele.
|
||||
\emph{Tohoku Mathematical Journal}, 41:438--444, 1935.
|
||||
|
||||
\bibitem{sta06}
|
||||
P. Stańczyk.
|
||||
\emph{Algorytmika praktyczna w konkursach Informatycznych},
|
||||
MSc thesis, University of Warsaw, 2006.
|
||||
|
||||
\bibitem{str69}
|
||||
V. Strassen.
|
||||
Gaussian elimination is not optimal.
|
||||
\emph{Numerische Mathematik}, 13(4):354--356, 1969.
|
||||
|
||||
\bibitem{tar75}
|
||||
R. E. Tarjan.
|
||||
Efficiency of a good but not linear set union algorithm.
|
||||
\emph{Journal of the ACM}, 22(2):215--225, 1975.
|
||||
|
||||
\bibitem{tar79}
|
||||
R. E. Tarjan.
|
||||
Applications of path compression on balanced trees.
|
||||
\emph{Journal of the ACM}, 26(4):690--715, 1979.
|
||||
|
||||
\bibitem{tar84}
|
||||
R. E. Tarjan and U. Vishkin.
|
||||
Finding biconnected componemts and computing tree functions in logarithmic parallel time.
|
||||
In \emph{Proceedings of the 25th Annual Symposium on Foundations of Computer Science}, 12--20, 1984.
|
||||
|
||||
\bibitem{tem61}
|
||||
H. N. V. Temperley and M. E. Fisher.
|
||||
Dimer problem in statistical mechanics -- an exact result.
|
||||
\emph{Philosophical Magazine}, 6(68):1061--1063, 1961.
|
||||
|
||||
\bibitem{usaco}
|
||||
USA Computing Olympiad, \url{http://www.usaco.org/}
|
||||
|
||||
\bibitem{war23}
|
||||
H. C. von Warnsdorf.
|
||||
\emph{Des Rösselsprunges einfachste und allgemeinste Lösung}.
|
||||
Schmalkalden, 1823.
|
||||
|
||||
\bibitem{war62}
|
||||
S. Warshall.
|
||||
A theorem on boolean matrices.
|
||||
\emph{Journal of the ACM}, 9(1):11--12, 1962.
|
||||
|
||||
% \bibitem{zec72}
|
||||
% E. Zeckendorf.
|
||||
% Représentation des nombres naturels par une somme de nombres de Fibonacci ou de nombres de Lucas.
|
||||
% \emph{Bull. Soc. Roy. Sci. Liege}, 41:179--182, 1972.
|
||||
|
||||
\end{thebibliography}
|
33
preface.tex
33
preface.tex
|
@ -1,33 +0,0 @@
|
|||
\chapter*{Preface}
|
||||
\markboth{\MakeUppercase{Preface}}{}
|
||||
\addcontentsline{toc}{chapter}{Preface}
|
||||
|
||||
The purpose of this book is to give you
|
||||
a thorough introduction to competitive programming.
|
||||
It is assumed that you already
|
||||
know the basics of programming, but no previous
|
||||
background in competitive programming is needed.
|
||||
|
||||
The book is especially intended for
|
||||
students who want to learn algorithms and
|
||||
possibly participate in
|
||||
the International Olympiad in Informatics (IOI) or
|
||||
in the International Collegiate Programming Contest (ICPC).
|
||||
Of course, the book is also suitable for
|
||||
anybody else interested in competitive programming.
|
||||
|
||||
It takes a long time to become a good competitive
|
||||
programmer, but it is also an opportunity to learn a lot.
|
||||
You can be sure that you will get
|
||||
a good general understanding of algorithms
|
||||
if you spend time reading the book,
|
||||
solving problems and taking part in contests.
|
||||
|
||||
The book is under continuous development.
|
||||
You can always send feedback on the book to
|
||||
\texttt{ahslaaks@cs.helsinki.fi}.
|
||||
|
||||
\begin{flushright}
|
||||
Helsinki, August 2019 \\
|
||||
Antti Laaksonen
|
||||
\end{flushright}
|
Loading…
Reference in New Issue