graph day version

This commit is contained in:
Johannes Kapfhammer 2021-02-08 23:10:42 +01:00
parent f269ae3919
commit 55a61a0050
10 changed files with 151 additions and 1356 deletions

BIN
book.pdf

Binary file not shown.

View File

@ -53,12 +53,12 @@
stringstyle=\color{strings}
}
\date{Draft \today}
\date{modified by Johannes Kapfhammer, February 2021}
\usepackage[a4paper,vmargin=30mm,hmargin=33mm,footskip=15mm]{geometry}
\title{\Huge Competitive Programmer's Handbook}
\author{\Large Antti Laaksonen}
\title{\Huge SOI Camp 2021 -- Graph Day}
\author{\Large Competitive Programmers Handbook by Antti Laaksonen}
\makeindex
\usepackage[totoc]{idxlayout}
@ -86,39 +86,42 @@
\newcommand{\key}[1] {\textbf{#1}}
\part{Basic techniques}
\include{chapter01}
\include{chapter02}
\include{chapter03}
\include{chapter04}
\include{chapter05}
\include{chapter06}
\include{chapter07}
\include{chapter08}
\include{chapter09}
\include{chapter10}
\part{Graph algorithms}
%\part{Basic techniques}
%\include{chapter01}
%\include{chapter02}
%\include{chapter03}
%\include{chapter04}
%\include{chapter05}
%\include{chapter06}
%\include{chapter07}
%\include{chapter08}
%\include{chapter09}
%\include{chapter10}
\part{Main Topics}
\include{chapter11}
\include{chapter12}
\include{chapter13}
\include{chapter14}
\include{chapter15}
\include{chapter16}
%\include{chapter20}
\part{Advanced topics}
\include{chapter15}
\include{chapter17}
\include{chapter18}
\include{chapter19}
\include{chapter20}
\part{Advanced topics}
\include{chapter21}
\include{chapter22}
\include{chapter23}
\include{chapter24}
\include{chapter25}
\include{chapter26}
\include{chapter27}
\include{chapter28}
\include{chapter29}
\include{chapter30}
% \part{Advanced topics}
%\include{chapter21}
%\include{chapter22}
%\include{chapter23}
%\include{chapter24}
%\include{chapter25}
%\include{chapter26}
%\include{chapter27}
%\include{chapter28}
%\include{chapter29}
%\include{chapter30}
\cleardoublepage
\phantomsection

View File

@ -478,21 +478,19 @@ way to represent graphs, and most algorithms can be
efficiently implemented using them.
A convenient way to store the adjacency lists is to declare
an array of vectors as follows:
a vector of vectors as follows:
\begin{lstlisting}
vector<int> adj[N];
vector<vector<int>> g;
\end{lstlisting}
The constant $N$ is chosen so that all
adjacency lists can be stored.
For example, the graph
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};
\node[draw, circle] (1) at (1,3) {$0$};
\node[draw, circle] (2) at (3,3) {$1$};
\node[draw, circle] (3) at (5,3) {$2$};
\node[draw, circle] (4) at (3,1) {$3$};
\path[draw,thick,->,>=latex] (1) -- (2);
\path[draw,thick,->,>=latex] (2) -- (3);
@ -503,11 +501,12 @@ For example, the graph
\end{center}
can be stored as follows:
\begin{lstlisting}
adj[1].push_back(2);
adj[2].push_back(3);
adj[2].push_back(4);
adj[3].push_back(4);
adj[4].push_back(1);
g.assign(4, {}); // g now consists of 4 empty arrays
g[0].push_back(1);
g[1].push_back(2);
g[1].push_back(3);
g[2].push_back(3);
g[3].push_back(0);
\end{lstlisting}
If the graph is undirected, it can be stored in a similar way,
@ -517,7 +516,7 @@ For a weighted graph, the structure can be extended
as follows:
\begin{lstlisting}
vector<pair<int,int>> adj[N];
vector<vector<pair<int,int>>> g;
\end{lstlisting}
In this case, the adjacency list of node $a$
@ -527,10 +526,10 @@ with weight $w$. For example, the graph
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};
\node[draw, circle] (1) at (1,3) {$0$};
\node[draw, circle] (2) at (3,3) {$1$};
\node[draw, circle] (3) at (5,3) {$2$};
\node[draw, circle] (4) at (3,1) {$3$};
\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
@ -541,11 +540,12 @@ with weight $w$. For example, the graph
\end{center}
can be stored as follows:
\begin{lstlisting}
adj[1].push_back({2,5});
adj[2].push_back({3,7});
adj[2].push_back({4,6});
adj[3].push_back({4,5});
adj[4].push_back({1,2});
g.assign(4, {});
g[0].push_back({1,5});
g[1].push_back({2,7});
g[1].push_back({3,6});
g[2].push_back({3,5});
g[3].push_back({0,2});
\end{lstlisting}
The benefit of using adjacency lists is that
@ -555,7 +555,7 @@ For example, the following loop goes through all nodes
to which we can move from node $s$:
\begin{lstlisting}
for (auto u : adj[s]) {
for (auto u : g[s]) {
// process node u
}
\end{lstlisting}
@ -570,7 +570,8 @@ We can efficiently check from an adjacency matrix
if there is an edge between two nodes.
The matrix can be stored as an array
\begin{lstlisting}
int adj[N][N];
vector<vector<int>> adj;
adj.assign(n, vector<int>(n, 0));
\end{lstlisting}
where each value $\texttt{adj}[a][b]$ indicates
whether the graph contains an edge from
@ -581,10 +582,10 @@ and otherwise $\texttt{adj}[a][b]=0$.
For example, the graph
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};
\node[draw, circle] (1) at (1,3) {$0$};
\node[draw, circle] (2) at (3,3) {$1$};
\node[draw, circle] (3) at (5,3) {$2$};
\node[draw, circle] (4) at (3,1) {$3$};
\path[draw,thick,->,>=latex] (1) -- (2);
\path[draw,thick,->,>=latex] (2) -- (3);
@ -704,10 +705,10 @@ Thus, the graph
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};
\node[draw, circle] (1) at (1,3) {$0$};
\node[draw, circle] (2) at (3,3) {$1$};
\node[draw, circle] (3) at (5,3) {$2$};
\node[draw, circle] (4) at (3,1) {$3$};
\path[draw,thick,->,>=latex] (1) -- (2);
\path[draw,thick,->,>=latex] (2) -- (3);
@ -718,11 +719,11 @@ Thus, the graph
\end{center}
can be represented as follows:
\begin{lstlisting}
edges.push_back({1,2});
edges.push_back({2,3});
edges.push_back({0,2});
edges.push_back({1,3});
edges.push_back({1,4});
edges.push_back({2,4});
edges.push_back({3,4});
edges.push_back({4,1});
edges.push_back({3,1});
\end{lstlisting}
\noindent
@ -738,10 +739,10 @@ For example, the graph
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};
\node[draw, circle] (1) at (1,3) {$0$};
\node[draw, circle] (2) at (3,3) {$1$};
\node[draw, circle] (3) at (5,3) {$2$};
\node[draw, circle] (4) at (3,1) {$3$};
\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
@ -755,10 +756,10 @@ can be represented as follows\footnote{In some older compilers, the function
\texttt{make\_tuple} must be used instead of the braces (for example,
\texttt{make\_tuple(1,2,5)} instead of \texttt{\{1,2,5\}}).}:
\begin{lstlisting}
edges.push_back({1,2,5});
edges.push_back({2,3,7});
edges.push_back({2,4,6});
edges.push_back({3,4,5});
edges.push_back({4,1,2});
edges.push_back({0,2,5});
edges.push_back({1,3,7});
edges.push_back({1,4,6});
edges.push_back({2,4,5});
edges.push_back({3,1,2});
\end{lstlisting}
\end{samepage}

View File

@ -129,11 +129,11 @@ a depth-first search at a given node.
The function assumes that the graph is
stored as adjacency lists in an array
\begin{lstlisting}
vector<int> adj[N];
vector<vector<int>> g;
\end{lstlisting}
and also maintains an array
\begin{lstlisting}
bool visited[N];
vector<bool> visited;
\end{lstlisting}
that keeps track of the visited nodes.
Initially, each array value is \texttt{false},
@ -145,7 +145,7 @@ void dfs(int s) {
if (visited[s]) return;
visited[s] = true;
// process node s
for (auto u: adj[s]) {
for (auto u : g[s]) {
dfs(u);
}
}
@ -312,8 +312,8 @@ as adjacency lists and maintains the following
data structures:
\begin{lstlisting}
queue<int> q;
bool visited[N];
int distance[N];
vector<bool> visited(n);
vector<int> distance(N);
\end{lstlisting}
The queue \texttt{q}
@ -336,7 +336,7 @@ q.push(x);
while (!q.empty()) {
int s = q.front(); q.pop();
// process node s
for (auto u : adj[s]) {
for (auto u : g[s]) {
if (visited[u]) continue;
visited[u] = true;
distance[u] = distance[s]+1;

View File

@ -20,313 +20,6 @@ where more sophisticated algorithms
are needed
for finding shortest paths.
\section{BellmanFord algorithm}
\index{BellmanFord algorithm}
The \key{BellmanFord algorithm}\footnote{The algorithm is named after
R. E. Bellman and L. R. Ford who published it independently
in 1958 and 1956, respectively \cite{bel58,for56a}.} finds
shortest paths from a starting node to all
nodes of the graph.
The algorithm can process all kinds of graphs,
provided that the graph does not contain a
cycle with negative length.
If the graph contains a negative cycle,
the algorithm can detect this.
The algorithm keeps track of distances
from the starting node to all nodes of the graph.
Initially, the distance to the starting node is 0
and the distance to all other nodes in infinite.
The algorithm reduces the distances by finding
edges that shorten the paths until it is not
possible to reduce any distance.
\subsubsection{Example}
Let us consider how the BellmanFord algorithm
works in the following graph:
\begin{center}
\begin{tikzpicture}
\node[draw, circle] (1) at (1,3) {1};
\node[draw, circle] (2) at (4,3) {2};
\node[draw, circle] (3) at (1,1) {3};
\node[draw, circle] (4) at (4,1) {4};
\node[draw, circle] (5) at (6,2) {6};
\node[color=red] at (1,3+0.55) {$0$};
\node[color=red] at (4,3+0.55) {$\infty$};
\node[color=red] at (1,1-0.55) {$\infty$};
\node[color=red] at (4,1-0.55) {$\infty$};
\node[color=red] at (6,2-0.55) {$\infty$};
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
\end{tikzpicture}
\end{center}
Each node of the graph is assigned a distance.
Initially, the distance to the starting node is 0,
and the distance to all other nodes is infinite.
The algorithm searches for edges that reduce distances.
First, all edges from node 1 reduce distances:
\begin{center}
\begin{tikzpicture}
\node[draw, circle] (1) at (1,3) {1};
\node[draw, circle] (2) at (4,3) {2};
\node[draw, circle] (3) at (1,1) {3};
\node[draw, circle] (4) at (4,1) {4};
\node[draw, circle] (5) at (6,2) {5};
\node[color=red] at (1,3+0.55) {$0$};
\node[color=red] at (4,3+0.55) {$5$};
\node[color=red] at (1,1-0.55) {$3$};
\node[color=red] at (4,1-0.55) {$7$};
\node[color=red] at (6,2-0.55) {$\infty$};
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
\path[draw=red,thick,->,line width=2pt] (1) -- (2);
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
\path[draw=red,thick,->,line width=2pt] (1) -- (4);
\end{tikzpicture}
\end{center}
After this, edges
$2 \rightarrow 5$ and $3 \rightarrow 4$
reduce distances:
\begin{center}
\begin{tikzpicture}
\node[draw, circle] (1) at (1,3) {1};
\node[draw, circle] (2) at (4,3) {2};
\node[draw, circle] (3) at (1,1) {3};
\node[draw, circle] (4) at (4,1) {4};
\node[draw, circle] (5) at (6,2) {5};
\node[color=red] at (1,3+0.55) {$0$};
\node[color=red] at (4,3+0.55) {$5$};
\node[color=red] at (1,1-0.55) {$3$};
\node[color=red] at (4,1-0.55) {$4$};
\node[color=red] at (6,2-0.55) {$7$};
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
\path[draw=red,thick,->,line width=2pt] (2) -- (5);
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
\end{tikzpicture}
\end{center}
Finally, there is one more change:
\begin{center}
\begin{tikzpicture}
\node[draw, circle] (1) at (1,3) {1};
\node[draw, circle] (2) at (4,3) {2};
\node[draw, circle] (3) at (1,1) {3};
\node[draw, circle] (4) at (4,1) {4};
\node[draw, circle] (5) at (6,2) {5};
\node[color=red] at (1,3+0.55) {$0$};
\node[color=red] at (4,3+0.55) {$5$};
\node[color=red] at (1,1-0.55) {$3$};
\node[color=red] at (4,1-0.55) {$4$};
\node[color=red] at (6,2-0.55) {$6$};
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
\end{tikzpicture}
\end{center}
After this, no edge can reduce any distance.
This means that the distances are final,
and we have successfully
calculated the shortest distances
from the starting node to all nodes of the graph.
For example, the shortest distance 3
from node 1 to node 5 corresponds to
the following path:
\begin{center}
\begin{tikzpicture}
\node[draw, circle] (1) at (1,3) {1};
\node[draw, circle] (2) at (4,3) {2};
\node[draw, circle] (3) at (1,1) {3};
\node[draw, circle] (4) at (4,1) {4};
\node[draw, circle] (5) at (6,2) {5};
\node[color=red] at (1,3+0.55) {$0$};
\node[color=red] at (4,3+0.55) {$5$};
\node[color=red] at (1,1-0.55) {$3$};
\node[color=red] at (4,1-0.55) {$4$};
\node[color=red] at (6,2-0.55) {$6$};
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,-] (1) -- node[font=\small,label=left:3] {} (3);
\path[draw,thick,-] (3) -- node[font=\small,label=below:1] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=left:3] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
\path[draw,thick,-] (4) -- node[font=\small,label=below:2] {} (5);
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (4);
\path[draw=red,thick,->,line width=2pt] (1) -- (3);
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
\end{tikzpicture}
\end{center}
\subsubsection{Implementation}
The following implementation of the
BellmanFord algorithm determines the shortest distances
from a node $x$ to all nodes of the graph.
The code assumes that the graph is stored
as an edge list \texttt{edges}
that consists of tuples of the form $(a,b,w)$,
meaning that there is an edge from node $a$ to node $b$
with weight $w$.
The algorithm consists of $n-1$ rounds,
and on each round the algorithm goes through
all edges of the graph and tries to
reduce the distances.
The algorithm constructs an array \texttt{distance}
that will contain the distances from $x$
to all nodes of the graph.
The constant \texttt{INF} denotes an infinite distance.
\begin{lstlisting}
for (int i = 1; i <= n; i++) distance[i] = INF;
distance[x] = 0;
for (int i = 1; i <= n-1; i++) {
for (auto e : edges) {
int a, b, w;
tie(a, b, w) = e;
distance[b] = min(distance[b], distance[a]+w);
}
}
\end{lstlisting}
The time complexity of the algorithm is $O(nm)$,
because the algorithm consists of $n-1$ rounds and
iterates through all $m$ edges during a round.
If there are no negative cycles in the graph,
all distances are final after $n-1$ rounds,
because each shortest path can contain at most $n-1$ edges.
In practice, the final distances can usually
be found faster than in $n-1$ rounds.
Thus, a possible way to make the algorithm more efficient
is to stop the algorithm if no distance
can be reduced during a round.
\subsubsection{Negative cycles}
\index{negative cycle}
The BellmanFord algorithm can also be used to
check if the graph contains a cycle with negative length.
For example, the graph
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,0) {$1$};
\node[draw, circle] (2) at (2,1) {$2$};
\node[draw, circle] (3) at (2,-1) {$3$};
\node[draw, circle] (4) at (4,0) {$4$};
\path[draw,thick,-] (1) -- node[font=\small,label=above:$3$] {} (2);
\path[draw,thick,-] (2) -- node[font=\small,label=above:$1$] {} (4);
\path[draw,thick,-] (1) -- node[font=\small,label=below:$5$] {} (3);
\path[draw,thick,-] (3) -- node[font=\small,label=below:$-7$] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=right:$2$] {} (3);
\end{tikzpicture}
\end{center}
\noindent
contains a negative cycle
$2 \rightarrow 3 \rightarrow 4 \rightarrow 2$
with length $-4$.
If the graph contains a negative cycle,
we can shorten infinitely many times
any path that contains the cycle by repeating the cycle
again and again.
Thus, the concept of a shortest path
is not meaningful in this situation.
A negative cycle can be detected
using the BellmanFord algorithm by
running the algorithm for $n$ rounds.
If the last round reduces any distance,
the graph contains a negative cycle.
Note that this algorithm can be used to
search for
a negative cycle in the whole graph
regardless of the starting node.
\subsubsection{SPFA algorithm}
\index{SPFA algorithm}
The \key{SPFA algorithm} (''Shortest Path Faster Algorithm'') \cite{fan94}
is a variant of the BellmanFord algorithm,
that is often more efficient than the original algorithm.
The SPFA algorithm does not go through all the edges on each round,
but instead, it chooses the edges to be examined
in a more intelligent way.
The algorithm maintains a queue of nodes that might
be used for reducing the distances.
First, the algorithm adds the starting node $x$
to the queue.
Then, the algorithm always processes the
first node in the queue, and when an edge
$a \rightarrow b$ reduces a distance,
node $b$ is added to the queue.
%
% The following implementation uses a
% \texttt{queue} \texttt{q}.
% In addition, an array \texttt{inqueue} indicates
% if a node is already in the queue,
% in which case the algorithm does not add
% the node to the queue again.
%
% \begin{lstlisting}
% for (int i = 1; i <= n; i++) distance[i] = INF;
% distance[x] = 0;
% q.push(x);
% while (!q.empty()) {
% int a = q.front(); q.pop();
% inqueue[a] = false;
% for (auto b : v[a]) {
% if (distance[a]+b.second < distance[b.first]) {
% distance[b.first] = distance[a]+b.second;
% if (!inqueue[b]) {q.push(b); inqueue[b] = true;}
% }
% }
% }
% \end{lstlisting}
The efficiency of the SPFA algorithm depends
on the structure of the graph:
the algorithm is often efficient,
but its worst case time complexity is still
$O(nm)$ and it is possible to create inputs
that make the algorithm as slow as the
original BellmanFord algorithm.
\section{Dijkstra's algorithm}
\index{Dijkstra's algorithm}
@ -334,15 +27,12 @@ original BellmanFord algorithm.
\key{Dijkstra's algorithm}\footnote{E. W. Dijkstra published the algorithm in 1959 \cite{dij59};
however, his original paper does not mention how to implement the algorithm efficiently.}
finds shortest
paths from the starting node to all nodes of the graph,
like the BellmanFord algorithm.
The benefit of Dijsktra's algorithm is that
it is more efficient and can be used for
paths from the starting node to all nodes of the graph.
Dijkstra's algorithm is very efficient and can be used for
processing large graphs.
However, the algorithm requires that there
are no negative weight edges in the graph.
Like the BellmanFord algorithm,
Dijkstra's algorithm maintains distances
to the nodes and reduces them during the search.
Dijkstra's algorithm is efficient, because
@ -543,9 +233,9 @@ The following implementation of Dijkstra's algorithm
calculates the minimum distances from a node $x$
to other nodes of the graph.
The graph is stored as adjacency lists
so that \texttt{adj[$a$]} contains a pair $(b,w)$
always when there is an edge from node $a$ to node $b$
with weight $w$.
so that \texttt{g[$v$]} contains a pair $(w,\text{cost})$
always when there is an edge from node $v$ to node $w$
with weight $\text{cost}$.
An efficient implementation of Dijkstra's algorithm
requires that it is possible to efficiently find the
@ -556,40 +246,42 @@ Using a priority queue, the next node to be processed
can be retrieved in logarithmic time.
In the following code, the priority queue
\texttt{q} contains pairs of the form $(-d,x)$,
\texttt{pq} contains pairs of the form $(d,x)$,
meaning that the current distance to node $x$ is $d$.
The array $\texttt{distance}$ contains the distance to
each node, and the array $\texttt{processed}$ indicates
whether a node has been processed.
Initially the distance is $0$ to $x$ and $\infty$ to all other nodes.
each node. Initially, the distance is $0$ to $\text{start}$ and $-1$ to all
other nodes. We use $-1$ as invalid value to denote that the node
has not been reached yet.
\begin{lstlisting}
for (int i = 1; i <= n; i++) distance[i] = INF;
distance[x] = 0;
q.push({0,x});
while (!q.empty()) {
int a = q.top().second; q.pop();
if (processed[a]) continue;
processed[a] = true;
for (auto u : adj[a]) {
int b = u.first, w = u.second;
if (distance[a]+w < distance[b]) {
distance[b] = distance[a]+w;
q.push({-distance[b],b});
}
}
vector<int> distance(n, -1);
priority_queue<pair<int, int>,
vector<pair<int, int>>,
greater<pair<int, int>>> pq;
distance[start] = 0;
pq.emplace(0, start);
while (!pq.empty()) {
auto [d, v] = q.top();
q.pop();
if (distance[v] != -1) continue;
distance[v] = d;
for (auto [w, cost] : g[v])
q.emplace(d + cost, w);
}
\end{lstlisting}
Note that the priority queue contains \emph{negative}
distances to nodes.
The reason for this is that the
default version of the C++ priority queue finds maximum
elements, while we want to find minimum elements.
By using negative distances,
we can directly use the default priority queue\footnote{Of
course, we could also declare the priority queue as in Chapter 4.5
and use positive distances, but the implementation would be a bit longer.}.
Note that the type of the priority queue is not
\verb|priority_queue<pair<int, int>| but instead
\verb|priority_queue<pair<int,int>,vector<pair<int,int>>,greater<pair<int,int>>>|.
This is because in C++, a priority queue by default puts the
\emph{largest} element on top, so we reverse the ordering by changing
the comparison operator from \verb|less| (the default) to
\verb|greater| (which does the opposite).
In case you forget, you can look up the syntax for the priority queue
in the C++ cheatsheet linked on the camp page.
Also note that there may be several instances of the same
node in the priority queue; however, only the instance with the
minimum distance will be processed.
@ -598,205 +290,3 @@ The time complexity of the above implementation is
$O(n+m \log m)$, because the algorithm goes through
all nodes of the graph and adds for each edge
at most one distance to the priority queue.
\section{FloydWarshall algorithm}
\index{FloydWarshall algorithm}
The \key{FloydWarshall algorithm}\footnote{The algorithm
is named after R. W. Floyd and S. Warshall
who published it independently in 1962 \cite{flo62,war62}.}
provides an alternative way to approach the problem
of finding shortest paths.
Unlike the other algorithms of this chapter,
it finds all shortest paths between the nodes
in a single run.
The algorithm maintains a two-dimensional array
that contains distances between the nodes.
First, distances are calculated only using
direct edges between the nodes,
and after this, the algorithm reduces distances
by using intermediate nodes in paths.
\subsubsection{Example}
Let us consider how the FloydWarshall algorithm
works in the following graph:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$3$};
\node[draw, circle] (2) at (4,3) {$4$};
\node[draw, circle] (3) at (1,1) {$2$};
\node[draw, circle] (4) at (4,1) {$1$};
\node[draw, circle] (5) at (6,2) {$5$};
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (2);
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
\end{tikzpicture}
\end{center}
Initially, the distance from each node to itself is $0$,
and the distance between nodes $a$ and $b$ is $x$
if there is an edge between nodes $a$ and $b$ with weight $x$.
All other distances are infinite.
In this graph, the initial array is as follows:
\begin{center}
\begin{tabular}{r|rrrrr}
& 1 & 2 & 3 & 4 & 5 \\
\hline
1 & 0 & 5 & $\infty$ & 9 & 1 \\
2 & 5 & 0 & 2 & $\infty$ & $\infty$ \\
3 & $\infty$ & 2 & 0 & 7 & $\infty$ \\
4 & 9 & $\infty$ & 7 & 0 & 2 \\
5 & 1 & $\infty$ & $\infty$ & 2 & 0 \\
\end{tabular}
\end{center}
\vspace{10pt}
The algorithm consists of consecutive rounds.
On each round, the algorithm selects a new node
that can act as an intermediate node in paths from now on,
and distances are reduced using this node.
On the first round, node 1 is the new intermediate node.
There is a new path between nodes 2 and 4
with length 14, because node 1 connects them.
There is also a new path
between nodes 2 and 5 with length 6.
\begin{center}
\begin{tabular}{r|rrrrr}
& 1 & 2 & 3 & 4 & 5 \\
\hline
1 & 0 & 5 & $\infty$ & 9 & 1 \\
2 & 5 & 0 & 2 & \textbf{14} & \textbf{6} \\
3 & $\infty$ & 2 & 0 & 7 & $\infty$ \\
4 & 9 & \textbf{14} & 7 & 0 & 2 \\
5 & 1 & \textbf{6} & $\infty$ & 2 & 0 \\
\end{tabular}
\end{center}
\vspace{10pt}
On the second round, node 2 is the new intermediate node.
This creates new paths between nodes 1 and 3
and between nodes 3 and 5:
\begin{center}
\begin{tabular}{r|rrrrr}
& 1 & 2 & 3 & 4 & 5 \\
\hline
1 & 0 & 5 & \textbf{7} & 9 & 1 \\
2 & 5 & 0 & 2 & 14 & 6 \\
3 & \textbf{7} & 2 & 0 & 7 & \textbf{8} \\
4 & 9 & 14 & 7 & 0 & 2 \\
5 & 1 & 6 & \textbf{8} & 2 & 0 \\
\end{tabular}
\end{center}
\vspace{10pt}
On the third round, node 3 is the new intermediate round.
There is a new path between nodes 2 and 4:
\begin{center}
\begin{tabular}{r|rrrrr}
& 1 & 2 & 3 & 4 & 5 \\
\hline
1 & 0 & 5 & 7 & 9 & 1 \\
2 & 5 & 0 & 2 & \textbf{9} & 6 \\
3 & 7 & 2 & 0 & 7 & 8 \\
4 & 9 & \textbf{9} & 7 & 0 & 2 \\
5 & 1 & 6 & 8 & 2 & 0 \\
\end{tabular}
\end{center}
\vspace{10pt}
The algorithm continues like this,
until all nodes have been appointed intermediate nodes.
After the algorithm has finished, the array contains
the minimum distances between any two nodes:
\begin{center}
\begin{tabular}{r|rrrrr}
& 1 & 2 & 3 & 4 & 5 \\
\hline
1 & 0 & 5 & 7 & 3 & 1 \\
2 & 5 & 0 & 2 & 8 & 6 \\
3 & 7 & 2 & 0 & 7 & 8 \\
4 & 3 & 8 & 7 & 0 & 2 \\
5 & 1 & 6 & 8 & 2 & 0 \\
\end{tabular}
\end{center}
For example, the array tells us that the
shortest distance between nodes 2 and 4 is 8.
This corresponds to the following path:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$3$};
\node[draw, circle] (2) at (4,3) {$4$};
\node[draw, circle] (3) at (1,1) {$2$};
\node[draw, circle] (4) at (4,1) {$1$};
\node[draw, circle] (5) at (6,2) {$5$};
\path[draw,thick,-] (1) -- node[font=\small,label=above:7] {} (2);
\path[draw,thick,-] (1) -- node[font=\small,label=left:2] {} (3);
\path[draw,thick,-] (3) -- node[font=\small,label=below:5] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=left:9] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (5);
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
\path[draw=red,thick,->,line width=2pt] (3) -- (4);
\path[draw=red,thick,->,line width=2pt] (4) -- (5);
\path[draw=red,thick,->,line width=2pt] (5) -- (2);
\end{tikzpicture}
\end{center}
\subsubsection{Implementation}
The advantage of the
FloydWarshall algorithm that it is
easy to implement.
The following code constructs a
distance matrix where $\texttt{distance}[a][b]$
is the shortest distance between nodes $a$ and $b$.
First, the algorithm initializes \texttt{distance}
using the adjacency matrix \texttt{adj} of the graph:
\begin{lstlisting}
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= n; j++) {
if (i == j) distance[i][j] = 0;
else if (adj[i][j]) distance[i][j] = adj[i][j];
else distance[i][j] = INF;
}
}
\end{lstlisting}
After this, the shortest distances can be found as follows:
\begin{lstlisting}
for (int k = 1; k <= n; k++) {
for (int i = 1; i <= n; i++) {
for (int j = 1; j <= n; j++) {
distance[i][j] = min(distance[i][j],
distance[i][k]+distance[k][j]);
}
}
}
\end{lstlisting}
The time complexity of the algorithm is $O(n^3)$,
because it contains three nested loops
that go through the nodes of the graph.
Since the implementation of the FloydWarshall
algorithm is simple, the algorithm can be
a good choice even if it is only needed to find a
single shortest path in the graph.
However, the algorithm can only be used when the graph
is so small that a cubic time complexity is fast enough.

View File

@ -113,17 +113,16 @@ a depth-first search at an arbitrary node.
The following recursive function can be used:
\begin{lstlisting}
void dfs(int s, int e) {
// process node s
for (auto u : adj[s]) {
if (u != e) dfs(u, s);
}
void dfs(int v, int p) {
for (auto w : g[v])
if (w != p)
dfs(w, v);
}
\end{lstlisting}
The function is given two parameters: the current node $s$
and the previous node $e$.
The purpose of the parameter $e$ is to make sure
The function is given two parameters: the current node $v$
and the previous node $p$.
The purpose of the parameter $p$ is to make sure
that the search only moves to nodes
that have not been visited yet.
@ -131,37 +130,36 @@ The following function call starts the search
at node $x$:
\begin{lstlisting}
dfs(x, 0);
dfs(x, -1);
\end{lstlisting}
In the first call $e=0$, because there is no
In the first call $p=-1$, because there is no
previous node, and it is allowed
to proceed to any direction in the tree.
\subsubsection{Dynamic programming}
Dynamic programming can be used to calculate
some information during a tree traversal.
Using dynamic programming, we can, for example,
\subsubsection{Storing Information}
We can calculate
some information during a tree traversal and store that for later use.
We can, for example,
calculate in $O(n)$ time for each node of a rooted tree the
number of nodes in its subtree
or the length of the longest path from the node
to a leaf.
As an example, let us calculate for each node $s$
a value $\texttt{count}[s]$: the number of nodes in its subtree.
As an example, let us calculate for each node $v$
a value $\texttt{subtreesize}[v]$: the number of nodes in its subtree.
The subtree contains the node itself and
all nodes in the subtrees of its children,
so we can calculate the number of nodes
recursively using the following code:
\begin{lstlisting}
void dfs(int s, int e) {
count[s] = 1;
for (auto u : adj[s]) {
if (u == e) continue;
dfs(u, s);
count[s] += count[u];
void dfs(int v, int p) {
subtreesize[s] = 1;
for (auto w : g[v]) {
if (w == p) continue;
dfs(w, v);
subtreesize[s] += subtreesize[u];
}
}
\end{lstlisting}
@ -220,7 +218,7 @@ to obtain another path with length 4.
Next we will discuss two $O(n)$ time algorithms
for calculating the diameter of a tree.
The first algorithm is based on dynamic programming,
The first algorithm is based on the previous idea of storing information,
and the second algorithm uses two depth-first searches.
\subsubsection{Algorithm 1}
@ -279,7 +277,7 @@ because there is a path
$6 \rightarrow 2 \rightarrow 1 \rightarrow 4 \rightarrow 7$.
In this case, $\texttt{maxLength}(1)$ equals the diameter.
Dynamic programming can be used to calculate the above
We can calculate the above
values for all nodes in $O(n)$ time.
First, to calculate $\texttt{toLeaf}(x)$,
we go through the children of $x$,
@ -448,8 +446,8 @@ goes through its child 2:
\path[draw,thick,->,color=red,line width=2pt] (3) -- (6);
\end{tikzpicture}
\end{center}
This part is easy to solve in $O(n)$ time, because we can use
dynamic programming as we have done previously.
This part is easy to solve in $O(n)$ time, because we can use a
similar technique to what we have done previously.
Then, the second part of the problem is to calculate
for every node $x$ the maximum length of a path

View File

@ -1,20 +1,4 @@
\chapter{Directed graphs}
In this chapter, we focus on two classes of directed graphs:
\begin{itemize}
\item \key{Acyclic graphs}:
There are no cycles in the graph,
so there is no path from any node to itself\footnote{Directed acyclic
graphs are sometimes called DAGs.}.
\item \key{Successor graphs}:
The outdegree of each node is 1,
so each node has a unique successor.
\end{itemize}
It turns out that in both cases,
we can design efficient algorithms that are based
on the special properties of the graphs.
\section{Topological sorting}
\chapter{Topological sorting}
\index{topological sorting}
\index{cycle}
@ -249,460 +233,3 @@ The search reaches node 2 whose state is 1,
which means that the graph contains a cycle.
In this example, there is a cycle
$2 \rightarrow 3 \rightarrow 5 \rightarrow 2$.
\section{Dynamic programming}
If a directed graph is acyclic,
dynamic programming can be applied to it.
For example, we can efficiently solve the following
problems concerning paths from a starting node
to an ending node:
\begin{itemize}
\item how many different paths are there?
\item what is the shortest/longest path?
\item what is the minimum/maximum number of edges in a path?
\item which nodes certainly appear in any path?
\end{itemize}
\subsubsection{Counting the number of paths}
As an example, let us calculate the number of paths
from node 1 to node 6 in the following graph:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,5) {$1$};
\node[draw, circle] (2) at (3,5) {$2$};
\node[draw, circle] (3) at (5,5) {$3$};
\node[draw, circle] (4) at (1,3) {$4$};
\node[draw, circle] (5) at (3,3) {$5$};
\node[draw, circle] (6) at (5,3) {$6$};
\path[draw,thick,->,>=latex] (1) -- (2);
\path[draw,thick,->,>=latex] (2) -- (3);
\path[draw,thick,->,>=latex] (1) -- (4);
\path[draw,thick,->,>=latex] (4) -- (5);
\path[draw,thick,->,>=latex] (5) -- (2);
\path[draw,thick,->,>=latex] (5) -- (3);
\path[draw,thick,->,>=latex] (3) -- (6);
\end{tikzpicture}
\end{center}
There are a total of three such paths:
\begin{itemize}
\item $1 \rightarrow 2 \rightarrow 3 \rightarrow 6$
\item $1 \rightarrow 4 \rightarrow 5 \rightarrow 2 \rightarrow 3 \rightarrow 6$
\item $1 \rightarrow 4 \rightarrow 5 \rightarrow 3 \rightarrow 6$
\end{itemize}
Let $\texttt{paths}(x)$ denote the number of paths from
node 1 to node $x$.
As a base case, $\texttt{paths}(1)=1$.
Then, to calculate other values of $\texttt{paths}(x)$,
we may use the recursion
\[\texttt{paths}(x) = \texttt{paths}(a_1)+\texttt{paths}(a_2)+\cdots+\texttt{paths}(a_k)\]
where $a_1,a_2,\ldots,a_k$ are the nodes from which there
is an edge to $x$.
Since the graph is acyclic, the values of $\texttt{paths}(x)$
can be calculated in the order of a topological sort.
A topological sort for the above graph is as follows:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,0) {$1$};
\node[draw, circle] (2) at (4.5,0) {$2$};
\node[draw, circle] (3) at (6,0) {$3$};
\node[draw, circle] (4) at (1.5,0) {$4$};
\node[draw, circle] (5) at (3,0) {$5$};
\node[draw, circle] (6) at (7.5,0) {$6$};
\path[draw,thick,->,>=latex] (1) edge [bend left=30] (2);
\path[draw,thick,->,>=latex] (2) -- (3);
\path[draw,thick,->,>=latex] (1) -- (4);
\path[draw,thick,->,>=latex] (4) -- (5);
\path[draw,thick,->,>=latex] (5) -- (2);
\path[draw,thick,->,>=latex] (5) edge [bend right=30] (3);
\path[draw,thick,->,>=latex] (3) -- (6);
\end{tikzpicture}
\end{center}
Hence, the numbers of paths are as follows:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,5) {$1$};
\node[draw, circle] (2) at (3,5) {$2$};
\node[draw, circle] (3) at (5,5) {$3$};
\node[draw, circle] (4) at (1,3) {$4$};
\node[draw, circle] (5) at (3,3) {$5$};
\node[draw, circle] (6) at (5,3) {$6$};
\path[draw,thick,->,>=latex] (1) -- (2);
\path[draw,thick,->,>=latex] (2) -- (3);
\path[draw,thick,->,>=latex] (1) -- (4);
\path[draw,thick,->,>=latex] (4) -- (5);
\path[draw,thick,->,>=latex] (5) -- (2);
\path[draw,thick,->,>=latex] (5) -- (3);
\path[draw,thick,->,>=latex] (3) -- (6);
\node[color=red] at (1,2.3) {$1$};
\node[color=red] at (3,2.3) {$1$};
\node[color=red] at (5,2.3) {$3$};
\node[color=red] at (1,5.7) {$1$};
\node[color=red] at (3,5.7) {$2$};
\node[color=red] at (5,5.7) {$3$};
\end{tikzpicture}
\end{center}
For example, to calculate the value of $\texttt{paths}(3)$,
we can use the formula $\texttt{paths}(2)+\texttt{paths}(5)$,
because there are edges from nodes 2 and 5
to node 3.
Since $\texttt{paths}(2)=2$ and $\texttt{paths}(5)=1$, we conclude that $\texttt{paths}(3)=3$.
\subsubsection{Extending Dijkstra's algorithm}
\index{Dijkstra's algorithm}
A by-product of Dijkstra's algorithm is a directed, acyclic
graph that indicates for each node of the original graph
the possible ways to reach the node using a shortest path
from the starting node.
Dynamic programming can be applied to that graph.
For example, in the graph
\begin{center}
\begin{tikzpicture}
\node[draw, circle] (1) at (0,0) {$1$};
\node[draw, circle] (2) at (2,0) {$2$};
\node[draw, circle] (3) at (0,-2) {$3$};
\node[draw, circle] (4) at (2,-2) {$4$};
\node[draw, circle] (5) at (4,-1) {$5$};
\path[draw,thick,-] (1) -- node[font=\small,label=above:3] {} (2);
\path[draw,thick,-] (1) -- node[font=\small,label=left:5] {} (3);
\path[draw,thick,-] (2) -- node[font=\small,label=right:4] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=above:8] {} (5);
\path[draw,thick,-] (3) -- node[font=\small,label=below:2] {} (4);
\path[draw,thick,-] (4) -- node[font=\small,label=below:1] {} (5);
\path[draw,thick,-] (2) -- node[font=\small,label=above:2] {} (3);
\end{tikzpicture}
\end{center}
the shortest paths from node 1 may use the following edges:
\begin{center}
\begin{tikzpicture}
\node[draw, circle] (1) at (0,0) {$1$};
\node[draw, circle] (2) at (2,0) {$2$};
\node[draw, circle] (3) at (0,-2) {$3$};
\node[draw, circle] (4) at (2,-2) {$4$};
\node[draw, circle] (5) at (4,-1) {$5$};
\path[draw,thick,->] (1) -- node[font=\small,label=above:3] {} (2);
\path[draw,thick,->] (1) -- node[font=\small,label=left:5] {} (3);
\path[draw,thick,->] (2) -- node[font=\small,label=right:4] {} (4);
\path[draw,thick,->] (3) -- node[font=\small,label=below:2] {} (4);
\path[draw,thick,->] (4) -- node[font=\small,label=below:1] {} (5);
\path[draw,thick,->] (2) -- node[font=\small,label=above:2] {} (3);
\end{tikzpicture}
\end{center}
Now we can, for example, calculate the number of
shortest paths from node 1 to node 5
using dynamic programming:
\begin{center}
\begin{tikzpicture}
\node[draw, circle] (1) at (0,0) {$1$};
\node[draw, circle] (2) at (2,0) {$2$};
\node[draw, circle] (3) at (0,-2) {$3$};
\node[draw, circle] (4) at (2,-2) {$4$};
\node[draw, circle] (5) at (4,-1) {$5$};
\path[draw,thick,->] (1) -- node[font=\small,label=above:3] {} (2);
\path[draw,thick,->] (1) -- node[font=\small,label=left:5] {} (3);
\path[draw,thick,->] (2) -- node[font=\small,label=right:4] {} (4);
\path[draw,thick,->] (3) -- node[font=\small,label=below:2] {} (4);
\path[draw,thick,->] (4) -- node[font=\small,label=below:1] {} (5);
\path[draw,thick,->] (2) -- node[font=\small,label=above:2] {} (3);
\node[color=red] at (0,0.7) {$1$};
\node[color=red] at (2,0.7) {$1$};
\node[color=red] at (0,-2.7) {$2$};
\node[color=red] at (2,-2.7) {$3$};
\node[color=red] at (4,-1.7) {$3$};
\end{tikzpicture}
\end{center}
\subsubsection{Representing problems as graphs}
Actually, any dynamic programming problem
can be represented as a directed, acyclic graph.
In such a graph, each node corresponds to a dynamic programming state
and the edges indicate how the states depend on each other.
As an example, consider the problem
of forming a sum of money $n$
using coins
$\{c_1,c_2,\ldots,c_k\}$.
In this problem, we can construct a graph where
each node corresponds to a sum of money,
and the edges show how the coins can be chosen.
For example, for coins $\{1,3,4\}$ and $n=6$,
the graph is as follows:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (0) at (0,0) {$0$};
\node[draw, circle] (1) at (2,0) {$1$};
\node[draw, circle] (2) at (4,0) {$2$};
\node[draw, circle] (3) at (6,0) {$3$};
\node[draw, circle] (4) at (8,0) {$4$};
\node[draw, circle] (5) at (10,0) {$5$};
\node[draw, circle] (6) at (12,0) {$6$};
\path[draw,thick,->] (0) -- (1);
\path[draw,thick,->] (1) -- (2);
\path[draw,thick,->] (2) -- (3);
\path[draw,thick,->] (3) -- (4);
\path[draw,thick,->] (4) -- (5);
\path[draw,thick,->] (5) -- (6);
\path[draw,thick,->] (0) edge [bend right=30] (3);
\path[draw,thick,->] (1) edge [bend right=30] (4);
\path[draw,thick,->] (2) edge [bend right=30] (5);
\path[draw,thick,->] (3) edge [bend right=30] (6);
\path[draw,thick,->] (0) edge [bend left=30] (4);
\path[draw,thick,->] (1) edge [bend left=30] (5);
\path[draw,thick,->] (2) edge [bend left=30] (6);
\end{tikzpicture}
\end{center}
Using this representation,
the shortest path from node 0 to node $n$
corresponds to a solution with the minimum number of coins,
and the total number of paths from node 0 to node $n$
equals the total number of solutions.
\section{Successor paths}
\index{successor graph}
\index{functional graph}
For the rest of the chapter,
we will focus on \key{successor graphs}.
In those graphs,
the outdegree of each node is 1, i.e.,
exactly one edge starts at each node.
A successor graph consists of one or more
components, each of which contains
one cycle and some paths that lead to it.
Successor graphs are sometimes called
\key{functional graphs}.
The reason for this is that any successor graph
corresponds to a function that defines
the edges of the graph.
The parameter for the function is a node of the graph,
and the function gives the successor of that node.
For example, the function
\begin{center}
\begin{tabular}{r|rrrrrrrrr}
$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
\hline
$\texttt{succ}(x)$ & 3 & 5 & 7 & 6 & 2 & 2 & 1 & 6 & 3 \\
\end{tabular}
\end{center}
defines the following graph:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,0) {$1$};
\node[draw, circle] (2) at (2,0) {$2$};
\node[draw, circle] (3) at (-2,0) {$3$};
\node[draw, circle] (4) at (1,-3) {$4$};
\node[draw, circle] (5) at (4,0) {$5$};
\node[draw, circle] (6) at (2,-1.5) {$6$};
\node[draw, circle] (7) at (-2,-1.5) {$7$};
\node[draw, circle] (8) at (3,-3) {$8$};
\node[draw, circle] (9) at (-4,0) {$9$};
\path[draw,thick,->] (1) -- (3);
\path[draw,thick,->] (2) edge [bend left=40] (5);
\path[draw,thick,->] (3) -- (7);
\path[draw,thick,->] (4) -- (6);
\path[draw,thick,->] (5) edge [bend left=40] (2);
\path[draw,thick,->] (6) -- (2);
\path[draw,thick,->] (7) -- (1);
\path[draw,thick,->] (8) -- (6);
\path[draw,thick,->] (9) -- (3);
\end{tikzpicture}
\end{center}
Since each node of a successor graph has a
unique successor, we can also define a function $\texttt{succ}(x,k)$
that gives the node that we will reach if
we begin at node $x$ and walk $k$ steps forward.
For example, in the above graph $\texttt{succ}(4,6)=2$,
because we will reach node 2 by walking 6 steps from node 4:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,0) {$4$};
\node[draw, circle] (2) at (1.5,0) {$6$};
\node[draw, circle] (3) at (3,0) {$2$};
\node[draw, circle] (4) at (4.5,0) {$5$};
\node[draw, circle] (5) at (6,0) {$2$};
\node[draw, circle] (6) at (7.5,0) {$5$};
\node[draw, circle] (7) at (9,0) {$2$};
\path[draw,thick,->] (1) -- (2);
\path[draw,thick,->] (2) -- (3);
\path[draw,thick,->] (3) -- (4);
\path[draw,thick,->] (4) -- (5);
\path[draw,thick,->] (5) -- (6);
\path[draw,thick,->] (6) -- (7);
\end{tikzpicture}
\end{center}
A straightforward way to calculate a value of $\texttt{succ}(x,k)$
is to start at node $x$ and walk $k$ steps forward, which takes $O(k)$ time.
However, using preprocessing, any value of $\texttt{succ}(x,k)$
can be calculated in only $O(\log k)$ time.
The idea is to precalculate all values of $\texttt{succ}(x,k)$ where
$k$ is a power of two and at most $u$, where $u$ is
the maximum number of steps we will ever walk.
This can be efficiently done, because
we can use the following recursion:
\begin{equation*}
\texttt{succ}(x,k) = \begin{cases}
\texttt{succ}(x) & k = 1\\
\texttt{succ}(\texttt{succ}(x,k/2),k/2) & k > 1\\
\end{cases}
\end{equation*}
Precalculating the values takes $O(n \log u)$ time,
because $O(\log u)$ values are calculated for each node.
In the above graph, the first values are as follows:
\begin{center}
\begin{tabular}{r|rrrrrrrrr}
$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\
\hline
$\texttt{succ}(x,1)$ & 3 & 5 & 7 & 6 & 2 & 2 & 1 & 6 & 3 \\
$\texttt{succ}(x,2)$ & 7 & 2 & 1 & 2 & 5 & 5 & 3 & 2 & 7 \\
$\texttt{succ}(x,4)$ & 3 & 2 & 7 & 2 & 5 & 5 & 1 & 2 & 3 \\
$\texttt{succ}(x,8)$ & 7 & 2 & 1 & 2 & 5 & 5 & 3 & 2 & 7 \\
$\cdots$ \\
\end{tabular}
\end{center}
After this, any value of $\texttt{succ}(x,k)$ can be calculated
by presenting the number of steps $k$ as a sum of powers of two.
For example, if we want to calculate the value of $\texttt{succ}(x,11)$,
we first form the representation $11=8+2+1$.
Using that,
\[\texttt{succ}(x,11)=\texttt{succ}(\texttt{succ}(\texttt{succ}(x,8),2),1).\]
For example, in the previous graph
\[\texttt{succ}(4,11)=\texttt{succ}(\texttt{succ}(\texttt{succ}(4,8),2),1)=5.\]
Such a representation always consists of
$O(\log k)$ parts, so calculating a value of $\texttt{succ}(x,k)$
takes $O(\log k)$ time.
\section{Cycle detection}
\index{cycle}
\index{cycle detection}
Consider a successor graph that only contains
a path that ends in a cycle.
We may ask the following questions:
if we begin our walk at the starting node,
what is the first node in the cycle
and how many nodes does the cycle contain?
For example, in the graph
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (5) at (0,0) {$5$};
\node[draw, circle] (4) at (-2,0) {$4$};
\node[draw, circle] (6) at (-1,1.5) {$6$};
\node[draw, circle] (3) at (-4,0) {$3$};
\node[draw, circle] (2) at (-6,0) {$2$};
\node[draw, circle] (1) at (-8,0) {$1$};
\path[draw,thick,->] (1) -- (2);
\path[draw,thick,->] (2) -- (3);
\path[draw,thick,->] (3) -- (4);
\path[draw,thick,->] (4) -- (5);
\path[draw,thick,->] (5) -- (6);
\path[draw,thick,->] (6) -- (4);
\end{tikzpicture}
\end{center}
we begin our walk at node 1,
the first node that belongs to the cycle is node 4, and the cycle consists
of three nodes (4, 5 and 6).
A simple way to detect the cycle is to walk in the
graph and keep track of
all nodes that have been visited. Once a node is visited
for the second time, we can conclude
that the node is the first node in the cycle.
This method works in $O(n)$ time and also uses
$O(n)$ memory.
However, there are better algorithms for cycle detection.
The time complexity of such algorithms is still $O(n)$,
but they only use $O(1)$ memory.
This is an important improvement if $n$ is large.
Next we will discuss Floyd's algorithm that
achieves these properties.
\subsubsection{Floyd's algorithm}
\index{Floyd's algorithm}
\key{Floyd's algorithm}\footnote{The idea of the algorithm is mentioned in \cite{knu982}
and attributed to R. W. Floyd; however, it is not known if Floyd actually
discovered the algorithm.} walks forward
in the graph using two pointers $a$ and $b$.
Both pointers begin at a node $x$ that
is the starting node of the graph.
Then, on each turn, the pointer $a$ walks
one step forward and the pointer $b$
walks two steps forward.
The process continues until
the pointers meet each other:
\begin{lstlisting}
a = succ(x);
b = succ(succ(x));
while (a != b) {
a = succ(a);
b = succ(succ(b));
}
\end{lstlisting}
At this point, the pointer $a$ has walked $k$ steps
and the pointer $b$ has walked $2k$ steps,
so the length of the cycle divides $k$.
Thus, the first node that belongs to the cycle
can be found by moving the pointer $a$ to node $x$
and advancing the pointers
step by step until they meet again.
\begin{lstlisting}
a = x;
while (a != b) {
a = succ(a);
b = succ(b);
}
first = a;
\end{lstlisting}
After this, the length of the cycle
can be calculated as follows:
\begin{lstlisting}
b = succ(a);
length = 1;
while (a != b) {
b = succ(b);
length++;
}
\end{lstlisting}

View File

@ -359,205 +359,3 @@ that create the remaining strongly connected components:
The time complexity of the algorithm is $O(n+m)$,
because the algorithm
performs two depth-first searches.
\section{2SAT problem}
\index{2SAT problem}
Strong connectivity is also linked with the
\key{2SAT problem}\footnote{The algorithm presented here was
introduced in \cite{asp79}.
There is also another well-known linear-time algorithm \cite{eve75}
that is based on backtracking.}.
In this problem, we are given a logical formula
\[
(a_1 \lor b_1) \land (a_2 \lor b_2) \land \cdots \land (a_m \lor b_m),
\]
where each $a_i$ and $b_i$ is either a logical variable
($x_1,x_2,\ldots,x_n$)
or a negation of a logical variable
($\lnot x_1, \lnot x_2, \ldots, \lnot x_n$).
The symbols ''$\land$'' and ''$\lor$'' denote
logical operators ''and'' and ''or''.
Our task is to assign each variable a value
so that the formula is true, or state
that this is not possible.
For example, the formula
\[
L_1 = (x_2 \lor \lnot x_1) \land
(\lnot x_1 \lor \lnot x_2) \land
(x_1 \lor x_3) \land
(\lnot x_2 \lor \lnot x_3) \land
(x_1 \lor x_4)
\]
is true when the variables are assigned as follows:
\[
\begin{cases}
x_1 = \textrm{false} \\
x_2 = \textrm{false} \\
x_3 = \textrm{true} \\
x_4 = \textrm{true} \\
\end{cases}
\]
However, the formula
\[
L_2 = (x_1 \lor x_2) \land
(x_1 \lor \lnot x_2) \land
(\lnot x_1 \lor x_3) \land
(\lnot x_1 \lor \lnot x_3)
\]
is always false, regardless of how we
assign the values.
The reason for this is that we cannot
choose a value for $x_1$
without creating a contradiction.
If $x_1$ is false, both $x_2$ and $\lnot x_2$
should be true which is impossible,
and if $x_1$ is true, both $x_3$ and $\lnot x_3$
should be true which is also impossible.
The 2SAT problem can be represented as a graph
whose nodes correspond to
variables $x_i$ and negations $\lnot x_i$,
and edges determine the connections
between the variables.
Each pair $(a_i \lor b_i)$ generates two edges:
$\lnot a_i \to b_i$ and $\lnot b_i \to a_i$.
This means that if $a_i$ does not hold,
$b_i$ must hold, and vice versa.
The graph for the formula $L_1$ is:
\\
\begin{center}
\begin{tikzpicture}[scale=1.0,minimum size=2pt]
\node[draw, circle, inner sep=1.3pt] (1) at (1,2) {$\lnot x_3$};
\node[draw, circle] (2) at (3,2) {$x_2$};
\node[draw, circle, inner sep=1.3pt] (3) at (1,0) {$\lnot x_4$};
\node[draw, circle] (4) at (3,0) {$x_1$};
\node[draw, circle, inner sep=1.3pt] (5) at (5,2) {$\lnot x_1$};
\node[draw, circle] (6) at (7,2) {$x_4$};
\node[draw, circle, inner sep=1.3pt] (7) at (5,0) {$\lnot x_2$};
\node[draw, circle] (8) at (7,0) {$x_3$};
\path[draw,thick,->] (1) -- (4);
\path[draw,thick,->] (4) -- (2);
\path[draw,thick,->] (2) -- (1);
\path[draw,thick,->] (3) -- (4);
\path[draw,thick,->] (2) -- (5);
\path[draw,thick,->] (4) -- (7);
\path[draw,thick,->] (5) -- (6);
\path[draw,thick,->] (5) -- (8);
\path[draw,thick,->] (8) -- (7);
\path[draw,thick,->] (7) -- (5);
\end{tikzpicture}
\end{center}
And the graph for the formula $L_2$ is:
\\
\begin{center}
\begin{tikzpicture}[scale=1.0,minimum size=2pt]
\node[draw, circle] (1) at (1,2) {$x_3$};
\node[draw, circle] (2) at (3,2) {$x_2$};
\node[draw, circle, inner sep=1.3pt] (3) at (5,2) {$\lnot x_2$};
\node[draw, circle, inner sep=1.3pt] (4) at (7,2) {$\lnot x_3$};
\node[draw, circle, inner sep=1.3pt] (5) at (4,3.5) {$\lnot x_1$};
\node[draw, circle] (6) at (4,0.5) {$x_1$};
\path[draw,thick,->] (1) -- (5);
\path[draw,thick,->] (4) -- (5);
\path[draw,thick,->] (6) -- (1);
\path[draw,thick,->] (6) -- (4);
\path[draw,thick,->] (5) -- (2);
\path[draw,thick,->] (5) -- (3);
\path[draw,thick,->] (2) -- (6);
\path[draw,thick,->] (3) -- (6);
\end{tikzpicture}
\end{center}
The structure of the graph tells us whether
it is possible to assign the values
of the variables so
that the formula is true.
It turns out that this can be done
exactly when there are no nodes
$x_i$ and $\lnot x_i$ such that
both nodes belong to the
same strongly connected component.
If there are such nodes,
the graph contains
a path from $x_i$ to $\lnot x_i$
and also a path from $\lnot x_i$ to $x_i$,
so both $x_i$ and $\lnot x_i$ should be true
which is not possible.
In the graph of the formula $L_1$
there are no nodes $x_i$ and $\lnot x_i$
such that both nodes
belong to the same strongly connected component,
so a solution exists.
In the graph of the formula $L_2$
all nodes belong to the same strongly connected component,
so a solution does not exist.
If a solution exists, the values for the variables
can be found by going through the nodes of the
component graph in a reverse topological sort order.
At each step, we process a component
that does not contain edges that lead to an
unprocessed component.
If the variables in the component
have not been assigned values,
their values will be determined
according to the values in the component,
and if they already have values,
they remain unchanged.
The process continues until each variable
has been assigned a value.
The component graph for the formula $L_1$ is as follows:
\begin{center}
\begin{tikzpicture}[scale=1.0]
\node[draw, circle] (1) at (0,0) {$A$};
\node[draw, circle] (2) at (2,0) {$B$};
\node[draw, circle] (3) at (4,0) {$C$};
\node[draw, circle] (4) at (6,0) {$D$};
\path[draw,thick,->] (1) -- (2);
\path[draw,thick,->] (2) -- (3);
\path[draw,thick,->] (3) -- (4);
\end{tikzpicture}
\end{center}
The components are
$A = \{\lnot x_4\}$,
$B = \{x_1, x_2, \lnot x_3\}$,
$C = \{\lnot x_1, \lnot x_2, x_3\}$ and
$D = \{x_4\}$.
When constructing the solution,
we first process the component $D$
where $x_4$ becomes true.
After this, we process the component $C$
where $x_1$ and $x_2$ become false
and $x_3$ becomes true.
All variables have been assigned values,
so the remaining components $A$ and $B$
do not change the variables.
Note that this method works, because the
graph has a special structure:
if there are paths from node $x_i$ to node $x_j$
and from node $x_j$ to node $\lnot x_j$,
then node $x_i$ never becomes true.
The reason for this is that there is also
a path from node $\lnot x_j$ to node $\lnot x_i$,
and both $x_i$ and $x_j$ become false.
\index{3SAT problem}
A more difficult problem is the \key{3SAT problem},
where each part of the formula is of the form
$(a_i \lor b_i \lor c_i)$.
This problem is NP-hard, so no efficient algorithm
for solving the problem is known.

View File

@ -55,7 +55,7 @@ is $O(k)$, which may be slow, because a tree of $n$
nodes may have a chain of $n$ nodes.
Fortunately, using a technique similar to that
used in Chapter 16.3, any value of $\texttt{ancestor}(x,k)$
used in Chapter 16.3 (of the full book), any value of $\texttt{ancestor}(x,k)$
can be efficiently calculated in $O(\log k)$ time
after preprocessing.
The idea is to precalculate all values $\texttt{ancestor}(x,k)$

View File

@ -2,32 +2,10 @@
\markboth{\MakeUppercase{Preface}}{}
\addcontentsline{toc}{chapter}{Preface}
The purpose of this book is to give you
a thorough introduction to competitive programming.
It is assumed that you already
know the basics of programming, but no previous
background in competitive programming is needed.
This script is based on the Competitive Programmer's Handbook
by Antti Laaksonen.
The book is especially intended for
students who want to learn algorithms and
possibly participate in
the International Olympiad in Informatics (IOI) or
in the International Collegiate Programming Contest (ICPC).
Of course, the book is also suitable for
anybody else interested in competitive programming.
It takes a long time to become a good competitive
programmer, but it is also an opportunity to learn a lot.
You can be sure that you will get
a good general understanding of algorithms
if you spend time reading the book,
solving problems and taking part in contests.
The book is under continuous development.
You can always send feedback on the book to
\texttt{ahslaaks@cs.helsinki.fi}.
\begin{flushright}
Helsinki, August 2019 \\
Antti Laaksonen
\end{flushright}
It contains the topics relevant for the graph day of the
SOI Camp 2021.
Most of the code was modified slightly, and also some minor
adjustments were made on the text.