\chapter{Basics of graphs}

Many programming problems can be solved by
interpreting the problem as a graph problem
and using a suitable graph algorithm.
A typical example of a graph is a network
of roads and cities in a country.
Sometimes, though, the graph is hidden
in the problem and it can be difficult to detect it.

This part of the book discusses techniques and algorithms
involving graphs
that are important in competitive programming.
We will first go through graph terminology
and different ways to store graphs in algorithms.

\section{Terminology}

\index{graph}
\index{node}
\index{edge}

A \key{graph} consists of \key{nodes}
and \key{edges} between them.
In this book,
the variable $n$ denotes the number of nodes
in a graph, and the variable $m$ denotes
the number of edges.
In addition, the nodes are numbered
using integers $1,2,\ldots,n$.

For example, the following graph contains 5 nodes and 7 edges:

\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (4,3) {$2$};
\node[draw, circle] (3) at (1,1) {$3$};
\node[draw, circle] (4) at (4,1) {$4$};
\node[draw, circle] (5) at (6,2) {$5$};

\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (3) -- (4);
\path[draw,thick,-] (2) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (4) -- (5);
\end{tikzpicture}
\end{center}

\index{path}

A \key{path} is a route from node $a$ to node $b$
that goes through the edges in the graph.
The \key{length} of a path is the number of
edges in the path.
For example, in the above graph, paths
from node 1 to node 5 are:

\begin{itemize}
\item $1 \rightarrow 2 \rightarrow 5$ (length 2)
\item $1 \rightarrow 4 \rightarrow 5$ (length 2)
\item $1 \rightarrow 2 \rightarrow 4 \rightarrow 5$ (length 3)
\item $1 \rightarrow 3 \rightarrow 4 \rightarrow 5$ (length 3)
\item $1 \rightarrow 4 \rightarrow 2 \rightarrow 5$ (length 3)
\item $1 \rightarrow 3 \rightarrow 4 \rightarrow 2 \rightarrow 5$ (length 4)
\end{itemize}

\subsubsection{Connectivity}

\index{connected graph}

A graph is \key{connected}, if there is path
between any two nodes.
For example, the following graph is connected:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (4,3) {$2$};
\node[draw, circle] (3) at (1,1) {$3$};
\node[draw, circle] (4) at (4,1) {$4$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (2) -- (3);
\path[draw,thick,-] (3) -- (4);
\path[draw,thick,-] (2) -- (4);
\end{tikzpicture}
\end{center}

The following graph is not connected
because it is not possible to get to other
nodes from node 4.
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (4,3) {$2$};
\node[draw, circle] (3) at (1,1) {$3$};
\node[draw, circle] (4) at (4,1) {$4$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (2) -- (3);
\end{tikzpicture}
\end{center}

\index{compomnent}

The connected parts of a graph are
its \key{components}.
For example, the following graph
contains three components:
$\{1,\,2,\,3\}$,
$\{4,\,5,\,6,\,7\}$ and
$\{8\}$.
\begin{center}
\begin{tikzpicture}[scale=0.8]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (4,3) {$2$};
\node[draw, circle] (3) at (1,1) {$3$};

\node[draw, circle] (6) at (6,1) {$6$};
\node[draw, circle] (7) at (9,1) {$7$};
\node[draw, circle] (4) at (6,3) {$4$};
\node[draw, circle] (5) at (9,3) {$5$};

\node[draw, circle] (8) at (11,2) {$8$};

\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (2) -- (3);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (4) -- (5);
\path[draw,thick,-] (5) -- (7);
\path[draw,thick,-] (6) -- (7);
\path[draw,thick,-] (6) -- (4);
\end{tikzpicture}
\end{center}

\index{tree}

A \key{tree} is a connected graph
that contains $n$ nodes and $n-1$ edges.
In a tree, there is a unique path
between any two nodes.
For example, the following graph is a tree:

\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (4,3) {$2$};
\node[draw, circle] (3) at (1,1) {$3$};
\node[draw, circle] (4) at (4,1) {$4$};
\node[draw, circle] (5) at (6,2) {$5$};

\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
%\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (2) -- (4);
%\path[draw,thick,-] (4) -- (5);
\end{tikzpicture}
\end{center}

\subsubsection{Edge directions}

\index{directed graph}

A graph is \key{directed}
if the edges can be travelled only
in one direction.
For example, the following graph is directed:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (4,3) {$2$};
\node[draw, circle] (3) at (1,1) {$3$};
\node[draw, circle] (4) at (4,1) {$4$};
\node[draw, circle] (5) at (6,2) {$5$};
\path[draw,thick,->,>=latex] (1) -- (2);
\path[draw,thick,->,>=latex] (2) -- (4);
\path[draw,thick,->,>=latex] (2) -- (5);
\path[draw,thick,->,>=latex] (4) -- (5);
\path[draw,thick,->,>=latex] (4) -- (1);
\path[draw,thick,->,>=latex] (3) -- (1);
\end{tikzpicture}
\end{center}

The above graph contains a path from
node $3$ to $5$ using edges
$3 \rightarrow 1 \rightarrow 2 \rightarrow 5$.
However, the graph doesn't contain
a path from node $5$ to $3$.

\index{cycle}
\index{acyclic graph}

A \key{cycle} is a path whose first and
last node is the same.
For example, the above graph contains
a cycle
$1 \rightarrow 2 \rightarrow 4 \rightarrow 1$.
If a graph doesn't contain any cycles,
it is called \key{acyclic}.

\subsubsection{Edge weights}

\index{weighted graph}

In a \key{weighted} graph, each edge is assigned
a \key{weight}.
Often, the weights are interpreted as edge lengths.
For example, the following graph is weighted:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (4,3) {$2$};
\node[draw, circle] (3) at (1,1) {$3$};
\node[draw, circle] (4) at (4,1) {$4$};
\node[draw, circle] (5) at (6,2) {$5$};
\path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,-] (1) -- node[font=\small,label=left:1] {} (3);
\path[draw,thick,-] (3) -- node[font=\small,label=below:7] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (4);
\path[draw,thick,-] (2) -- node[font=\small,label=above:7] {} (5);
\path[draw,thick,-] (4) -- node[font=\small,label=below:3] {} (5);
\end{tikzpicture}
\end{center}

Now the length of a path is the sum of
edge weights.
For example, in the above graph
the length of path
$1 \rightarrow 2 \rightarrow 5$
is $12$, and the length of path
$1 \rightarrow 3 \rightarrow 4 \rightarrow 5$ is $11$.
The latter is the shortest path from node $1$ to node $5$.

\subsubsection{Neighbors and degrees}

\index{neighbor}
\index{degree}

Two nodes are \key{neighbors} or \key{adjacent}
if there is a edge between them.
The \key{degree} of a node
is the number of its neighbors.
For example, in the following graph,
the neighbors of node 2 are 1, 4 and 5,
so its degree is 3.

\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (4,3) {$2$};
\node[draw, circle] (3) at (1,1) {$3$};
\node[draw, circle] (4) at (4,1) {$4$};
\node[draw, circle] (5) at (6,2) {$5$};

\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (3) -- (4);
\path[draw,thick,-] (2) -- (4);
\path[draw,thick,-] (2) -- (5);
%\path[draw,thick,-] (4) -- (5);
\end{tikzpicture}
\end{center}

The sum of degrees in a graph is always $2m$
where $m$ is the number of edges.
The reason for this is that each edge
increases the degree of two nodes by one.
Thus, the sum of degrees is always even.

\index{regular graph}
\index{complete graph}

A graph is \key{regular} if the
degree of every node is a constant $d$.
A graph is \key{complete} if the
degree of every node is $n-1$, i.e.,
the graph contains all possible edges
between the nodes.

\index{indegree}
\index{outdegree}

In a directed graph, the \key{indegree}
and \key{outdegree} of a node is
the number of edges that end and begin
at the node, respectively.
For example, in the following graph,
node 2 has indegree 2 and outdegree 1.

\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (4,3) {$2$};
\node[draw, circle] (3) at (1,1) {$3$};
\node[draw, circle] (4) at (4,1) {$4$};
\node[draw, circle] (5) at (6,2) {$5$};

\path[draw,thick,->,>=latex] (1) -- (2);
\path[draw,thick,->,>=latex] (1) -- (3);
\path[draw,thick,->,>=latex] (1) -- (4);
\path[draw,thick,->,>=latex] (3) -- (4);
\path[draw,thick,->,>=latex] (2) -- (4);
\path[draw,thick,<-,>=latex] (2) -- (5);
\end{tikzpicture}
\end{center}

\subsubsection{Colorings}

\index{coloring}
\index{bipartite graph}

In a \key{coloring} of a graph,
each node is assigned a color so that
no adjacent nodes have the same color.

A graph is \key{bipartite} if
it is possible to color it using two colors.
It turns out that a graph is bipartite
exactly when it doesn't contain a cycle
with odd number of edges.
For example, the graph
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$2$};
\node[draw, circle] (2) at (4,3) {$3$};
\node[draw, circle] (3) at (1,1) {$5$};
\node[draw, circle] (4) at (4,1) {$6$};
\node[draw, circle] (5) at (-2,1) {$4$};
\node[draw, circle] (6) at (-2,3) {$1$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (3) -- (4);
\path[draw,thick,-] (2) -- (4);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (5) -- (6);
\end{tikzpicture}
\end{center}
is bipartite because we can color it as follows:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle, fill=blue!40] (1) at (1,3) {$2$};
\node[draw, circle, fill=red!40] (2) at (4,3) {$3$};
\node[draw, circle, fill=red!40] (3) at (1,1) {$5$};
\node[draw, circle, fill=blue!40] (4) at (4,1) {$6$};
\node[draw, circle, fill=red!40] (5) at (-2,1) {$4$};
\node[draw, circle, fill=blue!40] (6) at (-2,3) {$1$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (3) -- (4);
\path[draw,thick,-] (2) -- (4);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (5) -- (6);
\end{tikzpicture}
\end{center}

\subsubsection{Simplicity}

\index{simple graph}

A graph is \key{simple}
if no edge begins and ends at the same node,
and there are no multiple
edges between two nodes.
Often we will assume that the graph is simple.
For example, the graph
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$2$};
\node[draw, circle] (2) at (4,3) {$3$};
\node[draw, circle] (3) at (1,1) {$5$};
\node[draw, circle] (4) at (4,1) {$6$};
\node[draw, circle] (5) at (-2,1) {$4$};
\node[draw, circle] (6) at (-2,3) {$1$};

\path[draw,thick,-] (1) edge [bend right=20] (2);
\path[draw,thick,-] (2) edge [bend right=20] (1);
%\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (3) -- (4);
\path[draw,thick,-] (2) -- (4);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (5) -- (6);

\tikzset{every loop/.style={in=135,out=190}}
\path[draw,thick,-] (5) edge [loop left] (5);
\end{tikzpicture}
\end{center}
is \emph{not} simple because there is an edge that begins
and ends at node 4, and there are two edges
between nodes 2 and 3.

\section{Graph representation}

There are several ways how to represent graphs in memory
in an algorithm.
The choice of a data structure
depends on the size of the graph and
how the algorithm manipulates it.
Next we will go through three representations.

\subsubsection{Adjacency list representation}

\index{adjacency list}

A usual way to represent a graph is
to create an \key{adjacency list} for each node.
An adjacency list contains contains all nodes
that can be reached from the node using a single edge.
The adjacency list representation is the most popular
way to store a graph, and most algorithms can be
efficiently implemented using it.

A good way to store the adjacency lists is to allocate an array
whose each element is a vector:
\begin{lstlisting}
vector<int> v[N];
\end{lstlisting}

The adjacency list for node $s$ is in position
$\texttt{v}[s]$ in the array.
The constant $N$ is so chosen that all
adjacency lists can be stored.
For example, the graph

\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};

\path[draw,thick,->,>=latex] (1) -- (2);
\path[draw,thick,->,>=latex] (2) -- (3);
\path[draw,thick,->,>=latex] (2) -- (4);
\path[draw,thick,->,>=latex] (3) -- (4);
\path[draw,thick,->,>=latex] (4) -- (1);
\end{tikzpicture}
\end{center}
can be stored as follows:
\begin{lstlisting}
v[1].push_back(2);
v[2].push_back(3);
v[2].push_back(4);
v[3].push_back(4);
v[4].push_back(1);
\end{lstlisting}

If the graph is undirected, it can be stored in a similar way,
but each edge each is store in both directions.

For an weighted graph, the structure can be extended
as follows:

\begin{lstlisting}
vector<pair<int,int>> v[N];
\end{lstlisting}

Now each adjacency list contains pairs whose first
element is the target node,
and the second element is the edge weight.
For example, the graph

\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};

\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
\end{tikzpicture}
\end{center}
can be stored as follows:
\begin{lstlisting}
v[1].push_back({2,5});
v[2].push_back({3,7});
v[2].push_back({4,6});
v[3].push_back({4,5});
v[4].push_back({1,2});
\end{lstlisting}

The benefit in the adjacency list representation is that
we can efficiently find the nodes that can be
reached from a certain node.
For example, the following loop goes trough all nodes
that can be reached from node $s$:

\begin{lstlisting}
for (auto u : v[s]) {
    // process node u
}
\end{lstlisting}

\subsubsection{Adjacency matrix representation}

\index{adjacency matrix}

An \key{adjacency matrix} is a two-dimensional array
that indicates for each possible edge if it is
included in the graph.
Using an adjacency matrix, we can efficiently check
if there is an edge between two nodes.
On the other hand, the matrix takes a lot of memory
if the graph is large.
We can store the matrix as an array
\begin{lstlisting}
int v[N][N];
\end{lstlisting}
where the value $\texttt{v}[a][b]$ indicates
whether the graph contains an edge from
node $a$ to node $b$.
If the edge is included in the graph,
then $\texttt{v}[a][b]=1$,
and otherwise $\texttt{v}[a][b]=0$.
For example, the graph
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};

\path[draw,thick,->,>=latex] (1) -- (2);
\path[draw,thick,->,>=latex] (2) -- (3);
\path[draw,thick,->,>=latex] (2) -- (4);
\path[draw,thick,->,>=latex] (3) -- (4);
\path[draw,thick,->,>=latex] (4) -- (1);
\end{tikzpicture}
\end{center}
can be represented as follows:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (4,4);
\node at (0.5,0.5) {1};
\node at (1.5,0.5) {0};
\node at (2.5,0.5) {0};
\node at (3.5,0.5) {0};
\node at (0.5,1.5) {0};
\node at (1.5,1.5) {0};
\node at (2.5,1.5) {0};
\node at (3.5,1.5) {1};
\node at (0.5,2.5) {0};
\node at (1.5,2.5) {0};
\node at (2.5,2.5) {1};
\node at (3.5,2.5) {1};
\node at (0.5,3.5) {0};
\node at (1.5,3.5) {1};
\node at (2.5,3.5) {0};
\node at (3.5,3.5) {0};
\node at (-0.5,0.5) {4};
\node at (-0.5,1.5) {3};
\node at (-0.5,2.5) {2};
\node at (-0.5,3.5) {1};
\node at (0.5,4.5) {1};
\node at (1.5,4.5) {2};
\node at (2.5,4.5) {3};
\node at (3.5,4.5) {4};
\end{tikzpicture}
\end{center}

If the graph is directed, the adjacency matrix
representation can be extended so that
the matrix contains the weight of the edge
if the edge exists.
Using this representation, the graph

\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};

\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
\end{tikzpicture}
\end{center}
\begin{samepage}
corresponds to the following matrix:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (4,4);
\node at (0.5,0.5) {2};
\node at (1.5,0.5) {0};
\node at (2.5,0.5) {0};
\node at (3.5,0.5) {0};
\node at (0.5,1.5) {0};
\node at (1.5,1.5) {0};
\node at (2.5,1.5) {0};
\node at (3.5,1.5) {5};
\node at (0.5,2.5) {0};
\node at (1.5,2.5) {0};
\node at (2.5,2.5) {7};
\node at (3.5,2.5) {6};
\node at (0.5,3.5) {0};
\node at (1.5,3.5) {5};
\node at (2.5,3.5) {0};
\node at (3.5,3.5) {0};
\node at (-0.5,0.5) {4};
\node at (-0.5,1.5) {3};
\node at (-0.5,2.5) {2};
\node at (-0.5,3.5) {1};
\node at (0.5,4.5) {1};
\node at (1.5,4.5) {2};
\node at (2.5,4.5) {3};
\node at (3.5,4.5) {4};
\end{tikzpicture}
\end{center}
\end{samepage}

\subsubsection{Edge list representation}

\index{edge list}

An \key{edge list} contains all edges of a graph.
This is a convenient way to represent a graph,
if the algorithm will go trough all edges of the graph,
and it is not needed to find edges that begin
at a given node.

The edge list can be stored in a vector
\begin{lstlisting}
vector<pair<int,int>> v;
\end{lstlisting}
where each element contains the starting
and ending node of an edge.
Thus, the graph

\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};

\path[draw,thick,->,>=latex] (1) -- (2);
\path[draw,thick,->,>=latex] (2) -- (3);
\path[draw,thick,->,>=latex] (2) -- (4);
\path[draw,thick,->,>=latex] (3) -- (4);
\path[draw,thick,->,>=latex] (4) -- (1);
\end{tikzpicture}
\end{center}
can be represented as follows:
\begin{lstlisting}
v.push_back({1,2});
v.push_back({2,3});
v.push_back({2,4});
v.push_back({3,4});
v.push_back({4,1});
\end{lstlisting}

\noindent
If the graph is weighted, we can extend the
structure as follows:
\begin{lstlisting}
vector<pair<pair<int,int>,int>> v;
\end{lstlisting}
Now the list contains pairs whose first element
contains the starting and ending node of an edge,
and the second element corresponds to the edge weight.
For example, the graph

\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,3) {$1$};
\node[draw, circle] (2) at (3,3) {$2$};
\node[draw, circle] (3) at (5,3) {$3$};
\node[draw, circle] (4) at (3,1) {$4$};

\path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2);
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3);
\path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4);
\path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4);
\path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1);
\end{tikzpicture}
\end{center}
\begin{samepage}
can be represented as follows:
\begin{lstlisting}
v.push_back({{1,2},5});
v.push_back({{2,3},7});
v.push_back({{2,4},6});
v.push_back({{3,4},5});
v.push_back({{4,1},2});
\end{lstlisting}
\end{samepage}