\chapter{Basics of graphs} Many programming problems can be solved by modelling the problem as a graph problem and using an appropriate graph algorithm. A typical example of a graph is a network of roads and cities in a country. Sometimes, though, the graph is hidden in the problem and it can be difficult to detect it. This part of the book discusses graph algorithms, especially focusing on topics that are important in competitive programming. In this chapter, we go through concepts related to graphs, and study different ways to represent graphs in algorithms. \section{Terminology} \index{graph} \index{node} \index{edge} A \key{graph} consists of \key{nodes} and \key{edges} between them. In this book, the variable $n$ denotes the number of nodes in a graph, and the variable $m$ denotes the number of edges. The nodes are numbered using integers $1,2,\ldots,n$. For example, the following graph consists of 5 nodes and 7 edges: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (4,3) {$2$}; \node[draw, circle] (3) at (1,1) {$3$}; \node[draw, circle] (4) at (4,1) {$4$}; \node[draw, circle] (5) at (6,2) {$5$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (3) -- (4); \path[draw,thick,-] (2) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (4) -- (5); \end{tikzpicture} \end{center} \index{path} A \key{path} leads from node $a$ to node $b$ through edges of the graph. The \key{length} of a path is the number of edges in it. For example, in the above graph, there are several paths from node 1 to node 5: \begin{itemize} \item $1 \rightarrow 2 \rightarrow 5$ (length 2) \item $1 \rightarrow 4 \rightarrow 5$ (length 2) \item $1 \rightarrow 2 \rightarrow 4 \rightarrow 5$ (length 3) \item $1 \rightarrow 3 \rightarrow 4 \rightarrow 5$ (length 3) \item $1 \rightarrow 4 \rightarrow 2 \rightarrow 5$ (length 3) \item $1 \rightarrow 3 \rightarrow 4 \rightarrow 2 \rightarrow 5$ (length 4) \end{itemize} \subsubsection{Connectivity} \index{connected graph} A graph is \key{connected} if there is path between any two nodes. For example, the following graph is connected: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (4,3) {$2$}; \node[draw, circle] (3) at (1,1) {$3$}; \node[draw, circle] (4) at (4,1) {$4$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (2) -- (3); \path[draw,thick,-] (3) -- (4); \path[draw,thick,-] (2) -- (4); \end{tikzpicture} \end{center} The following graph is not connected, because it is not possible to get from node 4 to any other node: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (4,3) {$2$}; \node[draw, circle] (3) at (1,1) {$3$}; \node[draw, circle] (4) at (4,1) {$4$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (2) -- (3); \end{tikzpicture} \end{center} \index{component} The connected parts of a graph are called its \key{components}. For example, the following graph contains three components: $\{1,\,2,\,3\}$, $\{4,\,5,\,6,\,7\}$ and $\{8\}$. \begin{center} \begin{tikzpicture}[scale=0.8] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (4,3) {$2$}; \node[draw, circle] (3) at (1,1) {$3$}; \node[draw, circle] (6) at (6,1) {$6$}; \node[draw, circle] (7) at (9,1) {$7$}; \node[draw, circle] (4) at (6,3) {$4$}; \node[draw, circle] (5) at (9,3) {$5$}; \node[draw, circle] (8) at (11,2) {$8$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (2) -- (3); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (4) -- (5); \path[draw,thick,-] (5) -- (7); \path[draw,thick,-] (6) -- (7); \path[draw,thick,-] (6) -- (4); \end{tikzpicture} \end{center} \index{tree} A \key{tree} is a connected graph that consists of $n$ nodes and $n-1$ edges. There is a unique path between any two nodes in a tree. For example, the following graph is a tree: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (4,3) {$2$}; \node[draw, circle] (3) at (1,1) {$3$}; \node[draw, circle] (4) at (4,1) {$4$}; \node[draw, circle] (5) at (6,2) {$5$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); %\path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (2) -- (4); %\path[draw,thick,-] (4) -- (5); \end{tikzpicture} \end{center} \subsubsection{Edge directions} \index{directed graph} A graph is \key{directed} if the edges can be traversed in one direction only. For example, the following graph is directed: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (4,3) {$2$}; \node[draw, circle] (3) at (1,1) {$3$}; \node[draw, circle] (4) at (4,1) {$4$}; \node[draw, circle] (5) at (6,2) {$5$}; \path[draw,thick,->,>=latex] (1) -- (2); \path[draw,thick,->,>=latex] (2) -- (4); \path[draw,thick,->,>=latex] (2) -- (5); \path[draw,thick,->,>=latex] (4) -- (5); \path[draw,thick,->,>=latex] (4) -- (1); \path[draw,thick,->,>=latex] (3) -- (1); \end{tikzpicture} \end{center} The above graph contains a path from node $3$ to node $5$ through the edges $3 \rightarrow 1 \rightarrow 2 \rightarrow 5$, but there is no path from node $5$ to node $3$. \index{cycle} \index{acyclic graph} A \key{cycle} is a path whose first and last node is the same. For example, the above graph contains a cycle $1 \rightarrow 2 \rightarrow 4 \rightarrow 1$. If a graph does not contain any cycles, it is called \key{acyclic}. \subsubsection{Edge weights} \index{weighted graph} In a \key{weighted} graph, each edge is assigned a \key{weight}. Often, the weights are interpreted as edge lengths. For example, the following graph is weighted: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (4,3) {$2$}; \node[draw, circle] (3) at (1,1) {$3$}; \node[draw, circle] (4) at (4,1) {$4$}; \node[draw, circle] (5) at (6,2) {$5$}; \path[draw,thick,-] (1) -- node[font=\small,label=above:5] {} (2); \path[draw,thick,-] (1) -- node[font=\small,label=left:1] {} (3); \path[draw,thick,-] (3) -- node[font=\small,label=below:7] {} (4); \path[draw,thick,-] (2) -- node[font=\small,label=left:6] {} (4); \path[draw,thick,-] (2) -- node[font=\small,label=above:7] {} (5); \path[draw,thick,-] (4) -- node[font=\small,label=below:3] {} (5); \end{tikzpicture} \end{center} The length of a path in a weighted graph is the sum of edge weights on the path. For example, in the above graph, the length of the path $1 \rightarrow 2 \rightarrow 5$ is $12$ and the length of the path $1 \rightarrow 3 \rightarrow 4 \rightarrow 5$ is $11$. The latter path is the \key{shortest} path from node $1$ to node $5$. \subsubsection{Neighbors and degrees} \index{neighbor} \index{degree} Two nodes are \key{neighbors} or \key{adjacent} if there is an edge between them. The \key{degree} of a node is the number of its neighbors. For example, in the following graph, the neighbors of node 2 are 1, 4 and 5, so its degree is 3. \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (4,3) {$2$}; \node[draw, circle] (3) at (1,1) {$3$}; \node[draw, circle] (4) at (4,1) {$4$}; \node[draw, circle] (5) at (6,2) {$5$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (3) -- (4); \path[draw,thick,-] (2) -- (4); \path[draw,thick,-] (2) -- (5); %\path[draw,thick,-] (4) -- (5); \end{tikzpicture} \end{center} The sum of degrees in a graph is always $2m$, where $m$ is the number of edges, because each edge increases the degree of two nodes by one. For this reason, the sum of degrees is always even. \index{regular graph} \index{complete graph} A graph is \key{regular} if the degree of every node is a constant $d$. A graph is \key{complete} if the degree of every node is $n-1$, i.e., the graph contains all possible edges between the nodes. \index{indegree} \index{outdegree} In a directed graph, the \key{indegree} of a node is the number of edges that end at the node, and the \key{outdegree} of a node is the number of edges that start at the node. For example, in the following graph, the indegree of node 2 is 2 and the outdegree of the node is 1. \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (4,3) {$2$}; \node[draw, circle] (3) at (1,1) {$3$}; \node[draw, circle] (4) at (4,1) {$4$}; \node[draw, circle] (5) at (6,2) {$5$}; \path[draw,thick,->,>=latex] (1) -- (2); \path[draw,thick,->,>=latex] (1) -- (3); \path[draw,thick,->,>=latex] (1) -- (4); \path[draw,thick,->,>=latex] (3) -- (4); \path[draw,thick,->,>=latex] (2) -- (4); \path[draw,thick,<-,>=latex] (2) -- (5); \end{tikzpicture} \end{center} \subsubsection{Colorings} \index{coloring} \index{bipartite graph} In a \key{coloring} of a graph, each node is assigned a color so that no adjacent nodes have the same color. A graph is \key{bipartite} if it is possible to color it using two colors. It turns out that a graph is bipartite exactly when it does not contain a cycle with an odd number of edges. For example, the graph \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$2$}; \node[draw, circle] (2) at (4,3) {$3$}; \node[draw, circle] (3) at (1,1) {$5$}; \node[draw, circle] (4) at (4,1) {$6$}; \node[draw, circle] (5) at (-2,1) {$4$}; \node[draw, circle] (6) at (-2,3) {$1$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (3) -- (4); \path[draw,thick,-] (2) -- (4); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (5) -- (6); \end{tikzpicture} \end{center} is bipartite, because it can be colored as follows: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle, fill=blue!40] (1) at (1,3) {$2$}; \node[draw, circle, fill=red!40] (2) at (4,3) {$3$}; \node[draw, circle, fill=red!40] (3) at (1,1) {$5$}; \node[draw, circle, fill=blue!40] (4) at (4,1) {$6$}; \node[draw, circle, fill=red!40] (5) at (-2,1) {$4$}; \node[draw, circle, fill=blue!40] (6) at (-2,3) {$1$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (3) -- (4); \path[draw,thick,-] (2) -- (4); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (5) -- (6); \end{tikzpicture} \end{center} However, the following graph is not bipartite: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$2$}; \node[draw, circle] (2) at (4,3) {$3$}; \node[draw, circle] (3) at (1,1) {$5$}; \node[draw, circle] (4) at (4,1) {$6$}; \node[draw, circle] (5) at (-2,1) {$4$}; \node[draw, circle] (6) at (-2,3) {$1$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (3) -- (4); \path[draw,thick,-] (2) -- (4); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (5) -- (6); \path[draw,thick,-] (1) -- (6); \end{tikzpicture} \end{center} \subsubsection{Simplicity} \index{simple graph} A graph is \key{simple} if no edge starts and ends at the same node, and there are no multiple edges between two nodes. Often we assume that graphs are simple. For example, the graph \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$2$}; \node[draw, circle] (2) at (4,3) {$3$}; \node[draw, circle] (3) at (1,1) {$5$}; \node[draw, circle] (4) at (4,1) {$6$}; \node[draw, circle] (5) at (-2,1) {$4$}; \node[draw, circle] (6) at (-2,3) {$1$}; \path[draw,thick,-] (1) edge [bend right=20] (2); \path[draw,thick,-] (2) edge [bend right=20] (1); %\path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (3) -- (4); \path[draw,thick,-] (2) -- (4); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (5) -- (6); \tikzset{every loop/.style={in=135,out=190}} \path[draw,thick,-] (5) edge [loop left] (5); \end{tikzpicture} \end{center} is \emph{not} simple, because there is an edge that starts and ends at node 4, and there are two edges between nodes 2 and 3. \section{Graph representation} There are several ways to represent graphs in algorithms. The choice of a data structure depends on the size of the graph and the way the algorithm processes it. Next we will go through three possible representations. \subsubsection{Adjacency list representation} \index{adjacency list} In the adjacency list representation, each node $x$ in the graph is assigned an \key{adjacency list} that consists of nodes to which there is an edge from $x$. Adjacency lists are the most popular way to represent a graph, and most algorithms can be efficiently implemented using them. A convenient way to store the adjacency lists is to declare an array of vectors as follows: \begin{lstlisting} vector v[N]; \end{lstlisting} The constant $N$ is chosen so that there is space for all adjacency lists. For example, the graph \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (3,3) {$2$}; \node[draw, circle] (3) at (5,3) {$3$}; \node[draw, circle] (4) at (3,1) {$4$}; \path[draw,thick,->,>=latex] (1) -- (2); \path[draw,thick,->,>=latex] (2) -- (3); \path[draw,thick,->,>=latex] (2) -- (4); \path[draw,thick,->,>=latex] (3) -- (4); \path[draw,thick,->,>=latex] (4) -- (1); \end{tikzpicture} \end{center} can be stored as follows: \begin{lstlisting} v[1].push_back(2); v[2].push_back(3); v[2].push_back(4); v[3].push_back(4); v[4].push_back(1); \end{lstlisting} If the graph is undirected, it can be stored in a similar way, but each edge is stored in both directions. For a weighted graph, the structure can be extended as follows: \begin{lstlisting} vector> v[N]; \end{lstlisting} If there is an edge from node $a$ to node $b$ with weight $w$, the adjacency list of node $a$ contains the pair $(b,w)$. For example, the graph \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (3,3) {$2$}; \node[draw, circle] (3) at (5,3) {$3$}; \node[draw, circle] (4) at (3,1) {$4$}; \path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2); \path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3); \path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4); \path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4); \path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1); \end{tikzpicture} \end{center} can be stored as follows: \begin{lstlisting} v[1].push_back({2,5}); v[2].push_back({3,7}); v[2].push_back({4,6}); v[3].push_back({4,5}); v[4].push_back({1,2}); \end{lstlisting} The benefit in using adjacency lists is that we can efficiently find the nodes to which we can move from a certain node through an edge. For example, the following loop goes through all nodes to which we can move from node $s$: \begin{lstlisting} for (auto u : v[s]) { // process node u } \end{lstlisting} \subsubsection{Adjacency matrix representation} \index{adjacency matrix} An \key{adjacency matrix} is a two-dimensional array that indicates which edges exist in the graph. We can efficiently check from an adjacency matrix if there is an edge between two nodes. The matrix can be stored as an array \begin{lstlisting} int v[N][N]; \end{lstlisting} where each value $\texttt{v}[a][b]$ indicates whether the graph contains an edge from node $a$ to node $b$. If the edge is included in the graph, then $\texttt{v}[a][b]=1$, and otherwise $\texttt{v}[a][b]=0$. For example, the graph \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (3,3) {$2$}; \node[draw, circle] (3) at (5,3) {$3$}; \node[draw, circle] (4) at (3,1) {$4$}; \path[draw,thick,->,>=latex] (1) -- (2); \path[draw,thick,->,>=latex] (2) -- (3); \path[draw,thick,->,>=latex] (2) -- (4); \path[draw,thick,->,>=latex] (3) -- (4); \path[draw,thick,->,>=latex] (4) -- (1); \end{tikzpicture} \end{center} can be represented as follows: \begin{center} \begin{tikzpicture}[scale=0.7] \draw (0,0) grid (4,4); \node at (0.5,0.5) {1}; \node at (1.5,0.5) {0}; \node at (2.5,0.5) {0}; \node at (3.5,0.5) {0}; \node at (0.5,1.5) {0}; \node at (1.5,1.5) {0}; \node at (2.5,1.5) {0}; \node at (3.5,1.5) {1}; \node at (0.5,2.5) {0}; \node at (1.5,2.5) {0}; \node at (2.5,2.5) {1}; \node at (3.5,2.5) {1}; \node at (0.5,3.5) {0}; \node at (1.5,3.5) {1}; \node at (2.5,3.5) {0}; \node at (3.5,3.5) {0}; \node at (-0.5,0.5) {4}; \node at (-0.5,1.5) {3}; \node at (-0.5,2.5) {2}; \node at (-0.5,3.5) {1}; \node at (0.5,4.5) {1}; \node at (1.5,4.5) {2}; \node at (2.5,4.5) {3}; \node at (3.5,4.5) {4}; \end{tikzpicture} \end{center} If the graph is directed, the adjacency matrix representation can be extended so that the matrix contains the weight of the edge if the edge exists. Using this representation, the graph \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (3,3) {$2$}; \node[draw, circle] (3) at (5,3) {$3$}; \node[draw, circle] (4) at (3,1) {$4$}; \path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2); \path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3); \path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4); \path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4); \path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1); \end{tikzpicture} \end{center} \begin{samepage} corresponds to the following matrix: \begin{center} \begin{tikzpicture}[scale=0.7] \draw (0,0) grid (4,4); \node at (0.5,0.5) {2}; \node at (1.5,0.5) {0}; \node at (2.5,0.5) {0}; \node at (3.5,0.5) {0}; \node at (0.5,1.5) {0}; \node at (1.5,1.5) {0}; \node at (2.5,1.5) {0}; \node at (3.5,1.5) {5}; \node at (0.5,2.5) {0}; \node at (1.5,2.5) {0}; \node at (2.5,2.5) {7}; \node at (3.5,2.5) {6}; \node at (0.5,3.5) {0}; \node at (1.5,3.5) {5}; \node at (2.5,3.5) {0}; \node at (3.5,3.5) {0}; \node at (-0.5,0.5) {4}; \node at (-0.5,1.5) {3}; \node at (-0.5,2.5) {2}; \node at (-0.5,3.5) {1}; \node at (0.5,4.5) {1}; \node at (1.5,4.5) {2}; \node at (2.5,4.5) {3}; \node at (3.5,4.5) {4}; \end{tikzpicture} \end{center} \end{samepage} The drawback in the adjacency matrix representation is that there are $n^2$ elements in the matrix and usually most of them are zero. For this reason, the representation cannot be used if the graph is large. \subsubsection{Edge list representation} \index{edge list} An \key{edge list} contains all edges of a graph in some order. This is a convenient way to represent a graph if the algorithm processes all edges of the graph, and it is not needed to find edges that start at a given node. The edge list can be stored in a vector \begin{lstlisting} vector> v; \end{lstlisting} where each pair $(a,b)$ denotes that there is an edge from node $a$ to node $b$. Thus, the graph \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (3,3) {$2$}; \node[draw, circle] (3) at (5,3) {$3$}; \node[draw, circle] (4) at (3,1) {$4$}; \path[draw,thick,->,>=latex] (1) -- (2); \path[draw,thick,->,>=latex] (2) -- (3); \path[draw,thick,->,>=latex] (2) -- (4); \path[draw,thick,->,>=latex] (3) -- (4); \path[draw,thick,->,>=latex] (4) -- (1); \end{tikzpicture} \end{center} can be represented as follows: \begin{lstlisting} v.push_back({1,2}); v.push_back({2,3}); v.push_back({2,4}); v.push_back({3,4}); v.push_back({4,1}); \end{lstlisting} \noindent If the graph is weighted, the structure can be extended as follows: \begin{lstlisting} vector> v; \end{lstlisting} Each element in this list is of the form $(a,b,w)$, which means that there is an edge from node $a$ to node $b$ with weight $w$. For example, the graph \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (1,3) {$1$}; \node[draw, circle] (2) at (3,3) {$2$}; \node[draw, circle] (3) at (5,3) {$3$}; \node[draw, circle] (4) at (3,1) {$4$}; \path[draw,thick,->,>=latex] (1) -- node[font=\small,label=above:5] {} (2); \path[draw,thick,->,>=latex] (2) -- node[font=\small,label=above:7] {} (3); \path[draw,thick,->,>=latex] (2) -- node[font=\small,label=left:6] {} (4); \path[draw,thick,->,>=latex] (3) -- node[font=\small,label=right:5] {} (4); \path[draw,thick,->,>=latex] (4) -- node[font=\small,label=left:2] {} (1); \end{tikzpicture} \end{center} \begin{samepage} can be represented as follows: \begin{lstlisting} v.push_back(make_tuple(1,2,5)); v.push_back(make_tuple(2,3,7)); v.push_back(make_tuple(2,4,6)); v.push_back(make_tuple(3,4,5)); v.push_back(make_tuple(4,1,2)); \end{lstlisting} \end{samepage}