cphb/luku17.tex

\chapter{Strongly connectivity}

\index{strongly connected graph}

In a directed graph, the directions of
the edges restrict possible paths in the graph,
so even if the graph is connected,
this doesn't guarantee that there would be
a path between any two nodes.
Thus, it is meaningful to define a new concept
for directed graphs that requires more than connectivity.

A graph is \key{strongly connected}
if there is a path from any node to all
other nodes in the graph.
For example, in the following picture,
the left graph is strongly connected,
while the right graph is not.

\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (1,1) {$1$};
\node[draw, circle] (2) at (3,1) {$2$};
\node[draw, circle] (3) at (1,-1) {$3$};
\node[draw, circle] (4) at (3,-1) {$4$};

\path[draw,thick,->] (1) -- (2);
\path[draw,thick,->] (2) -- (4);
\path[draw,thick,->] (4) -- (3);
\path[draw,thick,->] (3) -- (1);

\node[draw, circle] (1b) at (6,1) {$1$};
\node[draw, circle] (2b) at (8,1) {$2$};
\node[draw, circle] (3b) at (6,-1) {$3$};
\node[draw, circle] (4b) at (8,-1) {$4$};

\path[draw,thick,->] (1b) -- (2b);
\path[draw,thick,->] (2b) -- (4b);
\path[draw,thick,->] (4b) -- (3b);
\path[draw,thick,->] (1b) -- (3b);
\end{tikzpicture}
\end{center}

The right graph is not strongly connected
because, for example, there is no path
from node 2 to node 1.

\index{strongly connected component}
\index{component graph}

The \key{strongly connected components}
of a graph divide the graph into strongly connected
subgraphs that are as large as possible.
The strongly connected components form an
acyclic \key{component graph} that represents
the deep structure of the original graph.

For example, for the graph
\begin{center}
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
\node[draw, circle] (1) at (-1,1) {$7$};
\node[draw, circle] (2) at (-3,2) {$3$};
\node[draw, circle] (4) at (-5,2) {$2$};
\node[draw, circle] (6) at (-7,2) {$1$};
\node[draw, circle] (3) at (-3,0) {$6$};
\node[draw, circle] (5) at (-5,0) {$5$};
\node[draw, circle] (7) at (-7,0) {$4$};

\path[draw,thick,->] (2) -- (1);
\path[draw,thick,->] (1) -- (3);
\path[draw,thick,->] (3) -- (2);
\path[draw,thick,->] (2) -- (4);
\path[draw,thick,->] (3) -- (5);
\path[draw,thick,->] (4) edge [bend left] (6);
\path[draw,thick,->] (6) edge [bend left] (4);
\path[draw,thick,->] (4) -- (5);
\path[draw,thick,->] (5) -- (7);
\path[draw,thick,->] (6) -- (7);
\end{tikzpicture}
\end{center}
the strongly connected components are as follows:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (-1,1) {$7$};
\node[draw, circle] (2) at (-3,2) {$3$};
\node[draw, circle] (4) at (-5,2) {$2$};
\node[draw, circle] (6) at (-7,2) {$1$};
\node[draw, circle] (3) at (-3,0) {$6$};
\node[draw, circle] (5) at (-5,0) {$5$};
\node[draw, circle] (7) at (-7,0) {$4$};

\path[draw,thick,->] (2) -- (1);
\path[draw,thick,->] (1) -- (3);
\path[draw,thick,->] (3) -- (2);
\path[draw,thick,->] (2) -- (4);
\path[draw,thick,->] (3) -- (5);
\path[draw,thick,->] (4) edge [bend left] (6);
\path[draw,thick,->] (6) edge [bend left] (4);
\path[draw,thick,->] (4) -- (5);
\path[draw,thick,->] (5) -- (7);
\path[draw,thick,->] (6) -- (7);

\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
\end{tikzpicture}
\end{center}
The corresponding component graph is as follows:
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (-3,1) {$B$};
\node[draw, circle] (2) at (-6,2) {$A$};
\node[draw, circle] (3) at (-5,0) {$D$};
\node[draw, circle] (4) at (-7,0) {$C$};

\path[draw,thick,->] (1) -- (2);
\path[draw,thick,->] (1) -- (3);
\path[draw,thick,->] (2) -- (3);
\path[draw,thick,->] (2) -- (4);
\path[draw,thick,->] (3) -- (4);
\end{tikzpicture}
\end{center}
The components are $A=\{1,2\}$,
$B=\{3,6,7\}$, $C=\{4\}$ and $D=\{5\}$.

A component graph is an acyclic, directed graph,
so it is easier to process than the original
graph because it doesn't contain cycles.
Thus, as in Chapter 16, it is possible to
construct a topological sort for a component
graph and also use dynamic programming algorithms.

\section{Kosaraju's algorithm}

\index{Kosaraju's algorithm}

\key{Kosaraju's algorithm} is an efficient
method for finding the strongly connected components
of a directed graph.
It performs two depth-first searches:
the first search constructs a list of nodes
according to the structure of the graph,
and the second search forms the strongly connected components.

\subsubsection{Search 1}

The first phase of the algorithm constructs
a list of nodes in the order in which a
depth-first search processes them.
The algorithm goes through the nodes,
and begins a depth-first search at each
unprocessed node.
Each node will be added to the list
after it has been processed.

In the example graph the nodes are processed
in the following order:
\begin{center}
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
\node[draw, circle] (1) at (-1,1) {$7$};
\node[draw, circle] (2) at (-3,2) {$3$};
\node[draw, circle] (4) at (-5,2) {$2$};
\node[draw, circle] (6) at (-7,2) {$1$};
\node[draw, circle] (3) at (-3,0) {$6$};
\node[draw, circle] (5) at (-5,0) {$5$};
\node[draw, circle] (7) at (-7,0) {$4$};

\node at (-7,2.75) {$1/8$};
\node at (-5,2.75) {$2/7$};
\node at (-3,2.75) {$9/14$};
\node at (-7,-0.75) {$4/5$};
\node at (-5,-0.75) {$3/6$};
\node at (-3,-0.75) {$11/12$};
\node at (-1,1.75) {$10/13$};

\path[draw,thick,->] (2) -- (1);
\path[draw,thick,->] (1) -- (3);
\path[draw,thick,->] (3) -- (2);
\path[draw,thick,->] (2) -- (4);
\path[draw,thick,->] (3) -- (5);
\path[draw,thick,->] (4) edge [bend left] (6);
\path[draw,thick,->] (6) edge [bend left] (4);
\path[draw,thick,->] (4) -- (5);
\path[draw,thick,->] (5) -- (7);
\path[draw,thick,->] (6) -- (7);
\end{tikzpicture}
\end{center}

The notation $x/y$ means that
processing the node started at moment $x$
and ended at moment $y$.
When the nodes are sorted according to
ending times, the result is the following list:

\begin{tabular}{ll}
\\
node & ending time \\
\hline
4 & 5 \\
5 & 6 \\
2 & 7 \\
1 & 8 \\
6 & 12 \\
7 & 13 \\
3 & 14 \\
\\
\end{tabular}

In the second phase of the algorithm,
the nodes will be processed
in reverse order: $[3,7,6,1,2,5,4]$.

\subsubsection{Search 2}

The second phase of the algorithm
forms the strongly connected components
of the graph.
First, the algorithm reverses every
edge in the graph.
This ensures that during the second search,
we will always find a strongly connected
component without extra nodes.

The example graph becomes as follows
after reversing the edges:
\begin{center}
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
\node[draw, circle] (1) at (-1,1) {$7$};
\node[draw, circle] (2) at (-3,2) {$3$};
\node[draw, circle] (4) at (-5,2) {$2$};
\node[draw, circle] (6) at (-7,2) {$1$};
\node[draw, circle] (3) at (-3,0) {$6$};
\node[draw, circle] (5) at (-5,0) {$5$};
\node[draw, circle] (7) at (-7,0) {$4$};

\path[draw,thick,<-] (2) -- (1);
\path[draw,thick,<-] (1) -- (3);
\path[draw,thick,<-] (3) -- (2);
\path[draw,thick,<-] (2) -- (4);
\path[draw,thick,<-] (3) -- (5);
\path[draw,thick,<-] (4) edge [bend left] (6);
\path[draw,thick,<-] (6) edge [bend left] (4);
\path[draw,thick,<-] (4) -- (5);
\path[draw,thick,<-] (5) -- (7);
\path[draw,thick,<-] (6) -- (7);
\end{tikzpicture}
\end{center}

After this, the algorithm goes through the nodes
in the order defined by the first search.
If a node doesn't belong to a component,
the algorithm creates a new component
and begins a depth-first search
where all new nodes found during the search
are added to the new component.

In the example graph, the first component
begins at node 3:

\begin{center}
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
\node[draw, circle] (1) at (-1,1) {$7$};
\node[draw, circle] (2) at (-3,2) {$3$};
\node[draw, circle] (4) at (-5,2) {$2$};
\node[draw, circle] (6) at (-7,2) {$1$};
\node[draw, circle] (3) at (-3,0) {$6$};
\node[draw, circle] (5) at (-5,0) {$5$};
\node[draw, circle] (7) at (-7,0) {$4$};

\path[draw,thick,<-] (2) -- (1);
\path[draw,thick,<-] (1) -- (3);
\path[draw,thick,<-] (3) -- (2);
\path[draw,thick,<-] (2) -- (4);
\path[draw,thick,<-] (3) -- (5);
\path[draw,thick,<-] (4) edge [bend left] (6);
\path[draw,thick,<-] (6) edge [bend left] (4);
\path[draw,thick,<-] (4) -- (5);
\path[draw,thick,<-] (5) -- (7);
\path[draw,thick,<-] (6) -- (7);

\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
%\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
%\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
%\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
\end{tikzpicture}
\end{center}

Note that since we reversed all edges in the graph,
the component doesn't ''leak'' to other parts in the graph.

\begin{samepage}
The next nodes in the list are nodes 7 and 6,
but they already belong to a component.
The next new component begins at node 1:

\begin{center}
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
\node[draw, circle] (1) at (-1,1) {$7$};
\node[draw, circle] (2) at (-3,2) {$3$};
\node[draw, circle] (4) at (-5,2) {$2$};
\node[draw, circle] (6) at (-7,2) {$1$};
\node[draw, circle] (3) at (-3,0) {$6$};
\node[draw, circle] (5) at (-5,0) {$5$};
\node[draw, circle] (7) at (-7,0) {$4$};

\path[draw,thick,<-] (2) -- (1);
\path[draw,thick,<-] (1) -- (3);
\path[draw,thick,<-] (3) -- (2);
\path[draw,thick,<-] (2) -- (4);
\path[draw,thick,<-] (3) -- (5);
\path[draw,thick,<-] (4) edge [bend left] (6);
\path[draw,thick,<-] (6) edge [bend left] (4);
\path[draw,thick,<-] (4) -- (5);
\path[draw,thick,<-] (5) -- (7);
\path[draw,thick,<-] (6) -- (7);

\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
%\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
%\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
\end{tikzpicture}
\end{center}
\end{samepage}

Finally, the algorithm processes nodes 5 and 5
that create the remaining strongy connected components:

\begin{center}
\begin{tikzpicture}[scale=0.9,label distance=-2mm]
\node[draw, circle] (1) at (-1,1) {$7$};
\node[draw, circle] (2) at (-3,2) {$3$};
\node[draw, circle] (4) at (-5,2) {$2$};
\node[draw, circle] (6) at (-7,2) {$1$};
\node[draw, circle] (3) at (-3,0) {$6$};
\node[draw, circle] (5) at (-5,0) {$5$};
\node[draw, circle] (7) at (-7,0) {$4$};

\path[draw,thick,<-] (2) -- (1);
\path[draw,thick,<-] (1) -- (3);
\path[draw,thick,<-] (3) -- (2);
\path[draw,thick,<-] (2) -- (4);
\path[draw,thick,<-] (3) -- (5);
\path[draw,thick,<-] (4) edge [bend left] (6);
\path[draw,thick,<-] (6) edge [bend left] (4);
\path[draw,thick,<-] (4) -- (5);
\path[draw,thick,<-] (5) -- (7);
\path[draw,thick,<-] (6) -- (7);

\draw [red,thick,dashed,line width=2pt] (-0.5,2.5) rectangle (-3.5,-0.5);
\draw [red,thick,dashed,line width=2pt] (-4.5,2.5) rectangle (-7.5,1.5);
\draw [red,thick,dashed,line width=2pt] (-4.5,0.5) rectangle (-5.5,-0.5);
\draw [red,thick,dashed,line width=2pt] (-6.5,0.5) rectangle (-7.5,-0.5);
\end{tikzpicture}
\end{center}

The time complexity of the algorithm is $O(n+m)$
where $n$ is the number of nodes and $m$
is the number of edges.
The reason for this is that the algorithm
performs two depth-first searches and
each search takes $O(n+m)$ time.

\section{2SAT problem}

\index{2SAT problem}

Strongly connectivity is also linked with the
\key{2SAT problem}.
In this problem, we are given a logical formula
\[
(a_1 \lor b_1) \land (a_2 \lor b_2) \land \cdots \land (a_m \lor b_m),
\]
where each $a_i$ and $b_i$ is either a logical variable
($x_1,x_2,\ldots,x_n$)
or a negation of a logical variable
($\lnot x_1, \lnot x_2, \ldots, \lnot x_n$).
The symbols ''$\land$'' and ''$\lor$'' denote
logical operators ''and'' and ''or''.
Our task is to assign each variable a value
so that the formula is true or state
that it is not possible.

For example, the formula
\[
L_1 = (x_2 \lor \lnot x_1) \land
      (\lnot x_1 \lor \lnot x_2) \land
      (x_1 \lor x_3) \land
      (\lnot x_2 \lor \lnot x_3) \land
      (x_1 \lor x_4)
\]
is true when $x_1$ and $x_2$ are false
and $x_3$ and $x_4$ are true.
However, the formula
\[
L_2 = (x_1 \lor x_2) \land
      (x_1 \lor \lnot x_2) \land
      (\lnot x_1 \lor x_3) \land
      (\lnot x_1 \lor \lnot x_3)
\]
is always false.
The reason for this is that we can't
choose a value for variable $x_1$
without creating a contradiction.
If $x_1$ is false, both $x_2$ and $\lnot x_2$
should hold which is impossible,
and if $x_1$ is true, both $x_3$ and $\lnot x_3$
should hold which is impossible as well.

The 2SAT problem can be represented as a graph
where the nodes correspond to
variables $x_i$ and negations $\lnot x_i$,
and the edges determine the connections
between the variables.
Each pair $(a_i \lor b_i)$ generates two edges:
$\lnot a_i \to b_i$ and $\lnot b_i \to a_i$.
This means that if $a_i$ doesn't hold,
$b_i$ must hold, and vice versa.

The graph for formula $L_1$ is:
\\
\begin{center}
\begin{tikzpicture}[scale=1.0,minimum size=2pt]
\node[draw, circle, inner sep=1.3pt] (1) at (1,2) {$\lnot x_3$};
\node[draw, circle] (2) at (3,2) {$x_2$};
\node[draw, circle, inner sep=1.3pt] (3) at (1,0) {$\lnot x_4$};
\node[draw, circle] (4) at (3,0) {$x_1$};
\node[draw, circle, inner sep=1.3pt] (5) at (5,2) {$\lnot x_1$};
\node[draw, circle] (6) at (7,2) {$x_4$};
\node[draw, circle, inner sep=1.3pt] (7) at (5,0) {$\lnot x_2$};
\node[draw, circle] (8) at (7,0) {$x_3$};

\path[draw,thick,->] (1) -- (4);
\path[draw,thick,->] (4) -- (2);
\path[draw,thick,->] (2) -- (1);
\path[draw,thick,->] (3) -- (4);
\path[draw,thick,->] (2) -- (5);
\path[draw,thick,->] (4) -- (7);
\path[draw,thick,->] (5) -- (6);
\path[draw,thick,->] (5) -- (8);
\path[draw,thick,->] (8) -- (7);
\path[draw,thick,->] (7) -- (5);
\end{tikzpicture}
\end{center}
And the graph for formula $L_2$ is:
\\
\begin{center}
\begin{tikzpicture}[scale=1.0,minimum size=2pt]
\node[draw, circle] (1) at (1,2) {$x_3$};
\node[draw, circle] (2) at (3,2) {$x_2$};
\node[draw, circle, inner sep=1.3pt] (3) at (5,2) {$\lnot x_2$};
\node[draw, circle, inner sep=1.3pt] (4) at (7,2) {$\lnot x_3$};
\node[draw, circle, inner sep=1.3pt] (5) at (4,3.5) {$\lnot x_1$};
\node[draw, circle] (6) at (4,0.5) {$x_1$};

\path[draw,thick,->] (1) -- (5);
\path[draw,thick,->] (4) -- (5);
\path[draw,thick,->] (6) -- (1);
\path[draw,thick,->] (6) -- (4);
\path[draw,thick,->] (5) -- (2);
\path[draw,thick,->] (5) -- (3);
\path[draw,thick,->] (2) -- (6);
\path[draw,thick,->] (3) -- (6);
\end{tikzpicture}
\end{center}

The structure of the graph indicates whether
the corresponding 2SAT problem can be solved.
If there is a variable $x_i$ such that
both $x_i$ and $\lnot x_i$ belong to the
same strongly connected component,
then there are no solutions.
In this case, the graph contains
a path from $x_i$ to $\lnot x_i$,
and also a path from $\lnot x_i$ to $x_i$,
so both $x_i$ and $\lnot x_i$ should hold
which is not possible.
However, if the graph doesn't contain
such a variable, then there is always a solution.

In the graph of formula $L_1$
no nodes $x_i$ and $\lnot x_i$
belong to the same strongly connected component,
so there is a solution.
In the graph of formula $L_2$
all nodes belong to the same strongly connected component,
so there are no solutions.

If a solution exists, the values for the variables
can be found by processing the nodes of the
component graph in a reverse topological sort order.
At each step, we process and remove a component
that doesn't contain edges that lead to the
remaining components.
If the variables in the component don't have values,
their values will be determined
according to the component,
and if they already have values,
they are not changed.
The process continues until all variables
have been assigned a value.

The component graph for formula $L_1$ is as follows:
\begin{center}
\begin{tikzpicture}[scale=1.0]
\node[draw, circle] (1) at (0,0) {$A$};
\node[draw, circle] (2) at (2,0) {$B$};
\node[draw, circle] (3) at (4,0) {$C$};
\node[draw, circle] (4) at (6,0) {$D$};

\path[draw,thick,->] (1) -- (2);
\path[draw,thick,->] (2) -- (3);
\path[draw,thick,->] (3) -- (4);
\end{tikzpicture}
\end{center}

The components are
$A = \{\lnot x_4\}$,
$B = \{x_1, x_2, \lnot x_3\}$,
$C = \{\lnot x_1, \lnot x_2, x_3\}$ and
$D = \{x_4\}$.
When constructing the solution,
we first process component $D$
where $x_4$ becomes true.
After this, we process component $C$
where $x_1$ and $x_2$ become false
and $x_3$ becomes true.
All variables have been assigned a value,
so the remaining components $A$ and $B$
don't change the variables anymore.

Note that this method works because the
structure of the graph is special.
If there are paths from node $x_i$ to node $x_j$
and from node $x_j$ to node $\lnot x_j$,
then node $x_i$ never becomes true.
The reason for this is that there is also
a path from node $\lnot x_j$ to node $\lnot x_i$,
and both $x_i$ and $x_j$ become false.

\index{3SAT problem}

A more difficult problem is the \key{3SAT problem}
where each part of the formula is of the form
$(a_i \lor b_i \lor c_i)$.
No efficient algorithm for solving this problem
is known, but it is a NP-hard problem.