cphb/chapter18.tex

898 lines
26 KiB
TeX
Raw Normal View History

2016-12-28 23:54:51 +01:00
\chapter{Tree queries}
2017-01-09 19:32:38 +01:00
\index{tree query}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
This chapter discusses techniques for
2017-02-18 13:29:38 +01:00
processing queries related
2017-02-06 22:17:38 +01:00
to subtrees and paths of a rooted tree.
2017-02-18 13:29:38 +01:00
For example, such queries are:
2016-12-28 23:54:51 +01:00
\begin{itemize}
2017-02-06 22:17:38 +01:00
\item what is the $k$th ancestor of a node?
\item what is the sum of values in the subtree of a node?
\item what is the sum of values in a path between two nodes?
\item what is the lowest common ancestor of two nodes?
2016-12-28 23:54:51 +01:00
\end{itemize}
2017-01-09 19:32:38 +01:00
\section{Finding ancestors}
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
\index{ancestor}
The $k$th \key{ancestor} of a node $x$ in a rooted tree
is the node that we will reach if we move $k$
levels up from $x$.
Let $f(x,k)$ denote the $k$th ancestor of $x$.
2017-01-09 19:32:38 +01:00
For example, in the following tree, $f(2,1)=1$ and $f(8,2)=4$.
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$2$};
\node[draw, circle] (3) at (-2,1) {$4$};
\node[draw, circle] (4) at (0,1) {$5$};
\node[draw, circle] (5) at (2,-1) {$6$};
\node[draw, circle] (6) at (-3,-1) {$3$};
\node[draw, circle] (7) at (-1,-1) {$7$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend left] (3);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right] (1);
\end{tikzpicture}
\end{center}
2017-02-06 22:17:38 +01:00
An easy way to calculate the value of $f(x,k)$
is to perform a sequence of $k$ moves in the tree.
2017-01-09 19:32:38 +01:00
However, the time complexity of this method
2017-02-06 22:17:38 +01:00
is $O(n)$, because the tree may contain
a chain of $O(n)$ nodes.
2017-01-09 19:32:38 +01:00
2017-02-06 22:17:38 +01:00
Fortunately, it turns out that
using a technique similar to that
used in Chapter 16.3, any value of $f(x,k)$
can be efficiently calculated in $O(\log k)$ time
2017-01-09 19:32:38 +01:00
after preprocessing.
The idea is to precalculate all values $f(x,k)$
where $k$ is a power of two.
2017-02-18 13:29:38 +01:00
For example, the values for the above tree
2017-01-09 19:32:38 +01:00
are as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tabular}{r|rrrrrrrrr}
$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 \\
\hline
$f(x,1)$ & 0 & 1 & 4 & 1 & 1 & 2 & 4 & 7 \\
$f(x,2)$ & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 4 \\
$f(x,4)$ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
$\cdots$ \\
\end{tabular}
\end{center}
2017-01-09 19:32:38 +01:00
The value $0$ means that the $k$th ancestor
2017-02-06 22:17:38 +01:00
of a node does not exist.
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
The preprocessing takes $O(n \log n)$ time,
2017-01-09 19:32:38 +01:00
because each node can have at most $n$ ancestors.
2017-02-06 22:17:38 +01:00
After this, any value of $f(x,k)$ can be calculated
in $O(\log k)$ time by representing $k$
2017-01-09 19:32:38 +01:00
as a sum where each term is a power of two.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\section{Subtrees and paths}
2016-12-28 23:54:51 +01:00
2017-02-21 17:10:08 +01:00
\index{tree traversal array}
2016-12-28 23:54:51 +01:00
2017-02-21 17:10:08 +01:00
A \key{tree traversal array} contains the nodes of a rooted tree
2017-01-09 19:32:38 +01:00
in the order in which a depth-first search
from the root node visits them.
For example, in the tree
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
a depth-first search proceeds as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (6);
\path[draw=red,thick,->,line width=2pt] (6) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (8);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (9);
\path[draw=red,thick,->,line width=2pt] (9) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (5);
\path[draw=red,thick,->,line width=2pt] (5) edge [bend right=15] (1);
\end{tikzpicture}
\end{center}
2017-02-21 17:10:08 +01:00
Hence, the corresponding tree traversal array is as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (9,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
\subsubsection{Subtree queries}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
Each subtree of a tree corresponds to a subarray
2017-02-21 17:10:08 +01:00
in the tree traversal array such that
the first element in the subarray is the root node.
2017-01-09 19:32:38 +01:00
For example, the following subarray contains the
nodes in the subtree of node $4$:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (4,0) rectangle (8,1);
\draw (0,0) grid (9,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Using this fact, we can efficiently process queries
2017-02-06 22:17:38 +01:00
that are related to subtrees of a tree.
2017-01-09 19:32:38 +01:00
As an example, consider a problem where each node
is assigned a value, and our task is to support
the following queries:
2016-12-28 23:54:51 +01:00
\begin{itemize}
2017-02-06 22:17:38 +01:00
\item update the value of a node
\item calculate the sum of values in the subtree of a node
2016-12-28 23:54:51 +01:00
\end{itemize}
2017-02-06 22:17:38 +01:00
Consider the following tree where the blue numbers
are the values of the nodes.
For example, the sum of the subtree of node $4$
2017-01-09 19:32:38 +01:00
is $3+4+3+1=11$.
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\node[color=blue] at (0,3+0.65) {2};
\node[color=blue] at (-3-0.65,1) {3};
\node[color=blue] at (-1-0.65,1) {5};
\node[color=blue] at (1+0.65,1) {3};
\node[color=blue] at (3+0.65,1) {1};
\node[color=blue] at (-3,-1-0.65) {4};
\node[color=blue] at (-0.5,-1-0.65) {4};
\node[color=blue] at (1,-1-0.65) {3};
\node[color=blue] at (2.5,-1-0.65) {1};
\end{tikzpicture}
\end{center}
2017-02-21 17:10:08 +01:00
The idea is to construct a tree traversal array that contains
2017-02-06 22:17:38 +01:00
three values for each node: (1) the identifier of the node,
(2) the size of the subtree, and (3) the value of the node.
2017-01-09 19:32:38 +01:00
For example, the array for the above tree is as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,1) grid (9,-2);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (0.5,-1.5) {$2$};
\node at (1.5,-1.5) {$3$};
\node at (2.5,-1.5) {$4$};
\node at (3.5,-1.5) {$5$};
\node at (4.5,-1.5) {$3$};
\node at (5.5,-1.5) {$4$};
\node at (6.5,-1.5) {$3$};
\node at (7.5,-1.5) {$1$};
\node at (8.5,-1.5) {$1$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-02-06 22:17:38 +01:00
Using this array, we can calculate the sum of values
in any subtree by first finding out the size of the subtree
2017-01-09 19:32:38 +01:00
and then the values of the corresponding nodes.
For example, the values in the subtree of node $4$
can be found as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (4,1) rectangle (5,0);
\fill[color=lightgray] (4,0) rectangle (5,-1);
\fill[color=lightgray] (4,-1) rectangle (8,-2);
\draw (0,1) grid (9,-2);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (0.5,-1.5) {$2$};
\node at (1.5,-1.5) {$3$};
\node at (2.5,-1.5) {$4$};
\node at (3.5,-1.5) {$5$};
\node at (4.5,-1.5) {$3$};
\node at (5.5,-1.5) {$4$};
\node at (6.5,-1.5) {$3$};
\node at (7.5,-1.5) {$1$};
\node at (8.5,-1.5) {$1$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-02-18 13:29:38 +01:00
To answer the queries efficiently,
2017-02-06 22:17:38 +01:00
it suffices to store the values of the
2017-01-09 19:32:38 +01:00
nodes in a binary indexed tree or segment tree.
2017-02-06 22:17:38 +01:00
After this, we can both update a value
and calculate the sum of values in $O(\log n)$ time.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\subsubsection{Path queries}
2016-12-28 23:54:51 +01:00
2017-02-21 17:10:08 +01:00
Using a tree traversal array, we can also efficiently
2017-02-06 22:17:38 +01:00
calculate sums of values on
2017-02-08 21:45:17 +01:00
paths from the root node to any other
2017-01-09 19:32:38 +01:00
node in the tree.
Let us next consider a problem where our task
is to support the following queries:
2016-12-28 23:54:51 +01:00
\begin{itemize}
2017-02-06 22:17:38 +01:00
\item change the value of a node
2017-02-08 21:45:17 +01:00
\item calculate the sum of values on a path from
the root to a node
2016-12-28 23:54:51 +01:00
\end{itemize}
2017-02-08 21:45:17 +01:00
For example, in the following tree,
2017-02-18 13:29:38 +01:00
the sum of values from the root node to node 7 is
2017-02-08 21:45:17 +01:00
$4+5+5=14$:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\node[color=blue] at (0,3+0.65) {4};
\node[color=blue] at (-3-0.65,1) {5};
\node[color=blue] at (-1-0.65,1) {3};
\node[color=blue] at (1+0.65,1) {5};
\node[color=blue] at (3+0.65,1) {2};
\node[color=blue] at (-3,-1-0.65) {3};
\node[color=blue] at (-0.5,-1-0.65) {5};
\node[color=blue] at (1,-1-0.65) {3};
\node[color=blue] at (2.5,-1-0.65) {1};
\end{tikzpicture}
\end{center}
2017-02-08 21:45:17 +01:00
We can solve this problem in a similar way as before,
2017-02-18 13:29:38 +01:00
but now each value in the last row of the array is the sum of values
on a path from the root to the node.
2017-01-09 19:32:38 +01:00
For example, the following array corresponds to the above tree:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
2017-02-08 21:45:17 +01:00
\draw (0,1) grid (9,-2);
2016-12-28 23:54:51 +01:00
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (0.5,-1.5) {$4$};
2017-02-08 21:45:17 +01:00
\node at (1.5,-1.5) {$9$};
\node at (2.5,-1.5) {$12$};
\node at (3.5,-1.5) {$7$};
\node at (4.5,-1.5) {$9$};
\node at (5.5,-1.5) {$14$};
\node at (6.5,-1.5) {$12$};
\node at (7.5,-1.5) {$10$};
\node at (8.5,-1.5) {$6$};
2016-12-28 23:54:51 +01:00
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-02-08 21:45:17 +01:00
When the value of a node increases by $x$,
the sums of all nodes in its subtree increase by $x$.
For example, if the value of node 4 increases by 1,
2017-02-21 17:10:08 +01:00
the array changes as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
2017-02-08 21:45:17 +01:00
\fill[color=lightgray] (4,-1) rectangle (8,-2);
\draw (0,1) grid (9,-2);
2016-12-28 23:54:51 +01:00
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (0.5,-1.5) {$4$};
2017-02-08 21:45:17 +01:00
\node at (1.5,-1.5) {$9$};
\node at (2.5,-1.5) {$12$};
\node at (3.5,-1.5) {$7$};
\node at (4.5,-1.5) {$10$};
\node at (5.5,-1.5) {$15$};
\node at (6.5,-1.5) {$13$};
\node at (7.5,-1.5) {$11$};
\node at (8.5,-1.5) {$6$};
2016-12-28 23:54:51 +01:00
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-02-08 21:45:17 +01:00
Thus, to support both the operations,
we should be able to increase all values
in a range and retrieve a single value.
This can be done in $O(\log n)$ time
2017-02-18 13:29:38 +01:00
using a binary indexed tree
or segment tree (see Chapter 9.4).
2017-01-09 19:32:38 +01:00
\section{Lowest common ancestor}
\index{lowest common ancestor}
The \key{lowest common ancestor}
2017-02-06 22:17:38 +01:00
of two nodes in the tree is the lowest node
2017-01-09 19:32:38 +01:00
whose subtree contains both the nodes.
A typical problem is to efficiently process
2017-02-06 22:17:38 +01:00
queries that ask to find the lowest
common ancestor of given two nodes.
2017-02-18 13:29:38 +01:00
For example, in the following tree,
the lowest common ancestor of nodes 5 and 8
is node 2:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Next we will discuss two efficient techniques for
finding the lowest common ancestor of two nodes.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\subsubsection{Method 1}
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
One way to solve the problem is to use the fact
2017-01-09 19:32:38 +01:00
that we can efficiently find the $k$th
ancestor of any node in the tree.
2017-02-06 22:17:38 +01:00
Thus, we can first make sure that
2017-01-09 19:32:38 +01:00
both nodes are at the same level in the tree,
and then find the smallest value of $k$
2017-02-06 22:17:38 +01:00
such that the $k$th ancestor of both nodes is the same.
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
As an example, let us find the lowest common
2017-02-18 13:29:38 +01:00
ancestor of nodes $5$ and $8$:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle,fill=lightgray] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle,fill=lightgray] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Node $5$ is at level $3$, while node $8$ is at level $4$.
Thus, we first move one step upwards from node $8$ to node $6$.
2017-02-06 22:17:38 +01:00
After this, it turns out that the parent of both nodes $5$
and $6$ is node $2$, so we have found the lowest common ancestor.
2016-12-28 23:54:51 +01:00
2017-02-18 13:29:38 +01:00
The following picture shows how we move in the tree:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle,fill=lightgray] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle,fill=lightgray] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,->,line width=2pt] (6) edge [bend left] (3);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend right] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right] (3);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Using this method, we can find the lowest common ancestor
of any two nodes in $O(\log n)$ time after an $O(n \log n)$ time
preprocessing, because both steps can be
2017-02-06 22:17:38 +01:00
performed in $O(\log n)$ time.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\subsubsection{Method 2}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
Another way to solve the problem is based on
2017-02-21 17:10:08 +01:00
a tree traversal array \cite{ben00}.
2017-02-06 22:17:38 +01:00
Once again, the idea is to traverse the nodes
2017-01-09 19:32:38 +01:00
using a depth-first search:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (6);
\path[draw=red,thick,->,line width=2pt] (6) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (8);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend right=15] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (5);
\path[draw=red,thick,->,line width=2pt] (5) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (1);
\end{tikzpicture}
\end{center}
2017-02-22 20:42:35 +01:00
However, we use a bit different variant of
the tree traversal array where
we add each node to the array \emph{always}
2017-01-09 19:32:38 +01:00
when the depth-first search visits the node,
and not only at the first visit.
2017-02-06 22:17:38 +01:00
Hence, a node that has $k$ children appears $k+1$ times
2017-02-21 17:10:08 +01:00
in the array, and there are a total of $2n-1$
2017-01-09 19:32:38 +01:00
nodes in the array.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
We store two values in the array:
2017-02-06 22:17:38 +01:00
(1) the identifier of the node, and (2) the level of the
2017-01-09 19:32:38 +01:00
node in the tree.
The following array corresponds to the above tree:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,1) grid (15,2);
%\node at (-1.1,1.5) {\texttt{node}};
\node at (0.5,1.5) {$1$};
\node at (1.5,1.5) {$2$};
\node at (2.5,1.5) {$5$};
\node at (3.5,1.5) {$2$};
\node at (4.5,1.5) {$6$};
\node at (5.5,1.5) {$8$};
\node at (6.5,1.5) {$6$};
\node at (7.5,1.5) {$2$};
\node at (8.5,1.5) {$1$};
\node at (9.5,1.5) {$3$};
\node at (10.5,1.5) {$1$};
\node at (11.5,1.5) {$4$};
\node at (12.5,1.5) {$7$};
\node at (13.5,1.5) {$4$};
\node at (14.5,1.5) {$1$};
\draw (0,0) grid (15,1);
%\node at (-1.1,0.5) {\texttt{depth}};
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$3$};
\node at (3.5,0.5) {$2$};
\node at (4.5,0.5) {$3$};
\node at (5.5,0.5) {$4$};
\node at (6.5,0.5) {$3$};
\node at (7.5,0.5) {$2$};
\node at (8.5,0.5) {$1$};
\node at (9.5,0.5) {$2$};
\node at (10.5,0.5) {$1$};
\node at (11.5,0.5) {$2$};
\node at (12.5,0.5) {$3$};
\node at (13.5,0.5) {$2$};
\node at (14.5,0.5) {$1$};
\footnotesize
\node at (0.5,2.5) {$1$};
\node at (1.5,2.5) {$2$};
\node at (2.5,2.5) {$3$};
\node at (3.5,2.5) {$4$};
\node at (4.5,2.5) {$5$};
\node at (5.5,2.5) {$6$};
\node at (6.5,2.5) {$7$};
\node at (7.5,2.5) {$8$};
\node at (8.5,2.5) {$9$};
\node at (9.5,2.5) {$10$};
\node at (10.5,2.5) {$11$};
\node at (11.5,2.5) {$12$};
\node at (12.5,2.5) {$13$};
\node at (13.5,2.5) {$14$};
\node at (14.5,2.5) {$15$};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Using this array, we can find the lowest common ancestor
2017-02-06 22:17:38 +01:00
of nodes $a$ and $b$ by finding the node with lowest level
2017-01-09 19:32:38 +01:00
between nodes $a$ and $b$ in the array.
For example, the lowest common ancestor of nodes $5$ and $8$
can be found as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (2,1) rectangle (3,2);
\fill[color=lightgray] (5,1) rectangle (6,2);
\fill[color=lightgray] (2,0) rectangle (6,1);
\node at (3.5,-0.5) {$\uparrow$};
\draw (0,1) grid (15,2);
\node at (0.5,1.5) {$1$};
\node at (1.5,1.5) {$2$};
\node at (2.5,1.5) {$5$};
\node at (3.5,1.5) {$2$};
\node at (4.5,1.5) {$6$};
\node at (5.5,1.5) {$8$};
\node at (6.5,1.5) {$6$};
\node at (7.5,1.5) {$2$};
\node at (8.5,1.5) {$1$};
\node at (9.5,1.5) {$3$};
\node at (10.5,1.5) {$1$};
\node at (11.5,1.5) {$4$};
\node at (12.5,1.5) {$7$};
\node at (13.5,1.5) {$4$};
\node at (14.5,1.5) {$1$};
\draw (0,0) grid (15,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$3$};
\node at (3.5,0.5) {$2$};
\node at (4.5,0.5) {$3$};
\node at (5.5,0.5) {$4$};
\node at (6.5,0.5) {$3$};
\node at (7.5,0.5) {$2$};
\node at (8.5,0.5) {$1$};
\node at (9.5,0.5) {$2$};
\node at (10.5,0.5) {$1$};
\node at (11.5,0.5) {$2$};
\node at (12.5,0.5) {$3$};
\node at (13.5,0.5) {$2$};
\node at (14.5,0.5) {$1$};
\footnotesize
\node at (0.5,2.5) {$1$};
\node at (1.5,2.5) {$2$};
\node at (2.5,2.5) {$3$};
\node at (3.5,2.5) {$4$};
\node at (4.5,2.5) {$5$};
\node at (5.5,2.5) {$6$};
\node at (6.5,2.5) {$7$};
\node at (7.5,2.5) {$8$};
\node at (8.5,2.5) {$9$};
\node at (9.5,2.5) {$10$};
\node at (10.5,2.5) {$11$};
\node at (11.5,2.5) {$12$};
\node at (12.5,2.5) {$13$};
\node at (13.5,2.5) {$14$};
\node at (14.5,2.5) {$15$};
\end{tikzpicture}
\end{center}
2017-02-06 22:17:38 +01:00
Node 5 is at position 3, node 8 is at position 6,
2017-01-09 19:32:38 +01:00
and the node with lowest level between
2017-02-06 22:17:38 +01:00
positions $3 \ldots 6$ is node 2 at position 4
2017-01-09 19:32:38 +01:00
whose level is 2.
Thus, the lowest common ancestor of
nodes 5 and 8 is node 2.
2017-02-18 13:29:38 +01:00
Thus, to find the lowest common ancestor
of two nodes it suffices to process a range
minimum query.
Since the array is static,
we can process such queries in $O(1)$ time
after an $O(n \log n)$ time preprocessing.
2017-01-09 19:32:38 +01:00
\subsubsection{Distances of nodes}
2017-02-18 13:29:38 +01:00
Finally, let us consider the problem of
2017-02-06 22:17:38 +01:00
finding the distance between
two nodes in the tree, which equals
the length of the path between them.
2017-01-09 19:32:38 +01:00
It turns out that this problem reduces to
2017-02-06 22:17:38 +01:00
finding the lowest common ancestor of the nodes.
2017-01-09 19:32:38 +01:00
2017-02-18 13:29:38 +01:00
First, we root the tree arbitrarily.
2017-01-09 19:32:38 +01:00
After this, the distance between nodes $a$ and $b$
2017-02-18 13:29:38 +01:00
can be calculated using the formula
\[d(a)+d(b)-2 \cdot d(c),\]
2017-02-06 22:17:38 +01:00
where $c$ is the lowest common ancestor of $a$ and $b$
and $d(s)$ denotes the distance from the root node
2017-01-09 19:32:38 +01:00
to node $s$.
For example, in the tree
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,-,line width=2pt] (8) -- node[font=\small] {} (7);
\path[draw=red,thick,-,line width=2pt] (7) -- node[font=\small] {} (3);
\path[draw=red,thick,-,line width=2pt] (6) -- node[font=\small] {} (3);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
the lowest common ancestor of nodes 5 and 8 is node 2.
A path from node 5 to node 8
2017-02-18 13:29:38 +01:00
first ascends from node 5 to node 2
and then descends from node 2 to node 8.
2017-01-09 19:32:38 +01:00
The distances of the nodes from the root are
$d(5)=3$, $d(8)=4$ and $d(2)=2$,
so the distance between nodes 5 and 8 is
$3+4-2\cdot2=3$.
2016-12-28 23:54:51 +01:00