cphb/luku18.tex

924 lines
27 KiB
TeX
Raw Normal View History

2016-12-28 23:54:51 +01:00
\chapter{Tree queries}
2017-01-09 19:32:38 +01:00
\index{tree query}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
This chapter discusses techniques for
2017-02-06 22:17:38 +01:00
efficiently processing queries related
to subtrees and paths of a rooted tree.
2017-01-09 19:32:38 +01:00
For example, possible queries are:
2016-12-28 23:54:51 +01:00
\begin{itemize}
2017-02-06 22:17:38 +01:00
\item what is the $k$th ancestor of a node?
\item what is the sum of values in the subtree of a node?
\item what is the sum of values in a path between two nodes?
\item what is the lowest common ancestor of two nodes?
2016-12-28 23:54:51 +01:00
\end{itemize}
2017-01-09 19:32:38 +01:00
\section{Finding ancestors}
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
\index{ancestor}
The $k$th \key{ancestor} of a node $x$ in a rooted tree
is the node that we will reach if we move $k$
levels up from $x$.
Let $f(x,k)$ denote the $k$th ancestor of $x$.
2017-01-09 19:32:38 +01:00
For example, in the following tree, $f(2,1)=1$ and $f(8,2)=4$.
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$2$};
\node[draw, circle] (3) at (-2,1) {$4$};
\node[draw, circle] (4) at (0,1) {$5$};
\node[draw, circle] (5) at (2,-1) {$6$};
\node[draw, circle] (6) at (-3,-1) {$3$};
\node[draw, circle] (7) at (-1,-1) {$7$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend left] (3);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right] (1);
\end{tikzpicture}
\end{center}
2017-02-06 22:17:38 +01:00
An easy way to calculate the value of $f(x,k)$
is to perform a sequence of $k$ moves in the tree.
2017-01-09 19:32:38 +01:00
However, the time complexity of this method
2017-02-06 22:17:38 +01:00
is $O(n)$, because the tree may contain
a chain of $O(n)$ nodes.
2017-01-09 19:32:38 +01:00
2017-02-06 22:17:38 +01:00
Fortunately, it turns out that
using a technique similar to that
used in Chapter 16.3, any value of $f(x,k)$
can be efficiently calculated in $O(\log k)$ time
2017-01-09 19:32:38 +01:00
after preprocessing.
The idea is to precalculate all values $f(x,k)$
where $k$ is a power of two.
For example, the values for the tree above
are as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tabular}{r|rrrrrrrrr}
$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 \\
\hline
$f(x,1)$ & 0 & 1 & 4 & 1 & 1 & 2 & 4 & 7 \\
$f(x,2)$ & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 4 \\
$f(x,4)$ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
$\cdots$ \\
\end{tabular}
\end{center}
2017-01-09 19:32:38 +01:00
The value $0$ means that the $k$th ancestor
2017-02-06 22:17:38 +01:00
of a node does not exist.
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
The preprocessing takes $O(n \log n)$ time,
2017-01-09 19:32:38 +01:00
because each node can have at most $n$ ancestors.
2017-02-06 22:17:38 +01:00
After this, any value of $f(x,k)$ can be calculated
in $O(\log k)$ time by representing $k$
2017-01-09 19:32:38 +01:00
as a sum where each term is a power of two.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\section{Subtrees and paths}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\index{node array}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
A \key{node array} contains the nodes of a rooted tree
in the order in which a depth-first search
from the root node visits them.
For example, in the tree
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
a depth-first search proceeds as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (6);
\path[draw=red,thick,->,line width=2pt] (6) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (8);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (9);
\path[draw=red,thick,->,line width=2pt] (9) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (5);
\path[draw=red,thick,->,line width=2pt] (5) edge [bend right=15] (1);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Hence, the corresponding node array is as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (9,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
\subsubsection{Subtree queries}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
Each subtree of a tree corresponds to a subarray
in the node array,
where the first element is the root node.
For example, the following subarray contains the
nodes in the subtree of node $4$:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (4,0) rectangle (8,1);
\draw (0,0) grid (9,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Using this fact, we can efficiently process queries
2017-02-06 22:17:38 +01:00
that are related to subtrees of a tree.
2017-01-09 19:32:38 +01:00
As an example, consider a problem where each node
is assigned a value, and our task is to support
the following queries:
2016-12-28 23:54:51 +01:00
\begin{itemize}
2017-02-06 22:17:38 +01:00
\item update the value of a node
\item calculate the sum of values in the subtree of a node
2016-12-28 23:54:51 +01:00
\end{itemize}
2017-02-06 22:17:38 +01:00
Consider the following tree where the blue numbers
are the values of the nodes.
For example, the sum of the subtree of node $4$
2017-01-09 19:32:38 +01:00
is $3+4+3+1=11$.
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\node[color=blue] at (0,3+0.65) {2};
\node[color=blue] at (-3-0.65,1) {3};
\node[color=blue] at (-1-0.65,1) {5};
\node[color=blue] at (1+0.65,1) {3};
\node[color=blue] at (3+0.65,1) {1};
\node[color=blue] at (-3,-1-0.65) {4};
\node[color=blue] at (-0.5,-1-0.65) {4};
\node[color=blue] at (1,-1-0.65) {3};
\node[color=blue] at (2.5,-1-0.65) {1};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
The idea is to construct a node array that contains
2017-02-06 22:17:38 +01:00
three values for each node: (1) the identifier of the node,
(2) the size of the subtree, and (3) the value of the node.
2017-01-09 19:32:38 +01:00
For example, the array for the above tree is as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,1) grid (9,-2);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (0.5,-1.5) {$2$};
\node at (1.5,-1.5) {$3$};
\node at (2.5,-1.5) {$4$};
\node at (3.5,-1.5) {$5$};
\node at (4.5,-1.5) {$3$};
\node at (5.5,-1.5) {$4$};
\node at (6.5,-1.5) {$3$};
\node at (7.5,-1.5) {$1$};
\node at (8.5,-1.5) {$1$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-02-06 22:17:38 +01:00
Using this array, we can calculate the sum of values
in any subtree by first finding out the size of the subtree
2017-01-09 19:32:38 +01:00
and then the values of the corresponding nodes.
For example, the values in the subtree of node $4$
can be found as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (4,1) rectangle (5,0);
\fill[color=lightgray] (4,0) rectangle (5,-1);
\fill[color=lightgray] (4,-1) rectangle (8,-2);
\draw (0,1) grid (9,-2);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (0.5,-1.5) {$2$};
\node at (1.5,-1.5) {$3$};
\node at (2.5,-1.5) {$4$};
\node at (3.5,-1.5) {$5$};
\node at (4.5,-1.5) {$3$};
\node at (5.5,-1.5) {$4$};
\node at (6.5,-1.5) {$3$};
\node at (7.5,-1.5) {$1$};
\node at (8.5,-1.5) {$1$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-02-06 22:17:38 +01:00
To support the queries efficiently,
it suffices to store the values of the
2017-01-09 19:32:38 +01:00
nodes in a binary indexed tree or segment tree.
2017-02-06 22:17:38 +01:00
After this, we can both update a value
and calculate the sum of values in $O(\log n)$ time.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\subsubsection{Path queries}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
Using a node array, we can also efficiently
2017-02-06 22:17:38 +01:00
calculate sums of values on
paths between the root node and any other
2017-01-09 19:32:38 +01:00
node in the tree.
Let us next consider a problem where our task
is to support the following queries:
2016-12-28 23:54:51 +01:00
\begin{itemize}
2017-02-06 22:17:38 +01:00
\item change the value of a node
\item calculate the sum of values on a path between
the root node and a node
2016-12-28 23:54:51 +01:00
\end{itemize}
2017-01-09 19:32:38 +01:00
For example, in the following tree, the sum of
values from the root to node 8 is $4+5+3=12$.
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\node[color=blue] at (0,3+0.65) {4};
\node[color=blue] at (-3-0.65,1) {5};
\node[color=blue] at (-1-0.65,1) {3};
\node[color=blue] at (1+0.65,1) {5};
\node[color=blue] at (3+0.65,1) {2};
\node[color=blue] at (-3,-1-0.65) {3};
\node[color=blue] at (-0.5,-1-0.65) {5};
\node[color=blue] at (1,-1-0.65) {3};
\node[color=blue] at (2.5,-1-0.65) {1};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
To solve this problem, we can use a similar
technique as we used for subtree queries,
but the values of the nodes are stored
in a special way:
2017-02-06 22:17:38 +01:00
if the value of a node at position $k$
2017-01-09 19:32:38 +01:00
increases by $a$,
2017-02-06 22:17:38 +01:00
the value at position $k$ increases by $a$
and the value at position $k+c$ decreases by $a$,
2017-01-09 19:32:38 +01:00
where $c$ is the size of the subtree.
2016-12-28 23:54:51 +01:00
\begin{samepage}
2017-01-09 19:32:38 +01:00
For example, the following array corresponds to the above tree:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,1) grid (10,-2);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (9.5,0.5) {--};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (9.5,-0.5) {--};
\node at (0.5,-1.5) {$4$};
\node at (1.5,-1.5) {$5$};
\node at (2.5,-1.5) {$3$};
\node at (3.5,-1.5) {$-5$};
\node at (4.5,-1.5) {$2$};
\node at (5.5,-1.5) {$5$};
\node at (6.5,-1.5) {$-2$};
\node at (7.5,-1.5) {$-2$};
\node at (8.5,-1.5) {$-4$};
\node at (9.5,-1.5) {$-4$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\node at (9.5,1.4) {$10$};
\end{tikzpicture}
\end{center}
\end{samepage}
2017-01-09 19:32:38 +01:00
For example, the value of node $3$ is $-5$,
because it is the next node after the subtrees
of nodes $2$ and $6$ and its own value is $3$.
So the value decreases by $5+3$ and increases by $3$.
Note that the array contains an extra index 10
that only has the opposite number of the value
of the root node.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
Using this array, the sum of values in a path
from the root to node $x$ equals the sum
of values in the array from the beginning to node $x$.
For example, the sum from the root to node $8$
can be calculated as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (6,1) rectangle (7,0);
\fill[color=lightgray] (0,-1) rectangle (7,-2);
\draw (0,1) grid (10,-2);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (9.5,0.5) {--};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (9.5,-0.5) {--};
\node at (0.5,-1.5) {$4$};
\node at (1.5,-1.5) {$5$};
\node at (2.5,-1.5) {$3$};
\node at (3.5,-1.5) {$-5$};
\node at (4.5,-1.5) {$2$};
\node at (5.5,-1.5) {$5$};
\node at (6.5,-1.5) {$-2$};
\node at (7.5,-1.5) {$-2$};
\node at (8.5,-1.5) {$-4$};
\node at (9.5,-1.5) {$-4$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\node at (9.5,1.4) {$10$};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
The sum is
2017-02-06 22:17:38 +01:00
\[4+5+3-5+2+5-2=12\]
2017-01-09 19:32:38 +01:00
that equals the sum $4+5+3=12$.
2017-02-06 22:17:38 +01:00
This method works, because the value of each node
2017-01-09 19:32:38 +01:00
is added to the sum when the depth-first search
2017-02-06 22:17:38 +01:00
visits the node for the first time, and the value
of the node is removed from the sum when the subtree of the
2017-01-09 19:32:38 +01:00
node has been processed.
Once again, we can store the values of the nodes
in a binary indexed tree or a segment tree,
2017-02-06 22:17:38 +01:00
so it is possible to both update a value and
calculate the sum of values efficiently in $O(\log n)$ time.
2017-01-09 19:32:38 +01:00
\section{Lowest common ancestor}
\index{lowest common ancestor}
The \key{lowest common ancestor}
2017-02-06 22:17:38 +01:00
of two nodes in the tree is the lowest node
2017-01-09 19:32:38 +01:00
whose subtree contains both the nodes.
A typical problem is to efficiently process
2017-02-06 22:17:38 +01:00
queries that ask to find the lowest
common ancestor of given two nodes.
2017-01-09 19:32:38 +01:00
For example, in the tree
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
the lowest common ancestor of nodes 5 and 8 is node 2,
and the lowest common ancestor of nodes 3 and 4 is node 1.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
Next we will discuss two efficient techniques for
finding the lowest common ancestor of two nodes.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\subsubsection{Method 1}
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
One way to solve the problem is to use the fact
2017-01-09 19:32:38 +01:00
that we can efficiently find the $k$th
ancestor of any node in the tree.
2017-02-06 22:17:38 +01:00
Thus, we can first make sure that
2017-01-09 19:32:38 +01:00
both nodes are at the same level in the tree,
and then find the smallest value of $k$
2017-02-06 22:17:38 +01:00
such that the $k$th ancestor of both nodes is the same.
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
As an example, let us find the lowest common
2017-01-09 19:32:38 +01:00
ancestor of nodes $5$ and $8$ in the following tree:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle,fill=lightgray] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle,fill=lightgray] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Node $5$ is at level $3$, while node $8$ is at level $4$.
Thus, we first move one step upwards from node $8$ to node $6$.
2017-02-06 22:17:38 +01:00
After this, it turns out that the parent of both nodes $5$
and $6$ is node $2$, so we have found the lowest common ancestor.
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle,fill=lightgray] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle,fill=lightgray] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,->,line width=2pt] (6) edge [bend left] (3);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend right] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right] (3);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Using this method, we can find the lowest common ancestor
of any two nodes in $O(\log n)$ time after an $O(n \log n)$ time
preprocessing, because both steps can be
2017-02-06 22:17:38 +01:00
performed in $O(\log n)$ time.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\subsubsection{Method 2}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
Another way to solve the problem is based on
a node array.
2017-02-06 22:17:38 +01:00
Once again, the idea is to traverse the nodes
2017-01-09 19:32:38 +01:00
using a depth-first search:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (6);
\path[draw=red,thick,->,line width=2pt] (6) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (8);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend right=15] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (5);
\path[draw=red,thick,->,line width=2pt] (5) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (1);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
However, we add each node to the node array \emph{always}
when the depth-first search visits the node,
and not only at the first visit.
2017-02-06 22:17:38 +01:00
Hence, a node that has $k$ children appears $k+1$ times
2017-01-09 19:32:38 +01:00
in the node array, and there are a total of $2n-1$
nodes in the array.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
We store two values in the array:
2017-02-06 22:17:38 +01:00
(1) the identifier of the node, and (2) the level of the
2017-01-09 19:32:38 +01:00
node in the tree.
The following array corresponds to the above tree:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,1) grid (15,2);
%\node at (-1.1,1.5) {\texttt{node}};
\node at (0.5,1.5) {$1$};
\node at (1.5,1.5) {$2$};
\node at (2.5,1.5) {$5$};
\node at (3.5,1.5) {$2$};
\node at (4.5,1.5) {$6$};
\node at (5.5,1.5) {$8$};
\node at (6.5,1.5) {$6$};
\node at (7.5,1.5) {$2$};
\node at (8.5,1.5) {$1$};
\node at (9.5,1.5) {$3$};
\node at (10.5,1.5) {$1$};
\node at (11.5,1.5) {$4$};
\node at (12.5,1.5) {$7$};
\node at (13.5,1.5) {$4$};
\node at (14.5,1.5) {$1$};
\draw (0,0) grid (15,1);
%\node at (-1.1,0.5) {\texttt{depth}};
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$3$};
\node at (3.5,0.5) {$2$};
\node at (4.5,0.5) {$3$};
\node at (5.5,0.5) {$4$};
\node at (6.5,0.5) {$3$};
\node at (7.5,0.5) {$2$};
\node at (8.5,0.5) {$1$};
\node at (9.5,0.5) {$2$};
\node at (10.5,0.5) {$1$};
\node at (11.5,0.5) {$2$};
\node at (12.5,0.5) {$3$};
\node at (13.5,0.5) {$2$};
\node at (14.5,0.5) {$1$};
\footnotesize
\node at (0.5,2.5) {$1$};
\node at (1.5,2.5) {$2$};
\node at (2.5,2.5) {$3$};
\node at (3.5,2.5) {$4$};
\node at (4.5,2.5) {$5$};
\node at (5.5,2.5) {$6$};
\node at (6.5,2.5) {$7$};
\node at (7.5,2.5) {$8$};
\node at (8.5,2.5) {$9$};
\node at (9.5,2.5) {$10$};
\node at (10.5,2.5) {$11$};
\node at (11.5,2.5) {$12$};
\node at (12.5,2.5) {$13$};
\node at (13.5,2.5) {$14$};
\node at (14.5,2.5) {$15$};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Using this array, we can find the lowest common ancestor
2017-02-06 22:17:38 +01:00
of nodes $a$ and $b$ by finding the node with lowest level
2017-01-09 19:32:38 +01:00
between nodes $a$ and $b$ in the array.
For example, the lowest common ancestor of nodes $5$ and $8$
can be found as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (2,1) rectangle (3,2);
\fill[color=lightgray] (5,1) rectangle (6,2);
\fill[color=lightgray] (2,0) rectangle (6,1);
\node at (3.5,-0.5) {$\uparrow$};
\draw (0,1) grid (15,2);
\node at (0.5,1.5) {$1$};
\node at (1.5,1.5) {$2$};
\node at (2.5,1.5) {$5$};
\node at (3.5,1.5) {$2$};
\node at (4.5,1.5) {$6$};
\node at (5.5,1.5) {$8$};
\node at (6.5,1.5) {$6$};
\node at (7.5,1.5) {$2$};
\node at (8.5,1.5) {$1$};
\node at (9.5,1.5) {$3$};
\node at (10.5,1.5) {$1$};
\node at (11.5,1.5) {$4$};
\node at (12.5,1.5) {$7$};
\node at (13.5,1.5) {$4$};
\node at (14.5,1.5) {$1$};
\draw (0,0) grid (15,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$3$};
\node at (3.5,0.5) {$2$};
\node at (4.5,0.5) {$3$};
\node at (5.5,0.5) {$4$};
\node at (6.5,0.5) {$3$};
\node at (7.5,0.5) {$2$};
\node at (8.5,0.5) {$1$};
\node at (9.5,0.5) {$2$};
\node at (10.5,0.5) {$1$};
\node at (11.5,0.5) {$2$};
\node at (12.5,0.5) {$3$};
\node at (13.5,0.5) {$2$};
\node at (14.5,0.5) {$1$};
\footnotesize
\node at (0.5,2.5) {$1$};
\node at (1.5,2.5) {$2$};
\node at (2.5,2.5) {$3$};
\node at (3.5,2.5) {$4$};
\node at (4.5,2.5) {$5$};
\node at (5.5,2.5) {$6$};
\node at (6.5,2.5) {$7$};
\node at (7.5,2.5) {$8$};
\node at (8.5,2.5) {$9$};
\node at (9.5,2.5) {$10$};
\node at (10.5,2.5) {$11$};
\node at (11.5,2.5) {$12$};
\node at (12.5,2.5) {$13$};
\node at (13.5,2.5) {$14$};
\node at (14.5,2.5) {$15$};
\end{tikzpicture}
\end{center}
2017-02-06 22:17:38 +01:00
Node 5 is at position 3, node 8 is at position 6,
2017-01-09 19:32:38 +01:00
and the node with lowest level between
2017-02-06 22:17:38 +01:00
positions $3 \ldots 6$ is node 2 at position 4
2017-01-09 19:32:38 +01:00
whose level is 2.
Thus, the lowest common ancestor of
nodes 5 and 8 is node 2.
Using a segment tree, we can find the lowest
common ancestor in $O(\log n)$ time.
Since the array is static, the time complexity
$O(1)$ is also possible, but this is rarely needed.
2017-02-06 22:17:38 +01:00
In both cases, the preprocessing takes $O(n \log n)$ time.
2017-01-09 19:32:38 +01:00
\subsubsection{Distances of nodes}
2017-02-06 22:17:38 +01:00
Finally, let us consider a problem of
finding the distance between
two nodes in the tree, which equals
the length of the path between them.
2017-01-09 19:32:38 +01:00
It turns out that this problem reduces to
2017-02-06 22:17:38 +01:00
finding the lowest common ancestor of the nodes.
2017-01-09 19:32:38 +01:00
First, we choose an arbitrary node for the
root of the tree.
After this, the distance between nodes $a$ and $b$
is $d(a)+d(b)-2 \cdot d(c)$,
2017-02-06 22:17:38 +01:00
where $c$ is the lowest common ancestor of $a$ and $b$
and $d(s)$ denotes the distance from the root node
2017-01-09 19:32:38 +01:00
to node $s$.
For example, in the tree
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,-,line width=2pt] (8) -- node[font=\small] {} (7);
\path[draw=red,thick,-,line width=2pt] (7) -- node[font=\small] {} (3);
\path[draw=red,thick,-,line width=2pt] (6) -- node[font=\small] {} (3);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
the lowest common ancestor of nodes 5 and 8 is node 2.
A path from node 5 to node 8
goes first upwards from node 5 to node 2,
and then downwards from node 2 to node 8.
The distances of the nodes from the root are
$d(5)=3$, $d(8)=4$ and $d(2)=2$,
so the distance between nodes 5 and 8 is
$3+4-2\cdot2=3$.
2016-12-28 23:54:51 +01:00