cphb/chapter18.tex

909 lines
27 KiB
TeX
Raw Normal View History

2016-12-28 23:54:51 +01:00
\chapter{Tree queries}
2017-01-09 19:32:38 +01:00
\index{tree query}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
This chapter discusses techniques for
2017-02-18 13:29:38 +01:00
processing queries related
2017-02-06 22:17:38 +01:00
to subtrees and paths of a rooted tree.
2017-02-18 13:29:38 +01:00
For example, such queries are:
2016-12-28 23:54:51 +01:00
\begin{itemize}
2017-02-06 22:17:38 +01:00
\item what is the $k$th ancestor of a node?
\item what is the sum of values in the subtree of a node?
\item what is the sum of values in a path between two nodes?
\item what is the lowest common ancestor of two nodes?
2016-12-28 23:54:51 +01:00
\end{itemize}
2017-01-09 19:32:38 +01:00
\section{Finding ancestors}
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
\index{ancestor}
The $k$th \key{ancestor} of a node $x$ in a rooted tree
is the node that we will reach if we move $k$
levels up from $x$.
Let $f(x,k)$ denote the $k$th ancestor of $x$.
2017-01-09 19:32:38 +01:00
For example, in the following tree, $f(2,1)=1$ and $f(8,2)=4$.
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$2$};
\node[draw, circle] (3) at (-2,1) {$4$};
\node[draw, circle] (4) at (0,1) {$5$};
\node[draw, circle] (5) at (2,-1) {$6$};
\node[draw, circle] (6) at (-3,-1) {$3$};
\node[draw, circle] (7) at (-1,-1) {$7$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend left] (3);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right] (1);
\end{tikzpicture}
\end{center}
2017-02-06 22:17:38 +01:00
An easy way to calculate the value of $f(x,k)$
is to perform a sequence of $k$ moves in the tree.
2017-01-09 19:32:38 +01:00
However, the time complexity of this method
2017-02-06 22:17:38 +01:00
is $O(n)$, because the tree may contain
a chain of $O(n)$ nodes.
2017-01-09 19:32:38 +01:00
2017-02-06 22:17:38 +01:00
Fortunately, it turns out that
using a technique similar to that
used in Chapter 16.3, any value of $f(x,k)$
can be efficiently calculated in $O(\log k)$ time
2017-01-09 19:32:38 +01:00
after preprocessing.
The idea is to precalculate all values $f(x,k)$
where $k$ is a power of two.
2017-02-18 13:29:38 +01:00
For example, the values for the above tree
2017-01-09 19:32:38 +01:00
are as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tabular}{r|rrrrrrrrr}
$x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 \\
\hline
$f(x,1)$ & 0 & 1 & 4 & 1 & 1 & 2 & 4 & 7 \\
$f(x,2)$ & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 4 \\
$f(x,4)$ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
$\cdots$ \\
\end{tabular}
\end{center}
2017-01-09 19:32:38 +01:00
The value $0$ means that the $k$th ancestor
2017-02-06 22:17:38 +01:00
of a node does not exist.
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
The preprocessing takes $O(n \log n)$ time,
2017-01-09 19:32:38 +01:00
because each node can have at most $n$ ancestors.
2017-02-06 22:17:38 +01:00
After this, any value of $f(x,k)$ can be calculated
in $O(\log k)$ time by representing $k$
2017-01-09 19:32:38 +01:00
as a sum where each term is a power of two.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\section{Subtrees and paths}
2016-12-28 23:54:51 +01:00
2017-02-21 17:10:08 +01:00
\index{tree traversal array}
2016-12-28 23:54:51 +01:00
2017-02-21 17:10:08 +01:00
A \key{tree traversal array} contains the nodes of a rooted tree
2017-01-09 19:32:38 +01:00
in the order in which a depth-first search
from the root node visits them.
For example, in the tree
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
a depth-first search proceeds as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (6);
\path[draw=red,thick,->,line width=2pt] (6) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (8);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (9);
\path[draw=red,thick,->,line width=2pt] (9) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (5);
\path[draw=red,thick,->,line width=2pt] (5) edge [bend right=15] (1);
\end{tikzpicture}
\end{center}
2017-02-21 17:10:08 +01:00
Hence, the corresponding tree traversal array is as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (9,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
\subsubsection{Subtree queries}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
Each subtree of a tree corresponds to a subarray
2017-02-21 17:10:08 +01:00
in the tree traversal array such that
the first element in the subarray is the root node.
2017-01-09 19:32:38 +01:00
For example, the following subarray contains the
nodes in the subtree of node $4$:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (4,0) rectangle (8,1);
\draw (0,0) grid (9,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Using this fact, we can efficiently process queries
2017-02-06 22:17:38 +01:00
that are related to subtrees of a tree.
2017-01-09 19:32:38 +01:00
As an example, consider a problem where each node
is assigned a value, and our task is to support
the following queries:
2016-12-28 23:54:51 +01:00
\begin{itemize}
2017-02-06 22:17:38 +01:00
\item update the value of a node
\item calculate the sum of values in the subtree of a node
2016-12-28 23:54:51 +01:00
\end{itemize}
2017-02-06 22:17:38 +01:00
Consider the following tree where the blue numbers
are the values of the nodes.
For example, the sum of the subtree of node $4$
2017-01-09 19:32:38 +01:00
is $3+4+3+1=11$.
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\node[color=blue] at (0,3+0.65) {2};
\node[color=blue] at (-3-0.65,1) {3};
\node[color=blue] at (-1-0.65,1) {5};
\node[color=blue] at (1+0.65,1) {3};
\node[color=blue] at (3+0.65,1) {1};
\node[color=blue] at (-3,-1-0.65) {4};
\node[color=blue] at (-0.5,-1-0.65) {4};
\node[color=blue] at (1,-1-0.65) {3};
\node[color=blue] at (2.5,-1-0.65) {1};
\end{tikzpicture}
\end{center}
2017-02-21 17:10:08 +01:00
The idea is to construct a tree traversal array that contains
2017-02-06 22:17:38 +01:00
three values for each node: (1) the identifier of the node,
(2) the size of the subtree, and (3) the value of the node.
2017-01-09 19:32:38 +01:00
For example, the array for the above tree is as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,1) grid (9,-2);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (0.5,-1.5) {$2$};
\node at (1.5,-1.5) {$3$};
\node at (2.5,-1.5) {$4$};
\node at (3.5,-1.5) {$5$};
\node at (4.5,-1.5) {$3$};
\node at (5.5,-1.5) {$4$};
\node at (6.5,-1.5) {$3$};
\node at (7.5,-1.5) {$1$};
\node at (8.5,-1.5) {$1$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-02-06 22:17:38 +01:00
Using this array, we can calculate the sum of values
in any subtree by first finding out the size of the subtree
2017-01-09 19:32:38 +01:00
and then the values of the corresponding nodes.
For example, the values in the subtree of node $4$
can be found as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (4,1) rectangle (5,0);
\fill[color=lightgray] (4,0) rectangle (5,-1);
\fill[color=lightgray] (4,-1) rectangle (8,-2);
\draw (0,1) grid (9,-2);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (0.5,-1.5) {$2$};
\node at (1.5,-1.5) {$3$};
\node at (2.5,-1.5) {$4$};
\node at (3.5,-1.5) {$5$};
\node at (4.5,-1.5) {$3$};
\node at (5.5,-1.5) {$4$};
\node at (6.5,-1.5) {$3$};
\node at (7.5,-1.5) {$1$};
\node at (8.5,-1.5) {$1$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-02-18 13:29:38 +01:00
To answer the queries efficiently,
2017-02-06 22:17:38 +01:00
it suffices to store the values of the
2017-01-09 19:32:38 +01:00
nodes in a binary indexed tree or segment tree.
2017-02-06 22:17:38 +01:00
After this, we can both update a value
and calculate the sum of values in $O(\log n)$ time.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\subsubsection{Path queries}
2016-12-28 23:54:51 +01:00
2017-02-21 17:10:08 +01:00
Using a tree traversal array, we can also efficiently
2017-02-06 22:17:38 +01:00
calculate sums of values on
2017-02-08 21:45:17 +01:00
paths from the root node to any other
2017-01-09 19:32:38 +01:00
node in the tree.
Let us next consider a problem where our task
is to support the following queries:
2016-12-28 23:54:51 +01:00
\begin{itemize}
2017-02-06 22:17:38 +01:00
\item change the value of a node
2017-02-08 21:45:17 +01:00
\item calculate the sum of values on a path from
the root to a node
2016-12-28 23:54:51 +01:00
\end{itemize}
2017-02-08 21:45:17 +01:00
For example, in the following tree,
2017-02-18 13:29:38 +01:00
the sum of values from the root node to node 7 is
2017-02-08 21:45:17 +01:00
$4+5+5=14$:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (-3,1) {$2$};
\node[draw, circle] (3) at (-1,1) {$3$};
\node[draw, circle] (4) at (1,1) {$4$};
\node[draw, circle] (5) at (3,1) {$5$};
\node[draw, circle] (6) at (-3,-1) {$6$};
\node[draw, circle] (7) at (-0.5,-1) {$7$};
\node[draw, circle] (8) at (1,-1) {$8$};
\node[draw, circle] (9) at (2.5,-1) {$9$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (1) -- (5);
\path[draw,thick,-] (2) -- (6);
\path[draw,thick,-] (4) -- (7);
\path[draw,thick,-] (4) -- (8);
\path[draw,thick,-] (4) -- (9);
\node[color=blue] at (0,3+0.65) {4};
\node[color=blue] at (-3-0.65,1) {5};
\node[color=blue] at (-1-0.65,1) {3};
\node[color=blue] at (1+0.65,1) {5};
\node[color=blue] at (3+0.65,1) {2};
\node[color=blue] at (-3,-1-0.65) {3};
\node[color=blue] at (-0.5,-1-0.65) {5};
\node[color=blue] at (1,-1-0.65) {3};
\node[color=blue] at (2.5,-1-0.65) {1};
\end{tikzpicture}
\end{center}
2017-02-08 21:45:17 +01:00
We can solve this problem in a similar way as before,
2017-02-18 13:29:38 +01:00
but now each value in the last row of the array is the sum of values
on a path from the root to the node.
2017-01-09 19:32:38 +01:00
For example, the following array corresponds to the above tree:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
2017-02-08 21:45:17 +01:00
\draw (0,1) grid (9,-2);
2016-12-28 23:54:51 +01:00
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (0.5,-1.5) {$4$};
2017-02-08 21:45:17 +01:00
\node at (1.5,-1.5) {$9$};
\node at (2.5,-1.5) {$12$};
\node at (3.5,-1.5) {$7$};
\node at (4.5,-1.5) {$9$};
\node at (5.5,-1.5) {$14$};
\node at (6.5,-1.5) {$12$};
\node at (7.5,-1.5) {$10$};
\node at (8.5,-1.5) {$6$};
2016-12-28 23:54:51 +01:00
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-02-08 21:45:17 +01:00
When the value of a node increases by $x$,
the sums of all nodes in its subtree increase by $x$.
For example, if the value of node 4 increases by 1,
2017-02-21 17:10:08 +01:00
the array changes as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
2017-02-08 21:45:17 +01:00
\fill[color=lightgray] (4,-1) rectangle (8,-2);
\draw (0,1) grid (9,-2);
2016-12-28 23:54:51 +01:00
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$4$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$9$};
\node at (8.5,0.5) {$5$};
\node at (0.5,-0.5) {$9$};
\node at (1.5,-0.5) {$2$};
\node at (2.5,-0.5) {$1$};
\node at (3.5,-0.5) {$1$};
\node at (4.5,-0.5) {$4$};
\node at (5.5,-0.5) {$1$};
\node at (6.5,-0.5) {$1$};
\node at (7.5,-0.5) {$1$};
\node at (8.5,-0.5) {$1$};
\node at (0.5,-1.5) {$4$};
2017-02-08 21:45:17 +01:00
\node at (1.5,-1.5) {$9$};
\node at (2.5,-1.5) {$12$};
\node at (3.5,-1.5) {$7$};
\node at (4.5,-1.5) {$10$};
\node at (5.5,-1.5) {$15$};
\node at (6.5,-1.5) {$13$};
\node at (7.5,-1.5) {$11$};
\node at (8.5,-1.5) {$6$};
2016-12-28 23:54:51 +01:00
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\end{tikzpicture}
\end{center}
2017-02-08 21:45:17 +01:00
Thus, to support both the operations,
we should be able to increase all values
in a range and retrieve a single value.
This can be done in $O(\log n)$ time
2017-02-18 13:29:38 +01:00
using a binary indexed tree
or segment tree (see Chapter 9.4).
2017-01-09 19:32:38 +01:00
\section{Lowest common ancestor}
\index{lowest common ancestor}
The \key{lowest common ancestor}
2017-02-06 22:17:38 +01:00
of two nodes in the tree is the lowest node
2017-01-09 19:32:38 +01:00
whose subtree contains both the nodes.
A typical problem is to efficiently process
2017-02-06 22:17:38 +01:00
queries that ask to find the lowest
common ancestor of given two nodes.
2017-02-18 13:29:38 +01:00
For example, in the following tree,
the lowest common ancestor of nodes 5 and 8
is node 2:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
2017-02-24 19:07:25 +01:00
\node[draw, circle, fill=lightgray] (6) at (-3,-1) {$5$};
2016-12-28 23:54:51 +01:00
\node[draw, circle] (7) at (-1,-1) {$6$};
2017-02-24 19:07:25 +01:00
\node[draw, circle, fill=lightgray] (8) at (-1,-3) {$8$};
2016-12-28 23:54:51 +01:00
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
2017-02-24 19:07:25 +01:00
\path[draw=red,thick,->,line width=2pt] (6) edge [bend left] (3);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend right=40] (3);
2016-12-28 23:54:51 +01:00
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Next we will discuss two efficient techniques for
finding the lowest common ancestor of two nodes.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\subsubsection{Method 1}
2016-12-28 23:54:51 +01:00
2017-02-06 22:17:38 +01:00
One way to solve the problem is to use the fact
2017-01-09 19:32:38 +01:00
that we can efficiently find the $k$th
ancestor of any node in the tree.
2017-02-24 19:07:25 +01:00
Using this, we can divide the problem of
finding the lowest common ancestor into two parts.
2016-12-28 23:54:51 +01:00
2017-02-24 19:07:25 +01:00
We use two pointers that initially point to the
two nodes for which we should find the
lowest common ancestor.
First, we move one of the pointers upwards
so that both nodes are at the same level in the tree.
In the example case, we move from node 8 to node 6,
after which both nodes are at the same level:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle,fill=lightgray] (6) at (-3,-1) {$5$};
2017-02-24 19:07:25 +01:00
\node[draw, circle,fill=lightgray] (7) at (-1,-1) {$6$};
\node[draw, circle] (8) at (-1,-3) {$8$};
2016-12-28 23:54:51 +01:00
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
2017-02-24 19:07:25 +01:00
\path[draw=red,thick,->,line width=2pt] (8) edge [bend right] (7);
2016-12-28 23:54:51 +01:00
\end{tikzpicture}
\end{center}
2017-02-24 19:07:25 +01:00
After this, we determine the minimum number of steps
needed to move both pointers upwards so that
they will point to the same node.
This node is the lowest common ancestor of the nodes.
2016-12-28 23:54:51 +01:00
2017-02-24 19:07:25 +01:00
In the example case, it suffices to move both pointers
one step upwards to node 2,
which is the lowest common ancestor:
2017-02-18 13:29:38 +01:00
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
2017-02-24 19:07:25 +01:00
\node[draw, circle,fill=lightgray] (3) at (-2,1) {$2$};
2016-12-28 23:54:51 +01:00
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
2017-02-24 19:07:25 +01:00
\node[draw, circle] (6) at (-3,-1) {$5$};
2016-12-28 23:54:51 +01:00
\node[draw, circle] (7) at (-1,-1) {$6$};
2017-02-24 19:07:25 +01:00
\node[draw, circle] (8) at (-1,-3) {$8$};
2016-12-28 23:54:51 +01:00
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,->,line width=2pt] (6) edge [bend left] (3);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right] (3);
\end{tikzpicture}
\end{center}
2017-02-24 19:07:25 +01:00
Since both parts of the algorithm can be performed in
$O(\log n)$ time using precomputed information,
we can find the lowest common ancestor of any two
nodes in $O(\log n)$ time using this technique.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
\subsubsection{Method 2}
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
Another way to solve the problem is based on
2017-02-21 17:10:08 +01:00
a tree traversal array \cite{ben00}.
2017-02-06 22:17:38 +01:00
Once again, the idea is to traverse the nodes
2017-01-09 19:32:38 +01:00
using a depth-first search:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (6);
\path[draw=red,thick,->,line width=2pt] (6) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (8);
\path[draw=red,thick,->,line width=2pt] (8) edge [bend right=15] (7);
\path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (3);
\path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (4);
\path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (1);
\path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (5);
\path[draw=red,thick,->,line width=2pt] (5) edge [bend right=15] (2);
\path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (1);
\end{tikzpicture}
\end{center}
2017-02-24 19:07:25 +01:00
However, we use a bit different tree
traversal array than before:
2017-02-22 20:42:35 +01:00
we add each node to the array \emph{always}
2017-02-24 19:07:25 +01:00
when the depth-first search walks through the node,
and not only at the first visit\footnote{A similar technique is sometimes called the
\key{Euler tour technique} \cite{tar84}.}.
2017-02-06 22:17:38 +01:00
Hence, a node that has $k$ children appears $k+1$ times
2017-02-24 19:07:25 +01:00
in the array and there are a total of $2n-1$
2017-01-09 19:32:38 +01:00
nodes in the array.
2016-12-28 23:54:51 +01:00
2017-01-09 19:32:38 +01:00
We store two values in the array:
2017-02-06 22:17:38 +01:00
(1) the identifier of the node, and (2) the level of the
2017-01-09 19:32:38 +01:00
node in the tree.
The following array corresponds to the above tree:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,1) grid (15,2);
%\node at (-1.1,1.5) {\texttt{node}};
\node at (0.5,1.5) {$1$};
\node at (1.5,1.5) {$2$};
\node at (2.5,1.5) {$5$};
\node at (3.5,1.5) {$2$};
\node at (4.5,1.5) {$6$};
\node at (5.5,1.5) {$8$};
\node at (6.5,1.5) {$6$};
\node at (7.5,1.5) {$2$};
\node at (8.5,1.5) {$1$};
\node at (9.5,1.5) {$3$};
\node at (10.5,1.5) {$1$};
\node at (11.5,1.5) {$4$};
\node at (12.5,1.5) {$7$};
\node at (13.5,1.5) {$4$};
\node at (14.5,1.5) {$1$};
\draw (0,0) grid (15,1);
%\node at (-1.1,0.5) {\texttt{depth}};
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$3$};
\node at (3.5,0.5) {$2$};
\node at (4.5,0.5) {$3$};
\node at (5.5,0.5) {$4$};
\node at (6.5,0.5) {$3$};
\node at (7.5,0.5) {$2$};
\node at (8.5,0.5) {$1$};
\node at (9.5,0.5) {$2$};
\node at (10.5,0.5) {$1$};
\node at (11.5,0.5) {$2$};
\node at (12.5,0.5) {$3$};
\node at (13.5,0.5) {$2$};
\node at (14.5,0.5) {$1$};
\footnotesize
\node at (0.5,2.5) {$1$};
\node at (1.5,2.5) {$2$};
\node at (2.5,2.5) {$3$};
\node at (3.5,2.5) {$4$};
\node at (4.5,2.5) {$5$};
\node at (5.5,2.5) {$6$};
\node at (6.5,2.5) {$7$};
\node at (7.5,2.5) {$8$};
\node at (8.5,2.5) {$9$};
\node at (9.5,2.5) {$10$};
\node at (10.5,2.5) {$11$};
\node at (11.5,2.5) {$12$};
\node at (12.5,2.5) {$13$};
\node at (13.5,2.5) {$14$};
\node at (14.5,2.5) {$15$};
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
Using this array, we can find the lowest common ancestor
2017-02-06 22:17:38 +01:00
of nodes $a$ and $b$ by finding the node with lowest level
2017-01-09 19:32:38 +01:00
between nodes $a$ and $b$ in the array.
For example, the lowest common ancestor of nodes $5$ and $8$
can be found as follows:
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (2,1) rectangle (3,2);
\fill[color=lightgray] (5,1) rectangle (6,2);
\fill[color=lightgray] (2,0) rectangle (6,1);
\node at (3.5,-0.5) {$\uparrow$};
\draw (0,1) grid (15,2);
\node at (0.5,1.5) {$1$};
\node at (1.5,1.5) {$2$};
\node at (2.5,1.5) {$5$};
\node at (3.5,1.5) {$2$};
\node at (4.5,1.5) {$6$};
\node at (5.5,1.5) {$8$};
\node at (6.5,1.5) {$6$};
\node at (7.5,1.5) {$2$};
\node at (8.5,1.5) {$1$};
\node at (9.5,1.5) {$3$};
\node at (10.5,1.5) {$1$};
\node at (11.5,1.5) {$4$};
\node at (12.5,1.5) {$7$};
\node at (13.5,1.5) {$4$};
\node at (14.5,1.5) {$1$};
\draw (0,0) grid (15,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$2$};
\node at (2.5,0.5) {$3$};
\node at (3.5,0.5) {$2$};
\node at (4.5,0.5) {$3$};
\node at (5.5,0.5) {$4$};
\node at (6.5,0.5) {$3$};
\node at (7.5,0.5) {$2$};
\node at (8.5,0.5) {$1$};
\node at (9.5,0.5) {$2$};
\node at (10.5,0.5) {$1$};
\node at (11.5,0.5) {$2$};
\node at (12.5,0.5) {$3$};
\node at (13.5,0.5) {$2$};
\node at (14.5,0.5) {$1$};
\footnotesize
\node at (0.5,2.5) {$1$};
\node at (1.5,2.5) {$2$};
\node at (2.5,2.5) {$3$};
\node at (3.5,2.5) {$4$};
\node at (4.5,2.5) {$5$};
\node at (5.5,2.5) {$6$};
\node at (6.5,2.5) {$7$};
\node at (7.5,2.5) {$8$};
\node at (8.5,2.5) {$9$};
\node at (9.5,2.5) {$10$};
\node at (10.5,2.5) {$11$};
\node at (11.5,2.5) {$12$};
\node at (12.5,2.5) {$13$};
\node at (13.5,2.5) {$14$};
\node at (14.5,2.5) {$15$};
\end{tikzpicture}
\end{center}
2017-02-06 22:17:38 +01:00
Node 5 is at position 3, node 8 is at position 6,
2017-01-09 19:32:38 +01:00
and the node with lowest level between
2017-02-06 22:17:38 +01:00
positions $3 \ldots 6$ is node 2 at position 4
2017-01-09 19:32:38 +01:00
whose level is 2.
Thus, the lowest common ancestor of
nodes 5 and 8 is node 2.
2017-02-18 13:29:38 +01:00
Thus, to find the lowest common ancestor
of two nodes it suffices to process a range
minimum query.
Since the array is static,
we can process such queries in $O(1)$ time
after an $O(n \log n)$ time preprocessing.
2017-01-09 19:32:38 +01:00
\subsubsection{Distances of nodes}
2017-02-18 13:29:38 +01:00
Finally, let us consider the problem of
2017-02-06 22:17:38 +01:00
finding the distance between
two nodes in the tree, which equals
the length of the path between them.
2017-01-09 19:32:38 +01:00
It turns out that this problem reduces to
2017-02-06 22:17:38 +01:00
finding the lowest common ancestor of the nodes.
2017-01-09 19:32:38 +01:00
2017-02-18 13:29:38 +01:00
First, we root the tree arbitrarily.
2017-01-09 19:32:38 +01:00
After this, the distance between nodes $a$ and $b$
2017-02-18 13:29:38 +01:00
can be calculated using the formula
\[d(a)+d(b)-2 \cdot d(c),\]
2017-02-06 22:17:38 +01:00
where $c$ is the lowest common ancestor of $a$ and $b$
and $d(s)$ denotes the distance from the root node
2017-01-09 19:32:38 +01:00
to node $s$.
For example, in the tree
2016-12-28 23:54:51 +01:00
\begin{center}
\begin{tikzpicture}[scale=0.9]
\node[draw, circle] (1) at (0,3) {$1$};
\node[draw, circle] (2) at (2,1) {$4$};
\node[draw, circle] (3) at (-2,1) {$2$};
\node[draw, circle] (4) at (0,1) {$3$};
\node[draw, circle] (5) at (2,-1) {$7$};
\node[draw, circle] (6) at (-3,-1) {$5$};
\node[draw, circle] (7) at (-1,-1) {$6$};
\node[draw, circle] (8) at (-1,-3) {$8$};
\path[draw,thick,-] (1) -- (2);
\path[draw,thick,-] (1) -- (3);
\path[draw,thick,-] (1) -- (4);
\path[draw,thick,-] (2) -- (5);
\path[draw,thick,-] (3) -- (6);
\path[draw,thick,-] (3) -- (7);
\path[draw,thick,-] (7) -- (8);
\path[draw=red,thick,-,line width=2pt] (8) -- node[font=\small] {} (7);
\path[draw=red,thick,-,line width=2pt] (7) -- node[font=\small] {} (3);
\path[draw=red,thick,-,line width=2pt] (6) -- node[font=\small] {} (3);
\end{tikzpicture}
\end{center}
2017-01-09 19:32:38 +01:00
the lowest common ancestor of nodes 5 and 8 is node 2.
A path from node 5 to node 8
2017-02-18 13:29:38 +01:00
first ascends from node 5 to node 2
and then descends from node 2 to node 8.
2017-01-09 19:32:38 +01:00
The distances of the nodes from the root are
$d(5)=3$, $d(8)=4$ and $d(2)=2$,
so the distance between nodes 5 and 8 is
$3+4-2\cdot2=3$.
2016-12-28 23:54:51 +01:00