\chapter{Tree queries} \index{tree query} This chapter discusses techniques for processing queries on subtrees and paths of a rooted tree. For example, such queries are: \begin{itemize} \item what is the $k$th ancestor of a node? \item what is the sum of values in the subtree of a node? \item what is the sum of values on a path between two nodes? \item what is the lowest common ancestor of two nodes? \end{itemize} \section{Finding ancestors} \index{ancestor} The $k$th \key{ancestor} of a node $x$ in a rooted tree is the node that we will reach if we move $k$ levels up from $x$. Let $\texttt{ancestor}(x,k)$ denote the $k$th ancestor of a node $x$ (or $0$ if there is no such an ancestor). For example, in the following tree, $\texttt{ancestor}(2,1)=1$ and $\texttt{ancestor}(8,2)=4$. \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (2,1) {$2$}; \node[draw, circle] (3) at (-2,1) {$4$}; \node[draw, circle] (4) at (0,1) {$5$}; \node[draw, circle] (5) at (2,-1) {$6$}; \node[draw, circle] (6) at (-3,-1) {$3$}; \node[draw, circle] (7) at (-1,-1) {$7$}; \node[draw, circle] (8) at (-1,-3) {$8$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (3) -- (7); \path[draw,thick,-] (7) -- (8); \path[draw=red,thick,->,line width=2pt] (8) edge [bend left] (3); \path[draw=red,thick,->,line width=2pt] (2) edge [bend right] (1); \end{tikzpicture} \end{center} An easy way to calculate any value of $\texttt{ancestor}(x,k)$ is to perform a sequence of $k$ moves in the tree. However, the time complexity of this method is $O(k)$, which may be slow, because a tree of $n$ nodes may have a chain of $n$ nodes. Fortunately, using a technique similar to that used in Chapter 16.3, any value of $\texttt{ancestor}(x,k)$ can be efficiently calculated in $O(\log k)$ time after preprocessing. The idea is to precalculate all values $\texttt{ancestor}(x,k)$ where $k \le n$ is a power of two. For example, the values for the above tree are as follows: \begin{center} \begin{tabular}{r|rrrrrrrrr} $x$ & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 \\ \hline $\texttt{ancestor}(x,1)$ & 0 & 1 & 4 & 1 & 1 & 2 & 4 & 7 \\ $\texttt{ancestor}(x,2)$ & 0 & 0 & 1 & 0 & 0 & 1 & 1 & 4 \\ $\texttt{ancestor}(x,4)$ & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ $\cdots$ \\ \end{tabular} \end{center} The preprocessing takes $O(n \log n)$ time, because $O(\log n)$ values are calculated for each node. After this, any value of $\texttt{ancestor}(x,k)$ can be calculated in $O(\log k)$ time by representing $k$ as a sum where each term is a power of two. \section{Subtrees and paths} \index{tree traversal array} A \key{tree traversal array} contains the nodes of a rooted tree in the order in which a depth-first search from the root node visits them. For example, in the tree \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (-3,1) {$2$}; \node[draw, circle] (3) at (-1,1) {$3$}; \node[draw, circle] (4) at (1,1) {$4$}; \node[draw, circle] (5) at (3,1) {$5$}; \node[draw, circle] (6) at (-3,-1) {$6$}; \node[draw, circle] (7) at (-0.5,-1) {$7$}; \node[draw, circle] (8) at (1,-1) {$8$}; \node[draw, circle] (9) at (2.5,-1) {$9$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (1) -- (5); \path[draw,thick,-] (2) -- (6); \path[draw,thick,-] (4) -- (7); \path[draw,thick,-] (4) -- (8); \path[draw,thick,-] (4) -- (9); \end{tikzpicture} \end{center} a depth-first search proceeds as follows: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (-3,1) {$2$}; \node[draw, circle] (3) at (-1,1) {$3$}; \node[draw, circle] (4) at (1,1) {$4$}; \node[draw, circle] (5) at (3,1) {$5$}; \node[draw, circle] (6) at (-3,-1) {$6$}; \node[draw, circle] (7) at (-0.5,-1) {$7$}; \node[draw, circle] (8) at (1,-1) {$8$}; \node[draw, circle] (9) at (2.5,-1) {$9$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (1) -- (5); \path[draw,thick,-] (2) -- (6); \path[draw,thick,-] (4) -- (7); \path[draw,thick,-] (4) -- (8); \path[draw,thick,-] (4) -- (9); \path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (2); \path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (6); \path[draw=red,thick,->,line width=2pt] (6) edge [bend right=15] (2); \path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (1); \path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (3); \path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (1); \path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (4); \path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (7); \path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (4); \path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (8); \path[draw=red,thick,->,line width=2pt] (8) edge [bend right=15] (4); \path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (9); \path[draw=red,thick,->,line width=2pt] (9) edge [bend right=15] (4); \path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (1); \path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (5); \path[draw=red,thick,->,line width=2pt] (5) edge [bend right=15] (1); \end{tikzpicture} \end{center} Hence, the corresponding tree traversal array is as follows: \begin{center} \begin{tikzpicture}[scale=0.7] \draw (0,0) grid (9,1); \node at (0.5,0.5) {$1$}; \node at (1.5,0.5) {$2$}; \node at (2.5,0.5) {$6$}; \node at (3.5,0.5) {$3$}; \node at (4.5,0.5) {$4$}; \node at (5.5,0.5) {$7$}; \node at (6.5,0.5) {$8$}; \node at (7.5,0.5) {$9$}; \node at (8.5,0.5) {$5$}; % % \footnotesize % \node at (0.5,1.4) {$1$}; % \node at (1.5,1.4) {$2$}; % \node at (2.5,1.4) {$3$}; % \node at (3.5,1.4) {$4$}; % \node at (4.5,1.4) {$5$}; % \node at (5.5,1.4) {$6$}; % \node at (6.5,1.4) {$7$}; % \node at (7.5,1.4) {$8$}; % \node at (8.5,1.4) {$9$}; \end{tikzpicture} \end{center} \subsubsection{Subtree queries} Each subtree of a tree corresponds to a subarray of the tree traversal array such that the first element of the subarray is the root node. For example, the following subarray contains the nodes of the subtree of node $4$: \begin{center} \begin{tikzpicture}[scale=0.7] \fill[color=lightgray] (4,0) rectangle (8,1); \draw (0,0) grid (9,1); \node at (0.5,0.5) {$1$}; \node at (1.5,0.5) {$2$}; \node at (2.5,0.5) {$6$}; \node at (3.5,0.5) {$3$}; \node at (4.5,0.5) {$4$}; \node at (5.5,0.5) {$7$}; \node at (6.5,0.5) {$8$}; \node at (7.5,0.5) {$9$}; \node at (8.5,0.5) {$5$}; % % \footnotesize % \node at (0.5,1.4) {$1$}; % \node at (1.5,1.4) {$2$}; % \node at (2.5,1.4) {$3$}; % \node at (3.5,1.4) {$4$}; % \node at (4.5,1.4) {$5$}; % \node at (5.5,1.4) {$6$}; % \node at (6.5,1.4) {$7$}; % \node at (7.5,1.4) {$8$}; % \node at (8.5,1.4) {$9$}; \end{tikzpicture} \end{center} Using this fact, we can efficiently process queries that are related to subtrees of a tree. As an example, consider a problem where each node is assigned a value, and our task is to support the following queries: \begin{itemize} \item update the value of a node \item calculate the sum of values in the subtree of a node \end{itemize} Consider the following tree where the blue numbers are the values of the nodes. For example, the sum of the subtree of node $4$ is $3+4+3+1=11$. \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (-3,1) {$2$}; \node[draw, circle] (3) at (-1,1) {$3$}; \node[draw, circle] (4) at (1,1) {$4$}; \node[draw, circle] (5) at (3,1) {$5$}; \node[draw, circle] (6) at (-3,-1) {$6$}; \node[draw, circle] (7) at (-0.5,-1) {$7$}; \node[draw, circle] (8) at (1,-1) {$8$}; \node[draw, circle] (9) at (2.5,-1) {$9$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (1) -- (5); \path[draw,thick,-] (2) -- (6); \path[draw,thick,-] (4) -- (7); \path[draw,thick,-] (4) -- (8); \path[draw,thick,-] (4) -- (9); \node[color=blue] at (0,3+0.65) {2}; \node[color=blue] at (-3-0.65,1) {3}; \node[color=blue] at (-1-0.65,1) {5}; \node[color=blue] at (1+0.65,1) {3}; \node[color=blue] at (3+0.65,1) {1}; \node[color=blue] at (-3,-1-0.65) {4}; \node[color=blue] at (-0.5,-1-0.65) {4}; \node[color=blue] at (1,-1-0.65) {3}; \node[color=blue] at (2.5,-1-0.65) {1}; \end{tikzpicture} \end{center} The idea is to construct a tree traversal array that contains three values for each node: the identifier of the node, the size of the subtree, and the value of the node. For example, the array for the above tree is as follows: \begin{center} \begin{tikzpicture}[scale=0.7] \draw (0,1) grid (9,-2); \node[left] at (-1,0.5) {node id}; \node[left] at (-1,-0.5) {subtree size}; \node[left] at (-1,-1.5) {node value}; \node at (0.5,0.5) {$1$}; \node at (1.5,0.5) {$2$}; \node at (2.5,0.5) {$6$}; \node at (3.5,0.5) {$3$}; \node at (4.5,0.5) {$4$}; \node at (5.5,0.5) {$7$}; \node at (6.5,0.5) {$8$}; \node at (7.5,0.5) {$9$}; \node at (8.5,0.5) {$5$}; \node at (0.5,-0.5) {$9$}; \node at (1.5,-0.5) {$2$}; \node at (2.5,-0.5) {$1$}; \node at (3.5,-0.5) {$1$}; \node at (4.5,-0.5) {$4$}; \node at (5.5,-0.5) {$1$}; \node at (6.5,-0.5) {$1$}; \node at (7.5,-0.5) {$1$}; \node at (8.5,-0.5) {$1$}; \node at (0.5,-1.5) {$2$}; \node at (1.5,-1.5) {$3$}; \node at (2.5,-1.5) {$4$}; \node at (3.5,-1.5) {$5$}; \node at (4.5,-1.5) {$3$}; \node at (5.5,-1.5) {$4$}; \node at (6.5,-1.5) {$3$}; \node at (7.5,-1.5) {$1$}; \node at (8.5,-1.5) {$1$}; % % \footnotesize % \node at (0.5,1.4) {$1$}; % \node at (1.5,1.4) {$2$}; % \node at (2.5,1.4) {$3$}; % \node at (3.5,1.4) {$4$}; % \node at (4.5,1.4) {$5$}; % \node at (5.5,1.4) {$6$}; % \node at (6.5,1.4) {$7$}; % \node at (7.5,1.4) {$8$}; % \node at (8.5,1.4) {$9$}; \end{tikzpicture} \end{center} Using this array, we can calculate the sum of values in any subtree by first finding out the size of the subtree and then the values of the corresponding nodes. For example, the values in the subtree of node $4$ can be found as follows: \begin{center} \begin{tikzpicture}[scale=0.7] \fill[color=lightgray] (4,1) rectangle (5,0); \fill[color=lightgray] (4,0) rectangle (5,-1); \fill[color=lightgray] (4,-1) rectangle (8,-2); \draw (0,1) grid (9,-2); \node[left] at (-1,0.5) {node id}; \node[left] at (-1,-0.5) {subtree size}; \node[left] at (-1,-1.5) {node value}; \node at (0.5,0.5) {$1$}; \node at (1.5,0.5) {$2$}; \node at (2.5,0.5) {$6$}; \node at (3.5,0.5) {$3$}; \node at (4.5,0.5) {$4$}; \node at (5.5,0.5) {$7$}; \node at (6.5,0.5) {$8$}; \node at (7.5,0.5) {$9$}; \node at (8.5,0.5) {$5$}; \node at (0.5,-0.5) {$9$}; \node at (1.5,-0.5) {$2$}; \node at (2.5,-0.5) {$1$}; \node at (3.5,-0.5) {$1$}; \node at (4.5,-0.5) {$4$}; \node at (5.5,-0.5) {$1$}; \node at (6.5,-0.5) {$1$}; \node at (7.5,-0.5) {$1$}; \node at (8.5,-0.5) {$1$}; \node at (0.5,-1.5) {$2$}; \node at (1.5,-1.5) {$3$}; \node at (2.5,-1.5) {$4$}; \node at (3.5,-1.5) {$5$}; \node at (4.5,-1.5) {$3$}; \node at (5.5,-1.5) {$4$}; \node at (6.5,-1.5) {$3$}; \node at (7.5,-1.5) {$1$}; \node at (8.5,-1.5) {$1$}; % % \footnotesize % \node at (0.5,1.4) {$1$}; % \node at (1.5,1.4) {$2$}; % \node at (2.5,1.4) {$3$}; % \node at (3.5,1.4) {$4$}; % \node at (4.5,1.4) {$5$}; % \node at (5.5,1.4) {$6$}; % \node at (6.5,1.4) {$7$}; % \node at (7.5,1.4) {$8$}; % \node at (8.5,1.4) {$9$}; \end{tikzpicture} \end{center} To answer the queries efficiently, it suffices to store the values of the nodes in a binary indexed or segment tree. After this, we can both update a value and calculate the sum of values in $O(\log n)$ time. \subsubsection{Path queries} Using a tree traversal array, we can also efficiently calculate sums of values on paths from the root node to any node of the tree. Consider a problem where our task is to support the following queries: \begin{itemize} \item change the value of a node \item calculate the sum of values on a path from the root to a node \end{itemize} For example, in the following tree, the sum of values from the root node to node 7 is $4+5+5=14$: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (-3,1) {$2$}; \node[draw, circle] (3) at (-1,1) {$3$}; \node[draw, circle] (4) at (1,1) {$4$}; \node[draw, circle] (5) at (3,1) {$5$}; \node[draw, circle] (6) at (-3,-1) {$6$}; \node[draw, circle] (7) at (-0.5,-1) {$7$}; \node[draw, circle] (8) at (1,-1) {$8$}; \node[draw, circle] (9) at (2.5,-1) {$9$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (1) -- (5); \path[draw,thick,-] (2) -- (6); \path[draw,thick,-] (4) -- (7); \path[draw,thick,-] (4) -- (8); \path[draw,thick,-] (4) -- (9); \node[color=blue] at (0,3+0.65) {4}; \node[color=blue] at (-3-0.65,1) {5}; \node[color=blue] at (-1-0.65,1) {3}; \node[color=blue] at (1+0.65,1) {5}; \node[color=blue] at (3+0.65,1) {2}; \node[color=blue] at (-3,-1-0.65) {3}; \node[color=blue] at (-0.5,-1-0.65) {5}; \node[color=blue] at (1,-1-0.65) {3}; \node[color=blue] at (2.5,-1-0.65) {1}; \end{tikzpicture} \end{center} We can solve this problem like before, but now each value in the last row of the array is the sum of values on a path from the root to the node. For example, the following array corresponds to the above tree: \begin{center} \begin{tikzpicture}[scale=0.7] \draw (0,1) grid (9,-2); \node[left] at (-1,0.5) {node id}; \node[left] at (-1,-0.5) {subtree size}; \node[left] at (-1,-1.5) {path sum}; \node at (0.5,0.5) {$1$}; \node at (1.5,0.5) {$2$}; \node at (2.5,0.5) {$6$}; \node at (3.5,0.5) {$3$}; \node at (4.5,0.5) {$4$}; \node at (5.5,0.5) {$7$}; \node at (6.5,0.5) {$8$}; \node at (7.5,0.5) {$9$}; \node at (8.5,0.5) {$5$}; \node at (0.5,-0.5) {$9$}; \node at (1.5,-0.5) {$2$}; \node at (2.5,-0.5) {$1$}; \node at (3.5,-0.5) {$1$}; \node at (4.5,-0.5) {$4$}; \node at (5.5,-0.5) {$1$}; \node at (6.5,-0.5) {$1$}; \node at (7.5,-0.5) {$1$}; \node at (8.5,-0.5) {$1$}; \node at (0.5,-1.5) {$4$}; \node at (1.5,-1.5) {$9$}; \node at (2.5,-1.5) {$12$}; \node at (3.5,-1.5) {$7$}; \node at (4.5,-1.5) {$9$}; \node at (5.5,-1.5) {$14$}; \node at (6.5,-1.5) {$12$}; \node at (7.5,-1.5) {$10$}; \node at (8.5,-1.5) {$6$}; \end{tikzpicture} \end{center} When the value of a node increases by $x$, the sums of all nodes in its subtree increase by $x$. For example, if the value of node 4 increases by 1, the array changes as follows: \begin{center} \begin{tikzpicture}[scale=0.7] \fill[color=lightgray] (4,-1) rectangle (8,-2); \draw (0,1) grid (9,-2); \node[left] at (-1,0.5) {node id}; \node[left] at (-1,-0.5) {subtree size}; \node[left] at (-1,-1.5) {path sum}; \node at (0.5,0.5) {$1$}; \node at (1.5,0.5) {$2$}; \node at (2.5,0.5) {$6$}; \node at (3.5,0.5) {$3$}; \node at (4.5,0.5) {$4$}; \node at (5.5,0.5) {$7$}; \node at (6.5,0.5) {$8$}; \node at (7.5,0.5) {$9$}; \node at (8.5,0.5) {$5$}; \node at (0.5,-0.5) {$9$}; \node at (1.5,-0.5) {$2$}; \node at (2.5,-0.5) {$1$}; \node at (3.5,-0.5) {$1$}; \node at (4.5,-0.5) {$4$}; \node at (5.5,-0.5) {$1$}; \node at (6.5,-0.5) {$1$}; \node at (7.5,-0.5) {$1$}; \node at (8.5,-0.5) {$1$}; \node at (0.5,-1.5) {$4$}; \node at (1.5,-1.5) {$9$}; \node at (2.5,-1.5) {$12$}; \node at (3.5,-1.5) {$7$}; \node at (4.5,-1.5) {$10$}; \node at (5.5,-1.5) {$15$}; \node at (6.5,-1.5) {$13$}; \node at (7.5,-1.5) {$11$}; \node at (8.5,-1.5) {$6$}; \end{tikzpicture} \end{center} Thus, to support both the operations, we should be able to increase all values in a range and retrieve a single value. This can be done in $O(\log n)$ time using a binary indexed or segment tree (see Chapter 9.4). \section{Lowest common ancestor} \index{lowest common ancestor} The \key{lowest common ancestor} of two nodes of a rooted tree is the lowest node whose subtree contains both the nodes. A typical problem is to efficiently process queries that ask to find the lowest common ancestor of two nodes. For example, in the following tree, the lowest common ancestor of nodes 5 and 8 is node 2: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (2,1) {$4$}; \node[draw, circle] (3) at (-2,1) {$2$}; \node[draw, circle] (4) at (0,1) {$3$}; \node[draw, circle] (5) at (2,-1) {$7$}; \node[draw, circle, fill=lightgray] (6) at (-3,-1) {$5$}; \node[draw, circle] (7) at (-1,-1) {$6$}; \node[draw, circle, fill=lightgray] (8) at (-1,-3) {$8$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (3) -- (7); \path[draw,thick,-] (7) -- (8); \path[draw=red,thick,->,line width=2pt] (6) edge [bend left] (3); \path[draw=red,thick,->,line width=2pt] (8) edge [bend right=40] (3); \end{tikzpicture} \end{center} Next we will discuss two efficient techniques for finding the lowest common ancestor of two nodes. \subsubsection{Method 1} One way to solve the problem is to use the fact that we can efficiently find the $k$th ancestor of any node in the tree. Using this, we can divide the problem of finding the lowest common ancestor into two parts. We use two pointers that initially point to the two nodes whose lowest common ancestor we should find. First, we move one of the pointers upwards so that both pointers point to nodes at the same level. In the example scenario, we move the second pointer one level up so that it points to node 6 which is at the same level with node 5: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (2,1) {$4$}; \node[draw, circle] (3) at (-2,1) {$2$}; \node[draw, circle] (4) at (0,1) {$3$}; \node[draw, circle] (5) at (2,-1) {$7$}; \node[draw, circle,fill=lightgray] (6) at (-3,-1) {$5$}; \node[draw, circle,fill=lightgray] (7) at (-1,-1) {$6$}; \node[draw, circle] (8) at (-1,-3) {$8$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (3) -- (7); \path[draw,thick,-] (7) -- (8); \path[draw=red,thick,->,line width=2pt] (8) edge [bend right] (7); \end{tikzpicture} \end{center} After this, we determine the minimum number of steps needed to move both pointers upwards so that they will point to the same node. The node to which the pointers point after this is the lowest common ancestor. In the example scenario, it suffices to move both pointers one step upwards to node 2, which is the lowest common ancestor: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (2,1) {$4$}; \node[draw, circle,fill=lightgray] (3) at (-2,1) {$2$}; \node[draw, circle] (4) at (0,1) {$3$}; \node[draw, circle] (5) at (2,-1) {$7$}; \node[draw, circle] (6) at (-3,-1) {$5$}; \node[draw, circle] (7) at (-1,-1) {$6$}; \node[draw, circle] (8) at (-1,-3) {$8$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (3) -- (7); \path[draw,thick,-] (7) -- (8); \path[draw=red,thick,->,line width=2pt] (6) edge [bend left] (3); \path[draw=red,thick,->,line width=2pt] (7) edge [bend right] (3); \end{tikzpicture} \end{center} Since both parts of the algorithm can be performed in $O(\log n)$ time using precomputed information, we can find the lowest common ancestor of any two nodes in $O(\log n)$ time. \subsubsection{Method 2} Another way to solve the problem is based on a tree traversal array\footnote{This lowest common ancestor algorithm was presented in \cite{ben00}. This technique is sometimes called the \index{Euler tour technique} \key{Euler tour technique} \cite{tar84}.}. Once again, the idea is to traverse the nodes using a depth-first search: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (2,1) {$4$}; \node[draw, circle] (3) at (-2,1) {$2$}; \node[draw, circle] (4) at (0,1) {$3$}; \node[draw, circle] (5) at (2,-1) {$7$}; \node[draw, circle] (6) at (-3,-1) {$5$}; \node[draw, circle] (7) at (-1,-1) {$6$}; \node[draw, circle] (8) at (-1,-3) {$8$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (3) -- (7); \path[draw,thick,-] (7) -- (8); \path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (3); \path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (6); \path[draw=red,thick,->,line width=2pt] (6) edge [bend right=15] (3); \path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (7); \path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (8); \path[draw=red,thick,->,line width=2pt] (8) edge [bend right=15] (7); \path[draw=red,thick,->,line width=2pt] (7) edge [bend right=15] (3); \path[draw=red,thick,->,line width=2pt] (3) edge [bend right=15] (1); \path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (4); \path[draw=red,thick,->,line width=2pt] (4) edge [bend right=15] (1); \path[draw=red,thick,->,line width=2pt] (1) edge [bend right=15] (2); \path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (5); \path[draw=red,thick,->,line width=2pt] (5) edge [bend right=15] (2); \path[draw=red,thick,->,line width=2pt] (2) edge [bend right=15] (1); \end{tikzpicture} \end{center} However, we use a different tree traversal array than before: we add each node to the array \emph{always} when the depth-first search walks through the node, and not only at the first visit. Hence, a node that has $k$ children appears $k+1$ times in the array and there are a total of $2n-1$ nodes in the array. We store two values in the array: the identifier of the node and the depth of the node in the tree. The following array corresponds to the above tree: \begin{center} \begin{tikzpicture}[scale=0.7] \node[left] at (-1,1.5) {node id}; \node[left] at (-1,0.5) {depth}; \draw (0,1) grid (15,2); \node at (0.5,1.5) {$1$}; \node at (1.5,1.5) {$2$}; \node at (2.5,1.5) {$5$}; \node at (3.5,1.5) {$2$}; \node at (4.5,1.5) {$6$}; \node at (5.5,1.5) {$8$}; \node at (6.5,1.5) {$6$}; \node at (7.5,1.5) {$2$}; \node at (8.5,1.5) {$1$}; \node at (9.5,1.5) {$3$}; \node at (10.5,1.5) {$1$}; \node at (11.5,1.5) {$4$}; \node at (12.5,1.5) {$7$}; \node at (13.5,1.5) {$4$}; \node at (14.5,1.5) {$1$}; \draw (0,0) grid (15,1); \node at (0.5,0.5) {$1$}; \node at (1.5,0.5) {$2$}; \node at (2.5,0.5) {$3$}; \node at (3.5,0.5) {$2$}; \node at (4.5,0.5) {$3$}; \node at (5.5,0.5) {$4$}; \node at (6.5,0.5) {$3$}; \node at (7.5,0.5) {$2$}; \node at (8.5,0.5) {$1$}; \node at (9.5,0.5) {$2$}; \node at (10.5,0.5) {$1$}; \node at (11.5,0.5) {$2$}; \node at (12.5,0.5) {$3$}; \node at (13.5,0.5) {$2$}; \node at (14.5,0.5) {$1$}; \footnotesize \node at (0.5,2.5) {$0$}; \node at (1.5,2.5) {$1$}; \node at (2.5,2.5) {$2$}; \node at (3.5,2.5) {$3$}; \node at (4.5,2.5) {$4$}; \node at (5.5,2.5) {$5$}; \node at (6.5,2.5) {$6$}; \node at (7.5,2.5) {$7$}; \node at (8.5,2.5) {$8$}; \node at (9.5,2.5) {$9$}; \node at (10.5,2.5) {$10$}; \node at (11.5,2.5) {$11$}; \node at (12.5,2.5) {$12$}; \node at (13.5,2.5) {$13$}; \node at (14.5,2.5) {$14$}; \end{tikzpicture} \end{center} Now we can find the lowest common ancestor of nodes $a$ and $b$ by finding the node with the \emph{minimum} depth between nodes $a$ and $b$ in the array. For example, the lowest common ancestor of nodes $5$ and $8$ can be found as follows: \begin{center} \begin{tikzpicture}[scale=0.7] \node[left] at (-1,1.5) {node id}; \node[left] at (-1,0.5) {depth}; \fill[color=lightgray] (2,1) rectangle (3,2); \fill[color=lightgray] (5,1) rectangle (6,2); \fill[color=lightgray] (2,0) rectangle (6,1); \node at (3.5,-0.5) {$\uparrow$}; \draw (0,1) grid (15,2); \node at (0.5,1.5) {$1$}; \node at (1.5,1.5) {$2$}; \node at (2.5,1.5) {$5$}; \node at (3.5,1.5) {$2$}; \node at (4.5,1.5) {$6$}; \node at (5.5,1.5) {$8$}; \node at (6.5,1.5) {$6$}; \node at (7.5,1.5) {$2$}; \node at (8.5,1.5) {$1$}; \node at (9.5,1.5) {$3$}; \node at (10.5,1.5) {$1$}; \node at (11.5,1.5) {$4$}; \node at (12.5,1.5) {$7$}; \node at (13.5,1.5) {$4$}; \node at (14.5,1.5) {$1$}; \draw (0,0) grid (15,1); \node at (0.5,0.5) {$1$}; \node at (1.5,0.5) {$2$}; \node at (2.5,0.5) {$3$}; \node at (3.5,0.5) {$2$}; \node at (4.5,0.5) {$3$}; \node at (5.5,0.5) {$4$}; \node at (6.5,0.5) {$3$}; \node at (7.5,0.5) {$2$}; \node at (8.5,0.5) {$1$}; \node at (9.5,0.5) {$2$}; \node at (10.5,0.5) {$1$}; \node at (11.5,0.5) {$2$}; \node at (12.5,0.5) {$3$}; \node at (13.5,0.5) {$2$}; \node at (14.5,0.5) {$1$}; \footnotesize \node at (0.5,2.5) {$0$}; \node at (1.5,2.5) {$1$}; \node at (2.5,2.5) {$2$}; \node at (3.5,2.5) {$3$}; \node at (4.5,2.5) {$4$}; \node at (5.5,2.5) {$5$}; \node at (6.5,2.5) {$6$}; \node at (7.5,2.5) {$7$}; \node at (8.5,2.5) {$8$}; \node at (9.5,2.5) {$9$}; \node at (10.5,2.5) {$10$}; \node at (11.5,2.5) {$11$}; \node at (12.5,2.5) {$12$}; \node at (13.5,2.5) {$13$}; \node at (14.5,2.5) {$14$}; \end{tikzpicture} \end{center} Node 5 is at position 2, node 8 is at position 5, and the node with minimum depth between positions $2 \ldots 5$ is node 2 at position 3 whose depth is 2. Thus, the lowest common ancestor of nodes 5 and 8 is node 2. Thus, to find the lowest common ancestor of two nodes it suffices to process a range minimum query. Since the array is static, we can process such queries in $O(1)$ time after an $O(n \log n)$ time preprocessing. \subsubsection{Distances of nodes} The distance between nodes $a$ and $b$ equals the length of the path from $a$ to $b$. It turns out that the problem of calculating the distance between nodes reduces to finding their lowest common ancestor. First, we root the tree arbitrarily. After this, the distance of nodes $a$ and $b$ can be calculated using the formula \[\texttt{depth}(a)+\texttt{depth}(b)-2 \cdot \texttt{depth}(c),\] where $c$ is the lowest common ancestor of $a$ and $b$ and $\texttt{depth}(s)$ denotes the depth of node $s$. For example, consider the distance of nodes 5 and 8: \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (2,1) {$4$}; \node[draw, circle] (3) at (-2,1) {$2$}; \node[draw, circle] (4) at (0,1) {$3$}; \node[draw, circle] (5) at (2,-1) {$7$}; \node[draw, circle] (6) at (-3,-1) {$5$}; \node[draw, circle] (7) at (-1,-1) {$6$}; \node[draw, circle] (8) at (-1,-3) {$8$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (3) -- (7); \path[draw,thick,-] (7) -- (8); \path[draw=red,thick,-,line width=2pt] (8) -- node[font=\small] {} (7); \path[draw=red,thick,-,line width=2pt] (7) -- node[font=\small] {} (3); \path[draw=red,thick,-,line width=2pt] (6) -- node[font=\small] {} (3); \end{tikzpicture} \end{center} The lowest common ancestor of nodes 5 and 8 is node 2. The depths of the nodes are $\texttt{depth}(5)=3$, $\texttt{depth}(8)=4$ and $\texttt{depth}(2)=2$, so the distance between nodes 5 and 8 is $3+4-2\cdot2=3$. \section{Offline algorithms} So far, we have discussed \emph{online} algorithms for tree queries. Those algorithms are able to process queries one after another so that each query is answered before receiving the next query. However, in many problems, the online property is not necessary. In this section, we focus on \emph{offline} algorithms. Those algorithms are given a set of queries which can be answered in any order. It is often easier to design an offline algorithm compared to an online algorithm. \subsubsection{Merging data structures} One method to construct an offline algorithm is to perform a depth-first tree traversal and maintain data structures in nodes. At each node $s$, we create a data structure $\texttt{d}[s]$ that is based on the data structures of the children of $s$. Then, using this data structure, all queries related to $s$ are processed. As an example, consider the following problem: We are given a tree where each node has some value. Our task is to process queries of the form ''calculate the number of nodes with value $x$ in the subtree of node $s$''. For example, in the following tree, the subtree of node $4$ contains two nodes whose value is 3. \begin{center} \begin{tikzpicture}[scale=0.9] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (-3,1) {$2$}; \node[draw, circle] (3) at (-1,1) {$3$}; \node[draw, circle] (4) at (1,1) {$4$}; \node[draw, circle] (5) at (3,1) {$5$}; \node[draw, circle] (6) at (-3,-1) {$6$}; \node[draw, circle] (7) at (-0.5,-1) {$7$}; \node[draw, circle] (8) at (1,-1) {$8$}; \node[draw, circle] (9) at (2.5,-1) {$9$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (1) -- (5); \path[draw,thick,-] (2) -- (6); \path[draw,thick,-] (4) -- (7); \path[draw,thick,-] (4) -- (8); \path[draw,thick,-] (4) -- (9); \node[color=blue] at (0,3+0.65) {2}; \node[color=blue] at (-3-0.65,1) {3}; \node[color=blue] at (-1-0.65,1) {5}; \node[color=blue] at (1+0.65,1) {3}; \node[color=blue] at (3+0.65,1) {1}; \node[color=blue] at (-3,-1-0.65) {4}; \node[color=blue] at (-0.5,-1-0.65) {4}; \node[color=blue] at (1,-1-0.65) {3}; \node[color=blue] at (2.5,-1-0.65) {1}; \end{tikzpicture} \end{center} In this problem, we can use map structures to answer the queries. For example, the maps for node 4 and its children are as follows: \begin{center} \begin{tikzpicture}[scale=0.7] \node[draw, rectangle] (a) at (4,5.5) { \footnotesize \begin{tabular}{rrr} 4 \\ \hline 1 \\ \end{tabular}}; \node[draw, rectangle] (b) at (8,5.5) { \footnotesize \begin{tabular}{rrr} 3 \\ \hline 1 \\ \end{tabular}}; \node[draw, rectangle] (c) at (12,5.5) { \footnotesize \begin{tabular}{rr} 1 \\ \hline 1 \\ \end{tabular}}; \node[draw, rectangle] (d) at (8,8.5) { \footnotesize \begin{tabular}{rrr} 1 & 3 & 4 \\ \hline 1 & 2 & 1 \\ \end{tabular}}; \path[draw,thick,-] (a) -- (d); \path[draw,thick,-] (b) -- (d); \path[draw,thick,-] (c) -- (d); \end{tikzpicture} \end{center} If we create such a data structure for each node, we can easily process all given queries, because we can handle all queries related to a node immediately after creating its data structure. For example, the above map structure for node 4 tells us that its subtree contains two nodes whose value is 3. However, it would be too slow to create all data structures from scratch. Instead, at each node $s$, we create an initial data structure $\texttt{d}[s]$ that only contains the value of $s$. After this, we go through the children of $s$ and \emph{merge} $\texttt{d}[s]$ and all data structures $\texttt{d}[u]$ where $u$ is a child of $s$. For example, in the above tree, the map for node $4$ is created by merging the following maps: \begin{center} \begin{tikzpicture}[scale=0.7] \node[draw, rectangle] (a) at (4,5.5) { \footnotesize \begin{tabular}{rrr} 4 \\ \hline 1 \\ \end{tabular}}; \node[draw, rectangle] (b) at (7,5.5) { \footnotesize \begin{tabular}{rrr} 3 \\ \hline 1 \\ \end{tabular}}; \node[draw, rectangle] (c) at (10,5.5) { \footnotesize \begin{tabular}{rr} 1 \\ \hline 1 \\ \end{tabular}}; \node[draw, rectangle] (d) at (1,5.5) { \footnotesize \begin{tabular}{rr} 3 \\ \hline 1 \\ \end{tabular}}; \end{tikzpicture} \end{center} Here the first map is the initial data structure for node 4, and the other three maps correspond to nodes 7, 8 and 9. The merging at node $s$ can be done as follows: We go through the children of $s$ and at each child $u$ merge $\texttt{d}[s]$ and $\texttt{d}[u]$. We always copy the contents from $\texttt{d}[u]$ to $\texttt{d}[s]$. However, before this, we \emph{swap} the contents of $\texttt{d}[s]$ and $\texttt{d}[u]$ if $\texttt{d}[s]$ is smaller than $\texttt{d}[u]$. By doing this, each value is copied only $O(\log n)$ times during the tree traversal, which ensures that the algorithm is efficient. To swap the contents of two data structures $a$ and $b$ efficiently, we can just use the following code: \begin{lstlisting} swap(a,b); \end{lstlisting} It is guaranteed that the above code works in constant time when $a$ and $b$ are C++ standard library data structures. \subsubsection{Lowest common ancestors} There is also an offline algorithm for processing a set of lowest common ancestor queries\footnote{This algorithm was published by R. E. Tarjan in 1979 \cite{tar79}.}. The algorithm is based on the union-find data structure (see Chapter 15.2), and the benefit of the algorithm is that it is easier to implement than the algorithms discussed earlier in this chapter. The algorithm is given as input a set of pairs of nodes, and it determines for each such pair the lowest common ancestor of the nodes. The algorithm performs a depth-first tree traversal and maintains disjoint sets of nodes. Initially, each node belongs to a separate set. For each set, we also store the highest node in the tree that belongs to the set. When the algorithm visits a node $x$, it goes through all nodes $y$ such that the lowest common ancestor of $x$ and $y$ has to be found. If $y$ has already been visited, the algorithm reports that the lowest common ancestor of $x$ and $y$ is the highest node in the set of $y$. Then, after processing node $x$, the algorithm joins the sets of $x$ and its parent. For example, suppose that we want to find the lowest common ancestors of node pairs $(5,8)$ and $(2,7)$ in the following tree: \begin{center} \begin{tikzpicture}[scale=0.85] \node[draw, circle] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (2,1) {$4$}; \node[draw, circle] (3) at (-2,1) {$2$}; \node[draw, circle] (4) at (0,1) {$3$}; \node[draw, circle] (5) at (2,-1) {$7$}; \node[draw, circle] (6) at (-3,-1) {$5$}; \node[draw, circle] (7) at (-1,-1) {$6$}; \node[draw, circle] (8) at (-1,-3) {$8$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (3) -- (7); \path[draw,thick,-] (7) -- (8); \end{tikzpicture} \end{center} In the following trees, gray nodes denote visited nodes and dashed groups of nodes belong to the same set. When the algorithm visits node 8, it notices that node 5 has been visited and the highest node in its set is 2. Thus, the lowest common ancestor of nodes 5 and 8 is 2: \begin{center} \begin{tikzpicture}[scale=0.85] \node[draw, circle, fill=lightgray] (1) at (0,3) {$1$}; \node[draw, circle] (2) at (2,1) {$4$}; \node[draw, circle, fill=lightgray] (3) at (-2,1) {$2$}; \node[draw, circle] (4) at (0,1) {$3$}; \node[draw, circle] (5) at (2,-1) {$7$}; \node[draw, circle, fill=lightgray] (6) at (-3,-1) {$5$}; \node[draw, circle, fill=lightgray] (7) at (-1,-1) {$6$}; \node[draw, circle, fill=gray] (8) at (-1,-3) {$8$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (3) -- (7); \path[draw,thick,-] (7) -- (8); \draw [red,thick,dashed,line width=2pt,rotate around={-28:(-2,0)}] (-2.9,1.5) rectangle (-1.9,-2); \draw [red,thick,dashed,line width=2pt] (-1.5,-0.5) rectangle (-0.5,-1.5); \draw [red,thick,dashed,line width=2pt] (-1.5,-2.5) rectangle (-0.5,-3.5); \draw [red,thick,dashed,line width=2pt] (0.5,3.5) rectangle (-0.5,2.5); \draw [red,thick,dashed,line width=2pt] (0.5,1.5) rectangle (-0.5,0.5); \draw [red,thick,dashed,line width=2pt] (2.5,1.5) rectangle (1.5,0.5); \draw [red,thick,dashed,line width=2pt] (2.5,-0.5) rectangle (1.5,-1.5); \end{tikzpicture} \end{center} Later, when visiting node 7, the algorithm determines that the lowest common ancestor of nodes 2 and 7 is 1: \begin{center} \begin{tikzpicture}[scale=0.85] \node[draw, circle, fill=lightgray] (1) at (0,3) {$1$}; \node[draw, circle, fill=lightgray] (2) at (2,1) {$4$}; \node[draw, circle, fill=lightgray] (3) at (-2,1) {$2$}; \node[draw, circle, fill=lightgray] (4) at (0,1) {$3$}; \node[draw, circle, fill=gray] (5) at (2,-1) {$7$}; \node[draw, circle, fill=lightgray] (6) at (-3,-1) {$5$}; \node[draw, circle, fill=lightgray] (7) at (-1,-1) {$6$}; \node[draw, circle, fill=lightgray] (8) at (-1,-3) {$8$}; \path[draw,thick,-] (1) -- (2); \path[draw,thick,-] (1) -- (3); \path[draw,thick,-] (1) -- (4); \path[draw,thick,-] (2) -- (5); \path[draw,thick,-] (3) -- (6); \path[draw,thick,-] (3) -- (7); \path[draw,thick,-] (7) -- (8); \draw [red,thick,dashed,line width=2pt] (0.5,3.5) rectangle (-3.5,-3.5); \draw [red,thick,dashed,line width=2pt] (2.5,1.5) rectangle (1.5,0.5); \draw [red,thick,dashed,line width=2pt] (2.5,-0.5) rectangle (1.5,-1.5); \end{tikzpicture} \end{center}