2016-12-28 23:54:51 +01:00
|
|
|
\chapter{Range queries}
|
|
|
|
|
2017-01-03 18:41:30 +01:00
|
|
|
\index{range query}
|
|
|
|
\index{sum query}
|
|
|
|
\index{minimum query}
|
|
|
|
\index{maximum query}
|
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
A \key{range query} asks to calculate some information
|
|
|
|
about the elements in a given range of an array.
|
|
|
|
Typical range queries are:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{itemize}
|
2017-02-04 00:54:48 +01:00
|
|
|
\item \key{sum query}: calculate the sum of elements in a range
|
|
|
|
\item \key{minimum query}: find the smallest element in a range
|
|
|
|
\item \key{maximum query}: find the largest element in a range
|
2016-12-28 23:54:51 +01:00
|
|
|
\end{itemize}
|
2017-02-04 00:54:48 +01:00
|
|
|
For example, consider the range $[4,7]$ in the following array:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=lightgray] (3,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$3$};
|
|
|
|
\node at (2.5,0.5) {$8$};
|
|
|
|
\node at (3.5,0.5) {$4$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$1$};
|
|
|
|
\node at (6.5,0.5) {$3$};
|
|
|
|
\node at (7.5,0.5) {$4$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-02-04 00:54:48 +01:00
|
|
|
In this range, the sum of elements is $4+6+1+3=16$,
|
|
|
|
the minimum element is 1 and the maximum element is 6.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
|
|
|
|
An easy way to process range queries is
|
|
|
|
to go through all the elements in the range.
|
|
|
|
For example, we can calculate the sum
|
|
|
|
in a range $[a,b]$ as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{lstlisting}
|
2017-01-03 18:41:30 +01:00
|
|
|
int sum(int a, int b) {
|
2016-12-28 23:54:51 +01:00
|
|
|
int s = 0;
|
|
|
|
for (int i = a; i <= b; i++) {
|
|
|
|
s += t[i];
|
|
|
|
}
|
|
|
|
return s;
|
|
|
|
}
|
|
|
|
\end{lstlisting}
|
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
The above function works in $O(n)$ time.
|
|
|
|
However, if the array is large and there are several queries,
|
|
|
|
such an approach is slow.
|
|
|
|
In this chapter, we will learn how
|
|
|
|
range queries can be processed much more efficiently.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 18:41:30 +01:00
|
|
|
\section{Static array queries}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
We first focus on a simple situation where
|
2017-01-03 18:41:30 +01:00
|
|
|
the array is \key{static}, i.e.,
|
|
|
|
the elements never change between the queries.
|
2017-02-04 00:54:48 +01:00
|
|
|
In this case, it suffices to preprocess the
|
|
|
|
array and construct
|
|
|
|
a data structure that can be used for
|
|
|
|
finding the answer for
|
2017-01-03 18:41:30 +01:00
|
|
|
any possible range query efficiently.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 18:41:30 +01:00
|
|
|
\subsubsection{Sum query}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 18:41:30 +01:00
|
|
|
\index{prefix sum array}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
Sum queries can be processed efficiently
|
2017-01-03 19:43:51 +01:00
|
|
|
by constructing a \key{sum array}
|
2017-02-04 00:54:48 +01:00
|
|
|
that contains the sum of elements in the range $[1,k]$
|
2017-01-03 18:41:30 +01:00
|
|
|
for each $k=1,2,\ldots,n$.
|
2017-02-04 00:54:48 +01:00
|
|
|
Using the sum array, the sum of elements in
|
|
|
|
any range $[a,b]$ of the original array can
|
|
|
|
be calculated in $O(1)$ time.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 18:41:30 +01:00
|
|
|
For example, for the array
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
%\fill[color=lightgray] (3,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$3$};
|
|
|
|
\node at (2.5,0.5) {$4$};
|
|
|
|
\node at (3.5,0.5) {$8$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$1$};
|
|
|
|
\node at (6.5,0.5) {$4$};
|
|
|
|
\node at (7.5,0.5) {$2$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-01-03 19:43:51 +01:00
|
|
|
the corresponding sum array is as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
%\fill[color=lightgray] (3,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$4$};
|
|
|
|
\node at (2.5,0.5) {$8$};
|
|
|
|
\node at (3.5,0.5) {$16$};
|
|
|
|
\node at (4.5,0.5) {$22$};
|
|
|
|
\node at (5.5,0.5) {$23$};
|
|
|
|
\node at (6.5,0.5) {$27$};
|
|
|
|
\node at (7.5,0.5) {$29$};
|
|
|
|
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-02-04 00:54:48 +01:00
|
|
|
The following code constructs a sum array
|
|
|
|
\texttt{s} for an array \texttt{t} in $O(n)$ time:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{lstlisting}
|
|
|
|
for (int i = 1; i <= n; i++) {
|
|
|
|
s[i] = s[i-1]+t[i];
|
|
|
|
}
|
|
|
|
\end{lstlisting}
|
2017-02-04 00:54:48 +01:00
|
|
|
After this, the following function processes
|
|
|
|
any sum query in $O(1)$ time:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{lstlisting}
|
2017-01-03 18:41:30 +01:00
|
|
|
int sum(int a, int b) {
|
2016-12-28 23:54:51 +01:00
|
|
|
return s[b]-s[a-1];
|
|
|
|
}
|
|
|
|
\end{lstlisting}
|
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
The function calculates the sum in the range $[a,b]$
|
|
|
|
by subtracting the sum in the range $[1,a-1]$
|
|
|
|
from the sum in the range $[1,b]$.
|
|
|
|
Thus, only two values of the sum array
|
2017-01-03 18:41:30 +01:00
|
|
|
are needed, and the query takes $O(1)$ time.
|
2017-02-04 00:54:48 +01:00
|
|
|
Note that because of the one-based indexing,
|
2017-01-03 18:41:30 +01:00
|
|
|
the function also works when $a=1$ if $\texttt{s}[0]=0$.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 19:43:51 +01:00
|
|
|
As an example, consider the range $[4,7]$:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=lightgray] (3,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$3$};
|
|
|
|
\node at (2.5,0.5) {$4$};
|
|
|
|
\node at (3.5,0.5) {$8$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$1$};
|
|
|
|
\node at (6.5,0.5) {$4$};
|
|
|
|
\node at (7.5,0.5) {$2$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-02-04 00:54:48 +01:00
|
|
|
The sum in the range is $8+6+1+4=19$.
|
|
|
|
This can be calculated using the precalculated
|
|
|
|
sums for the ranges $[1,3]$ and $[1,7]$:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=lightgray] (2,0) rectangle (3,1);
|
|
|
|
\fill[color=lightgray] (6,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$4$};
|
|
|
|
\node at (2.5,0.5) {$8$};
|
|
|
|
\node at (3.5,0.5) {$16$};
|
|
|
|
\node at (4.5,0.5) {$22$};
|
|
|
|
\node at (5.5,0.5) {$23$};
|
|
|
|
\node at (6.5,0.5) {$27$};
|
|
|
|
\node at (7.5,0.5) {$29$};
|
|
|
|
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-02-04 00:54:48 +01:00
|
|
|
Thus, the sum in the range $[4,7]$ is $27-8=19$.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
It is also possible to generalize this idea
|
|
|
|
to higher dimensions.
|
|
|
|
For example, we can construct a two-dimensional
|
|
|
|
sum array that can be used for calculating
|
|
|
|
the sum of any rectangular subarray in $O(1)$ time.
|
|
|
|
Each value in such an array is the sum of a subarray
|
|
|
|
that begins at the upper-left corner of the array.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{samepage}
|
2017-01-03 18:41:30 +01:00
|
|
|
The following picture illustrates the idea:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
2017-01-03 21:51:20 +01:00
|
|
|
\begin{tikzpicture}[scale=0.54]
|
2016-12-28 23:54:51 +01:00
|
|
|
\draw[fill=lightgray] (3,2) rectangle (7,5);
|
|
|
|
\draw (0,0) grid (10,7);
|
|
|
|
%\draw[line width=2pt] (3,2) rectangle (7,5);
|
|
|
|
\node[anchor=center] at (6.5, 2.5) {$A$};
|
|
|
|
\node[anchor=center] at (2.5, 2.5) {$B$};
|
|
|
|
\node[anchor=center] at (6.5, 5.5) {$C$};
|
|
|
|
\node[anchor=center] at (2.5, 5.5) {$D$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
\end{samepage}
|
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
The sum of the gray subarray can be calculated
|
2017-01-03 18:41:30 +01:00
|
|
|
using the formula
|
2017-02-04 00:54:48 +01:00
|
|
|
\[S(A) - S(B) - S(C) + S(D),\]
|
|
|
|
where $S(X)$ denotes the sum of a rectangular
|
2017-01-03 18:41:30 +01:00
|
|
|
subarray from the upper-left corner
|
2017-02-04 00:54:48 +01:00
|
|
|
to the position of $X$.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 18:41:30 +01:00
|
|
|
\subsubsection{Minimum query}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
It is also possible to process minimum queries
|
2017-01-03 19:43:51 +01:00
|
|
|
in $O(1)$ time after preprocessing, though it is
|
2017-02-04 00:54:48 +01:00
|
|
|
more difficult than processing sum queries.
|
2017-01-03 18:41:30 +01:00
|
|
|
Note that minimum and maximum queries can always
|
|
|
|
be implemented using same techniques,
|
2017-02-04 00:54:48 +01:00
|
|
|
so it suffices to focus on minimum queries.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
The idea is to precalculate the minimum element of each range
|
2017-01-03 18:41:30 +01:00
|
|
|
of size $2^k$ in the array.
|
|
|
|
For example, in the array
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$3$};
|
|
|
|
\node at (2.5,0.5) {$4$};
|
|
|
|
\node at (3.5,0.5) {$8$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$1$};
|
|
|
|
\node at (6.5,0.5) {$4$};
|
|
|
|
\node at (7.5,0.5) {$2$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-01-03 18:41:30 +01:00
|
|
|
the following minima will be calculated:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tabular}{ccc}
|
|
|
|
|
|
|
|
\begin{tabular}{ccc}
|
2017-01-03 18:41:30 +01:00
|
|
|
range & size & min \\
|
2016-12-28 23:54:51 +01:00
|
|
|
\hline
|
|
|
|
$[1,1]$ & 1 & 1 \\
|
|
|
|
$[2,2]$ & 1 & 3 \\
|
|
|
|
$[3,3]$ & 1 & 4 \\
|
|
|
|
$[4,4]$ & 1 & 8 \\
|
|
|
|
$[5,5]$ & 1 & 6 \\
|
|
|
|
$[6,6]$ & 1 & 1 \\
|
|
|
|
$[7,7]$ & 1 & 4 \\
|
|
|
|
$[8,8]$ & 1 & 2 \\
|
|
|
|
\end{tabular}
|
|
|
|
|
|
|
|
&
|
|
|
|
|
|
|
|
\begin{tabular}{ccc}
|
2017-01-03 18:41:30 +01:00
|
|
|
range & size & min \\
|
2016-12-28 23:54:51 +01:00
|
|
|
\hline
|
|
|
|
$[1,2]$ & 2 & 1 \\
|
|
|
|
$[2,3]$ & 2 & 3 \\
|
|
|
|
$[3,4]$ & 2 & 4 \\
|
|
|
|
$[4,5]$ & 2 & 6 \\
|
|
|
|
$[5,6]$ & 2 & 1 \\
|
|
|
|
$[6,7]$ & 2 & 1 \\
|
|
|
|
$[7,8]$ & 2 & 2 \\
|
|
|
|
\\
|
|
|
|
\end{tabular}
|
|
|
|
|
|
|
|
&
|
|
|
|
|
|
|
|
\begin{tabular}{ccc}
|
2017-01-03 18:41:30 +01:00
|
|
|
range & size & min \\
|
2016-12-28 23:54:51 +01:00
|
|
|
\hline
|
|
|
|
$[1,4]$ & 4 & 1 \\
|
|
|
|
$[2,5]$ & 4 & 3 \\
|
|
|
|
$[3,6]$ & 4 & 1 \\
|
|
|
|
$[4,7]$ & 4 & 1 \\
|
|
|
|
$[5,8]$ & 4 & 1 \\
|
|
|
|
$[1,8]$ & 8 & 1 \\
|
|
|
|
\\
|
|
|
|
\\
|
|
|
|
\end{tabular}
|
|
|
|
|
|
|
|
\end{tabular}
|
|
|
|
|
|
|
|
\end{center}
|
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
There are $O(n \log n)$ ranges of size $2^k$,
|
|
|
|
because for each array position,
|
|
|
|
there are $O(\log n)$ ranges that begin at that position.
|
|
|
|
The minima in all ranges of size $2^k$ can be calculated
|
|
|
|
in $O(n \log n)$ time, because each range of size $2^k$
|
|
|
|
consists of two ranges of size $2^{k-1}$ and the minima
|
2017-01-03 18:41:30 +01:00
|
|
|
can be calculated recursively.
|
|
|
|
|
2017-02-04 00:54:48 +01:00
|
|
|
After this, the minimum in any range $[a,b]$
|
2017-01-03 18:41:30 +01:00
|
|
|
can be calculated in $O(1)$ time as a minimum of
|
2017-02-04 00:54:48 +01:00
|
|
|
two ranges of size $2^k$ where $k=\lfloor \log_2(b-a+1) \rfloor$.
|
|
|
|
The first range begins at index $a$,
|
|
|
|
and the second range ends at index $b$.
|
|
|
|
The parameter $k$ is chosen so that
|
|
|
|
the two ranges of size $2^k$
|
|
|
|
fully cover the range $[a,b]$.
|
2017-01-03 18:41:30 +01:00
|
|
|
|
|
|
|
As an example, consider the range $[2,7]$:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=lightgray] (1,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$3$};
|
|
|
|
\node at (2.5,0.5) {$4$};
|
|
|
|
\node at (3.5,0.5) {$8$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$1$};
|
|
|
|
\node at (6.5,0.5) {$4$};
|
|
|
|
\node at (7.5,0.5) {$2$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-02-04 00:54:48 +01:00
|
|
|
The length of the range is 6,
|
2017-01-03 18:41:30 +01:00
|
|
|
and $\lfloor \log_2(6) \rfloor = 2$.
|
|
|
|
Thus, the minimum can be calculated
|
|
|
|
from two ranges of length 4.
|
|
|
|
The ranges are $[2,5]$ and $[4,7]$:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=lightgray] (1,0) rectangle (5,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$3$};
|
|
|
|
\node at (2.5,0.5) {$4$};
|
|
|
|
\node at (3.5,0.5) {$8$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$1$};
|
|
|
|
\node at (6.5,0.5) {$4$};
|
|
|
|
\node at (7.5,0.5) {$2$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=lightgray] (3,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$3$};
|
|
|
|
\node at (2.5,0.5) {$4$};
|
|
|
|
\node at (3.5,0.5) {$8$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$1$};
|
|
|
|
\node at (6.5,0.5) {$4$};
|
|
|
|
\node at (7.5,0.5) {$2$};
|
|
|
|
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-02-04 00:54:48 +01:00
|
|
|
Since the minimum in the range $[2,5]$ is 3
|
|
|
|
and the minimum in the range $[4,7]$ is 1,
|
|
|
|
we know that the minimum in the range $[2,7]$ is 1.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 19:43:51 +01:00
|
|
|
\section{Binary indexed tree}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 19:43:51 +01:00
|
|
|
\index{binary indexed tree}
|
|
|
|
\index{Fenwick tree}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 10:48:16 +01:00
|
|
|
A \key{binary indexed tree} or \key{Fenwick tree}
|
|
|
|
can be seen as a dynamic version of a sum array.
|
|
|
|
The tree supports two $O(\log n)$ time operations:
|
|
|
|
calculating the sum of elements in a range,
|
|
|
|
and modifying the value of an element.
|
|
|
|
|
|
|
|
The benefit in using a binary indexed tree is
|
|
|
|
that the elements of the underlying array
|
|
|
|
can be efficiently updated between the queries.
|
|
|
|
This would not be possible with a sum array,
|
|
|
|
because after each update, we should build the
|
|
|
|
whole sum array again in $O(n)$ time.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 19:43:51 +01:00
|
|
|
\subsubsection{Structure}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 10:48:16 +01:00
|
|
|
Given an array of $n$ elements, indexed $1 \ldots n$,
|
|
|
|
the binary indexed tree for that array
|
|
|
|
is an array such that the value at position $k$
|
|
|
|
equals the sum of elements in the original array in a range
|
|
|
|
that ends at position $k$.
|
2017-01-03 19:43:51 +01:00
|
|
|
The length of the range is the largest power of two
|
|
|
|
that divides $k$.
|
2017-02-04 10:48:16 +01:00
|
|
|
For example, if $k=6$, the length of the range is $2$,
|
|
|
|
because $2$ divides $6$ but $4$ does not divide $6$.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{samepage}
|
2017-02-04 10:48:16 +01:00
|
|
|
For example, consider the following array:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$3$};
|
|
|
|
\node at (2.5,0.5) {$4$};
|
|
|
|
\node at (3.5,0.5) {$8$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$1$};
|
|
|
|
\node at (6.5,0.5) {$4$};
|
|
|
|
\node at (7.5,0.5) {$2$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
\end{samepage}
|
2017-02-04 10:48:16 +01:00
|
|
|
The corresponding binary indexed tree is as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
%\fill[color=lightgray] (3,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$4$};
|
|
|
|
\node at (2.5,0.5) {$4$};
|
|
|
|
\node at (3.5,0.5) {$16$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$7$};
|
|
|
|
\node at (6.5,0.5) {$4$};
|
|
|
|
\node at (7.5,0.5) {$29$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
|
|
|
|
\draw[->,thick] (0.5,-0.9) -- (0.5,-0.1);
|
|
|
|
\draw[->,thick] (2.5,-0.9) -- (2.5,-0.1);
|
|
|
|
\draw[->,thick] (4.5,-0.9) -- (4.5,-0.1);
|
|
|
|
\draw[->,thick] (6.5,-0.9) -- (6.5,-0.1);
|
|
|
|
\draw[->,thick] (1.5,-1.9) -- (1.5,-0.1);
|
|
|
|
\draw[->,thick] (5.5,-1.9) -- (5.5,-0.1);
|
|
|
|
\draw[->,thick] (3.5,-2.9) -- (3.5,-0.1);
|
|
|
|
\draw[->,thick] (7.5,-3.9) -- (7.5,-0.1);
|
|
|
|
|
|
|
|
\draw (0,-1) -- (1,-1) -- (1,-1.5) -- (0,-1.5) -- (0,-1);
|
|
|
|
\draw (2,-1) -- (3,-1) -- (3,-1.5) -- (2,-1.5) -- (2,-1);
|
|
|
|
\draw (4,-1) -- (5,-1) -- (5,-1.5) -- (4,-1.5) -- (4,-1);
|
|
|
|
\draw (6,-1) -- (7,-1) -- (7,-1.5) -- (6,-1.5) -- (6,-1);
|
|
|
|
\draw (0,-2) -- (2,-2) -- (2,-2.5) -- (0,-2.5) -- (0,-2);
|
|
|
|
\draw (4,-2) -- (6,-2) -- (6,-2.5) -- (4,-2.5) -- (4,-2);
|
|
|
|
\draw (0,-3) -- (4,-3) -- (4,-3.5) -- (0,-3.5) -- (0,-3);
|
|
|
|
\draw (0,-4) -- (8,-4) -- (8,-4.5) -- (0,-4.5) -- (0,-4);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2017-02-04 10:48:16 +01:00
|
|
|
For example, the value at position 6
|
|
|
|
in the binary indexed tree is 7,
|
|
|
|
because the sum of elements in the range $[5,6]$
|
|
|
|
in the original array is $6+1=7$.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 19:43:51 +01:00
|
|
|
\subsubsection{Sum query}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 19:43:51 +01:00
|
|
|
The basic operation in a binary indexed tree is
|
2017-02-04 10:48:16 +01:00
|
|
|
to calculate the sum of elements in a range $[1,k]$,
|
|
|
|
where $k$ is any position in the array.
|
|
|
|
The sum of such a range can be calculated as a
|
|
|
|
sum of one or more values stored in the tree.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 10:48:16 +01:00
|
|
|
For example, the range $[1,7]$ corresponds to
|
|
|
|
the following values:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
%\fill[color=lightgray] (3,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$4$};
|
|
|
|
\node at (2.5,0.5) {$4$};
|
|
|
|
\node at (3.5,0.5) {$16$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$7$};
|
|
|
|
\node at (6.5,0.5) {$4$};
|
|
|
|
\node at (7.5,0.5) {$29$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
|
|
|
|
\draw[->,thick] (0.5,-0.9) -- (0.5,-0.1);
|
|
|
|
\draw[->,thick] (2.5,-0.9) -- (2.5,-0.1);
|
|
|
|
\draw[->,thick] (4.5,-0.9) -- (4.5,-0.1);
|
|
|
|
\draw[->,thick] (6.5,-0.9) -- (6.5,-0.1);
|
|
|
|
\draw[->,thick] (1.5,-1.9) -- (1.5,-0.1);
|
|
|
|
\draw[->,thick] (5.5,-1.9) -- (5.5,-0.1);
|
|
|
|
\draw[->,thick] (3.5,-2.9) -- (3.5,-0.1);
|
|
|
|
\draw[->,thick] (7.5,-3.9) -- (7.5,-0.1);
|
|
|
|
|
|
|
|
\draw (0,-1) -- (1,-1) -- (1,-1.5) -- (0,-1.5) -- (0,-1);
|
|
|
|
\draw (2,-1) -- (3,-1) -- (3,-1.5) -- (2,-1.5) -- (2,-1);
|
|
|
|
\draw (4,-1) -- (5,-1) -- (5,-1.5) -- (4,-1.5) -- (4,-1);
|
|
|
|
\draw[fill=lightgray] (6,-1) -- (7,-1) -- (7,-1.5) -- (6,-1.5) -- (6,-1);
|
|
|
|
\draw (0,-2) -- (2,-2) -- (2,-2.5) -- (0,-2.5) -- (0,-2);
|
|
|
|
\draw[fill=lightgray] (4,-2) -- (6,-2) -- (6,-2.5) -- (4,-2.5) -- (4,-2);
|
|
|
|
\draw[fill=lightgray] (0,-3) -- (4,-3) -- (4,-3.5) -- (0,-3.5) -- (0,-3);
|
|
|
|
\draw (0,-4) -- (8,-4) -- (8,-4.5) -- (0,-4.5) -- (0,-4);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2017-02-04 10:48:16 +01:00
|
|
|
Hence, the sum of elements in the range $[1,7]$ is $16+7+4=27$.
|
|
|
|
The structure of the binary indexed tree allows us to calculate
|
|
|
|
the sum of elements in any range using only $O(\log n)$
|
|
|
|
values from the tree.
|
2017-01-03 19:43:51 +01:00
|
|
|
|
|
|
|
Using the same technique that we previously used
|
|
|
|
with a sum array,
|
|
|
|
we can efficiently calculate the sum of any range
|
|
|
|
$[a,b]$ by substracting the sum of the range $[1,a-1]$
|
|
|
|
from the sum of the range $[1,b]$.
|
2017-02-04 10:48:16 +01:00
|
|
|
Also here, only $O(\log n)$ values are needed,
|
2017-01-03 19:43:51 +01:00
|
|
|
because it suffices to calculate two sums of $[1,k]$ ranges.
|
|
|
|
|
|
|
|
\subsubsection{Array update}
|
|
|
|
|
|
|
|
When an element in the original array changes,
|
|
|
|
several sums in the binary indexed tree change.
|
2017-02-04 10:48:16 +01:00
|
|
|
For example, if the element at position 3 changes,
|
2017-01-03 19:43:51 +01:00
|
|
|
the sums of the following ranges change:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
%\fill[color=lightgray] (3,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$1$};
|
|
|
|
\node at (1.5,0.5) {$4$};
|
|
|
|
\node at (2.5,0.5) {$4$};
|
|
|
|
\node at (3.5,0.5) {$16$};
|
|
|
|
\node at (4.5,0.5) {$6$};
|
|
|
|
\node at (5.5,0.5) {$7$};
|
|
|
|
\node at (6.5,0.5) {$4$};
|
|
|
|
\node at (7.5,0.5) {$29$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
|
|
|
|
\draw[->,thick] (0.5,-0.9) -- (0.5,-0.1);
|
|
|
|
\draw[->,thick] (2.5,-0.9) -- (2.5,-0.1);
|
|
|
|
\draw[->,thick] (4.5,-0.9) -- (4.5,-0.1);
|
|
|
|
\draw[->,thick] (6.5,-0.9) -- (6.5,-0.1);
|
|
|
|
\draw[->,thick] (1.5,-1.9) -- (1.5,-0.1);
|
|
|
|
\draw[->,thick] (5.5,-1.9) -- (5.5,-0.1);
|
|
|
|
\draw[->,thick] (3.5,-2.9) -- (3.5,-0.1);
|
|
|
|
\draw[->,thick] (7.5,-3.9) -- (7.5,-0.1);
|
|
|
|
|
|
|
|
\draw (0,-1) -- (1,-1) -- (1,-1.5) -- (0,-1.5) -- (0,-1);
|
|
|
|
\draw[fill=lightgray] (2,-1) -- (3,-1) -- (3,-1.5) -- (2,-1.5) -- (2,-1);
|
|
|
|
\draw (4,-1) -- (5,-1) -- (5,-1.5) -- (4,-1.5) -- (4,-1);
|
|
|
|
\draw (6,-1) -- (7,-1) -- (7,-1.5) -- (6,-1.5) -- (6,-1);
|
|
|
|
\draw (0,-2) -- (2,-2) -- (2,-2.5) -- (0,-2.5) -- (0,-2);
|
|
|
|
\draw (4,-2) -- (6,-2) -- (6,-2.5) -- (4,-2.5) -- (4,-2);
|
|
|
|
\draw[fill=lightgray] (0,-3) -- (4,-3) -- (4,-3.5) -- (0,-3.5) -- (0,-3);
|
|
|
|
\draw[fill=lightgray] (0,-4) -- (8,-4) -- (8,-4.5) -- (0,-4.5) -- (0,-4);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2017-02-04 10:48:16 +01:00
|
|
|
However, it turns out that
|
|
|
|
the number of values that need to be updated
|
|
|
|
in the binary indexed tree is only $O(\log n)$.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 19:43:51 +01:00
|
|
|
\subsubsection{Implementation}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 19:43:51 +01:00
|
|
|
The operations of a binary indexed tree can be implemented
|
2017-02-04 10:48:16 +01:00
|
|
|
in an elegant and efficient way using bit operations.
|
|
|
|
The key fact needed is that $k \& -k$
|
|
|
|
isolates the last one bit in a number $k$.
|
2017-01-03 19:43:51 +01:00
|
|
|
For example, $6 \& -6=2$ because the number $6$
|
|
|
|
corresponds to 110 and the number $2$ corresponds to 10.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 10:48:16 +01:00
|
|
|
It turns out that when processing a range query,
|
|
|
|
the position $k$ in the binary indexed tree should be
|
|
|
|
decreased by $k \& -k$ at every step,
|
|
|
|
and when updating the array,
|
|
|
|
the position $k$ should be increased by $k \& -k$ at every step.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-02-04 10:48:16 +01:00
|
|
|
Suppose that the binary indexed tree is stored in an array \texttt{b}.
|
|
|
|
The following function \texttt{sum} calculates
|
|
|
|
the sum of elements in the range $[1,k]$:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{lstlisting}
|
2017-01-03 19:43:51 +01:00
|
|
|
int sum(int k) {
|
2016-12-28 23:54:51 +01:00
|
|
|
int s = 0;
|
|
|
|
while (k >= 1) {
|
|
|
|
s += b[k];
|
|
|
|
k -= k&-k;
|
|
|
|
}
|
|
|
|
return s;
|
|
|
|
}
|
|
|
|
\end{lstlisting}
|
|
|
|
|
2017-02-04 10:48:16 +01:00
|
|
|
The following function \texttt{add} increases the value
|
|
|
|
of the element at position $k$ by $x$
|
|
|
|
($x$ can be positive or negative):
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{lstlisting}
|
2017-01-03 19:43:51 +01:00
|
|
|
void add(int k, int x) {
|
2016-12-28 23:54:51 +01:00
|
|
|
while (k <= n) {
|
|
|
|
b[k] += x;
|
|
|
|
k += k&-k;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
\end{lstlisting}
|
|
|
|
|
2017-02-04 10:48:16 +01:00
|
|
|
The time complexity of both the functions is
|
|
|
|
$O(\log n)$, because the functions access $O(\log n)$
|
|
|
|
values in the binary indexed tree, and each transition
|
|
|
|
to the next position
|
|
|
|
takes $O(1)$ time using bit operations.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
\section{Segment tree}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
\index{segment tree}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
A \key{segment tree} is a data structure
|
|
|
|
whose supported operations are
|
|
|
|
handling a range query for range $[a,b]$
|
|
|
|
and updating the element at index $k$.
|
|
|
|
Using a segment tree, we can implement sum
|
|
|
|
queries, minimum queries and many other
|
|
|
|
queries so that both operations work in $O(\log n)$ time.
|
|
|
|
|
|
|
|
Compared to a binary indexed tree,
|
|
|
|
the advantage of a segment tree is that it is
|
|
|
|
a more general data structure.
|
|
|
|
While binary indexed trees only support
|
|
|
|
sum queries, segment trees also support other queries.
|
|
|
|
On the other hand, a segment tree requires more
|
|
|
|
memory and is a bit more difficult to implement.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
\subsubsection{Structure}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
A segment tree contains $2n-1$ nodes
|
|
|
|
so that the bottom $n$ nodes correspond
|
|
|
|
to the original array and the other nodes
|
|
|
|
contain information needed for range queries.
|
|
|
|
The values in a segment tree depend on
|
|
|
|
the supported query type.
|
|
|
|
We will first assume that the supported
|
|
|
|
query is the sum query.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
For example, the array
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$5$};
|
|
|
|
\node at (1.5,0.5) {$8$};
|
|
|
|
\node at (2.5,0.5) {$6$};
|
|
|
|
\node at (3.5,0.5) {$3$};
|
|
|
|
\node at (4.5,0.5) {$2$};
|
|
|
|
\node at (5.5,0.5) {$7$};
|
|
|
|
\node at (6.5,0.5) {$2$};
|
|
|
|
\node at (7.5,0.5) {$6$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-01-03 21:51:20 +01:00
|
|
|
corresponds to the following segment tree:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node[anchor=center] at (0.5, 0.5) {5};
|
|
|
|
\node[anchor=center] at (1.5, 0.5) {8};
|
|
|
|
\node[anchor=center] at (2.5, 0.5) {6};
|
|
|
|
\node[anchor=center] at (3.5, 0.5) {3};
|
|
|
|
\node[anchor=center] at (4.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (5.5, 0.5) {7};
|
|
|
|
\node[anchor=center] at (6.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (7.5, 0.5) {6};
|
|
|
|
|
|
|
|
\node[draw, circle] (a) at (1,2.5) {13};
|
|
|
|
\path[draw,thick,-] (a) -- (0.5,1);
|
|
|
|
\path[draw,thick,-] (a) -- (1.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {9};
|
|
|
|
\path[draw,thick,-] (b) -- (2.5,1);
|
|
|
|
\path[draw,thick,-] (b) -- (3.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {9};
|
|
|
|
\path[draw,thick,-] (c) -- (4.5,1);
|
|
|
|
\path[draw,thick,-] (c) -- (5.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8};
|
|
|
|
\path[draw,thick,-] (d) -- (6.5,1);
|
|
|
|
\path[draw,thick,-] (d) -- (7.5,1);
|
|
|
|
|
|
|
|
\node[draw, circle] (i) at (2,4.5) {22};
|
|
|
|
\path[draw,thick,-] (i) -- (a);
|
|
|
|
\path[draw,thick,-] (i) -- (b);
|
|
|
|
\node[draw, circle] (j) at (6,4.5) {17};
|
|
|
|
\path[draw,thick,-] (j) -- (c);
|
|
|
|
\path[draw,thick,-] (j) -- (d);
|
|
|
|
|
|
|
|
\node[draw, circle] (m) at (4,6.5) {39};
|
|
|
|
\path[draw,thick,-] (m) -- (i);
|
|
|
|
\path[draw,thick,-] (m) -- (j);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
Each internal node in the segment tree contains
|
|
|
|
information about a range of size $2^k$
|
|
|
|
in the original array.
|
|
|
|
In the above tree, the value of each internal
|
|
|
|
node is the sum of the corresponding array elements,
|
|
|
|
and it can be calculated as the sum of
|
|
|
|
the values of its left and right child node.
|
|
|
|
|
|
|
|
It is convenient to build a segment tree
|
|
|
|
when the size of the array is a power of two
|
|
|
|
and the tree is a complete binary tree.
|
|
|
|
In the sequel, we will assume that the tree
|
|
|
|
is built like this.
|
|
|
|
If the size of the array is not a power of two,
|
|
|
|
we can always extend it using zero elements.
|
|
|
|
|
|
|
|
\subsubsection{Range query}
|
|
|
|
|
|
|
|
In a segment tree, the answer for a range query
|
|
|
|
is calculated from nodes that belong to the range
|
|
|
|
and are as high as possible in the tree.
|
|
|
|
Each node gives the answer for a subrange,
|
|
|
|
and the answer for the entire range can be
|
|
|
|
calculated by combining these values.
|
|
|
|
|
|
|
|
For example, consider the following range:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=gray!50] (2,0) rectangle (8,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node[anchor=center] at (0.5, 0.5) {5};
|
|
|
|
\node[anchor=center] at (1.5, 0.5) {8};
|
|
|
|
\node[anchor=center] at (2.5, 0.5) {6};
|
|
|
|
\node[anchor=center] at (3.5, 0.5) {3};
|
|
|
|
\node[anchor=center] at (4.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (5.5, 0.5) {7};
|
|
|
|
\node[anchor=center] at (6.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (7.5, 0.5) {6};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-01-03 21:51:20 +01:00
|
|
|
The sum of elements in the range $[3,8]$ is
|
|
|
|
$6+3+2+7+2+6=26$.
|
|
|
|
The sum can be calculated from the segment tree
|
|
|
|
using the following subranges:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node[anchor=center] at (0.5, 0.5) {5};
|
|
|
|
\node[anchor=center] at (1.5, 0.5) {8};
|
|
|
|
\node[anchor=center] at (2.5, 0.5) {6};
|
|
|
|
\node[anchor=center] at (3.5, 0.5) {3};
|
|
|
|
\node[anchor=center] at (4.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (5.5, 0.5) {7};
|
|
|
|
\node[anchor=center] at (6.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (7.5, 0.5) {6};
|
|
|
|
|
|
|
|
\node[draw, circle] (a) at (1,2.5) {13};
|
|
|
|
\path[draw,thick,-] (a) -- (0.5,1);
|
|
|
|
\path[draw,thick,-] (a) -- (1.5,1);
|
|
|
|
\node[draw, circle,fill=gray!50,minimum size=22pt] (b) at (3,2.5) {9};
|
|
|
|
\path[draw,thick,-] (b) -- (2.5,1);
|
|
|
|
\path[draw,thick,-] (b) -- (3.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {9};
|
|
|
|
\path[draw,thick,-] (c) -- (4.5,1);
|
|
|
|
\path[draw,thick,-] (c) -- (5.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8};
|
|
|
|
\path[draw,thick,-] (d) -- (6.5,1);
|
|
|
|
\path[draw,thick,-] (d) -- (7.5,1);
|
|
|
|
|
|
|
|
\node[draw, circle] (i) at (2,4.5) {22};
|
|
|
|
\path[draw,thick,-] (i) -- (a);
|
|
|
|
\path[draw,thick,-] (i) -- (b);
|
|
|
|
\node[draw, circle,fill=gray!50] (j) at (6,4.5) {17};
|
|
|
|
\path[draw,thick,-] (j) -- (c);
|
|
|
|
\path[draw,thick,-] (j) -- (d);
|
|
|
|
|
|
|
|
\node[draw, circle] (m) at (4,6.5) {39};
|
|
|
|
\path[draw,thick,-] (m) -- (i);
|
|
|
|
\path[draw,thick,-] (m) -- (j);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-01-03 21:51:20 +01:00
|
|
|
Thus, the sum of the range is $9+17=26$.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
When the answer for a range query is
|
|
|
|
calculated using as high nodes as possible,
|
|
|
|
at most two nodes on each level
|
|
|
|
of the segment tree are needed.
|
|
|
|
Because of this, the total number of nodes
|
|
|
|
examined is only $O(\log n)$.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
\subsubsection{Array update}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
When an element in the array changes,
|
|
|
|
we should update all nodes in the segment tree
|
|
|
|
whose value depends on the changed element.
|
|
|
|
This can be done by travelling from the bottom
|
|
|
|
to the top in the tree and updating the nodes along the path.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{samepage}
|
2017-01-03 21:51:20 +01:00
|
|
|
The following picture shows which nodes in the segment tree
|
|
|
|
change if the element 7 in the array changes.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=gray!50] (5,0) rectangle (6,1);
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node[anchor=center] at (0.5, 0.5) {5};
|
|
|
|
\node[anchor=center] at (1.5, 0.5) {8};
|
|
|
|
\node[anchor=center] at (2.5, 0.5) {6};
|
|
|
|
\node[anchor=center] at (3.5, 0.5) {3};
|
|
|
|
\node[anchor=center] at (4.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (5.5, 0.5) {7};
|
|
|
|
\node[anchor=center] at (6.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (7.5, 0.5) {6};
|
|
|
|
|
|
|
|
\node[draw, circle] (a) at (1,2.5) {13};
|
|
|
|
\path[draw,thick,-] (a) -- (0.5,1);
|
|
|
|
\path[draw,thick,-] (a) -- (1.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {9};
|
|
|
|
\path[draw,thick,-] (b) -- (2.5,1);
|
|
|
|
\path[draw,thick,-] (b) -- (3.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt,fill=gray!50] (c) at (5,2.5) {9};
|
|
|
|
\path[draw,thick,-] (c) -- (4.5,1);
|
|
|
|
\path[draw,thick,-] (c) -- (5.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8};
|
|
|
|
\path[draw,thick,-] (d) -- (6.5,1);
|
|
|
|
\path[draw,thick,-] (d) -- (7.5,1);
|
|
|
|
|
|
|
|
\node[draw, circle] (i) at (2,4.5) {22};
|
|
|
|
\path[draw,thick,-] (i) -- (a);
|
|
|
|
\path[draw,thick,-] (i) -- (b);
|
|
|
|
\node[draw, circle,fill=gray!50] (j) at (6,4.5) {17};
|
|
|
|
\path[draw,thick,-] (j) -- (c);
|
|
|
|
\path[draw,thick,-] (j) -- (d);
|
|
|
|
|
|
|
|
\node[draw, circle,fill=gray!50] (m) at (4,6.5) {39};
|
|
|
|
\path[draw,thick,-] (m) -- (i);
|
|
|
|
\path[draw,thick,-] (m) -- (j);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
\end{samepage}
|
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
The path from the bottom of the segment tree to the top
|
|
|
|
always consists of $O(\log n)$ nodes,
|
|
|
|
so updating the array affects $O(\log n)$ nodes in the tree.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
\subsubsection{Storing the tree}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
A segment tree can be stored as an array
|
|
|
|
of $2N$ elements where $N$ is a power of two.
|
|
|
|
From now on, we will assume that the indices
|
|
|
|
of the original array are between $0$ and $N-1$.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
The element at index 1 in the segment tree array
|
|
|
|
contains the top node of the tree,
|
|
|
|
the elements at indices 2 and 3 correspond to
|
|
|
|
the second level of the tree, and so on.
|
|
|
|
Finally, the elements beginning from index $N$
|
|
|
|
contain the bottom level of the tree, i.e.,
|
|
|
|
the actual content of the original array.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
For example, the segment tree
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node[anchor=center] at (0.5, 0.5) {5};
|
|
|
|
\node[anchor=center] at (1.5, 0.5) {8};
|
|
|
|
\node[anchor=center] at (2.5, 0.5) {6};
|
|
|
|
\node[anchor=center] at (3.5, 0.5) {3};
|
|
|
|
\node[anchor=center] at (4.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (5.5, 0.5) {7};
|
|
|
|
\node[anchor=center] at (6.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (7.5, 0.5) {6};
|
|
|
|
|
|
|
|
\node[draw, circle] (a) at (1,2.5) {13};
|
|
|
|
\path[draw,thick,-] (a) -- (0.5,1);
|
|
|
|
\path[draw,thick,-] (a) -- (1.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {9};
|
|
|
|
\path[draw,thick,-] (b) -- (2.5,1);
|
|
|
|
\path[draw,thick,-] (b) -- (3.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {9};
|
|
|
|
\path[draw,thick,-] (c) -- (4.5,1);
|
|
|
|
\path[draw,thick,-] (c) -- (5.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8};
|
|
|
|
\path[draw,thick,-] (d) -- (6.5,1);
|
|
|
|
\path[draw,thick,-] (d) -- (7.5,1);
|
|
|
|
|
|
|
|
\node[draw, circle] (i) at (2,4.5) {22};
|
|
|
|
\path[draw,thick,-] (i) -- (a);
|
|
|
|
\path[draw,thick,-] (i) -- (b);
|
|
|
|
\node[draw, circle] (j) at (6,4.5) {17};
|
|
|
|
\path[draw,thick,-] (j) -- (c);
|
|
|
|
\path[draw,thick,-] (j) -- (d);
|
|
|
|
|
|
|
|
\node[draw, circle] (m) at (4,6.5) {39};
|
|
|
|
\path[draw,thick,-] (m) -- (i);
|
|
|
|
\path[draw,thick,-] (m) -- (j);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-01-03 21:51:20 +01:00
|
|
|
can be stored as follows ($N=8$):
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
%\fill[color=lightgray] (3,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (15,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$39$};
|
|
|
|
\node at (1.5,0.5) {$22$};
|
|
|
|
\node at (2.5,0.5) {$17$};
|
|
|
|
\node at (3.5,0.5) {$13$};
|
|
|
|
\node at (4.5,0.5) {$9$};
|
|
|
|
\node at (5.5,0.5) {$9$};
|
|
|
|
\node at (6.5,0.5) {$8$};
|
|
|
|
\node at (7.5,0.5) {$5$};
|
|
|
|
\node at (8.5,0.5) {$8$};
|
|
|
|
\node at (9.5,0.5) {$6$};
|
|
|
|
\node at (10.5,0.5) {$3$};
|
|
|
|
\node at (11.5,0.5) {$2$};
|
|
|
|
\node at (12.5,0.5) {$7$};
|
|
|
|
\node at (13.5,0.5) {$2$};
|
|
|
|
\node at (14.5,0.5) {$6$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\node at (8.5,1.4) {$9$};
|
|
|
|
\node at (9.5,1.4) {$10$};
|
|
|
|
\node at (10.5,1.4) {$11$};
|
|
|
|
\node at (11.5,1.4) {$12$};
|
|
|
|
\node at (12.5,1.4) {$13$};
|
|
|
|
\node at (13.5,1.4) {$14$};
|
|
|
|
\node at (14.5,1.4) {$15$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-01-03 21:51:20 +01:00
|
|
|
Using this representation,
|
|
|
|
for a node at index $k$,
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{itemize}
|
2017-01-03 21:51:20 +01:00
|
|
|
\item the parent node is at index $\lfloor k/2 \rfloor$,
|
|
|
|
\item the left child node is at index $2k$, and
|
|
|
|
\item the right child node is at index $2k+1$.
|
2016-12-28 23:54:51 +01:00
|
|
|
\end{itemize}
|
2017-01-03 21:51:20 +01:00
|
|
|
Note that this implies that the index of a node
|
|
|
|
is even if it is a left child and odd if it is a right child.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
\subsubsection{Functions}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
We assume that the segment tree is stored
|
|
|
|
in the array \texttt{p}.
|
|
|
|
The following function calculates the sum of range $[a,b]$:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{lstlisting}
|
2017-01-03 21:51:20 +01:00
|
|
|
int sum(int a, int b) {
|
2016-12-28 23:54:51 +01:00
|
|
|
a += N; b += N;
|
|
|
|
int s = 0;
|
|
|
|
while (a <= b) {
|
|
|
|
if (a%2 == 1) s += p[a++];
|
|
|
|
if (b%2 == 0) s += p[b--];
|
|
|
|
a /= 2; b /= 2;
|
|
|
|
}
|
|
|
|
return s;
|
|
|
|
}
|
|
|
|
\end{lstlisting}
|
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
The function begins from the bottom of the tree
|
|
|
|
and moves step by step upwards in the tree.
|
|
|
|
The function calculates the range sum to
|
|
|
|
the variable $s$ by combining the sums in the tree nodes.
|
|
|
|
The value of a node is added to the sum if
|
|
|
|
the parent node doesn't belong to the range.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
The function \texttt{add} increases the value
|
|
|
|
of element $k$ by $x$:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{lstlisting}
|
2017-01-03 21:51:20 +01:00
|
|
|
void add(int k, int x) {
|
2016-12-28 23:54:51 +01:00
|
|
|
k += N;
|
|
|
|
p[k] += x;
|
|
|
|
for (k /= 2; k >= 1; k /= 2) {
|
|
|
|
p[k] = p[2*k]+p[2*k+1];
|
|
|
|
}
|
|
|
|
}
|
|
|
|
\end{lstlisting}
|
2017-01-03 21:51:20 +01:00
|
|
|
First the function updates the bottom level
|
|
|
|
of the tree that corresponds to the original array.
|
|
|
|
After this, the function updates the values of all
|
|
|
|
internal nodes in the tree, until it reaches
|
|
|
|
the root node of the tree.
|
|
|
|
|
|
|
|
Both operations in the segment tree work
|
|
|
|
in $O(\log n)$ time because a segment tree
|
|
|
|
of $n$ elements consists of $O(\log n)$ levels,
|
|
|
|
and the operations move one level forward at each step.
|
|
|
|
|
|
|
|
\subsubsection{Other queries}
|
|
|
|
|
|
|
|
Besides the sum query,
|
|
|
|
the segment tree can support any range query
|
|
|
|
where the answer for range $[a,b]$
|
|
|
|
can be efficiently calculated
|
|
|
|
from ranges $[a,c]$ and $[c+1,b]$ where
|
|
|
|
$c$ is some element between $a$ and $b$.
|
|
|
|
Such queries are, for example,
|
|
|
|
minimum and maximum, greatest common divisor,
|
|
|
|
and bit operations.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{samepage}
|
2017-01-03 21:51:20 +01:00
|
|
|
For example, the following segment tree
|
|
|
|
supports minimum queries:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node[anchor=center] at (0.5, 0.5) {5};
|
|
|
|
\node[anchor=center] at (1.5, 0.5) {8};
|
|
|
|
\node[anchor=center] at (2.5, 0.5) {6};
|
|
|
|
\node[anchor=center] at (3.5, 0.5) {3};
|
|
|
|
\node[anchor=center] at (4.5, 0.5) {1};
|
|
|
|
\node[anchor=center] at (5.5, 0.5) {7};
|
|
|
|
\node[anchor=center] at (6.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (7.5, 0.5) {6};
|
|
|
|
|
|
|
|
\node[draw, circle,minimum size=22pt] (a) at (1,2.5) {5};
|
|
|
|
\path[draw,thick,-] (a) -- (0.5,1);
|
|
|
|
\path[draw,thick,-] (a) -- (1.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {3};
|
|
|
|
\path[draw,thick,-] (b) -- (2.5,1);
|
|
|
|
\path[draw,thick,-] (b) -- (3.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {1};
|
|
|
|
\path[draw,thick,-] (c) -- (4.5,1);
|
|
|
|
\path[draw,thick,-] (c) -- (5.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {2};
|
|
|
|
\path[draw,thick,-] (d) -- (6.5,1);
|
|
|
|
\path[draw,thick,-] (d) -- (7.5,1);
|
|
|
|
|
|
|
|
\node[draw, circle,minimum size=22pt] (i) at (2,4.5) {3};
|
|
|
|
\path[draw,thick,-] (i) -- (a);
|
|
|
|
\path[draw,thick,-] (i) -- (b);
|
|
|
|
\node[draw, circle,minimum size=22pt] (j) at (6,4.5) {1};
|
|
|
|
\path[draw,thick,-] (j) -- (c);
|
|
|
|
\path[draw,thick,-] (j) -- (d);
|
|
|
|
|
|
|
|
\node[draw, circle,minimum size=22pt] (m) at (4,6.5) {1};
|
|
|
|
\path[draw,thick,-] (m) -- (i);
|
|
|
|
\path[draw,thick,-] (m) -- (j);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
\end{samepage}
|
|
|
|
|
2017-01-03 21:51:20 +01:00
|
|
|
In this segment tree, every node in the tree
|
|
|
|
contains the smallest element in the corresponding
|
|
|
|
range of the original array.
|
|
|
|
The top node of the tree contains the smallest
|
|
|
|
element in the array.
|
|
|
|
The tree can be implemented like previously,
|
|
|
|
but instead of sums, minima are calculated.
|
|
|
|
|
|
|
|
\subsubsection{Binary search in tree}
|
|
|
|
|
|
|
|
The structure of the segment tree makes it possible
|
|
|
|
to use binary search.
|
|
|
|
For example, if the tree supports the minimum query,
|
|
|
|
we can find the index of the smallest
|
|
|
|
element in $O(\log n)$ time.
|
|
|
|
|
|
|
|
For example, in the following tree the
|
|
|
|
smallest element is 1 that can be found
|
|
|
|
by following a path downwards from the top node:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (8,0) grid (16,1);
|
|
|
|
|
|
|
|
\node[anchor=center] at (8.5, 0.5) {9};
|
|
|
|
\node[anchor=center] at (9.5, 0.5) {5};
|
|
|
|
\node[anchor=center] at (10.5, 0.5) {7};
|
|
|
|
\node[anchor=center] at (11.5, 0.5) {1};
|
|
|
|
\node[anchor=center] at (12.5, 0.5) {6};
|
|
|
|
\node[anchor=center] at (13.5, 0.5) {2};
|
|
|
|
\node[anchor=center] at (14.5, 0.5) {3};
|
|
|
|
\node[anchor=center] at (15.5, 0.5) {2};
|
|
|
|
|
|
|
|
%\node[anchor=center] at (1,2.5) {13};
|
|
|
|
|
|
|
|
\node[draw, circle,minimum size=22pt] (e) at (9,2.5) {5};
|
|
|
|
\path[draw,thick,-] (e) -- (8.5,1);
|
|
|
|
\path[draw,thick,-] (e) -- (9.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (f) at (11,2.5) {1};
|
|
|
|
\path[draw,thick,-] (f) -- (10.5,1);
|
|
|
|
\path[draw,thick,-] (f) -- (11.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (g) at (13,2.5) {2};
|
|
|
|
\path[draw,thick,-] (g) -- (12.5,1);
|
|
|
|
\path[draw,thick,-] (g) -- (13.5,1);
|
|
|
|
\node[draw, circle,minimum size=22pt] (h) at (15,2.5) {2};
|
|
|
|
\path[draw,thick,-] (h) -- (14.5,1);
|
|
|
|
\path[draw,thick,-] (h) -- (15.5,1);
|
|
|
|
|
|
|
|
\node[draw, circle,minimum size=22pt] (k) at (10,4.5) {1};
|
|
|
|
\path[draw,thick,-] (k) -- (e);
|
|
|
|
\path[draw,thick,-] (k) -- (f);
|
|
|
|
\node[draw, circle,minimum size=22pt] (l) at (14,4.5) {2};
|
|
|
|
\path[draw,thick,-] (l) -- (g);
|
|
|
|
\path[draw,thick,-] (l) -- (h);
|
|
|
|
|
|
|
|
\node[draw, circle,minimum size=22pt] (n) at (12,6.5) {1};
|
|
|
|
\path[draw,thick,-] (n) -- (k);
|
|
|
|
\path[draw,thick,-] (n) -- (l);
|
|
|
|
|
|
|
|
|
|
|
|
\path[draw=red,thick,->,line width=2pt] (n) -- (k);
|
|
|
|
\path[draw=red,thick,->,line width=2pt] (k) -- (f);
|
|
|
|
\path[draw=red,thick,->,line width=2pt] (f) -- (11.5,1);
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2017-01-03 22:11:02 +01:00
|
|
|
\section{Additional techniques}
|
|
|
|
|
|
|
|
\subsubsection{Index compression}
|
|
|
|
|
|
|
|
A limitation in data structures that have
|
|
|
|
been built upon an array is that
|
|
|
|
the elements are indexed using integers
|
|
|
|
$1,2,3,$ etc.
|
|
|
|
Difficulties arise when the indices
|
|
|
|
needed are large.
|
|
|
|
For example, using the index $10^9$ would
|
|
|
|
require that the array would contain $10^9$
|
|
|
|
elements which is not realistic.
|
|
|
|
|
|
|
|
\index{index compression}
|
|
|
|
|
|
|
|
However, we can often bypass this limitation
|
|
|
|
by using \key{index compression}
|
|
|
|
where the indices are redistributed so that
|
|
|
|
they are integers $1,2,3,$ etc.
|
|
|
|
This can be done if we know all the indices
|
|
|
|
needed during the algorithm beforehand.
|
|
|
|
|
|
|
|
The idea is to replace each original index $x$
|
|
|
|
with index $p(x)$ where $p$ is a function that
|
|
|
|
redistributes the indices.
|
|
|
|
We require that the order of the indices
|
|
|
|
doesn't change, so if $a<b$, then $p(a)<p(b)$.
|
|
|
|
Thanks to this, we can conviently perform queries
|
|
|
|
despite the fact that the indices are compressed.
|
|
|
|
|
|
|
|
For example, if the original indices are
|
|
|
|
$555$, $10^9$ and $8$, the new indices are:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\[
|
|
|
|
\begin{array}{lcl}
|
|
|
|
p(8) & = & 1 \\
|
|
|
|
p(555) & = & 2 \\
|
|
|
|
p(10^9) & = & 3 \\
|
|
|
|
\end{array}
|
|
|
|
\]
|
|
|
|
|
2017-01-03 22:11:02 +01:00
|
|
|
\subsubsection{Range update}
|
|
|
|
|
|
|
|
So far, we have implemented data structures
|
|
|
|
that support range queries and modifications
|
|
|
|
of single values.
|
|
|
|
Let us now consider a reverse situation
|
|
|
|
where we should update ranges and
|
|
|
|
retrieve single values.
|
|
|
|
We focus on an operation that increases all
|
|
|
|
elements in range $[a,b]$ by $x$.
|
|
|
|
|
|
|
|
Surprisingly, we can use the data structures
|
|
|
|
presented in this chapter also in this situation.
|
|
|
|
This requires that we change the array so that
|
|
|
|
each element indicates the \emph{change}
|
|
|
|
with respect to the previous element.
|
|
|
|
For example, the array
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$3$};
|
|
|
|
\node at (1.5,0.5) {$3$};
|
|
|
|
\node at (2.5,0.5) {$1$};
|
|
|
|
\node at (3.5,0.5) {$1$};
|
|
|
|
\node at (4.5,0.5) {$1$};
|
|
|
|
\node at (5.5,0.5) {$5$};
|
|
|
|
\node at (6.5,0.5) {$2$};
|
|
|
|
\node at (7.5,0.5) {$2$};
|
|
|
|
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-01-03 22:11:02 +01:00
|
|
|
becomes as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$3$};
|
|
|
|
\node at (1.5,0.5) {$0$};
|
|
|
|
\node at (2.5,0.5) {$-2$};
|
|
|
|
\node at (3.5,0.5) {$0$};
|
|
|
|
\node at (4.5,0.5) {$0$};
|
|
|
|
\node at (5.5,0.5) {$4$};
|
|
|
|
\node at (6.5,0.5) {$-3$};
|
|
|
|
\node at (7.5,0.5) {$0$};
|
|
|
|
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2017-01-03 22:11:02 +01:00
|
|
|
The original array is the sum array of the new array.
|
|
|
|
Thus, any value in the original array corresponds
|
|
|
|
to a sum of elements in the new array.
|
|
|
|
For example, the value 6 at index 5 in the original array
|
|
|
|
corresponds to the sum $3-2+4=5$.
|
|
|
|
|
|
|
|
The benefit in using the new array is
|
|
|
|
that we can update a range by changing just
|
|
|
|
two elements in the new array.
|
|
|
|
For example, if we want to
|
|
|
|
increase the range $2 \ldots 5$ by 5,
|
|
|
|
it suffices to increase the element at index 2 by 5
|
|
|
|
and decrease the element at index 6 by 5.
|
|
|
|
The result is as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (8,1);
|
|
|
|
|
|
|
|
\node at (0.5,0.5) {$3$};
|
|
|
|
\node at (1.5,0.5) {$5$};
|
|
|
|
\node at (2.5,0.5) {$-2$};
|
|
|
|
\node at (3.5,0.5) {$0$};
|
|
|
|
\node at (4.5,0.5) {$0$};
|
|
|
|
\node at (5.5,0.5) {$-1$};
|
|
|
|
\node at (6.5,0.5) {$-3$};
|
|
|
|
\node at (7.5,0.5) {$0$};
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
\node at (0.5,1.4) {$1$};
|
|
|
|
\node at (1.5,1.4) {$2$};
|
|
|
|
\node at (2.5,1.4) {$3$};
|
|
|
|
\node at (3.5,1.4) {$4$};
|
|
|
|
\node at (4.5,1.4) {$5$};
|
|
|
|
\node at (5.5,1.4) {$6$};
|
|
|
|
\node at (6.5,1.4) {$7$};
|
|
|
|
\node at (7.5,1.4) {$8$};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2017-01-03 22:11:02 +01:00
|
|
|
More generally, to increase the range
|
|
|
|
$a \ldots b$ by $x$,
|
|
|
|
we increase the element at index $a$ by $x$
|
|
|
|
and decrease the element at index $b+1$ by $x$.
|
|
|
|
The required operations are calculating
|
|
|
|
the sum in a range and updating a value,
|
|
|
|
so we can use a binary indexed tree or a segment tree.
|
|
|
|
|
|
|
|
A more difficult problem is to support both
|
|
|
|
range queries and range updates.
|
|
|
|
In Chapter 28 we will see that this is possible
|
|
|
|
as well.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
|
|
|