cphb/luku09.tex

1424 lines
40 KiB
TeX

\chapter{Range queries}
\index{range query}
\index{sum query}
\index{minimum query}
\index{maximum query}
In a \key{range query}, a range of an array
is given and we should calculate some value from the
elements in the range. Typical range queries are:
\begin{itemize}
\item \key{sum query}: calculate the sum of elements in range $[a,b]$
\item \key{minimum query}: find the smallest element in range $[a,b]$
\item \key{maximum query}: find the largest element in range $[a,b]$
\end{itemize}
For example, in range $[4,7]$ of the following array,
the sum is $4+6+1+3=14$, the minimum is 1 and the maximum is 6:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (3,0) rectangle (7,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$3$};
\node at (2.5,0.5) {$8$};
\node at (3.5,0.5) {$4$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$1$};
\node at (6.5,0.5) {$3$};
\node at (7.5,0.5) {$4$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
An easy way to answer a range query is
to iterate through all the elements in the range.
For example, we can answer a sum query as follows:
\begin{lstlisting}
int sum(int a, int b) {
int s = 0;
for (int i = a; i <= b; i++) {
s += t[i];
}
return s;
}
\end{lstlisting}
The above function handles a sum query
in $O(n)$ time, which is slow if the array is large
and there are a lot of queries.
In this chapter we will learn how
range queries can be answered much more efficiently.
\section{Static array queries}
We will first focus on a simple case where
the array is \key{static}, i.e.,
the elements never change between the queries.
In this case, it suffices to process the
contents of the array beforehand and construct
a data structure that can be used for answering
any possible range query efficiently.
\subsubsection{Sum query}
\index{prefix sum array}
Sum queries can be answered efficiently
by constructing a \key{sum array}
that contains the sum of the range $[1,k]$
for each $k=1,2,\ldots,n$.
After this, the sum of any range $[a,b]$ of the
original array
can be calculated in $O(1)$ time using the
precalculated sum array.
For example, for the array
\begin{center}
\begin{tikzpicture}[scale=0.7]
%\fill[color=lightgray] (3,0) rectangle (7,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$3$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$8$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$1$};
\node at (6.5,0.5) {$4$};
\node at (7.5,0.5) {$2$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
the corresponding sum array is as follows:
\begin{center}
\begin{tikzpicture}[scale=0.7]
%\fill[color=lightgray] (3,0) rectangle (7,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$4$};
\node at (2.5,0.5) {$8$};
\node at (3.5,0.5) {$16$};
\node at (4.5,0.5) {$22$};
\node at (5.5,0.5) {$23$};
\node at (6.5,0.5) {$27$};
\node at (7.5,0.5) {$29$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
The following code constructs a prefix sum
array \texttt{s} from array \texttt{t} in $O(n)$ time:
\begin{lstlisting}
for (int i = 1; i <= n; i++) {
s[i] = s[i-1]+t[i];
}
\end{lstlisting}
After this, the following function answers
a sum query in $O(1)$ time:
\begin{lstlisting}
int sum(int a, int b) {
return s[b]-s[a-1];
}
\end{lstlisting}
The function calculates the sum of range $[a,b]$
by subtracting the sum of range $[1,a-1]$
from the sum of range $[1,b]$.
Thus, only two values from the sum array
are needed, and the query takes $O(1)$ time.
Note that thanks to the one-based indexing,
the function also works when $a=1$ if $\texttt{s}[0]=0$.
As an example, consider the range $[4,7]$:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (3,0) rectangle (7,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$3$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$8$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$1$};
\node at (6.5,0.5) {$4$};
\node at (7.5,0.5) {$2$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
The sum of the range $[4,7]$ is $8+6+1+4=19$.
This can be calculated from the sum array
using the sums $[1,3]$ and $[1,7]$:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (2,0) rectangle (3,1);
\fill[color=lightgray] (6,0) rectangle (7,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$4$};
\node at (2.5,0.5) {$8$};
\node at (3.5,0.5) {$16$};
\node at (4.5,0.5) {$22$};
\node at (5.5,0.5) {$23$};
\node at (6.5,0.5) {$27$};
\node at (7.5,0.5) {$29$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
Thus, the sum of the range $[4,7]$ is $27-8=19$.
We can also generalize the idea of a sum array
for a two-dimensional array.
In this case, it will be possible to calculate the sum of
any rectangular subarray in $O(1)$ time.
The sum array will contain sums
for all subarrays that begin from the upper-left corner.
\begin{samepage}
The following picture illustrates the idea:
\begin{center}
\begin{tikzpicture}[scale=0.55]
\draw[fill=lightgray] (3,2) rectangle (7,5);
\draw (0,0) grid (10,7);
%\draw[line width=2pt] (3,2) rectangle (7,5);
\node[anchor=center] at (6.5, 2.5) {$A$};
\node[anchor=center] at (2.5, 2.5) {$B$};
\node[anchor=center] at (6.5, 5.5) {$C$};
\node[anchor=center] at (2.5, 5.5) {$D$};
\end{tikzpicture}
\end{center}
\end{samepage}
The sum inside the gray subarray can be calculated
using the formula
\[S(A) - S(B) - S(C) + S(D)\]
where $S(X)$ denotes the sum in a rectangular
subarray from the upper-left corner
to the position of letter $X$.
\subsubsection{Minimum query}
It is also possible to answer a minimum query
in $O(1)$ time after preprocessing, though it is
more difficult than answer a sum query.
Note that minimum and maximum queries can always
be implemented using same techniques,
so it suffices to focus on the minimum query.
The idea is to find the minimum element for each range
of size $2^k$ in the array.
For example, in the array
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$3$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$8$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$1$};
\node at (6.5,0.5) {$4$};
\node at (7.5,0.5) {$2$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
the following minima will be calculated:
\begin{center}
\begin{tabular}{ccc}
\begin{tabular}{ccc}
range & size & min \\
\hline
$[1,1]$ & 1 & 1 \\
$[2,2]$ & 1 & 3 \\
$[3,3]$ & 1 & 4 \\
$[4,4]$ & 1 & 8 \\
$[5,5]$ & 1 & 6 \\
$[6,6]$ & 1 & 1 \\
$[7,7]$ & 1 & 4 \\
$[8,8]$ & 1 & 2 \\
\end{tabular}
&
\begin{tabular}{ccc}
range & size & min \\
\hline
$[1,2]$ & 2 & 1 \\
$[2,3]$ & 2 & 3 \\
$[3,4]$ & 2 & 4 \\
$[4,5]$ & 2 & 6 \\
$[5,6]$ & 2 & 1 \\
$[6,7]$ & 2 & 1 \\
$[7,8]$ & 2 & 2 \\
\\
\end{tabular}
&
\begin{tabular}{ccc}
range & size & min \\
\hline
$[1,4]$ & 4 & 1 \\
$[2,5]$ & 4 & 3 \\
$[3,6]$ & 4 & 1 \\
$[4,7]$ & 4 & 1 \\
$[5,8]$ & 4 & 1 \\
$[1,8]$ & 8 & 1 \\
\\
\\
\end{tabular}
\end{tabular}
\end{center}
The number of $2^k$ ranges in an array is $O(n \log n)$
because there are $O(\log n)$ ranges that begin
from each array index.
The minima for all $2^k$ ranges can be calculated
in $O(n \log n)$ time because each $2^k$ range
consists of two $2^{k-1}$ ranges, so the minima
can be calculated recursively.
After this, the minimum of any range $[a,b]$c
can be calculated in $O(1)$ time as a minimum of
two $2^k$ ranges where $k=\lfloor \log_2(b-a+1) \rfloor$.
The first range begins from index $a$,
and the second range ends to index $b$.
The parameter $k$ is so chosen that
two $2^k$ ranges cover the range $[a,b]$ entirely.
As an example, consider the range $[2,7]$:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (1,0) rectangle (7,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$3$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$8$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$1$};
\node at (6.5,0.5) {$4$};
\node at (7.5,0.5) {$2$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
The length of the range $[2,7]$ is 6,
and $\lfloor \log_2(6) \rfloor = 2$.
Thus, the minimum can be calculated
from two ranges of length 4.
The ranges are $[2,5]$ and $[4,7]$:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (1,0) rectangle (5,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$3$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$8$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$1$};
\node at (6.5,0.5) {$4$};
\node at (7.5,0.5) {$2$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=lightgray] (3,0) rectangle (7,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$3$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$8$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$1$};
\node at (6.5,0.5) {$4$};
\node at (7.5,0.5) {$2$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
The minimum of the range $[2,5]$ is 3,
and the minimum of the range $[4,7]$ is 1.
Thus, the minimum of the range $[2,7]$ is 1.
\section{Binary indexed tree}
\index{binary indexed tree}
\index{Fenwick tree}
A \key{binary indexed tree} or a \key{Fenwick tree}
is a data structure that resembles a sum array.
The supported operations are answering
a sum query for range $[a,b]$,
and updating the element at index $k$.
The time complexity for both of the operations is $O(\log n)$.
Unlike a sum array, a binary indexed tree
can be efficiently updated between the sum queries.
This would not be possible using a sum array
because we should build the whole sum array again
in $O(n)$ time after each update.
\subsubsection{Structure}
A binary indexed tree can be represented as an array
where index $k$ contains the sum of a range in the
original array that ends to index $k$.
The length of the range is the largest power of two
that divides $k$.
For example, if $k=6$, the length of the range is $2$
because $2$ divides $6$ but $4$ doesn't divide $6$.
\begin{samepage}
For example, for the array
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$3$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$8$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$1$};
\node at (6.5,0.5) {$4$};
\node at (7.5,0.5) {$2$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
\end{samepage}
the corresponding binary indexed tree is as follows:
\begin{center}
\begin{tikzpicture}[scale=0.7]
%\fill[color=lightgray] (3,0) rectangle (7,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$4$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$16$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$4$};
\node at (7.5,0.5) {$29$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\draw[->,thick] (0.5,-0.9) -- (0.5,-0.1);
\draw[->,thick] (2.5,-0.9) -- (2.5,-0.1);
\draw[->,thick] (4.5,-0.9) -- (4.5,-0.1);
\draw[->,thick] (6.5,-0.9) -- (6.5,-0.1);
\draw[->,thick] (1.5,-1.9) -- (1.5,-0.1);
\draw[->,thick] (5.5,-1.9) -- (5.5,-0.1);
\draw[->,thick] (3.5,-2.9) -- (3.5,-0.1);
\draw[->,thick] (7.5,-3.9) -- (7.5,-0.1);
\draw (0,-1) -- (1,-1) -- (1,-1.5) -- (0,-1.5) -- (0,-1);
\draw (2,-1) -- (3,-1) -- (3,-1.5) -- (2,-1.5) -- (2,-1);
\draw (4,-1) -- (5,-1) -- (5,-1.5) -- (4,-1.5) -- (4,-1);
\draw (6,-1) -- (7,-1) -- (7,-1.5) -- (6,-1.5) -- (6,-1);
\draw (0,-2) -- (2,-2) -- (2,-2.5) -- (0,-2.5) -- (0,-2);
\draw (4,-2) -- (6,-2) -- (6,-2.5) -- (4,-2.5) -- (4,-2);
\draw (0,-3) -- (4,-3) -- (4,-3.5) -- (0,-3.5) -- (0,-3);
\draw (0,-4) -- (8,-4) -- (8,-4.5) -- (0,-4.5) -- (0,-4);
\end{tikzpicture}
\end{center}
For example, the binary indexed tree
contains the value 7 at index 6
because the sum of the elements in the range $[5,6]$
of the original array is $6+1=7$.
\subsubsection{Sum query}
The basic operation in a binary indexed tree is
calculating the sum of a range $[1,k]$ where $k$
is any index in the array.
The sum of any range can be constructed by combining
sums of subranges in the tree.
For example, the range $[1,7]$ will be divided
into three subranges:
\begin{center}
\begin{tikzpicture}[scale=0.7]
%\fill[color=lightgray] (3,0) rectangle (7,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$4$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$16$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$4$};
\node at (7.5,0.5) {$29$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\draw[->,thick] (0.5,-0.9) -- (0.5,-0.1);
\draw[->,thick] (2.5,-0.9) -- (2.5,-0.1);
\draw[->,thick] (4.5,-0.9) -- (4.5,-0.1);
\draw[->,thick] (6.5,-0.9) -- (6.5,-0.1);
\draw[->,thick] (1.5,-1.9) -- (1.5,-0.1);
\draw[->,thick] (5.5,-1.9) -- (5.5,-0.1);
\draw[->,thick] (3.5,-2.9) -- (3.5,-0.1);
\draw[->,thick] (7.5,-3.9) -- (7.5,-0.1);
\draw (0,-1) -- (1,-1) -- (1,-1.5) -- (0,-1.5) -- (0,-1);
\draw (2,-1) -- (3,-1) -- (3,-1.5) -- (2,-1.5) -- (2,-1);
\draw (4,-1) -- (5,-1) -- (5,-1.5) -- (4,-1.5) -- (4,-1);
\draw[fill=lightgray] (6,-1) -- (7,-1) -- (7,-1.5) -- (6,-1.5) -- (6,-1);
\draw (0,-2) -- (2,-2) -- (2,-2.5) -- (0,-2.5) -- (0,-2);
\draw[fill=lightgray] (4,-2) -- (6,-2) -- (6,-2.5) -- (4,-2.5) -- (4,-2);
\draw[fill=lightgray] (0,-3) -- (4,-3) -- (4,-3.5) -- (0,-3.5) -- (0,-3);
\draw (0,-4) -- (8,-4) -- (8,-4.5) -- (0,-4.5) -- (0,-4);
\end{tikzpicture}
\end{center}
Thus, the sum of the range $[1,7]$ is $16+7+4=27$.
Because of the structure of the binary indexed tree,
the length of each subrange inside a range is distinct,
so the sum of a range
always consists of sums of $O(\log n)$ subranges.
Using the same technique that we previously used
with a sum array,
we can efficiently calculate the sum of any range
$[a,b]$ by substracting the sum of the range $[1,a-1]$
from the sum of the range $[1,b]$.
The time complexity remains $O(\log n)$
because it suffices to calculate two sums of $[1,k]$ ranges.
\subsubsection{Array update}
When an element in the original array changes,
several sums in the binary indexed tree change.
For example, if the value at index 3 changes,
the sums of the following ranges change:
\begin{center}
\begin{tikzpicture}[scale=0.7]
%\fill[color=lightgray] (3,0) rectangle (7,1);
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$1$};
\node at (1.5,0.5) {$4$};
\node at (2.5,0.5) {$4$};
\node at (3.5,0.5) {$16$};
\node at (4.5,0.5) {$6$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$4$};
\node at (7.5,0.5) {$29$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\draw[->,thick] (0.5,-0.9) -- (0.5,-0.1);
\draw[->,thick] (2.5,-0.9) -- (2.5,-0.1);
\draw[->,thick] (4.5,-0.9) -- (4.5,-0.1);
\draw[->,thick] (6.5,-0.9) -- (6.5,-0.1);
\draw[->,thick] (1.5,-1.9) -- (1.5,-0.1);
\draw[->,thick] (5.5,-1.9) -- (5.5,-0.1);
\draw[->,thick] (3.5,-2.9) -- (3.5,-0.1);
\draw[->,thick] (7.5,-3.9) -- (7.5,-0.1);
\draw (0,-1) -- (1,-1) -- (1,-1.5) -- (0,-1.5) -- (0,-1);
\draw[fill=lightgray] (2,-1) -- (3,-1) -- (3,-1.5) -- (2,-1.5) -- (2,-1);
\draw (4,-1) -- (5,-1) -- (5,-1.5) -- (4,-1.5) -- (4,-1);
\draw (6,-1) -- (7,-1) -- (7,-1.5) -- (6,-1.5) -- (6,-1);
\draw (0,-2) -- (2,-2) -- (2,-2.5) -- (0,-2.5) -- (0,-2);
\draw (4,-2) -- (6,-2) -- (6,-2.5) -- (4,-2.5) -- (4,-2);
\draw[fill=lightgray] (0,-3) -- (4,-3) -- (4,-3.5) -- (0,-3.5) -- (0,-3);
\draw[fill=lightgray] (0,-4) -- (8,-4) -- (8,-4.5) -- (0,-4.5) -- (0,-4);
\end{tikzpicture}
\end{center}
Also in this case, the length of each range is distinct,
so $O(\log n)$ ranges will be updated in the binary indexed tree.
\subsubsection{Implementation}
The operations of a binary indexed tree can be implemented
in an elegant and efficient way using bit manipulation.
The bit operation needed is $k \& -k$ that
returns the last bit one from number $k$.
For example, $6 \& -6=2$ because the number $6$
corresponds to 110 and the number $2$ corresponds to 10.
It turns out that when calculating a range sum,
the index $k$ in the binary indexed tree should be
decreased by $k \& -k$ at every step.
Correspondingly, when updating the array,
the index $k$ should be increased by $k \& -k$ at every step.
The following functions assume that the binary indexed tree
is stored to array \texttt{b} and it consists of indices $1 \ldots n$.
The function \texttt{sum} calculates the sum of the range $[1,k]$:
\begin{lstlisting}
int sum(int k) {
int s = 0;
while (k >= 1) {
s += b[k];
k -= k&-k;
}
return s;
}
\end{lstlisting}
The function \texttt{add} increases the value of element $k$ by $x$:
\begin{lstlisting}
void add(int k, int x) {
while (k <= n) {
b[k] += x;
k += k&-k;
}
}
\end{lstlisting}
The time complexity of both above functions is
$O(\log n)$ because the functions change $O(\log n)$
values in the binary indexed tree and each move
to the next index
takes $O(1)$ time using the bit operation.
\section{Segmenttipuu}
\index{segmenttipuu@segmenttipuu}
\key{Segmenttipuu} on tietorakenne,
jonka operaatiot ovat taulukon välin $[a,b]$ välikysely
sekä kohdan $k$ arvon päivitys.
Segmenttipuun avulla voi toteuttaa summakyselyn,
minimikyselyn ja monia muitakin kyselyitä niin,
että kummankin operaation aikavaativuus on $O(\log n)$.
Segmenttipuun etuna binääri-indeksipuuhun verrattuna on,
että se on yleisempi tietorakenne.
Binääri-indeksipuulla voi toteuttaa vain summakyselyn,
mutta segmenttipuu sallii muitakin kyselyitä.
Toisaalta segmenttipuu vie enemmän muistia ja
on hieman vaikeampi toteuttaa kuin binääri-indeksipuu.
\subsubsection{Rakenne}
Segmenttipuussa on $2n-1$ solmua niin,
että alimmalla tasolla on $n$ solmua,
jotka kuvaavat taulukon sisällön,
ja ylemmillä tasoilla on välikyselyihin
tarvittavaa tietoa.
Segmenttipuun sisältö riippuu siitä,
mikä välikysely puun tulee toteuttaa.
Oletamme aluksi, että välikysely on tuttu summakysely.
Esimerkiksi taulukkoa
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$5$};
\node at (1.5,0.5) {$8$};
\node at (2.5,0.5) {$6$};
\node at (3.5,0.5) {$3$};
\node at (4.5,0.5) {$2$};
\node at (5.5,0.5) {$7$};
\node at (6.5,0.5) {$2$};
\node at (7.5,0.5) {$6$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
vastaa seuraava segmenttipuu:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node[anchor=center] at (0.5, 0.5) {5};
\node[anchor=center] at (1.5, 0.5) {8};
\node[anchor=center] at (2.5, 0.5) {6};
\node[anchor=center] at (3.5, 0.5) {3};
\node[anchor=center] at (4.5, 0.5) {2};
\node[anchor=center] at (5.5, 0.5) {7};
\node[anchor=center] at (6.5, 0.5) {2};
\node[anchor=center] at (7.5, 0.5) {6};
\node[draw, circle] (a) at (1,2.5) {13};
\path[draw,thick,-] (a) -- (0.5,1);
\path[draw,thick,-] (a) -- (1.5,1);
\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {9};
\path[draw,thick,-] (b) -- (2.5,1);
\path[draw,thick,-] (b) -- (3.5,1);
\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {9};
\path[draw,thick,-] (c) -- (4.5,1);
\path[draw,thick,-] (c) -- (5.5,1);
\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8};
\path[draw,thick,-] (d) -- (6.5,1);
\path[draw,thick,-] (d) -- (7.5,1);
\node[draw, circle] (i) at (2,4.5) {22};
\path[draw,thick,-] (i) -- (a);
\path[draw,thick,-] (i) -- (b);
\node[draw, circle] (j) at (6,4.5) {17};
\path[draw,thick,-] (j) -- (c);
\path[draw,thick,-] (j) -- (d);
\node[draw, circle] (m) at (4,6.5) {39};
\path[draw,thick,-] (m) -- (i);
\path[draw,thick,-] (m) -- (j);
\end{tikzpicture}
\end{center}
Jokaisessa segmenttipuun solmussa on tietoa
$2^k$-kokoisesta välistä taulukossa.
Tässä tapauksessa solmussa oleva arvo kertoo,
mikä on taulukon lukujen summa solmua vastaavalla
välillä.
Kunkin solmun arvo saadaan laskemalla yhteen
solmun alapuolella vasemmalla ja oikealla
olevien solmujen arvot.
Segmenttipuu on mukavinta rakentaa niin,
että taulukon koko on 2:n potenssi,
jolloin tuloksena on täydellinen binääripuu.
Jatkossa oletamme aina,
että taulukko täyttää tämän vaatimuksen.
Jos taulukon koko ei ole 2:n potenssi,
sen loppuun voi lisätä tyhjää niin,
että koosta tulee 2:n potenssi.
\subsubsection{Välikysely}
Segmenttipuussa vastaus välikyselyyn lasketaan
väliin kuuluvista solmuista,
jotka ovat mahdollisimman korkealla puussa.
Jokainen solmu antaa vastauksen väliin kuuluvalle osavälille,
ja vastaus kyselyyn selviää yhdistämällä
segmenttipuusta saadut osavälejä koskeva tiedot.
Tarkastellaan esimerkiksi seuraavaa taulukon väliä:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=gray!50] (2,0) rectangle (8,1);
\draw (0,0) grid (8,1);
\node[anchor=center] at (0.5, 0.5) {5};
\node[anchor=center] at (1.5, 0.5) {8};
\node[anchor=center] at (2.5, 0.5) {6};
\node[anchor=center] at (3.5, 0.5) {3};
\node[anchor=center] at (4.5, 0.5) {2};
\node[anchor=center] at (5.5, 0.5) {7};
\node[anchor=center] at (6.5, 0.5) {2};
\node[anchor=center] at (7.5, 0.5) {6};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
Lukujen summa välillä $[3,8]$ on $6+3+2+7+2+6=26$.
Segmenttipuusta summa saadaan laskettua seuraavien
osasummien avulla:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node[anchor=center] at (0.5, 0.5) {5};
\node[anchor=center] at (1.5, 0.5) {8};
\node[anchor=center] at (2.5, 0.5) {6};
\node[anchor=center] at (3.5, 0.5) {3};
\node[anchor=center] at (4.5, 0.5) {2};
\node[anchor=center] at (5.5, 0.5) {7};
\node[anchor=center] at (6.5, 0.5) {2};
\node[anchor=center] at (7.5, 0.5) {6};
\node[draw, circle] (a) at (1,2.5) {13};
\path[draw,thick,-] (a) -- (0.5,1);
\path[draw,thick,-] (a) -- (1.5,1);
\node[draw, circle,fill=gray!50,minimum size=22pt] (b) at (3,2.5) {9};
\path[draw,thick,-] (b) -- (2.5,1);
\path[draw,thick,-] (b) -- (3.5,1);
\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {9};
\path[draw,thick,-] (c) -- (4.5,1);
\path[draw,thick,-] (c) -- (5.5,1);
\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8};
\path[draw,thick,-] (d) -- (6.5,1);
\path[draw,thick,-] (d) -- (7.5,1);
\node[draw, circle] (i) at (2,4.5) {22};
\path[draw,thick,-] (i) -- (a);
\path[draw,thick,-] (i) -- (b);
\node[draw, circle,fill=gray!50] (j) at (6,4.5) {17};
\path[draw,thick,-] (j) -- (c);
\path[draw,thick,-] (j) -- (d);
\node[draw, circle] (m) at (4,6.5) {39};
\path[draw,thick,-] (m) -- (i);
\path[draw,thick,-] (m) -- (j);
\end{tikzpicture}
\end{center}
Taulukon välin summaksi tulee osasummista $9+17=26$.
Kun vastaus välikyselyyn lasketaan mahdollisimman
korkealla segmenttipuussa olevista solmuista,
väliin kuuluu enintään kaksi solmua
jokaiselta segmenttipuun tasolta.
Tämän ansiosta välikyselyssä
tarvittavien solmujen yhteismäärä on vain $O(\log n)$.
\subsubsection{Taulukon päivitys}
Kun taulukossa oleva arvo muuttuu,
segmenttipuussa täytyy päivittää
kaikkia solmuja, joiden arvo
riippuu muutetusta taulukon kohdasta.
Tämä tapahtuu kulkemalla puuta ylöspäin huipulle
asti ja tekemällä muutokset.
\begin{samepage}
Seuraava kuva näyttää, mitkä solmut segmenttipuussa muuttuvat,
jos taulukon luku 7 muuttuu.
\begin{center}
\begin{tikzpicture}[scale=0.7]
\fill[color=gray!50] (5,0) rectangle (6,1);
\draw (0,0) grid (8,1);
\node[anchor=center] at (0.5, 0.5) {5};
\node[anchor=center] at (1.5, 0.5) {8};
\node[anchor=center] at (2.5, 0.5) {6};
\node[anchor=center] at (3.5, 0.5) {3};
\node[anchor=center] at (4.5, 0.5) {2};
\node[anchor=center] at (5.5, 0.5) {7};
\node[anchor=center] at (6.5, 0.5) {2};
\node[anchor=center] at (7.5, 0.5) {6};
\node[draw, circle] (a) at (1,2.5) {13};
\path[draw,thick,-] (a) -- (0.5,1);
\path[draw,thick,-] (a) -- (1.5,1);
\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {9};
\path[draw,thick,-] (b) -- (2.5,1);
\path[draw,thick,-] (b) -- (3.5,1);
\node[draw, circle,minimum size=22pt,fill=gray!50] (c) at (5,2.5) {9};
\path[draw,thick,-] (c) -- (4.5,1);
\path[draw,thick,-] (c) -- (5.5,1);
\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8};
\path[draw,thick,-] (d) -- (6.5,1);
\path[draw,thick,-] (d) -- (7.5,1);
\node[draw, circle] (i) at (2,4.5) {22};
\path[draw,thick,-] (i) -- (a);
\path[draw,thick,-] (i) -- (b);
\node[draw, circle,fill=gray!50] (j) at (6,4.5) {17};
\path[draw,thick,-] (j) -- (c);
\path[draw,thick,-] (j) -- (d);
\node[draw, circle,fill=gray!50] (m) at (4,6.5) {39};
\path[draw,thick,-] (m) -- (i);
\path[draw,thick,-] (m) -- (j);
\end{tikzpicture}
\end{center}
\end{samepage}
Polku segmenttipuun pohjalta huipulle muodostuu aina $O(\log n)$ solmusta,
joten taulukon arvon muuttuminen vaikuttaa $O(\log n)$ solmuun puussa.
\subsubsection{Puun tallennus}
Segmenttipuun voi tallentaa muistiin
$2N$ alkion taulukkona,
jossa $N$ on riittävän suuri 2:n potenssi.
Tällaisen segmenttipuun avulla voi ylläpitää
taulukkoa, jonka indeksialue on $[0,N-1]$.
Segmenttipuun taulukon
kohdassa 1 on puun ylimmän solmun arvo,
kohdat 2 ja 3 sisältävät seuraavan tason
solmujen arvot, jne.
Segmenttipuun alin taso eli varsinainen
taulukon sisältä tallennetaan
kohdasta $N$ alkaen.
Niinpä taulukon kohdassa $k$ oleva alkio
on segmenttipuun taulukossa kohdassa $k+N$.
Esimerkiksi segmenttipuun
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node[anchor=center] at (0.5, 0.5) {5};
\node[anchor=center] at (1.5, 0.5) {8};
\node[anchor=center] at (2.5, 0.5) {6};
\node[anchor=center] at (3.5, 0.5) {3};
\node[anchor=center] at (4.5, 0.5) {2};
\node[anchor=center] at (5.5, 0.5) {7};
\node[anchor=center] at (6.5, 0.5) {2};
\node[anchor=center] at (7.5, 0.5) {6};
\node[draw, circle] (a) at (1,2.5) {13};
\path[draw,thick,-] (a) -- (0.5,1);
\path[draw,thick,-] (a) -- (1.5,1);
\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {9};
\path[draw,thick,-] (b) -- (2.5,1);
\path[draw,thick,-] (b) -- (3.5,1);
\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {9};
\path[draw,thick,-] (c) -- (4.5,1);
\path[draw,thick,-] (c) -- (5.5,1);
\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {8};
\path[draw,thick,-] (d) -- (6.5,1);
\path[draw,thick,-] (d) -- (7.5,1);
\node[draw, circle] (i) at (2,4.5) {22};
\path[draw,thick,-] (i) -- (a);
\path[draw,thick,-] (i) -- (b);
\node[draw, circle] (j) at (6,4.5) {17};
\path[draw,thick,-] (j) -- (c);
\path[draw,thick,-] (j) -- (d);
\node[draw, circle] (m) at (4,6.5) {39};
\path[draw,thick,-] (m) -- (i);
\path[draw,thick,-] (m) -- (j);
\end{tikzpicture}
\end{center}
voi tallentaa taulukkoon seuraavasti ($N=8$):
\begin{center}
\begin{tikzpicture}[scale=0.7]
%\fill[color=lightgray] (3,0) rectangle (7,1);
\draw (0,0) grid (15,1);
\node at (0.5,0.5) {$39$};
\node at (1.5,0.5) {$22$};
\node at (2.5,0.5) {$17$};
\node at (3.5,0.5) {$13$};
\node at (4.5,0.5) {$9$};
\node at (5.5,0.5) {$9$};
\node at (6.5,0.5) {$8$};
\node at (7.5,0.5) {$5$};
\node at (8.5,0.5) {$8$};
\node at (9.5,0.5) {$6$};
\node at (10.5,0.5) {$3$};
\node at (11.5,0.5) {$2$};
\node at (12.5,0.5) {$7$};
\node at (13.5,0.5) {$2$};
\node at (14.5,0.5) {$6$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\node at (8.5,1.4) {$9$};
\node at (9.5,1.4) {$10$};
\node at (10.5,1.4) {$11$};
\node at (11.5,1.4) {$12$};
\node at (12.5,1.4) {$13$};
\node at (13.5,1.4) {$14$};
\node at (14.5,1.4) {$15$};
\end{tikzpicture}
\end{center}
Tätä tallennustapaa käyttäen kohdassa $k$
olevalle solmulle pätee, että
\begin{itemize}
\item ylempi solmu on kohdassa $\lfloor k/2 \rfloor$,
\item vasen alempi solmu on kohdassa $2k$ ja
\item oikea alempi solmu on kohdassa $2k+1$.
\end{itemize}
Huomaa, että tämän seurauksena solmun kohta on parillinen,
jos se on vasemmalla ylemmästä solmusta katsoen,
ja pariton, jos se on oikealla.
\subsubsection{Toteutus}
Tarkastellaan seuraavaksi välikyselyn ja päivityksen
toteutusta segmenttipuuhun.
Seuraavat funktiot olettavat, että segmenttipuu
on tallennettu $2n-1$-kokoi\-seen taulukkoon $\texttt{p}$
edellä kuvatulla tavalla.
Funktio \texttt{summa} laskee summan
välillä $a \ldots b$:
\begin{lstlisting}
int summa(int a, int b) {
a += N; b += N;
int s = 0;
while (a <= b) {
if (a%2 == 1) s += p[a++];
if (b%2 == 0) s += p[b--];
a /= 2; b /= 2;
}
return s;
}
\end{lstlisting}
Funktio aloittaa summan laskeminen segmenttipuun
pohjalta ja liikkuu askel kerrallaan ylemmille tasoille.
Funktio laskee välin summan muuttujaan $s$
yhdistämällä puussa olevia osasummia.
Välin reunalla oleva osasumma lisätään summaan
aina silloin, kun se ei kuulu ylemmän tason osasummaan.
Funktio \texttt{lisaa} kasvattaa kohdan $k$ arvoa $x$:llä:
\begin{lstlisting}
void lisaa(int k, int x) {
k += N;
p[k] += x;
for (k /= 2; k >= 1; k /= 2) {
p[k] = p[2*k]+p[2*k+1];
}
}
\end{lstlisting}
Ensin funktio tekee muutoksen puun alimmalle
tasolle taulukkoon.
Tämän jälkeen se päivittää kaikki osasummat
puun huipulle asti.
Taulukon \texttt{p} indeksoinnin ansiosta
kohdasta $k$ alemmalla tasolla
ovat kohdat $2k$ ja $2k+1$.
Molemmat segmenttipuun operaatiot toimivat ajassa
$O(\log n)$, koska $n$ lukua sisältävässä
segmenttipuussa on $O(\log n)$ tasoa
ja operaatiot siirtyvät askel kerrallaan
segmenttipuun tasoja ylöspäin.
\subsubsection{Muut kyselyt}
Segmenttipuu mahdollistaa summan lisäksi minkä
tahansa välikyselyn,
jossa vierekkäisten välien $[a,b]$ ja $[b+1,c]$
tuloksista pystyy laskemaan tehokkaasti
välin $[a,c]$ tuloksen.
Tällaisia kyselyitä
ovat esimerkiksi minimi ja maksimi,
suurin yhteinen tekijä
sekä bittioperaatiot and, or ja xor.
\begin{samepage}
Esimerkiksi seuraavan segmenttipuun avulla voi laskea
taulukon välien minimejä:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node[anchor=center] at (0.5, 0.5) {5};
\node[anchor=center] at (1.5, 0.5) {8};
\node[anchor=center] at (2.5, 0.5) {6};
\node[anchor=center] at (3.5, 0.5) {3};
\node[anchor=center] at (4.5, 0.5) {1};
\node[anchor=center] at (5.5, 0.5) {7};
\node[anchor=center] at (6.5, 0.5) {2};
\node[anchor=center] at (7.5, 0.5) {6};
\node[draw, circle,minimum size=22pt] (a) at (1,2.5) {5};
\path[draw,thick,-] (a) -- (0.5,1);
\path[draw,thick,-] (a) -- (1.5,1);
\node[draw, circle,minimum size=22pt] (b) at (3,2.5) {3};
\path[draw,thick,-] (b) -- (2.5,1);
\path[draw,thick,-] (b) -- (3.5,1);
\node[draw, circle,minimum size=22pt] (c) at (5,2.5) {1};
\path[draw,thick,-] (c) -- (4.5,1);
\path[draw,thick,-] (c) -- (5.5,1);
\node[draw, circle,minimum size=22pt] (d) at (7,2.5) {2};
\path[draw,thick,-] (d) -- (6.5,1);
\path[draw,thick,-] (d) -- (7.5,1);
\node[draw, circle,minimum size=22pt] (i) at (2,4.5) {3};
\path[draw,thick,-] (i) -- (a);
\path[draw,thick,-] (i) -- (b);
\node[draw, circle,minimum size=22pt] (j) at (6,4.5) {1};
\path[draw,thick,-] (j) -- (c);
\path[draw,thick,-] (j) -- (d);
\node[draw, circle,minimum size=22pt] (m) at (4,6.5) {1};
\path[draw,thick,-] (m) -- (i);
\path[draw,thick,-] (m) -- (j);
\end{tikzpicture}
\end{center}
\end{samepage}
Tässä segmenttipuussa jokainen puun solmu kertoo,
mikä on pienin luku sen alapuolella olevassa
taulukon osassa.
Segmenttipuun ylin luku on pienin luku
koko taulukon alueella.
Puun toteutus on samanlainen kuin summan laskemisessa,
mutta joka kohdassa pitää laskea summan sijasta
lukujen minimi.
\subsubsection{Binäärihaku puussa}
Segmenttipuun sisältämää tietoa voi käyttää
binäärihaun kaltaisesti aloittamalla
haun puun huipulta.
Näin on mahdollista selvittää esimerkiksi
minimisegmenttipuusta $O(\log n)$-ajassa,
missä kohdassa on taulukon pienin luku.
Esimerkiksi seuraavassa puussa pienin
alkio on 1, jonka sijainti löytyy
kulkemalla puussa huipulta alaspäin:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (8,0) grid (16,1);
\node[anchor=center] at (8.5, 0.5) {9};
\node[anchor=center] at (9.5, 0.5) {5};
\node[anchor=center] at (10.5, 0.5) {7};
\node[anchor=center] at (11.5, 0.5) {1};
\node[anchor=center] at (12.5, 0.5) {6};
\node[anchor=center] at (13.5, 0.5) {2};
\node[anchor=center] at (14.5, 0.5) {3};
\node[anchor=center] at (15.5, 0.5) {2};
%\node[anchor=center] at (1,2.5) {13};
\node[draw, circle,minimum size=22pt] (e) at (9,2.5) {5};
\path[draw,thick,-] (e) -- (8.5,1);
\path[draw,thick,-] (e) -- (9.5,1);
\node[draw, circle,minimum size=22pt] (f) at (11,2.5) {1};
\path[draw,thick,-] (f) -- (10.5,1);
\path[draw,thick,-] (f) -- (11.5,1);
\node[draw, circle,minimum size=22pt] (g) at (13,2.5) {2};
\path[draw,thick,-] (g) -- (12.5,1);
\path[draw,thick,-] (g) -- (13.5,1);
\node[draw, circle,minimum size=22pt] (h) at (15,2.5) {2};
\path[draw,thick,-] (h) -- (14.5,1);
\path[draw,thick,-] (h) -- (15.5,1);
\node[draw, circle,minimum size=22pt] (k) at (10,4.5) {1};
\path[draw,thick,-] (k) -- (e);
\path[draw,thick,-] (k) -- (f);
\node[draw, circle,minimum size=22pt] (l) at (14,4.5) {2};
\path[draw,thick,-] (l) -- (g);
\path[draw,thick,-] (l) -- (h);
\node[draw, circle,minimum size=22pt] (n) at (12,6.5) {1};
\path[draw,thick,-] (n) -- (k);
\path[draw,thick,-] (n) -- (l);
\path[draw=red,thick,->,line width=2pt] (n) -- (k);
\path[draw=red,thick,->,line width=2pt] (k) -- (f);
\path[draw=red,thick,->,line width=2pt] (f) -- (11.5,1);
\end{tikzpicture}
\end{center}
\section{Lisätekniikoita}
\subsubsection{Indeksien pakkaus}
Taulukon päälle rakennettujen tietorakenteiden
rajoituksena on, että alkiot on indeksoitu
kokonaisluvuin $1,2,3,$ jne.
Tästä seuraa ongelmia,
jos tarvittavat indeksit ovat suuria.
Esimerkiksi indeksin $10^9$ käyttäminen
vaatisi, että taulukossa olisi $10^9$ alkiota,
mikä ei ole realistista.
\index{indeksien pakkaus@indeksien pakkaus}
Tätä rajoitusta on kuitenkin mahdollista
kiertää usein käyttämällä \key{indeksien pakkausta},
jolloin indeksit jaetaan
uudestaan niin, että ne ovat
kokonaisluvut $1,2,3,$ jne.
Tämä on mahdollista silloin, kun kaikki
algoritmin aikana tarvittavat indeksit
ovat tiedossa algoritmin alussa.
Ideana on korvata jokainen alkuperäinen
indeksi $x$ indeksillä $p(x)$,
missä $p$ jakaa indeksit uudestaan.
Vaatimuksena on, että indeksien järjestys
ei muutu, eli jos $a<b$, niin $p(a)<p(b)$,
minkä ansiosta kyselyitä voi tehdä
melko tavallisesti indeksien pakkauksesta huolimatta.
Esimerkiksi jos alkuperäiset indeksit ovat
$555$, $10^9$ ja $8$, ne muuttuvat näin:
\[
\begin{array}{lcl}
p(8) & = & 1 \\
p(555) & = & 2 \\
p(10^9) & = & 3 \\
\end{array}
\]
\subsubsection{Välin muuttaminen}
Tähän asti olemme toteuttaneet tietorakenteita,
joissa voi tehdä tehokkaasti välikyselyitä
ja muuttaa yksittäisiä taulukon arvoja.
Tarkastellaan lopuksi käänteistä tilannetta,
jossa pitääkin muuttaa välejä ja
kysellä yksittäisiä arvoja.
Keskitymme operaatioon,
joka kasvattaa kaikkia välin $[a,b]$ arvoja $x$:llä.
Yllättävää kyllä,
voimme käyttää tämän luvun tietorakenteita myös tässä tilanteessa.
Tämä vaatii, että muutamme taulukkoa niin,
että jokainen taulukon arvo kertoo \textit{muutoksen}
edelliseen arvoon nähden.
Esimerkiksi taulukosta
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$3$};
\node at (1.5,0.5) {$3$};
\node at (2.5,0.5) {$1$};
\node at (3.5,0.5) {$1$};
\node at (4.5,0.5) {$1$};
\node at (5.5,0.5) {$5$};
\node at (6.5,0.5) {$2$};
\node at (7.5,0.5) {$2$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
tulee seuraava:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$3$};
\node at (1.5,0.5) {$0$};
\node at (2.5,0.5) {$-2$};
\node at (3.5,0.5) {$0$};
\node at (4.5,0.5) {$0$};
\node at (5.5,0.5) {$4$};
\node at (6.5,0.5) {$-3$};
\node at (7.5,0.5) {$0$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
Minkä tahansa vanhan arvon saa uudesta taulukosta
laskemalla summan taulukon alusta kyseiseen kohtaan asti.
Esimerkiksi kohdan 6 vanha arvo 5 saadaan
summana $3-2+4=5$.
Uuden tallennustavan etuna on,
että välin muuttamiseen riittää muuttaa
kahta taulukon kohtaa.
Esimerkiksi jos välille $2 \ldots 5$
lisätään luku 5,
taulukon kohtaan 2 lisätään 5
ja taulukon kohdasta 6 poistetaan 5.
Tulos on tässä:
\begin{center}
\begin{tikzpicture}[scale=0.7]
\draw (0,0) grid (8,1);
\node at (0.5,0.5) {$3$};
\node at (1.5,0.5) {$5$};
\node at (2.5,0.5) {$-2$};
\node at (3.5,0.5) {$0$};
\node at (4.5,0.5) {$0$};
\node at (5.5,0.5) {$-1$};
\node at (6.5,0.5) {$-3$};
\node at (7.5,0.5) {$0$};
\footnotesize
\node at (0.5,1.4) {$1$};
\node at (1.5,1.4) {$2$};
\node at (2.5,1.4) {$3$};
\node at (3.5,1.4) {$4$};
\node at (4.5,1.4) {$5$};
\node at (5.5,1.4) {$6$};
\node at (6.5,1.4) {$7$};
\node at (7.5,1.4) {$8$};
\end{tikzpicture}
\end{center}
Yleisemmin kun taulukon välille $a \ldots b$
lisätään $x$, taulukon kohtaan $a$
lisätään $x$ ja taulukon kohdasta $b+1$
vähennetään $x$.
Tarvittavat operaatiot
ovat summan laskeminen
taulukon alusta tiettyyn kohtaan
sekä yksittäisen alkion muuttaminen,
joten voimme käyttää tuttuja menetelmiä tässäkin tilanteessa.
Hankalampi tilanne on, jos samaan aikaan pitää pystyä
sekä kysymään tietoa väleiltä että muuttamaan välejä.
Myöhemmin luvussa 28 tulemme näkemään,
että tämäkin on mahdollista.