2016-12-28 23:54:51 +01:00
|
|
|
\chapter{Square root algorithms}
|
|
|
|
|
2017-01-25 22:13:05 +01:00
|
|
|
\index{square root algorithm}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-25 22:13:05 +01:00
|
|
|
A \key{square root algorithm} is an algorithm
|
|
|
|
that has a square root in its time complexity.
|
|
|
|
A square root can be seen as a ''poor man's logarithm'':
|
|
|
|
the complexity $O(\sqrt n)$ is better than $O(n)$
|
|
|
|
but worse than $O(\log n)$.
|
|
|
|
Still, many square root algorithms are fast in practice
|
|
|
|
and have small constant factors.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-25 22:13:05 +01:00
|
|
|
As an example, let's consider the problem of
|
|
|
|
handling sum queries in an array.
|
|
|
|
The required operations are:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{itemize}
|
2017-01-25 22:13:05 +01:00
|
|
|
\item change the value at index $x$
|
|
|
|
\item calculate the sum in the range $[a,b]$
|
2016-12-28 23:54:51 +01:00
|
|
|
\end{itemize}
|
|
|
|
|
2017-01-25 22:13:05 +01:00
|
|
|
We have previously solved the problem using
|
|
|
|
a binary indexed tree and a segment tree,
|
|
|
|
that support both operations in $O(\log n)$ time.
|
|
|
|
However, now we will solve the problem
|
|
|
|
in another way using a square root structure
|
|
|
|
so that we can calculate sums in $O(\sqrt n)$ time
|
|
|
|
and modify values in $O(1)$ time.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
2017-01-26 22:12:30 +01:00
|
|
|
The idea is to divide the array into blocks
|
|
|
|
of size $\sqrt n$ so that each block contains
|
|
|
|
the sum of elements inside the block.
|
2017-01-25 22:13:05 +01:00
|
|
|
The following example shows an array and the
|
|
|
|
corresponding segments:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\draw (0,0) grid (16,1);
|
|
|
|
|
|
|
|
\draw (0,1) rectangle (4,2);
|
|
|
|
\draw (4,1) rectangle (8,2);
|
|
|
|
\draw (8,1) rectangle (12,2);
|
|
|
|
\draw (12,1) rectangle (16,2);
|
|
|
|
|
|
|
|
\node at (0.5, 0.5) {5};
|
|
|
|
\node at (1.5, 0.5) {8};
|
|
|
|
\node at (2.5, 0.5) {6};
|
|
|
|
\node at (3.5, 0.5) {3};
|
|
|
|
\node at (4.5, 0.5) {2};
|
|
|
|
\node at (5.5, 0.5) {7};
|
|
|
|
\node at (6.5, 0.5) {2};
|
|
|
|
\node at (7.5, 0.5) {6};
|
|
|
|
\node at (8.5, 0.5) {7};
|
|
|
|
\node at (9.5, 0.5) {1};
|
|
|
|
\node at (10.5, 0.5) {7};
|
|
|
|
\node at (11.5, 0.5) {5};
|
|
|
|
\node at (12.5, 0.5) {6};
|
|
|
|
\node at (13.5, 0.5) {2};
|
|
|
|
\node at (14.5, 0.5) {3};
|
|
|
|
\node at (15.5, 0.5) {2};
|
|
|
|
|
|
|
|
\node at (2, 1.5) {21};
|
|
|
|
\node at (6, 1.5) {17};
|
|
|
|
\node at (10, 1.5) {20};
|
|
|
|
\node at (14, 1.5) {13};
|
|
|
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2017-01-25 22:13:05 +01:00
|
|
|
When a value in the array changes,
|
2017-01-26 22:12:30 +01:00
|
|
|
we have to calculate the sum of the corresponding
|
|
|
|
block again:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=lightgray] (5,0) rectangle (6,1);
|
|
|
|
\draw (0,0) grid (16,1);
|
|
|
|
|
|
|
|
\fill[color=lightgray] (4,1) rectangle (8,2);
|
|
|
|
\draw (0,1) rectangle (4,2);
|
|
|
|
\draw (4,1) rectangle (8,2);
|
|
|
|
\draw (8,1) rectangle (12,2);
|
|
|
|
\draw (12,1) rectangle (16,2);
|
|
|
|
|
|
|
|
\node at (0.5, 0.5) {5};
|
|
|
|
\node at (1.5, 0.5) {8};
|
|
|
|
\node at (2.5, 0.5) {6};
|
|
|
|
\node at (3.5, 0.5) {3};
|
|
|
|
\node at (4.5, 0.5) {2};
|
|
|
|
\node at (5.5, 0.5) {5};
|
|
|
|
\node at (6.5, 0.5) {2};
|
|
|
|
\node at (7.5, 0.5) {6};
|
|
|
|
\node at (8.5, 0.5) {7};
|
|
|
|
\node at (9.5, 0.5) {1};
|
|
|
|
\node at (10.5, 0.5) {7};
|
|
|
|
\node at (11.5, 0.5) {5};
|
|
|
|
\node at (12.5, 0.5) {6};
|
|
|
|
\node at (13.5, 0.5) {2};
|
|
|
|
\node at (14.5, 0.5) {3};
|
|
|
|
\node at (15.5, 0.5) {2};
|
|
|
|
|
|
|
|
\node at (2, 1.5) {21};
|
|
|
|
\node at (6, 1.5) {15};
|
|
|
|
\node at (10, 1.5) {20};
|
|
|
|
\node at (14, 1.5) {13};
|
|
|
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2017-01-25 22:13:05 +01:00
|
|
|
Any sum in the array can be calculated as a combination
|
|
|
|
of single values in the array and the sums of the
|
2017-01-26 22:12:30 +01:00
|
|
|
blocks between them:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=lightgray] (3,0) rectangle (4,1);
|
|
|
|
\fill[color=lightgray] (12,0) rectangle (13,1);
|
|
|
|
\fill[color=lightgray] (13,0) rectangle (14,1);
|
|
|
|
\draw (0,0) grid (16,1);
|
|
|
|
|
|
|
|
\fill[color=lightgray] (4,1) rectangle (8,2);
|
|
|
|
\fill[color=lightgray] (8,1) rectangle (12,2);
|
|
|
|
\draw (0,1) rectangle (4,2);
|
|
|
|
\draw (4,1) rectangle (8,2);
|
|
|
|
\draw (8,1) rectangle (12,2);
|
|
|
|
\draw (12,1) rectangle (16,2);
|
|
|
|
|
|
|
|
\node at (0.5, 0.5) {5};
|
|
|
|
\node at (1.5, 0.5) {8};
|
|
|
|
\node at (2.5, 0.5) {6};
|
|
|
|
\node at (3.5, 0.5) {3};
|
|
|
|
\node at (4.5, 0.5) {2};
|
|
|
|
\node at (5.5, 0.5) {5};
|
|
|
|
\node at (6.5, 0.5) {2};
|
|
|
|
\node at (7.5, 0.5) {6};
|
|
|
|
\node at (8.5, 0.5) {7};
|
|
|
|
\node at (9.5, 0.5) {1};
|
|
|
|
\node at (10.5, 0.5) {7};
|
|
|
|
\node at (11.5, 0.5) {5};
|
|
|
|
\node at (12.5, 0.5) {6};
|
|
|
|
\node at (13.5, 0.5) {2};
|
|
|
|
\node at (14.5, 0.5) {3};
|
|
|
|
\node at (15.5, 0.5) {2};
|
|
|
|
|
|
|
|
\node at (2, 1.5) {21};
|
|
|
|
\node at (6, 1.5) {15};
|
|
|
|
\node at (10, 1.5) {20};
|
|
|
|
\node at (14, 1.5) {13};
|
|
|
|
|
|
|
|
\draw [decoration={brace}, decorate, line width=0.5mm] (14,-0.25) -- (3,-0.25);
|
|
|
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
|
|
|
|
2017-01-25 22:13:05 +01:00
|
|
|
We can change a value in $O(1)$ time,
|
2017-01-26 22:12:30 +01:00
|
|
|
because we only have to change the sum of a single block.
|
2017-01-25 22:13:05 +01:00
|
|
|
A sum in a range consists of three parts:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
\begin{itemize}
|
2017-01-25 22:13:05 +01:00
|
|
|
\item first, there are $O(\sqrt n)$ single values
|
2017-01-26 22:12:30 +01:00
|
|
|
\item then, there are $O(\sqrt n)$ consecutive blocks
|
2017-01-25 22:13:05 +01:00
|
|
|
\item finally, there are $O(\sqrt n)$ single values
|
2016-12-28 23:54:51 +01:00
|
|
|
\end{itemize}
|
|
|
|
|
2017-01-25 22:13:05 +01:00
|
|
|
Calculating each sum takes $O(\sqrt n)$ time,
|
|
|
|
so the total complexity for calculating the sum
|
|
|
|
of values in any range is $O(\sqrt n)$.
|
|
|
|
|
|
|
|
The reason why we use the parameter $\sqrt n$ is that
|
|
|
|
it balances two things:
|
|
|
|
for example, an array of $n$ elements is divided
|
2017-01-26 22:12:30 +01:00
|
|
|
into $\sqrt n$ blocks, each of which contains
|
2017-01-25 22:13:05 +01:00
|
|
|
$\sqrt n$ elements.
|
|
|
|
In practice, it is not needed to use exactly
|
|
|
|
the parameter $\sqrt n$ in algorithms, but it may be better to
|
|
|
|
use parameters $k$ and $n/k$ where $k$ is
|
|
|
|
larger or smaller than $\sqrt n$.
|
|
|
|
|
|
|
|
The best parameter depends on the problem
|
|
|
|
and input.
|
|
|
|
For example, if an algorithm often goes through
|
2017-01-26 22:12:30 +01:00
|
|
|
blocks but rarely iterates elements inside
|
|
|
|
blocks, it may be good to divide the array into
|
|
|
|
$k < \sqrt n$ blocks, each of which contains $n/k > \sqrt n$
|
2017-01-25 22:13:05 +01:00
|
|
|
elements.
|
|
|
|
|
|
|
|
\section{Batch processing}
|
|
|
|
|
|
|
|
\index{batch processing}
|
|
|
|
|
2017-01-26 22:12:30 +01:00
|
|
|
In \key{batch processing}, the operations of an
|
2017-01-25 22:13:05 +01:00
|
|
|
algorithm are divided into batches,
|
|
|
|
and each batch will be processed separately.
|
|
|
|
Between the batches some precalculation is done
|
|
|
|
to process the future operations more efficiently.
|
|
|
|
|
|
|
|
In a square root algorithm, $n$ operations are
|
|
|
|
divided into batches of size $O(\sqrt n)$,
|
|
|
|
and the number of both batches and operations in each
|
|
|
|
batch is $O(\sqrt n)$.
|
|
|
|
This balances the precalculation time between
|
|
|
|
the batches and the time needed for processing
|
|
|
|
the batches.
|
|
|
|
|
|
|
|
As an example, let's consider a problem
|
|
|
|
where a grid of size $k \times k$
|
|
|
|
initially consists of white squares.
|
|
|
|
Our task is to perform $n$ operations,
|
|
|
|
each of which is one of the following:
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{itemize}
|
|
|
|
\item
|
2017-01-25 22:13:05 +01:00
|
|
|
paint square $(y,x)$ black
|
2016-12-28 23:54:51 +01:00
|
|
|
\item
|
2017-01-25 22:13:05 +01:00
|
|
|
find the nearest black square to
|
|
|
|
square $(y,x)$ where the distance
|
|
|
|
between squares $(y_1,x_1)$ and $(y_2,x_2)$
|
|
|
|
is $|y_1-y_2|+|x_1-x_2|$
|
2016-12-28 23:54:51 +01:00
|
|
|
\end{itemize}
|
|
|
|
|
2017-01-25 22:13:05 +01:00
|
|
|
The solution is to divide the operations into
|
|
|
|
$O(\sqrt n)$ batches, each of which consists
|
|
|
|
of $O(\sqrt n)$ operations.
|
|
|
|
At the beginning of each batch,
|
|
|
|
we calculate for each square in the grid
|
|
|
|
the smallest distance to a black square.
|
|
|
|
This can be done in $O(k^2)$ time using breadth-first search.
|
|
|
|
|
|
|
|
When processing a batch, we maintain a list of squares
|
|
|
|
that have been painted black in the current batch.
|
|
|
|
Now, the distance from a square to the nearest black
|
|
|
|
square is either the precalculated distance or the distance
|
|
|
|
to a square that has been painted black in the current batch.
|
|
|
|
|
|
|
|
The algorithm works in
|
|
|
|
$O((k^2+n) \sqrt n)$ time.
|
|
|
|
First, between the batches,
|
|
|
|
there are $O(\sqrt n)$ searches that each take
|
|
|
|
$O(k^2)$ time.
|
|
|
|
Second, the total number of processed
|
|
|
|
squares is $O(n)$, and at each square,
|
|
|
|
we go through a list of $O(\sqrt n)$ squares
|
|
|
|
in a batch.
|
|
|
|
|
|
|
|
If the algorithm would perform a breadth-first search
|
|
|
|
at each operation, the complexity would be
|
|
|
|
$O(k^2 n)$.
|
|
|
|
And if the algorithm would go through all painted
|
|
|
|
squares at each operation,
|
|
|
|
the complexity would be $O(n^2)$.
|
|
|
|
The square root algorithm combines these complexities,
|
|
|
|
and turns the factor $n$ into $\sqrt n$.
|
|
|
|
|
|
|
|
\section{Case processing}
|
|
|
|
|
|
|
|
\index{case processing}
|
|
|
|
|
|
|
|
In \key{case processing}, an algorithm has
|
|
|
|
specialized subalgorithms for different cases that
|
|
|
|
may appear during the algorithm.
|
|
|
|
Typically, one part is efficient for
|
|
|
|
small parameters, and another part is efficient
|
|
|
|
for large parameters, and the turning point is
|
|
|
|
about $\sqrt n$.
|
|
|
|
|
|
|
|
As an example, let's consider a problem where
|
|
|
|
we are given a tree that contains $n$ nodes,
|
|
|
|
each with some color. Our task is to find two nodes
|
|
|
|
that have the same color and the distance
|
|
|
|
between them is as large as possible.
|
|
|
|
|
|
|
|
The problem can be solved by going through all
|
|
|
|
colors one after another, and for each color,
|
|
|
|
finding two nodes of that color whose distance is
|
|
|
|
maximum.
|
|
|
|
For a fixed color, a subalgorithm will be used
|
|
|
|
that depends on the number of nodes of that color.
|
|
|
|
Let's assume that the current color is $x$
|
|
|
|
and there are $c$ nodes whose color is $x$.
|
|
|
|
There are two cases:
|
|
|
|
|
|
|
|
\subsubsection*{Case 1: $c \le \sqrt n$}
|
|
|
|
|
|
|
|
If the number of nodes is small,
|
|
|
|
we go through all pairs of nodes whose
|
|
|
|
color is $x$ and select the pair that
|
|
|
|
has the maximum distance.
|
|
|
|
For each node, we have calculate the distance
|
|
|
|
to $O(\sqrt n)$ other nodes (see 18.3),
|
|
|
|
so the total time needed for processing all
|
|
|
|
nodes in case 1 is $O(n \sqrt n)$.
|
|
|
|
|
|
|
|
\subsubsection*{Case 2: $c > \sqrt n$}
|
|
|
|
|
|
|
|
If the number of nodes is large,
|
|
|
|
we traverse through the whole tree
|
|
|
|
and calculate the maximum distance between
|
|
|
|
two nodes with color $x$.
|
|
|
|
The time complexity of the tree traversal is $O(n)$,
|
|
|
|
and this will be done at most $O(\sqrt n)$ times,
|
|
|
|
so the total time needed for case 2 is
|
|
|
|
$O(n \sqrt n)$.\\\\
|
2016-12-28 23:54:51 +01:00
|
|
|
\noindent
|
2017-01-25 22:13:05 +01:00
|
|
|
The time complexity of the algorithm is $O(n \sqrt n)$,
|
|
|
|
because both case 1 and case 2 take $O(n \sqrt n)$ time.
|
|
|
|
|
|
|
|
\section{Mo's algorithm}
|
|
|
|
|
|
|
|
\index{Mo's algorithm}
|
|
|
|
|
|
|
|
\key{Mo's algorithm} can be used in many problems
|
2017-01-26 22:12:30 +01:00
|
|
|
that require processing range queries in
|
2017-01-25 22:13:05 +01:00
|
|
|
a \emph{static} array.
|
2017-01-26 22:12:30 +01:00
|
|
|
Before processing the queries, the algorithm
|
|
|
|
sorts them in a special order which guarantees
|
|
|
|
that the algorithm runs efficiently.
|
|
|
|
|
|
|
|
At each moment in the algorithm, there is an active
|
|
|
|
subarray and the algorithm maintains the answer
|
|
|
|
for a query to that subarray.
|
|
|
|
The algorithm processes the given queries one by one,
|
|
|
|
and always changes the active subarray
|
|
|
|
by inserting and removing elements
|
|
|
|
so that it corresponds to the current query.
|
2017-01-25 22:13:05 +01:00
|
|
|
The time complexity of the algorithm is
|
|
|
|
$O(n \sqrt n f(n))$ when there are $n$ queries
|
2017-01-26 22:12:30 +01:00
|
|
|
and each insertion and removal of an element
|
|
|
|
takes $O(f(n))$ time.
|
|
|
|
|
|
|
|
The essential trick in Mo's algorithm is that
|
|
|
|
the queries are processed in a special order,
|
|
|
|
which makes the algorithm efficient.
|
|
|
|
The array is divided into blocks of $k=O(\sqrt n)$
|
|
|
|
elements, and the queries are sorted primarily by
|
|
|
|
the index of the block that contains the first element
|
|
|
|
of the query, and secondarily by the index of the
|
|
|
|
last element of the query.
|
2017-01-25 22:13:05 +01:00
|
|
|
It turns out that using this order, the algorithm
|
2017-01-26 22:12:30 +01:00
|
|
|
only performs $O(n \sqrt n)$ operations,
|
|
|
|
because the left border of the subarray moves
|
|
|
|
$n$ times $O(\sqrt n)$ steps,
|
|
|
|
and the right border of the subarray moves
|
2017-01-25 22:13:05 +01:00
|
|
|
$\sqrt n$ times $O(n)$ steps. Thus, both the
|
|
|
|
borders move a total of $O(n \sqrt n)$ steps.
|
|
|
|
|
|
|
|
\subsubsection*{Example}
|
|
|
|
|
|
|
|
As an example, let's consider a problem
|
2017-01-26 22:12:30 +01:00
|
|
|
where we are given a set of subarrays in an array,
|
|
|
|
and our task is to calculate for each subarray
|
|
|
|
the number of distinct elements in the subarray.
|
2017-01-25 22:13:05 +01:00
|
|
|
|
|
|
|
In Mo's algorithm, the queries are always sorted
|
2017-01-26 22:12:30 +01:00
|
|
|
in the same way, but the way the answer for the query
|
|
|
|
is maintained depends on the problem.
|
2017-01-25 22:13:05 +01:00
|
|
|
In this problem, we can maintain an array
|
|
|
|
\texttt{c} where $\texttt{c}[x]$
|
|
|
|
indicates how many times an element $x$
|
2017-01-26 22:12:30 +01:00
|
|
|
occurs in the active subarray.
|
2017-01-25 22:13:05 +01:00
|
|
|
|
|
|
|
When we move from a query to another query,
|
2017-01-26 22:12:30 +01:00
|
|
|
the active subarray changes.
|
|
|
|
For example, if the current subarray is
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=lightgray] (1,0) rectangle (5,1);
|
|
|
|
\draw (0,0) grid (9,1);
|
|
|
|
\node at (0.5, 0.5) {4};
|
|
|
|
\node at (1.5, 0.5) {2};
|
|
|
|
\node at (2.5, 0.5) {5};
|
|
|
|
\node at (3.5, 0.5) {4};
|
|
|
|
\node at (4.5, 0.5) {2};
|
|
|
|
\node at (5.5, 0.5) {4};
|
|
|
|
\node at (6.5, 0.5) {3};
|
|
|
|
\node at (7.5, 0.5) {3};
|
|
|
|
\node at (8.5, 0.5) {4};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-01-26 22:12:30 +01:00
|
|
|
and the next subarray is
|
2016-12-28 23:54:51 +01:00
|
|
|
\begin{center}
|
|
|
|
\begin{tikzpicture}[scale=0.7]
|
|
|
|
\fill[color=lightgray] (2,0) rectangle (7,1);
|
|
|
|
\draw (0,0) grid (9,1);
|
|
|
|
\node at (0.5, 0.5) {4};
|
|
|
|
\node at (1.5, 0.5) {2};
|
|
|
|
\node at (2.5, 0.5) {5};
|
|
|
|
\node at (3.5, 0.5) {4};
|
|
|
|
\node at (4.5, 0.5) {2};
|
|
|
|
\node at (5.5, 0.5) {4};
|
|
|
|
\node at (6.5, 0.5) {3};
|
|
|
|
\node at (7.5, 0.5) {3};
|
|
|
|
\node at (8.5, 0.5) {4};
|
|
|
|
\end{tikzpicture}
|
|
|
|
\end{center}
|
2017-01-25 22:13:05 +01:00
|
|
|
there will be three steps:
|
|
|
|
the left border moves one step to the left,
|
|
|
|
and the right border moves two steps to the right.
|
|
|
|
|
2017-01-26 22:12:30 +01:00
|
|
|
After each step, we update the
|
2017-01-25 22:13:05 +01:00
|
|
|
array \texttt{c}.
|
2017-01-26 22:12:30 +01:00
|
|
|
If an element $x$ is added to the subarray,
|
2017-01-25 22:13:05 +01:00
|
|
|
the value
|
|
|
|
$\texttt{c}[x]$ increases by one,
|
2017-01-26 22:12:30 +01:00
|
|
|
and if an element $x$ is removed from the subarray,
|
2017-01-25 22:13:05 +01:00
|
|
|
the value $\texttt{c}[x]$ decreases by one.
|
|
|
|
If after an insertion
|
|
|
|
$\texttt{c}[x]=1$,
|
|
|
|
the answer for the query increases by one,
|
2017-01-26 22:12:30 +01:00
|
|
|
and if after a removal $\texttt{c}[x]=0$,
|
2017-01-25 22:13:05 +01:00
|
|
|
the answer for the query decreases by one.
|
|
|
|
|
|
|
|
In this problem, the time needed to perform
|
|
|
|
each step is $O(1)$, so the total time complexity
|
2017-01-26 22:12:30 +01:00
|
|
|
of the algorithm is $O(n \sqrt n)$.
|