Corrections

This commit is contained in:
Antti H S Laaksonen 2017-02-13 21:42:16 +02:00
parent faa9ca2518
commit 3dd874a4fa
4 changed files with 179 additions and 179 deletions

View File

@ -6,7 +6,7 @@ Competitive programming combines two topics:
The \key{design of algorithms} consists of problem solving
and mathematical thinking.
Skills for analyzing problems and solving them
creatively is needed.
creatively are needed.
An algorithm for solving a problem
has to be both correct and efficient,
and the core of the problem is often
@ -28,14 +28,14 @@ are graded by testing an implemented algorithm
using a set of test cases.
Thus, it is not enough that the idea of the
algorithm is correct, but the implementation has
to be correct as well.
also to be correct.
Good coding style in contests is
straightforward and concise.
The solutions should be written quickly,
The programs should be written quickly,
because there is not much time available.
Unlike in traditional software engineering,
the solutions are short (usually at most some
the programs are short (usually at most some
hundreds of lines) and it is not needed to
maintain them after the contest.
@ -49,7 +49,7 @@ For example, in Google Code Jam 2016,
among the best 3,000 participants,
73 \% used C++,
15 \% used Python and
10 \% used Java\footnote{\url{https://www.go-hero.net/jam/16}}.
10 \% used Java. %\footnote{\url{https://www.go-hero.net/jam/16}}
Some participants also used several languages.
Many people think that C++ is the best choice
@ -65,11 +65,11 @@ of data structures and algorithms.
On the other hand, it is good to
master several languages and know
their strengths.
For example, if large numbers are needed
For example, if large integers are needed
in the problem,
Python can be a good choice because it
contains a built-in library for handling
large numbers.
Python can be a good choice, because it
contains built-in operations for
calculating with large integers.
Still, most problems in programming contests
are set so that
using a specific programming language
@ -99,8 +99,8 @@ int main() {
\end{lstlisting}
The \texttt{\#include} line at the beginning
of the code is a feature in the \texttt{g++} compiler
that allows us to include the whole standard library.
of the code is a feature of the \texttt{g++} compiler
that allows us to include the entire standard library.
Thus, it is not needed to separately include
libraries such as \texttt{iostream},
\texttt{vector} and \texttt{algorithm},
@ -112,7 +112,7 @@ of the standard library can be used directly
in the code.
Without the \texttt{using} line we should write,
for example, \texttt{std::cout},
but now it is enough to write \texttt{cout}.
but now it suffices to write \texttt{cout}.
The code can be compiled using the following command:
@ -122,7 +122,7 @@ g++ -std=c++11 -O2 -Wall code.cpp -o code
This command produces a binary file \texttt{code}
from the source code \texttt{code.cpp}.
The compiler obeys the C++11 standard
The compiler follows the C++11 standard
(\texttt{-std=c++11}),
optimizes the code (\texttt{-O2})
and shows warnings about possible errors (\texttt{-Wall}).
@ -152,8 +152,8 @@ cin >> a >> b >> x;
This kind of code always works,
assuming that there is at least one space
or one newline between each element in the input.
For example, the above code accepts
or newline between each element in the input.
For example, the above code can read
both the following inputs:
\begin{lstlisting}
123 456 monkey
@ -203,7 +203,7 @@ printf("%d %d\n", a, b);
Sometimes the program should read a whole line
from the input, possibly with spaces.
This can be accomplished using the
This can be accomplished by using the
\texttt{getline} function:
\begin{lstlisting}
@ -212,7 +212,7 @@ getline(cin, s);
\end{lstlisting}
If the amount of data is unknown, the following
loop can be handy:
loop is useful:
\begin{lstlisting}
while (cin >> x) {
// code
@ -231,11 +231,11 @@ but add the following lines to the beginning of the code:
freopen("input.txt", "r", stdin);
freopen("output.txt", "w", stdout);
\end{lstlisting}
After this, the code reads the input from the file
After this, the program reads the input from the file
''input.txt'' and writes the output to the file
''output.txt''.
\section{Handling numbers}
\section{Working with numbers}
\index{integer}
@ -299,19 +299,19 @@ when $x$ is divided by $m$.
For example, $17 \bmod 5 = 2$,
because $17 = 3 \cdot 5 + 2$.
Sometimes, the answer for a problem is a
Sometimes, the answer to a problem is a
very large number but it is enough to
output it ''modulo $m$'', i.e.,
the remainder when the answer is divided by $m$
(for example, ''modulo $10^9+7$'').
The idea is that even if the actual answer
may be very large,
it is enough to use the types
it suffices to use the types
\texttt{int} and \texttt{long long}.
An important property of the remainder is that
in addition, subtraction and multiplication,
the remainder can be calculated before the operation:
the remainder can be taken before the operation:
\[
\begin{array}{rcr}
@ -321,7 +321,7 @@ the remainder can be calculated before the operation:
\end{array}
\]
Thus, we can calculate the remainder after every operation
Thus, we can take the remainder after every operation
and the numbers will never become too large.
For example, the following code calculates $n!$,
@ -339,8 +339,8 @@ be between $0\ldots m-1$.
However, in C++ and other languages,
the remainder of a negative number
is either zero or negative.
An easy way to make sure this will
not happen is to first calculate
An easy way to make sure there
are no negative remainders is to first calculate
the remainder as usual and then add $m$
if the result is negative:
\begin{lstlisting}
@ -378,7 +378,8 @@ printf("%.9f\n", x);
A difficulty when using floating point numbers
is that some numbers cannot be represented
accurately but there will be rounding errors.
accurately as floating point numbers,
but there will be rounding errors.
For example, the result of the following code
is surprising:
@ -387,14 +388,14 @@ double x = 0.3*3+0.1;
printf("%.20f\n", x); // 0.99999999999999988898
\end{lstlisting}
Because of a rounding error,
the value of \texttt{x} is a bit less than 1,
Due to a rounding error,
the value of \texttt{x} is a bit smaller than 1,
while the correct value would be 1.
It is risky to compare floating point numbers
with the \texttt{==} operator,
because it is possible that the values should
be equal but they are not due to rounding errors.
be equal but they are not because of rounding.
A better way to compare floating point numbers
is to assume that two numbers are equal
if the difference between them is $\varepsilon$,
@ -414,12 +415,12 @@ integers up to a certain limit can be still
represented accurately.
For example, using \texttt{double},
it is possible to accurately represent all
integers having absolute value at most $2^{53}$.
integers whose absolute value is at most $2^{53}$.
\section{Shortening code}
Short code is ideal in competitive programming,
because algorithms should be implemented
because programs should be written
as fast as possible.
Because of this, competitive programmers often define
shorter names for datatypes and other parts of code.
@ -450,7 +451,7 @@ cout << a*b << "\n";
The command \texttt{typedef}
can also be used with more complex types.
For example, the following code gives
the name \texttt{vi} for a vector of integers,
the name \texttt{vi} for a vector of integers
and the name \texttt{pi} for a pair
that contains two integers.
\begin{lstlisting}
@ -511,16 +512,16 @@ REP(i,1,n) {
Mathematics plays an important role in competitive
programming, and it is not possible to become
a successful competitive programmer without good skills
in mathematics.
This section covers some important
a successful competitive programmer without
having good mathematical skills.
This section discusses some important
mathematical concepts and formulas that
are needed later in the book.
\subsubsection{Sum formulas}
Each sum of the form
\[\sum_{x=1}^n x^k = 1^k+2^k+3^k+\ldots+n^k\]
\[\sum_{x=1}^n x^k = 1^k+2^k+3^k+\ldots+n^k,\]
where $k$ is a positive integer,
has a closed-form formula that is a
polynomial of degree $k+1$.
@ -529,13 +530,14 @@ For example,
and
\[\sum_{x=1}^n x^2 = 1^2+2^2+3^2+\ldots+n^2 = \frac{n(n+1)(2n+1)}{6}.\]
An \key{arithmetic sum} is a sum \index{arithmetic sum}
An \key{arithmetic progression} is a \index{arithmetic progression}
sequence of numbers
where the difference between any two consecutive
numbers is constant.
For example,
\[3+7+11+15\]
is an arithmetic sum with constant 4.
An arithmetic sum can be calculated
\[3, 7, 11, 15\]
is an arithmetic progression with constant 4.
The sum of an arithmetic progression can be calculated
using the formula
\[\frac{n(a+b)}{2}\]
where $a$ is the first number,
@ -547,14 +549,15 @@ The formula is based on the fact
that the sum consists of $n$ numbers and
the value of each number is $(a+b)/2$ on average.
\index{geometric sum}
A \key{geometric sum} is a sum
\index{geometric progression}
A \key{geometric progression} is a sequence
of numbers
where the ratio between any two consecutive
numbers is constant.
For example,
\[3+6+12+24\]
is a geometric sum with constant 2.
A geometric sum can be calculated
\[3,6,12,24\]
is a geometric progression with constant 2.
The sum of a geometric progression can be calculated
using the formula
\[\frac{bx-a}{x-1}\]
where $a$ is the first number,
@ -571,7 +574,7 @@ and solving the equation
\[ xS-S = bx-a\]
yields the formula.
A special case of a geometric sum is the formula
A special case of a sum of a geometric progression is the formula
\[1+2+4+8+\ldots+2^{n-1}=2^n-1.\]
\index{harmonic sum}
@ -579,7 +582,7 @@ A special case of a geometric sum is the formula
A \key{harmonic sum} is a sum of the form
\[ \sum_{x=1}^n \frac{1}{x} = 1+\frac{1}{2}+\frac{1}{3}+\ldots+\frac{1}{n}.\]
An upper bound for the harmonic sum is $\log_2(n)+1$.
An upper bound for a harmonic sum is $\log_2(n)+1$.
Namely, we can
modify each term $1/k$ so that $k$ becomes
the nearest power of two that does not exceed $k$.
@ -589,7 +592,7 @@ the sum as follows:
1+\frac{1}{2}+\frac{1}{2}+\frac{1}{4}+\frac{1}{4}+\frac{1}{4}.\]
This upper bound consists of $\log_2(n)+1$ parts
($1$, $2 \cdot 1/2$, $4 \cdot 1/4$, etc.),
and the sum of each part is at most 1.
and the value of each part is at most 1.
\subsubsection{Set theory}
@ -607,7 +610,7 @@ For example, the set
\[X=\{2,4,7\}\]
contains elements 2, 4 and 7.
The symbol $\emptyset$ denotes an empty set,
and $|S|$ denotes the size of the set $S$,
and $|S|$ denotes the size of a set $S$,
i.e., the number of elements in the set.
For example, in the above set, $|X|=3$.
@ -654,15 +657,11 @@ $\{2\}$, $\{4\}$, $\{7\}$, $\{2,4\}$, $\{2,7\}$, $\{4,7\}$ and $\{2,4,7\}$.
\end{center}
Often used sets are
\begin{itemize}[noitemsep]
\item $\mathbb{N}$ (natural numbers),
\item $\mathbb{Z}$ (integers),
\item $\mathbb{Q}$ (rational numbers) and
\item $\mathbb{R}$ (real numbers).
\end{itemize}
The set $\mathbb{N}$ of natural numbers
$\mathbb{N}$ (natural numbers),
$\mathbb{Z}$ (integers),
$\mathbb{Q}$ (rational numbers) and
$\mathbb{R}$ (real numbers).
The set $\mathbb{N}$
can be defined in two ways, depending
on the situation:
either $\mathbb{N}=\{0,1,2,\ldots\}$
@ -707,7 +706,7 @@ $A$ & $B$ & $\lnot A$ & $\lnot B$ & $A \land B$ & $A \lor B$ & $A \Rightarrow B$
\end{tabular}
\end{center}
The expression $\lnot A$ has the reverse value of $A$.
The expression $\lnot A$ has the opposite value of $A$.
The expression $A \land B$ is true if both $A$ and $B$
are true,
and the expression $A \lor B$ is true if $A$ or $B$ or both
@ -729,7 +728,7 @@ Using this definition, $P(7)$ is true but $P(8)$ is false.
\index{quantifier}
A \key{quantifier} connects a logical expression
to elements in a set.
to the elements of a set.
The most important quantifiers are
$\forall$ (\key{for all}) and $\exists$ (\key{there is}).
For example,
@ -746,7 +745,7 @@ For example,
\[\forall x ((x>1 \land \lnot P(x)) \Rightarrow (\exists a (\exists b (x = ab \land a > 1 \land b > 1))))\]
means that if a number $x$ is larger than 1
and not a prime number,
then there exist numbers $a$ and $b$
then there are numbers $a$ and $b$
that are larger than $1$ and whose product is $x$.
This proposition is true in the set of integers.
@ -804,7 +803,7 @@ of the logarithm.
According to the definition,
$\log_k(x)=a$ exactly when $k^a=x$.
A useful interpretation in algorithm design is
A useful property of logarithms is
that $\log_k(x)$ equals the number of times
we have to divide $x$ by $k$ before we reach
the number 1.
@ -813,9 +812,9 @@ because 5 divisions are needed:
\[32 \rightarrow 16 \rightarrow 8 \rightarrow 4 \rightarrow 2 \rightarrow 1 \]
Logarithms are often needed in the analysis of
Logarithms are often used in the analysis of
algorithms, because many efficient algorithms
divide in half something at each step.
halve something at each step.
Hence, we can estimate the efficiency of such algorithms
using logarithms.
@ -837,9 +836,9 @@ The \key{natural logarithm} $\ln(x)$ of a number $x$
is a logarithm whose base is $e \approx 2{,}71828$.
Another property of logarithms is that
the number of digits of a number $x$ in base $b$ is
the number of digits of an integer $x$ in base $b$ is
$\lfloor \log_b(x)+1 \rfloor$.
For example, the representation of
the number $123$ in base $2$ is 1111011 and
$123$ in base $2$ is 1111011 and
$\lfloor \log_2(123)+1 \rfloor = 7$.

View File

@ -59,7 +59,7 @@ for (int i = 1; i <= n; i++) {
\subsubsection*{Order of magnitude}
A time complexity does not indicate the exact number
A time complexity does not tell us the exact number
of times the code inside a loop is executed,
but it only shows the order of magnitude.
In the following examples, the code inside the loop
@ -210,7 +210,7 @@ $n$ must be divided by 2 to get 1.
\item[$O(\sqrt n)$]
A \key{square root algorithm} is slower than
$O(\log n)$ but faster than $O(n)$.
A special feature of square roots is that
A special property of square roots is that
$\sqrt n = n/\sqrt n$, so the square root $\sqrt n$ lies
in some sense in the middle of the input.
@ -225,7 +225,7 @@ reporting the answer.
\item[$O(n \log n)$]
This time complexity often indicates that the
algorithm sorts the input
algorithm sorts the input,
because the time complexity of efficient
sorting algorithms is $O(n \log n)$.
Another possibility is that the algorithm
@ -237,14 +237,14 @@ takes $O(\log n)$ time.
A \key{quadratic} algorithm often contains
two nested loops.
It is possible to go through all pairs of
input elements in $O(n^2)$ time.
the input elements in $O(n^2)$ time.
\item[$O(n^3)$]
\index{cubic algorithm}
A \key{cubic} algorithm often contains
three nested loops.
It is possible to go through all triplets of
input elements in $O(n^3)$ time.
the input elements in $O(n^3)$ time.
\item[$O(2^n)$]
This time complexity often indicates that
@ -285,9 +285,9 @@ of problems for which no polynomial algorithm is known.
\section{Estimating efficiency}
By calculating the time complexity,
By calculating the time complexity of an algorithm,
it is possible to check before
implementing an algorithm that it is
implementing the algorithm that it is
efficient enough for the problem.
The starting point for estimations is the fact that
a modern computer can perform some hundreds of
@ -297,7 +297,7 @@ For example, assume that the time limit for
a problem is one second and the input size is $n=10^5$.
If the time complexity is $O(n^2)$,
the algorithm will perform about $(10^5)^2=10^{10}$ operations.
This should take at least some tens of seconds time,
This should take at least some tens of seconds,
so the algorithm seems to be too slow for solving the problem.
On the other hand, given the input size,
@ -326,7 +326,7 @@ it is probably expected that the time
complexity of the algorithm is $O(n)$ or $O(n \log n)$.
This information makes it easier to design the algorithm,
because it rules out approaches that would yield
an algorithm with a slower time complexity.
an algorithm with a worse time complexity.
\index{constant factor}
@ -412,9 +412,9 @@ the following subarray produces the maximum sum $10$:
\end{center}
\end{samepage}
\subsubsection{Solution 1}
\subsubsection{Algorithm 1}
A straightforward solution to the problem
A straightforward algorithm to the problem
is to go through all possible ways to
select a subarray, calculate the sum of
numbers in each subarray and maintain
@ -437,18 +437,18 @@ cout << p << "\n";
The code assumes that the numbers are stored in an array \texttt{x}
with indices $1 \ldots n$.
The variables $a$ and $b$ select the first and last
The variables $a$ and $b$ determine the first and last
number in the subarray,
and the sum of the subarray is calculated to the variable $s$.
and the sum of the numbers is calculated to the variable $s$.
The variable $p$ contains the maximum sum found during the search.
The time complexity of the algorithm is $O(n^3)$,
because it consists of three nested loops and
each loop contains $O(n)$ steps.
\subsubsection{Solution 2}
\subsubsection{Algorithm 2}
It is easy to make the first solution more efficient
It is easy to make the first algorithm more efficient
by removing one loop from it.
This is possible by calculating the sum at the same
time when the right end of the subarray moves.
@ -467,17 +467,17 @@ cout << p << "\n";
\end{lstlisting}
After this change, the time complexity is $O(n^2)$.
\subsubsection{Solution 3}
\subsubsection{Algorithm 3}
Surprisingly, it is possible to solve the problem
in $O(n)$ time, which means that we can remove
one more loop.
The idea is to calculate for each array position
the maximum subarray sum that ends at that position.
the maximum sum of a subarray that ends at that position.
After this, the answer for the problem is the
maximum of those sums.
Condider the subproblem of finding the maximum subarray
Condider the subproblem of finding the maximum-sum subarray
that ends at position $k$.
There are two possibilities:
\begin{enumerate}
@ -487,13 +487,13 @@ at position $k-1$, followed by the element at position $k$.
\end{enumerate}
Our goal is to find a subarray with maximum sum,
so in case 2 the subarray that ends at index $k-1$
so in case 2 the subarray that ends at position $k-1$
should also have the maximum sum.
Thus, we can solve the problem efficiently
when we calculate the maximum subarray sum
for each ending position from left to right.
The following code implements the solution:
The following code implements the algorithm:
\begin{lstlisting}
int p = 0, s = 0;
for (int k = 1; k <= n; k++) {
@ -508,7 +508,7 @@ that goes through the input,
so the time complexity is $O(n)$.
This is also the best possible time complexity,
because any algorithm for the problem
has to access all array elements at least once.
has to examine all array elements at least once.
\subsubsection{Efficiency comparison}
@ -524,7 +524,7 @@ measured.
\begin{center}
\begin{tabular}{rrrr}
array size $n$ & solution 1 & solution 2 & solution 3 \\
array size $n$ & algorithm 1 & algorithm 2 & algorithm 3 \\
\hline
$10^2$ & $0{,}0$ s & $0{,}0$ s & $0{,}0$ s \\
$10^3$ & $0{,}1$ s & $0{,}0$ s & $0{,}0$ s \\
@ -539,8 +539,8 @@ The comparison shows that all algorithms
are efficient when the input size is small,
but larger inputs bring out remarkable
differences in running times of the algorithms.
The $O(n^3)$ time solution 1 becomes slow
when $n=10^4$, and the $O(n^2)$ time solution 2
The $O(n^3)$ time algorithm 1 becomes slow
when $n=10^4$, and the $O(n^2)$ time algorithm 2
becomes slow when $n=10^5$.
Only the $O(n)$ time solution 3 solves
Only the $O(n)$ time algorithm 3 processes
even the largest inputs instantly.

View File

@ -4,22 +4,22 @@
\key{Sorting}
is a fundamental algorithm design problem.
In addition,
many efficient algorithms
Many efficient algorithms
use sorting as a subroutine,
because it is often easier to process
data if the elements are in a sorted order.
For example, the question ''does the array contain
For example, the problem ''does the array contain
two equal elements?'' is easy to solve using sorting.
If the array contains two equal elements,
they will be next to each other after sorting,
so it is easy to find them.
Also the question ''what is the most frequent element
Also the problem ''what is the most frequent element
in the array?'' can be solved similarly.
There are many algorithms for sorting, that are
also good examples of algorithm design techniques.
There are many algorithms for sorting, and they are
also good examples of how to apply
different algorithm design techniques.
The efficient general sorting algorithms
work in $O(n \log n)$ time,
and many algorithms that use sorting
@ -99,15 +99,15 @@ is \key{bubble sort} where the elements
Bubble sort consists of $n-1$ rounds.
On each round, the algorithm iterates through
the elements in the array.
the elements of the array.
Whenever two consecutive elements are found
that are not in correct order,
the algorithm swaps them.
The algorithm can be implemented as follows
for array
for an array
$\texttt{t}[1],\texttt{t}[2],\ldots,\texttt{t}[n]$:
\begin{lstlisting}
for (int i = 1; i <= n; i++) {
for (int i = 1; i <= n-1; i++) {
for (int j = 1; j <= n-i; j++) {
if (t[j] > t[j+1]) swap(t[j],t[j+1]);
}
@ -118,7 +118,7 @@ After the first round of the algorithm,
the largest element will be in the correct position,
and in general, after $k$ rounds, the $k$ largest
elements will be in the correct positions.
Thus, after $n$ rounds, the whole array
Thus, after $n-1$ rounds, the whole array
will be sorted.
For example, in the array
@ -267,7 +267,7 @@ algorithm that always swaps consecutive
elements in the array.
It turns out that the time complexity
of such an algorithm is \emph{always}
at least $O(n^2)$ because in the worst case,
at least $O(n^2)$, because in the worst case,
$O(n^2)$ swaps are required for sorting the array.
A useful concept when analyzing sorting
@ -302,7 +302,7 @@ For example, in the array
\end{tikzpicture}
\end{center}
the inversions are $(6,3)$, $(6,5)$ and $(9,8)$.
The number of inversions indicates
The number of inversions tells us
how much work is needed to sort the array.
An array is completely sorted when
there are no inversions.
@ -332,20 +332,20 @@ that is based on recursion.
Mergesort sorts a subarray \texttt{t}$[a,b]$ as follows:
\begin{enumerate}
\item If $a=b$, do not do anything because the subarray is already sorted.
\item Calculate the index of the middle element: $k=\lfloor (a+b)/2 \rfloor$.
\item If $a=b$, do not do anything, because the subarray is already sorted.
\item Calculate the position of the middle element: $k=\lfloor (a+b)/2 \rfloor$.
\item Recursively sort the subarray \texttt{t}$[a,k]$.
\item Recursively sort the subarray \texttt{t}$[k+1,b]$.
\item \emph{Merge} the sorted subarrays \texttt{t}$[a,k]$ and \texttt{t}$[k+1,b]$
into a sorted subarray \texttt{t}$[a,b]$.
\end{enumerate}
Mergesort is an efficient algorithm because it
Mergesort is an efficient algorithm, because it
halves the size of the subarray at each step.
The recursion consists of $O(\log n)$ levels,
and processing each level takes $O(n)$ time.
Merging the subarrays \texttt{t}$[a,k]$ and \texttt{t}$[k+1,b]$
is possible in linear time because they are already sorted.
is possible in linear time, because they are already sorted.
For example, consider sorting the following array:
\begin{center}
@ -466,7 +466,7 @@ when we restrict ourselves to sorting algorithms
that are based on comparing array elements.
The lower bound for the time complexity
can be proved by examining the sorting
can be proved by considering sorting
as a process where each comparison of two elements
gives more information about the contents of the array.
The process creates the following tree:
@ -599,10 +599,10 @@ corresponds to the following bookkeeping array:
\end{tikzpicture}
\end{center}
For example, the value at index 3
For example, the value at position 3
in the bookkeeping array is 2,
because the element 3 appears 2 times
in the original array (indices 2 and 6).
in the original array (positions 2 and 6).
The construction of the bookkeeping array
takes $O(n)$ time. After this, the sorted array
@ -621,8 +621,8 @@ be used as indices in the bookkeeping array.
\index{sort@\texttt{sort}}
It is almost never a good idea to implement
an own sorting algorithm
It is almost never a good idea to use
a self-made sorting algorithm
in a contest, because there are good
implementations available in programming languages.
For example, the C++ standard library contains
@ -634,7 +634,7 @@ First, it saves time because there is no need to
implement the function.
In addition, the library implementation is
certainly correct and efficient: it is not probable
that a home-made sorting function would be better.
that a self-made sorting function would be better.
In this section we will see how to use the
C++ \texttt{sort} function.
@ -652,7 +652,7 @@ but a reverse order is possible as follows:
\begin{lstlisting}
sort(v.rbegin(),v.rend());
\end{lstlisting}
A regular array can be sorted as follows:
An ordinary array can be sorted as follows:
\begin{lstlisting}
int n = 7; // array size
int t[] = {4,2,5,3,5,8,3};
@ -667,7 +667,7 @@ Sorting a string means that the characters
in the string are sorted.
For example, the string ''monkey'' becomes ''ekmnoy''.
\subsubsection{Comparison operator}
\subsubsection{Comparison operators}
\index{comparison operator}
@ -677,17 +677,17 @@ of the elements to be sorted.
During the sorting, this operator will be used
whenever it is needed to find out the order of two elements.
Most C++ data types have a built-in comparison operator
Most C++ data types have a built-in comparison operator,
and elements of those types can be sorted automatically.
For example, numbers are sorted according to their values
and strings are sorted in alphabetical order.
\index{pair@\texttt{pair}}
Pairs (\texttt{pair}) are sorted primarily by the first
element (\texttt{first}).
Pairs (\texttt{pair}) are sorted primarily by their first
elements (\texttt{first}).
However, if the first elements of two pairs are equal,
they are sorted by the second element (\texttt{second}):
they are sorted by their second elements (\texttt{second}):
\begin{lstlisting}
vector<pair<int,int>> v;
v.push_back({1,5});
@ -700,7 +700,7 @@ $(1,2)$, $(1,5)$ and $(2,3)$.
\index{tuple@\texttt{tuple}}
Correspondingly, tuples (\texttt{tuple})
In a similar way, tuples (\texttt{tuple})
are sorted primarily by the first element,
secondarily by the second element, etc.:
\begin{lstlisting}
@ -741,7 +741,7 @@ struct P {
};
\end{lstlisting}
\subsubsection{Comparison function}
\subsubsection{Comparison functions}
\index{comparison function}
@ -782,7 +782,7 @@ for (int i = 1; i <= n; i++) {
The time complexity of this approach is $O(n)$,
because in the worst case, it is needed to check
all elements in the array.
If the array can contain any elements,
If the array may contain any elements,
this is also the best possible approach, because
there is no additional information available where
in the array we should search for the element $x$.
@ -791,7 +791,7 @@ However, if the array is \emph{sorted},
the situation is different.
In this case it is possible to perform the
search much faster, because the order of the
elements in the array guides us.
elements in the array guides the search.
The following \key{binary search} algorithm
efficiently searches for an element in a sorted array
in $O(\log n)$ time.
@ -804,7 +804,7 @@ At each step, the search halves the active region in the array,
until the target element is found, or it turns out
that there is no such element.
First, the search checks the middle element in the array.
First, the search checks the middle element of the array.
If the middle element is the target element,
the search terminates.
Otherwise, the search recursively continues
@ -823,7 +823,7 @@ while (a <= b) {
\end{lstlisting}
The algorithm maintains a range $a \ldots b$
that corresponds to the active region in the array.
that corresponds to the active region of the array.
Initially, the range is $1 \ldots n$, the whole array.
The algorithm halves the size of the range at each step,
so the time complexity is $O(\log n)$.
@ -856,7 +856,7 @@ if (t[k] == x) // x was found at index k
The variables $k$ and $b$ contain the position
in the array and the jump length.
If the array contains the element $x$,
the index of the element will be in the variable $k$
the position of $x$ will be in the variable $k$
after the search.
The time complexity of the algorithm is $O(\log n)$,
because the code in the \texttt{while} loop
@ -866,7 +866,7 @@ is performed at most twice for each jump length.
In practice, it is seldom needed to implement
binary search for searching elements in an array,
because we can use the standard library instead.
because we can use the standard library.
For example, the C++ functions \texttt{lower\_bound}
and \texttt{upper\_bound} implement binary search,
and the data structure \texttt{set} maintains a
@ -893,7 +893,7 @@ $\texttt{ok}(x)$ & \texttt{false} & \texttt{false}
\end{center}
\noindent
The value $k$ can be found using binary search:
Now, the value $k$ can be found using binary search:
\begin{lstlisting}
int x = -1;
@ -947,7 +947,7 @@ for (int b = z; b >= 1; b /= 2) {
int k = x+1;
\end{lstlisting}
Note that unlike in the standard binary search,
Note that unlike in the ordinary binary search,
here it is not allowed that consecutive values
of the function are equal.
In this case it would not be possible to know

View File

@ -30,7 +30,7 @@ size can be changed during the execution
of the program.
The most popular dynamic array in C++ is
the \texttt{vector} structure,
that can be used almost like a regular array.
that can be used almost like an ordinary array.
The following code creates an empty vector and
adds three elements to it:
@ -42,7 +42,7 @@ v.push_back(2); // [3,2]
v.push_back(5); // [3,2,5]
\end{lstlisting}
After this, the elements can be accessed like in a regular array:
After this, the elements can be accessed like in an ordinary array:
\begin{lstlisting}
cout << v[0] << "\n"; // 3
@ -102,7 +102,7 @@ vector<int> v(10, 5);
\end{lstlisting}
The internal implementation of the vector
uses a regular array.
uses an ordinary array.
If the size of the vector increases and
the array becomes too small,
a new array is allocated and all the
@ -119,7 +119,7 @@ In addition, there is special syntax for strings
that is not available in other data structures.
Strings can be combined using the \texttt{+} symbol.
The function $\texttt{substr}(k,x)$ returns the substring
that begins at index $k$ and has length $x$,
that begins at position $k$ and has length $x$,
and the function $\texttt{find}(\texttt{t})$ finds the position
of the first occurrence of a substring \texttt{t}.
@ -196,14 +196,14 @@ for (auto x : s) {
}
\end{lstlisting}
An important property of a set is
An important property of sets
that all the elements are \emph{distinct}.
Thus, the function \texttt{count} always returns
either 0 (the element is not in the set)
or 1 (the element is in the set),
and the function \texttt{insert} never adds
an element to the set if it is
already in the set.
already there.
The following code illustrates this:
\begin{lstlisting}
@ -214,13 +214,13 @@ s.insert(5);
cout << s.count(5) << "\n"; // 1
\end{lstlisting}
C++ also has the structures
C++ also contains the structures
\texttt{multiset} and \texttt{unordered\_multiset}
that work otherwise like \texttt{set}
and \texttt{unordered\_set}
but they can contain multiple instances of an element.
For example, in the following code all three instances
of the number 5 are added to the set:
of the number 5 are added to a multiset:
\begin{lstlisting}
multiset<int> s;
@ -231,7 +231,7 @@ cout << s.count(5) << "\n"; // 3
\end{lstlisting}
The function \texttt{erase} removes
all instances of an element
from a \texttt{multiset}:
from a multiset:
\begin{lstlisting}
s.erase(5);
cout << s.count(5) << "\n"; // 0
@ -249,7 +249,7 @@ cout << s.count(5) << "\n"; // 2
A \key{map} is a generalized array
that consists of key-value-pairs.
While the keys in a regular array are always
While the keys in an ordinary array are always
the consecutive integers $0,1,\ldots,n-1$,
where $n$ is the size of the array,
the keys in a map can be of any data type and
@ -259,11 +259,11 @@ C++ contains two map implementations that
correspond to the set implementations:
the structure
\texttt{map} is based on a balanced
binary tree and accessing an element
binary tree and accessing elements
takes $O(\log n)$ time,
while the structure
\texttt{unordered\_map} uses a hash map
and accessing an element takes $O(1)$ time on average.
and accessing elements takes $O(1)$ time on average.
The following code creates a map
where the keys are strings and the values are integers:
@ -288,15 +288,15 @@ is added to the map.
map<string,int> m;
cout << m["aybabtu"] << "\n"; // 0
\end{lstlisting}
The function \texttt{count} determines
if a key exists in the map:
The function \texttt{count} checks
if a key exists in a map:
\begin{lstlisting}
if (m.count("aybabtu")) {
cout << "key exists in the map";
}
\end{lstlisting}
The following code prints all keys and values
in the map:
in a map:
\begin{lstlisting}
for (auto x : m) {
cout << x.first << " " << x.second << "\n";
@ -358,7 +358,7 @@ reverse(v.begin(), v.end());
random_shuffle(v.begin(), v.end());
\end{lstlisting}
These functions can also be used with a regular array.
These functions can also be used with an ordinary array.
In this case, the functions are given pointers to the array
instead of iterators:
@ -371,8 +371,8 @@ random_shuffle(t, t+n);
\subsubsection{Set iterators}
Iterators are often used when accessing
elements in a set.
Iterators are often used to access
elements of a set.
The following code creates an iterator
\texttt{it} that points to the first element in the set:
\begin{lstlisting}
@ -421,11 +421,11 @@ if (it == s.end()) cout << "x is missing";
\end{lstlisting}
The function $\texttt{lower\_bound}(x)$ returns
an iterator to the smallest element in the set
an iterator to the smallest element
whose value is \emph{at least} $x$, and
the function $\texttt{upper\_bound}(x)$
returns an iterator to the smallest element
in the set whose value is \emph{larger than} $x$.
whose value is \emph{larger than} $x$.
If such elements do not exist,
the return value of the functions will be \texttt{end}.
These functions are not supported by the
@ -487,18 +487,18 @@ cout << s[4] << "\n"; // 0
cout << s[5] << "\n"; // 1
\end{lstlisting}
The benefit in using a bitset is that
it requires less memory than a regular array,
because each element in the bitset only
The benefit in using bitsets is that
they require less memory than ordinary arrays,
because each element in a bitset only
uses one bit of memory.
For example,
if $n$ bits are stored as an \texttt{int} array,
if $n$ bits are stored in an \texttt{int} array,
$32n$ bits of memory will be used,
but a corresponding bitset only requires $n$ bits of memory.
In addition, the values in a bitset
In addition, the values of a bitset
can be efficiently manipulated using
bit operators, which makes it possible to
optimize algorithms.
optimize algorithms using bit sets.
The following code shows another way to create a bitset:
\begin{lstlisting}
@ -530,9 +530,9 @@ cout << (a^b) << "\n"; // 1001101110
A \texttt{deque} is a dynamic array
whose size can be changed at both ends of the array.
Like a vector, a deque contains functions
Like a vector, a deque contains the functions
\texttt{push\_back} and \texttt{pop\_back}, but
it also contains additional functions
it also contains the functions
\texttt{push\_front} and \texttt{pop\_front}
that are not available in a vector.
@ -547,7 +547,7 @@ d.pop_front(); // [5]
\end{lstlisting}
The internal implementation of a deque
is more complex than the implementation of a vector.
is more complex than that of a vector.
For this reason, a deque is slower than a vector.
Still, the time complexity of adding and removing
elements is $O(1)$ on average at both ends.
@ -583,7 +583,7 @@ provides two $O(1)$ time operations:
adding a element to the end of the queue,
and removing the first element in the queue.
It is only possible to access the first
and the last element of a queue.
and last element of a queue.
The following code shows how a queue can be used:
\begin{lstlisting}
@ -606,7 +606,7 @@ maintains a set of elements.
The supported operations are insertion and,
depending on the type of the queue,
retrieval and removal of
either the minimum element or the maximum element.
either the minimum or maximum element.
The time complexity is $O(\log n)$
for insertion and removal and $O(1)$ for retrieval.
@ -623,7 +623,7 @@ As default, the elements in the C++
priority queue are sorted in decreasing order,
and it is possible to find and remove the
largest element in the queue.
The following code shows an example:
The following code illustrates this:
\begin{lstlisting}
priority_queue<int> q;
@ -643,7 +643,8 @@ q.pop();
Using the following declaration,
we can create a priority queue
that supports finding and removing the minimum element:
that allows us to find and remove
the minimum element:
\begin{lstlisting}
priority_queue<int,vector<int>,greater<int>> q;
@ -670,20 +671,20 @@ and 9 belong to both of the lists.
A straightforward solution to the problem is
to go through all pairs of numbers in $O(n^2)$ time,
but next we will concentrate on
more efficient solutions.
more efficient algorithms.
\subsubsection{Solution 1}
\subsubsection{Algorithm 1}
We construct a set of the numbers in $A$,
We construct a set of the numbers that appear in $A$,
and after this, we iterate through the numbers
in $B$ and check for each number if it
also belongs to $A$.
This is efficient because the numbers in $A$
This is efficient because the numbers of $A$
are in a set.
Using the \texttt{set} structure,
the time complexity of the algorithm is $O(n \log n)$.
\subsubsection{Solution 2}
\subsubsection{Algorithm 2}
It is not needed to maintain an ordered set,
so instead of the \texttt{set} structure
@ -693,7 +694,7 @@ more efficient, because we only have to change
the underlying data structure.
The time complexity of the new algorithm is $O(n)$.
\subsubsection{Solution 3}
\subsubsection{Algorithm 3}
Instead of data structures, we can use sorting.
First, we sort both lists $A$ and $B$.
@ -707,12 +708,12 @@ so the total time complexity is $O(n \log n)$.
The following table shows how efficient
the above algorithms are when $n$ varies and
the elements in the lists are random
the elements of the lists are random
integers between $1 \ldots 10^9$:
\begin{center}
\begin{tabular}{rrrr}
$n$ & solution 1 & solution 2 & solution 3 \\
$n$ & algorithm 1 & algorithm 2 & algorithm 3 \\
\hline
$10^6$ & $1{,}5$ s & $0{,}3$ s & $0{,}2$ s \\
$2 \cdot 10^6$ & $3{,}7$ s & $0{,}8$ s & $0{,}3$ s \\
@ -722,22 +723,22 @@ $5 \cdot 10^6$ & $10{,}0$ s & $2{,}3$ s & $0{,}9$ s \\
\end{tabular}
\end{center}
Solutions 1 and 2 are equal except that
Algorithm 1 and 2 are equal except that
they use different set structures.
In this problem, this choice has an important effect on
the running time, because solution 2
is 45 times faster than solution 1.
the running time, because algorithm 2
is 45 times faster than algorithm 1.
However, the most efficient solution is solution 3
However, the most efficient algorithm is algorithm 3
that uses sorting.
It only uses half of the time compared to solution 2.
It only uses half of the time compared to algorithm 2.
Interestingly, the time complexity of both
solution 1 and solution 3 is $O(n \log n)$,
but despite this, solution 3 is ten times faster.
algorithm 1 and algorithm 3 is $O(n \log n)$,
but despite this, algorithm 3 is ten times faster.
This can be explained by the fact that
sorting is a simple procedure and it is done
only once at the beginning of solution 3,
only once at the beginning of algorithm 3,
and the rest of the algorithm works in linear time.
On the other hand,
solution 3 maintains a complex balanced binary tree
algorithm 3 maintains a complex balanced binary tree
during the whole algorithm.