2016-12-28 23:54:51 +01:00
|
|
|
|
\chapter{Data structures}
|
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
\index{data structure}
|
|
|
|
|
|
|
|
|
|
A \key{data structure} is a way to store
|
2017-03-10 23:50:25 +01:00
|
|
|
|
data in the memory of a computer.
|
2017-01-30 22:32:12 +01:00
|
|
|
|
It is important to choose an appropriate
|
2016-12-31 13:25:58 +01:00
|
|
|
|
data structure for a problem,
|
|
|
|
|
because each data structure has its own
|
|
|
|
|
advantages and disadvantages.
|
|
|
|
|
The crucial question is: which operations
|
|
|
|
|
are efficient in the chosen data structure?
|
|
|
|
|
|
|
|
|
|
This chapter introduces the most important
|
|
|
|
|
data structures in the C++ standard library.
|
|
|
|
|
It is a good idea to use the standard library
|
|
|
|
|
whenever possible,
|
|
|
|
|
because it will save a lot of time.
|
2017-03-10 23:50:25 +01:00
|
|
|
|
Later in the book we will learn about more sophisticated
|
2016-12-31 13:25:58 +01:00
|
|
|
|
data structures that are not available
|
|
|
|
|
in the standard library.
|
|
|
|
|
|
2017-02-20 22:23:10 +01:00
|
|
|
|
\section{Dynamic arrays}
|
2016-12-31 13:25:58 +01:00
|
|
|
|
|
|
|
|
|
\index{dynamic array}
|
|
|
|
|
\index{vector}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
A \key{dynamic array} is an array whose
|
|
|
|
|
size can be changed during the execution
|
2017-01-30 22:32:12 +01:00
|
|
|
|
of the program.
|
2016-12-31 13:25:58 +01:00
|
|
|
|
The most popular dynamic array in C++ is
|
2017-01-30 22:32:12 +01:00
|
|
|
|
the \texttt{vector} structure,
|
2017-03-10 23:50:25 +01:00
|
|
|
|
which can be used almost like an ordinary array.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
The following code creates an empty vector and
|
|
|
|
|
adds three elements to it:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
vector<int> v;
|
|
|
|
|
v.push_back(3); // [3]
|
|
|
|
|
v.push_back(2); // [3,2]
|
|
|
|
|
v.push_back(5); // [3,2,5]
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-02-13 20:42:16 +01:00
|
|
|
|
After this, the elements can be accessed like in an ordinary array:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
cout << v[0] << "\n"; // 3
|
|
|
|
|
cout << v[1] << "\n"; // 2
|
|
|
|
|
cout << v[2] << "\n"; // 5
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
The function \texttt{size} returns the number of elements in the vector.
|
|
|
|
|
The following code iterates through
|
|
|
|
|
the vector and prints all elements in it:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
for (int i = 0; i < v.size(); i++) {
|
|
|
|
|
cout << v[i] << "\n";
|
|
|
|
|
}
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
|
|
|
|
\begin{samepage}
|
2017-03-10 23:50:25 +01:00
|
|
|
|
A shorter way to iterate through a vector is as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
for (auto x : v) {
|
|
|
|
|
cout << x << "\n";
|
|
|
|
|
}
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
\end{samepage}
|
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
The function \texttt{back} returns the last element
|
|
|
|
|
in the vector, and
|
|
|
|
|
the function \texttt{pop\_back} removes the last element:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
vector<int> v;
|
|
|
|
|
v.push_back(5);
|
|
|
|
|
v.push_back(2);
|
|
|
|
|
cout << v.back() << "\n"; // 2
|
|
|
|
|
v.pop_back();
|
|
|
|
|
cout << v.back() << "\n"; // 5
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
The following code creates a vector with five elements:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
vector<int> v = {2,4,2,5,1};
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
Another way to create a vector is to give the number
|
|
|
|
|
of elements and the initial value for each element:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
2016-12-31 13:25:58 +01:00
|
|
|
|
// size 10, initial value 0
|
2016-12-28 23:54:51 +01:00
|
|
|
|
vector<int> v(10);
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
\begin{lstlisting}
|
2016-12-31 13:25:58 +01:00
|
|
|
|
// size 10, initial value 5
|
2016-12-28 23:54:51 +01:00
|
|
|
|
vector<int> v(10, 5);
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-03-10 23:50:25 +01:00
|
|
|
|
The internal implementation of a vector
|
2017-02-13 20:42:16 +01:00
|
|
|
|
uses an ordinary array.
|
2016-12-31 13:25:58 +01:00
|
|
|
|
If the size of the vector increases and
|
|
|
|
|
the array becomes too small,
|
|
|
|
|
a new array is allocated and all the
|
2017-01-30 22:32:12 +01:00
|
|
|
|
elements are moved to the new array.
|
|
|
|
|
However, this does not happen often and the
|
|
|
|
|
average time complexity of
|
|
|
|
|
\texttt{push\_back} is $O(1)$.
|
2016-12-31 13:25:58 +01:00
|
|
|
|
|
|
|
|
|
\index{string}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-01-30 22:32:12 +01:00
|
|
|
|
The \texttt{string} structure
|
|
|
|
|
is also a dynamic array that can be used almost like a vector.
|
2016-12-31 13:25:58 +01:00
|
|
|
|
In addition, there is special syntax for strings
|
|
|
|
|
that is not available in other data structures.
|
|
|
|
|
Strings can be combined using the \texttt{+} symbol.
|
|
|
|
|
The function $\texttt{substr}(k,x)$ returns the substring
|
2017-02-13 20:42:16 +01:00
|
|
|
|
that begins at position $k$ and has length $x$,
|
2017-01-30 22:32:12 +01:00
|
|
|
|
and the function $\texttt{find}(\texttt{t})$ finds the position
|
|
|
|
|
of the first occurrence of a substring \texttt{t}.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
The following code presents some string operations:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
string a = "hatti";
|
|
|
|
|
string b = a+a;
|
|
|
|
|
cout << b << "\n"; // hattihatti
|
|
|
|
|
b[5] = 'v';
|
|
|
|
|
cout << b << "\n"; // hattivatti
|
|
|
|
|
string c = b.substr(3,4);
|
|
|
|
|
cout << c << "\n"; // tiva
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-02-20 22:23:10 +01:00
|
|
|
|
\section{Set structures}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
\index{set}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
A \key{set} is a data structure that
|
2017-01-30 22:32:12 +01:00
|
|
|
|
maintains a collection of elements.
|
2017-02-27 20:29:32 +01:00
|
|
|
|
The basic operations of sets are element
|
2016-12-31 13:25:58 +01:00
|
|
|
|
insertion, search and removal.
|
|
|
|
|
|
2017-03-10 23:50:25 +01:00
|
|
|
|
The C++ standard library contains two set
|
2017-03-12 12:42:21 +01:00
|
|
|
|
implementations:
|
2016-12-31 13:25:58 +01:00
|
|
|
|
The structure \texttt{set} is based on a balanced
|
2017-05-17 22:30:18 +02:00
|
|
|
|
binary tree and its operations work in $O(\log n)$ time.
|
2017-02-27 20:29:32 +01:00
|
|
|
|
The structure \texttt{unordered\_set} uses hashing,
|
2017-05-17 22:30:18 +02:00
|
|
|
|
and its operations work in $O(1)$ time on average.
|
2016-12-31 13:25:58 +01:00
|
|
|
|
|
2017-03-10 23:50:25 +01:00
|
|
|
|
The choice of which set implementation to use
|
2016-12-31 13:25:58 +01:00
|
|
|
|
is often a matter of taste.
|
2017-05-17 22:30:18 +02:00
|
|
|
|
The benefit of the \texttt{set} structure
|
2016-12-31 13:25:58 +01:00
|
|
|
|
is that it maintains the order of the elements
|
|
|
|
|
and provides functions that are not available
|
|
|
|
|
in \texttt{unordered\_set}.
|
2017-05-17 22:30:18 +02:00
|
|
|
|
On the other hand, \texttt{unordered\_set}
|
|
|
|
|
can be more efficient.
|
2016-12-31 13:25:58 +01:00
|
|
|
|
|
|
|
|
|
The following code creates a set
|
2017-05-17 22:30:18 +02:00
|
|
|
|
that contains integers,
|
2017-01-30 22:32:12 +01:00
|
|
|
|
and shows some of the operations.
|
2016-12-31 13:25:58 +01:00
|
|
|
|
The function \texttt{insert} adds an element to the set,
|
2017-01-30 22:32:12 +01:00
|
|
|
|
the function \texttt{count} returns the number of occurrences
|
2017-05-17 22:30:18 +02:00
|
|
|
|
of an element in the set,
|
2016-12-31 13:25:58 +01:00
|
|
|
|
and the function \texttt{erase} removes an element from the set.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
set<int> s;
|
|
|
|
|
s.insert(3);
|
|
|
|
|
s.insert(2);
|
|
|
|
|
s.insert(5);
|
|
|
|
|
cout << s.count(3) << "\n"; // 1
|
|
|
|
|
cout << s.count(4) << "\n"; // 0
|
|
|
|
|
s.erase(3);
|
|
|
|
|
s.insert(4);
|
|
|
|
|
cout << s.count(3) << "\n"; // 0
|
|
|
|
|
cout << s.count(4) << "\n"; // 1
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 13:25:58 +01:00
|
|
|
|
A set can be used mostly like a vector,
|
|
|
|
|
but it is not possible to access
|
|
|
|
|
the elements using the \texttt{[]} notation.
|
|
|
|
|
The following code creates a set,
|
|
|
|
|
prints the number of elements in it, and then
|
|
|
|
|
iterates through all the elements:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
set<int> s = {2,5,6,8};
|
|
|
|
|
cout << s.size() << "\n"; // 4
|
|
|
|
|
for (auto x : s) {
|
|
|
|
|
cout << x << "\n";
|
|
|
|
|
}
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-02-25 15:51:29 +01:00
|
|
|
|
An important property of sets is
|
2017-03-10 23:50:25 +01:00
|
|
|
|
that all their elements are \emph{distinct}.
|
2016-12-31 13:25:58 +01:00
|
|
|
|
Thus, the function \texttt{count} always returns
|
|
|
|
|
either 0 (the element is not in the set)
|
|
|
|
|
or 1 (the element is in the set),
|
|
|
|
|
and the function \texttt{insert} never adds
|
|
|
|
|
an element to the set if it is
|
2017-02-13 20:42:16 +01:00
|
|
|
|
already there.
|
2016-12-31 13:25:58 +01:00
|
|
|
|
The following code illustrates this:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
set<int> s;
|
|
|
|
|
s.insert(5);
|
|
|
|
|
s.insert(5);
|
|
|
|
|
s.insert(5);
|
|
|
|
|
cout << s.count(5) << "\n"; // 1
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-02-13 20:42:16 +01:00
|
|
|
|
C++ also contains the structures
|
2016-12-31 13:25:58 +01:00
|
|
|
|
\texttt{multiset} and \texttt{unordered\_multiset}
|
2017-03-10 23:50:25 +01:00
|
|
|
|
that otherwise work like \texttt{set}
|
2016-12-31 13:25:58 +01:00
|
|
|
|
and \texttt{unordered\_set}
|
2017-01-30 22:32:12 +01:00
|
|
|
|
but they can contain multiple instances of an element.
|
|
|
|
|
For example, in the following code all three instances
|
2017-02-13 20:42:16 +01:00
|
|
|
|
of the number 5 are added to a multiset:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
multiset<int> s;
|
|
|
|
|
s.insert(5);
|
|
|
|
|
s.insert(5);
|
|
|
|
|
s.insert(5);
|
|
|
|
|
cout << s.count(5) << "\n"; // 3
|
|
|
|
|
\end{lstlisting}
|
2016-12-31 13:25:58 +01:00
|
|
|
|
The function \texttt{erase} removes
|
|
|
|
|
all instances of an element
|
2017-02-13 20:42:16 +01:00
|
|
|
|
from a multiset:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
s.erase(5);
|
|
|
|
|
cout << s.count(5) << "\n"; // 0
|
|
|
|
|
\end{lstlisting}
|
2016-12-31 13:25:58 +01:00
|
|
|
|
Often, only one instance should be removed,
|
|
|
|
|
which can be done as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
s.erase(s.find(5));
|
|
|
|
|
cout << s.count(5) << "\n"; // 2
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-02-20 22:23:10 +01:00
|
|
|
|
\section{Map structures}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-01-30 22:32:12 +01:00
|
|
|
|
\index{map}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
A \key{map} is a generalized array
|
|
|
|
|
that consists of key-value-pairs.
|
2017-02-13 20:42:16 +01:00
|
|
|
|
While the keys in an ordinary array are always
|
2017-01-30 22:32:12 +01:00
|
|
|
|
the consecutive integers $0,1,\ldots,n-1$,
|
2016-12-31 14:31:37 +01:00
|
|
|
|
where $n$ is the size of the array,
|
|
|
|
|
the keys in a map can be of any data type and
|
2017-01-30 22:32:12 +01:00
|
|
|
|
they do not have to be consecutive values.
|
2016-12-31 14:31:37 +01:00
|
|
|
|
|
2017-03-10 23:50:25 +01:00
|
|
|
|
The C++ standard library contains two map
|
|
|
|
|
implementations that correspond to the set
|
|
|
|
|
implementations: the structure
|
2016-12-31 14:31:37 +01:00
|
|
|
|
\texttt{map} is based on a balanced
|
2017-02-13 20:42:16 +01:00
|
|
|
|
binary tree and accessing elements
|
2016-12-31 14:31:37 +01:00
|
|
|
|
takes $O(\log n)$ time,
|
|
|
|
|
while the structure
|
2017-02-27 20:29:32 +01:00
|
|
|
|
\texttt{unordered\_map} uses hashing
|
2017-02-13 20:42:16 +01:00
|
|
|
|
and accessing elements takes $O(1)$ time on average.
|
2016-12-31 14:31:37 +01:00
|
|
|
|
|
|
|
|
|
The following code creates a map
|
|
|
|
|
where the keys are strings and the values are integers:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
map<string,int> m;
|
2016-12-31 14:31:37 +01:00
|
|
|
|
m["monkey"] = 4;
|
|
|
|
|
m["banana"] = 3;
|
|
|
|
|
m["harpsichord"] = 9;
|
|
|
|
|
cout << m["banana"] << "\n"; // 3
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-01-30 22:32:12 +01:00
|
|
|
|
If the value of a key is requested
|
|
|
|
|
but the map does not contain it,
|
2016-12-31 14:31:37 +01:00
|
|
|
|
the key is automatically added to the map with
|
|
|
|
|
a default value.
|
|
|
|
|
For example, in the following code,
|
|
|
|
|
the key ''aybabtu'' with value 0
|
|
|
|
|
is added to the map.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
map<string,int> m;
|
|
|
|
|
cout << m["aybabtu"] << "\n"; // 0
|
|
|
|
|
\end{lstlisting}
|
2017-02-13 20:42:16 +01:00
|
|
|
|
The function \texttt{count} checks
|
|
|
|
|
if a key exists in a map:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
if (m.count("aybabtu")) {
|
2017-05-17 22:30:18 +02:00
|
|
|
|
// key exists
|
2016-12-28 23:54:51 +01:00
|
|
|
|
}
|
|
|
|
|
\end{lstlisting}
|
2017-03-10 23:50:25 +01:00
|
|
|
|
The following code prints all the keys and values
|
2017-02-13 20:42:16 +01:00
|
|
|
|
in a map:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
for (auto x : m) {
|
|
|
|
|
cout << x.first << " " << x.second << "\n";
|
|
|
|
|
}
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
\section{Iterators and ranges}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
\index{iterator}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
Many functions in the C++ standard library
|
2017-01-30 22:32:12 +01:00
|
|
|
|
operate with iterators.
|
2016-12-31 14:31:37 +01:00
|
|
|
|
An \key{iterator} is a variable that points
|
|
|
|
|
to an element in a data structure.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-03-10 23:50:25 +01:00
|
|
|
|
The often used iterators \texttt{begin}
|
|
|
|
|
and \texttt{end} define a range that contains
|
2016-12-31 14:31:37 +01:00
|
|
|
|
all elements in a data structure.
|
|
|
|
|
The iterator \texttt{begin} points to
|
|
|
|
|
the first element in the data structure,
|
|
|
|
|
and the iterator \texttt{end} points to
|
|
|
|
|
the position \emph{after} the last element.
|
|
|
|
|
The situation looks as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tabular}{llllllllll}
|
|
|
|
|
\{ & 3, & 4, & 6, & 8, & 12, & 13, & 14, & 17 & \} \\
|
|
|
|
|
& $\uparrow$ & & & & & & & & $\uparrow$ \\
|
|
|
|
|
& \multicolumn{3}{l}{\texttt{s.begin()}} & & & & & & \texttt{s.end()} \\
|
|
|
|
|
\end{tabular}
|
|
|
|
|
\end{center}
|
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
Note the asymmetry in the iterators:
|
|
|
|
|
\texttt{s.begin()} points to an element in the data structure,
|
|
|
|
|
while \texttt{s.end()} points outside the data structure.
|
|
|
|
|
Thus, the range defined by the iterators is \emph{half-open}.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-01-30 22:32:12 +01:00
|
|
|
|
\subsubsection{Working with ranges}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
Iterators are used in C++ standard library functions
|
2017-01-30 22:36:14 +01:00
|
|
|
|
that are given a range of elements in a data structure.
|
2016-12-31 14:31:37 +01:00
|
|
|
|
Usually, we want to process all elements in a
|
|
|
|
|
data structure, so the iterators
|
|
|
|
|
\texttt{begin} and \texttt{end} are given for the function.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
For example, the following code sorts a vector
|
|
|
|
|
using the function \texttt{sort},
|
|
|
|
|
then reverses the order of the elements using the function
|
|
|
|
|
\texttt{reverse}, and finally shuffles the order of
|
|
|
|
|
the elements using the function \texttt{random\_shuffle}.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\index{sort@\texttt{sort}}
|
|
|
|
|
\index{reverse@\texttt{reverse}}
|
|
|
|
|
\index{random\_shuffle@\texttt{random\_shuffle}}
|
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
sort(v.begin(), v.end());
|
|
|
|
|
reverse(v.begin(), v.end());
|
|
|
|
|
random_shuffle(v.begin(), v.end());
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-02-13 20:42:16 +01:00
|
|
|
|
These functions can also be used with an ordinary array.
|
2016-12-31 14:31:37 +01:00
|
|
|
|
In this case, the functions are given pointers to the array
|
|
|
|
|
instead of iterators:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
\newpage
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
sort(t, t+n);
|
|
|
|
|
reverse(t, t+n);
|
|
|
|
|
random_shuffle(t, t+n);
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
\subsubsection{Set iterators}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-02-13 20:42:16 +01:00
|
|
|
|
Iterators are often used to access
|
|
|
|
|
elements of a set.
|
2016-12-31 14:31:37 +01:00
|
|
|
|
The following code creates an iterator
|
2017-05-17 22:30:18 +02:00
|
|
|
|
\texttt{it} that points to the smallest element in a set:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
set<int>::iterator it = s.begin();
|
|
|
|
|
\end{lstlisting}
|
2016-12-31 14:31:37 +01:00
|
|
|
|
A shorter way to write the code is as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
auto it = s.begin();
|
|
|
|
|
\end{lstlisting}
|
2016-12-31 14:31:37 +01:00
|
|
|
|
The element to which an iterator points
|
2017-02-27 20:29:32 +01:00
|
|
|
|
can be accessed using the \texttt{*} symbol.
|
2016-12-31 14:31:37 +01:00
|
|
|
|
For example, the following code prints
|
|
|
|
|
the first element in the set:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
auto it = s.begin();
|
|
|
|
|
cout << *it << "\n";
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-01-30 22:32:12 +01:00
|
|
|
|
Iterators can be moved using the operators
|
2017-03-25 18:04:09 +01:00
|
|
|
|
\texttt{++} (forward) and \texttt{--} (backward),
|
2016-12-31 14:31:37 +01:00
|
|
|
|
meaning that the iterator moves to the next
|
|
|
|
|
or previous element in the set.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-05-17 22:30:18 +02:00
|
|
|
|
The following code prints all the elements
|
|
|
|
|
in increasing order:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
for (auto it = s.begin(); it != s.end(); it++) {
|
|
|
|
|
cout << *it << "\n";
|
|
|
|
|
}
|
|
|
|
|
\end{lstlisting}
|
2017-05-17 22:30:18 +02:00
|
|
|
|
The following code prints the largest element in the set:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
2017-05-17 22:30:18 +02:00
|
|
|
|
auto it = s.end(); it--;
|
2016-12-28 23:54:51 +01:00
|
|
|
|
cout << *it << "\n";
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
The function $\texttt{find}(x)$ returns an iterator
|
|
|
|
|
that points to an element whose value is $x$.
|
2017-01-30 22:32:12 +01:00
|
|
|
|
However, if the set does not contain $x$,
|
2016-12-31 14:31:37 +01:00
|
|
|
|
the iterator will be \texttt{end}.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
auto it = s.find(x);
|
2017-05-17 22:30:18 +02:00
|
|
|
|
if (it == s.end()) {
|
|
|
|
|
// x is not found
|
|
|
|
|
}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 14:31:37 +01:00
|
|
|
|
The function $\texttt{lower\_bound}(x)$ returns
|
2017-05-17 22:30:18 +02:00
|
|
|
|
an iterator to the smallest element in the set
|
2017-01-30 22:32:12 +01:00
|
|
|
|
whose value is \emph{at least} $x$, and
|
2016-12-31 14:31:37 +01:00
|
|
|
|
the function $\texttt{upper\_bound}(x)$
|
2017-05-17 22:30:18 +02:00
|
|
|
|
returns an iterator to the smallest element in the set
|
2017-02-13 20:42:16 +01:00
|
|
|
|
whose value is \emph{larger than} $x$.
|
2017-05-17 22:30:18 +02:00
|
|
|
|
In both functions, if such an element does not exist,
|
|
|
|
|
the return value is \texttt{end}.
|
2016-12-31 14:31:37 +01:00
|
|
|
|
These functions are not supported by the
|
2017-03-10 23:50:25 +01:00
|
|
|
|
\texttt{unordered\_set} structure which
|
2017-01-30 22:32:12 +01:00
|
|
|
|
does not maintain the order of the elements.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{samepage}
|
2016-12-31 14:31:37 +01:00
|
|
|
|
For example, the following code finds the element
|
|
|
|
|
nearest to $x$:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
2017-05-17 22:30:18 +02:00
|
|
|
|
auto it = s.lower_bound(x);
|
|
|
|
|
if (it == s.begin()) {
|
|
|
|
|
cout << *it << "\n";
|
|
|
|
|
} else if (it == s.end()) {
|
|
|
|
|
it--;
|
|
|
|
|
cout << *it << "\n";
|
2016-12-28 23:54:51 +01:00
|
|
|
|
} else {
|
2017-05-17 22:30:18 +02:00
|
|
|
|
int a = *it; it--;
|
|
|
|
|
int b = *it;
|
|
|
|
|
if (x-b < a-x) cout << b << "\n";
|
|
|
|
|
else cout << a << "\n";
|
2016-12-28 23:54:51 +01:00
|
|
|
|
}
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-05-17 22:30:18 +02:00
|
|
|
|
The code assumes that the set is not empty,
|
|
|
|
|
and goes through all possible cases
|
|
|
|
|
using an iterator \texttt{it}.
|
2016-12-31 14:31:37 +01:00
|
|
|
|
First, the iterator points to the smallest
|
|
|
|
|
element whose value is at least $x$.
|
2017-05-17 22:30:18 +02:00
|
|
|
|
If \texttt{it} equals \texttt{begin},
|
2016-12-31 14:31:37 +01:00
|
|
|
|
the corresponding element is nearest to $x$.
|
2017-05-17 22:30:18 +02:00
|
|
|
|
If \texttt{it} equals \texttt{end},
|
|
|
|
|
the largest element in the set is nearest to $x$.
|
2017-03-10 23:50:25 +01:00
|
|
|
|
If none of the previous cases hold,
|
2016-12-31 14:31:37 +01:00
|
|
|
|
the element nearest to $x$ is either the
|
2017-05-17 22:30:18 +02:00
|
|
|
|
element that corresponds to \texttt{it} or the previous element.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\end{samepage}
|
|
|
|
|
|
2016-12-31 16:36:46 +01:00
|
|
|
|
\section{Other structures}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-05-04 19:28:44 +02:00
|
|
|
|
\subsubsection{Bitset}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 14:38:55 +01:00
|
|
|
|
\index{bitset}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-04-18 19:12:07 +02:00
|
|
|
|
A \key{bitset} is an array
|
2016-12-31 14:38:55 +01:00
|
|
|
|
where each element is either 0 or 1.
|
|
|
|
|
For example, the following code creates
|
|
|
|
|
a bitset that contains 10 elements:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
bitset<10> s;
|
2017-03-07 16:54:11 +01:00
|
|
|
|
s[1] = 1;
|
|
|
|
|
s[3] = 1;
|
|
|
|
|
s[4] = 1;
|
|
|
|
|
s[7] = 1;
|
|
|
|
|
cout << s[4] << "\n"; // 1
|
|
|
|
|
cout << s[5] << "\n"; // 0
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-05-17 22:30:18 +02:00
|
|
|
|
The benefit of using bitsets is that
|
2017-02-13 20:42:16 +01:00
|
|
|
|
they require less memory than ordinary arrays,
|
|
|
|
|
because each element in a bitset only
|
2016-12-31 14:38:55 +01:00
|
|
|
|
uses one bit of memory.
|
|
|
|
|
For example,
|
2017-02-13 20:42:16 +01:00
|
|
|
|
if $n$ bits are stored in an \texttt{int} array,
|
2016-12-31 14:38:55 +01:00
|
|
|
|
$32n$ bits of memory will be used,
|
|
|
|
|
but a corresponding bitset only requires $n$ bits of memory.
|
2017-02-13 20:42:16 +01:00
|
|
|
|
In addition, the values of a bitset
|
2016-12-31 14:38:55 +01:00
|
|
|
|
can be efficiently manipulated using
|
|
|
|
|
bit operators, which makes it possible to
|
2017-02-13 20:42:16 +01:00
|
|
|
|
optimize algorithms using bit sets.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-03-07 16:54:11 +01:00
|
|
|
|
The following code shows another way to create the above bitset:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
2017-03-07 16:54:11 +01:00
|
|
|
|
bitset<10> s(string("0010011010")); // from right to left
|
|
|
|
|
cout << s[4] << "\n"; // 1
|
|
|
|
|
cout << s[5] << "\n"; // 0
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 14:38:55 +01:00
|
|
|
|
The function \texttt{count} returns the number
|
|
|
|
|
of ones in the bitset:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
bitset<10> s(string("0010011010"));
|
|
|
|
|
cout << s.count() << "\n"; // 4
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 14:38:55 +01:00
|
|
|
|
The following code shows examples of using bit operations:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
bitset<10> a(string("0010110110"));
|
|
|
|
|
bitset<10> b(string("1011011000"));
|
|
|
|
|
cout << (a&b) << "\n"; // 0010010000
|
|
|
|
|
cout << (a|b) << "\n"; // 1011111110
|
|
|
|
|
cout << (a^b) << "\n"; // 1001101110
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-05-04 19:28:44 +02:00
|
|
|
|
\subsubsection{Deque}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 16:35:06 +01:00
|
|
|
|
\index{deque}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-04-18 19:12:07 +02:00
|
|
|
|
A \key{deque} is a dynamic array
|
2017-05-17 22:30:18 +02:00
|
|
|
|
whose size can be efficiently
|
|
|
|
|
changed at both ends of the array.
|
2017-02-27 20:29:32 +01:00
|
|
|
|
Like a vector, a deque provides the functions
|
2016-12-31 16:35:06 +01:00
|
|
|
|
\texttt{push\_back} and \texttt{pop\_back}, but
|
2017-02-27 20:29:32 +01:00
|
|
|
|
it also provides the functions
|
2016-12-31 16:35:06 +01:00
|
|
|
|
\texttt{push\_front} and \texttt{pop\_front}
|
2017-03-10 23:50:25 +01:00
|
|
|
|
which are not available in a vector.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 16:35:06 +01:00
|
|
|
|
A deque can be used as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
deque<int> d;
|
|
|
|
|
d.push_back(5); // [5]
|
|
|
|
|
d.push_back(2); // [5,2]
|
|
|
|
|
d.push_front(3); // [3,5,2]
|
|
|
|
|
d.pop_back(); // [3,5]
|
|
|
|
|
d.pop_front(); // [5]
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2016-12-31 16:35:06 +01:00
|
|
|
|
The internal implementation of a deque
|
2017-05-17 22:30:18 +02:00
|
|
|
|
is more complex than that of a vector,
|
|
|
|
|
and for this reason, a deque is slower than a vector.
|
|
|
|
|
Still, both adding and removing
|
|
|
|
|
elements takes $O(1)$ time on average at both ends.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-05-04 19:28:44 +02:00
|
|
|
|
\subsubsection{Stack}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 16:35:06 +01:00
|
|
|
|
\index{stack}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-04-18 19:12:07 +02:00
|
|
|
|
A \key{stack}
|
2016-12-31 16:35:06 +01:00
|
|
|
|
is a data structure that provides two
|
|
|
|
|
$O(1)$ time operations:
|
|
|
|
|
adding an element to the top,
|
|
|
|
|
and removing an element from the top.
|
|
|
|
|
It is only possible to access the top
|
|
|
|
|
element of a stack.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 16:35:06 +01:00
|
|
|
|
The following code shows how a stack can be used:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
stack<int> s;
|
|
|
|
|
s.push(3);
|
|
|
|
|
s.push(2);
|
|
|
|
|
s.push(5);
|
|
|
|
|
cout << s.top(); // 5
|
|
|
|
|
s.pop();
|
|
|
|
|
cout << s.top(); // 2
|
|
|
|
|
\end{lstlisting}
|
2017-05-04 19:28:44 +02:00
|
|
|
|
\subsubsection{Queue}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 16:35:06 +01:00
|
|
|
|
\index{queue}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-04-18 19:12:07 +02:00
|
|
|
|
A \key{queue} also
|
2016-12-31 16:35:06 +01:00
|
|
|
|
provides two $O(1)$ time operations:
|
2017-03-10 23:50:25 +01:00
|
|
|
|
adding an element to the end of the queue,
|
2017-01-30 22:32:12 +01:00
|
|
|
|
and removing the first element in the queue.
|
2016-12-31 16:35:06 +01:00
|
|
|
|
It is only possible to access the first
|
2017-02-13 20:42:16 +01:00
|
|
|
|
and last element of a queue.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 16:35:06 +01:00
|
|
|
|
The following code shows how a queue can be used:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{lstlisting}
|
2017-05-17 22:30:18 +02:00
|
|
|
|
queue<int> q;
|
|
|
|
|
q.push(3);
|
|
|
|
|
q.push(2);
|
|
|
|
|
q.push(5);
|
|
|
|
|
cout << q.front(); // 3
|
|
|
|
|
q.pop();
|
|
|
|
|
cout << q.front(); // 2
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-05-04 19:28:44 +02:00
|
|
|
|
\subsubsection{Priority queue}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2016-12-31 16:35:06 +01:00
|
|
|
|
\index{priority queue}
|
|
|
|
|
\index{heap}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-04-18 19:12:07 +02:00
|
|
|
|
A \key{priority queue}
|
2016-12-31 16:35:06 +01:00
|
|
|
|
maintains a set of elements.
|
|
|
|
|
The supported operations are insertion and,
|
|
|
|
|
depending on the type of the queue,
|
|
|
|
|
retrieval and removal of
|
2017-02-13 20:42:16 +01:00
|
|
|
|
either the minimum or maximum element.
|
2017-05-17 22:30:18 +02:00
|
|
|
|
Insertion and removal take $O(\log n)$ time,
|
|
|
|
|
and retrieval takes $O(1)$ time.
|
2016-12-31 16:35:06 +01:00
|
|
|
|
|
2017-02-27 20:29:32 +01:00
|
|
|
|
While an ordered set efficiently supports
|
2016-12-31 16:35:06 +01:00
|
|
|
|
all the operations of a priority queue,
|
2017-05-17 22:30:18 +02:00
|
|
|
|
the benefit of using a priority queue is
|
2016-12-31 16:35:06 +01:00
|
|
|
|
that it has smaller constant factors.
|
|
|
|
|
A priority queue is usually implemented using
|
|
|
|
|
a heap structure that is much simpler than a
|
2017-05-17 22:30:18 +02:00
|
|
|
|
balanced binary tree used in an ordered set.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{samepage}
|
2017-05-17 22:30:18 +02:00
|
|
|
|
By default, the elements in a C++
|
2016-12-31 16:35:06 +01:00
|
|
|
|
priority queue are sorted in decreasing order,
|
|
|
|
|
and it is possible to find and remove the
|
|
|
|
|
largest element in the queue.
|
2017-02-13 20:42:16 +01:00
|
|
|
|
The following code illustrates this:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
priority_queue<int> q;
|
|
|
|
|
q.push(3);
|
|
|
|
|
q.push(5);
|
|
|
|
|
q.push(7);
|
|
|
|
|
q.push(2);
|
|
|
|
|
cout << q.top() << "\n"; // 7
|
|
|
|
|
q.pop();
|
|
|
|
|
cout << q.top() << "\n"; // 5
|
|
|
|
|
q.pop();
|
|
|
|
|
q.push(6);
|
|
|
|
|
cout << q.top() << "\n"; // 6
|
|
|
|
|
q.pop();
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
\end{samepage}
|
|
|
|
|
|
2017-05-17 22:30:18 +02:00
|
|
|
|
If we want to create a priority queue
|
2017-05-04 19:28:44 +02:00
|
|
|
|
that supports finding and removing
|
2017-05-17 22:30:18 +02:00
|
|
|
|
the smallest element,
|
|
|
|
|
we can do it as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
priority_queue<int,vector<int>,greater<int>> q;
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
|
2017-05-04 20:28:47 +02:00
|
|
|
|
\subsubsection{Policy-based data structures}
|
|
|
|
|
|
|
|
|
|
The \texttt{g++} compiler also supports
|
|
|
|
|
some data structures that are not part
|
|
|
|
|
of the C++ standard library.
|
|
|
|
|
Such structures are called \emph{policy-based}
|
|
|
|
|
data structures.
|
|
|
|
|
To use these structures, the following lines
|
|
|
|
|
must be added to the code:
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
#include <ext/pb_ds/assoc_container.hpp>
|
|
|
|
|
using namespace __gnu_pbds;
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
After this, we can define a data structure \texttt{indexed\_set} that
|
|
|
|
|
is like \texttt{set} but can be indexed like an array.
|
|
|
|
|
The definition for \texttt{int} values is as follows:
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
typedef tree<int,null_type,less<int>,rb_tree_tag,
|
|
|
|
|
tree_order_statistics_node_update> indexed_set;
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
Now we can create a set as follows:
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
indexed_set s;
|
|
|
|
|
s.insert(2);
|
|
|
|
|
s.insert(3);
|
|
|
|
|
s.insert(7);
|
|
|
|
|
s.insert(9);
|
|
|
|
|
\end{lstlisting}
|
2017-05-17 22:30:18 +02:00
|
|
|
|
The speciality of this set is that we have access to
|
2017-05-04 20:28:47 +02:00
|
|
|
|
the indices that the elements would have in a sorted array.
|
|
|
|
|
The function $\texttt{find\_by\_order}$ returns
|
|
|
|
|
an iterator to the element at a given position:
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
auto x = s.find_by_order(2);
|
|
|
|
|
cout << *x << "\n"; // 7
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
And the function $\texttt{order\_of\_key}$
|
|
|
|
|
returns the position of a given element:
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
cout << s.order_of_key(7) << "\n"; // 2
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
If the element does not appear in the set,
|
|
|
|
|
we get the position that the element would have
|
|
|
|
|
in the set:
|
|
|
|
|
\begin{lstlisting}
|
|
|
|
|
cout << s.order_of_key(6) << "\n"; // 2
|
|
|
|
|
cout << s.order_of_key(8) << "\n"; // 3
|
|
|
|
|
\end{lstlisting}
|
|
|
|
|
Both the functions work in logarithmic time.
|
|
|
|
|
|
2016-12-31 16:35:06 +01:00
|
|
|
|
\section{Comparison to sorting}
|
|
|
|
|
|
2017-03-10 23:50:25 +01:00
|
|
|
|
It is often possible to solve a problem
|
2016-12-31 16:35:06 +01:00
|
|
|
|
using either data structures or sorting.
|
|
|
|
|
Sometimes there are remarkable differences
|
|
|
|
|
in the actual efficiency of these approaches,
|
|
|
|
|
which may be hidden in their time complexities.
|
|
|
|
|
|
|
|
|
|
Let us consider a problem where
|
|
|
|
|
we are given two lists $A$ and $B$
|
2017-05-17 22:30:18 +02:00
|
|
|
|
that both contain $n$ elements.
|
|
|
|
|
Our task is to calculate the number of elements
|
2016-12-31 16:35:06 +01:00
|
|
|
|
that belong to both of the lists.
|
|
|
|
|
For example, for the lists
|
|
|
|
|
\[A = [5,2,8,9,4] \hspace{10px} \textrm{and} \hspace{10px} B = [3,2,9,5],\]
|
|
|
|
|
the answer is 3 because the numbers 2, 5
|
|
|
|
|
and 9 belong to both of the lists.
|
|
|
|
|
|
2017-01-30 22:32:12 +01:00
|
|
|
|
A straightforward solution to the problem is
|
2017-05-17 22:30:18 +02:00
|
|
|
|
to go through all pairs of elements in $O(n^2)$ time,
|
|
|
|
|
but next we will focus on
|
2017-02-13 20:42:16 +01:00
|
|
|
|
more efficient algorithms.
|
2016-12-31 16:35:06 +01:00
|
|
|
|
|
2017-02-13 20:42:16 +01:00
|
|
|
|
\subsubsection{Algorithm 1}
|
2016-12-31 16:35:06 +01:00
|
|
|
|
|
2017-05-17 22:30:18 +02:00
|
|
|
|
We construct a set of the elements that appear in $A$,
|
|
|
|
|
and after this, we iterate through the elements
|
|
|
|
|
of $B$ and check for each elements if it
|
2016-12-31 16:35:06 +01:00
|
|
|
|
also belongs to $A$.
|
2017-05-17 22:30:18 +02:00
|
|
|
|
This is efficient because the elements of $A$
|
2016-12-31 16:35:06 +01:00
|
|
|
|
are in a set.
|
|
|
|
|
Using the \texttt{set} structure,
|
|
|
|
|
the time complexity of the algorithm is $O(n \log n)$.
|
|
|
|
|
|
2017-02-13 20:42:16 +01:00
|
|
|
|
\subsubsection{Algorithm 2}
|
2016-12-31 16:35:06 +01:00
|
|
|
|
|
2017-05-17 22:30:18 +02:00
|
|
|
|
It is not necessary to maintain an ordered set,
|
2016-12-31 16:35:06 +01:00
|
|
|
|
so instead of the \texttt{set} structure
|
|
|
|
|
we can also use the \texttt{unordered\_set} structure.
|
|
|
|
|
This is an easy way to make the algorithm
|
2017-01-30 22:32:12 +01:00
|
|
|
|
more efficient, because we only have to change
|
|
|
|
|
the underlying data structure.
|
2016-12-31 16:35:06 +01:00
|
|
|
|
The time complexity of the new algorithm is $O(n)$.
|
|
|
|
|
|
2017-02-13 20:42:16 +01:00
|
|
|
|
\subsubsection{Algorithm 3}
|
2016-12-31 16:35:06 +01:00
|
|
|
|
|
|
|
|
|
Instead of data structures, we can use sorting.
|
|
|
|
|
First, we sort both lists $A$ and $B$.
|
|
|
|
|
After this, we iterate through both the lists
|
|
|
|
|
at the same time and find the common elements.
|
|
|
|
|
The time complexity of sorting is $O(n \log n)$,
|
|
|
|
|
and the rest of the algorithm works in $O(n)$ time,
|
|
|
|
|
so the total time complexity is $O(n \log n)$.
|
|
|
|
|
|
|
|
|
|
\subsubsection{Efficiency comparison}
|
|
|
|
|
|
|
|
|
|
The following table shows how efficient
|
|
|
|
|
the above algorithms are when $n$ varies and
|
2017-02-13 20:42:16 +01:00
|
|
|
|
the elements of the lists are random
|
2016-12-31 16:35:06 +01:00
|
|
|
|
integers between $1 \ldots 10^9$:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tabular}{rrrr}
|
2017-05-17 22:30:18 +02:00
|
|
|
|
$n$ & Algorithm 1 & Algorithm 2 & Algorithm 3 \\
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\hline
|
2017-05-17 22:32:48 +02:00
|
|
|
|
$10^6$ & $1.5$ s & $0.3$ s & $0.2$ s \\
|
|
|
|
|
$2 \cdot 10^6$ & $3.7$ s & $0.8$ s & $0.3$ s \\
|
|
|
|
|
$3 \cdot 10^6$ & $5.7$ s & $1.3$ s & $0.5$ s \\
|
|
|
|
|
$4 \cdot 10^6$ & $7.7$ s & $1.7$ s & $0.7$ s \\
|
|
|
|
|
$5 \cdot 10^6$ & $10.0$ s & $2.3$ s & $0.9$ s \\
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\end{tabular}
|
|
|
|
|
\end{center}
|
|
|
|
|
|
2017-02-25 15:51:29 +01:00
|
|
|
|
Algorithms 1 and 2 are equal except that
|
2017-01-30 22:32:12 +01:00
|
|
|
|
they use different set structures.
|
|
|
|
|
In this problem, this choice has an important effect on
|
2017-05-17 22:30:18 +02:00
|
|
|
|
the running time, because Algorithm 2
|
|
|
|
|
is 4–5 times faster than Algorithm 1.
|
2016-12-31 16:35:06 +01:00
|
|
|
|
|
2017-05-17 22:30:18 +02:00
|
|
|
|
However, the most efficient algorithm is Algorithm 3
|
2017-03-10 23:50:25 +01:00
|
|
|
|
which uses sorting.
|
2017-05-17 22:30:18 +02:00
|
|
|
|
It only uses half the time compared to Algorithm 2.
|
2016-12-31 16:35:06 +01:00
|
|
|
|
Interestingly, the time complexity of both
|
2017-05-17 22:30:18 +02:00
|
|
|
|
Algorithm 1 and Algorithm 3 is $O(n \log n)$,
|
|
|
|
|
but despite this, Algorithm 3 is ten times faster.
|
2017-01-30 22:32:12 +01:00
|
|
|
|
This can be explained by the fact that
|
2016-12-31 16:35:06 +01:00
|
|
|
|
sorting is a simple procedure and it is done
|
2017-05-17 22:30:18 +02:00
|
|
|
|
only once at the beginning of Algorithm 3,
|
2016-12-31 16:35:06 +01:00
|
|
|
|
and the rest of the algorithm works in linear time.
|
|
|
|
|
On the other hand,
|
2017-05-17 22:30:18 +02:00
|
|
|
|
Algorithm 1 maintains a complex balanced binary tree
|
2017-02-28 20:14:06 +01:00
|
|
|
|
during the whole algorithm.
|