Improve language
This commit is contained in:
parent
1a43cf875e
commit
758ed890ce
177
chapter10.tex
177
chapter10.tex
|
@ -2,43 +2,41 @@
|
||||||
|
|
||||||
All data in computer programs is internally stored as bits,
|
All data in computer programs is internally stored as bits,
|
||||||
i.e., as numbers 0 and 1.
|
i.e., as numbers 0 and 1.
|
||||||
In this chapter, we will learn how integers
|
This chapter discusses the bit representation
|
||||||
are represented as bits, and how bit operations
|
of integers, and shows examples
|
||||||
can be used to manipulate them.
|
of how to use bit operations.
|
||||||
It turns out that there are many uses for
|
It turns out that there are many uses for
|
||||||
bit operations in algorithm programming.
|
bit manipulation in algorithm programming.
|
||||||
|
|
||||||
\section{Bit representation}
|
\section{Bit representation}
|
||||||
|
|
||||||
\index{bit representation}
|
\index{bit representation}
|
||||||
|
|
||||||
Every nonnegative integer can be represented as a sum
|
In programming, an $n$ bit integer is internally
|
||||||
\[c_k 2^k + \ldots + c_2 2^2 + c_1 2^1 + c_0 2^0,\]
|
stored as a binary number that consists of $n$ bits.
|
||||||
where each coefficient $c_i$ is either 0 or 1.
|
For example, the C++ type \texttt{int} is
|
||||||
The bit representation of such a number is
|
a 32-bit type, which means that every \texttt{int}
|
||||||
$c_k \cdots c_2 c_1 c_0$.
|
number consists of 32 bits.
|
||||||
For example, the number 43 corresponds to the sum
|
|
||||||
\[1 \cdot 2^5 + 0 \cdot 2^4 + 1 \cdot 2^3 + 0 \cdot 2^2 + 1 \cdot 2^1 + 1 \cdot 2^0,\]
|
|
||||||
so the bit representation of the number is 101011.
|
|
||||||
|
|
||||||
In programming, the length of the bit representation
|
Here is the bit representation of
|
||||||
depends on the data type of the number.
|
the \texttt{int} number 43:
|
||||||
For example, in C++ the type \texttt{int} is
|
|
||||||
usually a 32-bit type and an \texttt{int} number
|
|
||||||
consists of 32 bits.
|
|
||||||
Thus, the bit representation of 43
|
|
||||||
as an \texttt{int} number is as follows:
|
|
||||||
\[00000000000000000000000000101011\]
|
\[00000000000000000000000000101011\]
|
||||||
|
The bits in the representation are indexed from right to left.
|
||||||
|
To convert a bit representation $b_k \cdots b_2 b_1 b_0$ into a number,
|
||||||
|
we can use the formula
|
||||||
|
\[b_k 2^k + \ldots + b_2 2^2 + b_1 2^1 + b_0 2^0.\]
|
||||||
|
For example,
|
||||||
|
\[1 \cdot 2^5 + 1 \cdot 2^3 + 1 \cdot 2^1 + 1 \cdot 2^0 = 43.\]
|
||||||
|
|
||||||
The bit representation of a number is either
|
The bit representation of a number is either
|
||||||
\key{signed} or \key{unsigned}.
|
\key{signed} or \key{unsigned}.
|
||||||
Usually a signed representation is used,
|
Usually a signed representation is used,
|
||||||
which means that both negative and positive
|
which means that both negative and positive
|
||||||
numbers can be represented.
|
numbers can be represented.
|
||||||
A signed number of $n$ bits can contain any
|
A signed variable of $n$ bits can contain any
|
||||||
integer between $-2^{n-1}$ and $2^{n-1}-1$.
|
integer between $-2^{n-1}$ and $2^{n-1}-1$.
|
||||||
For example, the \texttt{int} type in C++ is
|
For example, the \texttt{int} type in C++ is
|
||||||
a signed type, and it can contain any
|
a signed type, so an \texttt{int} variable can contain any
|
||||||
integer between $-2^{31}$ and $2^{31}-1$.
|
integer between $-2^{31}$ and $2^{31}-1$.
|
||||||
|
|
||||||
The first bit in a signed representation
|
The first bit in a signed representation
|
||||||
|
@ -50,21 +48,20 @@ opposite number of a number is calculated by first
|
||||||
inverting all the bits in the number,
|
inverting all the bits in the number,
|
||||||
and then increasing the number by one.
|
and then increasing the number by one.
|
||||||
|
|
||||||
For example, the bit representation of $-43$
|
For example, the bit representation of
|
||||||
as an \texttt{int} number is as follows:
|
the \texttt{int} number $-43$ is
|
||||||
\[11111111111111111111111111010101\]
|
\[11111111111111111111111111010101.\]
|
||||||
|
|
||||||
In an unsigned representation, only nonnegative
|
In an unsigned representation, only nonnegative
|
||||||
numbers can be used, but the upper bound of the numbers is larger.
|
numbers can be used, but the upper bound for the values is larger.
|
||||||
An unsigned number of $n$ bits can contain any
|
An unsigned variable of $n$ bits can contain any
|
||||||
integer between $0$ and $2^n-1$.
|
integer between $0$ and $2^n-1$.
|
||||||
For example, the \texttt{unsigned int} type in C++
|
For example, in C++, an \texttt{unsigned int} variable
|
||||||
can contain any integer between $0$ and $2^{32}-1$.
|
can contain any integer between $0$ and $2^{32}-1$.
|
||||||
|
|
||||||
There is a connection between signed and unsigned
|
There is a connection between the
|
||||||
representations:
|
representations:
|
||||||
a number $-x$ in a signed representation
|
a signed number $-x$ equals an unsigned number $2^n-x$.
|
||||||
equals the number $2^n-x$ in an unsigned representation.
|
|
||||||
For example, the following code shows that
|
For example, the following code shows that
|
||||||
the signed number $x=-43$ equals the unsigned
|
the signed number $x=-43$ equals the unsigned
|
||||||
number $y=2^{32}-43$:
|
number $y=2^{32}-43$:
|
||||||
|
@ -90,7 +87,7 @@ cout << x << "\n"; // -2147483648
|
||||||
\end{lstlisting}
|
\end{lstlisting}
|
||||||
|
|
||||||
Initially, the value of $x$ is $2^{31}-1$.
|
Initially, the value of $x$ is $2^{31}-1$.
|
||||||
This is the largest number that can be stored
|
This is the largest value that can be stored
|
||||||
in an \texttt{int} variable,
|
in an \texttt{int} variable,
|
||||||
so the next number after $2^{31}-1$ is $-2^{31}$.
|
so the next number after $2^{31}-1$ is $-2^{31}$.
|
||||||
|
|
||||||
|
@ -172,7 +169,7 @@ for example, \textasciitilde$29 = -30$.
|
||||||
|
|
||||||
The result of the not operation at the bit level
|
The result of the not operation at the bit level
|
||||||
depends on the length of the bit representation,
|
depends on the length of the bit representation,
|
||||||
because the operation changes all bits.
|
because the operation inverts all bits.
|
||||||
For example, if the numbers are 32-bit
|
For example, if the numbers are 32-bit
|
||||||
\texttt{int} numbers, the result is as follows:
|
\texttt{int} numbers, the result is as follows:
|
||||||
|
|
||||||
|
@ -192,11 +189,9 @@ zero bits to the number,
|
||||||
and the right bit shift $x > > k$
|
and the right bit shift $x > > k$
|
||||||
removes the $k$ last bits from the number.
|
removes the $k$ last bits from the number.
|
||||||
For example, $14 < < 2 = 56$,
|
For example, $14 < < 2 = 56$,
|
||||||
because $14$ equals 1110
|
because $14$ and $56$ correspond to 1110 and 111000.
|
||||||
and $56$ equals 111000.
|
|
||||||
Similarly, $49 > > 3 = 6$,
|
Similarly, $49 > > 3 = 6$,
|
||||||
because $49$ equals 110001
|
because $49$ and $6$ correspond to 110001 and 110.
|
||||||
and $6$ equals 110.
|
|
||||||
|
|
||||||
Note that $x < < k$
|
Note that $x < < k$
|
||||||
corresponds to multiplying $x$ by $2^k$,
|
corresponds to multiplying $x$ by $2^k$,
|
||||||
|
@ -209,7 +204,7 @@ rounded down to an integer.
|
||||||
A number of the form $1 < < k$ has a one bit
|
A number of the form $1 < < k$ has a one bit
|
||||||
in position $k$ and all other bits are zero,
|
in position $k$ and all other bits are zero,
|
||||||
so we can use such numbers to access single bits of numbers.
|
so we can use such numbers to access single bits of numbers.
|
||||||
For example, the $k$th bit of a number is one
|
In particular, the $k$th bit of a number is one
|
||||||
exactly when $x$ \& $(1 < < k)$ is not zero.
|
exactly when $x$ \& $(1 < < k)$ is not zero.
|
||||||
The following code prints the bit representation
|
The following code prints the bit representation
|
||||||
of an \texttt{int} number $x$:
|
of an \texttt{int} number $x$:
|
||||||
|
@ -222,13 +217,13 @@ for (int i = 31; i >= 0; i--) {
|
||||||
\end{lstlisting}
|
\end{lstlisting}
|
||||||
|
|
||||||
It is also possible to modify single bits
|
It is also possible to modify single bits
|
||||||
of numbers using the above idea.
|
of numbers using similar ideas.
|
||||||
For example, the expression $x$ | $(1 < < k)$
|
For example, the formula $x$ | $(1 < < k)$
|
||||||
sets the $k$th bit of $x$ to one,
|
sets the $k$th bit of $x$ to one,
|
||||||
the expression
|
the formula
|
||||||
$x$ \& \textasciitilde $(1 < < k)$
|
$x$ \& \textasciitilde $(1 < < k)$
|
||||||
sets the $k$th bit of $x$ to zero,
|
sets the $k$th bit of $x$ to zero,
|
||||||
and the expression
|
and the formula
|
||||||
$x$ $\XOR$ $(1 < < k)$
|
$x$ $\XOR$ $(1 < < k)$
|
||||||
inverts the $k$th bit of $x$.
|
inverts the $k$th bit of $x$.
|
||||||
|
|
||||||
|
@ -239,7 +234,7 @@ one bits to zero, except for the last one bit.
|
||||||
The formula $x$ | $(x-1)$
|
The formula $x$ | $(x-1)$
|
||||||
inverts all the bits after the last one bit.
|
inverts all the bits after the last one bit.
|
||||||
Also note that a positive number $x$ is
|
Also note that a positive number $x$ is
|
||||||
of the form $2^k$ if $x$ \& $(x-1) = 0$.
|
a power of two exactly when $x$ \& $(x-1) = 0$.
|
||||||
|
|
||||||
\subsubsection*{Additional functions}
|
\subsubsection*{Additional functions}
|
||||||
|
|
||||||
|
@ -272,86 +267,76 @@ cout << __builtin_parity(x) << "\n"; // 1
|
||||||
\end{lstlisting}
|
\end{lstlisting}
|
||||||
\end{samepage}
|
\end{samepage}
|
||||||
|
|
||||||
The above functions support \texttt{int} numbers,
|
While the above functions only support \texttt{int} numbers,
|
||||||
but there are also \texttt{long long} functions
|
there are also \texttt{long long} versions of
|
||||||
available with the suffix \texttt{ll}.
|
the functions available with the suffix \texttt{ll}.
|
||||||
|
|
||||||
\section{Representing sets}
|
\section{Representing sets}
|
||||||
|
|
||||||
Each subset of a set $\{0,1,2,\ldots,n-1\}$
|
Every subset of a set
|
||||||
corresponds to an $n$ bit number
|
$\{0,1,2,\ldots,n-1\}$
|
||||||
where the one bits indicate which elements
|
can be represented as an $n$ bit integer
|
||||||
are included in the subset.
|
whose one bits indicate which
|
||||||
For example, the set $\{1,3,4,8\}$
|
elements belong to the subset.
|
||||||
corresponds to the number $2^8+2^4+2^3+2^1=282$,
|
This is an efficient way to represent sets,
|
||||||
whose bit representation is 100011010.
|
because every element requires only one bit of memory,
|
||||||
|
and set operations can be implemented as bit operations.
|
||||||
|
|
||||||
The benefit in using the bit representation
|
For example, since \texttt{int} is a 32-bit type,
|
||||||
is that the information whether an element belongs
|
an \texttt{int} number can represent any subset
|
||||||
to the set requires only one bit of memory.
|
of the set $\{0,1,2,\ldots,31\}$.
|
||||||
In addition, set operations can be efficiently
|
The bit representation of the set $\{1,3,4,8\}$ is
|
||||||
implemented as bit operations.
|
\[00000000000000000000000100011010,\]
|
||||||
|
which corresponds to the number $2^8+2^4+2^3+2^1=282$.
|
||||||
|
|
||||||
\subsubsection{Set implementation}
|
\subsubsection{Set implementation}
|
||||||
|
|
||||||
In the following code, $x$
|
The following code declares an \texttt{int}
|
||||||
contains a subset of $\{0,1,2,\ldots,31\}$.
|
variable $x$ that can contain
|
||||||
The code adds the elements 1, 3, 4 and 8
|
a subset of $\{0,1,2,\ldots,31\}$.
|
||||||
to the set and then prints the elements.
|
After this, the code adds the elements 1, 3, 4 and 8
|
||||||
|
to the set and prints the size of the set.
|
||||||
\begin{lstlisting}
|
\begin{lstlisting}
|
||||||
// x is an empty set
|
|
||||||
int x = 0;
|
int x = 0;
|
||||||
// add elements 1, 3, 4 and 8 to the set
|
|
||||||
x |= (1<<1);
|
x |= (1<<1);
|
||||||
x |= (1<<3);
|
x |= (1<<3);
|
||||||
x |= (1<<4);
|
x |= (1<<4);
|
||||||
x |= (1<<8);
|
x |= (1<<8);
|
||||||
// print the elements in the set
|
cout << __builtin_popcount(x) << "\n"; // 4
|
||||||
|
\end{lstlisting}
|
||||||
|
Then, the following code prints all
|
||||||
|
elements that belong to the set:
|
||||||
|
\begin{lstlisting}
|
||||||
for (int i = 0; i < 32; i++) {
|
for (int i = 0; i < 32; i++) {
|
||||||
if (x&(1<<i)) cout << i << " ";
|
if (x&(1<<i)) cout << i << " ";
|
||||||
}
|
}
|
||||||
cout << "\n";
|
// output: 1 3 4 8
|
||||||
\end{lstlisting}
|
|
||||||
|
|
||||||
The output of the code is as follows:
|
|
||||||
|
|
||||||
\begin{lstlisting}
|
|
||||||
1 3 4 8
|
|
||||||
\end{lstlisting}
|
\end{lstlisting}
|
||||||
|
|
||||||
\subsubsection{Set operations}
|
\subsubsection{Set operations}
|
||||||
|
|
||||||
Set operations can be implemented as follows:
|
Set operations can be implemented as follows as bit operations:
|
||||||
\begin{itemize}
|
|
||||||
\item $a$ \& $b$ is the intersection $a \cap b$ of $a$ and $b$
|
|
||||||
\item $a$ | $b$ is the union $a \cup b$ of $a$ and $b$
|
|
||||||
\item \textasciitilde$a$ is the complement $\bar a$ of $a$
|
|
||||||
\item $a$ \& (\textasciitilde$b$) is the difference
|
|
||||||
$a \setminus b$ of $a$ and $b$
|
|
||||||
\end{itemize}
|
|
||||||
|
|
||||||
For example, the following code constructs the union
|
\begin{center}
|
||||||
of $\{1,3,4,8\}$ and $\{3,6,8,9\}$:
|
\begin{tabular}{lll}
|
||||||
|
& set syntax & bit syntax \\
|
||||||
|
\hline
|
||||||
|
intersection & $a \cap b$ & $a$ \& $b$ \\
|
||||||
|
union & $a \cup b$ & $a$ | $b$ \\
|
||||||
|
complement & $\bar a$ & \textasciitilde$a$ \\
|
||||||
|
difference & $a \setminus b$ & $a$ \& (\textasciitilde$b$) \\
|
||||||
|
\end{tabular}
|
||||||
|
\end{center}
|
||||||
|
|
||||||
|
For example, the following code first constructs
|
||||||
|
the sets $x=\{1,3,4,8\}$ and $y=\{3,6,8,9\}$,
|
||||||
|
and then calculates the set $z = x \cup y = \{1,3,4,6,8,9\}$:
|
||||||
|
|
||||||
\begin{lstlisting}
|
\begin{lstlisting}
|
||||||
// set {1,3,4,8}
|
|
||||||
int x = (1<<1)+(1<<3)+(1<<4)+(1<<8);
|
int x = (1<<1)+(1<<3)+(1<<4)+(1<<8);
|
||||||
// set {3,6,8,9}
|
|
||||||
int y = (1<<3)+(1<<6)+(1<<8)+(1<<9);
|
int y = (1<<3)+(1<<6)+(1<<8)+(1<<9);
|
||||||
// union of the sets
|
|
||||||
int z = x|y;
|
int z = x|y;
|
||||||
// print the elements in the union
|
cout << __builtin_popcount(z) << "\n"; // 6
|
||||||
for (int i = 0; i < 32; i++) {
|
|
||||||
if (z&(1<<i)) cout << i << " ";
|
|
||||||
}
|
|
||||||
cout << "\n";
|
|
||||||
\end{lstlisting}
|
|
||||||
|
|
||||||
The output of the code is as follows:
|
|
||||||
|
|
||||||
\begin{lstlisting}
|
|
||||||
1 3 4 6 8 9
|
|
||||||
\end{lstlisting}
|
\end{lstlisting}
|
||||||
|
|
||||||
\subsubsection{Iterating through subsets}
|
\subsubsection{Iterating through subsets}
|
||||||
|
|
Loading…
Reference in New Issue