\chapter{Bit manipulation} All data in computer programs is internally stored as bits, i.e., as numbers 0 and 1. In this chapter, we will learn how integers are represented as bits, and how bit operations can be used to manipulate them. It turns out that there are many uses for bit operations in algorithm programming. \section{Bit representation} \index{bit representation} Every nonnegative integer can be represented as a sum \[c_k 2^k + \ldots + c_2 2^2 + c_1 2^1 + c_0 2^0,\] where each coefficient $c_i$ is either 0 or 1. The bit representation of such a number is $c_k \cdots c_2 c_1 c_0$. For example, the number 43 corresponds to the sum \[1 \cdot 2^5 + 0 \cdot 2^4 + 1 \cdot 2^3 + 0 \cdot 2^2 + 1 \cdot 2^1 + 1 \cdot 2^0,\] so the bit representation of the number is 101011. In programming, the length of the bit representation depends on the data type of the number. For example, in C++ the type \texttt{int} is usually a 32-bit type and an \texttt{int} number consists of 32 bits. Thus, the bit representation of 43 as an \texttt{int} number is as follows: \[00000000000000000000000000101011\] The bit representation of a number is either \key{signed} or \key{unsigned}. Usually a signed representation is used, which means that both negative and positive numbers can be represented. A signed number of $n$ bits can contain any integer between $-2^{n-1}$ and $2^{n-1}-1$. For example, the \texttt{int} type in C++ is a signed type, and it can contain any integer between $-2^{31}$ and $2^{31}-1$. The first bit in a signed representation is the sign of the number (0 for nonnegative numbers and 1 for negative numbers), and the remaining $n-1$ bits contain the magnitude of the number. \key{Two's complement} is used, which means that the opposite number of a number is calculated by first inverting all the bits in the number, and then increasing the number by one. For example, the bit representation of $-43$ as an \texttt{int} number is as follows: \[11111111111111111111111111010101\] In an unsigned representation, only nonnegative numbers can be used, but the upper bound of the numbers is larger. An unsigned number of $n$ bits can contain any integer between $0$ and $2^n-1$. For example, the \texttt{unsigned int} type in C++ can contain any integer between $0$ and $2^{32}-1$. There is a connection between signed and unsigned representations: a number $-x$ in a signed representation equals the number $2^n-x$ in an unsigned representation. For example, the following code shows that the signed number $x=-43$ equals the unsigned number $y=2^{32}-43$: \begin{lstlisting} int x = -43; unsigned int y = x; cout << x << "\n"; // -43 cout << y << "\n"; // 4294967253 \end{lstlisting} If a number is larger than the upper bound of the bit representation, the number will overflow. In a signed representation, the next number after $2^{n-1}-1$ is $-2^{n-1}$, and in an unsigned representation, the next number after $2^{n-1}$ is $0$. For example, consider the following code: \begin{lstlisting} int x = 2147483647 cout << x << "\n"; // 2147483647 x++; cout << x << "\n"; // -2147483648 \end{lstlisting} Initially, the value of $x$ is $2^{31}-1$. This is the largest number that can be stored in an \texttt{int} variable, so the next number after $2^{31}-1$ is $-2^{31}$. \section{Bit operations} \newcommand\XOR{\mathbin{\char`\^}} \subsubsection{And operation} \index{and operation} The \key{and} operation $x$ \& $y$ produces a number that has one bits in positions where both $x$ and $y$ have one bits. For example, $22$ \& $26$ = 18, because \begin{center} \begin{tabular}{rrr} & 10110 & (22)\\ \& & 11010 & (26) \\ \hline = & 10010 & (18) \\ \end{tabular} \end{center} Using the and operation, we can check if a number $x$ is even because $x$ \& $1$ = 0 if $x$ is even, and $x$ \& $1$ = 1 if $x$ is odd. More generally, $x$ is divisible by $2^k$ exactly when $x$ \& $(2^k-1)$ = 0. \subsubsection{Or operation} \index{or operation} The \key{or} operation $x$ | $y$ produces a number that has one bits in positions where at least one of $x$ and $y$ have one bits. For example, $22$ | $26$ = 30, because \begin{center} \begin{tabular}{rrr} & 10110 & (22)\\ | & 11010 & (26) \\ \hline = & 11110 & (30) \\ \end{tabular} \end{center} \subsubsection{Xor operation} \index{xor operation} The \key{xor} operation $x$ $\XOR$ $y$ produces a number that has one bits in positions where exactly one of $x$ and $y$ have one bits. For example, $22$ $\XOR$ $26$ = 12, because \begin{center} \begin{tabular}{rrr} & 10110 & (22)\\ $\XOR$ & 11010 & (26) \\ \hline = & 01100 & (12) \\ \end{tabular} \end{center} \subsubsection{Not operation} \index{not operation} The \key{not} operation \textasciitilde$x$ produces a number where all the bits of $x$ have been inverted. The formula \textasciitilde$x = -x-1$ holds, for example, \textasciitilde$29 = -30$. The result of the not operation at the bit level depends on the length of the bit representation, because the operation changes all bits. For example, if the numbers are 32-bit \texttt{int} numbers, the result is as follows: \begin{center} \begin{tabular}{rrrr} $x$ & = & 29 & 00000000000000000000000000011101 \\ \textasciitilde$x$ & = & $-30$ & 11111111111111111111111111100010 \\ \end{tabular} \end{center} \subsubsection{Bit shifts} \index{bit shift} The left bit shift $x < < k$ appends $k$ zero bits to the number, and the right bit shift $x > > k$ removes the $k$ last bits from the number. For example, $14 < < 2 = 56$, because $14$ equals 1110 and $56$ equals 111000. Similarly, $49 > > 3 = 6$, because $49$ equals 110001 and $6$ equals 110. Note that $x < < k$ corresponds to multiplying $x$ by $2^k$, and $x > > k$ corresponds to dividing $x$ by $2^k$ rounded down to an integer. \subsubsection{Applications} A number of the form $1 < < k$ has a one bit in position $k$ and all other bits are zero, so we can use such numbers to access single bits of numbers. For example, the $k$th bit of a number is one exactly when $x$ \& $(1 < < k)$ is not zero. The following code prints the bit representation of an \texttt{int} number $x$: \begin{lstlisting} for (int i = 31; i >= 0; i--) { if (x&(1< 1 && (b&(1<