\chapter{Bit manipulation} All data in computer programs is internally stored as bits, i.e., as numbers 0 and 1. This chapter discusses the bit representation of integers, and shows examples of how to use bit operations. It turns out that there are many uses for bit manipulation in algorithm programming. \section{Bit representation} \index{bit representation} In programming, an $n$ bit integer is internally stored as a binary number that consists of $n$ bits. For example, the C++ type \texttt{int} is a 32-bit type, which means that every \texttt{int} number consists of 32 bits. Here is the bit representation of the \texttt{int} number 43: \[00000000000000000000000000101011\] The bits in the representation are indexed from right to left. To convert a bit representation $b_k \cdots b_2 b_1 b_0$ into a number, we can use the formula \[b_k 2^k + \ldots + b_2 2^2 + b_1 2^1 + b_0 2^0.\] For example, \[1 \cdot 2^5 + 1 \cdot 2^3 + 1 \cdot 2^1 + 1 \cdot 2^0 = 43.\] The bit representation of a number is either \key{signed} or \key{unsigned}. Usually a signed representation is used, which means that both negative and positive numbers can be represented. A signed variable of $n$ bits can contain any integer between $-2^{n-1}$ and $2^{n-1}-1$. For example, the \texttt{int} type in C++ is a signed type, so an \texttt{int} variable can contain any integer between $-2^{31}$ and $2^{31}-1$. The first bit in a signed representation is the sign of the number (0 for nonnegative numbers and 1 for negative numbers), and the remaining $n-1$ bits contain the magnitude of the number. \key{Two's complement} is used, which means that the opposite number of a number is calculated by first inverting all the bits in the number, and then increasing the number by one. For example, the bit representation of the \texttt{int} number $-43$ is \[11111111111111111111111111010101.\] In an unsigned representation, only nonnegative numbers can be used, but the upper bound for the values is larger. An unsigned variable of $n$ bits can contain any integer between $0$ and $2^n-1$. For example, in C++, an \texttt{unsigned int} variable can contain any integer between $0$ and $2^{32}-1$. There is a connection between the representations: a signed number $-x$ equals an unsigned number $2^n-x$. For example, the following code shows that the signed number $x=-43$ equals the unsigned number $y=2^{32}-43$: \begin{lstlisting} int x = -43; unsigned int y = x; cout << x << "\n"; // -43 cout << y << "\n"; // 4294967253 \end{lstlisting} If a number is larger than the upper bound of the bit representation, the number will overflow. In a signed representation, the next number after $2^{n-1}-1$ is $-2^{n-1}$, and in an unsigned representation, the next number after $2^{n-1}$ is $0$. For example, consider the following code: \begin{lstlisting} int x = 2147483647 cout << x << "\n"; // 2147483647 x++; cout << x << "\n"; // -2147483648 \end{lstlisting} Initially, the value of $x$ is $2^{31}-1$. This is the largest value that can be stored in an \texttt{int} variable, so the next number after $2^{31}-1$ is $-2^{31}$. \section{Bit operations} \newcommand\XOR{\mathbin{\char`\^}} \subsubsection{And operation} \index{and operation} The \key{and} operation $x$ \& $y$ produces a number that has one bits in positions where both $x$ and $y$ have one bits. For example, $22$ \& $26$ = 18, because \begin{center} \begin{tabular}{rrr} & 10110 & (22)\\ \& & 11010 & (26) \\ \hline = & 10010 & (18) \\ \end{tabular} \end{center} Using the and operation, we can check if a number $x$ is even because $x$ \& $1$ = 0 if $x$ is even, and $x$ \& $1$ = 1 if $x$ is odd. More generally, $x$ is divisible by $2^k$ exactly when $x$ \& $(2^k-1)$ = 0. \subsubsection{Or operation} \index{or operation} The \key{or} operation $x$ | $y$ produces a number that has one bits in positions where at least one of $x$ and $y$ have one bits. For example, $22$ | $26$ = 30, because \begin{center} \begin{tabular}{rrr} & 10110 & (22)\\ | & 11010 & (26) \\ \hline = & 11110 & (30) \\ \end{tabular} \end{center} \subsubsection{Xor operation} \index{xor operation} The \key{xor} operation $x$ $\XOR$ $y$ produces a number that has one bits in positions where exactly one of $x$ and $y$ have one bits. For example, $22$ $\XOR$ $26$ = 12, because \begin{center} \begin{tabular}{rrr} & 10110 & (22)\\ $\XOR$ & 11010 & (26) \\ \hline = & 01100 & (12) \\ \end{tabular} \end{center} \subsubsection{Not operation} \index{not operation} The \key{not} operation \textasciitilde$x$ produces a number where all the bits of $x$ have been inverted. The formula \textasciitilde$x = -x-1$ holds, for example, \textasciitilde$29 = -30$. The result of the not operation at the bit level depends on the length of the bit representation, because the operation inverts all bits. For example, if the numbers are 32-bit \texttt{int} numbers, the result is as follows: \begin{center} \begin{tabular}{rrrr} $x$ & = & 29 & 00000000000000000000000000011101 \\ \textasciitilde$x$ & = & $-30$ & 11111111111111111111111111100010 \\ \end{tabular} \end{center} \subsubsection{Bit shifts} \index{bit shift} The left bit shift $x < < k$ appends $k$ zero bits to the number, and the right bit shift $x > > k$ removes the $k$ last bits from the number. For example, $14 < < 2 = 56$, because $14$ and $56$ correspond to 1110 and 111000. Similarly, $49 > > 3 = 6$, because $49$ and $6$ correspond to 110001 and 110. Note that $x < < k$ corresponds to multiplying $x$ by $2^k$, and $x > > k$ corresponds to dividing $x$ by $2^k$ rounded down to an integer. \subsubsection{Applications} A number of the form $1 < < k$ has a one bit in position $k$ and all other bits are zero, so we can use such numbers to access single bits of numbers. In particular, the $k$th bit of a number is one exactly when $x$ \& $(1 < < k)$ is not zero. The following code prints the bit representation of an \texttt{int} number $x$: \begin{lstlisting} for (int i = 31; i >= 0; i--) { if (x&(1< 1 && (b&(1<