Improve language

This commit is contained in:
Antti H S Laaksonen 2017-05-09 23:32:59 +03:00
parent 5a298088b9
commit bf949a8f8c
5 changed files with 144 additions and 146 deletions

View File

@ -9,7 +9,7 @@ because many questions involving integers
are very difficult to solve even if they are very difficult to solve even if they
seem simple at first glance. seem simple at first glance.
As an example, let us consider the following equation: As an example, consider the following equation:
\[x^3 + y^3 + z^3 = 33\] \[x^3 + y^3 + z^3 = 33\]
It is easy to find three real numbers $x$, $y$ and $z$ It is easy to find three real numbers $x$, $y$ and $z$
that satisfy the equation. that satisfy the equation.
@ -21,10 +21,10 @@ y = \sqrt[3]{3}, \\
z = \sqrt[3]{3}.\\ z = \sqrt[3]{3}.\\
\end{array} \end{array}
\] \]
However, nobody knows if there are any three However, it is an open problem in number theory
if there are any three
\emph{integers} $x$, $y$ and $z$ \emph{integers} $x$, $y$ and $z$
that would satisfy the equation, but this that would satisfy the equation \cite{bec07}.
is an open problem in number theory \cite{bec07}.
In this chapter, we will focus on basic concepts In this chapter, we will focus on basic concepts
and algorithms in number theory. and algorithms in number theory.
@ -51,7 +51,7 @@ A number $n>1$ is a \key{prime}
if its only positive factors are 1 and $n$. if its only positive factors are 1 and $n$.
For example, 7, 19 and 41 are primes, For example, 7, 19 and 41 are primes,
but 35 is not a prime, because $5 \cdot 7 = 35$. but 35 is not a prime, because $5 \cdot 7 = 35$.
For each number $n>1$, there is a unique For every number $n>1$, there is a unique
\key{prime factorization} \key{prime factorization}
\[ n = p_1^{\alpha_1} p_2^{\alpha_2} \cdots p_k^{\alpha_k},\] \[ n = p_1^{\alpha_1} p_2^{\alpha_2} \cdots p_k^{\alpha_k},\]
where $p_1,p_2,\ldots,p_k$ are distinct primes and where $p_1,p_2,\ldots,p_k$ are distinct primes and
@ -87,7 +87,7 @@ and the product of the factors is $\mu(84)=84^6=351298031616$.
\index{perfect number} \index{perfect number}
A number $n$ is \key{perfect} if $n=\sigma(n)-n$, A number $n$ is called a \key{perfect number} if $n=\sigma(n)-n$,
i.e., $n$ equals the sum of its factors i.e., $n$ equals the sum of its factors
between $1$ and $n-1$. between $1$ and $n-1$.
For example, 28 is a perfect number, For example, 28 is a perfect number,
@ -211,13 +211,13 @@ algorithm that builds an array using which we
can efficiently check if a given number between $2 \ldots n$ can efficiently check if a given number between $2 \ldots n$
is prime and, if it is not, find one prime factor of the number. is prime and, if it is not, find one prime factor of the number.
The algorithm builds an array $\texttt{a}$ The algorithm builds an array $\texttt{sieve}$
whose positions $2,3,\ldots,n$ are used. whose positions $2,3,\ldots,n$ are used.
The value $\texttt{a}[k]=0$ means The value $\texttt{sieve}[k]=0$ means
that $k$ is prime, that $k$ is prime,
and the value $\texttt{a}[k] \neq 0$ and the value $\texttt{sieve}[k] \neq 0$
means that $k$ is not a prime and one means that $k$ is not a prime and one
of its prime factors is $\texttt{a}[k]$. of its prime factors is $\texttt{sieve}[k]$.
The algorithm iterates through the numbers The algorithm iterates through the numbers
$2 \ldots n$ one by one. $2 \ldots n$ one by one.
@ -279,31 +279,30 @@ For example, if $n=20$, the array is as follows:
The following code implements the sieve of The following code implements the sieve of
Eratosthenes. Eratosthenes.
The code assumes that each element in The code assumes that each element of
\texttt{a} is initially zero. \texttt{sieve} is initially zero.
\begin{lstlisting} \begin{lstlisting}
for (int x = 2; x <= n; x++) { for (int x = 2; x <= n; x++) {
if (a[x]) continue; if (sieve[x]) continue;
for (int u = 2*x; u <= n; u += x) { for (int u = 2*x; u <= n; u += x) {
a[u] = x; sieve[u] = x;
} }
} }
\end{lstlisting} \end{lstlisting}
The inner loop of the algorithm will be executed The inner loop of the algorithm is executed
$n/x$ times for any $x$. $n/x$ times for each value of $x$.
Thus, an upper bound for the running time Thus, an upper bound for the running time
of the algorithm is the harmonic sum of the algorithm is the harmonic sum
\[\sum_{x=2}^n n/x = n/2 + n/3 + n/4 + \cdots + n/n = O(n \log n).\]
\index{harmonic sum} \index{harmonic sum}
\[\sum_{x=2}^n n/x = n/2 + n/3 + n/4 + \cdots + n/n = O(n \log n).\] In fact, the algorithm is more efficient,
In fact, the algorithm is even more efficient,
because the inner loop will be executed only if because the inner loop will be executed only if
the number $x$ is prime. the number $x$ is prime.
It can be shown that the time complexity of the It can be shown that the running time of the
algorithm is only $O(n \log \log n)$, algorithm is only $O(n \log \log n)$,
a complexity very near to $O(n)$. a complexity very near to $O(n)$.
@ -338,11 +337,21 @@ The algorithm is based on the following formula:
\textrm{gcd}(b,a \bmod b) & b \neq 0\\ \textrm{gcd}(b,a \bmod b) & b \neq 0\\
\end{cases} \end{cases}
\end{equation*} \end{equation*}
For example, For example,
\[\textrm{gcd}(24,36) = \textrm{gcd}(36,24) \[\textrm{gcd}(24,36) = \textrm{gcd}(36,24)
= \textrm{gcd}(24,12) = \textrm{gcd}(12,0)=12.\] = \textrm{gcd}(24,12) = \textrm{gcd}(12,0)=12.\]
The time complexity of Euclid's algorithm
is $O(\log n)$, where $n=\min(a,b)$. The algorithm can be implemented as follows:
\begin{lstlisting}
int gcd(int a, int b) {
if (b == 0) return a;
return gcd(b, a%b);
}
\end{lstlisting}
It can be shown that Euclid's algorithm works
in $O(\log n)$ time, where $n=\min(a,b)$.
The worst case for the algorithm is The worst case for the algorithm is
the case when $a$ and $b$ are consecutive Fibonacci numbers. the case when $a$ and $b$ are consecutive Fibonacci numbers.
For example, For example,
@ -376,8 +385,8 @@ Note that $\varphi(n)=n-1$ if $n$ is prime.
\index{modular arithmetic} \index{modular arithmetic}
In \key{modular arithmetic}, In \key{modular arithmetic},
the set of available numbers is limited so the set of numbers is limited so
that only numbers $0,1,2,\ldots,m-1$ may be used, that only numbers $0,1,2,\ldots,m-1$ are used,
where $m$ is a constant. where $m$ is a constant.
Each number $x$ is Each number $x$ is
represented by the number $x \bmod m$: represented by the number $x \bmod m$:
@ -385,9 +394,9 @@ the remainder after dividing $x$ by $m$.
For example, if $m=17$, then $75$ For example, if $m=17$, then $75$
is represented by $75 \bmod 17 = 7$. is represented by $75 \bmod 17 = 7$.
Often we can take the remainder before doing Often we can take remainders before doing
calculations. calculations.
In particular, the following formulas can be used: In particular, the following formulas hold:
\[ \[
\begin{array}{rcl} \begin{array}{rcl}
(x+y) \bmod m & = & (x \bmod m + y \bmod m) \bmod m \\ (x+y) \bmod m & = & (x \bmod m + y \bmod m) \bmod m \\
@ -484,12 +493,12 @@ If $m$ is prime, the formula becomes
\[ \[
x^{-1} = x^{m-2}. x^{-1} = x^{m-2}.
\] \]
For example, if $x=6$ and $m=17$, then For example,
\[x^{-1}=6^{17-2} \bmod 17 = 3.\] \[6^{-1} \bmod 17 =6^{17-2} \bmod 17 = 3.\]
Using this formula, we can calculate modular inverses
efficiently using the modular exponentation algorithm.
The above formula can be derived using Euler's theorem. This formula allows us to efficiently calculate
modular inverses using the modular exponentation algorithm.
The formula can be derived using Euler's theorem.
First, the modular inverse should satisfy the following equation: First, the modular inverse should satisfy the following equation:
\[ \[
x x^{-1} \bmod m = 1. x x^{-1} \bmod m = 1.
@ -522,6 +531,8 @@ cout << x*x << "\n"; // 2537071545
\section{Solving equations} \section{Solving equations}
\subsubsection*{Diophantine equations}
\index{Diophantine equation} \index{Diophantine equation}
A \key{Diophantine equation} A \key{Diophantine equation}
@ -529,12 +540,12 @@ A \key{Diophantine equation}
is an equation of the form is an equation of the form
\[ ax + by = c, \] \[ ax + by = c, \]
where $a$, $b$ and $c$ are constants where $a$, $b$ and $c$ are constants
and we should find the values of $x$ and $y$. and the values of $x$ and $y$ should be found.
Each number in the equation has to be an integer. Each number in the equation has to be an integer.
For example, one solution for the equation For example, one solution for the equation
$5x+2y=11$ is $x=3$ and $y=-2$. $5x+2y=11$ is $x=3$ and $y=-2$.
\index{Euclid's algorithm} \index{extended Euclid's algorithm}
We can efficiently solve a Diophantine equation We can efficiently solve a Diophantine equation
by using Euclid's algorithm. by using Euclid's algorithm.
@ -548,11 +559,7 @@ ax + by = \textrm{gcd}(a,b)
A Diophantine equation can be solved if A Diophantine equation can be solved if
$c$ is divisible by $c$ is divisible by
$\textrm{gcd}(a,b)$, $\textrm{gcd}(a,b)$,
and otherwise the equation cannot be solved. and otherwise it cannot be solved.
\index{extended Euclid's algorithm}
\subsubsection*{Extended Euclid's algorithm}
As an example, let us find numbers $x$ and $y$ As an example, let us find numbers $x$ and $y$
that satisfy the following equation: that satisfy the following equation:
@ -588,7 +595,7 @@ so a solution to the equation is
$x=8$ and $y=-20$. $x=8$ and $y=-20$.
A solution to a Diophantine equation is not unique, A solution to a Diophantine equation is not unique,
but we can form an infinite number of solutions because we can form an infinite number of solutions
if we know one solution. if we know one solution.
If a pair $(x,y)$ is a solution, then also all pairs If a pair $(x,y)$ is a solution, then also all pairs
\[(x+\frac{kb}{\textrm{gcd}(a,b)},y-\frac{ka}{\textrm{gcd}(a,b)})\] \[(x+\frac{kb}{\textrm{gcd}(a,b)},y-\frac{ka}{\textrm{gcd}(a,b)})\]
@ -621,7 +628,7 @@ because
\[X_k {X_k}^{-1}_{m_k} \bmod m_k = 1.\] \[X_k {X_k}^{-1}_{m_k} \bmod m_k = 1.\]
Since all other terms in the sum are divisible by $m_k$, Since all other terms in the sum are divisible by $m_k$,
they have no effect on the remainder, they have no effect on the remainder,
and the remainder by $m_k$ for the whole sum is $a_k$. and $x \bmod m_k = a_k$.
For example, a solution for For example, a solution for
\[ \[

View File

@ -8,7 +8,7 @@ Usually, the goal is to find a way to
count the combinations efficiently count the combinations efficiently
without generating each combination separately. without generating each combination separately.
As an example, let us consider the problem As an example, consider the problem
of counting the number of ways to of counting the number of ways to
represent an integer $n$ as a sum of positive integers. represent an integer $n$ as a sum of positive integers.
For example, there are 8 representations For example, there are 8 representations
@ -35,27 +35,28 @@ The values of the function
can be recursively calculated as follows: can be recursively calculated as follows:
\begin{equation*} \begin{equation*}
f(n) = \begin{cases} f(n) = \begin{cases}
1 & n = 1\\ 1 & n = 0\\
f(1)+f(2)+\ldots+f(n-1)+1 & n > 1\\ f(0)+f(1)+\cdots+f(n-1) & n > 0\\
\end{cases} \end{cases}
\end{equation*} \end{equation*}
The base case is $f(1)=1$, The base case is $f(0)=1$,
because there is only one way to represent the number 1. because the empty sum represents the number 0.
When $n>1$, we go through all ways to Then, if $n>0$, we consider all ways to
choose the last number in the sum. choose the first number of the sum.
For example, in when $n=4$, the sum can end If the first number is $k$,
with $+1$, $+2$ or $+3$. there are $f(n-k)$ representations
In addition, we also count the representation for the remaining part of the sum.
that only contains $n$. Thus, we calculate the sum of all values
of the form $f(n-k)$ where $k<n$.
The first values for the function are: The first values for the function are:
\[ \[
\begin{array}{lcl} \begin{array}{lcl}
f(0) & = & 1 \\
f(1) & = & 1 \\ f(1) & = & 1 \\
f(2) & = & 2 \\ f(2) & = & 2 \\
f(3) & = & 4 \\ f(3) & = & 4 \\
f(4) & = & 8 \\ f(4) & = & 8 \\
f(5) & = & 16 \\
\end{array} \end{array}
\] \]
It turns out that the function also has a closed-form formula It turns out that the function also has a closed-form formula
@ -134,7 +135,8 @@ The sum of binomial coefficients is
\] \]
The reason for the name ''binomial coefficient'' The reason for the name ''binomial coefficient''
is that can be seen when the binomial $(a+b)$ is raised to
the $n$th power:
\[ (a+b)^n = \[ (a+b)^n =
{n \choose 0} a^n b^0 + {n \choose 0} a^n b^0 +
@ -314,13 +316,14 @@ there are 6 solutions:
In this scenario, we can assume that In this scenario, we can assume that
$k$ balls are initially placed in boxes $k$ balls are initially placed in boxes
and there is an empty box between each and there is an empty box between each
two such boxes. pair of two adjacent boxes.
The remaining task is to choose the The remaining task is to choose the
positions for positions for the remaining empty boxes.
$n-k-(k-1)=n-2k+1$ empty boxes. There are $n-2k+1$ such boxes and
There are $k+1$ positions, $k+1$ positions for them.
so the number of solutions is Thus, using the formula of scenario 2,
${n-2k+1+k+1-1 \choose n-2k+1} = {n-k+1 \choose n-2k+1}$. the number of solutions is
${n-k+1 \choose n-2k+1}$.
\subsubsection{Multinomial coefficients} \subsubsection{Multinomial coefficients}
@ -348,10 +351,10 @@ number of valid
parenthesis expressions that consist of parenthesis expressions that consist of
$n$ left parentheses and $n$ right parentheses. $n$ left parentheses and $n$ right parentheses.
For example, $C_3=5$, because using three For example, $C_3=5$, because
left parentheses and three right parentheses,
we can construct the following parenthesis we can construct the following parenthesis
expressions: expressions using three
left parentheses and three right parentheses:
\begin{itemize}[noitemsep] \begin{itemize}[noitemsep]
\item \texttt{()()()} \item \texttt{()()()}
@ -370,7 +373,7 @@ The following rules precisely define all
valid parenthesis expressions: valid parenthesis expressions:
\begin{itemize} \begin{itemize}
\item The empty expression is valid. \item An empty parenthesis expression is valid.
\item If an expression $A$ is valid, \item If an expression $A$ is valid,
then also the expression then also the expression
\texttt{(}$A$\texttt{)} is valid. \texttt{(}$A$\texttt{)} is valid.
@ -402,11 +405,11 @@ of parentheses and the number of expressions
is the product of the following values: is the product of the following values:
\begin{itemize} \begin{itemize}
\item $C_{i}$: number of ways to construct an expression \item $C_{i}$: the number of ways to construct an expression
using the parentheses in the first part, using the parentheses of the first part,
not counting the outermost parentheses not counting the outermost parentheses
\item $C_{n-i-1}$: number of ways to construct an \item $C_{n-i-1}$: the number of ways to construct an
expression using the parentheses in the second part expression using the parentheses of the second part
\end{itemize} \end{itemize}
In addition, the base case is $C_0=1$, In addition, the base case is $C_0=1$,
because we can construct an empty parenthesis because we can construct an empty parenthesis
@ -656,7 +659,7 @@ recursive formula:
\end{cases} \end{cases}
\end{equation*} \end{equation*}
The formula can be derived by going through The formula can be derived by considering
the possibilities how the element 1 changes the possibilities how the element 1 changes
in the derangement. in the derangement.
There are $n-1$ ways to choose an element $x$ There are $n-1$ ways to choose an element $x$
@ -695,8 +698,7 @@ remain unchanged when the $k$th way is applied.
As an example, let us calculate the number of As an example, let us calculate the number of
necklaces of $n$ pearls, necklaces of $n$ pearls,
where the color of each pearl is where each pearl has $m$ possible colors.
one of $1,2,\ldots,m$.
Two necklaces are symmetric if they are Two necklaces are symmetric if they are
similar after rotating them. similar after rotating them.
For example, the necklace For example, the necklace
@ -749,7 +751,7 @@ pearl has the same color remain the same.
More generally, when the number of steps is $k$, More generally, when the number of steps is $k$,
a total of a total of
\[m^{\textrm{gcd}(k,n)},\] \[m^{\textrm{gcd}(k,n)}\]
necklaces remain the same, necklaces remain the same,
where $\textrm{gcd}(k,n)$ is the greatest common where $\textrm{gcd}(k,n)$ is the greatest common
divisor of $k$ and $n$. divisor of $k$ and $n$.

View File

@ -132,8 +132,8 @@ whose elements are calculated using the formula
AB[i,j] = \sum_{k=1}^n A[i,k] \cdot B[k,j]. AB[i,j] = \sum_{k=1}^n A[i,k] \cdot B[k,j].
\] \]
The idea is that each element in $AB$ The idea is that each element of $AB$
is a sum of products of elements in $A$ and $B$ is a sum of products of elements of $A$ and $B$
according to the following picture: according to the following picture:
\begin{center} \begin{center}
@ -248,10 +248,10 @@ for matrix multiplication\footnote{The first such
algorithm was Strassen's algorithm, algorithm was Strassen's algorithm,
published in 1969 \cite{str69}, published in 1969 \cite{str69},
whose time complexity is $O(n^{2.80735})$; whose time complexity is $O(n^{2.80735})$;
the best current algorithm the best current algorithm \cite{gal14}
works in $O(n^{2.37286})$ time \cite{gal14}.}, works in $O(n^{2.37286})$ time.},
but they are mostly of theoretical interest but they are mostly of theoretical interest
and such special algorithms are not needed and such algorithms are not necessary
in competitive programming. in competitive programming.
@ -424,15 +424,15 @@ For example,
\index{linear recurrence} \index{linear recurrence}
A \key{linear recurrence} A \key{linear recurrence}
can be represented as a function $f(n)$ is a function $f(n)$
such that the initial values are whose initial values are
$f(0),f(1),\ldots,f(k-1)$ $f(0),f(1),\ldots,f(k-1)$
and the larger values and larger values
are calculated recursively using the formula are calculated recursively using the formula
\[f(n) = c_1 f(n-1) + c_2 f(n-2) + \ldots + c_k f (n-k),\] \[f(n) = c_1 f(n-1) + c_2 f(n-2) + \ldots + c_k f (n-k),\]
where $c_1,c_2,\ldots,c_k$ are constant coefficients. where $c_1,c_2,\ldots,c_k$ are constant coefficients.
We can use dynamic programming to calculate Dynamic programming can be used to calculate
any value of $f(n)$ in $O(kn)$ time by calculating any value of $f(n)$ in $O(kn)$ time by calculating
all values of $f(0),f(1),\ldots,f(n)$ one after another. all values of $f(0),f(1),\ldots,f(n)$ one after another.
However, if $k$ is small, it is possible to calculate However, if $k$ is small, it is possible to calculate
@ -455,7 +455,8 @@ f(n) & = & f(n-1)+f(n-2) \\
In this case, $k=2$ and $c_1=c_2=1$. In this case, $k=2$ and $c_1=c_2=1$.
\begin{samepage} \begin{samepage}
The idea is to represent the To efficiently calculate Fibonacci numbers,
we represent the
Fibonacci formula as a Fibonacci formula as a
square matrix $X$ of size $2 \times 2$, square matrix $X$ of size $2 \times 2$,
for which the following holds: for which the following holds:
@ -670,8 +671,9 @@ $2 \rightarrow 6 \rightarrow 3 \rightarrow 2 \rightarrow 5$.
\subsubsection{Shortest paths} \subsubsection{Shortest paths}
Using a similar idea in a weighted graph, Using a similar idea in a weighted graph,
we can calculate for each pair of nodes the shortest we can calculate for each pair of nodes the minimum
path between them that contains exactly $n$ edges. length of a path
between them that contains exactly $n$ edges.
To calculate this, we have to define matrix multiplication To calculate this, we have to define matrix multiplication
in a new way, so that we do not calculate the numbers in a new way, so that we do not calculate the numbers
of paths but minimize the lengths of paths. of paths but minimize the lengths of paths.
@ -740,9 +742,10 @@ V^4= \begin{bmatrix}
\infty & \infty & 12 & 13 & 11 & \infty \\ \infty & \infty & 12 & 13 & 11 & \infty \\
\end{bmatrix}, \end{bmatrix},
\] \]
we can conclude that the shortest path of 4 edges we can conclude that the minimum length of a path
from node 2 to node 5 has length 8. of 4 edges
This path is from node 2 to node 5 is 8.
Such a path is
$2 \rightarrow 1 \rightarrow 4 \rightarrow 2 \rightarrow 5$. $2 \rightarrow 1 \rightarrow 4 \rightarrow 2 \rightarrow 5$.
\subsubsection{Kirchhoff's theorem} \subsubsection{Kirchhoff's theorem}
@ -819,7 +822,8 @@ L= \begin{bmatrix}
\end{bmatrix} \end{bmatrix}
\] \]
The number of spanning trees equals It can be shown that
the number of spanning trees equals
the determinant of a matrix that is obtained the determinant of a matrix that is obtained
when we remove any row and any column from $L$. when we remove any row and any column from $L$.
For example, if we remove the first row For example, if we remove the first row
@ -835,8 +839,8 @@ and column, the result is
The determinant is always the same, The determinant is always the same,
regardless of which row and column we remove from $L$. regardless of which row and column we remove from $L$.
Note that a special case of Kirchhoff's theorem Note that Cayley's formula in Chapter 22.5 is
is Cayley's formula in Chapter 22.5, a special case of Kirchhoff's theorem,
because in a complete graph of $n$ nodes because in a complete graph of $n$ nodes
\[ \det( \[ \det(

View File

@ -13,8 +13,7 @@ where the three dots describe the event.
For example, when throwing a dice, For example, when throwing a dice,
the outcome is an integer between $1$ and $6$, the outcome is an integer between $1$ and $6$,
and it is assumed that the probability of and the probability of each outcome is $1/6$.
each outcome is $1/6$.
For example, we can calculate the following probabilities: For example, we can calculate the following probabilities:
\begin{itemize}[noitemsep] \begin{itemize}[noitemsep]
@ -56,9 +55,9 @@ Thus, the probability of the event is
Another way to calculate the probability is Another way to calculate the probability is
to simulate the process that generates the event. to simulate the process that generates the event.
In this case, we draw three cards, so the process In this example, we draw three cards, so the process
consists of three steps. consists of three steps.
We require that each step in the process is successful. We require that each step of the process is successful.
Drawing the first card certainly succeeds, Drawing the first card certainly succeeds,
because there are no restrictions. because there are no restrictions.
@ -73,7 +72,7 @@ The probability that the entire process succeeds is
\section{Events} \section{Events}
An event in probability can be represented as a set An event in probability theory can be represented as a set
\[A \subset X,\] \[A \subset X,\]
where $X$ contains all possible outcomes where $X$ contains all possible outcomes
and $A$ is a subset of outcomes. and $A$ is a subset of outcomes.
@ -85,7 +84,7 @@ corresponds to the set
Each outcome $x$ is assigned a probability $p(x)$. Each outcome $x$ is assigned a probability $p(x)$.
Furthermore, the probability $P(A)$ of an event Furthermore, the probability $P(A)$ of an event
that corresponds to a set $A$ can be calculated as a sum $A$ can be calculated as a sum
of probabilities of outcomes using the formula of probabilities of outcomes using the formula
\[P(A) = \sum_{x \in A} p(x).\] \[P(A) = \sum_{x \in A} p(x).\]
For example, when throwing a dice, For example, when throwing a dice,
@ -97,7 +96,7 @@ so the probability of the event
The total probability of the outcomes in $X$ must The total probability of the outcomes in $X$ must
be 1, i.e., $P(X)=1$. be 1, i.e., $P(X)=1$.
Since the events in probability are sets, Since the events in probability theory are sets,
we can manipulate them using standard set operations: we can manipulate them using standard set operations:
\begin{itemize} \begin{itemize}
@ -166,7 +165,7 @@ The \key{conditional probability}
\[P(A | B) = \frac{P(A \cap B)}{P(B)}\] \[P(A | B) = \frac{P(A \cap B)}{P(B)}\]
is the probability of $A$ is the probability of $A$
assuming that $B$ happens. assuming that $B$ happens.
In this situation, when calculating the Hence, when calculating the
probability of $A$, we only consider the outcomes probability of $A$, we only consider the outcomes
that also belong to $B$. that also belong to $B$.
@ -313,17 +312,17 @@ values $a,a+1,\ldots,b$ and the probability of each value is $1/n$.
For example, when throwing a dice, For example, when throwing a dice,
$a=1$, $b=6$ and $P(X=x)=1/6$ for each value $x$. $a=1$, $b=6$ and $P(X=x)=1/6$ for each value $x$.
The expected value for $X$ in a uniform distribution is The expected value of $X$ in a uniform distribution is
\[E[X] = \frac{a+b}{2}.\] \[E[X] = \frac{a+b}{2}.\]
\index{binomial distribution} \index{binomial distribution}
~\\
In a \key{binomial distribution}, $n$ attempts In a \key{binomial distribution}, $n$ attempts
are made are made
and the probability that a single attempt succeeds and the probability that a single attempt succeeds
is $p$. is $p$.
The random variable $X$ counts the number of The random variable $X$ counts the number of
successful attempts, successful attempts,
and the probability for a value $x$ is and the probability of a value $x$ is
\[P(X=x)=p^x (1-p)^{n-x} {n \choose x},\] \[P(X=x)=p^x (1-p)^{n-x} {n \choose x},\]
where $p^x$ and $(1-p)^{n-x}$ correspond to where $p^x$ and $(1-p)^{n-x}$ correspond to
successful and unsuccessful attemps, successful and unsuccessful attemps,
@ -334,25 +333,25 @@ For example, when throwing a dice ten times,
the probability of throwing a six exactly the probability of throwing a six exactly
three times is $(1/6)^3 (5/6)^7 {10 \choose 3}$. three times is $(1/6)^3 (5/6)^7 {10 \choose 3}$.
The expected value for $X$ in a binomial distribution is The expected value of $X$ in a binomial distribution is
\[E[X] = pn.\] \[E[X] = pn.\]
\index{geometric distribution} \index{geometric distribution}
~\\
In a \key{geometric distribution}, In a \key{geometric distribution},
the probability that an attempt succeeds is $p$, the probability that an attempt succeeds is $p$,
and we continue until the first success happens. and we continue until the first success happens.
The random variable $X$ counts the number The random variable $X$ counts the number
of attempts needed, and the probability for of attempts needed, and the probability of
a value $x$ is a value $x$ is
\[P(X=x)=(1-p)^{x-1} p,\] \[P(X=x)=(1-p)^{x-1} p,\]
where $(1-p)^{x-1}$ corresponds to unsuccessful attemps where $(1-p)^{x-1}$ corresponds to the unsuccessful attemps
and $p$ corresponds to the first successful attempt. and $p$ corresponds to the first successful attempt.
For example, if we throw a dice until we throw a six, For example, if we throw a dice until we throw a six,
the probability that the number of throws the probability that the number of throws
is exactly 4 is $(5/6)^3 1/6$. is exactly 4 is $(5/6)^3 1/6$.
The expected value for $X$ in a geometric distribution is The expected value of $X$ in a geometric distribution is
\[E[X]=\frac{1}{p}.\] \[E[X]=\frac{1}{p}.\]
\section{Markov chains} \section{Markov chains}
@ -369,7 +368,7 @@ for moving to other states.
A Markov chain can be represented as a graph A Markov chain can be represented as a graph
whose nodes are states and edges are transitions. whose nodes are states and edges are transitions.
As an example, let us consider a problem As an example, consider a problem
where we are in floor 1 in an $n$ floor building. where we are in floor 1 in an $n$ floor building.
At each step, we randomly walk either one floor At each step, we randomly walk either one floor
up or one floor down, except that we always up or one floor down, except that we always
@ -420,11 +419,11 @@ $[1/2,0,1/2,0,0]$, and so on.
An efficient way to simulate the walk in An efficient way to simulate the walk in
a Markov chain is to use dynamic programming. a Markov chain is to use dynamic programming.
The idea is to maintain the probability distribution The idea is to maintain the probability distribution,
and at each step go through all possibilities and at each step go through all possibilities
how we can move. how we can move.
Using this method, we can simulate $m$ steps Using this method, we can simulate
in $O(n^2 m)$ time. a walk of $m$ steps in $O(n^2 m)$ time.
The transitions of a Markov chain can also be The transitions of a Markov chain can also be
represented as a matrix that updates the represented as a matrix that updates the
@ -511,7 +510,7 @@ The $kth$ \key{order statistic} of an array
is the element at position $k$ after sorting is the element at position $k$ after sorting
the array in increasing order. the array in increasing order.
It is easy to calculate any order statistic It is easy to calculate any order statistic
in $O(n \log n)$ time by sorting the array, in $O(n \log n)$ time by first sorting the array,
but is it really needed to sort the entire array but is it really needed to sort the entire array
just to find one element? just to find one element?
@ -526,16 +525,16 @@ its running time is usually $O(n)$
but $O(n^2)$ in the worst case. but $O(n^2)$ in the worst case.
The algorithm chooses a random element $x$ The algorithm chooses a random element $x$
in the array, and moves elements smaller than $x$ of the array, and moves elements smaller than $x$
to the left part of the array, to the left part of the array,
and all other elements to the right part of the array. and all other elements to the right part of the array.
This takes $O(n)$ time when there are $n$ elements. This takes $O(n)$ time when there are $n$ elements.
Assume that the left part contains $a$ elements Assume that the left part contains $a$ elements
and the right part contains $b$ elements. and the right part contains $b$ elements.
If $a=k-1$, element $x$ is the $k$th order statistic. If $a=k$, element $x$ is the $k$th order statistic.
Otherwise, if $a>k-1$, we recursively find the $k$th order Otherwise, if $a>k$, we recursively find the $k$th order
statistic for the left part, statistic for the left part,
and if $a<k-1$, we recursively find the $r$th order and if $a<k$, we recursively find the $r$th order
statistic for the right part where $r=k-a$. statistic for the right part where $r=k-a$.
The search continues in a similar way, until the element The search continues in a similar way, until the element
has been found. has been found.
@ -544,9 +543,9 @@ When each element $x$ is randomly chosen,
the size of the array about halves at each step, the size of the array about halves at each step,
so the time complexity for so the time complexity for
finding the $k$th order statistic is about finding the $k$th order statistic is about
\[n+n/2+n/4+n/8+\cdots=O(n).\] \[n+n/2+n/4+n/8+\cdots \le 2n = O(n).\]
The worst case for the algorithm is still $O(n^2)$, The worst case of the algorithm requires still $O(n^2)$ time,
because it is possible that $x$ is always chosen because it is possible that $x$ is always chosen
in such a way that it is one of the smallest or largest in such a way that it is one of the smallest or largest
elements in the array and $O(n)$ steps are needed. elements in the array and $O(n)$ steps are needed.

View File

@ -8,12 +8,12 @@ no matter what the opponent does,
if such a strategy exists. if such a strategy exists.
It turns out that there is a general strategy It turns out that there is a general strategy
for all such games, for such games,
and we can analyze the games using the \key{nim theory}. and we can analyze the games using the \key{nim theory}.
First, we will analyze simple games where First, we will analyze simple games where
players remove sticks from heaps, players remove sticks from heaps,
and after this, we will generalize the strategy and after this, we will generalize the strategy
used in those games to all other games. used in those games to other games.
\section{Game states} \section{Game states}
@ -252,7 +252,7 @@ where $\oplus$ is the xor operation\footnote{The optimal strategy
for nim was published in 1901 by C. L. Bouton \cite{bou01}.}. for nim was published in 1901 by C. L. Bouton \cite{bou01}.}.
The states whose nim sum is 0 are losing states, The states whose nim sum is 0 are losing states,
and all other states are winning states. and all other states are winning states.
For example, the nim sum for For example, the nim sum of
$[10,12,5]$ is $10 \oplus 12 \oplus 5 = 3$, $[10,12,5]$ is $10 \oplus 12 \oplus 5 = 3$,
so the state is a winning state. so the state is a winning state.
@ -260,8 +260,6 @@ But how is the nim sum related to the nim game?
We can explain this by looking at how the nim We can explain this by looking at how the nim
sum changes when the nim state changes. sum changes when the nim state changes.
~\\
\noindent
\textit{Losing states:} \textit{Losing states:}
The final state $[0,0,\ldots,0]$ is a losing state, The final state $[0,0,\ldots,0]$ is a losing state,
and its nim sum is 0, as expected. and its nim sum is 0, as expected.
@ -270,8 +268,6 @@ a winning state, because when a single value $x_k$ changes,
the nim sum also changes, so the nim sum the nim sum also changes, so the nim sum
is different from 0 after the move. is different from 0 after the move.
~\\
\noindent
\textit{Winning states:} \textit{Winning states:}
We can move to a losing state if We can move to a losing state if
there is any heap $k$ for which $x_k \oplus s < x_k$. there is any heap $k$ for which $x_k \oplus s < x_k$.
@ -280,10 +276,8 @@ heap $k$ so that it will contain $x_k \oplus s$ sticks,
which will lead to a losing state. which will lead to a losing state.
There is always such a heap, where $x_k$ There is always such a heap, where $x_k$
has a one bit at the position of the leftmost has a one bit at the position of the leftmost
one bit in $s$. one bit of $s$.
~\\
\noindent
As an example, consider the state $[10,12,5]$. As an example, consider the state $[10,12,5]$.
This state is a winning state, This state is a winning state,
because its nim sum is 3. because its nim sum is 3.
@ -291,7 +285,6 @@ Thus, there has to be a move which
leads to a losing state. leads to a losing state.
Next we will find out such a move. Next we will find out such a move.
\begin{samepage}
The nim sum of the state is as follows: The nim sum of the state is as follows:
\begin{center} \begin{center}
@ -303,12 +296,11 @@ The nim sum of the state is as follows:
3 & \texttt{0011} \\ 3 & \texttt{0011} \\
\end{tabular} \end{tabular}
\end{center} \end{center}
\end{samepage}
In this case, the heap with 10 sticks In this case, the heap with 10 sticks
is the only heap that has a one bit is the only heap that has a one bit
at the position of the leftmost at the position of the leftmost
one bit in the nim sum: one bit of the nim sum:
\begin{center} \begin{center}
\begin{tabular}{r|r} \begin{tabular}{r|r}
@ -344,11 +336,11 @@ In a \key{misère game}, the goal of the game
is opposite, is opposite,
so the player who removes the last stick so the player who removes the last stick
loses the game. loses the game.
It turns out that a misère nim game can be It turns out that the misère nim game can be
optimally played almost like the standard nim game. optimally played almost like the standard nim game.
The idea is to first play the misère game The idea is to first play the misère game
like a standard game, but change the strategy like the standard game, but change the strategy
at the end of the game. at the end of the game.
The new strategy will be introduced in a situation The new strategy will be introduced in a situation
where each heap would contain at most one stick where each heap would contain at most one stick
@ -386,7 +378,7 @@ the states and allowed moves, and there is no randomness in the game.
The idea is to calculate for each game state The idea is to calculate for each game state
a Grundy number that corresponds to the number of a Grundy number that corresponds to the number of
sticks in a nim heap. sticks in a nim heap.
When we know the Grundy numbers for all states, When we know the Grundy numbers of all states,
we can play the game like the nim game. we can play the game like the nim game.
\subsubsection{Grundy numbers} \subsubsection{Grundy numbers}
@ -394,10 +386,10 @@ we can play the game like the nim game.
\index{Grundy number} \index{Grundy number}
\index{mex function} \index{mex function}
The \key{Grundy number} for a game state is The \key{Grundy number} of a game state is
\[\textrm{mex}(\{g_1,g_2,\ldots,g_n\}),\] \[\textrm{mex}(\{g_1,g_2,\ldots,g_n\}),\]
where $g_1,g_2,\ldots,g_n$ are Grundy numbers for where $g_1,g_2,\ldots,g_n$ are the Grundy numbers of the
states to which we can move from the state, states to which we can move,
and the mex function gives the smallest and the mex function gives the smallest
nonnegative number that is not in the set. nonnegative number that is not in the set.
For example, $\textrm{mex}(\{0,1,3\})=2$. For example, $\textrm{mex}(\{0,1,3\})=2$.
@ -457,9 +449,7 @@ and if the Grundy number is $x>0$, we can move
to states whose Grundy numbers include all numbers to states whose Grundy numbers include all numbers
$0,1,\ldots,x-1$. $0,1,\ldots,x-1$.
~\\ As an example, consider a game where
\noindent
As an example, let us consider a game where
the players move a figure in a maze. the players move a figure in a maze.
Each square in the maze is either floor or wall. Each square in the maze is either floor or wall.
On each turn, the player has to move On each turn, the player has to move
@ -468,7 +458,6 @@ of steps left or up.
The winner of the game is the player who The winner of the game is the player who
makes the last move. makes the last move.
\begin{samepage}
The following picture shows a possible initial state The following picture shows a possible initial state
of the game, where @ denotes the figure and * of the game, where @ denotes the figure and *
denotes a square where it can move. denotes a square where it can move.
@ -495,11 +484,10 @@ denotes a square where it can move.
\end{scope} \end{scope}
\end{tikzpicture} \end{tikzpicture}
\end{center} \end{center}
\end{samepage}
The states of the game are all floor squares The states of the game are all floor squares
in the maze. of the maze.
In this situation, the Grundy numbers In the above maze, the Grundy numbers
are as follows: are as follows:
\begin{center} \begin{center}
@ -577,11 +565,9 @@ is the nim sum of the Grundy numbers of the subgames.
The game can be played like a nim game by calculating The game can be played like a nim game by calculating
all Grundy numbers for subgames and then their nim sum. all Grundy numbers for subgames and then their nim sum.
~\\
\noindent
As an example, consider a game that consists As an example, consider a game that consists
of three mazes. of three mazes.
In this game, on each turn the player chooses one In this game, on each turn, the player chooses one
of the mazes and then moves the figure in the maze. of the mazes and then moves the figure in the maze.
Assume that the initial state of the game is as follows: Assume that the initial state of the game is as follows:
@ -762,7 +748,7 @@ $0 \oplus 3 \oplus 3 = 0$.
\subsubsection{Grundy's game} \subsubsection{Grundy's game}
Sometimes a move in the game divides the game Sometimes a move in a game divides the game
into subgames that are independent of each other. into subgames that are independent of each other.
In this case, the Grundy number of the game is In this case, the Grundy number of the game is