Improve language

2017-05-09 23:32:59 +03:00 · 2017-05-09 23:32:59 +03:00 · bf949a8f8c
parent 5a298088b9
commit bf949a8f8c
5 changed files with 144 additions and 146 deletions
--- a/chapter21.tex
+++ b/chapter21.tex
@ -9,7 +9,7 @@ because many questions involving integers
 are very difficult to solve even if they
 seem simple at first glance.
-As an example, let us consider the following equation:
+As an example, consider the following equation:
 \[x^3 + y^3 + z^3 = 33\]
 It is easy to find three real numbers $x$, $y$ and $z$
 that satisfy the equation.
@ -21,10 +21,10 @@ y = \sqrt[3]{3}, \\
 z = \sqrt[3]{3}.\\
 \end{array}
 \]
-However, nobody knows if there are any three
+However, it is an open problem in number theory
 if there are any three
 \emph{integers} $x$, $y$ and $z$
-that would satisfy the equation, but this
+that would satisfy the equation \cite{bec07}.
 is an open problem in number theory \cite{bec07}.
 In this chapter, we will focus on basic concepts
 and algorithms in number theory.
@ -51,7 +51,7 @@ A number $n>1$ is a \key{prime}
 if its only positive factors are 1 and $n$.
 For example, 7, 19 and 41 are primes,
 but 35 is not a prime, because $5 \cdot 7 = 35$.
-For each number $n>1$, there is a unique
+For every number $n>1$, there is a unique
 \key{prime factorization}
 \[ n = p_1^{\alpha_1} p_2^{\alpha_2} \cdots p_k^{\alpha_k},\]
 where $p_1,p_2,\ldots,p_k$ are distinct primes and
@ -87,7 +87,7 @@ and the product of the factors is $\mu(84)=84^6=351298031616$.
 \index{perfect number}
-A number $n$ is \key{perfect} if $n=\sigma(n)-n$,
+A number $n$ is called a \key{perfect number} if $n=\sigma(n)-n$,
 i.e., $n$ equals the sum of its factors
 between $1$ and $n-1$.
 For example, 28 is a perfect number,
@ -211,13 +211,13 @@ algorithm that builds an array using which we
 can efficiently check if a given number between $2 \ldots n$
 is prime and, if it is not, find one prime factor of the number.
-The algorithm builds an array $\texttt{a}$
+The algorithm builds an array $\texttt{sieve}$
 whose positions $2,3,\ldots,n$ are used.
-The value $\texttt{a}[k]=0$ means
+The value $\texttt{sieve}[k]=0$ means
 that $k$ is prime,
-and the value $\texttt{a}[k] \neq 0$
+and the value $\texttt{sieve}[k] \neq 0$
 means that $k$ is not a prime and one
-of its prime factors is $\texttt{a}[k]$.
+of its prime factors is $\texttt{sieve}[k]$.
 The algorithm iterates through the numbers
 $2 \ldots n$ one by one.
@ -279,31 +279,30 @@ For example, if $n=20$, the array is as follows:
 The following code implements the sieve of
 Eratosthenes.
-The code assumes that each element in
+The code assumes that each element of
-\texttt{a} is initially zero.
+\texttt{sieve} is initially zero.
 \begin{lstlisting}
 for (int x = 2; x <= n; x++) {
-    if (a[x]) continue;
+    if (sieve[x]) continue;
    for (int u = 2*x; u <= n; u += x) {
-        a[u] = x;
+        sieve[u] = x;
    }
 }
 \end{lstlisting}
-The inner loop of the algorithm will be executed
+The inner loop of the algorithm is executed
-$n/x$ times for any $x$.
+$n/x$ times for each value of $x$.
 Thus, an upper bound for the running time
 of the algorithm is the harmonic sum
 \[\sum_{x=2}^n n/x = n/2 + n/3 + n/4 + \cdots + n/n = O(n \log n).\]
 \index{harmonic sum}
-\[\sum_{x=2}^n n/x = n/2 + n/3 + n/4 + \cdots + n/n = O(n \log n).\]
+In fact, the algorithm is more efficient,
 In fact, the algorithm is even more efficient,
 because the inner loop will be executed only if
 the number $x$ is prime.
-It can be shown that the time complexity of the
+It can be shown that the running time of the
 algorithm is only $O(n \log \log n)$,
 a complexity very near to $O(n)$. 
@ -338,11 +337,21 @@ The algorithm is based on the following formula:
               \textrm{gcd}(b,a \bmod b) & b \neq 0\\
           \end{cases}
 \end{equation*}
 For example,
 \[\textrm{gcd}(24,36) = \textrm{gcd}(36,24)
 = \textrm{gcd}(24,12) = \textrm{gcd}(12,0)=12.\]
-The time complexity of Euclid's algorithm
+
-is $O(\log n)$, where $n=\min(a,b)$.
+The algorithm can be implemented as follows:
 \begin{lstlisting}
 int gcd(int a, int b) {
    if (b == 0) return a;
    return gcd(b, a%b);
 }
 \end{lstlisting}
 It can be shown that Euclid's algorithm works
 in $O(\log n)$ time, where $n=\min(a,b)$.
 The worst case for the algorithm is
 the case when $a$ and $b$ are consecutive Fibonacci numbers.
 For example,
@ -376,8 +385,8 @@ Note that $\varphi(n)=n-1$ if $n$ is prime.
 \index{modular arithmetic}
 In \key{modular arithmetic},
-the set of available numbers is limited so
+the set of numbers is limited so
-that only numbers $0,1,2,\ldots,m-1$ may be used,
+that only numbers $0,1,2,\ldots,m-1$ are used,
 where $m$ is a constant.
 Each number $x$ is
 represented by the number $x \bmod m$:
@ -385,9 +394,9 @@ the remainder after dividing $x$ by $m$.
 For example, if $m=17$, then $75$
 is represented by $75 \bmod 17 = 7$.
-Often we can take the remainder before doing
+Often we can take remainders before doing
 calculations.
-In particular, the following formulas can be used:
+In particular, the following formulas hold:
 \[
 \begin{array}{rcl}
 (x+y) \bmod m & = & (x \bmod m + y \bmod m) \bmod m \\
@ -484,12 +493,12 @@ If $m$ is prime, the formula becomes
 \[
 x^{-1} = x^{m-2}.
 \]
-For example, if $x=6$ and $m=17$, then
+For example,
-\[x^{-1}=6^{17-2} \bmod 17 = 3.\]
+\[6^{-1} \bmod 17 =6^{17-2} \bmod 17 = 3.\]
 Using this formula, we can calculate modular inverses
 efficiently using the modular exponentation algorithm.
-The above formula can be derived using Euler's theorem.
+This formula allows us to efficiently calculate
 modular inverses using the modular exponentation algorithm.
 The formula can be derived using Euler's theorem.
 First, the modular inverse should satisfy the following equation:
 \[
 x x^{-1} \bmod m = 1.
@ -522,6 +531,8 @@ cout << x*x << "\n"; // 2537071545
 \section{Solving equations}
 \subsubsection*{Diophantine equations}
 \index{Diophantine equation}
 A \key{Diophantine equation}
@ -529,12 +540,12 @@ A \key{Diophantine equation}
 is an equation of the form
 \[ ax + by = c, \]
 where $a$, $b$ and $c$ are constants
-and we should find the values of $x$ and $y$.
+and the values of $x$ and $y$ should be found.
 Each number in the equation has to be an integer.
 For example, one solution for the equation
 $5x+2y=11$ is $x=3$ and $y=-2$.
-\index{Euclid's algorithm}
+\index{extended Euclid's algorithm}
 We can efficiently solve a Diophantine equation
 by using Euclid's algorithm.
@ -548,11 +559,7 @@ ax + by = \textrm{gcd}(a,b)
 A Diophantine equation can be solved if
 $c$ is divisible by
 $\textrm{gcd}(a,b)$,
-and otherwise the equation cannot be solved.
+and otherwise it cannot be solved.
 \index{extended Euclid's algorithm}
 \subsubsection*{Extended Euclid's algorithm}
 As an example, let us find numbers $x$ and $y$
 that satisfy the following equation:
@ -588,7 +595,7 @@ so a solution to the equation is
 $x=8$ and $y=-20$.
 A solution to a Diophantine equation is not unique,
-but we can form an infinite number of solutions
+because we can form an infinite number of solutions
 if we know one solution.
 If a pair $(x,y)$ is a solution, then also all pairs
 \[(x+\frac{kb}{\textrm{gcd}(a,b)},y-\frac{ka}{\textrm{gcd}(a,b)})\]
@ -621,7 +628,7 @@ because
 \[X_k {X_k}^{-1}_{m_k} \bmod m_k = 1.\]
 Since all other terms in the sum are divisible by $m_k$,
 they have no effect on the remainder,
-and the remainder by $m_k$ for the whole sum is $a_k$.
+and $x \bmod m_k = a_k$.
 For example, a solution for
 \[
--- a/chapter22.tex
+++ b/chapter22.tex
@ -8,7 +8,7 @@ Usually, the goal is to find a way to
 count the combinations efficiently
 without generating each combination separately.
-As an example, let us consider the problem
+As an example, consider the problem
 of counting the number of ways to
 represent an integer $n$ as a sum of positive integers.
 For example, there are 8 representations
@ -35,27 +35,28 @@ The values of the function
 can be recursively calculated as follows:
 \begin{equation*}
    f(n) = \begin{cases}
-               1               & n = 1\\
+               1               & n = 0\\
-               f(1)+f(2)+\ldots+f(n-1)+1 & n > 1\\
+               f(0)+f(1)+\cdots+f(n-1) & n > 0\\
           \end{cases}
 \end{equation*}
-The base case is $f(1)=1$,
+The base case is $f(0)=1$,
-because there is only one way to represent the number 1.
+because the empty sum represents the number 0.
-When $n>1$, we go through all ways to
+Then, if $n>0$, we consider all ways to
-choose the last number in the sum.
+choose the first number of the sum.
-For example, in when $n=4$, the sum can end
+If the first number is $k$,
-with $+1$, $+2$ or $+3$.
+there are $f(n-k)$ representations
-In addition, we also count the representation
+for the remaining part of the sum.
-that only contains $n$.
+Thus, we calculate the sum of all values
 of the form $f(n-k)$ where $k<n$.
 The first values for the function are:
 \[
 \begin{array}{lcl}
 f(0) & = & 1 \\
 f(1) & = & 1 \\
 f(2) & = & 2 \\
 f(3) & = & 4 \\
 f(4) & = & 8 \\
 f(5) & = & 16 \\
 \end{array}
 \]
 It turns out that the function also has a closed-form formula
@ -134,7 +135,8 @@ The sum of binomial coefficients is
 \]
 The reason for the name ''binomial coefficient''
-is that
+can be seen when the binomial $(a+b)$ is raised to
 the $n$th power:
 \[ (a+b)^n =
 {n \choose 0} a^n b^0 + 
@ -314,13 +316,14 @@ there are 6 solutions:
 In this scenario, we can assume that
 $k$ balls are initially placed in boxes
 and there is an empty box between each
-two such boxes.
+pair of two adjacent boxes.
 The remaining task is to choose the
-positions for
+positions for the remaining empty boxes.
-$n-k-(k-1)=n-2k+1$ empty boxes.
+There are $n-2k+1$ such boxes and
-There are $k+1$ positions,
+$k+1$ positions for them.
-so the number of solutions is
+Thus, using the formula of scenario 2,
-${n-2k+1+k+1-1 \choose n-2k+1} = {n-k+1 \choose n-2k+1}$.
+the number of solutions is
 ${n-k+1 \choose n-2k+1}$.
 \subsubsection{Multinomial coefficients}
@ -348,10 +351,10 @@ number of valid
 parenthesis expressions that consist of
 $n$ left parentheses and $n$ right parentheses.
-For example, $C_3=5$, because using three
+For example, $C_3=5$, because
 left parentheses and three right parentheses,
 we can construct the following parenthesis
-expressions:
+expressions using three
 left parentheses and three right parentheses:
 \begin{itemize}[noitemsep]
 \item \texttt{()()()}
@ -370,7 +373,7 @@ The following rules precisely define all
 valid parenthesis expressions:
 \begin{itemize}
-\item The empty expression is valid.
+\item An empty parenthesis expression is valid.
 \item If an expression $A$ is valid,
 then also the expression
 \texttt{(}$A$\texttt{)} is valid.
@ -402,11 +405,11 @@ of parentheses and the number of expressions
 is the product of the following values:
 \begin{itemize}
-\item $C_{i}$: number of ways to construct an expression
+\item $C_{i}$: the number of ways to construct an expression
-using the parentheses in the first part,
+using the parentheses of the first part,
 not counting the outermost parentheses
-\item $C_{n-i-1}$: number of ways to construct an
+\item $C_{n-i-1}$: the number of ways to construct an
-expression using the parentheses in the second part
+expression using the parentheses of the second part
 \end{itemize}
 In addition, the base case is $C_0=1$,
 because we can construct an empty parenthesis
@ -656,7 +659,7 @@ recursive formula:
           \end{cases}
 \end{equation*}
-The formula can be derived by going through
+The formula can be derived by considering
 the possibilities how the element 1 changes
 in the derangement.
 There are $n-1$ ways to choose an element $x$
@ -695,8 +698,7 @@ remain unchanged when the $k$th way is applied.
 As an example, let us calculate the number of
 necklaces of $n$ pearls,
-where the color of each pearl is
+where each pearl has $m$ possible colors.
 one of $1,2,\ldots,m$.
 Two necklaces are symmetric if they are
 similar after rotating them.
 For example, the necklace
@ -749,7 +751,7 @@ pearl has the same color remain the same.
 More generally, when the number of steps is $k$,
 a total of
-\[m^{\textrm{gcd}(k,n)},\]
+\[m^{\textrm{gcd}(k,n)}\]
 necklaces remain the same,
 where $\textrm{gcd}(k,n)$ is the greatest common
 divisor of $k$ and $n$.
--- a/chapter23.tex
+++ b/chapter23.tex
@ -132,8 +132,8 @@ whose elements are calculated using the formula
 AB[i,j] = \sum_{k=1}^n A[i,k] \cdot B[k,j].
 \]
-The idea is that each element in $AB$
+The idea is that each element of $AB$
-is a sum of products of elements in $A$ and $B$
+is a sum of products of elements of $A$ and $B$
 according to the following picture:
 \begin{center}
@ -248,10 +248,10 @@ for matrix multiplication\footnote{The first such
 algorithm was Strassen's algorithm,
 published in 1969 \cite{str69},
 whose time complexity is $O(n^{2.80735})$;
-the best current algorithm
+the best current algorithm \cite{gal14}
-works in $O(n^{2.37286})$ time \cite{gal14}.},
+works in $O(n^{2.37286})$ time.},
 but they are mostly of theoretical interest
-and such special algorithms are not needed
+and such algorithms are not necessary
 in competitive programming.
@ -424,15 +424,15 @@ For example,
 \index{linear recurrence}
 A \key{linear recurrence}
-can be represented as a function $f(n)$
+is a function $f(n)$
-such that the initial values are
+whose initial values are
 $f(0),f(1),\ldots,f(k-1)$
-and the larger values
+and larger values
 are calculated recursively using the formula
 \[f(n) = c_1 f(n-1) + c_2 f(n-2) + \ldots + c_k f (n-k),\]
 where $c_1,c_2,\ldots,c_k$ are constant coefficients.
-We can use dynamic programming to calculate
+Dynamic programming can be used to calculate
 any value of $f(n)$ in $O(kn)$ time by calculating
 all values of $f(0),f(1),\ldots,f(n)$ one after another.
 However, if $k$ is small, it is possible to calculate
@ -455,7 +455,8 @@ f(n) & = & f(n-1)+f(n-2) \\
 In this case, $k=2$ and $c_1=c_2=1$.
 \begin{samepage}
-The idea is to represent the
+To efficiently calculate Fibonacci numbers,
 we represent the
 Fibonacci formula as a
 square matrix $X$ of size $2 \times 2$,
 for which the following holds:
@ -670,8 +671,9 @@ $2 \rightarrow 6 \rightarrow 3 \rightarrow 2 \rightarrow 5$.
 \subsubsection{Shortest paths}
 Using a similar idea in a weighted graph,
-we can calculate for each pair of nodes the shortest
+we can calculate for each pair of nodes the minimum
-path between them that contains exactly $n$ edges.
+length of a path
 between them that contains exactly $n$ edges.
 To calculate this, we have to define matrix multiplication
 in a new way, so that we do not calculate the numbers
 of paths but minimize the lengths of paths.
@ -740,9 +742,10 @@ V^4= \begin{bmatrix}
  \infty & \infty & 12 & 13 & 11 & \infty \\
 \end{bmatrix},
 \]
-we can conclude that the shortest path of 4 edges
+we can conclude that the minimum length of a path
-from node 2 to node 5 has length 8.
+of 4 edges
-This path is
+from node 2 to node 5 is 8.
 Such a path is
 $2 \rightarrow 1 \rightarrow 4 \rightarrow 2 \rightarrow 5$.
 \subsubsection{Kirchhoff's theorem}
@ -819,7 +822,8 @@ L= \begin{bmatrix}
 \end{bmatrix}
 \]
-The number of spanning trees equals
+It can be shown that
 the number of spanning trees equals
 the determinant of a matrix that is obtained
 when we remove any row and any column from $L$.
 For example, if we remove the first row
@ -835,8 +839,8 @@ and column, the result is
 The determinant is always the same,
 regardless of which row and column we remove from $L$.
-Note that a special case of Kirchhoff's theorem
+Note that Cayley's formula in Chapter 22.5 is
-is Cayley's formula in Chapter 22.5,
+a special case of Kirchhoff's theorem,
 because in a complete graph of $n$ nodes
 \[ \det(
--- a/chapter24.tex
+++ b/chapter24.tex
@ -13,8 +13,7 @@ where the three dots describe the event.
 For example, when throwing a dice,
 the outcome is an integer between $1$ and $6$,
-and it is assumed that the probability of
+and the probability of each outcome is $1/6$.
 each outcome is $1/6$.
 For example, we can calculate the following probabilities:
 \begin{itemize}[noitemsep]
@ -56,9 +55,9 @@ Thus, the probability of the event is
 Another way to calculate the probability is
 to simulate the process that generates the event.
-In this case, we draw three cards, so the process
+In this example, we draw three cards, so the process
 consists of three steps.
-We require that each step in the process is successful.
+We require that each step of the process is successful.
 Drawing the first card certainly succeeds,
 because there are no restrictions.
@ -73,7 +72,7 @@ The probability that the entire process succeeds is
 \section{Events}
-An event in probability can be represented as a set
+An event in probability theory can be represented as a set
 \[A \subset X,\]
 where $X$ contains all possible outcomes
 and $A$ is a subset of outcomes.
@ -85,7 +84,7 @@ corresponds to the set
 Each outcome $x$ is assigned a probability $p(x)$.
 Furthermore, the probability $P(A)$ of an event
-that corresponds to a set $A$ can be calculated as a sum
+$A$ can be calculated as a sum
 of probabilities of outcomes using the formula
 \[P(A) = \sum_{x \in A} p(x).\]
 For example, when throwing a dice,
@ -97,7 +96,7 @@ so the probability of the event
 The total probability of the outcomes in $X$ must
 be 1, i.e., $P(X)=1$.
-Since the events in probability are sets,
+Since the events in probability theory are sets,
 we can manipulate them using standard set operations:
 \begin{itemize}
@ -166,7 +165,7 @@ The \key{conditional probability}
 \[P(A | B) = \frac{P(A \cap B)}{P(B)}\]
 is the probability of $A$
 assuming that $B$ happens.
-In this situation, when calculating the
+Hence, when calculating the
 probability of $A$, we only consider the outcomes
 that also belong to $B$.
@ -313,17 +312,17 @@ values $a,a+1,\ldots,b$ and the probability of each value is $1/n$.
 For example, when throwing a dice,
 $a=1$, $b=6$ and $P(X=x)=1/6$ for each value $x$.
-The expected value for $X$ in a uniform distribution is
+The expected value of $X$ in a uniform distribution is
 \[E[X] = \frac{a+b}{2}.\]
 \index{binomial distribution}
 ~\\
 In a \key{binomial distribution}, $n$ attempts
 are made
 and the probability that a single attempt succeeds
 is $p$.
 The random variable $X$ counts the number of
 successful attempts,
-and the probability for a value $x$ is
+and the probability of a value $x$ is
 \[P(X=x)=p^x (1-p)^{n-x} {n \choose x},\]
 where $p^x$ and $(1-p)^{n-x}$ correspond to
 successful and unsuccessful attemps,
@ -334,25 +333,25 @@ For example, when throwing a dice ten times,
 the probability of throwing a six exactly
 three times is $(1/6)^3 (5/6)^7 {10 \choose 3}$.
-The expected value for $X$ in a binomial distribution is
+The expected value of $X$ in a binomial distribution is
 \[E[X] = pn.\]
 \index{geometric distribution}
 ~\\
 In a \key{geometric distribution},
 the probability that an attempt succeeds is $p$,
 and we continue until the first success happens.
 The random variable $X$ counts the number
-of attempts needed, and the probability for
+of attempts needed, and the probability of
 a value $x$ is
 \[P(X=x)=(1-p)^{x-1} p,\]
-where $(1-p)^{x-1}$ corresponds to unsuccessful attemps
+where $(1-p)^{x-1}$ corresponds to the unsuccessful attemps
 and $p$ corresponds to the first successful attempt.
 For example, if we throw a dice until we throw a six,
 the probability that the number of throws
 is exactly 4 is $(5/6)^3 1/6$.
-The expected value for $X$ in a geometric distribution is
+The expected value of $X$ in a geometric distribution is
 \[E[X]=\frac{1}{p}.\]
 \section{Markov chains}
@ -369,7 +368,7 @@ for moving to other states.
 A Markov chain can be represented as a graph
 whose nodes are states and edges are transitions.
-As an example, let us consider a problem
+As an example, consider a problem
 where we are in floor 1 in an $n$ floor building.
 At each step, we randomly walk either one floor
 up or one floor down, except that we always
@ -420,11 +419,11 @@ $[1/2,0,1/2,0,0]$, and so on.
 An efficient way to simulate the walk in
 a Markov chain is to use dynamic programming.
-The idea is to maintain the probability distribution
+The idea is to maintain the probability distribution,
 and at each step go through all possibilities
 how we can move.
-Using this method, we can simulate $m$ steps
+Using this method, we can simulate
-in $O(n^2 m)$ time.
+a walk of $m$ steps in $O(n^2 m)$ time.
 The transitions of a Markov chain can also be
 represented as a matrix that updates the
@ -511,7 +510,7 @@ The $kth$ \key{order statistic} of an array
 is the element at position $k$ after sorting
 the array in increasing order.
 It is easy to calculate any order statistic
-in $O(n \log n)$ time by sorting the array,
+in $O(n \log n)$ time by first sorting the array,
 but is it really needed to sort the entire array
 just to find one element?
@ -526,16 +525,16 @@ its running time is usually $O(n)$
 but $O(n^2)$ in the worst case.
 The algorithm chooses a random element $x$
-in the array, and moves elements smaller than $x$
+of the array, and moves elements smaller than $x$
 to the left part of the array,
 and all other elements to the right part of the array.
 This takes $O(n)$ time when there are $n$ elements.
 Assume that the left part contains $a$ elements
 and the right part contains $b$ elements.
-If $a=k-1$, element $x$ is the $k$th order statistic.
+If $a=k$, element $x$ is the $k$th order statistic.
-Otherwise, if $a>k-1$, we recursively find the $k$th order
+Otherwise, if $a>k$, we recursively find the $k$th order
 statistic for the left part,
-and if $a<k-1$, we recursively find the $r$th order
+and if $a<k$, we recursively find the $r$th order
 statistic for the right part where $r=k-a$.
 The search continues in a similar way, until the element
 has been found.
@ -544,9 +543,9 @@ When each element $x$ is randomly chosen,
 the size of the array about halves at each step,
 so the time complexity for
 finding the $k$th order statistic is about
-\[n+n/2+n/4+n/8+\cdots=O(n).\]
+\[n+n/2+n/4+n/8+\cdots \le 2n = O(n).\]
-The worst case for the algorithm is still $O(n^2)$,
+The worst case of the algorithm requires still $O(n^2)$ time,
 because it is possible that $x$ is always chosen
 in such a way that it is one of the smallest or largest
 elements in the array and $O(n)$ steps are needed.
--- a/chapter25.tex
+++ b/chapter25.tex
@ -8,12 +8,12 @@ no matter what the opponent does,
 if such a strategy exists.
 It turns out that there is a general strategy
-for all such games,
+for such games,
 and we can analyze the games using the \key{nim theory}.
 First, we will analyze simple games where
 players remove sticks from heaps,
 and after this, we will generalize the strategy
-used in those games to all other games.
+used in those games to other games.
 \section{Game states}
@ -252,7 +252,7 @@ where $\oplus$ is the xor operation\footnote{The optimal strategy
 for nim was published in 1901 by C. L. Bouton \cite{bou01}.}.
 The states whose nim sum is 0 are losing states,
 and all other states are winning states.
-For example, the nim sum for
+For example, the nim sum of
 $[10,12,5]$ is $10 \oplus 12 \oplus 5 = 3$,
 so the state is a winning state.
@ -260,8 +260,6 @@ But how is the nim sum related to the nim game?
 We can explain this by looking at how the nim
 sum changes when the nim state changes.
 ~\\
 \noindent
 \textit{Losing states:}
 The final state $[0,0,\ldots,0]$ is a losing state,
 and its nim sum is 0, as expected.
@ -270,8 +268,6 @@ a winning state, because when a single value $x_k$ changes,
 the nim sum also changes, so the nim sum
 is different from 0 after the move.
 ~\\
 \noindent
 \textit{Winning states:}
 We can move to a losing state if
 there is any heap $k$ for which $x_k \oplus s < x_k$.
@ -280,10 +276,8 @@ heap $k$ so that it will contain $x_k \oplus s$ sticks,
 which will lead to a losing state.
 There is always such a heap, where $x_k$
 has a one bit at the position of the leftmost
-one bit in $s$.
+one bit of $s$.
 ~\\
 \noindent
 As an example, consider the state $[10,12,5]$.
 This state is a winning state,
 because its nim sum is 3.
@ -291,7 +285,6 @@ Thus, there has to be a move which
 leads to a losing state.
 Next we will find out such a move.
 \begin{samepage}
 The nim sum of the state is as follows:
 \begin{center}
@ -303,12 +296,11 @@ The nim sum of the state is as follows:
 3 & \texttt{0011} \\
 \end{tabular}
 \end{center}
 \end{samepage}
 In this case, the heap with 10 sticks
 is the only heap that has a one bit
 at the position of the leftmost
-one bit in the nim sum:
+one bit of the nim sum:
 \begin{center}
 \begin{tabular}{r|r}
@ -344,11 +336,11 @@ In a \key{misère game}, the goal of the game
 is opposite,
 so the player who removes the last stick
 loses the game.
-It turns out that a misère nim game can be
+It turns out that the misère nim game can be
 optimally played almost like the standard nim game.
 The idea is to first play the misère game
-like a standard game, but change the strategy
+like the standard game, but change the strategy
 at the end of the game.
 The new strategy will be introduced in a situation
 where each heap would contain at most one stick
@ -386,7 +378,7 @@ the states and allowed moves, and there is no randomness in the game.
 The idea is to calculate for each game state
 a Grundy number that corresponds to the number of
 sticks in a nim heap.
-When we know the Grundy numbers for all states,
+When we know the Grundy numbers of all states,
 we can play the game like the nim game.
 \subsubsection{Grundy numbers}
@ -394,10 +386,10 @@ we can play the game like the nim game.
 \index{Grundy number}
 \index{mex function}
-The \key{Grundy number} for a game state is
+The \key{Grundy number} of a game state is
 \[\textrm{mex}(\{g_1,g_2,\ldots,g_n\}),\]
-where $g_1,g_2,\ldots,g_n$ are Grundy numbers for
+where $g_1,g_2,\ldots,g_n$ are the Grundy numbers of the
-states to which we can move from the state,
+states to which we can move,
 and the mex function gives the smallest
 nonnegative number that is not in the set.
 For example, $\textrm{mex}(\{0,1,3\})=2$.
@ -457,9 +449,7 @@ and if the Grundy number is $x>0$, we can move
 to states whose Grundy numbers include all numbers
 $0,1,\ldots,x-1$.
-~\\
+As an example, consider a game where
 \noindent
 As an example, let us consider a game where
 the players move a figure in a maze.
 Each square in the maze is either floor or wall.
 On each turn, the player has to move
@ -468,7 +458,6 @@ of steps left or up.
 The winner of the game is the player who
 makes the last move.
 \begin{samepage}
 The following picture shows a possible initial state
 of the game, where @ denotes the figure and *
 denotes a square where it can move.
@ -495,11 +484,10 @@ denotes a square where it can move.
  \end{scope}
 \end{tikzpicture}
 \end{center}
 \end{samepage}
 The states of the game are all floor squares
-in the maze.
+of the maze.
-In this situation, the Grundy numbers
+In the above maze, the Grundy numbers
 are as follows:
 \begin{center}
@ -577,11 +565,9 @@ is the nim sum of the Grundy numbers of the subgames.
 The game can be played like a nim game by calculating
 all Grundy numbers for subgames and then their nim sum.
 ~\\
 \noindent
 As an example, consider a game that consists
 of three mazes.
-In this game, on each turn the player chooses one
+In this game, on each turn, the player chooses one
 of the mazes and then moves the figure in the maze.
 Assume that the initial state of the game is as follows:
@ -762,7 +748,7 @@ $0 \oplus 3 \oplus 3 = 0$.
 \subsubsection{Grundy's game}
-Sometimes a move in the game divides the game
+Sometimes a move in a game divides the game
 into subgames that are independent of each other.
 In this case, the Grundy number of the game is