Corrections
This commit is contained in:
parent
1e8a4e5504
commit
e42cfb413e
190
luku06.tex
190
luku06.tex
|
@ -3,7 +3,7 @@
|
||||||
\index{greedy algorithm}
|
\index{greedy algorithm}
|
||||||
|
|
||||||
A \key{greedy algorithm}
|
A \key{greedy algorithm}
|
||||||
constructs a solution for a problem
|
constructs a solution to the problem
|
||||||
by always making a choice that looks
|
by always making a choice that looks
|
||||||
the best at the moment.
|
the best at the moment.
|
||||||
A greedy algorithm never takes back
|
A greedy algorithm never takes back
|
||||||
|
@ -15,15 +15,15 @@ are usually very efficient.
|
||||||
The difficulty in designing a greedy algorithm
|
The difficulty in designing a greedy algorithm
|
||||||
is to invent a greedy strategy
|
is to invent a greedy strategy
|
||||||
that always produces an optimal solution
|
that always produces an optimal solution
|
||||||
for the problem.
|
to the problem.
|
||||||
The locally optimal choices in a greedy
|
The locally optimal choices in a greedy
|
||||||
algorithm should also be globally optimal.
|
algorithm should also be globally optimal.
|
||||||
It's often difficult to argue why
|
It is often difficult to argue that
|
||||||
a greedy algorithm works.
|
a greedy algorithm works.
|
||||||
|
|
||||||
\section{Coin problem}
|
\section{Coin problem}
|
||||||
|
|
||||||
As the first example, we consider a problem
|
As a first example, we consider a problem
|
||||||
where we are given a set of coin values
|
where we are given a set of coin values
|
||||||
and our task is to form a sum of money
|
and our task is to form a sum of money
|
||||||
using the coins.
|
using the coins.
|
||||||
|
@ -41,7 +41,7 @@ $200+200+100+20$ whose sum is 520.
|
||||||
|
|
||||||
\subsubsection{Greedy algorithm}
|
\subsubsection{Greedy algorithm}
|
||||||
|
|
||||||
A natural greedy algorithm for the problem
|
A simple greedy algorithm to the problem
|
||||||
is to always select the largest possible coin,
|
is to always select the largest possible coin,
|
||||||
until we have constructed the required sum of money.
|
until we have constructed the required sum of money.
|
||||||
This algorithm works in the example case,
|
This algorithm works in the example case,
|
||||||
|
@ -54,10 +54,10 @@ the greedy algorithm \emph{always} works, i.e.,
|
||||||
it always produces a solution with the fewest
|
it always produces a solution with the fewest
|
||||||
possible number of coins.
|
possible number of coins.
|
||||||
The correctness of the algorithm can be
|
The correctness of the algorithm can be
|
||||||
argued as follows:
|
shown as follows:
|
||||||
|
|
||||||
Each coin 1, 5, 10, 50 and 100 appears
|
Each coin 1, 5, 10, 50 and 100 appears
|
||||||
at most once in the optimal solution.
|
at most once in an optimal solution.
|
||||||
The reason for this is that if the
|
The reason for this is that if the
|
||||||
solution would contain two such coins,
|
solution would contain two such coins,
|
||||||
we could replace them by one coin and
|
we could replace them by one coin and
|
||||||
|
@ -65,14 +65,14 @@ obtain a better solution.
|
||||||
For example, if the solution would contain
|
For example, if the solution would contain
|
||||||
coins $5+5$, we could replace them by coin $10$.
|
coins $5+5$, we could replace them by coin $10$.
|
||||||
|
|
||||||
In the same way, both coins 2 and 20 can appear
|
In the same way, coins 2 and 20 appear
|
||||||
at most twice in the optimal solution
|
at most twice in an optimal solution,
|
||||||
because, we could replace
|
because we could replace
|
||||||
coins $2+2+2$ by coins $5+1$ and
|
coins $2+2+2$ by coins $5+1$ and
|
||||||
coins $20+20+20$ by coins $50+10$.
|
coins $20+20+20$ by coins $50+10$.
|
||||||
Moreover, the optimal solution can't contain
|
Moreover, an optimal solution cannot contain
|
||||||
coins $2+2+1$ or $20+20+10$
|
coins $2+2+1$ or $20+20+10$
|
||||||
because we would replace them by coins $5$ and $50$.
|
because we could replace them by coins $5$ and $50$.
|
||||||
|
|
||||||
Using these observations,
|
Using these observations,
|
||||||
we can show for each coin $x$ that
|
we can show for each coin $x$ that
|
||||||
|
@ -80,32 +80,33 @@ it is not possible to optimally construct
|
||||||
sum $x$ or any larger sum by only using coins
|
sum $x$ or any larger sum by only using coins
|
||||||
that are smaller than $x$.
|
that are smaller than $x$.
|
||||||
For example, if $x=100$, the largest optimal
|
For example, if $x=100$, the largest optimal
|
||||||
sum using the smaller coins is $5+20+20+5+2+2=99$.
|
sum using the smaller coins is $50+20+20+5+2+2=99$.
|
||||||
Thus, the greedy algorithm that always selects
|
Thus, the greedy algorithm that always selects
|
||||||
the largest coin produces the optimal solution.
|
the largest coin produces the optimal solution.
|
||||||
|
|
||||||
This example shows that it can be difficult
|
This example shows that it can be difficult
|
||||||
to argue why a greedy algorithm works,
|
to argue that a greedy algorithm works,
|
||||||
even if the algorithm itself is simple.
|
even if the algorithm itself is simple.
|
||||||
|
|
||||||
\subsubsection{General case}
|
\subsubsection{General case}
|
||||||
|
|
||||||
In the general case, the coin set can contain any coins
|
In the general case, the coin set can contain any coins
|
||||||
and the greedy algorithm \emph{not} necessarily produces
|
and the greedy algorithm \emph{does not} necessarily produce
|
||||||
an optimal solution.
|
an optimal solution.
|
||||||
|
|
||||||
We can prove that a greedy algorithm doesn't work
|
We can prove that a greedy algorithm does not work
|
||||||
by showing a counterexample
|
by showing a counterexample
|
||||||
where the algorithm gives a wrong answer.
|
where the algorithm gives a wrong answer.
|
||||||
In this problem it's easy to find a counterexample:
|
In this problem we can easily find a counterexample:
|
||||||
if the coins are $\{1,3,4\}$ and the sum of money
|
if the coins are $\{1,3,4\}$ and the target sum
|
||||||
is 6, the greedy algorithm produces the solution
|
is 6, the greedy algorithm produces the solution
|
||||||
$4+1+1$, while the optimal solution is $3+3$.
|
$4+1+1$ while the optimal solution is $3+3$.
|
||||||
|
|
||||||
We don't know if the general coin problem
|
We do not know if the general coin problem
|
||||||
can be solved using any greedy algorithm.
|
can be solved using any greedy algorithm.
|
||||||
However, we will revisit the problem in the next chapter
|
However, as we will see in Chapter 7,
|
||||||
because the general problem can be solved using a dynamic
|
the general problem can be efficiently
|
||||||
|
solved using a dynamic
|
||||||
programming algorithm that always gives the
|
programming algorithm that always gives the
|
||||||
correct answer.
|
correct answer.
|
||||||
|
|
||||||
|
@ -115,9 +116,9 @@ Many scheduling problems can be solved
|
||||||
using a greedy strategy.
|
using a greedy strategy.
|
||||||
A classic problem is as follows:
|
A classic problem is as follows:
|
||||||
Given $n$ events with their starting and ending
|
Given $n$ events with their starting and ending
|
||||||
times, our task is to plan a schedule
|
times, we should plan a schedule
|
||||||
so that we can join as many events as possible.
|
that includes as many events as possible.
|
||||||
It's not possible to join an event partially.
|
It is not possible to select an event partially.
|
||||||
For example, consider the following events:
|
For example, consider the following events:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tabular}{lll}
|
\begin{tabular}{lll}
|
||||||
|
@ -130,7 +131,7 @@ $D$ & 6 & 8 \\
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
\end{center}
|
\end{center}
|
||||||
In this case the maximum number of events is two.
|
In this case the maximum number of events is two.
|
||||||
For example, we can join events $B$ and $D$
|
For example, we can select events $B$ and $D$
|
||||||
as follows:
|
as follows:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tikzpicture}[scale=.4]
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
@ -171,8 +172,8 @@ selects the following events:
|
||||||
\end{tikzpicture}
|
\end{tikzpicture}
|
||||||
\end{center}
|
\end{center}
|
||||||
|
|
||||||
However, choosing short events is not always
|
However, select short events is not always
|
||||||
a correct strategy but the algorithm fails,
|
a correct strategy, but the algorithm fails,
|
||||||
for example, in the following case:
|
for example, in the following case:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tikzpicture}[scale=.4]
|
\begin{tikzpicture}[scale=.4]
|
||||||
|
@ -206,10 +207,10 @@ This algorithm selects the following events:
|
||||||
\end{tikzpicture}
|
\end{tikzpicture}
|
||||||
\end{center}
|
\end{center}
|
||||||
|
|
||||||
However, we can find a counterexample for this
|
However, we can find a counterexample
|
||||||
algorithm, too.
|
also for this algorithm.
|
||||||
For example, in the following case,
|
For example, in the following case,
|
||||||
the algorithm selects only one event:
|
the algorithm only selects one event:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tikzpicture}[scale=.4]
|
\begin{tikzpicture}[scale=.4]
|
||||||
\begin{scope}
|
\begin{scope}
|
||||||
|
@ -221,7 +222,7 @@ the algorithm selects only one event:
|
||||||
\end{center}
|
\end{center}
|
||||||
If we select the first event, it is not possible
|
If we select the first event, it is not possible
|
||||||
to select any other events.
|
to select any other events.
|
||||||
However, it would be possible to join the
|
However, it would be possible to select the
|
||||||
other two events.
|
other two events.
|
||||||
|
|
||||||
\subsubsection*{Algorithm 3}
|
\subsubsection*{Algorithm 3}
|
||||||
|
@ -246,35 +247,37 @@ This algorithm selects the following events:
|
||||||
|
|
||||||
It turns out that this algorithm
|
It turns out that this algorithm
|
||||||
\emph{always} produces an optimal solution.
|
\emph{always} produces an optimal solution.
|
||||||
The algorithm works because
|
First, it is always an optimal choice
|
||||||
regarding the final solution, it is
|
to first select an event that ends
|
||||||
optimal to select an event that
|
as early as possible.
|
||||||
ends as soon as possible.
|
After this, it is an optimal choice
|
||||||
Then it is optimal to select
|
to select the next event
|
||||||
the next event using the same strategy, etc.
|
using the same strategy, etc.,
|
||||||
|
until we cannot select any more events.
|
||||||
|
|
||||||
One way to justify the choice is to think
|
One way to argue that the algorithm works
|
||||||
what happens if we first select some event
|
is to consider
|
||||||
|
what happens if we first select an event
|
||||||
that ends later than the event that ends
|
that ends later than the event that ends
|
||||||
as soon as possible.
|
as early as possible.
|
||||||
This can never be a better choice
|
Now, we will have at most an equal number of
|
||||||
because after an event that ends later,
|
choices how we can select the next event.
|
||||||
we will have at most an equal number of
|
Hence, selecting an event that ends later
|
||||||
possibilities to select for the next events,
|
can never yield a better solution,
|
||||||
compared to the strategy that we select the
|
and the greedy algorithm is correct.
|
||||||
event that ends as soon as possible.
|
|
||||||
|
|
||||||
\section{Tasks and deadlines}
|
\section{Tasks and deadlines}
|
||||||
|
|
||||||
We are given $n$ tasks with duration and deadline.
|
Let us now consider a problem where
|
||||||
Our task is to choose an order to perform the tasks.
|
we are given $n$ tasks with durations and deadlines,
|
||||||
|
and our task is to choose an order to perform the tasks.
|
||||||
For each task, we get $d-x$ points
|
For each task, we get $d-x$ points
|
||||||
where $d$ is the deadline of the task
|
where $d$ is the task's deadline
|
||||||
and $x$ is the moment when we finished the task.
|
and $x$ is the moment when we finished the task.
|
||||||
What is the largest possible total score
|
What is the largest possible total score
|
||||||
we can obtain?
|
we can obtain?
|
||||||
|
|
||||||
For example, if the tasks are
|
For example, suppose that the tasks are as follows:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tabular}{lll}
|
\begin{tabular}{lll}
|
||||||
task & duration & deadline \\
|
task & duration & deadline \\
|
||||||
|
@ -285,8 +288,8 @@ $C$ & 2 & 7 \\
|
||||||
$D$ & 4 & 5 \\
|
$D$ & 4 & 5 \\
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
\end{center}
|
\end{center}
|
||||||
then the optimal solution is to perform
|
In this case, an optimal schedule for the tasks
|
||||||
the tasks as follows:
|
is as follows:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tikzpicture}[scale=.4]
|
\begin{tikzpicture}[scale=.4]
|
||||||
\begin{scope}
|
\begin{scope}
|
||||||
|
@ -317,16 +320,16 @@ $B$ yields 0 points, $A$ yields $-7$ points
|
||||||
and $D$ yields $-8$ points,
|
and $D$ yields $-8$ points,
|
||||||
so the total score is $-10$.
|
so the total score is $-10$.
|
||||||
|
|
||||||
Surprisingly, the optimal solution for the problem
|
Surprisingly, the optimal solution to the problem
|
||||||
doesn't depend on the dedalines at all,
|
does not depend on the deadlines at all,
|
||||||
but a correct greedy strategy is to simply
|
but a correct greedy strategy is to simply
|
||||||
perform the tasks \emph{sorted by their durations}
|
perform the tasks \emph{sorted by their durations}
|
||||||
in increasing order.
|
in increasing order.
|
||||||
The reason for this is that if we ever perform
|
The reason for this is that if we ever perform
|
||||||
two successive tasks such that the first task
|
two tasks one after another such that the first task
|
||||||
takes longer than the second task,
|
takes longer than the second task,
|
||||||
we can obtain a better solution if we swap the tasks.
|
we can obtain a better solution if we swap the tasks.
|
||||||
For example, if the successive tasks are
|
For example, consider the following schedule:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tikzpicture}[scale=.4]
|
\begin{tikzpicture}[scale=.4]
|
||||||
\begin{scope}
|
\begin{scope}
|
||||||
|
@ -345,7 +348,7 @@ For example, if the successive tasks are
|
||||||
\end{scope}
|
\end{scope}
|
||||||
\end{tikzpicture}
|
\end{tikzpicture}
|
||||||
\end{center}
|
\end{center}
|
||||||
and $a>b$, the swapped order of the tasks
|
Here $a>b$, so we should swap the tasks:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tikzpicture}[scale=.4]
|
\begin{tikzpicture}[scale=.4]
|
||||||
\begin{scope}
|
\begin{scope}
|
||||||
|
@ -364,10 +367,10 @@ and $a>b$, the swapped order of the tasks
|
||||||
\end{scope}
|
\end{scope}
|
||||||
\end{tikzpicture}
|
\end{tikzpicture}
|
||||||
\end{center}
|
\end{center}
|
||||||
gives $b$ points less to $X$ and $a$ points more to $Y$,
|
Now $X$ gives $b$ points less and $Y$ gives $a$ points more,
|
||||||
so the total score increases by $a-b > 0$.
|
so the total score increases by $a-b > 0$.
|
||||||
In an optimal solution,
|
In an optimal solution,
|
||||||
for each two successive tasks,
|
for any two consecutive tasks,
|
||||||
it must hold that the shorter task comes
|
it must hold that the shorter task comes
|
||||||
before the longer task.
|
before the longer task.
|
||||||
Thus, the tasks must be performed
|
Thus, the tasks must be performed
|
||||||
|
@ -378,9 +381,8 @@ sorted by their durations.
|
||||||
We will next consider a problem where
|
We will next consider a problem where
|
||||||
we are given $n$ numbers $a_1,a_2,\ldots,a_n$
|
we are given $n$ numbers $a_1,a_2,\ldots,a_n$
|
||||||
and our task is to find a value $x$
|
and our task is to find a value $x$
|
||||||
such that the sum
|
that minimizes the sum
|
||||||
\[|a_1-x|^c+|a_2-x|^c+\cdots+|a_n-x|^c\]
|
\[|a_1-x|^c+|a_2-x|^c+\cdots+|a_n-x|^c.\]
|
||||||
becomes as small as possible.
|
|
||||||
We will focus on the cases $c=1$ and $c=2$.
|
We will focus on the cases $c=1$ and $c=2$.
|
||||||
|
|
||||||
\subsubsection{Case $c=1$}
|
\subsubsection{Case $c=1$}
|
||||||
|
@ -400,13 +402,12 @@ For example, the list $[1,2,9,2,6]$
|
||||||
becomes $[1,2,2,6,9]$ after sorting,
|
becomes $[1,2,2,6,9]$ after sorting,
|
||||||
so the median is 2.
|
so the median is 2.
|
||||||
|
|
||||||
The median is the optimal choice,
|
The median is an optimal choice,
|
||||||
because if $x$ is smaller than the median,
|
because if $x$ is smaller than the median,
|
||||||
the sum becomes smaller by increasing $x$,
|
the sum becomes smaller by increasing $x$,
|
||||||
and if $x$ is larger then the median,
|
and if $x$ is larger then the median,
|
||||||
the sum becomes smaller by decreasing $x$
|
the sum becomes smaller by decreasing $x$.
|
||||||
Thus, we should move $x$ as near the median
|
Hence, the optimal solution is that $x$
|
||||||
as possible, so the optimal solution that $x$
|
|
||||||
is the median.
|
is the median.
|
||||||
If $n$ is even and there are two medians,
|
If $n$ is even and there are two medians,
|
||||||
both medians and all values between them
|
both medians and all values between them
|
||||||
|
@ -428,9 +429,9 @@ In the example the average is $(1+2+9+2+6)/5=4$.
|
||||||
This result can be derived by presenting
|
This result can be derived by presenting
|
||||||
the sum as follows:
|
the sum as follows:
|
||||||
\[
|
\[
|
||||||
nx^2 - 2x(a_1+a_2+\cdots+a_n) + (a_1^2+a_2^2+\cdots+a_n^2).
|
nx^2 - 2x(a_1+a_2+\cdots+a_n) + (a_1^2+a_2^2+\cdots+a_n^2)
|
||||||
\]
|
\]
|
||||||
The last part doesn't depend on $x$,
|
The last part does not depend on $x$,
|
||||||
so we can ignore it.
|
so we can ignore it.
|
||||||
The remaining parts form a function
|
The remaining parts form a function
|
||||||
$nx^2-2xs$ where $s=a_1+a_2+\cdots+a_n$.
|
$nx^2-2xs$ where $s=a_1+a_2+\cdots+a_n$.
|
||||||
|
@ -446,16 +447,13 @@ the average of the numbers $a_1,a_2,\ldots,a_n$.
|
||||||
\index{binary code}
|
\index{binary code}
|
||||||
\index{codeword}
|
\index{codeword}
|
||||||
|
|
||||||
We are given a string, and our task is to
|
A \key{binary code} assigns for each character
|
||||||
\emph{compress} it so that it requires less space.
|
of a given string a \key{codeword} that consists of bits.
|
||||||
We will do this using a \key{binary code}
|
We can \emph{compress} the string using the binary code
|
||||||
that determines for each character
|
|
||||||
a \key{codeword} that consists of bits.
|
|
||||||
After this, we can compress the string
|
|
||||||
by replacing each character by the
|
by replacing each character by the
|
||||||
corresponding codeword.
|
corresponding codeword.
|
||||||
For example, the following binary code
|
For example, the following binary code
|
||||||
determines codewords for characters
|
assigns codewords for characters
|
||||||
\texttt{A}–\texttt{D}:
|
\texttt{A}–\texttt{D}:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tabular}{rr}
|
\begin{tabular}{rr}
|
||||||
|
@ -470,19 +468,20 @@ character & codeword \\
|
||||||
This is a \key{constant-length} code
|
This is a \key{constant-length} code
|
||||||
which means that the length of each
|
which means that the length of each
|
||||||
codeword is the same.
|
codeword is the same.
|
||||||
For example, the compressed form of the string
|
For example, we can compress the string
|
||||||
\texttt{AABACDACA} is
|
\texttt{AABACDACA} as follows:
|
||||||
\[000001001011001000,\]
|
\[000001001011001000\]
|
||||||
so 18 bits are needed.
|
Using this code, the length of the compressed
|
||||||
|
string is 18 bits.
|
||||||
However, we can compress the string better
|
However, we can compress the string better
|
||||||
by using a \key{variable-length} code
|
if we use a \key{variable-length} code
|
||||||
where codewords may have different lengths.
|
where codewords may have different lengths.
|
||||||
Then we can give short codewords for
|
Then we can give short codewords for
|
||||||
characters that appear often,
|
characters that appear often
|
||||||
and long codewords for characters
|
and long codewords for characters
|
||||||
that appear rarely.
|
that appear rarely.
|
||||||
It turns out that the \key{optimal} code
|
It turns out that an \key{optimal} code
|
||||||
for the aforementioned string is as follows:
|
for the above string is as follows:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tabular}{rr}
|
\begin{tabular}{rr}
|
||||||
character & codeword \\
|
character & codeword \\
|
||||||
|
@ -493,21 +492,21 @@ character & codeword \\
|
||||||
\texttt{D} & 111 \\
|
\texttt{D} & 111 \\
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
\end{center}
|
\end{center}
|
||||||
The optimal code produces a compressed string
|
An optimal code produces a compressed string
|
||||||
that is as short as possible.
|
that is as short as possible.
|
||||||
In this case, the compressed form using
|
In this case, the compressed string using
|
||||||
the optimal code is
|
the optimal code is
|
||||||
\[001100101110100,\]
|
\[001100101110100,\]
|
||||||
so only 15 bits are needed.
|
so only 15 bits are needed instead of 18 bits.
|
||||||
Thus, thanks to a better code it was possible to
|
Thus, thanks to a better code it was possible to
|
||||||
save 3 bits in the compressed string.
|
save 3 bits in the compressed string.
|
||||||
|
|
||||||
Note that it is required that no codeword
|
We require that no codeword
|
||||||
is a prefix of another codeword.
|
is a prefix of another codeword.
|
||||||
For example, it is not allowed that a code
|
For example, it is not allowed that a code
|
||||||
would contain both codewords 10
|
would contain both codewords 10
|
||||||
and 1011.
|
and 1011.
|
||||||
The reason for this is that we also want
|
The reason for this is that we want
|
||||||
to be able to generate the original string
|
to be able to generate the original string
|
||||||
from the compressed string.
|
from the compressed string.
|
||||||
If a codeword could be a prefix of another codeword,
|
If a codeword could be a prefix of another codeword,
|
||||||
|
@ -515,7 +514,7 @@ this would not always be possible.
|
||||||
For example, the following code is \emph{not} valid:
|
For example, the following code is \emph{not} valid:
|
||||||
\begin{center}
|
\begin{center}
|
||||||
\begin{tabular}{rr}
|
\begin{tabular}{rr}
|
||||||
merkki & koodisana \\
|
character & codeword \\
|
||||||
\hline
|
\hline
|
||||||
\texttt{A} & 10 \\
|
\texttt{A} & 10 \\
|
||||||
\texttt{B} & 11 \\
|
\texttt{B} & 11 \\
|
||||||
|
@ -524,7 +523,7 @@ merkki & koodisana \\
|
||||||
\end{tabular}
|
\end{tabular}
|
||||||
\end{center}
|
\end{center}
|
||||||
Using this code, it would not be possible to know
|
Using this code, it would not be possible to know
|
||||||
if the compressed string 1011 means
|
if the compressed string 1011 corresponds to
|
||||||
the string \texttt{AB} or the string \texttt{C}.
|
the string \texttt{AB} or the string \texttt{C}.
|
||||||
|
|
||||||
\index{Huffman coding}
|
\index{Huffman coding}
|
||||||
|
@ -533,11 +532,11 @@ the string \texttt{AB} or the string \texttt{C}.
|
||||||
|
|
||||||
\key{Huffman coding} is a greedy algorithm
|
\key{Huffman coding} is a greedy algorithm
|
||||||
that constructs an optimal code for
|
that constructs an optimal code for
|
||||||
compressing a string.
|
compressing a given string.
|
||||||
The algorithm builds a binary tree
|
The algorithm builds a binary tree
|
||||||
based on the frequencies of the characters
|
based on the frequencies of the characters
|
||||||
in the string,
|
in the string,
|
||||||
and a codeword for each characters can be read
|
and each character's codeword can be read
|
||||||
by following a path from the root to
|
by following a path from the root to
|
||||||
the corresponding node.
|
the corresponding node.
|
||||||
A move to the left correspons to bit 0,
|
A move to the left correspons to bit 0,
|
||||||
|
@ -547,11 +546,10 @@ Initially, each character of the string is
|
||||||
represented by a node whose weight is the
|
represented by a node whose weight is the
|
||||||
number of times the character appears in the string.
|
number of times the character appears in the string.
|
||||||
Then at each step two nodes with minimum weights
|
Then at each step two nodes with minimum weights
|
||||||
are selected and they are combined by creating
|
are combined by creating
|
||||||
a new node whose weight is the sum of the weights
|
a new node whose weight is the sum of the weights
|
||||||
of the original nodes.
|
of the original nodes.
|
||||||
The process continues until all nodes have been
|
The process continues until all nodes have been combined.
|
||||||
combined and the code is ready.
|
|
||||||
|
|
||||||
Next we will see how Huffman coding creates
|
Next we will see how Huffman coding creates
|
||||||
the optimal code for the string
|
the optimal code for the string
|
||||||
|
|
Loading…
Reference in New Issue