2016-12-28 23:54:51 +01:00
|
|
|
|
\chapter{Greedy algorithms}
|
|
|
|
|
|
2017-01-01 22:43:44 +01:00
|
|
|
|
\index{greedy algorithm}
|
|
|
|
|
|
|
|
|
|
A \key{greedy algorithm}
|
2017-01-31 21:20:39 +01:00
|
|
|
|
constructs a solution to the problem
|
2017-01-01 22:43:44 +01:00
|
|
|
|
by always making a choice that looks
|
|
|
|
|
the best at the moment.
|
|
|
|
|
A greedy algorithm never takes back
|
|
|
|
|
its choices, but directly constructs
|
|
|
|
|
the final solution.
|
|
|
|
|
For this reason, greedy algorithms
|
|
|
|
|
are usually very efficient.
|
|
|
|
|
|
2017-02-13 22:16:30 +01:00
|
|
|
|
The difficulty in designing greedy algorithms
|
|
|
|
|
is to find a greedy strategy
|
2017-01-01 22:43:44 +01:00
|
|
|
|
that always produces an optimal solution
|
2017-01-31 21:20:39 +01:00
|
|
|
|
to the problem.
|
2017-01-01 22:43:44 +01:00
|
|
|
|
The locally optimal choices in a greedy
|
|
|
|
|
algorithm should also be globally optimal.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
It is often difficult to argue that
|
2017-01-01 22:43:44 +01:00
|
|
|
|
a greedy algorithm works.
|
|
|
|
|
|
|
|
|
|
\section{Coin problem}
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
As a first example, we consider a problem
|
2017-01-01 22:43:44 +01:00
|
|
|
|
where we are given a set of coin values
|
|
|
|
|
and our task is to form a sum of money
|
|
|
|
|
using the coins.
|
|
|
|
|
The values of the coins are
|
|
|
|
|
$\{c_1,c_2,\ldots,c_k\}$,
|
|
|
|
|
and each coin can be used as many times we want.
|
|
|
|
|
What is the minimum number of coins needed?
|
|
|
|
|
|
2017-02-13 22:16:30 +01:00
|
|
|
|
For example, if the coins are the euro coins (in cents)
|
2017-01-01 22:43:44 +01:00
|
|
|
|
\[\{1,2,5,10,20,50,100,200\}\]
|
|
|
|
|
and the sum of money is 520,
|
|
|
|
|
we need at least four coins.
|
|
|
|
|
The optimal solution is to select coins
|
|
|
|
|
$200+200+100+20$ whose sum is 520.
|
|
|
|
|
|
|
|
|
|
\subsubsection{Greedy algorithm}
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
A simple greedy algorithm to the problem
|
2017-01-01 22:43:44 +01:00
|
|
|
|
is to always select the largest possible coin,
|
|
|
|
|
until we have constructed the required sum of money.
|
|
|
|
|
This algorithm works in the example case,
|
|
|
|
|
because we first select two 200 cent coins,
|
|
|
|
|
then one 100 cent coin and finally one 20 cent coin.
|
|
|
|
|
But does this algorithm always work?
|
|
|
|
|
|
|
|
|
|
It turns out that, for the set of euro coins,
|
|
|
|
|
the greedy algorithm \emph{always} works, i.e.,
|
|
|
|
|
it always produces a solution with the fewest
|
|
|
|
|
possible number of coins.
|
|
|
|
|
The correctness of the algorithm can be
|
2017-01-31 21:20:39 +01:00
|
|
|
|
shown as follows:
|
2017-01-01 22:43:44 +01:00
|
|
|
|
|
|
|
|
|
Each coin 1, 5, 10, 50 and 100 appears
|
2017-01-31 21:20:39 +01:00
|
|
|
|
at most once in an optimal solution.
|
2017-01-01 22:43:44 +01:00
|
|
|
|
The reason for this is that if the
|
|
|
|
|
solution would contain two such coins,
|
|
|
|
|
we could replace them by one coin and
|
|
|
|
|
obtain a better solution.
|
|
|
|
|
For example, if the solution would contain
|
|
|
|
|
coins $5+5$, we could replace them by coin $10$.
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
In the same way, coins 2 and 20 appear
|
|
|
|
|
at most twice in an optimal solution,
|
|
|
|
|
because we could replace
|
2017-01-01 22:43:44 +01:00
|
|
|
|
coins $2+2+2$ by coins $5+1$ and
|
|
|
|
|
coins $20+20+20$ by coins $50+10$.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
Moreover, an optimal solution cannot contain
|
2017-02-13 22:16:30 +01:00
|
|
|
|
coins $2+2+1$ or $20+20+10$,
|
2017-01-31 21:20:39 +01:00
|
|
|
|
because we could replace them by coins $5$ and $50$.
|
2017-01-01 22:43:44 +01:00
|
|
|
|
|
|
|
|
|
Using these observations,
|
|
|
|
|
we can show for each coin $x$ that
|
|
|
|
|
it is not possible to optimally construct
|
2017-02-13 22:16:30 +01:00
|
|
|
|
a sum $x$ or any larger sum by only using coins
|
2017-01-01 22:43:44 +01:00
|
|
|
|
that are smaller than $x$.
|
|
|
|
|
For example, if $x=100$, the largest optimal
|
2017-01-31 21:20:39 +01:00
|
|
|
|
sum using the smaller coins is $50+20+20+5+2+2=99$.
|
2017-01-01 22:43:44 +01:00
|
|
|
|
Thus, the greedy algorithm that always selects
|
|
|
|
|
the largest coin produces the optimal solution.
|
|
|
|
|
|
|
|
|
|
This example shows that it can be difficult
|
2017-01-31 21:20:39 +01:00
|
|
|
|
to argue that a greedy algorithm works,
|
2017-01-01 22:43:44 +01:00
|
|
|
|
even if the algorithm itself is simple.
|
|
|
|
|
|
|
|
|
|
\subsubsection{General case}
|
|
|
|
|
|
|
|
|
|
In the general case, the coin set can contain any coins
|
2017-01-31 21:20:39 +01:00
|
|
|
|
and the greedy algorithm \emph{does not} necessarily produce
|
2017-01-01 22:43:44 +01:00
|
|
|
|
an optimal solution.
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
We can prove that a greedy algorithm does not work
|
2017-01-01 22:43:44 +01:00
|
|
|
|
by showing a counterexample
|
|
|
|
|
where the algorithm gives a wrong answer.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
In this problem we can easily find a counterexample:
|
|
|
|
|
if the coins are $\{1,3,4\}$ and the target sum
|
2017-01-01 22:43:44 +01:00
|
|
|
|
is 6, the greedy algorithm produces the solution
|
2017-01-31 21:20:39 +01:00
|
|
|
|
$4+1+1$ while the optimal solution is $3+3$.
|
2017-01-01 22:43:44 +01:00
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
We do not know if the general coin problem
|
2017-01-01 22:43:44 +01:00
|
|
|
|
can be solved using any greedy algorithm.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
However, as we will see in Chapter 7,
|
|
|
|
|
the general problem can be efficiently
|
|
|
|
|
solved using a dynamic
|
2017-01-01 22:43:44 +01:00
|
|
|
|
programming algorithm that always gives the
|
|
|
|
|
correct answer.
|
|
|
|
|
|
|
|
|
|
\section{Scheduling}
|
|
|
|
|
|
|
|
|
|
Many scheduling problems can be solved
|
2017-02-13 22:16:30 +01:00
|
|
|
|
using greedy algorithms.
|
2017-01-01 22:43:44 +01:00
|
|
|
|
A classic problem is as follows:
|
|
|
|
|
Given $n$ events with their starting and ending
|
2017-02-13 22:16:30 +01:00
|
|
|
|
times, our goal is to plan a schedule
|
2017-01-31 21:20:39 +01:00
|
|
|
|
that includes as many events as possible.
|
|
|
|
|
It is not possible to select an event partially.
|
2017-01-01 22:43:44 +01:00
|
|
|
|
For example, consider the following events:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tabular}{lll}
|
2017-01-01 22:43:44 +01:00
|
|
|
|
event & starting time & ending time \\
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\hline
|
|
|
|
|
$A$ & 1 & 3 \\
|
|
|
|
|
$B$ & 2 & 5 \\
|
|
|
|
|
$C$ & 3 & 9 \\
|
|
|
|
|
$D$ & 6 & 8 \\
|
|
|
|
|
\end{tabular}
|
|
|
|
|
\end{center}
|
2017-01-01 22:43:44 +01:00
|
|
|
|
In this case the maximum number of events is two.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
For example, we can select events $B$ and $D$
|
2017-01-01 22:43:44 +01:00
|
|
|
|
as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=.4]
|
|
|
|
|
\begin{scope}
|
|
|
|
|
\draw (2, 0) rectangle (6, -1);
|
|
|
|
|
\draw[fill=lightgray] (4, -1.5) rectangle (10, -2.5);
|
|
|
|
|
\draw (6, -3) rectangle (18, -4);
|
|
|
|
|
\draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
|
|
|
|
|
\node at (2.5,-0.5) {$A$};
|
|
|
|
|
\node at (4.5,-2) {$B$};
|
|
|
|
|
\node at (6.5,-3.5) {$C$};
|
|
|
|
|
\node at (12.5,-5) {$D$};
|
|
|
|
|
\end{scope}
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
|
|
|
|
|
2017-01-01 22:43:44 +01:00
|
|
|
|
It is possible to invent several greedy algorithms
|
|
|
|
|
for the problem, but which of them works in every case?
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-01-01 22:43:44 +01:00
|
|
|
|
\subsubsection*{Algorithm 1}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-01-01 22:43:44 +01:00
|
|
|
|
The first idea is to select as \emph{short}
|
|
|
|
|
events as possible.
|
|
|
|
|
In the example case this algorithm
|
|
|
|
|
selects the following events:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=.4]
|
|
|
|
|
\begin{scope}
|
|
|
|
|
\draw[fill=lightgray] (2, 0) rectangle (6, -1);
|
|
|
|
|
\draw (4, -1.5) rectangle (10, -2.5);
|
|
|
|
|
\draw (6, -3) rectangle (18, -4);
|
|
|
|
|
\draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
|
|
|
|
|
\node at (2.5,-0.5) {$A$};
|
|
|
|
|
\node at (4.5,-2) {$B$};
|
|
|
|
|
\node at (6.5,-3.5) {$C$};
|
|
|
|
|
\node at (12.5,-5) {$D$};
|
|
|
|
|
\end{scope}
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
|
|
|
|
|
2017-02-13 22:16:30 +01:00
|
|
|
|
However, selecting short events is not always
|
2017-01-31 21:20:39 +01:00
|
|
|
|
a correct strategy, but the algorithm fails,
|
2017-01-01 22:43:44 +01:00
|
|
|
|
for example, in the following case:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=.4]
|
|
|
|
|
\begin{scope}
|
|
|
|
|
\draw (1, 0) rectangle (7, -1);
|
|
|
|
|
\draw[fill=lightgray] (6, -1.5) rectangle (9, -2.5);
|
|
|
|
|
\draw (8, -3) rectangle (14, -4);
|
|
|
|
|
\end{scope}
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
2017-01-01 22:43:44 +01:00
|
|
|
|
If we select the short event, we can only select one event.
|
|
|
|
|
However, it would be possible to select both the long events.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-01-01 22:43:44 +01:00
|
|
|
|
\subsubsection*{Algorithm 2}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-01-01 22:43:44 +01:00
|
|
|
|
Another idea is to always select the next possible
|
|
|
|
|
event that \emph{begins} as \emph{early} as possible.
|
|
|
|
|
This algorithm selects the following events:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=.4]
|
|
|
|
|
\begin{scope}
|
|
|
|
|
\draw[fill=lightgray] (2, 0) rectangle (6, -1);
|
|
|
|
|
\draw (4, -1.5) rectangle (10, -2.5);
|
|
|
|
|
\draw[fill=lightgray] (6, -3) rectangle (18, -4);
|
|
|
|
|
\draw (12, -4.5) rectangle (16, -5.5);
|
|
|
|
|
\node at (2.5,-0.5) {$A$};
|
|
|
|
|
\node at (4.5,-2) {$B$};
|
|
|
|
|
\node at (6.5,-3.5) {$C$};
|
|
|
|
|
\node at (12.5,-5) {$D$};
|
|
|
|
|
\end{scope}
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
However, we can find a counterexample
|
|
|
|
|
also for this algorithm.
|
2017-01-01 22:43:44 +01:00
|
|
|
|
For example, in the following case,
|
2017-01-31 21:20:39 +01:00
|
|
|
|
the algorithm only selects one event:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=.4]
|
|
|
|
|
\begin{scope}
|
|
|
|
|
\draw[fill=lightgray] (1, 0) rectangle (14, -1);
|
|
|
|
|
\draw (3, -1.5) rectangle (7, -2.5);
|
|
|
|
|
\draw (8, -3) rectangle (12, -4);
|
|
|
|
|
\end{scope}
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
2017-01-01 22:43:44 +01:00
|
|
|
|
If we select the first event, it is not possible
|
|
|
|
|
to select any other events.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
However, it would be possible to select the
|
2017-01-01 22:43:44 +01:00
|
|
|
|
other two events.
|
|
|
|
|
|
|
|
|
|
\subsubsection*{Algorithm 3}
|
|
|
|
|
|
|
|
|
|
The third idea is to always select the next
|
|
|
|
|
possible event that \emph{ends} as \emph{early} as possible.
|
|
|
|
|
This algorithm selects the following events:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=.4]
|
|
|
|
|
\begin{scope}
|
|
|
|
|
\draw[fill=lightgray] (2, 0) rectangle (6, -1);
|
|
|
|
|
\draw (4, -1.5) rectangle (10, -2.5);
|
|
|
|
|
\draw (6, -3) rectangle (18, -4);
|
|
|
|
|
\draw[fill=lightgray] (12, -4.5) rectangle (16, -5.5);
|
|
|
|
|
\node at (2.5,-0.5) {$A$};
|
|
|
|
|
\node at (4.5,-2) {$B$};
|
|
|
|
|
\node at (6.5,-3.5) {$C$};
|
|
|
|
|
\node at (12.5,-5) {$D$};
|
|
|
|
|
\end{scope}
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
|
|
|
|
|
2017-01-01 22:43:44 +01:00
|
|
|
|
It turns out that this algorithm
|
|
|
|
|
\emph{always} produces an optimal solution.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
First, it is always an optimal choice
|
|
|
|
|
to first select an event that ends
|
|
|
|
|
as early as possible.
|
|
|
|
|
After this, it is an optimal choice
|
|
|
|
|
to select the next event
|
|
|
|
|
using the same strategy, etc.,
|
|
|
|
|
until we cannot select any more events.
|
|
|
|
|
|
|
|
|
|
One way to argue that the algorithm works
|
|
|
|
|
is to consider
|
|
|
|
|
what happens if we first select an event
|
2017-01-01 22:43:44 +01:00
|
|
|
|
that ends later than the event that ends
|
2017-01-31 21:20:39 +01:00
|
|
|
|
as early as possible.
|
|
|
|
|
Now, we will have at most an equal number of
|
|
|
|
|
choices how we can select the next event.
|
|
|
|
|
Hence, selecting an event that ends later
|
|
|
|
|
can never yield a better solution,
|
|
|
|
|
and the greedy algorithm is correct.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-01-01 23:34:14 +01:00
|
|
|
|
\section{Tasks and deadlines}
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
Let us now consider a problem where
|
2017-02-13 22:16:30 +01:00
|
|
|
|
we are given $n$ tasks with durations and deadlines
|
2017-01-31 21:20:39 +01:00
|
|
|
|
and our task is to choose an order to perform the tasks.
|
2017-02-13 22:16:30 +01:00
|
|
|
|
For each task, we earn $d-x$ points
|
2017-01-31 21:20:39 +01:00
|
|
|
|
where $d$ is the task's deadline
|
2017-01-01 23:34:14 +01:00
|
|
|
|
and $x$ is the moment when we finished the task.
|
|
|
|
|
What is the largest possible total score
|
|
|
|
|
we can obtain?
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
For example, suppose that the tasks are as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tabular}{lll}
|
2017-01-01 23:34:14 +01:00
|
|
|
|
task & duration & deadline \\
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\hline
|
|
|
|
|
$A$ & 4 & 2 \\
|
|
|
|
|
$B$ & 3 & 5 \\
|
|
|
|
|
$C$ & 2 & 7 \\
|
|
|
|
|
$D$ & 4 & 5 \\
|
|
|
|
|
\end{tabular}
|
|
|
|
|
\end{center}
|
2017-01-31 21:20:39 +01:00
|
|
|
|
In this case, an optimal schedule for the tasks
|
|
|
|
|
is as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=.4]
|
|
|
|
|
\begin{scope}
|
|
|
|
|
\draw (0, 0) rectangle (4, -1);
|
|
|
|
|
\draw (4, 0) rectangle (10, -1);
|
|
|
|
|
\draw (10, 0) rectangle (18, -1);
|
|
|
|
|
\draw (18, 0) rectangle (26, -1);
|
|
|
|
|
\node at (0.5,-0.5) {$C$};
|
|
|
|
|
\node at (4.5,-0.5) {$B$};
|
|
|
|
|
\node at (10.5,-0.5) {$A$};
|
|
|
|
|
\node at (18.5,-0.5) {$D$};
|
|
|
|
|
|
|
|
|
|
\draw (0,1.5) -- (26,1.5);
|
|
|
|
|
\foreach \i in {0,2,...,26}
|
|
|
|
|
{
|
|
|
|
|
\draw (\i,1.25) -- (\i,1.75);
|
|
|
|
|
}
|
|
|
|
|
\footnotesize
|
|
|
|
|
\node at (0,2.5) {0};
|
|
|
|
|
\node at (10,2.5) {5};
|
|
|
|
|
\node at (20,2.5) {10};
|
|
|
|
|
|
|
|
|
|
\end{scope}
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
2017-01-01 23:34:14 +01:00
|
|
|
|
In this solution, $C$ yields 5 points,
|
|
|
|
|
$B$ yields 0 points, $A$ yields $-7$ points
|
|
|
|
|
and $D$ yields $-8$ points,
|
|
|
|
|
so the total score is $-10$.
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
Surprisingly, the optimal solution to the problem
|
|
|
|
|
does not depend on the deadlines at all,
|
2017-01-01 23:34:14 +01:00
|
|
|
|
but a correct greedy strategy is to simply
|
|
|
|
|
perform the tasks \emph{sorted by their durations}
|
|
|
|
|
in increasing order.
|
|
|
|
|
The reason for this is that if we ever perform
|
2017-01-31 21:20:39 +01:00
|
|
|
|
two tasks one after another such that the first task
|
2017-01-01 23:34:14 +01:00
|
|
|
|
takes longer than the second task,
|
|
|
|
|
we can obtain a better solution if we swap the tasks.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
For example, consider the following schedule:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=.4]
|
|
|
|
|
\begin{scope}
|
|
|
|
|
\draw (0, 0) rectangle (8, -1);
|
|
|
|
|
\draw (8, 0) rectangle (12, -1);
|
|
|
|
|
\node at (0.5,-0.5) {$X$};
|
|
|
|
|
\node at (8.5,-0.5) {$Y$};
|
|
|
|
|
|
|
|
|
|
\draw [decoration={brace}, decorate, line width=0.3mm] (7.75,-1.5) -- (0.25,-1.5);
|
|
|
|
|
\draw [decoration={brace}, decorate, line width=0.3mm] (11.75,-1.5) -- (8.25,-1.5);
|
|
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
|
\node at (4,-2.5) {$a$};
|
|
|
|
|
\node at (10,-2.5) {$b$};
|
|
|
|
|
|
|
|
|
|
\end{scope}
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
2017-01-31 21:20:39 +01:00
|
|
|
|
Here $a>b$, so we should swap the tasks:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=.4]
|
|
|
|
|
\begin{scope}
|
|
|
|
|
\draw (0, 0) rectangle (4, -1);
|
|
|
|
|
\draw (4, 0) rectangle (12, -1);
|
|
|
|
|
\node at (0.5,-0.5) {$Y$};
|
|
|
|
|
\node at (4.5,-0.5) {$X$};
|
|
|
|
|
|
|
|
|
|
\draw [decoration={brace}, decorate, line width=0.3mm] (3.75,-1.5) -- (0.25,-1.5);
|
|
|
|
|
\draw [decoration={brace}, decorate, line width=0.3mm] (11.75,-1.5) -- (4.25,-1.5);
|
|
|
|
|
|
|
|
|
|
\footnotesize
|
|
|
|
|
\node at (2,-2.5) {$b$};
|
|
|
|
|
\node at (8,-2.5) {$a$};
|
|
|
|
|
|
|
|
|
|
\end{scope}
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
2017-02-13 22:16:30 +01:00
|
|
|
|
Now $X$ gives $b$ points fewer and $Y$ gives $a$ points more,
|
2017-01-01 23:34:14 +01:00
|
|
|
|
so the total score increases by $a-b > 0$.
|
|
|
|
|
In an optimal solution,
|
2017-01-31 21:20:39 +01:00
|
|
|
|
for any two consecutive tasks,
|
2017-01-01 23:34:14 +01:00
|
|
|
|
it must hold that the shorter task comes
|
|
|
|
|
before the longer task.
|
|
|
|
|
Thus, the tasks must be performed
|
|
|
|
|
sorted by their durations.
|
|
|
|
|
|
|
|
|
|
\section{Minimizing sums}
|
|
|
|
|
|
|
|
|
|
We will next consider a problem where
|
|
|
|
|
we are given $n$ numbers $a_1,a_2,\ldots,a_n$
|
|
|
|
|
and our task is to find a value $x$
|
2017-01-31 21:20:39 +01:00
|
|
|
|
that minimizes the sum
|
|
|
|
|
\[|a_1-x|^c+|a_2-x|^c+\cdots+|a_n-x|^c.\]
|
2017-01-01 23:34:14 +01:00
|
|
|
|
We will focus on the cases $c=1$ and $c=2$.
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-01-01 23:34:14 +01:00
|
|
|
|
\subsubsection{Case $c=1$}
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
2017-01-01 23:34:14 +01:00
|
|
|
|
In this case, we should minimize the sum
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\[|a_1-x|+|a_2-x|+\cdots+|a_n-x|.\]
|
2017-01-01 23:34:14 +01:00
|
|
|
|
For example, if the numbers are $[1,2,9,2,6]$,
|
|
|
|
|
the best solution is to select $x=2$
|
|
|
|
|
which produces the sum
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\[
|
|
|
|
|
|1-2|+|2-2|+|9-2|+|2-2|+|6-2|=12.
|
|
|
|
|
\]
|
2017-01-01 23:34:14 +01:00
|
|
|
|
In the general case, the best choice for $x$
|
|
|
|
|
is the \textit{median} of the numbers,
|
|
|
|
|
i.e., the middle number after sorting.
|
|
|
|
|
For example, the list $[1,2,9,2,6]$
|
|
|
|
|
becomes $[1,2,2,6,9]$ after sorting,
|
|
|
|
|
so the median is 2.
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
The median is an optimal choice,
|
2017-01-01 23:34:14 +01:00
|
|
|
|
because if $x$ is smaller than the median,
|
|
|
|
|
the sum becomes smaller by increasing $x$,
|
|
|
|
|
and if $x$ is larger then the median,
|
2017-01-31 21:20:39 +01:00
|
|
|
|
the sum becomes smaller by decreasing $x$.
|
|
|
|
|
Hence, the optimal solution is that $x$
|
2017-01-01 23:34:14 +01:00
|
|
|
|
is the median.
|
|
|
|
|
If $n$ is even and there are two medians,
|
|
|
|
|
both medians and all values between them
|
|
|
|
|
are optimal solutions.
|
|
|
|
|
|
|
|
|
|
\subsubsection{Case $c=2$}
|
|
|
|
|
|
|
|
|
|
In this case, we should minimize the sum
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\[(a_1-x)^2+(a_2-x)^2+\cdots+(a_n-x)^2.\]
|
2017-01-01 23:34:14 +01:00
|
|
|
|
For example, if the numbers are $[1,2,9,2,6]$,
|
|
|
|
|
the best solution is to select $x=4$
|
|
|
|
|
which produces the sum
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\[
|
|
|
|
|
(1-4)^2+(2-4)^2+(9-4)^2+(2-4)^2+(6-4)^2=46.
|
|
|
|
|
\]
|
2017-01-01 23:34:14 +01:00
|
|
|
|
In the general case, the best choice for $x$
|
|
|
|
|
is the \emph{average} of the numbers.
|
|
|
|
|
In the example the average is $(1+2+9+2+6)/5=4$.
|
|
|
|
|
This result can be derived by presenting
|
|
|
|
|
the sum as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\[
|
2017-01-31 21:20:39 +01:00
|
|
|
|
nx^2 - 2x(a_1+a_2+\cdots+a_n) + (a_1^2+a_2^2+\cdots+a_n^2)
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\]
|
2017-01-31 21:20:39 +01:00
|
|
|
|
The last part does not depend on $x$,
|
2017-01-01 23:34:14 +01:00
|
|
|
|
so we can ignore it.
|
|
|
|
|
The remaining parts form a function
|
|
|
|
|
$nx^2-2xs$ where $s=a_1+a_2+\cdots+a_n$.
|
|
|
|
|
This is a parabola opening upwards
|
|
|
|
|
with roots $x=0$ and $x=2s/n$,
|
|
|
|
|
and the minimum value is the average
|
|
|
|
|
of the roots $x=s/n$, i.e.,
|
|
|
|
|
the average of the numbers $a_1,a_2,\ldots,a_n$.
|
|
|
|
|
|
|
|
|
|
\section{Data compression}
|
|
|
|
|
|
|
|
|
|
\index{data compression}
|
|
|
|
|
\index{binary code}
|
|
|
|
|
\index{codeword}
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
A \key{binary code} assigns for each character
|
|
|
|
|
of a given string a \key{codeword} that consists of bits.
|
|
|
|
|
We can \emph{compress} the string using the binary code
|
2017-01-01 23:34:14 +01:00
|
|
|
|
by replacing each character by the
|
|
|
|
|
corresponding codeword.
|
|
|
|
|
For example, the following binary code
|
2017-01-31 21:20:39 +01:00
|
|
|
|
assigns codewords for characters
|
2017-01-01 23:34:14 +01:00
|
|
|
|
\texttt{A}–\texttt{D}:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tabular}{rr}
|
2017-01-01 23:34:14 +01:00
|
|
|
|
character & codeword \\
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\hline
|
|
|
|
|
\texttt{A} & 00 \\
|
|
|
|
|
\texttt{B} & 01 \\
|
|
|
|
|
\texttt{C} & 10 \\
|
|
|
|
|
\texttt{D} & 11 \\
|
|
|
|
|
\end{tabular}
|
|
|
|
|
\end{center}
|
2017-01-01 23:34:14 +01:00
|
|
|
|
This is a \key{constant-length} code
|
|
|
|
|
which means that the length of each
|
|
|
|
|
codeword is the same.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
For example, we can compress the string
|
|
|
|
|
\texttt{AABACDACA} as follows:
|
2017-02-13 22:16:30 +01:00
|
|
|
|
\[00\,00\,01\,00\,10\,11\,00\,10\,00\]
|
2017-01-31 21:20:39 +01:00
|
|
|
|
Using this code, the length of the compressed
|
|
|
|
|
string is 18 bits.
|
2017-01-01 23:34:14 +01:00
|
|
|
|
However, we can compress the string better
|
2017-01-31 21:20:39 +01:00
|
|
|
|
if we use a \key{variable-length} code
|
2017-01-01 23:34:14 +01:00
|
|
|
|
where codewords may have different lengths.
|
|
|
|
|
Then we can give short codewords for
|
2017-01-31 21:20:39 +01:00
|
|
|
|
characters that appear often
|
2017-01-01 23:34:14 +01:00
|
|
|
|
and long codewords for characters
|
|
|
|
|
that appear rarely.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
It turns out that an \key{optimal} code
|
|
|
|
|
for the above string is as follows:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tabular}{rr}
|
2017-01-01 23:34:14 +01:00
|
|
|
|
character & codeword \\
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\hline
|
|
|
|
|
\texttt{A} & 0 \\
|
|
|
|
|
\texttt{B} & 110 \\
|
|
|
|
|
\texttt{C} & 10 \\
|
|
|
|
|
\texttt{D} & 111 \\
|
|
|
|
|
\end{tabular}
|
|
|
|
|
\end{center}
|
2017-01-31 21:20:39 +01:00
|
|
|
|
An optimal code produces a compressed string
|
2017-01-01 23:34:14 +01:00
|
|
|
|
that is as short as possible.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
In this case, the compressed string using
|
2017-01-01 23:34:14 +01:00
|
|
|
|
the optimal code is
|
2017-02-13 22:16:30 +01:00
|
|
|
|
\[0\,0\,110\,0\,10\,111\,0\,10\,0,\]
|
2017-01-31 21:20:39 +01:00
|
|
|
|
so only 15 bits are needed instead of 18 bits.
|
2017-01-01 23:34:14 +01:00
|
|
|
|
Thus, thanks to a better code it was possible to
|
|
|
|
|
save 3 bits in the compressed string.
|
|
|
|
|
|
2017-01-31 21:20:39 +01:00
|
|
|
|
We require that no codeword
|
2017-01-01 23:34:14 +01:00
|
|
|
|
is a prefix of another codeword.
|
|
|
|
|
For example, it is not allowed that a code
|
|
|
|
|
would contain both codewords 10
|
|
|
|
|
and 1011.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
The reason for this is that we want
|
2017-01-01 23:34:14 +01:00
|
|
|
|
to be able to generate the original string
|
|
|
|
|
from the compressed string.
|
|
|
|
|
If a codeword could be a prefix of another codeword,
|
|
|
|
|
this would not always be possible.
|
|
|
|
|
For example, the following code is \emph{not} valid:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tabular}{rr}
|
2017-01-31 21:20:39 +01:00
|
|
|
|
character & codeword \\
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\hline
|
|
|
|
|
\texttt{A} & 10 \\
|
|
|
|
|
\texttt{B} & 11 \\
|
|
|
|
|
\texttt{C} & 1011 \\
|
|
|
|
|
\texttt{D} & 111 \\
|
|
|
|
|
\end{tabular}
|
|
|
|
|
\end{center}
|
2017-01-01 23:34:14 +01:00
|
|
|
|
Using this code, it would not be possible to know
|
2017-01-31 21:20:39 +01:00
|
|
|
|
if the compressed string 1011 corresponds to
|
2017-01-01 23:34:14 +01:00
|
|
|
|
the string \texttt{AB} or the string \texttt{C}.
|
|
|
|
|
|
|
|
|
|
\index{Huffman coding}
|
|
|
|
|
|
|
|
|
|
\subsubsection{Huffman coding}
|
|
|
|
|
|
2017-02-21 00:17:36 +01:00
|
|
|
|
\key{Huffman coding} \cite{huf52} is a greedy algorithm
|
2017-01-01 23:34:14 +01:00
|
|
|
|
that constructs an optimal code for
|
2017-01-31 21:20:39 +01:00
|
|
|
|
compressing a given string.
|
2017-01-01 23:34:14 +01:00
|
|
|
|
The algorithm builds a binary tree
|
|
|
|
|
based on the frequencies of the characters
|
|
|
|
|
in the string,
|
2017-01-31 21:20:39 +01:00
|
|
|
|
and each character's codeword can be read
|
2017-01-01 23:34:14 +01:00
|
|
|
|
by following a path from the root to
|
|
|
|
|
the corresponding node.
|
2017-02-13 22:16:30 +01:00
|
|
|
|
A move to the left corresponds to bit 0,
|
2017-01-01 23:34:14 +01:00
|
|
|
|
and a move to the right corresponds to bit 1.
|
|
|
|
|
|
|
|
|
|
Initially, each character of the string is
|
|
|
|
|
represented by a node whose weight is the
|
2017-02-13 22:16:30 +01:00
|
|
|
|
number of times the character occurs in the string.
|
2017-01-01 23:34:14 +01:00
|
|
|
|
Then at each step two nodes with minimum weights
|
2017-01-31 21:20:39 +01:00
|
|
|
|
are combined by creating
|
2017-01-01 23:34:14 +01:00
|
|
|
|
a new node whose weight is the sum of the weights
|
|
|
|
|
of the original nodes.
|
2017-01-31 21:20:39 +01:00
|
|
|
|
The process continues until all nodes have been combined.
|
2017-01-01 23:34:14 +01:00
|
|
|
|
|
|
|
|
|
Next we will see how Huffman coding creates
|
|
|
|
|
the optimal code for the string
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\texttt{AABACDACA}.
|
2017-01-01 23:34:14 +01:00
|
|
|
|
Initially, there are four nodes that correspond
|
|
|
|
|
to the characters in the string:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
|
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=0.9]
|
|
|
|
|
\node[draw, circle] (1) at (0,0) {$5$};
|
|
|
|
|
\node[draw, circle] (2) at (2,0) {$1$};
|
|
|
|
|
\node[draw, circle] (3) at (4,0) {$2$};
|
|
|
|
|
\node[draw, circle] (4) at (6,0) {$1$};
|
|
|
|
|
|
|
|
|
|
\node[color=blue] at (0,-0.75) {\texttt{A}};
|
|
|
|
|
\node[color=blue] at (2,-0.75) {\texttt{B}};
|
|
|
|
|
\node[color=blue] at (4,-0.75) {\texttt{C}};
|
|
|
|
|
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
|
|
|
|
|
|
|
|
|
%\path[draw,thick,-] (4) -- (5);
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
2017-01-01 23:34:14 +01:00
|
|
|
|
The node that represents character \texttt{A}
|
|
|
|
|
has weight 5 because character \texttt{A}
|
|
|
|
|
appears 5 times in the string.
|
|
|
|
|
The other weights have been calculated
|
|
|
|
|
in the same way.
|
|
|
|
|
|
|
|
|
|
The first step is to combine the nodes that
|
|
|
|
|
correspond to characters \texttt{B} and \texttt{D},
|
|
|
|
|
both with weight 1.
|
|
|
|
|
The result is:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=0.9]
|
|
|
|
|
\node[draw, circle] (1) at (0,0) {$5$};
|
|
|
|
|
\node[draw, circle] (3) at (2,0) {$2$};
|
|
|
|
|
\node[draw, circle] (2) at (4,0) {$1$};
|
|
|
|
|
\node[draw, circle] (4) at (6,0) {$1$};
|
|
|
|
|
\node[draw, circle] (5) at (5,1) {$2$};
|
|
|
|
|
|
|
|
|
|
\node[color=blue] at (0,-0.75) {\texttt{A}};
|
|
|
|
|
\node[color=blue] at (2,-0.75) {\texttt{C}};
|
|
|
|
|
\node[color=blue] at (4,-0.75) {\texttt{B}};
|
|
|
|
|
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
|
|
|
|
|
|
|
|
|
\node at (4.3,0.7) {0};
|
|
|
|
|
\node at (5.7,0.7) {1};
|
|
|
|
|
|
|
|
|
|
\path[draw,thick,-] (2) -- (5);
|
|
|
|
|
\path[draw,thick,-] (4) -- (5);
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
2017-01-01 23:34:14 +01:00
|
|
|
|
After this, the nodes with weight 2 are combined:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=0.9]
|
|
|
|
|
\node[draw, circle] (1) at (1,0) {$5$};
|
|
|
|
|
\node[draw, circle] (3) at (3,1) {$2$};
|
|
|
|
|
\node[draw, circle] (2) at (4,0) {$1$};
|
|
|
|
|
\node[draw, circle] (4) at (6,0) {$1$};
|
|
|
|
|
\node[draw, circle] (5) at (5,1) {$2$};
|
|
|
|
|
\node[draw, circle] (6) at (4,2) {$4$};
|
|
|
|
|
|
|
|
|
|
\node[color=blue] at (1,-0.75) {\texttt{A}};
|
|
|
|
|
\node[color=blue] at (3,1-0.75) {\texttt{C}};
|
|
|
|
|
\node[color=blue] at (4,-0.75) {\texttt{B}};
|
|
|
|
|
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
|
|
|
|
|
|
|
|
|
\node at (4.3,0.7) {0};
|
|
|
|
|
\node at (5.7,0.7) {1};
|
|
|
|
|
\node at (3.3,1.7) {0};
|
|
|
|
|
\node at (4.7,1.7) {1};
|
|
|
|
|
|
|
|
|
|
\path[draw,thick,-] (2) -- (5);
|
|
|
|
|
\path[draw,thick,-] (4) -- (5);
|
|
|
|
|
\path[draw,thick,-] (3) -- (6);
|
|
|
|
|
\path[draw,thick,-] (5) -- (6);
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
2017-01-01 23:34:14 +01:00
|
|
|
|
Finally, the two remaining nodes are combined:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tikzpicture}[scale=0.9]
|
|
|
|
|
\node[draw, circle] (1) at (2,2) {$5$};
|
|
|
|
|
\node[draw, circle] (3) at (3,1) {$2$};
|
|
|
|
|
\node[draw, circle] (2) at (4,0) {$1$};
|
|
|
|
|
\node[draw, circle] (4) at (6,0) {$1$};
|
|
|
|
|
\node[draw, circle] (5) at (5,1) {$2$};
|
|
|
|
|
\node[draw, circle] (6) at (4,2) {$4$};
|
|
|
|
|
\node[draw, circle] (7) at (3,3) {$9$};
|
|
|
|
|
|
|
|
|
|
\node[color=blue] at (2,2-0.75) {\texttt{A}};
|
|
|
|
|
\node[color=blue] at (3,1-0.75) {\texttt{C}};
|
|
|
|
|
\node[color=blue] at (4,-0.75) {\texttt{B}};
|
|
|
|
|
\node[color=blue] at (6,-0.75) {\texttt{D}};
|
|
|
|
|
|
|
|
|
|
\node at (4.3,0.7) {0};
|
|
|
|
|
\node at (5.7,0.7) {1};
|
|
|
|
|
\node at (3.3,1.7) {0};
|
|
|
|
|
\node at (4.7,1.7) {1};
|
|
|
|
|
\node at (2.3,2.7) {0};
|
|
|
|
|
\node at (3.7,2.7) {1};
|
|
|
|
|
|
|
|
|
|
\path[draw,thick,-] (2) -- (5);
|
|
|
|
|
\path[draw,thick,-] (4) -- (5);
|
|
|
|
|
\path[draw,thick,-] (3) -- (6);
|
|
|
|
|
\path[draw,thick,-] (5) -- (6);
|
|
|
|
|
\path[draw,thick,-] (1) -- (7);
|
|
|
|
|
\path[draw,thick,-] (6) -- (7);
|
|
|
|
|
\end{tikzpicture}
|
|
|
|
|
\end{center}
|
|
|
|
|
|
2017-01-01 23:34:14 +01:00
|
|
|
|
Now all nodes are in the tree, so the code is ready.
|
|
|
|
|
The following codewords can be read from the tree:
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\begin{center}
|
|
|
|
|
\begin{tabular}{rr}
|
2017-01-01 23:34:14 +01:00
|
|
|
|
character & codeword \\
|
2016-12-28 23:54:51 +01:00
|
|
|
|
\hline
|
|
|
|
|
\texttt{A} & 0 \\
|
|
|
|
|
\texttt{B} & 110 \\
|
|
|
|
|
\texttt{C} & 10 \\
|
|
|
|
|
\texttt{D} & 111 \\
|
|
|
|
|
\end{tabular}
|
|
|
|
|
\end{center}
|
|
|
|
|
|
|
|
|
|
% \subsubsection{Miksi algoritmi toimii?}
|
|
|
|
|
%
|
|
|
|
|
% Huffmanin koodaus on ahne algoritmi, koska se
|
|
|
|
|
% yhdistää aina kaksi solmua, joiden painot ovat
|
|
|
|
|
% pienimmät.
|
|
|
|
|
% Miksi on varmaa, että tämä menetelmä tuottaa
|
|
|
|
|
% aina optimaalisen koodin?
|
|
|
|
|
%
|
|
|
|
|
% Merkitään $c(x)$ merkin $x$ esiintymiskertojen
|
|
|
|
|
% määrää merkkijonossa sekä $s(x)$
|
|
|
|
|
% merkkiä $x$ vastaavan koodisanan pituutta.
|
|
|
|
|
% Näitä merkintöjä käyttäen merkkijonon
|
|
|
|
|
% bittiesityksen pituus on
|
|
|
|
|
% \[\sum_x c(x) \cdot s(x),\]
|
|
|
|
|
% missä summa käy läpi kaikki merkkijonon merkit.
|
|
|
|
|
% Esimerkiksi äskeisessä esimerkissä
|
|
|
|
|
% bittiesityksen pituus on
|
|
|
|
|
% \[5 \cdot 1 + 1 \cdot 3 + 2 \cdot 2 + 1 \cdot 3 = 15.\]
|
|
|
|
|
% Hyödyllinen havainto on, että $s(x)$ on yhtä suuri kuin
|
|
|
|
|
% merkkiä $x$ vastaavan solmun \emph{syvyys} puussa
|
|
|
|
|
% eli matka puun huipulta solmuun.
|
|
|
|
|
%
|
|
|
|
|
% Perustellaan ensin, miksi optimaalista koodia vastaa
|
|
|
|
|
% aina binääripuu, jossa jokaisesta solmusta lähtee
|
|
|
|
|
% alaspäin joko kaksi haaraa tai ei yhtään haaraa.
|
|
|
|
|
% Tehdään vastaoletus, että jostain solmusta lähtisi
|
|
|
|
|
% alaspäin vain yksi haara.
|
|
|
|
|
% Esimerkiksi seuraavassa puussa tällainen tilanne on solmussa $a$:
|
|
|
|
|
% \begin{center}
|
|
|
|
|
% \begin{tikzpicture}[scale=0.9]
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (3) at (3,1) {\phantom{$a$}};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (2) at (4,0) {$b$};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (5) at (5,1) {$a$};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (6) at (4,2) {\phantom{$a$}};
|
|
|
|
|
%
|
|
|
|
|
% \path[draw,thick,-] (2) -- (5);
|
|
|
|
|
% \path[draw,thick,-] (3) -- (6);
|
|
|
|
|
% \path[draw,thick,-] (5) -- (6);
|
|
|
|
|
% \end{tikzpicture}
|
|
|
|
|
% \end{center}
|
|
|
|
|
% Tällainen solmu $a$ on kuitenkin aina turha, koska se
|
|
|
|
|
% tuo vain yhden bitin lisää polkuihin, jotka kulkevat
|
|
|
|
|
% solmun kautta, eikä sen avulla voi erottaa kahta
|
|
|
|
|
% koodisanaa toisistaan. Niinpä kyseisen solmun voi poistaa
|
|
|
|
|
% puusta, minkä seurauksena syntyy parempi koodi,
|
|
|
|
|
% eli optimaalista koodia vastaavassa puussa ei voi olla
|
|
|
|
|
% solmua, josta lähtee vain yksi haara.
|
|
|
|
|
%
|
|
|
|
|
% Perustellaan sitten, miksi on joka vaiheessa optimaalista
|
|
|
|
|
% yhdistää kaksi solmua, joiden painot ovat pienimmät.
|
|
|
|
|
% Tehdään vastaoletus, että solmun $a$ paino on pienin,
|
|
|
|
|
% mutta sitä ei saisi yhdistää aluksi toiseen solmuun,
|
|
|
|
|
% vaan sen sijasta tulisi yhdistää solmu $b$
|
|
|
|
|
% ja jokin toinen solmu:
|
|
|
|
|
% \begin{center}
|
|
|
|
|
% \begin{tikzpicture}[scale=0.9]
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (1) at (0,0) {\phantom{$a$}};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (2) at (-2,-1) {\phantom{$a$}};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (3) at (2,-1) {$a$};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (4) at (-3,-2) {\phantom{$a$}};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (5) at (-1,-2) {\phantom{$a$}};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (8) at (-2,-3) {$b$};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (9) at (0,-3) {\phantom{$a$}};
|
|
|
|
|
%
|
|
|
|
|
% \path[draw,thick,-] (1) -- (2);
|
|
|
|
|
% \path[draw,thick,-] (1) -- (3);
|
|
|
|
|
% \path[draw,thick,-] (2) -- (4);
|
|
|
|
|
% \path[draw,thick,-] (2) -- (5);
|
|
|
|
|
% \path[draw,thick,-] (5) -- (8);
|
|
|
|
|
% \path[draw,thick,-] (5) -- (9);
|
|
|
|
|
% \end{tikzpicture}
|
|
|
|
|
% \end{center}
|
|
|
|
|
% Solmuille $a$ ja $b$ pätee
|
|
|
|
|
% $c(a) \le c(b)$ ja $s(a) \le s(b)$.
|
|
|
|
|
% Solmut aiheuttavat bittiesityksen pituuteen lisäyksen
|
|
|
|
|
% \[c(a) \cdot s(a) + c(b) \cdot s(b).\]
|
|
|
|
|
% Tarkastellaan sitten toista tilannetta,
|
|
|
|
|
% joka on muuten samanlainen kuin ennen,
|
|
|
|
|
% mutta solmut $a$ ja $b$ on vaihdettu keskenään:
|
|
|
|
|
% \begin{center}
|
|
|
|
|
% \begin{tikzpicture}[scale=0.9]
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (1) at (0,0) {\phantom{$a$}};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (2) at (-2,-1) {\phantom{$a$}};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (3) at (2,-1) {$b$};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (4) at (-3,-2) {\phantom{$a$}};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (5) at (-1,-2) {\phantom{$a$}};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (8) at (-2,-3) {$a$};
|
|
|
|
|
% \node[draw, circle, minimum size=20pt] (9) at (0,-3) {\phantom{$a$}};
|
|
|
|
|
%
|
|
|
|
|
% \path[draw,thick,-] (1) -- (2);
|
|
|
|
|
% \path[draw,thick,-] (1) -- (3);
|
|
|
|
|
% \path[draw,thick,-] (2) -- (4);
|
|
|
|
|
% \path[draw,thick,-] (2) -- (5);
|
|
|
|
|
% \path[draw,thick,-] (5) -- (8);
|
|
|
|
|
% \path[draw,thick,-] (5) -- (9);
|
|
|
|
|
% \end{tikzpicture}
|
|
|
|
|
% \end{center}
|
|
|
|
|
% Osoittautuu, että tätä puuta vastaava koodi on
|
|
|
|
|
% \emph{yhtä hyvä tai parempi} kuin alkuperäinen koodi, joten vastaoletus
|
|
|
|
|
% on väärin ja Huffmanin koodaus
|
|
|
|
|
% toimiikin oikein, jos se yhdistää aluksi solmun $a$
|
|
|
|
|
% jonkin solmun kanssa.
|
|
|
|
|
% Tämän perustelee seuraava epäyhtälöketju:
|
|
|
|
|
% \[\begin{array}{rcl}
|
|
|
|
|
% c(b) & \ge & c(a) \\
|
|
|
|
|
% c(b)\cdot(s(b)-s(a)) & \ge & c(a)\cdot (s(b)-s(a)) \\
|
|
|
|
|
% c(b)\cdot s(b)-c(b)\cdot s(a) & \ge & c(a)\cdot s(b)-c(a)\cdot s(a) \\
|
|
|
|
|
% c(a)\cdot s(a)+c(b)\cdot s(b) & \ge & c(a)\cdot s(b)+c(b)\cdot s(a) \\
|
|
|
|
|
% \end{array}\]
|