贪心策略

The Greedy Strategy

For many games, you should think ahead, a strategy which focuses on immediate advantage could easily lead to defeat. Such as playing chess.

最近好像是国际象棋世界冠军赛。高三那会丁立人夺冠,还上 chess.com 玩了点,两个 AI(chess.com 的与云库查询)对对碰,后者比较厉害。

12.13:我嘞个开了光的嘴,随便提了嘴红眼 GOAT,红眼寄,提了嘴丁立人,丁立人寄。

But for many other games, you can do quite well by simply making whichever move seems best at the moment, without worrying too much about future consequences.

The Greedy Algorithmic Strategy(贪心策略): given a problem, build up a solution piece by piece, always choosing the next piece that offers the most obvious and immediate benefit.

An Activity-Selection Problem

Assume we have one hall and nn activities S={a1,,an}S = \left\lbrace a_1, \dots, a_n \right\rbrace:

  • Each activity has a start time sis_i and a finish time fif_i.
  • Two activities cannot happen simultaneously in the hall.

So what's the maximum number of activities can we schedule?

Divide and Conquer:

  • Define SiS_i to be the set of activities start after aia_i finishes;
  • Define FiF_i to be the set of activities finish before aia_i starts.
  • OPT(S)=max1in{OPT(Fi)+1+OPT(Si)}\operatorname{OPT}(S) = \max\limits_{1\le i\le n} \left\lbrace \operatorname{OPT}(F_i) + 1 + \operatorname{OPT}(S_i) \right\rbrace.

In any solution, some activity is the first to finish, OPT(S)=max1in{1+OPT(Si)}\operatorname{OPT}(S) = \max\limits_{1\le i \le n}\left\lbrace 1 + \operatorname{OPT}(S_i) \right\rbrace. This cause n×(n1)×n \times (n-1) \times \dots subproblems, which is not efficient.

Observation: To make OPT(S)\operatorname{OPT}(S) as large as possible, the activity that finishes first should finish as early as possible.

1
2
3
4
5
6
7
8
9
ActivitySeletion(S):
Sort S into increasing order of finish time
SOL := {a1}
a' := a1
for i := 2 to n:
if ai.start > a'.finish:
SOL := SOL + {ai}
a' := ai
return SOL

How to formally prove the correctness of this algorithm?

  • The firstly selected activity is in some optimal solution.
  • The following selection is correct to this optimal solution.

Lemma 1

Let a' be the earliest finishing activity in S, then a' is in some optimal solution of the problem.

Proof.

Let OPT(S)\operatorname{OPT}(S) be an optimal solution solution to the problem, let aa be the earliest finishing activity in OPT(S)\operatorname{OPT}(S).

Assume aOPT(S)a' \notin \operatorname{OPT}(S), otherwise we are done.

Then SOL(S)=OPT(S)+aa\operatorname{SOL}(S) = \operatorname{OPT}(S) + a' - a is also a feasible solution, and it has same size as OPT(S)\operatorname{OPT}(S).

So SOL(S)\operatorname{SOL}(S) is also an optimal solution.

Lemma 2

Let a' be the earliest finishing activity in S, S' be the activities starting after a' finishes. Then OPT(S){a}\operatorname{OPT}(S') \cup \left\lbrace a' \right\rbrace is an optimal solution of the problem.

Proof.

Let OPT(S)\operatorname{OPT}(S) be an optimal solution to the original problem, and aOPT(S)a' \in \operatorname{OPT}(S) by Lemma 1.

Thus OPT(S)=SOL(S){a}\operatorname{OPT}(S) = \operatorname{SOL}(S') \cup \left\lbrace a' \right\rbrace.

If OPT(S){a}\operatorname{OPT}(S') \cup \left\lbrace a' \right\rbrace is not an optimal solution to the original problem, then it must be the case that SOL(S)>OPT(S)|\operatorname{SOL}(S')| > |\operatorname{OPT}(S')|.

But this contradicts that OPT(S)\operatorname{OPT}(S') is an optimal solution for problem SS'.

Then proof by induction.

Elements of the Greedy Strategy

If an (optimization) problem has following two properties, then the greedy strategy usually works for it:

  • Optimal substructure
  • Greedy property

Optimal Substructure

A problem exhibits optimal substructure if an optimal solution to the problem contains within it optimal solution(s) to subproblem(s):

  • Size nn problem P(n)P(n) and optimal solution of P(n)P(n) is OPTP(n)\operatorname{OPT}_{P(n)}.
  • Solving P(n)P(n) needs to solve size n<nn' < n subproblem P(n)P(n').
  • Optimal solution of P(n)P(n') is OPTP(n)\operatorname{OPT}_{P(n')}.
  • OPTP(n)\operatorname{OPT}_{P(n)} contains a solution of P(n)P(n'): SOLP(n)\operatorname{SOL}_{P(n')}.
  • Optimal Substructure Property: SOLP(n)=OPTP(n)\operatorname{SOL}_{P(n')} = \operatorname{OPT}_{P(n')}.

即,问题的最优解包含了子问题的最优解。

Counterexample: Longest path problem.

Greedy-Choice Property

At each step when building a solution, make the choice that looks best for the current problem, without considering results from subproblems. That is, make local greedy choice at each step.

If the problem only admits optimal structure:

  • Find ii that maximize utility ai+OPTP(ni)a_i + \operatorname{OPT}_{P(n_i)}.
  • We have to compute OPTP(ni)\operatorname{OPT}_{P(n_i)} for all ii first.

With greedy choice, we have a way to pick correct ii without knowing any OPTP(ni)\operatorname{OPT}_{P(n_i)}.

Examples

Fractional Knapsack Problem

A thief robbing a warehouse finds nn items A={a1,,an}A = \left\lbrace a_1, \dots, a_n \right\rbrace.

Item aia_i is worth viv_i dollars and weighs wiw_i pounds.

The thief can carry at most WW pounds in his knapsack.

The thief can carry fraction of items.

What should the thief take to maximize his profit?

A greedy strategy: Keep taking the most cost efficient item (i.e. max{viwi}\max \left\lbrace \frac{v_i}{w_i} \right\rbrace) until the knapsack is full.

The greedy solution is optimal.

The proof is obvious: if the greedy solution is not optimal, then we can replace the last item with a fraction of it to get a better solution.

0-1 Knapsack Problem

The thief can only take an item or not.

The greedy solution is NOT optimal.

A simple counterexample:

  • Two items
  • Item One has value 22 and weighs 11 pound.
  • Item Two has value WW and weighs WW pounds.

If we use greedy strategy, we will only take Item One, value 22.

The greedy solution can be arbitrarily bad.

The proof above doesn't work cos we can't do the replace operation in this case.

A data compression problem

这个应该与《离散数学》中「前缀码」有关,可以参考(给的链接是其下面的「生成树」的链接,因为里面子标题不够,需要往上翻)。

Assume we have a data file containing 100k characters. Further assume the file only uses 6 characters. How to store this file to save space?

The simplest way is to use 3 bits to encode for each char.

Instead of using fixed-length codeword for each char, we should let frequent chars use shorter codewords. That is, use a variable-length code.

Use a binary tree to visualize a prefix-free code.

  • Each leaf denotes a char.
  • Each internal node: left branch is 0, right branch is 1.
  • Path from root to leaf is the codeword of that char.

Optimal code must be represented by a full binary tree: a tree each node having zero or two children. Since if a node has only one child, we can merge it with its child.

Consider a file using a size nn alphabet C={c1,,cn}C = \left\lbrace c_1, \dots, c_n \right\rbrace. For each char, let fif_i be the frequency of char cic_i.

Let TT be a full binary tree representing a prefix-free code. For each char cic_i, let dT(i)d_T(i) be the depth of cic_i in TT.

So the length of encoded message is i=1nfidT(i)\displaystyle \sum_{i=1}^n f_i \cdot d_T(i).

Alternatively, recursively(bottom-up) define each internal node's frequency to be sum of its two children.

So the length of encoded message is uTrootfu\displaystyle \sum_{u \in T \setminus \text{root}}f_u.

Huffman codes: Merge two least frequent chars and recurse.

1
2
3
4
5
6
7
8
9
Huffman(C):
Build a priority queue Q based on frequency
for i := 1 to n-1
Allocate new node z
x := z.left := Q.ExtractMin()
y := z.right := Q.ExtractMin()
z.freq := x.freq + y.freq
Q.Insert(z)
return Q.ExtractMin()

Process:

Proof. Correctness of Huffman Codes
  • Lemma 1 [greedy choice]: Let x,yx, y be two least frequent chars, then in some optimal code tree, then x,yx, y are siblings and have largest depth.
  • Lemma 2 [optimal substructure]: Let xx and yy be two least frequent chars in CC. Let Cz=C{x,y}+{z}C_{z}=C-\{x, y\}+\{z\} with fz=fx+fyf_{z}=f_{x}+f_{y}. Let TzT_{z} be an optimal code tree for CzC_{z}. Let TT be a code tree obtained from TzT_{z} by replacing leaf node zz with an internal node having xx and yy as children. Then, TT is an optimal code tree for CC.

Lemma 1 Proof sketch:

  • Let TT be an optimal code tree with depth dd.
  • Let aa and bb be siblings with depth dd. (Recall TT is a full binary tree.)
  • Assume aa and bb are not xx and yy. (Otherwise we are done.)
  • Let TT' be the code tree obtained by swapping aa and xx.
  • cost(T)=cost(T)+(ddT(x))fx(ddT(x))fa=cost(T)+(ddT(x))(fxfa)cost(T)\operatorname{cost}\left(T'\right)=\operatorname{cost}(T)+\left(d-d_{T}(x)\right) \cdot f_{x}-\left(d-d_{T}(x)\right) \cdot f_{a}=\operatorname{cost}(T)+\left(d-d_{T}(x)\right) \cdot\left(f_{x}-f_{a}\right) \leq \operatorname{cost}(T)
  • Swapping bb and yy, obtaining TT'', further reduces the total cost.
  • So TT'' must also be an optimal code tree.

Lemma 2 Proof sketch:

  • Let TT' be an optimal code tree for CC with x,yx, y being sibling leaves.
  • cost(T)=fx+fy+uTroot,u{x,y}fu=fx+fy+cost(Tz)fx+fy+cost(Tz)=cost(T)\operatorname{cost}\left(T^{\prime}\right)=f_{x}+f_{y}+\sum_{u \in T \setminus \text{root},\, u \notin\{x, y\}} f_{u}=f_{x}+f_{y}+\operatorname{cost}\left(T_{z}^{\prime}\right) \geq f_{x}+f_{y}+\operatorname{cost}\left(T_{z}\right)=\operatorname{cost}(T)
  • So TT must be an optimal code tree for CC.

Set Cover

Suppose we need to build schools for nn towns. Each school must be in a town, no child should travel more than 30km to reach a school. So what's the minimum number of schools we need to build?

The Set Cover Problem:

  • Input: A universe UU of nn elements and S={S1,,Sm}\mathcal{S} = \left\lbrace S_1, \dots, S_m \right\rbrace where each SiUS_i \subseteq U.
  • Output: CS\mathscr{C} \subseteq \mathcal{S} such that SiCSi=U\bigcup_{S_{i} \in \mathscr{C}} S_{i}=U.
    • That is, a subset of S\mathcal{S} that covers UU.
  • Goal: Minimize C|\mathscr{C}|.

Simple greedy strategy:

  • Keep picking the town that covers most remaining uncovered towns, until we are done.
    • Pick the set that covers most uncovered elements, until all elements are covered.

However this is not optimal:

Though, greedy solution of Set Cover is close to optimal.

Throrem

Suppose the optimal solution uses kk sets, then the greedy solution will use at most klnnk \ln n sets.

Proof.

Let ntn_t be number of uncovered elements after tt iterations. n0=nn_0 = n.

These ntn_t elements can be covered by some kk sets.

So one of the remaining sets will cover at least ntk\frac{n_t}{k} of these uncovered elements. Since we pick the set that covers most uncovered elements, we will cover at least ntk\frac{n_t}{k} elements in the next iteration.

Thus nt+1ntntk=nt(11k)n_{t+1} \le n_t - \frac{n_t}{k} = n_t\left( 1 - \dfrac{1}{k} \right) .

So

ntn0(11k)t<n0etk\begin{aligned} n_t &\le n_0\left( 1 - \dfrac{1}{k} \right)^t \\ &< n_0 \cdot e^{-\frac{t}{k}} \\ \end{aligned}

With t=klnnt = k \ln n, we have nt<1n_t < 1. By then we must have done.

So the greedy strategy gives a lnn\ln n approximation algorithm, and it has efficient implementation(Polynomial runtime).

We most likely can not do better if we only care about efficient algorithms.

[Dinur & Steuer STOC14]: There is no polynimial-runtime (1o(1))lnn(1 - o(1))\cdot \ln n approximation algorithm unless P=NP.

So Greed can gives optimal (MST, Huffman codes), arbitrarily (0-1 knapsack) bad or near-optimal (Set cover) solutions.