9.27

From The Algorithm Design Manual Solution Wiki
Jump to navigation Jump to search

Greedy-Approximation for Set Cover
Given a universe U of n elements and a family S = {S1, … , Sm} of subsets whose union is U, repeatedly pick the set that covers the largest number of still-uncovered elements until every element is covered.

Why it works. Although NP-hard to solve optimally, this greedy rule guarantees an H(n) ≈ ln n + 1 approximation factor (H is the n-th harmonic number), which is the best possible in polynomial time unless P = NP.

Pseudocode (unit costs):

GREEDY-SET-COVER(U, S) C ← ∅ // chosen sets U_left ← U // uncovered elements while U_left ≠ ∅ pick T ∈ S that maximizes |T ∩ U_left| C ← C ∪ {T} U_left ← U_left \ T return C

Python implementation (each set is a Python set):
def greedy_set_cover(universe, subsets): uncovered = set(universe) chosen = [] while uncovered: best = max(subsets, key=lambda s: len(s & uncovered)) chosen.append(best) uncovered -= best return chosen

Complexities: Each iteration scans all m sets and performs set intersections costing O(n). In the worst case we iterate |C| ≤ n times, giving O(mn²) naïvely, or O(mn) if element counts are cached and updated incrementally.

Thus, the presented greedy algorithm is easy to code (see above), runs in polynomial time, and returns a solution no worse than a factor H(n) from the optimum.


Back to Chapter 9