10.23

From The Algorithm Design Manual Solution Wiki
Jump to navigation Jump to search

Idea
The problem is the same as finding the cheapest way to parenthesize a sequence (or to triangulate a polygon).
Insert two fictitious “breaks”, one at the beginning (position 0) and one at the end (position L). Let   pos[0] = 0 < pos[1] < … < pos[m] < pos[m+1] = L
(m is the number of real cuts).
Define dp[i][j] = minimum cost of completely breaking the substring that starts just after pos[i] and ends at pos[j] (i < j).
The length of that substring is len = pos[j] – pos[i].

Recurrence
If no real cut lies between i and j (j = i+1) the cost is 0. Otherwise, choose the first cut inside this interval at some k, i < k < j. That cut costs len units, plus the optimal costs of the two sub-intervals it creates:   dp[i][j] = mini<k<j ( dp[i][k] + dp[k][j] + len )

Algorithm (O(n³))  (n = m+2 is the size of pos)

for i in 0..n-2: # base cases dp[i][i+1] = 0 for width in 2..n-1: # number of endpoints between i and j for i in 0..n-1-width: j = i + width dp[i][j] = ∞ len = pos[j] - pos[i] for k in i+1..j-1: # first cut to perform cost = dp[i][k] + dp[k][j] + len if cost < dp[i][j]: dp[i][j] = cost answer = dp[0][n-1]

Correctness sketch
The outermost cut inside any interval (i,j) must be at some k. Removing that cut isolates two smaller sub-problems (i,k) and (k,j), whose optimal solutions are, by definition, dp[i][k] and dp[k][j]. Adding the unavoidable cost pos[j]–pos[i] for this outermost cut yields the recurrence; choosing the minimum over all feasible k gives the optimal value. Induction on the interval width proves optimality.

Complexities
There are O(n²) intervals (i,j). For each, the inner loop scans at most O(n) choices of k. Hence total time O(n³) and space O(n²).

Result
Running the algorithm on the example (cuts at 3, 8, 10 of a 20-character string) returns the optimal cost 38, matching the right-to-left strategy.


Back to Chapter 10