Dynamic Programming (Draft)

1. Introduction

A discrete optimization problem is a problem of maximizing an objective function on a finite set of potential solutions. Equivalently, minimizing a cost function on a finite solution set is also called a discrete-optimization problem.

A classic discrete optimization problem is the shortest-path problem, which is the problem of finding the shortest path between two nodes in a graph.

Consider, for example, a road-trip scenario from New York City to San Francisco. For simplicity’s sake, we assume that there are only three other cities: Chicago, Kansas City, and Dallas. Let us assume further that we are required to pass through at least one of the three cities. The graph structure of this shortest-path problem might look like this:

The NYC-SF shortest path problem, simplified

How do we find the shortest path? Since we are required to go through at least one of the three cities, the shortest path must be one of the following:

  1. The path to Chicago, followed by the shortest path from Chicago to San Francisco.
  2. The path to Kansas City, followed by the shortest path from Kansas City to San Francisco.
  3. The path to Dallas, followed by the shortest path from Dallas to San Francisco.

Observe that the above solution reduces the shortest-path problem to a smaller variant—smaller, because the three cities are closer to San Francisco than New York City. This is a manifestation of the principle of optimality:

Principle of Optimality. An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision. (Bellman, 1957)

Indeed, if a shortest path from New York City to San Francisco passes through Chicago, the subpath from Chicago to San Francisco must be a shortest path in between the two cities as well.

The principle of optimality suggests that a discrete optimization problem can be tackled in a piecemeal fashion, solving the smaller variants of the problem first and then extending the solutions to obtain a solution to the whole problem. This is the crux of dynamic programming.

In this post, we study the main technical ideas behind dynamic programming, as well as their applications to a variety of discrete optimization problems. We start with a survey of the greedy-algorithms approach, a restricted form of dynamic programming that always opt for a local optimum at each step. [TODO: and then what?]

2. Greedy Algorithms

A greedy algorithm is a method of tackling a discrete optimization problem by making the locally optimal choice at each stage. It is not entirely obvious why such a method should yield a global optimum, but this is the case for a surprising number of optimization problems.

In this section, we first solve a simple optimization problem whose solution can be found via the greedy algorithm approach. We then abstract the problem-solving technique exhibited in the example problem to tackle a number of optimization problems. Afterwards, we touch upon the theory of matroids, which provides a general description of optimization problems that admit greedy algorithm solutions.

2.1. Activity Selection

2.1.1. Problem statement. A classic example of a discrete optimization problem that admits a greedy solution is the activity-selection problem, which entails finding a maximal collection of non-overlapping activities. In this problem, we aim to maximize the total number of activities, and the set of solutions to search through consists of schedule candidates with activities that do not overlap with one another.

More formally, we let be a set of activities. Each activity has a start time and an end time such that . For notational convenience, we rearrange to have the ordering

We declare two activities and to be non-overlapping if . Our goal is to find the maximal subset of consisting of non-overlapping activities.

2.1.2. Greedy algorithm. The naïve solution examines all possible combinations of the events, requiring time to construct a solution.

The greedy algorithm approach suggests that we first pick the activity that ends as early as possible: namely, . We then consider the activities with starting time later than or equal to , and choose the one that ends as early as possible. Since the collection of activities shrinks at each stage, the recursion terminates after a finite number of steps.

In the below Python implementation, we pad the input lists with a dummy event that starts and ends at 0, so that activity_selector(a, 0, n-1, m, M) includes to the output list, as desired.

def activity_selector(a, start_ind, end_ind, m, M):
    if start_ind == 0:
        a, m, M = [0] + a, [0] + m, [0] + M
    i = start_ind + 1
    while i <= end_ind and m[i] < M[start_ind]:
        i = i + 1
    if i <= end_ind:
        return [a[i]] + activity_selector(a, i, end_ind, m, M)
    else:
        return []

Since every activity is examined exactly once, the time complexity of the greedy solution is . If we drop the assumption that a is sorted in the order of end times, then the overall time complexity becomes .

Given that activity_selector builds an optimal solution in the order of end times, we can do away with recursion and construct an optimal solution iteratively.

def iterative_activity_selector(a, m, M):
    i = 0
    solution = [a[0]]
    for k in range(2, len(a)):
        if m[k] >= M[i]:
            solution.append(a[k])
            i = k
    return solution

2.1.3. Optimality. Is the greedy solution an optimal solution? To see this, we let denote the collection of all activities such that . We shall show that, for each , the activity in with the earliest end time is included in at least one maximal subset of non-overlapping activities (henceforth maxset) in . From this, it follows from mathematical induction that making the greedy choice at each step yields an optimal solution.

We now fix and pick a maxset in , which we denote by . Let be the event in with the earliest end time. If , then is a maxset that contains . If , then

is a maxset in that contains . Since , we see that activities in remain non-overlapping. Now, , and so is a maxset in .

2.2. The Greedy Algorithm Paradigm

In the activity-selection example (Section 2.1), the principle of optimality manifests as optimal substructure: an optimal solution with respect to contains optimal solutions with respect to for all .

If a discrete optimization problem exhibits optimal substructure, then a greedy algorithm produces an optimal solution so long as a sequence of locally optimal choices produces a global solution. Local optimality is understood to mean a choice that is best at each given stage, without the information about potential future choices or solutions to other subproblems. This condition is known as the greedy-choice property.

In what follows, we sketch greedy algorithm solutions to a number of discrete optimization problems.

2.2.1. Paying with coins. Suppose we are given the standard coin denominations of U.S. currency: one cent, 5 cents, 10 cents, 25 cents, 50 cents, and 100 cents. How do we compute the minimum number of coins required to assemble a fixed, arbitrary amount of money?

The greedy algorithm approach would be to use as many 100 cents coins as possible, and then use as many 50 cents coins as possible, and so on. Since the smallest denomination is 1 cent, this algorithm always terminates with the correct sum.

To assemble cents, we need to make at least comparisons and at most comparisons, and so the time complexity of the greedy algorithm solution is .

For notational convenience, let us set , , , , , and .

The change-making problem exhibits optimal substructure. Specifically, if is an optimal solution for assembling

cents, then is an optimal solution for assembling

whenever for all . Indeed, if the latter fails to be optimal, the same adjustment needed to turn it into an optimal solution to the former to obtain a better solution, rendeirng the former non-optimal.

Finally, any non-greey choice necessarily increases the total number of coins, as it is always possible to sum smaller denominations to obtain a larger denomination. It follows that the greedy-choice property is satisfied.

2.2.2. Generalized change-making problem. We now fix a finite, increasing sequence of positive integers that satisfy the following properties:

  1. .
  2. For each , there exists a sequence of nonnegative integers such that

Given an arbitrary positive integer , the generalized change-making problem asks for a sequence of nonnegative integers such that the sum

is minimal among all sequences that satisfy the identity

As in the standard change-making problem, the greedy-algorithm solution would be to always choose the largest such that

where is the amount that remains to be assembled. Since , this algorithm always terminates with the correct sum.

The generalized change-making problem exhibits optimal substrcture for the same reason the standard change-making problem exhibits optimal structure. As for the greedy-choice property, the assumptions on allow us to carry over the proof of the greedy-choice property for the standard change-making problem with minor changes.

2.2.3. Fractional knapsack problem. We now consider a variation of the change-making problem, called the knapsack problem. In this problem, we assume the existence of a knapsack, or a backpack, that can hold a fixed, limited amount of weight. Given a collection of items with fixed values attached to each item, the knapsack problem asks the highest possible total value of the items that can fit in one knapsack, as well as the subcollection of items that achieves this maximal total value.

Here, we consider the fractional knapsack problem, where we are allowed to put fractions of items in the knapsack. Formally, the fractional knapsack problem is a discrete optimization problem that aims to maximize the sum

subject to the constraint

where and for each . We understand to be the value of item , to be the weight of item , to be the fraction of item that we choose to take, and to be the maximum weight the knapsack can handle.

The greedy-algorithm approach to solving the fractional knapsack problem would be to order the items in decreasing order of value per unit weight, i.e.,

We then choose the maximal such that

and choose, inductively, the maximal such that

until either the items run out or

Since both and are finite, the algorithm terminates. The algorithm makes at most comparisons, and so its time complexity is . If we take into account the time it takes to order the list of items, then the time complexity becomes .

To show that the greedy strategy is optimal, we first show that the fractional knapsack problem exhibits optimal substructure. Specifically, if is an optimal solution to the fractional knapsack problem with respect to a knapsack with maximal load of , then we show that is an optimal solution to the fractional knapsack problem with respect to a knapsack with maximal load of

whenever for all and the difference is positive. If fails to be optimal, then the same adjustment needed to obtain an optimal solution can be made to to obtain a better solution to the knapsack problem with maximal load of . Therefore, fails to be optimal, and we conclude that the fractional knapsack problem exhibits optimal substructure.

We observe also that any non-greedy choice decreases the total value of items in the knapsack, as the greedy choice always chooses the item with the best value per unit weight ratio. It follows that the fractional knapsack problem satisfies the greedy-choice property.

2.2.4. Shortest paths on a weighted graph. Moving away from change-making problems and their variants, we return to the problem of finding the shortest path between two nodes on a graph, which we examined briefly in Section 1.

Let be a weighted, directed graph with nonnegative weights. The shortest-path problem concerns finding a path between two nodes with minimal total weight.

To simplify our exposition, we think of edge weights as distances. The greedy algorithm paradigm suggests that, at each step, we move to an unexamined node closest to the starting node. The distances the each adjacent node of can be computed by adding the edge length and, if applicable, comparing with previously computed distances to the node.

This greedy algorithm solution, known as Dijkstra’s algorithm, can be implemented as follows:

def dijkstra(G, starting_node):
    distance[source] = 0
    search_set = {s}
    visited = {}
    while search_queue:
        # Find node closest to the source and pop it out of the search set.
        v, search_set = _closest_node(search_set, distance)
        visited.add(v)
        for u in G.edges(v):
            if u in search_set:
                distance[u] = min(distance[u],
                    distance[v] + G.edge[v][u]['weight'])
            elif u not in visited:
                distance[u] = distance[v] + G.edge[v][u]['weight']
                search_set.add(u)