These authors spend substantial time on a classic computer science method called "dynamic programming" (invented by Richard Bellman). It is widely used in bioinformatics. For example, sequence alignment algorithms such as NeedlemanWunsch and SmithWaterman are dynamic programming methods. The key is to have a problem that can be broken into stages or steps; the problem is attacked from one end or the other and the result for each stage is computed in turn, based on current knowledge, and the result is stored.
A problem they use to illustrate the method is the "change" problem, which I first ran across in SICP. (Another is Towers of Hanoi).
As much as I enjoyed the example, I think it is an unfortunate choice for the biologically oriented student of bioinformatics, because it is really more complex than necessary to illustrate the dynamic programming idea; the complexities are not relevant to the sequence problem, and may well distract the naive student.
I would talk first about a simpler problem. Imagine a triangle made up of elements with integer values, having one element on the first line, two on the second, three on the third and so on. Given a current position, we must move down and either left or right to the next element.
Our problem statement is to evaluate alternative paths from top to bottom through this triangle (e.g. to discover the path with the maximum total value), as described in this problem from Project Euler.
As you can appreciate, triangles can become so large that it is impossible to enumerate and evaluate all paths. The number of paths is 2**(n1) for a triangle with n levels, and this value gets large rapidly with increasing n. To solve the problem, we work from the bottom to the top.
However, what I really want to talk about in this post is the change problem, which is a version of something more generally known as the knapsack problem.
We have a container (the knapsack) with a known capacity, and objects of various size which we wish to store in it. We may desire to know different things, for example, how many different combinations of objects can we find that completely fill the knapsack? This is our problem statement for the change problem: given American coins (1, 5, 10, and 25 cents), how many ways are there to make change for a dollar?
Rather than go back and look up the answer in SICP (it's been a long time, and I was never much good with Scheme), I decided it would be more fun to try to solve it without looking at any other resources. The key for me was to visualize the problem as a tree, in which the toplevel node contains a set of 100 pennies. What can we do? Well, we could change 5 pennies to a nickel, or 10 pennies to a dime, and so on. Each one of these new forms of change for a dollar becomes a subnode below the first one. From each subnode, in turn, we attempt further transformations, which will become subnodes in turn.
For example, starting with only 10 pennies (P=penny,N=nickel,D=dime):
10P 0N 0D => 5P 1N 0D => 0P 2N 0D => 0P 0N 1D 
Two things to notice: we can do some kind of depth first traversal, or breadfirst. (My graph algorithm skills are weak, but that much seems clear at least). And the depthfirst would presumably use recursion (for which this problem is famous). The other thing to notice is that we can get duplicatesthere are two paths to the 1 dime state.
My solution (Python code) follows below. Simple things: I use a dict to hold each single kind of change, and a utility function that checks whether it is possible for us to change, say, pennies to a nickel for a current dict. This function returns either a new dict containing the result or None. The loop which solves the problem starts with a list containing a single element, a dict like this:
{ 1:100,5:0,10:0,25:0 }
The code does the following:
Notably, this solution adaptable to other coin types and other total amounts. However, it seems not to be very efficient, since it slows dramatically with amounts beyond a few dollars. I'll have to work on that.
# how many ways to change a dollar? 

No comments:
Post a Comment