*et al.*, Biological Sequence Analysis. To save effort, I'll just use my slide from class:

We're going to model this system in Python, starting with two sets of dice with the requisite probabilities, and extending on into HMM analysis. Here is typical output for 100 rolls of the dice. We show the state above, and the output below:

Following, in segments, is the code for the Model class.

The initialization function is kept simple. It requires a value for the transition probability to switch states in either direction. The emission probabilities are hard-coded. The state will be set at the beginning of a run. Rolling the dice (below) is straightforward. We test the result of

`random.random()`

against the cumulative probability and emit the corresponding outcome.`random.random()`

is used a second time to decide whether to switch on a particular transition. The object can generate sequences of as many states as needed, with a default of 50.It is useful to check that model performs as expected. The code for this is in another module, which we import here. But first, to motivate us, here is typical output for this model:

The different outcomes 1 through 6 are emitted with the correct probabilities. The last two lines list the number of times out of 10000 that the model was in the fair or loaded state. The ratio is determined by ratio of the transition probabilities, which are also determined.

Here is the diceStats module:

## No comments:

Post a Comment