As long as I have RPy working, I did a first version of a clustering script. First we do our imports and grab the definitions of R objects:
Next we generate a numpy array with random values in the set {0.5 .. 1.0}. Then we pick a few rows and mark a few of the same columns in each by changing each column's value to a random one between 0.0 .. 0.2. These show up as the light rectangles in the plot. Then we just:
The only hard part is going back and forth between a Python array and R matrix. There's probably a better way to do it, but what I used was
flatten
from here before feeding to R, and this going the other way:We plot a heat map using code from here. As you can see in the graphic at the top of the post,
hclust
does a reasonable job of finding the structure we created. Rather than list the code, I put a zipped folder with the files on Dropbox here.