Q(Mj|Mi) = Q(Mi|Mj)
But we may want to implement a method Q that leads to more efficient exploration of the tree space, say by restricting changes in the phylogenetic tree to nearest neighbor interchanges (NNI) most of the time. And suppose that in implementing Q, we know that:
Q(Mj|Mi) ≠ Q(Mi|Mj)
If we can calculate a value for that ratio (between forward and backward moves), then we can divide the acceptance ratio by this "bias" (my term) and get the correct distribution in our samples.
Now the wikiepedia article on Metropolis Hastings makes complete sense. Having the right form of Q is critical for the success of the method, and the acceptance rate is monitored to achieve that. Another issue is to sample but not get trapped on peaks of density; that is where simulated annealing comes in.
It would be nice to see an example of how you would calculate this ratio for Q.