GMRQ vs. Lag-time

Question

Github user jeiros asked why you can’t use GMRQ to find the best lag-time when building an MSM. He wanted to know why the score appears to decrease with increasing lag-time:

gmrq decreases

The full question is here.

Answer

Eigenvalues behave predictably with lag-time. Assume you have a transition matrix $\mathbf{T}$ parameterized at lag time $\tau$. Operating on a vector $\mathbf{x}$ twice should give you the same result as using a transition matrix parameterized at lag-time $2\tau$. This is a consequence of Markovianity.

So

And the eigenvalues

This is why we define the timescales $t_i$ such that

If $\mathbf{T}^2=\mathbf{T}(2\tau)$, then the timescales should be invariant to lag-time.

The GMRQ score is the sum of the first $m$ eigenvalues

As $\tau$ gets larger, the exponent gets more negative, and the GMRQ score decreases.

In practice, the timescales do vary with the lag-time parameter, but not in any way we can account for. You can’t optimize the lag-time parameter because it is not a parameter. The object that we are scoring is $\mathbf{T}(\tau)$. If you change $\tau$, you’re scoring a different entity.

For full details, you can read the GMRQ paper doi:10.1063/1.4916292. Also check out the continuous time version of MSMs which uses lag-time as a tunable parameter and its associated paper doi:10.1063/1.4926516.