gitbib | All tags: cross-validation msm msm-theory perspective qm review tica variational

Statistical models of protein conformational dynamics

2016-mcgibbon-thesis

Robert McGibbon

2016-03-01 (print)

Description

Chapter 1 is a bespoke introduction to MD and MSMs

Chapter 2 is adapted from 2013-mcgibbon-kdml (ref. 37).

Chapter 3 is adapted from 2014-mcgibbon-hmm (ref. 92).

Chapter 4 is adapted from 2015-ratematrix (ref. 120).

Chapter 5 is adapted from 2014-mcgibbon-bic (ref. 162).

Chapter 6 is adapted from 2015-mcgibbon-gmrq (ref. 214).

Chapter 7 is adapted from 2016-sparsetica.

Chapter 8 is adapted from 2015-mdtraj.

NumEntryWhy
37 2013-mcgibbon-kdml
92 2014-mcgibbon-hmm
120 2015-ratematrix
162 2014-mcgibbon-bic
214 2015-mcgibbon-gmrq

Automated construction of order parameters for analyzing simulations of protein folding and water dynamics

2015-schwantes-thesis

Christian Schwantes

2015-05-01 (print)

Description

Section 1.2 is adapted from 2015-schwantes-ktica (ref. 27) and 2014-mcgibbon-bic (ref. 28).

Chapter 2 is adapted from 2014-mcgibbon-bic (ref. 28).

Chapter 3 is adapted from 2013-schwantes-tica (ref. 73).

Chapter 4 is adapted from 2016-schwantes-nug2 (ref. 122).

Chapter 5 is adapted from 2015-schwantes-ktica (ref. 27).

Chapter 6 is supposed to have been submitted for publication.

NumEntryWhy
27 2015-schwantes-ktica
28 2014-mcgibbon-bic
28 2014-mcgibbon-bic
73 2013-schwantes-tica
122 2016-schwantes-nug2
27 2015-schwantes-ktica

Inferring protein structure and dynamics from simulation and experiment

2013-beauchamp-thesis

Kyle Beauchamp

2013-09-01 (print)


Markov State Models and tICA Reveal a Nonnative Folding Nucleus in Simulations of NuG2

2016-schwantes-nug2

Christian R. Schwantes; Diwakar Shukla; Vijay S. Pande

2016-04-01 (print)

Biophysical Journal (Biophys. J.). 110, 8, 1716-1719. doi:10.1016/j.bpj.2016.03.026

Description

They find an intermediate in 2011-larsen-folding NuG2 trajectories that is a register shift that was missed before tICA+MSM.

Efficient maximum likelihood parameterization of continuous-time Markov processes

2015-ratematrix

Robert T. McGibbon; Vijay S. Pande

2015-07-21 (print)

The Journal of Chemical Physics (J. Chem. Phys.). 143, 3, 034109. doi:10.1063/1.4926516

msm-theory

Variational cross-validation of slow dynamical modes in molecular kinetics

2015-mcgibbon-gmrq

Robert T. McGibbon; Vijay S. Pande

2015-03-28 (print)

The Journal of Chemical Physics (J. Chem. Phys.). 142, 12, 124105. doi:10.1063/1.4916292

msm-theory variational

Modeling Molecular Kinetics with tICA and the Kernel Trick

2015-schwantes-ktica

Christian R. Schwantes; Vijay S. Pande

2015-02-10 (print)

Journal of Chemical Theory and Computation (J. Chem. Theory Comput.). 11, 2, 600-608. doi:10.1021/ct5007357

Description

They introduce kernel tICA as an extension to tICA. This is useful to get non-linear solutions to the tICA equation. They claim you can estimate eigenprocesses without building an MSM.

They briefly introduce the transfer operator. They introduce the variational principle of conformation dynamics per 2011-prinz (ref. 25). They introduce tICA as maximizing the autocorrelation. They say that solutions to tICA are the same as solutions to the variational problem per 2013-noe-tica (ref. 28). Linearity makes them crude solutions.

They explain that a natural approach to introduce non-linearity is to expand the original representation into a higher dimensional space and do tICA there. They say this is impractical. The expanded space probably has to be huge. You can perform analysis in the big representation without explicitly representing it by using the "kernel trick". They reproduce an example of the kernel trick from 1998-scholkopf-kernel-pca (ref. 39).

They re-write the tICA problem only in terms of inner products so you can apply the kernel trick. They introduce normalization. They choose a gaussian kernel. They simulate a four-well potential, muller potential, alanine dipeptide, and fip35ww. They need to do MLE cross validation over parameters (kernel width and regularization strength).

This uses so much RAM! Huge matrices to solve (that scale with the amount of data!!)

NumEntryWhy
21 2014-msm-perspective Data needs analysis
25 2011-prinz Details of transfer operator approach.
33 2001-schutte-variational Details of transfer operator approach.
34 2013-noe-variational "It was shown that a variational principle can be derived for the eignvalues of the transfer operator." The autocorelation of a function is less than the autocorrelation of the first dynamical eigenfunction of the transfer operator. This is used to argue that you don't have to estimate the operator itself. Just estimate its eigenfunctions
35 2014-nuske-variational "Successfully constructed estimates of the top eigenfunctions in the span of a prespecified library of basis functions." Contrast with this work, which "does not require a predefined basis set"
22 2013-schwantes-tica Citing tICA
28 2013-noe-tica solutions to tica provide estimates of the slowest eigenfunctions of the transfer operator.
36 doi:10.1103/PhysRevLett.72.3634 Citing tICA
37 doi:10.1162/neco.2006.18.10.2495 Citing tICA
39 1998-scholkopf-kernel-pca Used to introduce ther kernel trick.

msm-theory

Perspective: Markov models for long-timescale biomolecular dynamics

2014-msm-perspective

C. R. Schwantes; R. T. McGibbon; V. S. Pande

2014-09-07 (print)

The Journal of Chemical Physics (J. Chem. Phys.). 141, 9, 090901. doi:10.1063/1.4895044

Description

Very good perspective on the importance of analysis (particularly MSM analysis) for understanding large, modern MD datasets. Money quote: "we believe that quantitative analysis has increasingly become a limiting factor in the application of MD"

msm-theory perspective

Statistical Model Selection for Markov Models of Biomolecular Dynamics

2014-mcgibbon-bic

Robert T. McGibbon; Christian R. Schwantes; Vijay S. Pande

2014-06-19 (print)

The Journal of Physical Chemistry B (J. Phys. Chem. B). 118, 24, 6475-6481. doi:10.1021/jp411822r

Description

This is before 2015-mcgibbon-gmrq GRMQ cross-validation. They explicitly find the volume of voronoi cells (in low number of tIC space) to find a likelihood. They use AIC/BIC to find the number of states to use. Finding volumes is tough and you still can't compare across protocols (so you can basically only scan number of states or clustering method), but! this was the first paper to seriously suggest using a smaller number of states to avoid overfitting.

msm cross-validation

Variational Approach to Molecular Kinetics

2014-nuske-variational

Feliks Nüske; Bettina G. Keller; Guillermo Pérez-Hernández; Antonia S. J. S. Mey; Frank Noé

2014-04-08 (print)

Journal of Chemical Theory and Computation (J. Chem. Theory Comput.). 10, 4, 1739-1752. doi:10.1021/ct4009156

Description

This paper is largely redundant with 2013-noe-variational (ref. 65). They cite it as such: "Following the recently introduced variational principle for metastable stochastic processes,(65) we propose a variational approach to molecular kinetics."

They perform their variational approach on 2- and 10-alanine in addition to 1D potentials.

This comes after tICA and cites 2013-schwantes-tica (ref. 57) and 2013-noe-tica (ref. 58) in the intro, but does nothing further with it. In particular, they don't note that tICA is just another choice of basis set.

They cite their error paper 2010-msm-error (ref. 55).

NumEntryWhy
65 2013-noe-variational
57 2013-schwantes-tica
58 2013-noe-tica
55 2010-msm-error

msm-theory variational

Learning Kinetic Distance Metrics for Markov State Models of Protein Conformational Dynamics

2013-mcgibbon-kdml

Robert T. McGibbon; Vijay S. Pande

2013-07-09 (print)

Journal of Chemical Theory and Computation (J. Chem. Theory Comput.). 9, 7, 2900-2906. doi:10.1021/ct400132h

Description

Learn scaling of coordinates to better approximate kinetics? Redundant with tICA.

Identification of slow molecular order parameters for Markov model construction

2013-noe-tica

Guillermo Pérez-Hernández; Fabian Paul; Toni Giorgino; Gianni De Fabritiis; Frank Noé

2013-07-07 (print)

The Journal of Chemical Physics (J. Chem. Phys.). 139, 1, 015102. doi:10.1063/1.4811489

Description

The Noe group introduces tica concomitantly with 2013-schwantes-tica. They use the variational approach from 2013-noe-variational to derive the tICA equation. They cite a 2001 book about independent component analysis.

msm-theory tica

Improvements in Markov State Model Construction Reveal Many Non-Native Interactions in the Folding of NTL9

2013-schwantes-tica

Christian R. Schwantes; Vijay S. Pande

2013-04-09 (print)

Journal of Chemical Theory and Computation (J. Chem. Theory Comput.). 9, 4, 2000-2009. doi:10.1021/ct300878a

Description

The Pande group introduces tica concomitantly with 2013-noe-tica. This paper uses PCA as inspiration and cites signal processing literature.

msm-theory tica

A Variational Approach to Modeling Slow Processes in Stochastic Dynamical Systems

2013-noe-variational

Frank Noé; Feliks Nüske

2013-01-01 (print)

Multiscale Modeling & Simulation (Multiscale Model. Simul.). 11, 2, 635-655. doi:10.1137/110858616

Description

I think the point of this versus 2014-nuske-variational is to be "protein agnostic". They allude to proteins, but say this is more general. Their example is a double-well potential.

They introduce the propogator formalism and stipulate that dynamics can be seperated into "fast" and "slow" components. In contrast to a quantum mechanics Hamiltonian, we don't know the propogator here. You have to infer it from data.

They claim the error bound derived in 2010-msm-error (ref. 34) is not constructive, whereas this method *is* constructive.

Math section heavily cites 2010-msm-error (ref. 34).

They adapt the Rayleigh variational principle from quantum mechanics, and cite 1989-szabo-ostlund-qm (ref. 43). They show that the autocorrelation of the true first dynamical eigenfunction is its eigenvalue, and an estimate of the first dynamical eigenfunction necessarily has a smaller eigenvalue. This sets the variational bound. In terms of names that don't seem to be used now that we're in the future: the Ritz method is for when you have no overlap integrals (e.g. MSMs) and the Roothan-Hall method is for when you do (tICA).

They put it to the test on a double well potential. They use indicator basis functions to make an MSM; hermite basis functions so they still have no overlap integrals, but smooth functions; and gaussian basis functions (with overlap integrals). This must have come before tICA because there is no mention made of it, even though it would fit in nicely.

NumEntryWhy
34 2010-msm-error
34 2010-msm-error
43 1989-szabo-ostlund-qm

msm-theory variational

Markov models of molecular kinetics: Generation and validation

2011-prinz

Jan-Hendrik Prinz; Hao Wu; Marco Sarich; Bettina Keller; Martin Senne; Martin Held; John D. Chodera; Christof Schütte; Frank Noé

2011-05-07 (print)

The Journal of Chemical Physics (J. Chem. Phys.). 134, 17, 174105. doi:10.1063/1.3565032

Description

Fantastic in-depth intro to MSMs. Figure 1 in this paper is necessary for understanding eigenvectors. This defines and relates the propogator and transfer operator. This shows how we compute timescales from eigenvectors. This discusess state decomposition error and shows that many states are needed in transition regions.

quote: it is clear that a “sufficiently fine” partitioning will be able to resolve “sufficient” detail 2010-msm-error.

Cites 2004-nina-msm for use of the term "MSM".

msm-theory review

On the Approximation Quality of Markov State Models

2010-msm-error

Marco Sarich; Frank Noé; Christof Schütte

2010-01-01 (print)

Multiscale Modeling & Simulation (Multiscale Model. Simul.). 8, 4, 1154-1177. doi:10.1137/090764049

msm-theory

What Is the Relation Between Slow Feature Analysis and Independent Component Analysis?

doi:10.1162/neco.2006.18.10.2495

Tobias Blaschke; Pietro Berkes; Laurenz Wiskott

2006-10-01 (print)

Neural Computation (Neural Comput.). 18, 10, 2495-2508. doi:10.1162/neco.2006.18.10.2495

Transfer Operator Approach to Conformational Dynamics in Biomolecular Systems

2001-schutte-variational

Ch. Schütte; W. Huisinga; P. Deuflhard

2001-01-01 (print)

Ergodic Theory, Analysis, and Efficient Simulation of Dynamical Systems (Ergodic Theory, Analysis, and Efficient Simulation of Dynamical Systems). 191-223. doi:10.1007/978-3-642-56589-2_9

Description

Full treatment of transfer operator / propagator and build an MSM for a small RNA chain.

msm-theory

Nonlinear Component Analysis as a Kernel Eigenvalue Problem

1998-scholkopf-kernel-pca

Bernhard Schölkopf; Alexander Smola; Klaus-Robert Müller

1998-07-01 (print)

Neural Computation (Neural Comput.). 10, 5, 1299-1319. doi:10.1162/089976698300017467

Separation of a mixture of independent signals using time delayed correlations

doi:10.1103/PhysRevLett.72.3634

L. Molgedey; H. G. Schuster

1994-06-06 (online)

Physical Review Letters (Phys. Rev. Lett.). 72, 23, 3634-3637. doi:10.1103/PhysRevLett.72.3634

Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory

1989-szabo-ostlund-qm

Attila Szabo; Neil S. Ostlund

1989-01-01 (print)

Description

Cited by 2013-noe-variational for Rayleigh variational method.

qm

Understanding Protein Dynamics with L1-Regularized Reversible Hidden Markov Models

2014-mcgibbon-hmm

Robert McGibbon; Bharath Ramsundar; Mohammad Sultan; Gert Kiss; Vijay Pande

32, 2, 1197-1205.

Description

Use hidden markov models instead of discrete state MSMs.