gitbib | All tags: cheminformatics deep-learning machine-learning misc

Computational design of trimeric influenza-neutralizing proteins targeting the hemagglutinin receptor binding site

2017-baker-antibody-design

Eva-Maria Strauch; Steffen M Bernard; David La; Alan J Bohn; Peter S Lee; Caitlin E Anderson; Travis Nieusma; Carly A Holstein; Natalie K Garcia; Kathryn A Hooper; Rashmi Ravichandran; Jorgen W Nelson; William Sheffler; Jesse D Bloom; Kelly K Lee; Andrew B Ward; Paul Yager; Deborah H Fuller; Ian A Wilson; David Baker

2017-06-12 (online)

Nature Biotechnology (Nat. Biotechnol.). 35, 7, 667-671. doi:10.1038/nbt.3907

Description

Uses computation to design an antibody for influenza A

Learning Important Features Through Propagating Activation Differences

2017-deep-lift

Avanti Shrikumar; Peyton Greenside; Anshul Kundaje

2017-04-10 (online)

arxiv:1704.02685

Description

Decompose ouput predictions

machine-learning deep-learning

Hybrid computing using a neural network with dynamic external memory

2016-neural-computers

Alex Graves; Greg Wayne; Malcolm Reynolds; Tim Harley; Ivo Danihelka; Agnieszka Grabska-Barwińska; Sergio Gómez Colmenarejo; Edward Grefenstette; Tiago Ramalho; John Agapiou; Adrià Puigdomènech Badia; Karl Moritz Hermann; Yori Zwols; Georg Ostrovski; Adam Cain; Helen King; Christopher Summerfield; Phil Blunsom; Koray Kavukcuoglu; Demis Hassabis

2016-10-12 (online)

Nature (Nature). 538, 7626, 471-476. doi:10.1038/nature20101

Description

Augment deep networks with an external memory (RAM) matrix.

Bart says: "TL;DR: This work follows a line of research that teaches deep-nets to learn algorithmic tasks (addition, sorting, multiplication, key-value look-up). This paper goes a bit further and teaches their network to do shortest-path finding in graphs and demonstrates on maps of the London underground. Cool demo with nice results, but the hype-machine has blown it out of proportion (check out the FT article for a breathless take claiming thinking computers are one step closer...)"

machine-learning deep-learning

Automatic chemical design using a data-driven continuous representation of molecules

2016-aspuru-mol-feat

Rafael Gómez-Bombarelli; David Duvenaud; José Miguel Hernández-Lobato; Jorge Aguilera-Iparraguirre; Timothy D. Hirzel; Ryan P. Adams; Alán Aspuru-Guzik

2016-10-07 (online)

arxiv:1610.02415

Description

The authors train an auto-encoder to provide a vector representation for small molecules. Small molecules are graphs with varying sizes, so they're hard to feed into neural nets (which require fixed-length bitvectors). By fusing together an encoder and decoder (and making the "middle" representation sufficiently small), they learn a vector representation.

The authors lean heavily on arxiv:1511.06349 (ref. 25) to autoencode SMILES strings.

They use a variational autoencoder (noisy) to avoid "dead zones" in latent space.

They optomize OLED properties as an example.

NumEntryWhy
25 arxiv:1511.06349

machine-learning cheminformatics misc

Modelling proteins’ hidden conformations to predict antibiotic resistance

2016-msm-cryptic-binding

Kathryn M. Hart; Chris M. W. Ho; Supratik Dutta; Michael L. Gross; Gregory R. Bowman

2016-10-06 (online)

Nature Communications (Nat. Commun.). 7, 12965. doi:10.1038/ncomms12965

Description

Labmate summarizes:

They generated ensembles using MD, then docked to those ensembles, then re-weighted the docking scores based on the MSM. This gave a huge improvement in the predictive power of docking to predict affinity/potency. It turned an inverse relationship (when docking using xtal structures) into a highly correlated trend.

They confirmed their hypothesis about the protein flexibility by using a mass spec. method.

They identified a loop movement important in the anti-antibacterial activity of the enzyme that was different from one previously proposed/suspected.

They proposed mutants that would stabilize their proposed loop, and tested them experimentally.

The power of using the MSM to re-weight other analyses is also very encouraging to see yet again. Also note that they did all this with what looks like a pretty low amount of aggregate sampling (few microseconds per mutant).

misc

Structural analysis of high-dimensional basins of attraction

2016-mbar-volumes

Stefano Martiniani; K. Julian Schrenk; Jacob D. Stevenson; David J. Wales; Daan Frenkel

2016-09-15 (online)

Physical Review E (Phys. Rev. E). 94, 3, doi:10.1103/PhysRevE.94.031301

Description

Use multistate benett acceptance (MBAR) to find volumes in high dimensions.

misc

Neural Coarse-Graining: Extracting slowly-varying latent degrees of freedom with neural networks

2016-guttenberg-deep-slow

Nicholas Guttenberg; Martin Biehl; Ryota Kanai

2016-09-01 (online)

arxiv:1609.00116

Description

Somehow uses deep networks to extract slow modes from dynamical signals.

machine-learning deep-learning

Generating Sentences from a Continuous Space

arxiv:1511.06349

Samuel R. Bowman; Luke Vilnis; Oriol Vinyals; Andrew M. Dai; Rafal Jozefowicz; Samy Bengio

2015-11-19 (online)

arxiv:1511.06349

Description

Advances in autoencoding text, used by 2016-aspuru-mol-feat.

misc

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

2014-ganguli-saddle-points

Yann Dauphin; Razvan Pascanu; Caglar Gulcehre; Kyunghyun Cho; Surya Ganguli; Yoshua Bengio

2014-06-10 (online)

arxiv:1406.2572

Description

Labmate summarizes:

This one is a really cool paper. One of those "we've all been doing it wrong" papers that could have a big impact. Their main conclusions are

1. When optimizing functions in high dimensional spaces, saddle points are a much bigger problem than local minima. There are far more of them, and the few local minima that do exist mostly have values only slightly worse than the global minimum.

2. Standard optimization methods deal really badly with saddle points (and hence work really badly in high dimensional spaces). First order methods like gradient descent start taking tiny steps, so they take a really long time to escape. Quasi-Newton methods are even worse. They just converge to the saddle point and never escape.

3. They describe a new approach that doesn't have these problems and goes right through saddle points without slowing down.

They do all this in the context of neural networks, but it likely applies just as well to other high dimensional optimization problems. Proteins, for example. When you use an algorithm like L-BFGS for energy minimization, it's probably converging to a saddle point, not a local minimum. It could be really interesting to try their method. Could we fold a protein to the native state just by a straightforward energy minimization?

Force field optimization is another case whether this approach could be really useful.

They also show that at a saddle point, there's a strong monotonic relationship between the error and the fraction of negative eigenvalues of the Hessian. Potentially that could be used as a way to measure how far you are from the global minimum. For example, when optimizing force field parameters, it would tell you whether your parameters are close to optimal, or whether there's still a lot of room to improve them further.

misc machine-learning deep-learning