jaxDNA will for the first time allow for the establishment of community-accepted benchmarks in biomolecular modeling.

Megan Engel (Senior Schmidt Science Fellow, University of Calgary) (pictured)

Michael Brenner (Harvard)

Ryan Krueger (Harvard)

OpenDNA – Using Machine Learning Techniques to Parameterize DNA Dynamics Models from Experimental Data

PIs Megan Engel (Senior Schmidt Science Fellow, University of Calgary), Michael Brenner (Harvard), and Ryan Krueger (Harvard)

Computational models play a key role in efforts to understand biological processes – from embryogenesis to disease – and are central to the design of synthetic nanotechnology comprising life’s basic building blocks: proteins and nucleic acids. Such models simulate DNA and protein physics using biomolecular force fields, which are characterized by parameters that dictate molecular properties – for example, hydrogen bonding strength. In the field of DNA nanotechnology, computational modeling has already led to the successful creation of some impressive artificial nanomachines, but two major shortcomings are preventing more dramatic progress in the field. Firstly, the process by which biomolecular model parameters are chosen to best match experimental data is opaque and irreproducible, leading to a superfluity of redundant DNA and protein models that cannot be easily tweaked or iterated upon. Secondly, feed-forward, trial-and-error modeling is the order of the day, and performing inverse design, where users can input a desired molecular function and retrieve the nucleic acid sequence that would produce it, remains out of reach. jaxDNA, an initiative of Megan Engel (Senior Schmidt Science Fellow, University of Calgary), Michael Brenner (Harvard), and Ryan Krueger (Harvard), is poised to usher in a new paradigm in molecular modeling free from these weaknesses by harnessing the power of machine learning training pipelines.

jaxDNA will provide a transparent procedure for fitting the parameters of an existing DNA model – oxDNA – by combining gradient-based optimization algorithms (the bedrock of machine learning) with molecular dynamics simulations. By wrapping this scheme in an accessible user web interface and creating a dynamic online repository of experimental data that can be used for model fitting, jaxDNA will for the first time allow for the establishment of community-accepted benchmarks in biomolecular modeling. This will drive the type of collaborative iteration and model development that catapulted large language models and image classifiers in machine learning to success. The jaxDNA framework will also feature an accessible, practical tool for inverse DNA sequence design usable by experimentalists outside the computational modeling community, enabling maximum uptake. By demonstrating the impact of machine learning inspired biomolecular model development for the oxDNA model, jaxDNA opens the door to a transformation in the way all biomolecular modeling is done.