Thoughts and Theory

Exploit local structure of three-dimensional molecular complexes to predict binding affinities

Image from Unsplash.

Nathan C. Frey

This post was co-authored by Bharath Ramsundar from DeepChem.

ACNNs learn chemical features from the three-dimensional structure of protein-ligand complexes. In this post, we show how to use the open-source implementation of ACNNs in DeepChem and the PDBbind dataset to predict protein-ligand binding affinities.

An interactive tutorial accompanies this post and is available to run through Google Colab.

A complex problem

A key challenge in drug discovery is finding small molecules that preferentially bind to a target protein. We can use molecular docking or free energy perturbation calculations to predict the binding affinity of candidate molecules, but…


From Unsplash

Thoughts and Theory

Towards an “ImageNet moment” in Molecular Machine Learning

This post was co-authored by Bharath Ramsundar from DeepChem.

Benchmark datasets are an important driver of progress in machine learning. Unlike computer vision and natural language processing, the diversity and complexity of datasets in chemical and life sciences make these fields largely resistant to attempts to curate benchmarks that are widely accepted in the community. In this post, we show how to add datasets to the MoleculeNet benchmark for molecular machine learning and make them programmatically accessible with the DeepChem API.


A simple guide to reproducible research without becoming a software engineer

From Unsplash.

When you do an experiment, whether that’s in a lab or on a computer, you generate data that needs to be analyzed. If your analysis involves new methods, algorithms, or simulations, you probably wrote some code along the way. Scientific code is designed to be quick to write, easy for the writer to use, and never looked at again after the project is complete (maybe designed is a strong word).

For many scientists, packaging their code involves a lot of work and no reward. I want to share a few obvious benefits and some that are hopefully non-obvious. After that…


Why materials science is a mystery to most people and why it’s important

From Unsplash

When I (used to) get a haircut or take an Uber, the #1 question I got is the same one everyone else gets: “What do you do?” If I’m feeling really adventurous, I might say “I’m a graduate student studying materials science.” I’m never feeling really adventurous. It is extremely unlikely that the person I’m talking to will have heard of materials science as a field of study, even though the name is pretty descriptive and self-explanatory. It’s much easier to say “physics” or “chemistry.” Even if…


How to get started doing research in materials informatics (data science + materials science)

Understand structure — property — performance — processing relationships in materials with data science. From Wikimedia Commons.

In this post I share resources and recommendations for getting involved in materials informatics research. As it becomes increasingly more expensive and time-consuming to discover and engineer new materials to address some of our most pressing global challenges (human health, food and water security, climate, etc.), we need materials scientists with scientific domain expertise and training in data science. Whether you want to use data science in your own research or simply have a better understanding of the state of play in the field, this post will help you on your way.

I’ve shared a list of resources on GitHub…


Searching for materials with multiple quantum properties

Looking for “spin” in quantum materials. From Unsplash.

What is a MOM? Patient, supportive, loving…no, not quite, in the cold world of materials physics, a MOM is a multi-order material. That means a material with two or more quantum “orders,” where electrons in the material are organized in some way because of quantum mechanics. We call them “orders” because common examples include things like magnetism, an effect caused by electrons in a material “ordering” and all pointing in the same direction. So your refrigerator magnets have quantum order!

MOM: Multi-order material

In this Science Advances paper, we simulated materials on a…


How semi-supervised learning is used to accelerate materials synthesis

Nathan C. Frey

This post was co-authored with Vishnu Harshith from IIT Madras

Many real-world problems involve datasets where only some of the data is labeled and the rest is unlabeled. In this post, we discuss our implementation of semi-supervised learning for predicting the synthesizability of theoretical materials.

From Unsplash.

When we think about the materials that will enable next-generation technologies, it’s probably not the case that there is one ultimate material waiting to be found that will solve all our problems. The problems we need to solve (producing and storing clean energy, mitigating climate change, desalinating water, etc.) …


Machine learning and physics-based simulations to design materials with defects

In this paper, published in ACS Nano, I show how machine learning and computer simulations can be used to design materials with defects for new types of computing and information storage. You might think that “defects” are always a bad thing; after all, if a product is “defective,” it doesn’t inspire a lot of confidence. But in materials, defects can be purposefully created and used to engineer better performing technologies.

Cooking with defects

If you’ve owned a set of kitchen knives for more than a few months and they aren’t…


Something Deeply Hidden by Sean Carroll is really good and you should buy it. If you want to read a more concise and thoughtful review that was published when the book actually came out, I recommend Sabine Hossenfelder’s. If you are a physics student or physics enthusiast with some time to kill, read on…


Marc Andreessen’s essay “IT’S TIME TO BUILD” is being lauded as a “call to arms” and “rallying cry” for addressing the pandemic. I think the thesis is exactly right, but like most writing, it handily exculpates the writer and their close friends from any wrongdoing. I recommend you read the entire essay, but with this definition in mind: every time Marc says “we”, substitute “political leaders, CEOs, entrepreneurs, and investors.” Whether that is what Marc intended or not, I think it makes the essay clearer, more honest, and more actionable than if you think “we” refers to “the American people.”

Nathan C. Frey, PhD

Postdoc @MIT | Co-founder @AtomicAI | Previously NDSEG Fellow @UPenn and affiliate scientist @Berkeley Lab

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store