Apoha | Article | Teaching Machines to Understand Matter

To teach machines to understand matter, we need three things: a world model for molecular behaviour, a new kind of contact-sensory measurement, and a way to learn from matter as it behaves.

Shamit Shrivastava · CEO & Co-Founder, Apoha

I. The Missing World Model

We already know what it feels like to understand matter.

You taste a sauce and know it needs more salt. You feel a dough and know it needs more water. You smell milk and know it has turned. Long before we can explain the chemistry, each of us develops an intuitive model of how complex mixtures behave.

That model is built through contact.

Biology converged on five sensory channels to produce the raw material for intelligence: touch, taste, sight, sound and smell. These provide the data that living systems need to create a model of the world around them.

Two are wave senses: vision and hearing, which decode photons and pressure waves as they propagate through space. Three are contact senses: touch, taste, and smell, which emerge from matter colliding with matter at an interface.

Vision and hearing allow us to understand the world from a distance. Touch, taste, and smell are how we understand matter itself.

The deep learning revolution was built on wave-sensory data. Cameras, microphones, text, the internet, and decades of digitisation created vast datasets for machines to learn from. Machines learned to see, hear, read, and speak.

But the contact senses followed a different path. There is no internet-scale dataset for touch, taste, or smell. We have no equivalent corpus to understand what happens when matter meets matter.

This is why machines remain strangely poor at understanding molecular behaviour. They can recognise an image of a glass of wine, describe its chemistry, and retrieve everything written about it. But they cannot taste it and recommend something else that you may like.

And yet machines are now moving into the physical world. Autonomous labs run experiments. Robots manipulate objects. AI systems design molecules, predict protein structures, and propose new drugs. The ambition is no longer merely to process information, it is for AI to be able to plan actions in the real world, execute, anticipate outcomes and learn from surprises. This is the new frontier of AI that everyone is excited about, the so-called world models.

Oliver Hsu, Investment Partner at Andreessen Horowitz, recently mapped “frontier systems for the physical world” — but the systems he describes still operate primarily through vision, language, or motion. Yann LeCun has argued that language models remain fundamentally limited without sensory grounding. Both are reaching for the contact senses without naming them.

And with that ambition comes a deeper question: what kind of world model does a machine need for the world it is acting within?

Most world-model discussions are about planning motion of objects in the real world. For robotics, drones, and self-driving cars, the key principles that world models learn and plan around are equations of motion: forces, trajectories, and how objects move through space.

But motion is only one aspect of physical reality, especially beyond applications like robotics. Matter does more than move. Molecules fold, bind, dissolve, aggregate, crystallise, denature, gel, separate, and degrade. A therapeutic antibody in a vial, a lipid nanoparticle in blood, a battery electrolyte at an electrode, milk in your coffee — their behaviour is governed by more than just equations of motion.

If we want machines to understand molecules in the real world, rather than isolated objects in a simulation, they need a world model for molecular behaviour. They need a way to represent the various forms that a molecule can take under different conditions, and to predict how it transforms when it interacts with its environment. This representation is known as the equation of state.

The equation of state is one of the oldest and most ambitious ideas in physical science: a description of the conditions matter can occupy, and the transitions between those conditions. Gibbs gave it a mathematical form in 1873. Maxwell famously sculpted one in clay to make its geometry tangible.

For simple substances, equations of state have been among the great successes of physics. They can describe systems such as steam in a power plant, where pressure, temperature, and volume are related well enough to design turbines and predict performance. But for complex molecules in real environments — proteins, formulations, biological fluids, electrolytes, mixtures — they remain one of science’s deepest unsolved challenges.

The difficulty is not merely computational. Behaviour emerges from vast numbers of interacting components across scales of space and time. Predicting how a molecule will behave under real-world conditions remains extraordinarily hard.

In fact when AI learned sight and sound, there already was a lot of digitised audio and visual data available. The key unlock came from new models and enough compute. However, when it comes to understanding molecular behaviour, what we lack is the right kind of data.

This is not something that clever engineering, automation, and more measurements can solve. In fact, the computing power is already there. What we need is a scientific breakthrough. We need a new data layer for AI.

• • •

II. The Missing Measurement

The industry already measures a great deal. Modern laboratories measure viscosity, aggregation, thermal stability, particle formation, charge variants, degradation, binding, solubility, and more. Increasingly, these measurements sit inside automated loops, with AI proposing molecules, robots testing them, and models updating their predictions.

The limitation is deeper than effort or throughput. Most laboratory measurements reduce behaviour to isolated properties, measuring one thing at a time, under controlled conditions, often near equilibrium, often as sparse points or endpoints. They are powerful, but they are not how contact senses understand matter.

Contact senses perturb matter and read the response as it unfolds in time. They measure physical chemistry as a dynamic signal, far from equilibrium, where behaviour is often most revealing.

Under those conditions, matter does not express itself through a single channel. Mechanical, chemical, electrical, and thermal effects are coupled. A contact event is inherently multimodal. The readout has to be dynamic and multimodal too.

Protein science shows what can happen when the right kind of data exists.

For decades, protein folding was treated as a first-principles problem. Given a sequence, we could compute its three-dimensional structure. The process could work in theory, but the search space is too vast, and the approximations too costly.

DeepMind showed another way.

AlphaFold did not solve protein folding by deriving every interaction from quantum mechanics. It learned from the Protein Data Bank: decades of experimentally determined structures, standardised and curated at scale.

That lesson is now reshaping protein engineering. Models propose sequences, predict structures, screen functions, and tighten the loop between design and experiment. The field is moving from discovery toward design.

But the success of structure prediction also reveals the next bottleneck.

A protein is more than its sequence and structure. It is a physical object that must survive concentration, temperature, shear, interfaces, excipients, salts, buffers, containers, freeze-thaw cycles, and time. It must behave not as an isolated molecule in a model, but as part of a formulation.

For example, when we think of a material or molecule “failing” what we typically mean is that a material that we have designed actually goes through its lifecycle unscathed and without losing the designed functionality. For a protein formulation it can mean that it doesn't suddenly aggregate and block the syringe that is trying to inject it into the blood, or it doesn’t precipitate as it is exposed to a new buffer or salt environments, or that it doesn’t gel when it comes in contact with air by accident. All these transformations, gelation, aggregation, precipitation, under stress or different environmental conditions are failure modes that in physics are known as phase transitions.

Molecular dynamics, density functional theory, coarse-grained simulations, and other first-principles approaches have produced extraordinary insight into such behaviours of proteins. But for complex, multicomponent liquids and biological formulations, they still run into the same wall: the real system is too coupled, too high-dimensional, and too sensitive to conditions to be fully captured from first principles alone.

Dynamic observations of how molecules and formulations behave and how they transform when they interact with their environment are the missing layer between molecular design and real-world performance.

• • •

So how do we create behavioural state data?

We return to the contact senses.

When you touch silk, taste wine, or smell rain, two material interfaces meet. That encounter produces emissions — mechanical, chemical, electrical, thermal — and the nervous system reads them as information.

For more than a decade, this has been the scientific problem at the centre of my work: how interfaces carry information. In lipid interfaces, I studied nonlinear waves whose propagation depends on thermodynamic state. Later work on shock-like and detonation-like waves extended the same idea: interfaces do not merely separate materials. They transmit and process information about state.

The point is not to reproduce biology literally. Nature is a guide to principles, not an instruction manual. When humans took inspiration from birds to build flying machines, they did not build planes with flapping wings. They extracted the principle of flight and engineered it in another form.

The same is true here.

We do not need to grow neurons, build artificial tongues, or reproduce the full machinery of smell. We need to extract the principle of contact sensing: create a controlled interface, bring matter into contact with matter, capture the emissions produced by that encounter, and learn the behavioural state encoded in them.

The natural place to begin is the liquid state: a dynamic interface where two or more molecules interact, mix, organise and reveal their behaviour.

Life began in liquid, and most of biology still happens there. Blood, cytoplasm, mucus, milk, tears, vaccines, antibody formulations, electrolytes, beverages, fuels, detergents, inks are just some of the liquids in which molecules most often meet, move, react, assemble, and fail.

Liquid states reveal the most about molecular behaviour: viscosity, aggregation, gelation, phase separation, crystallisation, solubility, degradation, stability, and flow.

All of these behaviours are essential to understanding how drugs work, how food tastes, how new materials perform. This ability to understand matter is what we call Liquid State Intelligence.

• • •

III. Closing the Loop

Liquid State Intelligence was built to create the missing data layer for molecular behaviour.

Apoha’s platform uses carefully prepared liquid interfaces tuned near thermodynamic transitions. A tiny droplet of the molecule or formulation to be understood is introduced. The liquids meet. Their interaction produces time-domain emissions: dynamic signatures of how the sample behaves under coupled chemical, mechanical, electrical, thermal, and interfacial stress.

The platform captures those emissions and turns them into behavioural state data: high-dimensional representations of how a molecule behaves, not merely what it looks like.

Feynman once imagined a tiny insect at the corner of a swimming pool. People are diving, splashing, moving through the water. The insect cannot see the pool. It only feels the waves arriving at the edge. Could it reconstruct what is happening in the pool from those waves alone?

That is the thought experiment Liquid State Intelligence is built around.

When two liquids meet, the sample perturbs the interface and emissions radiate outward. Those emissions carry information about what happened in the encounter: how the molecule interacted, how stress propagated, and whether the system tolerated the perturbation or crossed toward failure.

The same principle appears across physics. The Large Hadron Collider learns about particles from the emissions produced in collision. LIGO learns about black holes from waves emitted by their merger. Liquid State Intelligence applies the same logic to complex molecules in liquid: collide, listen, decode.

• • •

‍

The output of Liquid State Intelligence is a machine-readable map of how a molecule behaves: e.g. when it freezes, gels, aggregates.

This ability to directly perceive behaviour and recognise materials through state changes is a game-changer.

Most material design begins with composition or structure and then tries to predict behaviour. Liquid State Intelligence inverts the approach. We start with the critical information. We begin with how a material actually behaves.

Each sample becomes a point in a space organised by behaviour. In that space, distance has meaning. Materials that are close together behave similarly, even if their compositions, sequences, or structures are very different. Materials that are far apart behave differently, even if they appear similar by conventional descriptors.

That lets us ask a different kind of question: not “what is compositionally similar?” but “what behaves similarly?”

Find me a protein that behaves like chicken protein after hydration, heating, shear, extrusion, cooling, and storage. Find me a sugar replacement that behaves like sugar not only in sweetness, but in onset, retention, spike, stickiness, mouthfeel, and aftertaste. Find me a different material that preserves the functional behaviour I care about under the stresses it will actually experience.

We have used this logic to solve hard problems across diverse materials from soil to serums and fats to formulations, including helping identify a plant protein that could replace a chicken-like protein in a product already on supermarket shelves. But the frontier we are most focused on is antibodies.

In antibodies, the question is not chicken-like behaviour. It is drug-like behaviour, which is whether a molecule is able to avoid various failure modes that stop it from being a drug or being administered to a patient, despite having the right biology.

Antibodies are the sharpest proving ground. They are programmable, heavily benchmarked, and surrounded by sophisticated computational workflows. Industry already understands the cost of behavioural failure. Billions of dollars are being spent trying to predict developability. That gives us a demanding arena, and a clear measure of whether behavioural state data adds something existing pipelines cannot.

‍

Working with Boehringer Ingelheim, and later in a study presented at PEGS, we showed that unstable antibody candidates undergo a visible gelation: a state change in the neck between merging liquids, detectable from just eight micrograms of material in dilute solution.

Using only eight micrograms, Liquid State Intelligence generates machine-readable behavioural maps capturing multiple liabilities simultaneously. These maps identify failure-prone molecules with greater than 90% precision in an explainable manner. Apoha's technology can dramatically reduce costs and timelines in drug development while opening up new avenues for R&D teams across food, drink, biotech and beyond.

For the first time, a discovery team can see behavioural liabilities from micrograms of material, months or years before those liabilities would surface in development, when the cost of a wrong decision is still low and the freedom to act on it is still high.

• • •

There is a precedent for what happens when you push the envelope on understanding materials. Carnot studied steam engines and discovered the second law of thermodynamics. Gibbs studied chemical mixtures and gave us the equation of state. Every foundational law of thermodynamics emerged from the effort to understand how matter actually behaves. The reward was never just better materials. It was a deeper understanding of nature itself.

In building machines that learn thermodynamics from behavioural state data and from direct contact with matter, we are opening the same door. A machine that truly understands how materials behave does not stop at materials. It helps us answer fundamental questions: How do self-organising systems emerge? How did life arise from chemistry? What principles govern the boundary between order and chaos? Understanding and engineering new medicines, foods and materials is one upshot of Liquid State Intelligence. Understanding the physical world itself, in all its behaviours, is another.

That is what this company is for.

Join us.

Shamit Shrivastava is CEO and Co-Founder of Apoha. A mechanical engineer and biophysicist, he holds a PhD in Bioengineering from Boston University and conducted postdoctoral research at the University of Oxford's Institute for Biomedical Engineering. His work spans 48 publications and over 1,600 citations across thermodynamics, nonlinear acoustics, neuromorphic computing, and biophysics. His foundational experimental demonstration of solitary acoustic waves in lipid interfaces — and the theoretical framework treating nerve impulses as thermodynamic shock waves governed by the equation of state — was developed within a lineage of research spanning Manfred Eigen (Nobel Prize, 1967), Konrad Kaufmann, and Matthias Schneider. Featured in Douglas Fox, "The Brain, Reimagined," Scientific American (April 2018); monograph published in Progress in Biophysics and Molecular Biology (2021).