Four ways Google Research scientists have been using Empirical Research Assistance

April 29, 2026
The Google Research Science team
Since introducing Empirical Research Assistance in the fall, Google Research scientists have been using it to address real-world applications in epidemiology, cosmology, atmospheric monitoring, and neuroscience, providing a hint of AI’s transformational capabilities to accelerate scientific discoveries.
AI’s capabilities to advance scientific discovery are growing every week, with outcomes that promise not just to enable breakthrough discoveries but to transform how science is done. In September, we released a preprint introducing Empirical Research Assistance (ERA) to help scientists generate expert-level empirical software. That included novel solutions to six diverse and challenging benchmark problems in fields ranging from cell biology to neuroscience.
Since then, Google scientists and our academic collaborators have been developing and using ERA to test its capabilities and explore potential applications. These efforts go beyond proof-of-concept tests to real-world scenarios in epidemiology, geospatial analysis, and more, revealing how AI can democratize access to the power of computational modeling, find solutions to unsolved problems, unlock deeper insights from existing data collections, and go beyond black-box modeling to discover interpretable, mechanistically accurate solutions.
It’s been inspiring to see the excitement of Google research scientists, visiting faculty researchers and academic collaborators as they experiment with ERA. We are thrilled to see these capabilities expand as it nears more widespread availability to support AI-assisted scientific discovery for global benefit.
In the preprint, authors used ERA to predict U.S. hospitalizations for COVID-19, showing that it was able to retrospectively match or outperform existing tools from the Centers for Disease Control and Prevention (CDC) and leading research institutions. As a follow-on effort, the team has now expanded to generate forecasts not just for COVID, but also for influenza and respiratory syncytial virus (RSV), and has been submitting prospective forecasts in real time every week.
When the CDC’s long-running flu forecast challenge opened in November for the 2025-26 season, Google began submitting weekly forecasts for every U.S. state and at all time horizons, up to four weeks in the future. Late last year Google also joined CDC’s year-round live forecasts for state-level COVID-19 hospitalizations, as well as CDC’s recently launched hub for forecasting RSV. Public leaderboards for flu and COVID-19 run by Nicholas Reich, a biostatistics professor at the University of Massachusetts Amherst and consultant on this project, show that Google has been performing at or near the top of both leaderboards during the time they have been submitting forecasts to each project (see figure). Although there is no public leaderboard for RSV, internal analyses show a similarly strong performance.
An AI-powered tool that can meet or exceed the forecasting accuracy of leading public health agency tools promises huge public health benefit for tracking newer conditions and in broader locations, democratizing access to computational modeling for epidemiology for a wider range of infections and geographies.
Cosmic strings are theoretical defects in the fabric of spacetime, believed to have formed in the early universe and predicted to emit gravitational radiation. Calculating the spectrum of this emitted energy is an unsolved problem, largely because the governing equations contain singularities — mathematical points where values approach infinity and traditional models break down. Last fall, a paper used OpenAI’s GPT-5 to find a partial solution for the gravitational energy radiating from cosmic strings, but only for the simplest case of a square loop where the angle α = π/2, or 90 degrees. A unified exact solution — a single, complete mathematical formula that solved the integral perfectly — remained an open problem.
To address this, we combined ERA with Gemini Deep Think. By systematically exploring mathematical techniques capable of navigating these singularities, we successfully derived six general solutions and a concise formula for the asymptotic limit, which we shared in March. This illustrates the powerful potential of pairing ERA with advanced LLMs to unlock precise, novel solutions at the frontier of cosmology.
Regular observations of carbon dioxide (CO2) began at Hawaii’s Mauna Loa Observatory in the late 1950s, yielding the iconic Keeling Curve that documents rising global CO2 concentrations in Earth’s atmosphere. Mapping human greenhouse gas emissions and understanding how plants, trees, soils and oceans absorb those emissions requires us to track how CO2 varies across regions and over time. Current space-based CO2 sensors, like NASA’s Orbiting Carbon Observatory-2 (OCO-2) were designed to make high-precision observations, but they only map a tiny fraction of the Earth’s surface and return to each location just once every 16 days. Geostationary satellites, such as the GOES East satellite designed to support weather forecasting, orbit the Earth from a much higher altitude and can scan an entire hemisphere every 10 minutes. However, none of the existing geostationary satellites were designed to map CO2.
Google researchers used ERA to develop a single-pixel, physics-guided neural network to distill a column-averaged CO2 signal from the existing GOES East observations. To do so, the model combines data from 16 wavelength bands from GOES-East with lower-troposphere meteorology, solar angles, and day of the year. After training on the sparse observations from OCO-2 and OCO-3, the model was then able to derive estimates of column-averaged CO2 everywhere and every 10 minutes.
Research shared at the International Workshop on Greenhouse Gas Measurements from Space shows that the AI-developed model is able to leverage the high spatial and temporal density of the GOES East observations to track column-averaged CO2 with unprecedented spatial and temporal resolution. Comparisons against independent data from additional years of OCO-2 observations, and from the ground-based total column carbon observing network, confirm the model’s ability to capture real CO2 variability.
These results show how an AI algorithm can extract additional value from existing observational instruments, especially for resource-intensive satellite research missions. This project is among several questions related to climate and greenhouse gases that Google researchers are exploring using ERA.
Although we can now map tens of thousands of neurons in living brains, untangling the functional circuits is the next step. Google researchers used ERA to tackle this challenge in both real and simulated zebrafish, a popular model organism for studying how a vertebrate detects stimuli, processes information and responds. In natural settings, light passing through ripples on the water’s surface creates patterns of light and dark stripes on the seafloor or riverbed. Zebrafish have evolved to instinctively respond to changes in those stripes in order to stay in shallow water and avoid getting swept away.
In a new study, we looked at the zebrafish neural circuit corresponding to this environmental stimulus. We provided ERA with the wiring diagram of simZFish, a simplified zebrafish body and brain simulator. Guided by this information — revealing what cellular connections exist, but omitting the mathematical rules that govern them — ERA was able to propose circuits that connect stimulus to neural activity to motor response. Testing these AI-hypothesized circuits against new visual stimuli showed that they were not just statistical shortcuts, but accurate neural mechanisms that generalize to other, similar situations.
This builds on results from the preprint, which showed that AI-developed models could outperform baseline methods at predicting the activity of over 70,000 neurons captured in the Zebrafish Activity Prediction Benchmark, ZAPBench, a dataset of neural activity from experiments that mimic typical environmental stimuli.
While ZAPBench proved ERA's ability to find state-of-the-art predictive solutions, the simulated environment reveals how it can go beyond black-box modeling. Equipped with structural information, ERA discovered interpretable, mechanistically accurate solutions, providing a powerful blueprint for addressing scientific grand challenges in living brains.
These four projects are among a growing list of results that show how LLM-backed systems can advance science and accelerate the pace of discovery. These examples represent a range of fields and also types of problems, from theoretical math to data forecasting to analyzing data from observational instruments and simulation output. They also showcase the potential for AI-enabled science to solve open problems, democratize access to computational modeling, and maximize the utility of existing observational data. We’re excited about the progress being unlocked by ERA and other Google tools — including co-scientist and PAT — designed to accelerate scientific discovery.
We’d like to thank our collaborators on developing ERA, and all the scientists who are among the early adopters. The epidemiological forecasting work is led by Zahra Shamsi, Sarah Martinson, Nicholas Reich, Martyna Plomecka, and Brian Williams. The cosmological paper is authored by Michael Brenner, Vincent Cohen-Addad, and David Woodruff. The research on carbon dioxide monitoring is led by Aarón Sonabend-W, Sean Campbell, Renee Johnston, Vishal Batchu, Carl Elkin, Christopher Van Arsdale, John Platt, and Anna Michalak. The paper on neural circuits was authored by Jan-Matthis Lückmann, Viren Jain, and Michał Januszewski. We also acknowledge leadership support from John Platt, Michael Brenner, Lizzie Dorfman, Vip Gupta, Alison Lentz, Erica Brand, Katherine Chou, Ronit Levavi Morad, Yossi Matias, and James Manyika.

Analyzing the article from a skeptical perspective, we can identify several patterns of intellectual rigor demonstrated by ARC:
Emotional exploitation: ARC does not engage in emotional manipulation or use fear appeals to drive its analysis.
Distortion: There is no evidence of semantic manipulation or out-of-context framing in the presented case studies.
Bad faith: ARC does not engage in sealioning, attacking critics, or using the Gish gallop to flood with weak arguments.
False framing: The analysis provided by ARC is balanced and avoids forced binary choices or false equivalence.
Evasion: There are no indications that ARC evades criticism by changing topics when cornered.
Authority games: While ARC leverages the credibility of its developers, it does not rely excessively on jargon or volume over logic to support its arguments.
However, as with any AI model, there may be concerns about bias in the training data and potential limitations in the model's ability to generalize beyond the presented case studies. Additionally, further research is needed to assess ARC's performance in real-world settings and evaluate its long-term impact on critical thinking skills and cognitive sovereignty.

Four ways Google Research scientists have been using Empirical Research Assistance

Facts Only

Executive Summary

Full Take

Sentinel — Human