Award-Winning Researcher Trains Robots to Make Educated Guesses

Yen-Ling Kuo always wanted to understand how things worked. When she was growing up in Taiwan, reading the story of Michael Faraday in elementary school piqued her curiosity about the natural world. During that time, she was introduced to Logo, a computer program with a turtle cursor to help children learn basic coding through hands-on experimentation.
It was Kuo’s introduction to programming logic.
Yen-Ling Kuo
Employer
University of Virginia in Charlottesville
Title
Assistant professor of computer science
Member grade
Member
Alma maters
National Taiwan University; MIT
In high school she learned the capacity computers held. She could write programs that completed tasks independently, she realized.
“Once I discovered how powerful computers could be,” she says, “I knew I wanted to focus on using them to solve real-world problems.”
Kuo, an IEEE member, never lost her interest in the “how” behind processes and tools. Her curiosity, combined with a stint working at a Silicon Valley company, led her to focus on innovations that live at the intersection of cognitive and computer sciences.
Kuo, now an assistant professor of computer science at the University of Virginia in Charlottesville, last year received the IEEE Robotics and Automation Society’s inaugural Outstanding Women in Robotics and Automation Early Career Contribution Award. The award is part of the IEEE-RAS Women in Engineering’s Outstanding Women in Robotics and Automation (WiRA) Paper Awards, which promote excellence and recognize the impact that female researchers have on robotics and automation fields at different stages in their academic careers.
Kuo’s winning paper, “Diff-DAgger: Uncertainty Estimation with Diffusion Policy for Robotic Manipulation,” demonstrates a novel method to help robots better identify and estimate uncertainty when faced with scenarios on which they’ve not been trained. The method reduces the amount of human supervision, improves a robot’s rate of successful task completion, and opens up a path to introduce more complex models with bigger data demands into interactive robot learning.
She says her research will help people working in the robotics and automation fields more efficiently collect the data needed for effective model training.
Silicon Valley’s impact
Kuo earned bachelor’s and master’s degrees in computer science at the National Taiwan University, in Taipei, in 2009 and 2012. As she was nearing completion of her master’s degree, she did what many computer science graduates do: She pursued a summer internship at a tech company.
She spent the summer of 2011 at Google’s campus in Kirkland, Wash., working on the company’s comparison ads project.
When her internship ended, she joined the MIT Media Lab as a visiting student, working on the Open Mind Common Sense project with Henry Lieberman.
As she was considering pursuing a Ph.D., a call from Google changed her plans. The company offered her a full-time role as a software engineer.
“I viewed the job offer as a positive development,” she says. “I believe it can never hurt your future research career to get some real-world experience under your belt.”
She was hired in 2012 and helped build techniques that incorporate computer vision and natural language processing to improve the customer shopping search experience. She led the company’s Shop the Look initiative, a predecessor to Google’s current AI-powered shopping experience. The project connected social media content with search results, something the company had struggled to do in the past.
Kuo and her team were tasked with building a connection between the natural language people use to describe an item and an image that matches the searcher’s intent. It was at a time when the neural network—using deep learning models to power Google products—was gaining momentum at the company. Integrating neural network tools into her work was a requirement—which raised questions for Kuo.
“I was applying the neural network tools,” she says. “But I didn’t have 100 percent certainty about how they actually worked.”
She considered how she could become more knowledgeable about deep learning models. It was a full-circle moment. She decided that after nearly four years at Google, it was time to earn a Ph.D. in computer science. She returned to MIT in 2016.
The question that changed everything
Boris Katz, one of Kuo’s Ph.D. advisors, is a principal research scientist and the head of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL)’s InfoLab. He also led the creation of the START Natural Language System, the world’s first Web-based question-answering system.
When the two met, Katz asked Kuo why she wanted to pursue a doctorate degree. She explained her interest in understanding how neural networks work and in using that knowledge to connect the physical world with human language.
He suggested she attend a summer course at MIT’s Center for Brains, Minds, and Machines, a research initiative that ran from 2013 through 2025. CBMM’s objective was to bring together computer scientists, cognitive scientists, and neuroscientists to understand how human intelligence works. The goal was to use the resulting insights to establish an engineering practice to build artificial intelligence systems.
For Kuo, it was a chance to better understand human intelligence and identify ways it could be replicated in machines.
“It was an opportunity for me to interact with other scientists and gain insight into how people learn, understand, and figure things out in the world,” she says. “I saw it as a very useful and inspiring way to incorporate those ideas into my own research work.”
During her Ph.D. studies, she was a research assistant at CSAIL. The experience helped shape her doctoral research, which focused on building AI systems that apply past learning to new situations. She developed machine learning models to support the efforts, including language understanding and social interactions.
She completed her Ph.D. in computer science in 2022 with a minor in cognitive science.
After graduation, she continued her work and collaboration at CSAIL, particularly on projects that involved the “theory of mind” concept.
Theory of mind spurs innovation
Theory of mind isn’t new, having originated with primatologists studying chimpanzees in the late 1970s. The theory recognizes that others have their own thoughts, beliefs, and perspectives. It’s a skill that allows humans to infer someone’s mental state and predict their behavior without verbal communication.
“It’s like when college roommates are moving into their dorm. They may not talk too much, but they work together naturally to coordinate their activities and accomplish goals,” Kuo says. “They can infer and mentally interpret each other’s behaviors and signals to make decisions and complete tasks without words.”
She brought her theory of mind research to the University of Virginia when she joined as an assistant professor in 2023.
Kuo conducts her research in UVA Engineering’s multidisciplinary cyberphysical Link Lab. Her broad focus is on developing computational models that help robots interpret both direct data and silent signals, from language and movements to a person’s gaze. If successful, it could give robots the same sort of physical and theory of mind reasoning capabilities that power physical and social interactions among humans.
“There are no computational frameworks yet available that will translate this kind of understanding into a robot efficiently,” she says.
She adds that the process to get there begins with improving how robots learn to perform tasks.
The evolution of robot learning
Historically, one way robots learned was to mimic humans. A researcher would manually guide a robot through a task, like cutting an apple, and it would repeat the movements. The robot was successful until the environment changed, such as when its hand was in a different position or the apple was at a different angle. The robot was then faced with a situation for which it hadn’t been trained. Without any data available to help it correct course, the robot would start making small errors that eventually led to a full system crash.
This diagram describes how the robotic gripper’s visual perception and tactile sensing prevents a potato chip from breaking.Xuhui Kang, Yen-Ling Kuo, et al.
To solve the problem, researchers developed the dataset aggregation (DAgger) method. As a robot performed a task, a researcher was on standby to provide real-time corrections during unexpected scenarios. The correction data was continuously added to the robot’s model, teaching it how to recover from mistakes.
To reduce the human monitoring effort, robot-gated DAgger was created to enable bots to query humans when the machines became uncertain.
The most popular approach to make the query decision is to train multiple models to consider when determining a course of action. If the models all agree, the robot proceeds. If they don’t agree, the robot is likely to get stuck and ask for help.
Although the multiple model approach was widely adopted, it has limitations. Practically speaking, as models become more complex, it is hard or impossible to train multiple copies. A more fundamental issue is that disagreement among models doesn’t always imply uncertainty; it could just mean there are different ways to accomplish a task.
The Diff-DAgger solution
That is the gap Kuo’s research team closed with the novel Diff-DAgger research. The approach builds on diffusion policy, a technique that helps robots account for different ways a task can be performed.
The new method repurposes diffusion loss, the signal a robot uses to improve its model during training, as a real-time confidence check. During task execution, the robot computes the signal and compares it against values from its training data using a statistical test. The signal spikes when the robot faces an unfamiliar situation and is uncertain how to proceed. The signal stays silent when the robot’s current action is close to what it learned before.
The spike represents the robot’s ability to self-diagnose and predict an imminent failure. Human intervention is triggered only when the signal spikes. No spike means the robot can be left to complete its decision-making process on its own.
Kuo’s team achieved significant results: Failure prediction rates were improved by 39 percent. Task completion rates were increased by 20 percent, and tasks were completed nearly eight times faster.
Her research at UVA gained attention from the National Science Foundation, which honored her last year with a Career Award, the foundation’s flagship grant for early-career researchers. The five-year US $665,000 grant supports her research that builds computational models for human-robot interactions through theory of mind reasoning.
She also received the Toyota Research Institute’s Young Faculty Researcher Award to teach cars to reason about interactions on the road and with the driver.
As service robots and self-driving vehicles become more available, such works are likely to make interactions between humans and robots more intuitive and useful.
Kuo ultimately wants to build more robust robots that are able to integrate into a social space with humans by engaging with us through grounded interactions, she says.
The impact of IEEE
Like many IEEE members, Kuo was introduced to the organization as a student. In 2018 she submitted her first paper, “Deep Sequential Models for Sampling-Based Planning,” to the IEEE/Robotics Society of Japan International Conference on Intelligent Robots and Systems while pursuing her Ph.D. at MIT. Her IEEE involvement grew alongside her professional career.
“It was a natural segue to transition from student to a full IEEE member,” she says. Today she is an active volunteer with the IEEE Robotics and Automation Society, a reviewer for submitted papers, and a presenter and panelist at conferences.
She says one of the best parts of attending conferences is having the opportunity to engage with students. She also enjoys participating as a panelist at luncheons, she says, because it gives her one-on-one time with student attendees. She can share her knowledge and offer insights as they prepare to embark on their career.
Her goal in the coming years, she says, is to broaden her involvement with IEEE initiatives and branch out to other technical committees. Sharing knowledge and learning from others is essential to anyone’s career growth, she says, and “IEEE offers a great opportunity for both.”

Facts Only

Yen-Ling Kuo is an assistant professor of computer science at the University of Virginia in Charlottesville.
She earned bachelor’s and master’s degrees in computer science from National Taiwan University in 2009 and 2012.
Kuo completed her Ph.D. in computer science at MIT in 2022, with a minor in cognitive science.
She worked as a software engineer at Google from 2012 to 2016, contributing to projects like Shop the Look.
Kuo received the IEEE Robotics and Automation Society’s inaugural Outstanding Women in Robotics and Automation Early Career Contribution Award in 2023.
Her winning paper, "Diff-DAgger: Uncertainty Estimation with Diffusion Policy for Robotic Manipulation," improves robots' ability to handle uncertainty in untrained scenarios.
The method reduces human supervision and increases task completion rates by 20%.
Kuo’s research focuses on theory of mind reasoning, enabling robots to interpret human signals and intentions.
She received a $665,000 National Science Foundation Career Award in 2023.
Kuo also received the Toyota Research Institute’s Young Faculty Researcher Award.
She is an active IEEE member, serving as a reviewer, presenter, and panelist at conferences.
Kuo’s work aims to integrate robots into social spaces by improving their ability to engage in grounded interactions with humans.

Executive Summary

Yen-Ling Kuo, an assistant professor of computer science at the University of Virginia, has made significant contributions to robotics and automation, particularly in improving how robots handle uncertainty and interact with humans. Her research focuses on developing computational models that enable robots to interpret both explicit data and subtle human signals, such as language, movements, and gaze, to enhance human-robot interactions. Kuo's work builds on her background in computer science and cognitive science, including her Ph.D. from MIT, where she explored AI systems that apply past learning to new situations. Her paper "Diff-DAgger: Uncertainty Estimation with Diffusion Policy for Robotic Manipulation" introduces a method to help robots better identify uncertainty in untrained scenarios, reducing human supervision and improving task completion rates. This innovation earned her the IEEE Robotics and Automation Society’s inaugural Outstanding Women in Robotics and Automation Early Career Contribution Award. Kuo's research is supported by grants from the National Science Foundation and the Toyota Research Institute, aiming to make interactions between humans and robots more intuitive and effective.
Kuo's career path includes early exposure to programming through Logo, internships at Google, and a return to academia to deepen her understanding of neural networks and human intelligence. Her work at MIT’s Center for Brains, Minds, and Machines and CSAIL shaped her focus on integrating cognitive science with robotics. At the University of Virginia, she continues to explore theory of mind reasoning in robots, seeking to create systems that can infer human intentions and adapt to social contexts. Her involvement with IEEE, including reviewing papers and participating in conferences, reflects her commitment to advancing the field and mentoring the next generation of researchers.

Full Take

Yen-Ling Kuo’s research represents a compelling intersection of cognitive science and robotics, addressing a critical gap in how machines interpret and respond to human behavior. Her work on uncertainty estimation in robotic manipulation, particularly the Diff-DAgger method, offers a practical solution to a long-standing challenge: reducing the need for constant human oversight while improving task performance. The 39% improvement in failure prediction and 20% increase in task completion rates are notable, but the broader implications lie in how this approach could scale to more complex, real-world environments. The use of diffusion policy to repurpose training signals as real-time confidence checks is innovative, though questions remain about its robustness across diverse robotic platforms and edge cases where human behavior is highly unpredictable.
Kuo’s focus on theory of mind reasoning is particularly intriguing, as it seeks to bridge the gap between mechanical task execution and social interaction. The idea that robots could infer human intentions—much like humans do in collaborative settings—is ambitious but fraught with challenges. For instance, how do we ensure that robots accurately interpret subtle cues like gaze or body language without introducing biases or misinterpretations? The lack of existing computational frameworks for this kind of reasoning underscores both the novelty and the difficulty of her work. If successful, it could transform human-robot collaboration, but it also raises ethical questions about autonomy, consent, and the potential for manipulation in human-machine interactions.
The broader narrative around Kuo’s career—from her early exposure to programming to her transition from industry to academia—highlights the value of interdisciplinary thinking. Her time at Google provided practical insights into the limitations of neural networks, which later informed her academic research. This trajectory underscores the importance of real-world experience in shaping theoretical advancements. However, the article’s framing of her achievements as a linear progression from curiosity to innovation could benefit from acknowledging the iterative, often messy nature of scientific discovery. What setbacks or failed experiments shaped her current approach? How does her work address critiques of AI’s tendency to replicate human biases?
Patterns detected: none
Bridge questions:
1. How might Kuo’s theory of mind models account for cultural differences in nonverbal communication, given that human social cues vary widely across societies?
2. What are the potential risks of robots misinterpreting human intentions, and how could these be mitigated in safety-critical applications like healthcare or autonomous vehicles?
3. If robots achieve theory of mind reasoning, how should we define their ethical boundaries in decision-making, particularly in scenarios where human intentions conflict?

Sentinel — Human

Confidence

LIKELY_HUMAN (confidence: 0.15)