Medicine Technology 🌱 Environment Space Energy Physics Engineering Social Science Earth Science Science
Technology 2026-03-17 3 min read

A robot on wheels that reasons about where you left your glasses

TUM's search robot combines language model knowledge with real-time 3D mapping to find misplaced objects 30% faster than random search.

It looks like a broomstick on wheels with a camera perched on top. It is not elegant. But the robot coming out of Prof. Angela Schoellig's Learning Systems and Robotics Lab at the Technical University of Munich (TUM) can do something no previous system has managed quite this well: reason about where a lost object might be, and then go look there first.

Tell it you have lost your glasses in the kitchen, and it does not simply scan every surface systematically. It builds a three-dimensional map of the room, identifies objects in the scene, and then uses a language model to figure out which surfaces are plausible resting places for eyeglasses. A table or windowsill? Likely. A stovetop or sink? Probably not. The result, according to the team's measurements, is a search roughly 30% more efficient than random exploration.

Depth pixels and common sense

The system works by combining two different forms of intelligence. The camera captures standard two-dimensional images, but each pixel also carries depth information, allowing the robot to construct a centimeter-accurate 3D spatial map that updates continuously as it moves. A laptop connected to the robot runs image recognition to identify objects in the scene.

The second layer is a large language model that provides what amounts to common-sense reasoning about object-location relationships. The language model has absorbed enough text from the internet to know that people set glasses on tables, not burners. The researchers convert this relational knowledge into probability scores on the 3D map - two-digit numbers that constantly recalculate the likelihood that the target object is at each location.

"We have taught the robot to understand its surroundings," Schoellig said. The system uses AI in two distinct ways: image recognition to identify what is in the room, and language modeling to reason about where a missing object would logically be.

Spotting what has changed

The robot has another trick that makes it particularly useful for the real-world problem of finding recently misplaced items. It remembers previous images of a space and compares them with current views. If a new object suddenly appears on the kitchen counter - something that was not there during the last scan - the system flags that area as a high-probability search location with 95% certainty.

This change-detection capability turns the robot from a generic searcher into something closer to a domestic assistant that tracks the evolving state of a room. It does not just know what is there; it knows what is new.

Open drawers, closed questions

The current system has an obvious limitation: it can only find objects that are visible. Glasses tucked inside a drawer or behind a closed cabinet door are invisible to a camera, no matter how sophisticated the reasoning layer. Schoellig's team is already working on this next phase, which requires the robot to physically interact with its environment - opening cupboards, grasping handles, determining whether a door swings upward or sideways.

That step involves substantially harder engineering challenges. The robot needs robotic arms and hands, force feedback, and manipulation planning. Moving from passive observation to active exploration is a qualitative leap in complexity.

The search capabilities also remain limited to controlled indoor environments. A kitchen with known furniture is very different from a cluttered garage or an unfamiliar room. How well the system generalizes across spaces, lighting conditions, and object types is still an open question.

But the underlying approach - combining spatial mapping with language-based reasoning about where objects belong - represents a practical step toward robots that can navigate human spaces meaningfully rather than mechanically. Schoellig described the capability as "important for all robots that move in spaces that are constantly changing." Humanoid factory workers, care robots in private homes, and search-and-rescue systems all need this kind of contextual understanding of their surroundings.

The work was published March 3, 2026, in IEEE Robotics and Automation Letters. It is one of the first demonstrations of a robot that integrates visual scene understanding with language model reasoning for a concrete, practical task. The broomstick on wheels may not look like much, but it knows where to look.

Source: Bogenberger, B. et al. "Where did I leave my glasses? Open-Vocabulary Semantic Exploration in Real-World Semi-Static Environments." IEEE Robotics and Automation Letters, March 3, 2026. Research conducted at the TUM Learning Systems and Robotics Lab, Munich Institute of Robotics and Machine Intelligence (MIRMI), Technical University of Munich.