Medicine 2026-02-19 4 min read

Lab-Grown Brain Tissue Learned to Balance a Pole - Then Forgot Everything After a Rest

UC Santa Cruz researchers used reinforcement learning to train mouse brain organoids on the classic cart-pole control problem, raising the win rate from 4.5% to 46%

Brain organoids are tiny clusters of neural tissue - smaller than a peppercorn - grown in laboratory dishes from stem cells. They form spontaneously into structures that mimic early brain development, generating networks of several million neurons that produce electrical signals, form connections, and in some ways behave like developing cortical tissue. For years, researchers have used them primarily as model systems for studying disease. A UC Santa Cruz team has now used them to ask a different kind of question: can this tissue actually learn?

The answer, based on a paper published in Cell Reports, appears to be a qualified yes - with a sharp caveat about forgetting. The team, led by Ash Robbins, Mircea Teodorescu, and David Haussler, trained mouse-derived brain organoids to improve at the cart-pole problem, a classic engineering benchmark in which a control system must keep a vertical pole balanced on a moving cart. Using electrical stimulation to send information about the pole's angle to the organoid, and interpreting the organoid's returning signals as force commands, the researchers coached the tissue to significantly improve its performance over time.

What the cart-pole problem tests

The cart-pole problem is a standard test of adaptive control. A pole is attached by a hinge to a cart that can move left or right along a track. The control system - in this case, the organoid - receives information about which way the pole is tipping and decides how to move the cart to keep it upright. The task requires continuous adjustment and feedback processing. It is the kind of problem every human infant solves in a cruder form when learning to stand, and it has been used as a benchmark in robotics and artificial intelligence for decades because it is simple enough to analyze but demanding enough to require genuine adaptive processing.

The researchers translated the physical dynamics into electrical signals: stronger signals for larger tilt angles, with the direction of the signal encoding which way the pole was falling. The organoid's response - patterns of neural firing transmitted back through the interface - were interpreted as force commands applied to the virtual cart.

How the training worked

Training followed a reinforcement learning protocol. Each attempt to balance the pole was called an episode. After each episode ended - when the pole fell - the researchers assessed whether the organoid had improved its average balance time over recent episodes. If performance was improving, no training signal was sent. If it was not improving, a targeted electrical signal was applied to specific neurons selected by a reinforcement learning algorithm to nudge the network toward better behavior.

"You could think of it like an artificial coach that says, 'you're doing it wrong, tweak it a little bit in this way,'" said Robbins. "We're learning how to best give it these coaching signals."

The key result: organoids trained with this adaptive coaching approach won the game - keeping the pole upright beyond a minimum success threshold - at a rate of 46%. Organoids coached at random, without the reinforcement learning algorithm selecting which neurons to target, achieved a win rate of only 4.5%. The difference is statistically meaningful and demonstrates that targeted, feedback-driven training shaped the neural network toward better performance.

From an engineering view

"What makes this powerful is that we can measure, stimulate, and adapt in the same system," said ECE Professor Teodorescu. "This is not just recording neural activity. It is a closed-loop bioelectrical interface where the tissue's response directly shapes the next input. That is what allows us to study learning as a physical process, which has been very difficult to study directly in intact brains."

The platform was made possible in part by an electrophysiology system developed by industry partner Maxwell Biosciences, which allows simultaneous stimulation and recording from thousands of electrode sites beneath the organoid tissue. That density of access is what enables the closed-loop feedback architecture.

The forgetting problem

The most significant limitation of the current results is also one of the most scientifically interesting findings: the organoids appear to forget what they learn. After 15 minutes of active pole-balancing training, the organoid rested for 45 minutes. When training resumed, performance had dropped back to baseline. The learning did not persist across the rest period.

"It is likely that more sophisticated organoids, perhaps grown to include multiple brain regions involved in animal learning, will be needed to recapitulate the kind of long-term adaptive performance improvement we see in animals," said Distinguished Professor David Haussler.

In biological brains, long-term memory involves structural changes at synapses - the connections between neurons - and interactions between different brain regions including the hippocampus. Current organoids lack that architecture. They are, as one outside researcher noted, "incredibly minimal neural circuits" with no dopamine system, no sensory experience, and no multi-region connectivity. That they show any learning at all is the surprising finding; that it is short-lived is the expected consequence of their simplicity.

What this work is - and is not - for

The researchers were careful to define the scope of the project. The goal is to understand how neurons encode and process information, and how neurological diseases disrupt that capacity. Conditions including Alzheimer's disease, Parkinson's disease, autism, schizophrenia, and ADHD all alter the brain's ability to learn and adapt. Having a tractable model system - one where variables can be precisely controlled and observed - could generate new insights into those disruptions.

"Our goal is to advance brain research and the treatment of neurological diseases, not to replace robotic controllers and other kinds of computers with lab-grown animal brain tissues," Haussler emphasized. "The latter might be considered cool, but would bring up serious ethical issues, especially if human brain organoids were used."

To enable wider adoption of this research approach, Robbins developed an open-source software tool called BrainDance, designed to let any lab with the biological capacity to culture organoids run neural simulation learning experiments without building the software infrastructure from scratch.

Source: University of California, Santa Cruz. Study published in Cell Reports. Lead researcher: Ash Robbins; co-senior authors: Mircea Teodorescu and David Haussler, UC Santa Cruz. Media contact: Emily Cerf, ecerf@ucsc.edu.