Medicine 2026-03-10 4 min read

Scientists reconstructed movies from mouse brain activity, pixel by pixel

A neural decoding model built on single-cell recordings in the visual cortex produced high-quality video reconstructions, with accuracy improving as more neurons were included.

A mouse watches a video. Somewhere in its visual cortex, thousands of neurons fire in patterns shaped by what the animal sees. Now, for the first time, researchers have reversed that process: reading those neural firing patterns and reconstructing the video the mouse was watching, pixel by pixel, with no information other than brain activity.

The study, led by Dr. Joel Bauer at the Sainsbury Wellcome Centre at University College London and published in eLife, demonstrates that single-cell recordings from the mouse visual cortex contain enough information to produce recognizable, high-quality video reconstructions. The work represents a significant advance in neural decoding, the attempt to read out what the brain is representing from its electrical activity.

Why mice, and why single cells

Most previous attempts at visual reconstruction from brain activity have used human fMRI data. Functional magnetic resonance imaging measures blood flow changes across brain regions, providing a coarse, indirect signal that averages the activity of millions of neurons. The resolution is limited by the size of each imaging voxel, typically a few cubic millimeters.

Mouse single-cell recordings offer a fundamentally different level of detail. Using calcium imaging, which detects fluorescent signals when individual neurons fire, the researchers could track the activity of hundreds of specific cells in the visual cortex simultaneously. This cell-level precision provides a much richer dataset for decoding, though it comes with its own constraints: you cannot do this in humans.

Building the decoder

The team used a dynamic neural encoding model originally developed for the 2023 Sensorium Competition, which predicts how individual neurons will respond to a given visual stimulus while also accounting for the mouse's own behavior, including body movements and pupil diameter. The UCL team adapted this model for the inverse problem: given the neural activity, reconstruct the stimulus.

The approach works by starting with a blank screen and iteratively updating each pixel based on the difference between predicted neural activity for the current image and the actual recorded activity. Through successive refinements, the algorithm converges on a video frame that would produce neural responses matching what was actually observed. The process runs independently for each time point, producing a complete reconstructed video.

The model was trained on data from mice watching one set of videos, then tested on its ability to reconstruct a separate 10-second video clip the model had never seen during training. This held-out validation is critical: without it, the system might simply be memorizing training data rather than genuinely decoding neural representations.

What came out, and what was lost

The reconstructed videos captured the overall structure and motion of the original clips. Objects, edges, and movement patterns were recognizable. The team quantified accuracy using pixel-level correlation between original and reconstructed frames, finding strong correspondence with minimal timing drift between the two videos.

Reconstruction quality improved as more neurons were included in the analysis, demonstrating that the method benefits from richer neural data rather than simply extracting information from a few key cells. This is an important finding: it suggests that visual information in the mouse cortex is distributed across populations of neurons rather than concentrated in a small number of highly informative cells.

Resolution remains a limitation. The reconstructed videos captured coarse spatial structure better than fine detail. Small features and sharp edges were smoothed out, a reflection of both the neural encoding model's assumptions and the finite number of recorded neurons.

The gap between representation and reality

The researchers are particularly interested in what the reconstructions get wrong. The brain does not maintain a perfect pixel-for-pixel copy of the visual world. Neural representations are shaped by attention, expectations, and the specific features that each brain region is tuned to emphasize. By comparing reconstructed videos to original stimuli, researchers can begin to map where and how the brain's internal model diverges from external reality.

These divergences are not errors in the colloquial sense. They reflect the brain's active processing of visual information, selectively amplifying some features and suppressing others. Understanding these distortions could illuminate fundamental questions about visual perception: why we see certain optical illusions, how we fill in information from blind spots, and how context shapes what we think we see.

What this cannot do yet

The technique currently works only in mice under controlled laboratory conditions, with surgically implanted imaging windows and precisely calibrated visual stimuli. Translating any version of this approach to humans would require non-invasive neural recording methods with far better spatial resolution than current fMRI or EEG technology provides.

The reconstructions, while impressive, are of simple video clips shown to restrained animals. Whether the same approach could decode neural representations of natural, freely viewed visual scenes, which involve eye movements, depth perception, and three-dimensional spatial processing, is an open question.

The ethical implications of reading visual experience from brain activity, even in animals, deserve careful consideration as the technology advances. For now, the primary applications are in basic neuroscience: understanding how visual information is encoded, transformed, and represented across neural populations.

Source: Bauer, J. et al. Published in eLife (2026). Sainsbury Wellcome Centre, University College London.