PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Extracting audio from visual information

Algorithm recovers speech from the vibrations of a potato-chip bag filmed through soundproof glass

2014-08-04
(Press-News.org) Researchers at MIT, Microsoft, and Adobe have developed an algorithm that can reconstruct an audio signal by analyzing minute vibrations of objects depicted in video. In one set of experiments, they were able to recover intelligible speech from the vibrations of a potato-chip bag photographed from 15 feet away through soundproof glass.

In other experiments, they extracted useful audio signals from videos of aluminum foil, the surface of a glass of water, and even the leaves of a potted plant. The researchers will present their findings in a paper at this year's Siggraph, the premier computer graphics conference.

"When sound hits an object, it causes the object to vibrate," says Abe Davis, a graduate student in electrical engineering and computer science at MIT and first author on the new paper. "The motion of this vibration creates a very subtle visual signal that's usually invisible to the naked eye. People didn't realize that this information was there."

Joining Davis on the Siggraph paper are Frédo Durand and Bill Freeman, both MIT professors of computer science and engineering; Neal Wadhwa, a graduate student in Freeman's group; Michael Rubinstein of Microsoft Research, who did his PhD with Freeman; and Gautham Mysore of Adobe Research.

Reconstructing audio from video requires that the frequency of the video samples — the number of frames of video captured per second — be higher than the frequency of the audio signal. In some of their experiments, the researchers used a high-speed camera that captured 2,000 to 6,000 frames per second. That's much faster than the 60 frames per second possible with some smartphones, but well below the frame rates of the best commercial high-speed cameras, which can top 100,000 frames per second.

Commodity hardware

In other experiments, however, they used an ordinary digital camera. Because of a quirk in the design of most cameras' sensors, the researchers were able to infer information about high-frequency vibrations even from video recorded at a standard 60 frames per second. While this audio reconstruction wasn't as faithful as it was with the high-speed camera, it may still be good enough to identify the gender of a speaker in a room; the number of speakers; and even, given accurate enough information about the acoustic properties of speakers' voices, their identities.

The researchers' technique has obvious applications in law enforcement and forensics, but Davis is more enthusiastic about the possibility of what he describes as a "new kind of imaging."

"We're recovering sounds from objects," he says. "That gives us a lot of information about the sound that's going on around the object, but it also gives us a lot of information about the object itself, because different objects are going to respond to sound in different ways." In ongoing work, the researchers have begun trying to determine material and structural properties of objects from their visible response to short bursts of sound.

In the experiments reported in the Siggraph paper, the researchers also measured the mechanical properties of the objects they were filming and determined that the motions they were measuring were about a tenth of micrometer. That corresponds to five thousandths of a pixel in a close-up image, but from the change of a single pixel's color value over time, it's possible to infer motions smaller than a pixel.

Suppose, for instance, that an image has a clear boundary between two regions: Everything on one side of the boundary is blue; everything on the other is red. But at the boundary itself, the camera's sensor receives both red and blue light, so it averages them out to produce purple. If, over successive frames of video, the blue region encroaches into the red region — even less than the width of a pixel — the purple will grow slightly bluer. That color shift contains information about the degree of encroachment.

Putting it together

Some boundaries in an image are fuzzier than a single pixel in width, however. So the researchers borrowed a technique from earlier work on algorithms that amplify minuscule variations in video, making visible previously undetectable motions: the breathing of an infant in the neonatal ward of a hospital, or the pulse in a subject's wrist.

That technique passes successive frames of video through a battery of image filters, which are used to measure fluctuations, such as the changing color values at boundaries, at several different orientations — say, horizontal, vertical, and diagonal — and several different scales.

The researchers developed an algorithm that combines the output of the filters to infer the motions of an object as a whole when it's struck by sound waves. Different edges of the object may be moving in different directions, so the algorithm first aligns all the measurements so that they won't cancel each other out. And it gives greater weight to measurements made at very distinct edges — clear boundaries between different color values.

The researchers also produced a variation on the algorithm for analyzing conventional video. The sensor of a digital camera consists of an array of photodetectors — millions of them, even in commodity devices. As it turns out, it's less expensive to design the sensor hardware so that it reads off the measurements of one row of photodetectors at a time. Ordinarily, that's not a problem, but with fast-moving objects, it can lead to odd visual artifacts. An object — say, the rotor of a helicopter — may actually move detectably between the reading of one row and the reading of the next.

For Davis and his colleagues, this bug is a feature. Slight distortions of the edges of objects in conventional video, though invisible to the naked eye, contain information about the objects' high-frequency vibration. And that information is enough to yield a murky but potentially useful audio signal.

Written by Larry Hardesty, MIT News Office

Related Links

'Researchers amplify variations in video, making the invisible visible' http://newsoffice.mit.edu/2012/amplifying-invisible-video-0622

'Seeing the human pulse' http://newsoffice.mit.edu/2013/seeing-the-human-pulse-0620


ELSE PRESS RELEASES FROM THIS DATE:

A protecting umbrella against oxygen

A protecting umbrella against oxygen
2014-08-04
This news release is available in German. In a paper published this week in the journal Nature Chemistry, researchers from the Center for Electrochemical Sciences – CES at the Ruhr-University Bochum and from the Max-Planck-Institute for Chemical Energy Conversion in Mülheim an der Ruhr report a novel concept to work with efficient and possibly cheaper catalysts. A kind of buffer protects the catalysts against the hostile conditions encountered in fuel cells, which have been to date dismissed utilization. The scientists report in the current issue of Nature Chemistry. Hydrogenases, ...

Self-assembly of gold nanoparticles into small clusters

Self-assembly of gold nanoparticles into small clusters
2014-08-04
This news release is available in German. This was determined using Small-Angle X-ray Scattering (SAXS) at BESSY II. A thorough examination with an electron microscope (TEM) confirmed their result. "The research on this phenomenon is now proceeding because we are convinced that such nanoclusters lend themselves as catalysts, whether in fuel cells, in photocatalytic water splitting, or for other important reactions in chemical engineering", explains Dr. Armin Hoell of HZB. The results have just appeared in two peer reviewed international academic journals. "What ...

Lung cancer diagnosis tool shown to be safe and effective for older patients

2014-08-04
A recent study in Manchester has found that a procedure to take tissue samples from lung cancer patients can be used safely in the elderly – allowing doctors to make a more accurate diagnosis and to choose appropriate treatment. Half of all lung cancer patients are over 70 years old when first diagnosed, but studies have shown that these older patients are less likely to receive an accurate diagnosis. A correct assessment of the stage of a patient's disease – how much their tumour has grown and spread – is key to ensuring they receive the right treatment. Non-invasive ...

Protein ZEB1 promotes breast tumor resistance to radiation therapy

Protein ZEB1 promotes breast tumor resistance to radiation therapy
2014-08-04
Twist, Snail, Slug. They may sound like words in a children's nursery rhyme, but they are actually the exotic names given to proteins that can generate cells with stem cell-like properties that have the ability to form diverse types of tissue. One protein with the even more out-there name of ZEB1 (zinc finger E-box binding homeobox 1), is now thought to keep breast cancer cells from being successfully treated with radiation therapy, according to a study at The University of Texas MD Anderson Cancer Center in Houston. Li Ma, Ph.D., an assistant professor of experimental ...

Phases of clinical depression could affect treatment

2014-08-04
Research led by the University of Adelaide has resulted in new insights into clinical depression that demonstrate there cannot be a "one-size-fits-all" approach to treating the disease. As part of their findings, the researchers have developed a new model for clinical depression that takes into account the dynamic role of the immune system. This neuroimmune interaction results in different phases of depression, and has implications for current treatment practices. "Depression is much more complex than we have previously understood," says senior author Professor Bernhard ...

Analysis of African plant reveals possible treatment for aging brain

Analysis of African plant reveals possible treatment for aging brain
2014-08-04
LA JOLLA—For hundreds of years, healers in São Tomé e Príncipe—an island off the western coast of Africa—have prescribed cata-manginga leaves and bark to their patients. These pickings from the Voacanga africana tree are said to decrease inflammation and ease the symptoms of mental disorders. Now, scientists at the Salk Institute for Biological Studies have discovered that the power of the plant isn't just folklore: a compound isolated from Voacanga africana protects cells from altered molecular pathways linked to Alzheimer's disease, Parkinson's disease and the neurodegeneration ...

Becoming bad through video games

2014-08-04
Previous studies show that violent video games increase adolescent aggressiveness, but new Dartmouth research finds for the first time that teen-agers who play mature-rated, risk-glorifying video games are more likely subsequently to engage in a wide range of deviant behaviors beyond aggression, including alcohol use, smoking cigarettes, delinquency and risky sex. More generally, such games – especially character-based games with anti-social protagonists – appear to affect how adolescents think of themselves, with potential consequences for their alter ego in the real ...

Still no 'justice for all' for female athletes

2014-08-04
Spanish hurdler María José Martínez-Patiño, who in the 1980s endured harsh global media attention when she was subjected to unscientific gender tests, is co-author of a study that takes stock of current sexual verification policies in athletics. While such policies were originally designed to weed out men who impersonate women at female-only events, issues of privacy and confidentiality remain paramount to safeguard athletes from unnecessary embarrassment, says Nathan Ha of the University of California Los Angeles in the US, lead author of the review in Springer's journal ...

Attention, bosses: web-surfing at work has its benefits

Attention, bosses: web-surfing at work has its benefits
2014-08-04
A new e-memo for the boss: Online breaks at work can refresh workers and boost productivity. Early findings from a University of Cincinnati study will be presented on Aug. 5, at the 74th annual meeting of the Academy of Management in Philadelphia. The study led by Sung Doo Kim, a doctoral candidate in the Carl H. Lindner College of Business, opens a rare avenue of research into coping with technology-induced distractions in our contemporary society. Previous research has focused on breaks during off-job hours such as evening, weekend and vacation periods, or on traditional ...

Fruit flies going high-tech: How touchscreen technology helps to understand eating habits

2014-08-04
A new study reveals surprising similarities between the way mammals and flies eat. What and how we eat is a crucial determinant of health and wellbeing. Model organisms such as fruit flies have provided crucial insights into how our brain decides what and how much to eat. But until now it was not clear how similar eating was in fruit flies and mammals (vertebrates). In a paper published today (Itskov et. al 2014) in the scientific journal Nature Communications, scientists from the Champalimaud Neuroscience Programme, Lisbon, Portugal, in collaboration with the University ...

LAST 30 PRESS RELEASES:

Living heritage: How ancient buildings on Hainan Island sustain hidden plant diversity

Just the smell of lynx can reduce deer browsing damage in recovering forests

Hidden struggles: Cambridge scientists share the truth behind their success

Cellular hazmat team cleans up tau. Could it prevent dementia?

Innovation Crossroads startup revolutionizes wildfire prevention through grid hardening

ICCUB astronomers lead the most ambitious study of runaway massive stars in the Milky Way

Artificial Intelligence can generate a feeling of intimacy

Antidepressants not associated with serious complications from TBI

Evasive butterfly mimicry reveals a supercharged biodiversity feedback loop

Hearing angry or happy human voices is linked to changes in dogs’ balance

Microplastics are found in a third of surveyed fish off the coasts of remote Pacific Islands

De-stigmatizing self-reported data in health care research

US individuals traveling from strongly blue or red US counties may favor everyday travel to like-minded destinations

Study reveals how superionic state enables long-term water storage in Earth's interior

AI machine learning can optimize patient risk assessments

Efficacy of immunosuppressive regimens for survival of stem cell-derived grafts

Glowing bacterial sensors detect gut illness in mice before symptoms emerge

GLP-1 RAs and prior major adverse limb events in patients with diabetes

Life-course psychosocial stress and risk of dementia and stroke in middle-aged and older adults

Cells have a built-in capacity limit for copying DNA, and it could impact cancer treatment

Study finds longer hospital stays and higher readmissions for young adults with complex childhood conditions

Study maps how varied genetic forms of autism lead to common features

New chip-sized, energy-efficient optical amplifier can intensify light 100 times

New light-based platform sets the stage for future quantum supercomputers

Pesticides significantly affect soil life and biodiversity

Corals sleep like us, but their symbiosis does not rest

Huayuan biota decodes Earth’s first Phanerozoic mass extinction

Beyond Polymers: New state-of-the-art 3D micro and nanofabrication technique overcomes material limitations

New platform could develop vaccines faster than ever before

TF-rs1049296 C>T variant modifies the association between hepatic iron stores and liver fibrosis in metabolic dysfunction-associated steatotic liver disease

[Press-News.org] Extracting audio from visual information
Algorithm recovers speech from the vibrations of a potato-chip bag filmed through soundproof glass