PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Audio-guided self-supervised learning for disentangled visual speech representations

Audio-guided self-supervised learning for disentangled visual speech representations
2025-01-07
(Press-News.org)

Learning visual speech representations from talking face videos is an important problem for several speech-related tasks, such as lip reading, talking face generation, audio-visual speech separation, and so on. The key difficulty lies in tackling speech-irrelevant factors presented in the videos, such as lighting, resolution, viewpoints, head motion, and so on.

To solve the problems, a research team led by Shuang YANG publishes their new research on 15 December 2024 in Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.

The team proposes to disentangle speech-relevant and speech-irrelevant facial movements from videos in a self-supervised learning manner. The proposed method can learn discriminative disentangled speech representations from videos and can benefit the lip reading task by a straightforward method like knowledge distillation. Both qualitative and quantitative results on the popular visual speech datasets LRW and LRS2-BBC show the effectiveness of their method.

In the research, the researchers observe the speech process and find that speech-relevant and speech-irrelevant facial movements are differences in the frequency of occurrence. Specifically, speech-relevant facial movements always occur at a higher frequency than speech-irrelevant ones. Moreover, the researchers find that the speech-relevant facial movements are consistently synchronized with the accompanying audio speech signal.

Based on the new observations above, the researchers introduce a novel two-branch network to decompose the visual changes between two frames in the same video into speech-relevant and speech-irrelevant components. For speech-relevant branch, they introduce the high-frequency audio signal to guide the learning of speech-relevant cues. For the speech-irrelevant branch, they introduce an information bottleneck to restrict the capacity from acquiring high-frequency and fine-grained speech-relevant information.

Future work can focus on exploring more explicit auxiliary tasks and constraints beyond the reconstruction task to capture speech cues from videos. Meanwhile, it's also a nice try to combine multiple types of knowledge representations to enhance the obtained speech representations.  

DOI: 10.1007/s11704-024-3787-8

END


[Attachments] See images for this press release:
Audio-guided self-supervised learning for disentangled visual speech representations Audio-guided self-supervised learning for disentangled visual speech representations 2 Audio-guided self-supervised learning for disentangled visual speech representations 3

ELSE PRESS RELEASES FROM THIS DATE:

From logs to security: How process analysis is transforming access control

From logs to security: How process analysis is transforming access control
2025-01-07
Researchers at the University of Electro-Communications have developed a groundbreaking framework for improving system security by analyzing business process logs. This framework focuses on ensuring that role-based access control (RBAC) rules-critical to managing who can access specific system resources-are correctly implemented. Noncompliance with these rules, whether due to error or malicious activity, can result in unauthorized access and pose significant risks to organizations.   RBAC is a widely used access control model that relies on predefined roles assigned to users. However, as business processes become more complex, ensuring ...

Dronedarone inhibits the proliferation of esophageal squamous cell carcinoma through the CDK4/CDK6-RB1 axis in vitro and in vivo

Dronedarone inhibits the proliferation of esophageal squamous cell carcinoma through the CDK4/CDK6-RB1 axis in vitro and in vivo
2025-01-07
Esophageal squamous cell carcinoma (ESCC) is a severe health threat, being a predominant subtype of esophageal cancer and contributing significantly to cancer-related mortality globally. Despite advancements in combination therapies, patient prognosis remains poor, highlighting an urgent need for novel treatment strategies. In this context, a study explores the potential of dronedarone, an FDA-approved drug, in inhibiting ESCC proliferation through the CDK4/CDK6-RB1 axis, both in vitro and in vivo. The research reveals that dronedarone, ...

Photonic nanojet-regulated soft microalga-robot

Photonic nanojet-regulated soft microalga-robot
2025-01-07
Micro/nanorobots hold exciting prospects for executing different tasks in complex microenvironments due to their small size, high flexibility, controllability, and environmental adaptability. However, traditional rigid micro/nanorobots are still difficult to perform different biomedical tasks in complex and unstructured narrow microenvironments due to their limited flexibility and insufficient deformability. To address this problem, in a new paper published in PhotoniX, a team of scientists led by Professor Hongbao Xin from Institute of Nanophotonics, Jinan University, China, has developed a new soft microalga robot (saBOT). They innovatively used microalga, ...

How do directional connections shape complex dynamics in neuronal networks?

How do directional connections shape complex dynamics in neuronal networks?
2025-01-07
Uncovering the relationship between structure (connectivity) and function (neuronal activity) is a fundamental question across many areas of biology. However, investigating this directly in animal brains is challenging because of the immense complexity of their neural connections and the invasive surgeries that are typically needed. Lab-grown neurons with artificially-controlled connections have the possibility of becoming a useful alternative to animal testing, particularly as we learn how to accurately characterize their behaviour. A research team at Tohoku University used microfluidic devices to reveal how directional connections shape the complex dynamics ...

Drug-resistant hookworms put pets and people at risk

Drug-resistant hookworms put pets and people at risk
2025-01-07
Canine hookworms are becoming increasingly resistant to drugs across Australia, according to new research. Scientists at The University of Queensland and The University of Sydney have identified widespread resistance to benzimidazole-based dewormers which are commonly used to treat gastrointestinal parasites in dogs. Dr Swaid Abdullah from UQ’s School of Veterinary Science said almost 70 per cent of the hookworm samples studied showed genetic mutations that can cause drug resistance. “This is a big problem, as hookworm infections ...

New strontium isotope map of Sub-Saharan Africa is a powerful tool for archaeology, forensics, and wildlife conservation

2025-01-07
A team of researchers led by UC Santa Cruz recently released a sophisticated new map that reveals, for the first time, the unique “geologic fingerprints” for most of the African continent.  The map will help archaeologists, conservation scientists, and forensics experts match artifacts and plant, animal, and human remains found at locations around the world back to their most likely region of origin within Africa, offering new insights on issues ranging from the history of the transatlantic slave trade to modern wildlife trafficking and human migration patterns.  The research team’s ...

‘Sandwich carers’ experience decline in mental and physical health

2025-01-07
People who care for both their children and older family members – also known as ‘sandwich carers’ – suffer from deterioration in both their mental and physical health over time, finds a new study by UCL researchers. The research, published in Public Health, analysed data from around 2,000 sandwich carers and 2,000 non-sandwich carers from the UK Household Longitudinal Study between 2009 and 2020. Sandwich carers juggle the responsibilities of caring for ageing parents or older relatives while raising dependent children ...

A new way to determine whether a species will successfully invade an ecosystem

2025-01-06
CAMBRIDGE, MA -- When a new species is introduced into an ecosystem, it may succeed in establishing itself, or it may fail to gain a foothold and die out. Physicists at MIT have now devised a formula that can predict which of those outcomes is most likely. The researchers created their formula based on analysis of hundreds of different scenarios that they modeled using populations of soil bacteria grown in their laboratory. They now plan to test their formula in larger-scale ecosystems, including forests. This approach could also be helpful in predicting whether probiotics or fecal microbiota treatments (FMT) would successfully combat infections of the human GI tract. “People ...

A change in the weather in the U.S. Corn Belt

2025-01-06
A change in the weather in the U.S. Corn Belt Intensive farming and shallow groundwater affect precipitation patterns The sweeping land use changes and irrigation of the U.S. Corn Belt, along with the influence of the area’s shallow groundwater, have significantly altered precipitation patterns in that vital agricultural region, new research shows. The study, published in the Proceedings of the National Academy of Sciences, focuses on “precipitation recycling” — a process in which the moisture released to the atmosphere by plants, soils, lakes, and other features of the landscape returns to the same area in the form of rain. By using advanced ...

How we classify flood risk may give developers, home buyers a false sense of security

2025-01-06
Common methods of communicating flood risk may create a false sense of security, leading to increased development in areas threatened by flooding. This phenomenon, called the “safe development paradox,” is described in a new paper from North Carolina State University. Lead author Georgina Sanchez, a research scholar in NC State’s Center for Geospatial Analytics, said this may be an unintended byproduct of how the Federal Emergency Management Agency classifies areas based on their probability of dangerous flooding. Known as flood mapping, this classification system describes areas in terms of their likelihood of being flooded each year. These ...

LAST 30 PRESS RELEASES:

Potential new treatment for sepsis

Study reveals how many hours of video games per week might be too many

Electrospinning for mimicking bioelectric microenvironment in tissue regeneration

Home fingertip oxygen monitors less accurate for people with darker skin tones

Six weeks in a cast no less effective than surgery for unstable ankle fractures

Precautionary approach to alcohol-free and low alcohol drinks needed to protect public health, say experts

Gas-atomized Ca–Mg alloy powders produce hydrogen simply by adding water — high-efficiency hydrogen generation at room temperature

British redcoat’s lost memoir reveals harsh realities of life as a disabled veteran

World-leading rare earth magnet recycling facility launches in UK

Corday Selden selected for the Oceanography Society Early Career Award

MIT chemists determine the structure of the fuzzy coat that surrounds Tau proteins

Same moves, different terrain: How bacteria navigate complex environments without changing their playbook

Severe weather is deadly for vulnerable older adults long after the storm ends, study finds

Expert panel highlights opportunities for improving cancer studies

Hearing aid prescriptions not associated with changes in memory and thinking

Seth Zippel selected for The Oceanography Society Early Career Award

Jeremy Horowitz selected for The Oceanography Society Early Career Award

Kennesaw State University’s Jerry Mack named Paul “Bear” Bryant Newcomer Coach of the Year

Ancient teeth are treasure troves of data on Iron Age lifestyles

Avocados may become easier to grow in India—but not if global emissions remain high

Pregnant women with IBD show heightened inflammation in vaginal mucosa

Underwater photos show seabirds, seals and fish interacting with a tidal turbine in Washington State

1 in 5 surveyed UK adults who have experienced the death of a pet report it as more distressing than experienced human deaths, with significant rates of prolonged grief disorder symptoms also being re

Polyester microfibers in soil negatively impact the development of cherry tomato plants in experiments, raising concerns over the potential effect of high levels of such contaminants

LGBTQ+ adults may be around twice as likely to be unemployed or to report workforce non-participation compared to heterosexual adults, per large representative Australian survey

Horses can smell fear: In experiments where horses smelled sweat from scared humans, they reacted to scary and sudden events with increased fear and reduced human interaction

New synaptic formation in adolescence challenges conventional views of brain development

Scientists identify target to treat devastating brain disease

Oliver Zielinski selected as Fellow of The Oceanography Society

Has progress stalled on gender equality at work?

[Press-News.org] Audio-guided self-supervised learning for disentangled visual speech representations