PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

A language learning system that pays attention -- more efficiently than ever before

MIT researchers' new hardware and software system streamlines state-of-the-art sentence analysis

2021-02-10
(Press-News.org) Human language can be inefficient. Some words are vital. Others, expendable.

Reread the first sentence of this story. Just two words, "language" and "inefficient," convey almost the entire meaning of the sentence. The importance of key words underlies a popular new tool for natural language processing (NLP) by computers: the attention mechanism. When coded into a broader NLP algorithm, the attention mechanism homes in on key words rather than treating every word with equal importance. That yields better results in NLP tasks like detecting positive or negative sentiment or predicting which words should come next in a sentence.

The attention mechanism's accuracy often comes at the expense of speed and computing power, however. It runs slowly on general-purpose processors like you might find in consumer-grade computers. So, MIT researchers have designed a combined software-hardware system, dubbed SpAtten, specialized to run the attention mechanism. SpAtten enables more streamlined NLP with less computing power.

"Our system is similar to how the human brain processes language," says Hanrui Wang. "We read very fast and just focus on key words. That's the idea with SpAtten."

The research will be presented this month at the IEEE International Symposium on High-Performance Computer Architecture. Wang is the paper's lead author and a PhD student in the Department of Electrical Engineering and Computer Science. Co-authors include Zhekai Zhang and their advisor, Assistant Professor Song Han.

Since its introduction in 2015, the attention mechanism has been a boon for NLP. It's built into state-of-the-art NLP models like Google's BERT and OpenAI's GPT-3. The attention mechanism's key innovation is selectivity -- it can infer which words or phrases in a sentence are most important, based on comparisons with word patterns the algorithm has previously encountered in a training phase. Despite the attention mechanism's rapid adoption into NLP models, it's not without cost.

NLP models require a hefty load of computer power, thanks in part to the high memory demands of the attention mechanism. "This part is actually the bottleneck for NLP models," says Wang. One challenge he points to is the lack of specialized hardware to run NLP models with the attention mechanism. General-purpose processors, like CPUs and GPUs, have trouble with the attention mechanism's complicated sequence of data movement and arithmetic. And the problem will get worse as NLP models grow more complex, especially for long sentences. "We need algorithmic optimizations and dedicated hardware to process the ever-increasing computational demand," says Wang.

The researchers developed a system called SpAtten to run the attention mechanism more efficiently. Their design encompasses both specialized software and hardware. One key software advance is SpAtten's use of "cascade pruning," or eliminating unnecessary data from the calculations. Once the attention mechanism helps pick a sentence's key words (called tokens), SpAtten prunes away unimportant tokens and eliminates the corresponding computations and data movements. The attention mechanism also includes multiple computation branches (called heads). Similar to tokens, the unimportant heads are identified and pruned away. Once dispatched, the extraneous tokens and heads don't factor into the algorithm's downstream calculations, reducing both computational load and memory access.

To further trim memory use, the researchers also developed a technique called "progressive quantization." The method allows the algorithm to wield data in smaller bitwidth chunks and fetch as few as possible from memory. Lower data precision, corresponding to smaller bitwidth, is used for simple sentences, and higher precision is used for complicated ones. Intuitively it's like fetching the phrase "cmptr progm" as the low-precision version of "computer program."

Alongside these software advances, the researchers also developed a hardware architecture specialized to run SpAtten and the attention mechanism while minimizing memory access. Their architecture design employs a high degree of "parallelism," meaning multiple operations are processed simultaneously on multiple processing elements, which is useful because the attention mechanism analyzes every word of a sentence at once. The design enables SpAtten to rank the importance of tokens and heads (for potential pruning) in a small number of computer clock cycles. Overall, the software and hardware components of SpAtten combine to eliminate unnecessary or inefficient data manipulation, focusing only on the tasks needed to complete the user's goal.

The philosophy behind the system is captured in its name. SpAtten is a portmanteau of "sparse attention," and the researchers note in the paper that SpAtten is "homophonic with 'spartan,' meaning simple and frugal." Wang says, "that's just like our technique here: making the sentence more concise." That concision was borne out in testing.

The researchers coded a simulation of SpAtten's hardware design -- they haven't fabricated a physical chip yet -- and tested it against competing general-purposes processors. SpAtten ran more than 100 times faster than the next best competitor (a TITAN Xp GPU). Further, SpAtten was more than 1,000 times more energy efficient than competitors, indicating that SpAtten could help trim NLP's substantial electricity demands.

The researchers also integrated SpAtten into their previous work, to help validate their philosophy that hardware and software are best designed in tandem. They built a specialized NLP model architecture for SpAtten, using their Hardware-Aware Transformer (HAT) framework, and achieved a roughly two times speedup over a more general model.

The researchers think SpAtten could be useful to companies that employ NLP models for the majority of their artificial intelligence workloads. "Our vision for the future is that new algorithms and hardware that remove the redundancy in languages will reduce cost and save on the power budget for data center NLP workloads" says Wang.

On the opposite end of the spectrum, SpAtten could bring NLP to smaller, personal devices. "We can improve the battery life for mobile phone or IoT devices," says Wang, referring to internet-connected "things" -- televisions, smart speakers, and the like. "That's especially important because in the future, numerous IoT devices will interact with humans by voice and natural language, so NLP will be the first application we want to employ."

Han says SpAtten's focus on efficiency and redundancy removal is the way forward in NLP research. "Human brains are sparsely activated [by key words]. NLP models that are sparsely activated will be promising in the future," he says. "Not all words are equal -- pay attention only to the important ones."

INFORMATION:

Written by Dan Ackerman, MIT News Office

Additional background

Paper: "SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning" https://arxiv.org/pdf/2012.09852.pdf

SpAtten Project Page: https://spatten.mit.edu



ELSE PRESS RELEASES FROM THIS DATE:

Temple-Led Team: COVID containment measures in Philly associated with rise in gun violence

Temple-Led Team: COVID containment measures in Philly associated with rise in gun violence
2021-02-10
(Philadelphia, PA) - While the COVID-19 pandemic brought most of the country to a standstill in March 2020, Philadelphia trauma surgeons noticed an alarming trend in the incidence of firearm violence. Instead of decreasing with containment measures, firearm-injured patients were presenting at even higher rates to Temple University Hospital and other trauma centers around the city. A team led by Jessica H. Beard, MD, MPH, FACS, Assistant Professor of Surgery and Director of Trauma Research at the Lewis Katz School of Medicine at Temple University (LKSOM), sought to determine the magnitude of Philadelphia's increase in firearm violence during the COVID-19 pandemic. They also aimed to understand potential causes ...

Definitely not the flu: risk of death from COVID-19 3.5 times higher than from flu

Definitely not the flu: risk of death from COVID-19 3.5 times higher than from flu
2021-02-10
A new study published in CMAJ (Canadian Medical Association Journal) found that the risk of death from COVID-19 was 3.5 times higher than from influenza. "We can now say definitively that COVID-19 is much more severe than seasonal influenza," says Dr. Amol Verma, St. Michael's Hospital, Unity Health Toronto, and the University of Toronto. "Patients admitted to hospital in Ontario with COVID-19 had a 3.5 times greater risk of death, 1.5 times greater use of the ICU, and 1.5 times longer hospital stays than patients admitted with influenza." These ...

Traffic reductions due to COVID-19 boost air quality in some states but not all

2021-02-10
Dramatic decreases in traffic caused by COVID-19 shutdowns improved air quality in car-dependent states but didn't offset additional forms of pollution in other parts of the country. Those findings by a University of South Florida researcher suggest that while decreasing the number of vehicles on the road is a good first step toward creating cleaner air, additional measures aimed at reducing other sources of air pollution, such as coal plants or industrial factories, must also be considered. The study, led by Yasin Elshorbany, an assistant professor of atmospheric chemistry and climate change at USF's St. Petersburg campus, was published ...

CWRU researchers uncover biochemical rules between RNA-protein interactions and expr

2021-02-10
CLEVELAND--A team of Case Western Reserve University researchers has found a way to measure key characteristics of proteins that bind to RNA in cells--a discovery that could improve our understanding of how gene function is disturbed in cancer, neurodegenerative disorders or infections. RNA--short for ribonucleic acid--carries genetic instructions within the body. RNA-binding proteins play an important role in the regulation of gene expression. Scientists already knew that the way these proteins function depends on their "binding kinetics," a term that describes how frequently they latch on to a site in an RNA, and how long they ...

Texas A&M researchers discover energy drinks' harmful effects on heart

2021-02-10
A team of researchers, led by a Texas A&M University professor, has found that some energy drinks have adverse effects on the muscle cells of the heart. The study, led by Dr. Ivan Rusyn, a professor in the Veterinary Integrative Biosciences (VIBS) Department at the Texas A&M College of Veterinary Medicine & Biomedical Sciences (CVMBS), was published in Food and Chemical Toxicology. In it, researchers observed cardiomyocytes - human heart cells grown in a laboratory - exposed to some energy drinks showed an increased beat rate and other factors affecting cardiac function. When placed in the context of the human body, ...

Scientists create liquid crystals that look a lot like their solid counterparts

Scientists create liquid crystals that look a lot like their solid counterparts
2021-02-10
A team at the University of Colorado Boulder has designed new kinds of liquid crystals that mirror the complex structures of some solid crystals--a major step forward in building flowing materials that can match the colorful diversity of forms seen in minerals and gems, from lazulite to topaz. The group's findings, published today in the journal Nature, may one day lead to new types of smart windows and television or computer displays that can bend and control light like never before. The results come down to a property of solid crystals that will be familiar to many chemists and gemologists: Symmetry. Ivan Smalyukh, ...

Israelis unwilling to risk two-state solution, new RAND report

2021-02-10
Israelis across the political spectrum prefer the status quo to the two-state solution, and Palestinians are only willing to accept a two-state solution that Israelis will be unable to accept, according to a new RAND Corporation report that assesses whether there are any alternative solutions to the conflict that average Israelis and Palestinians would support. Derived from a series of innovative, structured focus group discussions, the report suggests that the Biden Administration's recent reaffirmation of U.S. policy to support a "mutually agreed two-state solution, one in which Israel lives in peace ...

Solar awnings over parking lots help companies and customers

Solar awnings over parking lots help companies and customers
2021-02-10
The number of people who own electric vehicles (EVs) is increasing, but they face a conundrum: Unlike those who own gasoline-burning cars, EV owners can't just pop down to the corner gas station for a fill-up. Particularly in rural areas, charging stations can be few and far between. Joshua Pearce, Richard Witte Endowed Professor of Materials Science and Engineering and professor of electrical and computer engineering at Michigan Technological University, hopes to change that. In a model outlined in a paper in the journal Renewable Energy, Pearce and his co-author, graduate student Swaraj Sanjay Deshmukh, note the untapped potential of retail parking lot solar photovoltaic awnings. The study investigates the energy-related benefits ...

Plant-based magnetic nanoparticles with antifungal properties

Plant-based magnetic nanoparticles with antifungal properties
2021-02-10
A team of researchers from Immanuel Kant Baltic Federal University obtained magnetic nanoparticles using sweet flag (Acorus calamus). Both the roots and the leaves of this plant have antioxidant, antimicrobial, and insecticide properties. The extract of sweet flag was used as a non-toxic reagent for the manufacture of coated particles. The authors of the work also showed the efficiency of the new nanoparticles against several types of pathogenic fungi that damage cultivated plants. A technology developed by the team provides for the manufacture of nanoparticles from a cheap plant-based raw material and reduces the harmful effect of reagents on the environment. Because of their unique properties, nanoparticles are used in many areas, from medicine to oil production. ...

A novel approach to determine how carcinogenic bacteria find their targets

2021-02-10
The gram-negative bacteria Helicobacter pylori (H. pylori) colonize the stomachs of the majority of the world's population. Although most people may never experience major complications due to the pathogen, H. pylori infections increase the risk of certain types of gastric cancer, as well as other illnesses such as peptic ulcers and gastritis. Currently, H. pylori infections are treatable with a cocktail of antibiotics, but the rapid emergence of antibiotic resistance in H. pylori is a significant concern. To counter these threats, Pushkar Lele, assistant professor in the Artie McFerrin Department of Chemical Engineering at Texas A&M University, investigated how ...

LAST 30 PRESS RELEASES:

Circle versus rectangle: Finding ‘Earth 2.0’ may be easier using a new telescope shape

Metformin changes blood metal levels in humans

Long-term anticoagulation discontinuation after catheter ablation for atrial fibrillation

Fractional flow reserve–guided complete vs culprit-only revascularization in non–ST-elevation myocardial infarction and multivessel disease

Participation of women in cardiovascular trials from 2017 to 2023

Semaglutide and tirzepatide in patients with heart failure with preserved ejection fraction

Changes in biology of internal fat may be the leading cause of heart failure

Transcatheter or surgical treatment of patients with aortic stenosis at low to intermediate risk

Promising new drug for people with stubborn high blood pressure

One shot of RSV vaccine effective against hospitalization in older adults for two seasons

Bivalent RSV prefusion F protein–based vaccine for preventing cardiovascular hospitalizations in older adults

Clonal hematopoiesis and risk of new-onset myocarditis and pericarditis

Risk of myocarditis or pericarditis with high-dose vs standard-dose influenza vaccine

High-dose vs standard-dose influenza vaccine and cardiovascular outcomes in older adults

Prevalence, determinants, and time trends of cardiovascular health in the WHO African region

New study finds that, after a heart attack, women have worse prognosis when treated with beta-blockers

CNIC-led REBOOT clinical trial challenges 40-year-old standard of care for heart attack patients

Systolic blood pressure and microaxial flow pump–associated survival in infarct-related cardiogenic shock

Beta blockers, the standard treatment after a heart attack, may offer no benefit for heart attack patients and women can have worse outcomes

High Mountain Asia’s shrinking glaciers linked to monsoon changes

All DRII-ed up: How do plants recover after drought?

Research on stigma says to just ‘shake it off’

Scientists track lightning “pollution” in real time using NASA satellite

Millions of women rely on contraceptives, but new Rice study shows they may do more than just prevent pregnancy

Hot days make for icy weather, Philippine study finds

Roxana Mehran, MD, receives the most prestigious award given by the European Society of Cardiology

World's first clinical trial showing lubiprostone aids kidney function

Capturing language change through the genes

Public trust in elections increases with clear facts

Thawing permafrost raised carbon dioxide levels after the last ice age

[Press-News.org] A language learning system that pays attention -- more efficiently than ever before
MIT researchers' new hardware and software system streamlines state-of-the-art sentence analysis