(Press-News.org) NEW YORK – The era of artificial-intelligence chatbots that seem to understand and use language the way we humans do has begun. Under the hood, these chatbots use large language models, a particular kind of neural network. But a new study shows that large language models remain vulnerable to mistaking nonsense for natural language. To a team of researchers at Columbia University, it’s a flaw that might point toward ways to improve chatbot performance and help reveal how humans process language.
In a paper published online today in Nature Machine Intelligence, the scientists describe how they challenged nine different language models with hundreds of pairs of sentences. For each pair, people who participated in the study picked which of the two sentences they thought was more natural, meaning that it was more likely to be read or heard in everyday life. The researchers then tested the models to see if they would rate each sentence pair the same way the humans had.
In head-to-head tests, more sophisticated AIs based on what researchers refer to as transformer neural networks tended to perform better than simpler recurrent neural network models and statistical models that just tally the frequency of word pairs found on the internet or in online databases. But all the models made mistakes, sometimes choosing sentences that sound like nonsense to a human ear.
“That some of the large language models perform as well as they do suggests that they capture something important that the simpler models are missing,” said Dr. Nikolaus Kriegeskorte, PhD, a principal investigator at Columbia’s Zuckerman Institute and a coauthor on the paper. “That even the best models we studied still can be fooled by nonsense sentences shows that their computations are missing something about the way humans process language.”
Consider the following sentence pair that both human participants and the AI’s assessed in the study:
That is the narrative we have been sold.
This is the week you have been dying.
People given these sentences in the study judged the first sentence as more likely to be encountered than the second. But according to BERT, one of the better models, the second sentence is more natural. GPT-2, perhaps the most widely known model, correctly identified the first sentence as more natural, matching the human judgments.
“Every model exhibited blind spots, labeling some sentences as meaningful that human participants thought were gibberish,” said senior author Christopher Baldassano, PhD, an assistant professor of psychology at Columbia. “That should give us pause about the extent to which we want AI systems making important decisions, at least for now.”
The good but imperfect performance of many models is one of the study results that most intrigues Dr. Kriegeskorte. “Understanding why that gap exists and why some models outperform others can drive progress with language models,” he said.
Another key question for the research team is whether the computations in AI chatbots can inspire new scientific questions and hypotheses that could guide neuroscientists toward a better understanding of human brains. Might the ways these chatbots work point to something about the circuitry of our brains?
Further analysis of the strengths and flaws of various chatbots and their underlying algorithms could help answer that question.
“Ultimately, we are interested in understanding how people think,” said Tal Golan, PhD, the paper’s corresponding author who this year segued from a postdoctoral position at Columbia’s Zuckerman Institute to set up his own lab at Ben-Gurion University of the Negev in Israel. “These AI tools are increasingly powerful but they process language differently from the way we do. Comparing their language understanding to ours gives us a new approach to thinking about how we think.”
###
The paper, “Testing the limits of natural language models for predicting human language judgements,” was published online in Nature Machine Intelligence on September 14, 2023. Its full list of authors includes Tal Golan, Matthew Siegelman, Nikolaus Kriegeskorte and Christopher Baldassano.
END
Flexible thin-film electrodes placed directly on brain tissue show promise for the diagnosis and treatment of epilepsy, as demonstrated recently by scientists at Tokyo Tech. Thanks to an innovative yet straightforward design, these durable electrodes accurately match the mechanical properties of brain tissue, leading to better performance during electrocorticography recordings and targeted neural stimulation.
Measuring brain activity is a useful technique for diagnosing epilepsy and other neuropsychiatric disorders. Among the several approaches adopted, electroencephalography (EEG) is the least invasive. During EEG recordings, electrodes ...
Although a simple molecule, nitric oxide is an important signal substance that helps to reduce blood pressure by relaxing the blood vessels. But how it goes about doing this has long been unclear. Researchers at Karolinska Institutet in Sweden now present an entirely novel principle that challenges the Nobel Prize-winning hypothesis that the substance signals in its gaseous form. Their findings are presented in the journal Nature Chemical Biology.
That the simple molecule nitric oxide or nitrogen monoxide (NO) serves as a signal substance in many important physiological processes has been known for some time. For example, the discovery of the compound’s ...
A team of researchers, led by a University of Hawai‘i (UH) at Mānoa planetary scientist, discovered that high energy electrons in Earth’s plasma sheet are contributing to weathering processes on the Moon's surface and, importantly, the electrons may have aided the formation of water on the lunar surface. The study was published today in Nature Astronomy.
Understanding the concentrations and distributions of water on the Moon is critical to understanding its formation and evolution, and to providing water resources for future human exploration. The new ...
Areas in Vancouver with the greatest need for restorative nature often have the least exposure to it, according to a new UBC study published recently in Ambio. These neighbourhoods include Strathcona, downtown Vancouver, the West End, southern Sunset and Marpole.
The researchers developed a new tool, the local restorative nature (LRN) index to assess spaces for the presence of qualities that promote mental well-being. While initially applied in Vancouver, the index can also be used in any urban landscape, according to lead author Dr. Tahia Devisscher, an assistant professor in the faculty of forestry.
We sat down with Dr. Devisscher to discuss the study findings and ...
PHILADELPHIA – Many Americans do not know what rights are protected under the First Amendment and a substantial number cannot name all three branches of government, according to the 2023 Annenberg Constitution Day Civics Survey.
The Annenberg Public Policy Center’s annual, nationally representative survey finds that when U.S. adults are asked to name the specific rights guaranteed by the First Amendment to the Constitution, only one right is recalled by most of the respondents: Freedom of speech, ...
The study across three countries led by the Department of Psychology’s Dr Paul Hanel discovered people who prioritised achievement over enjoyment were less happy on the next day.
Whereas those who aimed for freedom said they had a 13% increase in well-being, recording better sleep quality and life satisfaction.
And participants who tried to relax and follow their hobbies recorded an average well-being boost of 8% and a 10% drop in stress and anxiety.
Dr Hanel worked with colleagues at the University of Bath on the Journal of Personality-published study.
For the first ...
Boston, MA – New polling data released late last week shows 77% of surveyed Massachusetts residents support a $600 state Child and Family Tax Credit. This polling confirms the popularity of the more generous Child and Family Tax Credit included in the House tax package, which is under consideration alongside the Senate tax bill by a bicameral conference committee.
“The overwhelming support for a $600 tax credit per child matches up with the stories I have heard from families across my district, and the experiences of working Massachusetts families that they need more financial ...
WASHINGTON — Capturing blur-free images of fast movements like falling water droplets or molecular interactions requires expensive ultrafast cameras that acquire millions of images per second. In a new paper, researchers report a camera that could offer a much less expensive way to achieve ultrafast imaging for a wide range of applications such as real-time monitoring of drug delivery or high-speed lidar systems for autonomous driving.
“Our camera uses a completely new method to achieve high-speed imaging,” said Jinyang Liang from the Institut national de la recherche scientifique (INRS) ...
INDIANAPOLIS – Research scientists led by Johanne Eliacin, PhD, of the U.S. Department of Veterans’ Affairs (VA) and Regenstrief Institute, have developed PARTNER-MH, an innovative, peer-led patient navigation program to support racially and ethnically minoritized veterans seeking mental healthcare, regardless of the types of mental health services needed or their mental health diagnoses.
In two peer-reviewed published papers they report significant improvements in mental health outcomes and high participant satisfaction with the program.
PARTNER-MH, developed for VA mental ...
Newborn screening (NBS) is routinely performed across the world using biochemical testing methods. Recent advancements in genetic sequencing are a potential game-changer for newborn screening, swiftly assessing a comprehensive range of monogenic disorders. Yet, the effectiveness of genetic sequencing as an alternative method for NBS has not previously been studied.
To evaluate the outcomes of applying gene panel sequencing as a first-tier newborn screening test, a recent study conducted by eight NBS centers and BGI Genomics was ...