(Press-News.org) The language ‘engines’ that power generative artificial intelligence (AI) are plagued by a wide range of issues that can hurt society, most notably through the spread of misinformation and discriminatory content, including racist and sexist stereotypes.
In large part these failings of popular AI systems, such as ChatGPT, are due to shortcomings with the language databases upon which they are trained.
To address these issues, researchers from the University of Birmingham have developed a novel framework for better understanding large language models (LLMs) by integrating principles from sociolinguistics – the study of language variation and change.
Publishing their research in Frontiers in AI, the experts argue that by accurately representing different ‘varieties of language’, the performance of AI systems could be significantly improved – addressing critical challenges in AI, including social bias, misinformation, domain adaptation, and alignment with societal values.
The researchers emphasise the importance of using sociolinguistic principles to train LLMs to better represent the diverse dialects, registers, and periods of which any language is composed – opening new avenues for developing AI systems that are more accurate and reliable, as well as more ethical and socially aware.
Lead author Professor Jack Grieve commented: “When prompted, generative AIs such as ChatGPT may be more likely to produce negative portrayals about certain ethnicities and genders, but our research offers solutions for how LLMs can be trained in a more principled manner to mitigate social biases.
“These types of issues can generally be traced back to the data that the LLM was trained on. If the training corpus contains relatively frequent expression of harmful or inaccurate ideas about certain social groups, LLMs will inevitably reproduce those biases resulting in potentially racist or sexist content.”
The study suggests that fine-tuning LLMs on datasets designed to represent the target language in all its diversity – as decades of research in sociolinguistics has described in detail – can generally enhance the societal value of these AI systems. The researchers also believe that by balancing training data from different social groups and contexts, it is possible to address issues around the amount of data required to train these systems.
“We propose that increasing the sociolinguistic diversity of training data is far more important than merely expanding its scale,” added Professor Grieve. “For all these reasons, we therefore believe there is a clear and urgent need for sociolinguistic insight in LLM design and evaluation.
“Understanding the structure of society, and how this structure is reflected in patterns of language use, is critical to maximizing the benefits of LLMs for the societies in which they are increasingly being embedded. More generally, incorporating insights from the humanities and the social sciences is crucial for developing AI systems that better serve humanity.”
ENDS
For more information, please contact Press Office, University of Birmingham, tel: +44 (0)121 414 2772: email: pressoffice@contacts.bham.ac.uk
Notes to editor:
The University of Birmingham is ranked amongst the world’s top 100 institutions. Its work brings people from across the world to Birmingham, including researchers, teachers and more than 8,000 international students from over 150 countries.
‘The Sociolinguistic Foundations of Language Modelling’ – Jack Grieve, Sara Bartl, Matteo Fuoli, Jason Grafmiller, Weihang Huang, Alejandro Jawerbaum, Akira Murakami, Marcus Perlman, Dana Roemling, and Bodo Winter is published by Frontiers in AI.
END
Understanding bias and discrimination in AI: Why sociolinguistics holds the key to better Large Language Models and a fairer world
2025-01-13
ELSE PRESS RELEASES FROM THIS DATE:
Safe and energy-efficient quasi-solid battery for electric vehicles and devices
2025-01-13
Technological advances have led to the widespread use of electric devices and vehicles. These innovations are not only convenient but also environmentally friendly, offering an alternative to polluting fuel-driven machines. Lithium ion batteries (LIBs) are widely used in electrical appliances and vehicles. Commercial LIBs comprise an organic electrolyte solution, which is considered indispensable to make them energy efficient. However, ensuring safety becomes a concern and may be difficult to achieve with the rising market demand.
While solid-state batteries can help mitigate safety issues, the interface between solid electrodes and the ...
Financial incentives found to help people quit smoking, including during pregnancy
2025-01-13
Rewards and financial incentives are successful methods to help people quit smoking, according to a new Cochrane review co-led by a University of Massachusetts Amherst public health and health policy researcher. For the first time, the researchers also found “high-certainty evidence” that this intervention works for pregnant people as well.
A previous review of studies, also published in the Cochrane Database of Systematic Reviews, found moderate certainty that financial ...
Rewards and financial incentives successfully help people to give up smoking
2025-01-13
Offering rewards helps people to stop smoking, with success rates continuing long after incentives have stopped.
The new research, led by the University of East Anglia, also explored whether incentives were effective in helping pregnant women to give up smoking.
While previous research found rewards played a moderate role in encouraging pregnant women to quit smoking, this up-to-date study found there is now “high certainty evidence” that such schemes are successful in this ...
HKU ecologists reveal key genetic insights for the conservation of iconic cockatoo species
2025-01-12
Ecologists at the School of Biological Sciences of The University of Hong Kong (HKU) have made valuable discoveries that could transform the conservation of two iconic cockatoo species: the Sulphur-crested cockatoos and the critically endangered Yellow-crested cockatoos – with only 2,000 individuals remaining in the wild for the latter.
Until now, no whole-genome research had been conducted on either species, which were identified solely by subtle morphological differences. Through two innovative studies, the team ...
New perspective highlights urgent need for US physician strike regulations
2025-01-11
Key Takeaways:
A new Perspective piece in The New England Journal of Medicine led by the Harvard Pilgrim Health Care Institute examined the increasing frequency of physician strikes around the globe.
The piece is one of the first to provide international lessons on balancing physician collective bargaining rights with patient protections in the U.S.
The findings underscore the urgent need for regulatory reforms to address the increasing frequency of physician strikes and ensure the sustainability of the healthcare system.
Boston, MA – A ...
An eye-opening year of extreme weather and climate
2025-01-11
From the persistent droughts of southern Africa and Central America in the early part of the year to the more recent devastating extreme rainfall in Spain and the deadly Hurricane Helene along America’s east coast, 2024 has been a year of climate events that affected the lives of billions of people.
In a recent paper published in Advances in Atmospheric Sciences, an international team of scientists led by Dr Wenxia Zhang at the Institute of Atmospheric Physics, Chinese Academy of Sciences, provide an overview of the characteristics and impacts of the most notable extreme events of the year, including rainfall and flooding, ...
Scientists engineer substrates hostile to bacteria but friendly to cells
2025-01-11
Tokyo, Japan – Researchers from Tokyo Metropolitan University have created nanostructured alumina surfaces which are strongly antibacterial but can be used to culture cells. They found that anodic porous alumina (APA) surfaces prepared using electrochemistry in concentrated sulfuric acid had unprecedented resistance to bacterial growth, but did not hamper cell cultures. The team’s technology promises to have a big impact on regenerative medicine, where high quality cell cultures without bacterial contamination may be produced without ...
New tablet shows promise for the control and elimination of intestinal worms
2025-01-11
A new tablet combining albendazole and ivermectin is safe and more effective than albendazole alone in treating Trichuris trichiura and other soil-transmitted helminths (STH), according to a clinical trial conducted by the STOP consortium and led by the Barcelona Institute of Global Health (ISGlobal), a centre supported by “la Caixa” Foundation. The findings, published in The Lancet Infectious Diseases, open opportunities to improve the control of these neglected tropical infections, which affect around ...
Project to redesign clinical trials for neurologic conditions for underserved populations funded with $2.9M grant to UTHealth Houston
2025-01-10
In an effort to close the gap in neurological outcomes for underserved populations, a UTHealth Houston project funded with $2.9 million from the National Institutes of Health (NIH) will engage community partners to improve the design of clinical trials.
Neurologic conditions including stroke, Parkinson’s disease, and vascular cognitive impairment and dementia contribute to the leading causes of death and disability in the U.S. The goal of the project is to build an infrastructure for community-engaged research interventions for those three neurologic conditions affecting brain health.
“Historically, clinical trials for neurological conditions haven’t ...
Depression – discovering faster which treatment will work best for which individual
2025-01-10
Depression can affect anyone. It is common, and in many cases severe. These days, there are good treatments available, typically involving a combination of psychotherapy and medication. However, finding the right treatment can take some time. Not everyone responds equally well to every medication. Researchers at six European university medical centers, led by Charité – Universitätsmedizin Berlin, teamed up to accelerate the process of arriving at solid findings relating to both new and known treatments. The key will be a joint study design, supported over the ...