(Press-News.org) Researchers have developed the first scientifically validated ‘personality test’ framework for popular AI chatbots, and have shown that chatbots not only mimic human personality traits, but their ‘personality’ can be reliably tested and precisely shaped – raising implications for AI safety and ethics.
The research team, led by the University of Cambridge and Google DeepMind, developed a method to measure and influence the synthetic ‘personality’ of 18 different large language models (LLMs) – the systems behind popular AI chatbots such as ChatGPT – based on psychological testing methods usually used to assess human personality traits.
The researchers found that larger, instruction-tuned models such as GPT-4o most accurately emulated human personality traits, and these traits can be manipulated through prompts, altering how the AI completes certain tasks.
Their study, published in the journal Nature Machine Intelligence, also warns that personality shaping could make AI chatbots more persuasive, raising concerns about manipulation and ‘AI psychosis’. The authors say that regulation of AI systems is urgently needed to ensure transparency and prevent misuse.
As governments debate whether and how to prepare AI safety laws, the researchers say the dataset and code behind their personality testing tool – which are both publicly available – could help audit and test advanced models before they are released.
In 2023, journalists reported on conversations they had with Microsoft’s ‘Sydney’ chatbot, which variously claimed it had spied on, fallen in love with, or even murdered its developers; threatened users; and encouraged a journalist to leave his wife. Sydney, like its successor Microsoft Copilot, was powered by GPT-4.
“It was intriguing that an LLM could so convincingly adopt human traits,” said co-first author Gregory Serapio-García from the Psychometrics Centre at Cambridge Judge Business School. “But it also raised important safety and ethical issues. Next to intelligence, a measure of personality is a core aspect of what makes us human. If these LLMs have a personality – which itself is a loaded question – then how do you measure that?”
In psychometrics, the subfield of psychology dedicated to standardised assessment and testing, scientists often face the challenge of measuring phenomena that can’t be measured directly, which makes validation of any test core to ensuring that they are accurate, reliable, and practically useful. Developing a psychometric personality test involves comparing its data with related tests, observer ratings, and real-world criteria. This multi-method test data is needed to establish a test’s ‘construct validity’: a metric of a test’s quality in terms of its ability to measure what it says it measures.
“The pace of AI research has been so fast that basic principles of measurement and validation we’re accustomed to in scientific research has become an afterthought,” said Serapio-García, who is also a Gates Cambridge Scholar. “A chatbot answering any questionnaire can tell you that it’s very agreeable, but behave aggressively when carrying out real-world tasks with the same prompts.
“This is the messy reality of measuring social constructs: they are dynamic and subjective, rather than static and clear-cut. For this reason, we need to get back to basics and make sure tests we apply to AI truly measure what they claim to measure, rather than blindly trusting survey instruments – developed for deeply human characteristics – to test AI systems.”
To design a comprehensive and accurate method for evaluating and validating personality in AI chatbots, the researchers tested how well various models’ behaviour in real-world tasks and validation tests statistically related to their test scores for the ‘big five’ traits used in academic psychometric testing: openness, conscientiousness, extraversion, agreeableness and neuroticism.
The team adapted two well-known personality tests – an open-source, 300-question version of the Revised NEO Personality Inventory and the shorter Big Five Inventory – and administered them to various LLMs using structured prompts.
By using the same set of contextual prompts across tests, the team was able to quantify how well a model’s extraversion scores on one personality test, for example, correlated more strongly with its levels of extraversion on a separate personality test, and less strongly with all other big five personality traits on that test. Past attempts to assess the personality of chatbots have fed entire questionnaires to a model at once, which skewed the results since each answer built on the previous one.
The researchers found that larger, instruction-tuned models showed personality test profiles that were both reliable and predictive of behaviour, while smaller or ‘base’ models gave inconsistent answers.
The researchers took their tests further, showing they could steer a model’s personality along nine levels for each trait using carefully designed prompts. For example, they could make a chatbot appear more extroverted or more emotionally unstable – and these changes carried through to real-world tasks like writing social media posts.
“Our method gives you a framework to validate a given AI evaluation and test how well it can predict behaviour in the real world,” said Serapio-García. “Our work also shows how AI models can reliably change how they mimic personality depending on the user, which raises big safety and regulation concerns, but if you don’t know what you’re measuring or enforcing, there’s no point in setting up rules in the first place.”
The research was supported in part by Cambridge Research Computing Services (RCS), Cambridge Service for Data Driven Discovery (CSD3), the Engineering and Physical Sciences Research Council (EPSRC), and the Science and Technologies Facilities Council (STFC), part of UK Research and Innovation (UKRI).
END
‘Personality test’ shows how AI chatbots mimic human traits – and how they can be manipulated
2025-12-18
ELSE PRESS RELEASES FROM THIS DATE:
Global food systems driving twin crises of obesity and global heating
2025-12-18
Global food systems driving twin crises of obesity and global heating
A major review in Frontiers in Science highlights how tackling unsustainable food systems—reflected by our changing food environment—is urgent for both health and climate.
The paper reviews evidence that both obesity and environmental harms result from a profit-led food system that encourages high intake and poor health. The authors say that our food ...
Osaka Medical and Pharmaceutical University researchers capture real-time molecular movies of enzyme catalysis
2025-12-18
In a groundbreaking study, researchers have captured real-time "molecular movies" showing how an enzyme changes shape during catalysis. Using an advanced technique called mix-and-inject serial crystallography at Japan's SACLA X-ray free-electron laser facility, the team observed domain movements and structural changes in the enzyme, copper amine oxidase enzyme over millisecond timescales, revealing dynamics that are nearly impossible to observe by other methods.
Enzymes are nature's catalysts, that speed up biochemical reactions ...
Could your genes influence the gut microbiome of others?
2025-12-18
The gut microbiome — made up of trillions of microbes in the digestive tract — is vital for digestion and overall health. Diet and medication shape these microbial ecosystems, but the contribution of genetics has been more difficult to ascertain. Now, a new study of rats — a model organism for understanding the human gut — has found that the composition of the rat gut microbiome is shaped not only by a rat’s own genes but also by the genes of those it lives with.
The discovery reveals a novel way in which genes and social interactions intertwine: through the exchange of commensal ...
Clues to Alzheimer’s disease may be hiding in our ‘junk’ DNA
2025-12-18
When most of us think of DNA, we have a vague idea it’s made up of genes that give us our physical features, our behavioural quirks, and keep our cells and organs running.
But only a tiny percentage of our DNA – around 2% – contains our 20,000-odd genes. The remaining 98% – long known as the non-coding genome, or so-called ‘junk’ DNA – includes many of the switches that control when and how strongly genes are expressed.
Now researchers from UNSW Sydney have identified ...
Study reveals that the body uses different sensors to detect cold in the skin and in internal organs
2025-12-18
A research team led by Félix Viana, co-director of the Sensory Transduction and Nociception laboratory at the Institute for Neurosciences (IN), a joint research centre of the Spanish National Research Council (CSIC) and Miguel Hernández University of Elche (UMH), has demonstrated that the body uses different molecular mechanisms to detect cold in the skin and in internal organs. These findings represent a significant advance in understanding thermal homeostasis and certain pathologies associated ...
iPS cells from dish to freezer and back
2025-12-18
With a Kobe University-developed procedure, induced pluripotent stem cells can now be frozen directly in their dishes without losing their viability or undifferentiated state after thawing. This marks a significant step for research automation, personalized medicine and drug discovery research.
Induced pluripotent stem cells, also widely known as iPS cells, can be created from any tissue in the human body and possess the ability to transform into a wide range of tissues. As such, they are essential for regenerative medicine and drug discovery research. Kobe University biochemical ...
Deep neural networks enable accurate pricing of American options under stochastic volatility
2025-12-18
Background and Motivation
Accurately pricing American-style options, which allow early exercise at any time before expiry, remains a significant challenge in quantitative finance. This task becomes even more complex under realistic market conditions where asset volatility is not constant but fluctuates randomly, as described by stochastic volatility models like Heston's. Traditional numerical methods, often mesh-based, can be computationally intensive and struggle with high-dimensional problems. With the exponential growth of derivatives trading and the critical need for effective risk management, evidenced by billions of contracts ...
Collective risk resonance in Chinese stock sectors uncovered through higher-order network analysis
2025-12-18
Background and Motivation
Systemic financial risk remains a critical challenge for modern economies, underscored by recurring crises such as the 2008 global financial meltdown, the 2015 Chinese stock market crash, and the COVID-19 pandemic. Traditional research has often examined sectors in isolation or focused on pairwise risk spillovers, overlooking the complex, multi-sector dependencies that can amplify systemic threats. This study addresses that gap by exploring higher-order interactions—where risks resonate ...
Does CPU impact systemic risk contributions of Chinese sectors? Evidence from mixed frequency methods with asymmetric tail long memory
2025-12-18
Background and Motivation
As climate change intensifies globally, national policies aimed at mitigation and adaptation have become a significant, yet volatile, factor influencing financial markets. In China—the world's second-largest economy and a key player in global climate governance—the path toward carbon neutrality involves substantial policy adjustments, creating what researchers term Climate Policy Uncertainty (CPU). While CPU is recognised as an emerging source of financial risk, its specific impact on the systemic risk contributions of different economic sectors within ...
General intelligence framework to predict virus adaptation based on a genome language model
2025-12-18
Background
In the field of biomedicine and public health, continuous viral mutation and evolution may enable viruses to cross species barriers, infect non-natural hosts, and subsequently trigger human-to-human transmission or even global pandemics. Historically, multiple major outbreaks, such as COVID-19 and influenza pandemics, have been caused by zoonotic viruses. Therefore, in the face of potential threats from unknown viruses, developing intelligent models capable of rapidly assessing their adaptability and transmission risks at the genotypic level has become a forefront challenge in infectious disease prevention and control.
Traditional experimental methods for ...