PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

Lay intuition as effective at jailbreaking AI chatbots as technical methods

2025-11-04
(Press-News.org) UNIVERSITY PARK, Pa. — It doesn’t take technical expertise to work around the built-in guardrails of artificial intelligence (AI) chatbots like ChatGPT and Gemini, which are intended to ensure that the chatbots operate within a set of legal and ethical boundaries and do not discriminate against people of a certain age, race or gender. A single, intuitive question can trigger the same biased response from an AI model as advanced technical inquiries, according to a team led by researchers at Penn State. 

“A lot of research on AI bias has relied on sophisticated ‘jailbreak’ techniques,” said Amulya Yadav, associate professor at Penn State’s College of Information Sciences and Technology. “These methods often involve generating strings of random characters computed by algorithms to trick models into revealing discriminatory responses. While such techniques prove these biases exist theoretically, they don’t reflect how real people use AI. The average user isn’t reverse-engineering token probabilities or pasting cryptic character sequences into ChatGPT — they type plain, intuitive prompts. And that lived reality is what this approach captures.” 

Prior work probing AI bias — skewed or discriminatory outputs from AI systems caused by human influences in the training data, like language or cultural bias — has been done by experts using technical knowledge to engineer large language model (LLM) responses. To see how average internet users encounter biases in AI-powered chatbots, the researchers studied the entries submitted to a competition called “Bias-a-Thon.” Organized by Penn State’s Center for Socially Responsible AI(CSRAI), the competition challenged contestants to come up with prompts that would lead generative AI systems to respond with biased answers. 

They found that the intuitive strategies employed by everyday users were just as effective at inducing biased responses as expert technical strategies. The researchers presented their findings at the 8th AAAI/ACM Conference on AI, Ethics, and Society. 

Fifty-two individuals participated in the Bias-a-Thon, submitting screenshots of 75 prompts and AI responses from eight generative AI models. They also provided an explanation of the bias or stereotype that they identified in the response, such as age-related or historical bias. 

The researchers conducted Zoom interviews with a subset of the participants to better understand their prompting strategies and their conceptions of ideas like fairness, representation and stereotyping when interacting with generative AI tools. Once they arrived at a participant-informed working definition of “bias” — which included a lack of representation, stereotypes and prejudice, and unjustified preferences toward groups — the researchers tested the contest prompts in several LLMs to see if they would elicit similar responses. 

“Large language models are inherently random,” said lead author Hangzhi Guo, a doctoral candidate in information sciences and technology at Penn State. “If you ask the same question to these models two times, they might return different answers. We wanted to use only the prompts that were reproducible, meaning that they yielded similar responses across LLMs.” 

The researchers found that 53 of the prompts generated reproducible results. Biases fell into eight categories: gender bias; race, ethnic and religious bias; age bias; disability bias; language bias; historical bias favoring Western nations; cultural bias; and political bias. The researchers also found that participants used seven strategies to elicit these biases: role playing, or asking the LLM to assume a persona; hypothetical scenarios; using human knowledge to ask about niche topics, where it’s easier to identify biased responses; using leading questions on controversial topics; probing biases in under-represented groups; feeding the LLM false information; and framing the task as having a research purpose. 

“The competition revealed a completely fresh set of biases,” said Yadav, organizer of the Bias-a-Thon. “For example, the winning entry uncovered an uncanny preference for conventional beauty standards. The LLMs consistently deemed a person with a clear face to be more trustworthy than a person with facial acne, or a person with high cheekbones more employable than a person with low cheekbones. This illustrates how average users can help us uncover blind spots in our understanding of where LLMs are biased. There may be many more examples such as these which have been overlooked by the jailbreaking literature on LLM bias.”

The researchers described mitigating biases in LLMs as a cat-and-mouse game, meaning that developers are constantly addressing issues as they arise. They suggested strategies that developers can use to mitigate these issues now, including implementing a robust classification filter to screen outputs before they go to users, conducting extensive testing, educating users and providing specific references or citations so users can verify information. 

“By shining a light on inherent and reproducible biases that laypersons can identify, the Bias-a-Thon serves an AI literacy function,” said co-author S. Shyam Sundar, Evan Pugh University Professor at Penn State and director of the Penn State Center for Socially Responsible Artificial Intelligence, which has since organized other AI competitions such as Fake-a-thon, Diagnose-a-thon and Cheat-a-thon. “The whole goal of these efforts is to increase awareness of systematic problems with AI, to promote the informed use of AI among laypersons and to stimulate more socially responsible ways of developing these tools.” 

Other Penn State contributors to this research include doctoral candidates Eunchae Jang, Wenbo Zhang, Bonam Mingole and Vipul Gupta. Pranav Narayanan Venkit, research scientist at Salesforce AI Research; Mukund Srinath, machine learning scientist at Expedia; and Kush R. Varshney from IBM Research also participated in the work.

END


ELSE PRESS RELEASES FROM THIS DATE:

USC researchers use AI to uncover genetic blueprint of the brain’s largest communication bridge

2025-11-04
For the first time, a research team led by the Mark and Mary Stevens Neuroimaging and Informatics Institute (Stevens INI) at the Keck School of Medicine of USC has mapped the genetic architecture of a crucial part of the human brain known as the corpus callosum—the thick band of nerve fibers that connects the brain’s left and right hemispheres. The findings open new pathways for discoveries about mental illness, neurological disorders and other diseases related to defects in this part of the brain. The corpus callosum is critical for nearly everything the brain does, from coordinating ...

Tiny swarms, big impact: Researchers engineering adaptive magnetic systems for medicine, energy and environment

2025-11-04
Rice University is partnering with researchers at the University of Washington, Columbia University and Louisiana State University on a $2 million award from the National Science Foundation to revolutionize how materials and microrobots can be designed, controlled and applied in real-world environments. Funded through NSF’s Designing Materials to Revolutionize and Engineer our Future (DMREF) program, the four-year project — Adaptive and Responsive Magnetic Swarms (ARMS) — aims to create microscopic robotic swarms that move and think collectively, much like schools of fish or flocks of birds. Led by principal investigator Zach ...

MSU study: How can AI personas be used to detect human deception?

2025-11-04
EAST LANSING, Mich. – Can an AI persona detect when a human is lying – and should we trust it if it can? Artificial intelligence, or AI, has had many recent advances and continues t  evolve in scope and capability. A new Michigan State University–led study is diving deeper into how well AI can understand humans by using it to detect human deception.  In the study, published in the Journal of Communication, researchers from MSU and the University of Oklahoma conducted 12 experiments with over 19,000 AI participants to examine how well AI personas were ...

Slowed by sound: A mouse model of Parkinson’s Disease shows noise affects movement

2025-11-04
In the development of Parkinson’s disease, it may not be a good idea to turn the amp to 11. High-volume noise exposure produced motor deficits in a mouse model of early-stage Parkinson’s disease, and established a link between the auditory processing and movement areas of the brain, according to a study published November 4th in the open-access journal PLOS Biology by Pei Zhang from the Huazhong University of Science and Technology in Wuhan, China, and colleagues. The environment can play an important role in the development of Parkinson’s disease, but how sound volume in particular might impact the severity of symptoms was unknown. To understand how ...

Demographic shifts could boost drug-resistant infections across Europe

2025-11-04
The rates of bloodstream infections caused by drug-resistant bacteria will increase substantially across Europe in the next five years, driven largely by aging populations, according to a new paper published November 4th in the open-access journal PLOS Medicine by Gwenan Knight of the London School of Hygiene and Tropical Medicine, UK, and colleagues. Antimicrobial resistance (AMR) is a global public health crisis. To effectively target interventions and track progress toward international goals, accurately estimating how the AMR burden will change over time is necessary. In ...

Insight into how sugars regulate the inflammatory disease process

2025-11-04
New research has updated our understanding of how sugars, known as glycans, help immune cells move into skin in the inflammatory disease, psoriasis. The paper entitled “Leukocytes have a heparan sulfate glycocalyx that regulates recruitment during psoriasis-like skin inflammation” published in the journal Science Signaling. The lead authors are Dr Amy Saunders from Lancaster University  and Dr Douglas Dyer from the University of Manchester, with their joint PhD student, ...

PKU scientists uncover climate impacts and future trends of hailstorms in China

2025-11-04
Peking University, November 4, 2025: A research team led by Professor Zhang Qinghong and Li Rumeng from the Department of Atmospheric and Oceanic Sciences at Peking University (PKU) School of Physics, has found that hailstorms in China have surged since the Industrial Revolution, likely due to human-driven climate warming. The study, published in Nature Communications in September 2025, combines historical records, meteorological data, and artificial intelligence to track long-term hailstorm trends. Why It Matters: Hail can fall fast and hit hard. Apart from smashing crops and damaging homes, it may even endanger lives. After 2024’s record-breaking ...

Computer model mimics human audiovisual perception

2025-11-04
A neural computation first discovered in insects has been shown to explain how humans combine sight and sound – even when illusions trick us into “hearing” what we do not see. Now, researcher Dr Cesare Parise from the University of Liverpool, UK, has created a biologically grounded model based on this computation, which can take in real-life audiovisual information instead of more abstract parameters used in previous models. Parise’s research, published today in eLife as the final Version of ...

AC instead of DC: A game-changer for VR headsets and near-eye displays

2025-11-04
WASHINGTON, Nov. 4, 2025 — LEDs, or light-emitting diodes, are essential components in near-eye displays like virtual reality and augmented reality headsets and smart glasses, along with electronics like cameras and medical equipment. Conventional LEDs use direct current power, which requires two contacts, like the positive and negative contacts to connect a battery. As device form factors continue to shrink, fabricating nano-LEDs requires each of the hundreds of microscopic components to touch both contacts, which presents a complicated alignment problem for device manufacturers. In Applied Physics Letters, ...

Prevention of cardiovascular disease events and deaths among black adults via systolic blood pressure equity

2025-11-04
About The Study: The findings of this modeling study suggest that achieving systolic blood pressure equity between non-Hispanic Black and white adults could substantially reduce the number of cardiovascular disease events and deaths experienced by non-Hispanic Black U.S. adults. Initiatives to maintain normal blood pressure and achieve blood pressure control for individuals with hypertension could have a substantial impact on health equity in the U.S.  Corresponding Author: To contact the corresponding author, Shakia T. ...

LAST 30 PRESS RELEASES:

A unified model of memory and perception: how Hebbian learning explains our recall of past events

Chemical evidence of ancient life detected in 3.3 billion-year-old rocks: Carnegie Science / PNAS

Medieval communities boosted biodiversity around Lake Constance

Groundbreaking research identifies lethal dose of plastics for seabirds, sea turtles and marine mammals: “It’s much smaller than you might think”

Lethal aggression, territory, and fitness in wild chimpanzees

The woman and the goose: a 12,000-year-old glimpse into prehistoric belief

Ancient chemical clues reveal Earth’s earliest life 3.3 billion years ago

From warriors to healers: a muscle stem cell signal redirects macrophages toward tadpole tail regeneration

How AI can rig polls

Investing in nurses reduces physician burnout, international study finds

Small changes in turnout could substantially alter election results in the future, study warns

Medicaid expansion increases access to HIV prevention medication for high-risk populations

Arkansas research awarded for determining cardinal temps for eight cover crops

Study reveals how the gut builds long-lasting immunity after viral infections

How people identify scents and perceive their pleasantness

Evidence builds for disrupted mitochondria as cause of Parkinson’s

SwRI turbocharges its hydrogen-fueled internal combustion engine

Parasitic ant tricks workers into killing their queen, then takes the throne

New study identifies part of brain animals use to make inferences

Reducing arsenic in drinking water cuts risk of death, even after years of chronic exposure

Lower arsenic in drinking water reduces death risk, even after years of chronic exposure

Lowering arsenic levels in groundwater decreases death rates from chronic disease

Arsenic exposure reduction and chronic disease mortality

Parasitic matricide, ants chemically compel host workers to kill their own queen

Clinical trials affected by research grant terminations at the National Institutes of Health

Racial and ethnic disparities in cesarean birth trends in the United States

Light-intensity-dependent transformation of mesoscopic molecular assemblies

Tirzepatide may only temporarily suppress brain activity involved in “food noise”

Do all countries benefit from clinical trials? A new Yale study examines the data

Consensus on the management of liver injury associated with targeted drugs and immune checkpoint inhibitors for hepatocellular carcinoma (version 2024)

[Press-News.org] Lay intuition as effective at jailbreaking AI chatbots as technical methods