PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

New “bandit” algorithm uses light for better bets

Simulated photonic reinforcement learning method learns faster and aims higher

New “bandit” algorithm uses light for better bets
2023-08-21
(Press-News.org)

How does a gambler maximize winnings from a row of slot machines? This is the inspiration for the "multi-armed bandit problem," a common task in reinforcement learning in which "agents" make choices to earn rewards. Recently, an international research team led by Hiroaki Shinkawa at the University of Tokyo developed an extended photonic reinforcement learning scheme that moves from the static bandit problem towards a more challenging dynamic environment. This study was published July 25 in Intelligent Computing, a Science Partner Journal.

The success of the scheme relies on both a photonic system to enhance the learning quality and a supporting algorithm. Looking at a "potential photonic implementation," the authors developed a modified bandit Q-learning algorithm and validated its effectiveness through numerical simulations. They also tested their algorithm with a parallel architecture, where multiple agents operate at the same time, and found that the key to accelerating the parallel learning process is to avoid conflicting decisions by taking advantage of the quantum interference of photons.

Although using the quantum interference of photons is not new in this field, the authors believe this study is "the first to connect the notion of photonic cooperative decision-making with Q-learning and apply it to a dynamic environment." Reinforcement learning problems are generally set in a dynamic environment that changes with the agents' actions and are thus more complex than the static environment in a bandit problem. 

This study targets a grid world, a collection of cells holding varying rewards. Each agent can go up, down, left or right and get a reward based on its current move and location. In this environment, the agent's next move is determined entirely by its current move and location.

The simulations in this study use a 5 × 5 cell grid; each cell is called a "state," every move made by an agent at each time step is called an "action," and the rule determining how an agent selects a certain action in each state is called a "policy." The decision-making process is designed as a bandit problem scenario, where each state-action pair is regarded as a slot machine and the changes in Q value — the values of the state-action pairs — are regarded as the rewards.

Unlike basic Q-learning algorithms, which generally focus on finding the optimal path to maximize rewards, the modified bandit Q-learning algorithm aims to learn the optimal Q value for every state-action pair in the entire environment, efficiently and accurately. Therefore, it is essential for an agent to keep a good balance between "exploiting" the familiar pairs with high values for faster learning and "exploring" unfrequented pairs for potentially higher values. The softmax algorithm, a popular model that excels in this kind of balancing, is used as the policy.

The authors' future priority is to design a photonic system supporting conflict-free decision-making among at least three agents, hoping its addition to their proposed scheme will help agents avoid making conflicting decisions. Meanwhile, they are planning to develop algorithms that allow agents to act continuously and to apply their bandit Q-learning algorithm to more complicated reinforcement learning tasks.

END


[Attachments] See images for this press release:
New “bandit” algorithm uses light for better bets

ELSE PRESS RELEASES FROM THIS DATE:

For men with erectile dysfunction, penile implants are usually covered by insurance – but not always

2023-08-21
August 21, 2022 – Implantable penile prostheses (IPPs) are an established treatment option for erectile dysfunction (ED), and are covered by insurance in about 80% of cases, reports a paper in the September issue of Urology Practice®, an Official Journal of the American Urological Association (AUA). The journal is published in the Lippincott portfolio by Wolters Kluwer. However, some employer-sponsored insurance plans specifically exclude this guideline-recommended treatment option for ED, according to the new research by Dr. Mohit Khera, MD, MBA, MPH, of Baylor College ...

Adversities permanently change our brains

2023-08-21
Neuroscientists at Radboudumc show that adversities permanently change the functioning of the brain. Furthermore, an aberrant reaction of the brain to adversities is related to anxiety symptoms. This may have predictive value for the development of psychiatric disorders. Your brain is shaped by the things you experience. That sounds logical, but can you really measure that? And what can you do with it? Neuroscientists at Radboud university medical center investigated the influence of adversities in life on patterns in the brain. They found remarkable associations that may have predictive value for the development of psychiatric disorders. Special ...

MSU hires Judd Herzer for new mobility director role

2023-08-21
EAST LANSING, Mich. – Michigan State University today named Judd Herzer as the director of MSU Mobility to help amplify and focus the university’s vast research activities in the smart-vehicle landscape. Satish Udpa, University Distinguished Professor in the College of Engineering at MSU and co-founder of MSU Mobility, has been fulfilling the duties of this newly created role in an interim capacity while the university looked for the ideal candidate. Mobility is among MSU’s principal areas of research and innovation, and MSU Mobility and its ...

Antil receives funding for workshop on digital twins

2023-08-21
Harbir Antil, Director, Center for Mathematics and Artificial Intelligence, and Professor, Mathematical Sciences, received funding for: "Mathematics for Digital Twins (MATH-DT)." This award will provide support for a workshop titled "Mathematical Opportunities in Digital Twins" to be held on Dec. 11-13, 2023, at George Mason University's campus in Arlington, VA.  The workshop will bring together key experts working in many aspects of mathematics, key application fields, and industry with the goal of determining ...

Understanding mechanisms of alcohol-associated bowel disease

Understanding mechanisms of alcohol-associated bowel disease
2023-08-21
Alcohol consumption is a significant risk factor for gastrointestinal diseases, including cancer. Alcohol can damage the gastrointestinal tract in several ways. It can promote an impairment of several intestinal barrier functions, leading to leaky gut and dysbiosis. Ethanol metabolism can also produce toxic substances such as acetaldehyde and acetate, further damaging the gut and potentially promoting cancer.  Ethanol and its metabolites enhance DNA damage response and dysregulate the epithelial proliferation/differentiation program, thereby increasing the risk of cancer development. In a new paper published in eGastroenterology, ...

SARS-CoV-2: The grasping fingers of the viral N protein

SARS-CoV-2: The grasping fingers of the viral N protein
2023-08-21
#FRANKFURT. Immediately after the infection of a cell in the throat or lungs, the SARS-CoV-2 virus works very hard to replicate, using the human cell’s metabolic pathways to produce its proteins and make sure that its genetic material (the RNA genome) is copied. The RNA genome is then packaged very compactly into new virus particles that are released from the cell to infect more cells. One viral protein, called the nucleocapsid protein (N), is particularly important for rapid and efficient replication. ...

Climate win-win: study quantifies benefits of enhanced weathering

Climate win-win: study quantifies benefits of enhanced weathering
2023-08-21
Applying ground-up silicate rock to Midwestern farm fields can capture significant amounts of carbon dioxide and prevent it from accumulating in the atmosphere, according to a new study that successfully quantified those climate benefits for the first time. Working with Eion Corp., researchers at the University of Illinois Urbana-Champaign and the Leverhulme Centre for Climate Change Mitigation (LC3M) developed a new method to calculate the CO2-reduction potential of basalt rock amendments applied to cropland soil, a process known as enhanced weathering. Traditional row-crop agriculture releases sizable amounts of soil-derived carbon to the atmosphere as CO2, a greenhouse gas ...

Late mortality after COVID-19 infection in veterans vs risk-matched comparators

2023-08-21
About The Study: The findings of this study indicate that COVID-19 was not associated with any clinically significant excess mortality among those who survived at least 180 days compared with closely risk-matched comparators, despite having worse 2-year total mortality. This finding has individual level and health system planning implications and should be reassuring to persons who have survived COVID-19 for at least 180 days.  Authors: Theodore J. Iwashyna, M.D., Ph.D., of the Ann Arbor VA in Ann Arbor, Michigan, is the corresponding author. To access the embargoed study: Visit our For The Media ...

Screen time at age 1 and communication, problem-solving developmental delay at ages 2 and 4

2023-08-21
About The Study: In this study including 7,097 mother-child pairs, greater screen time for children at age 1 was associated with developmental delays in communication and problem-solving at ages 2 and 4. These findings suggest that domains of developmental delay should be considered separately in future discussions on screen time and child development.  Authors: Taku Obara, Ph.D., of Tohoku University in Sendai, Japan, is the corresponding author. To access the embargoed study: Visit our For The Media website at this ...

International pediatric COVID-19 severity over the course of the pandemic

2023-08-21
About The Study: This study including 31,000 hospitalized children with SARS-CoV-2 infection suggested that while intensive care unit admission decreased over the course of the pandemic in all age groups, ventilatory and oxygen support did not decrease over time in children younger than age 5. These findings highlight the importance of considering different pediatric age groups when assessing disease severity in COVID-19.  Authors: Kirsty Short, Ph.D., of the University of Queensland in Brisbane, Australia, is the corresponding author. To access the embargoed study: Visit our For The Media website at this link https://media.jamanetwork.com/ (doi:10.1001/jamapediatrics.2023.3117) Editor’s ...

LAST 30 PRESS RELEASES:

Gene classifier tests for prostate cancer may influence treatment decisions despite lack of evidence for long-term outcomes

KERI, overcomes the biggest challenge of the lithium–sulfur battery, the core of UAM

In chimpanzees, peeing is contagious

Scientists uncover structure of critical component in deadly Nipah virus

Study identifies benefits, risks linked to popular weight-loss drugs

Ancient viral DNA shapes early embryo development

New study paves way for immunotherapies tailored for childhood cancers

Association of waist circumference with all-cause and cardiovascular mortalities in diabetes from the National Health and Nutrition Examination Survey 2003–2018

A new chapter in Roman administration: Insights from a late Roman inscription

Global trust in science remains strong

New global research reveals strong public trust in science

Inflammation may explain stomach problems in psoriasis sufferers

Guidance on animal-borne infections in the Canadian Arctic

Fatty muscles raise the risk of serious heart disease regardless of overall body weight

HKU ecologists uncover significant ecological impact of hybrid grouper release through religious practices

New register opens to crown Champion Trees across the U.S.

A unified approach to health data exchange

New superconductor with hallmark of unconventional superconductivity discovered

Global HIV study finds that cardiovascular risk models underestimate for key populations

New study offers insights into how populations conform or go against the crowd

Development of a high-performance AI device utilizing ion-controlled spin wave interference in magnetic materials

WashU researchers map individual brain dynamics

Technology for oxidizing atmospheric methane won’t help the climate

US Department of Energy announces Early Career Research Program for FY 2025

PECASE winners: 3 UVA engineering professors receive presidential early career awards

‘Turn on the lights’: DAVD display helps navy divers navigate undersea conditions

MSU researcher’s breakthrough model sheds light on solar storms and space weather

Nebraska psychology professor recognized with Presidential Early Career Award

New data shows how ‘rage giving’ boosted immigrant-serving nonprofits during the first Trump Administration

Unique characteristics of a rare liver cancer identified as clinical trial of new treatment begins

[Press-News.org] New “bandit” algorithm uses light for better bets
Simulated photonic reinforcement learning method learns faster and aims higher