PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

New “bandit” algorithm uses light for better bets

Simulated photonic reinforcement learning method learns faster and aims higher

New “bandit” algorithm uses light for better bets
2023-08-21
(Press-News.org)

How does a gambler maximize winnings from a row of slot machines? This is the inspiration for the "multi-armed bandit problem," a common task in reinforcement learning in which "agents" make choices to earn rewards. Recently, an international research team led by Hiroaki Shinkawa at the University of Tokyo developed an extended photonic reinforcement learning scheme that moves from the static bandit problem towards a more challenging dynamic environment. This study was published July 25 in Intelligent Computing, a Science Partner Journal.

The success of the scheme relies on both a photonic system to enhance the learning quality and a supporting algorithm. Looking at a "potential photonic implementation," the authors developed a modified bandit Q-learning algorithm and validated its effectiveness through numerical simulations. They also tested their algorithm with a parallel architecture, where multiple agents operate at the same time, and found that the key to accelerating the parallel learning process is to avoid conflicting decisions by taking advantage of the quantum interference of photons.

Although using the quantum interference of photons is not new in this field, the authors believe this study is "the first to connect the notion of photonic cooperative decision-making with Q-learning and apply it to a dynamic environment." Reinforcement learning problems are generally set in a dynamic environment that changes with the agents' actions and are thus more complex than the static environment in a bandit problem. 

This study targets a grid world, a collection of cells holding varying rewards. Each agent can go up, down, left or right and get a reward based on its current move and location. In this environment, the agent's next move is determined entirely by its current move and location.

The simulations in this study use a 5 × 5 cell grid; each cell is called a "state," every move made by an agent at each time step is called an "action," and the rule determining how an agent selects a certain action in each state is called a "policy." The decision-making process is designed as a bandit problem scenario, where each state-action pair is regarded as a slot machine and the changes in Q value — the values of the state-action pairs — are regarded as the rewards.

Unlike basic Q-learning algorithms, which generally focus on finding the optimal path to maximize rewards, the modified bandit Q-learning algorithm aims to learn the optimal Q value for every state-action pair in the entire environment, efficiently and accurately. Therefore, it is essential for an agent to keep a good balance between "exploiting" the familiar pairs with high values for faster learning and "exploring" unfrequented pairs for potentially higher values. The softmax algorithm, a popular model that excels in this kind of balancing, is used as the policy.

The authors' future priority is to design a photonic system supporting conflict-free decision-making among at least three agents, hoping its addition to their proposed scheme will help agents avoid making conflicting decisions. Meanwhile, they are planning to develop algorithms that allow agents to act continuously and to apply their bandit Q-learning algorithm to more complicated reinforcement learning tasks.

END


[Attachments] See images for this press release:
New “bandit” algorithm uses light for better bets

ELSE PRESS RELEASES FROM THIS DATE:

For men with erectile dysfunction, penile implants are usually covered by insurance – but not always

2023-08-21
August 21, 2022 – Implantable penile prostheses (IPPs) are an established treatment option for erectile dysfunction (ED), and are covered by insurance in about 80% of cases, reports a paper in the September issue of Urology Practice®, an Official Journal of the American Urological Association (AUA). The journal is published in the Lippincott portfolio by Wolters Kluwer. However, some employer-sponsored insurance plans specifically exclude this guideline-recommended treatment option for ED, according to the new research by Dr. Mohit Khera, MD, MBA, MPH, of Baylor College ...

Adversities permanently change our brains

2023-08-21
Neuroscientists at Radboudumc show that adversities permanently change the functioning of the brain. Furthermore, an aberrant reaction of the brain to adversities is related to anxiety symptoms. This may have predictive value for the development of psychiatric disorders. Your brain is shaped by the things you experience. That sounds logical, but can you really measure that? And what can you do with it? Neuroscientists at Radboud university medical center investigated the influence of adversities in life on patterns in the brain. They found remarkable associations that may have predictive value for the development of psychiatric disorders. Special ...

MSU hires Judd Herzer for new mobility director role

2023-08-21
EAST LANSING, Mich. – Michigan State University today named Judd Herzer as the director of MSU Mobility to help amplify and focus the university’s vast research activities in the smart-vehicle landscape. Satish Udpa, University Distinguished Professor in the College of Engineering at MSU and co-founder of MSU Mobility, has been fulfilling the duties of this newly created role in an interim capacity while the university looked for the ideal candidate. Mobility is among MSU’s principal areas of research and innovation, and MSU Mobility and its ...

Antil receives funding for workshop on digital twins

2023-08-21
Harbir Antil, Director, Center for Mathematics and Artificial Intelligence, and Professor, Mathematical Sciences, received funding for: "Mathematics for Digital Twins (MATH-DT)." This award will provide support for a workshop titled "Mathematical Opportunities in Digital Twins" to be held on Dec. 11-13, 2023, at George Mason University's campus in Arlington, VA.  The workshop will bring together key experts working in many aspects of mathematics, key application fields, and industry with the goal of determining ...

Understanding mechanisms of alcohol-associated bowel disease

Understanding mechanisms of alcohol-associated bowel disease
2023-08-21
Alcohol consumption is a significant risk factor for gastrointestinal diseases, including cancer. Alcohol can damage the gastrointestinal tract in several ways. It can promote an impairment of several intestinal barrier functions, leading to leaky gut and dysbiosis. Ethanol metabolism can also produce toxic substances such as acetaldehyde and acetate, further damaging the gut and potentially promoting cancer.  Ethanol and its metabolites enhance DNA damage response and dysregulate the epithelial proliferation/differentiation program, thereby increasing the risk of cancer development. In a new paper published in eGastroenterology, ...

SARS-CoV-2: The grasping fingers of the viral N protein

SARS-CoV-2: The grasping fingers of the viral N protein
2023-08-21
#FRANKFURT. Immediately after the infection of a cell in the throat or lungs, the SARS-CoV-2 virus works very hard to replicate, using the human cell’s metabolic pathways to produce its proteins and make sure that its genetic material (the RNA genome) is copied. The RNA genome is then packaged very compactly into new virus particles that are released from the cell to infect more cells. One viral protein, called the nucleocapsid protein (N), is particularly important for rapid and efficient replication. ...

Climate win-win: study quantifies benefits of enhanced weathering

Climate win-win: study quantifies benefits of enhanced weathering
2023-08-21
Applying ground-up silicate rock to Midwestern farm fields can capture significant amounts of carbon dioxide and prevent it from accumulating in the atmosphere, according to a new study that successfully quantified those climate benefits for the first time. Working with Eion Corp., researchers at the University of Illinois Urbana-Champaign and the Leverhulme Centre for Climate Change Mitigation (LC3M) developed a new method to calculate the CO2-reduction potential of basalt rock amendments applied to cropland soil, a process known as enhanced weathering. Traditional row-crop agriculture releases sizable amounts of soil-derived carbon to the atmosphere as CO2, a greenhouse gas ...

Late mortality after COVID-19 infection in veterans vs risk-matched comparators

2023-08-21
About The Study: The findings of this study indicate that COVID-19 was not associated with any clinically significant excess mortality among those who survived at least 180 days compared with closely risk-matched comparators, despite having worse 2-year total mortality. This finding has individual level and health system planning implications and should be reassuring to persons who have survived COVID-19 for at least 180 days.  Authors: Theodore J. Iwashyna, M.D., Ph.D., of the Ann Arbor VA in Ann Arbor, Michigan, is the corresponding author. To access the embargoed study: Visit our For The Media ...

Screen time at age 1 and communication, problem-solving developmental delay at ages 2 and 4

2023-08-21
About The Study: In this study including 7,097 mother-child pairs, greater screen time for children at age 1 was associated with developmental delays in communication and problem-solving at ages 2 and 4. These findings suggest that domains of developmental delay should be considered separately in future discussions on screen time and child development.  Authors: Taku Obara, Ph.D., of Tohoku University in Sendai, Japan, is the corresponding author. To access the embargoed study: Visit our For The Media website at this ...

International pediatric COVID-19 severity over the course of the pandemic

2023-08-21
About The Study: This study including 31,000 hospitalized children with SARS-CoV-2 infection suggested that while intensive care unit admission decreased over the course of the pandemic in all age groups, ventilatory and oxygen support did not decrease over time in children younger than age 5. These findings highlight the importance of considering different pediatric age groups when assessing disease severity in COVID-19.  Authors: Kirsty Short, Ph.D., of the University of Queensland in Brisbane, Australia, is the corresponding author. To access the embargoed study: Visit our For The Media website at this link https://media.jamanetwork.com/ (doi:10.1001/jamapediatrics.2023.3117) Editor’s ...

LAST 30 PRESS RELEASES:

Crystallographic engineering enables fast low‑temperature ion transport of TiNb2O7 for cold‑region lithium‑ion batteries

Ultrafast sulfur redox dynamics enabled by a PPy@N‑TiO2 Z‑scheme heterojunction photoelectrode for photo‑assisted lithium–sulfur batteries

Optimized biochar use could cut China’s cropland nitrous oxide emissions by up to half

Neural progesterone receptors link ovulation and sexual receptivity in medaka

A new Japanese study investigates how tariff policies influence long-run economic growth

Mental trauma succeeds 1 in 7 dog related injuries, claims data suggest

Breastfeeding may lower mums’ later life depression/anxiety risks for up to 10 years after pregnancy

Study finds more than a quarter of adults worldwide could benefit from GLP-1 medications for weight loss

Hobbies don’t just improve personal lives, they can boost workplace creativity too

Study shows federal safety metric inappropriately penalizes hospitals for lifesaving stroke procedures

Improving sleep isn’t enough: researchers highlight daytime function as key to assessing insomnia treatments

Rice Brain Institute awards first seed grants to jump-start collaborative brain health research

Personalizing cancer treatments significantly improve outcome success

UW researchers analyzed which anthologized writers and books get checked out the most from Seattle Public Library

Study finds food waste compost less effective than potting mix alone

UCLA receives $7.3 million for wide-ranging cannabis research

Why this little-known birth control option deserves more attention

Johns Hopkins-led team creates first map of nerve circuitry in bone, identifies key signals for bone repair

UC Irvine astronomers spot largest known stream of super-heated gas in the universe

Research shows how immune system reacts to pig kidney transplants in living patients

Dark stars could help solve three pressing puzzles of the high-redshift universe

Manganese gets its moment as a potential fuel cell catalyst

“Gifted word learner” dogs can pick up new words by overhearing their owners’ talk

More data, more sharing can help avoid misinterpreting “smoking gun” signals in topological physics

An illegal fentanyl supply shock may have contributed to a dramatic decline in deaths

Some dogs can learn new words by eavesdropping on their owners

Scientists trace facial gestures back to their source. before a smile appears, the brain has already decided

Is “Smoking Gun” evidence enough to prove scientific discovery?

Scientists find microbes enhance the benefits of trees by removing greenhouse gases

KAIST-Yonsei team identifies origin cells for malignant brain tumor common in young adults

[Press-News.org] New “bandit” algorithm uses light for better bets
Simulated photonic reinforcement learning method learns faster and aims higher