(Press-News.org)
Artificial intelligence is growing into a pivotal tool in chemical research, offering novel methods to tackle complex challenges that traditional approaches struggle with. One subtype of artificial intelligence that has seen increasing use in chemistry is machine learning, which uses algorithms and statistical models to make decisions based on data and perform tasks that it has not been explicitly programmed for.
However, to make reliable predictions, machine learning also demands large amounts of data, which isn’t always available in chemical research. Small chemical datasets simply do not provide enough information for these algorithms to train on, which limits their effectiveness.
In a new study, scientists in the team of Berend Smit at EPFL, have found a solution in large language models such as GPT-3. Those models are pre-trained on massive amounts of texts, and are known for their broad capabilities in understanding and generating human-like text. GPT-3 forms the basis of the more popular artificial intelligence ChatGPT.
The study, now published in Nature Machine Intelligence, unveils a novel approach that significantly simplifies chemical analysis using artificial intelligence. Contrary to initial skepticism, the method doesn't directly ask GPT-3 chemical questions. “GPT-3 has not seen most of the chemical literature, so if we ask ChatGPT a chemical question, the answers are typically limited to what one can find on Wikipedia,” says Kevin Jablonka, the study’s lead researcher. “Instead, we fine-tune GPT-3 with a small data set converted into questions and answers, creating a new model capable of providing accurate chemical insights.”
This process involves feeding GPT-3 a curated list of Q&As. “For example, for high-entropy alloys, it is important to know whether an alloy occurs in a single phase or has multiple phases,” says Smit. “The curated list of Q&As are of the type: Q= “Is the <name of the high entropy alloy> single phase?” A= “Yes/No”.”
He continues: “In the literature, we have found many alloys of which the answer is known, and we used this data to fine-tune GPT-3. What we get back is a refined AI model that is trained to only answer this question with a yes or no.”
In tests, the model, trained with relatively few Q&As, correctly answered over 95% of very diverse chemical problems, often surpassing the accuracy of state-of-the-art machine-learning models. “The point is that this is as easy as doing a literature search, which works for many chemical problems,” says Smit.
One of the most striking aspects of this study is its simplicity and speed. Traditional machine learning models require months to develop and demand extensive knowledge. In contrast, the approach developed by Jablonka takes five minutes and requires zero knowledge.
The implications of the study are profound. It introduces a method as easy as conducting a literature search, applicable to various chemical problems. The ability to formulate questions like “Is the yield of a [chemical] made with this [recipe] high?” and receive accurate answers can revolutionize how chemical research is planned and carried out.
In the paper, the authors state: “Next to a literature search, querying a foundational model [e.g., GPT-3,4] might become a routine way to bootstrap a project by leveraging the collective knowledge encoded in these foundational models.” Or, as Smit succinctly puts it, “This is going to change the way we do chemistry.”
Other contributors
EPFL Laboratory of Artificial Chemical Intelligence
Helmholtz Institute for Polymers in Energy Applications (Helmholtz Center Berlin and FSU Jena)
Reference
Kevin Maik Jablonka, Philippe Schwaller, Andres Ortega-Guerrero, Berend Smit. Is GPT all you need for low-data discovery in chemistry? Nature Machine Intelligence 2023. DOI:10.1038/s42256-023-00788-1
END
In recent years, carbon-based catalysts — especially nitrogen-doped nanocarbons — have emerged as sustainable, reliable alternatives to the metal catalysts that have traditionally been used to support chemical reactions. Researchers from the Key Laboratory of Advanced Carbon-Based Functional Materials (Fujian Province University) at Fuzhou University synthesized nanocarbons from guanine molecules to better understand the precise role nitrogen plays in the carbon-based materials and explore the reaction mechanisms of these catalytic systems.
In a recently published study, the research team clarified how different types of nitrogen can modulate oxidative dehydrogenation ...
Mucormycosis is a relatively rare but serious fungal infection increasingly recognised for its poor prognosis and high mortality. Due to the COVID-19 pandemic, the incidence of mucormycosis reached high levels during 2021–2022 in India.
This study led by Dr. Rizwan Suliankatchi Abdulkader (Indian Council of Medical Research) established a multicentric ambispective cohort of patients hospitalised with mucormycosis across India and reported their baseline profile, clinical characteristics, and outcomes at discharge.
Mucormycosis was diagnosed based on mycological confirmation on direct microscopy (KOH/Calcofluor white stain), ...
Research introduces new DNA methylation-based method for accurately assessing cell composition in the human pancreas, addressing a critical gap in diabetes research. By overcoming limitations of traditional protein marker-based approaches, the study provides a more precise means to identify specific cell types. The findings offer insights into beta-cell dysfunction across diabetes types and have direct clinical implications, enhancing our understanding of diabetes development and potentially guiding more tailored treatment ...
Tsukuba, Japan—Meteora sporadica is a small, unicellular eukaryote (protist) that was discovered in deep Mediterranean sea sediments in 2002. It differs from known protists by the presence of two lateral arms that swing back and forth. However, the ultrastructure and phylogenetic position of M. sporadica remain unknown.
In this study, researchers successfully cultured and analyzed two strains of M. sporadica from marine sediments in detail. Ultratructural observations revealed ...
Tsukuba, Japan—Leguminous plants have a mechanism (rhizobial symbiosis) to efficiently acquire nitrogen, which is an essential macronutrient for growth, through the nitrogen-fixing bacteria rhizobia. Root nodules are organs on plant roots that facilitate the symbiotic relationship. Rhizobia coloniza these nodules and fix nitrogen by converting nitrogen from air into ammonia. Iron is needed for the enzymes that catalyze nitrogen fixation; however, where and how iron is transported to the nodule and used for nitrogen fixation is largely unknown.
In this study, using the legume model plant Lotus japonicus, a transcriptome ...
PHOENIX, Feb. 6, 2024 – Eleven scientists leading the way in stroke research will be recognized during the American Stroke Association’s International Stroke Conference 2024 for their exceptional professional achievements. The meeting will be held in Phoenix, Feb. 7-9, and is a world premier meeting for researchers and clinicians dedicated to the science of stroke and brain health.
The illustrious group of awardees includes four groundbreaking scientists who have devoted their careers to stroke research and six scientists will be recognized for their notable new research. The awards include the Ralph L. Sacco Outstanding Stroke Research ...
While wind farms have become a widely popular method of generating energy, researchers are now looking at the impact of these large farms on wind patterns and the surrounding environment.
Using large-scale simulations to better understand the way air moves across and within wind farms, researchers from UBC Okanagan and Delft University of Technology (TU Delft) in the Netherlands have developed a modelling framework that will help improve wind energy forecasts and productivity.
The researchers also hope to learn how large wind farms can alter natural wind patterns.
“Wind farms are getting so large that ...
Be front and center for the hottest research findings in the molecular life sciences at Discover BMB, the annual meeting of the American Society for Biochemistry and Molecular Biology, to be held March 23–26 in San Antonio.
Don’t miss this opportunity to hear from the top minds in the field. Reporters are invited to register for a complimentary press pass to attend #DiscoverBMB in San Antonio or access press materials electronically. Please note that only a limited number of complementary on-site press passes will be issued, so advance registration is recommended. Find more information in the #DiscoverBMB newsroom.
As part of an exciting program spotlighting the ...
Scientists have found the strongest evidence yet that our brains can compensate for age-related deterioration by recruiting other areas to help with brain function and maintain cognitive performance.
As we age, our brain gradually atrophies, losing nerve cells and connections and this can lead to a decline in brain function. It’s not fully understood why some people appear to maintain better brain function than others, and how we can protect ourselves from cognitive decline.
A widely accepted notion is that some people’s brains are able to compensate ...
LA JOLLA, CA—La Jolla Institute for Immunology (LJI) is working to guide the development of new tuberculosis vaccines and drug therapies.
Now a team of LJI scientists has uncovered important clues to how human T cells combat Mycobacterium tuberculosis, the bacterium that causes TB. Their findings were published recently in Nature Communications.
"This research gives us a better understanding of T cell responses to different stages in tuberculosis infection and helps us figure out is there are additional diagnostic ...