PRESS-NEWS.org - Press Release Distribution
PRESS RELEASES DISTRIBUTION

MIT researchers introduce generative AI for databases

This new tool offers an easier way for people to analyze complex tabular data.

2024-07-09
(Press-News.org) CAMBRIDGE, MA — A new tool makes it easier for database users to perform complicated statistical analyses of tabular data without the need to know what is going on behind the scenes.

GenSQL, a generative AI system for databases, could help users make predictions, detect anomalies, guess missing values, fix errors, or generate synthetic data with just a few keystrokes.

For instance, if the system were used to analyze medical data from a patient who has always had high blood pressure, it could catch a blood pressure reading that is low for that particular patient but would otherwise be in the normal range.

GenSQL automatically integrates a tabular dataset and a generative probabilistic AI model, which can account for uncertainty and adjust their decision-making based on new data.

Moreover, GenSQL can be used to produce and analyze synthetic data that mimic the real data in a database. This could be especially useful in situations where sensitive data cannot be shared, such as patient health records, or when real data are sparse.

This new tool is built on top of SQL, a programming language for database creation and manipulation that was introduced in the late 1970s and is used by millions of developers worldwide.

“Historically, SQL taught the business world what a computer could do. They didn’t have to write custom programs, they just had to ask questions of a database in high-level language. We think that, when we move from just querying data to asking questions of models and data, we are going to need an analogous language that teaches people the coherent questions you can ask a computer that has a probabilistic model of the data,” says Vikash Mansinghka ’05, MEng ’09, PhD ’09, senior author of a paper introducing GenSQL and a principal research scientist and leader of the Probabilistic Computing Project in the MIT Department of Brain and Cognitive Sciences.

When the researchers compared GenSQL to popular, AI-based approaches for data analysis, they found that it was not only faster but also produced more accurate results. Importantly, the probabilistic models used by GenSQL are explainable, so users can read and edit them.

“Looking at the data and trying to find some meaningful patterns by just using some simple statistical rules might miss important interactions. You really want to capture the correlations and the dependencies of the variables, which can be quite complicated, in a model. With GenSQL, we want to enable a large set of users to query their data and their model without having to know all the details,” adds lead author Mathieu Huot, a research scientist in the Department of Brain and Cognitive Sciences and member of the Probabilistic Computing Project.

They are joined on the paper by Matin Ghavami and Alexander Lew, MIT graduate students; Cameron Freer, a research scientist; Ulrich Schaechtle and Zane Shelby of Digital Garage; Martin Rinard, an MIT professor in the Department of Electrical Engineering and Computer Science and member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and Feras Saad ’15, MEng ’16, PhD ’22, an assistant professor at Carnegie Mellon University. The research was recently presented at the ACM Conference on Programming Language Design and Implementation.

Combining models and databases

SQL, which stands for structured query language, is a programming language for storing and manipulating information in a database. In SQL, people can ask questions about data using keywords, such as by summing, filtering, or grouping database records.

However, querying a model can provide deeper insights, since models can capture what data imply for an individual. For instance, a female developer who wonders if she is underpaid is likely more interested in what salary data mean for her individually than in trends from database records.

The researchers noticed that SQL didn’t provide an effective way to incorporate probabilistic AI models, but at the same time, approaches that use probabilistic models to make inferences didn’t support complex database queries.

They built GenSQL to fill this gap, enabling someone to query both a dataset and a probabilistic model using a straightforward yet powerful formal programming language.

A GenSQL user uploads their data and probabilistic model, which the system automatically integrates. Then, she can run queries on data that also get input from the probabilistic model running behind the scenes. This not only enables more complex queries but can also provide more accurate answers.

For instance, a query in GenSQL might be something like, “How likely is it that a developer from Seattle knows the programming language Rust?” Just looking at a correlation between columns in a database might miss subtle dependencies. Incorporating a probabilistic model can capture more complex interactions.   

Plus, the probabilistic models GenSQL utilizes are auditable, so people can see which data the model uses for decision-making. In addition, these models provide measures of calibrated uncertainty along with each answer.

For instance, with this calibrated uncertainty, if one queries the model for predicted outcomes of different cancer treatments for a patient from a minority group that is underrepresented in the dataset, GenSQL would tell the user that it is uncertain, and how uncertain it is, rather than overconfidently advocating for the wrong treatment.

Faster and more accurate results

To evaluate GenSQL, the researchers compared their system to popular baseline methods that use neural networks. GenSQL was between 1.7 and 6.8 times faster than these approaches, executing most queries in a few milliseconds while providing more accurate results.

They also applied GenSQL in two case studies: one in which the system identified mislabeled clinical trial data and the other in which it generated accurate synthetic data that captured complex relationships in genomics.

Next, the researchers want to apply GenSQL more broadly to conduct largescale modeling of human populations. With GenSQL, they can generate synthetic data to draw inferences about things like health and salary while controlling what information is used in the analysis.

They also want to make GenSQL easier to use and more powerful by adding new optimizations and automation to the system. In the long run, the researchers want to enable users to make natural language queries in GenSQL. Their goal is to eventually develop a ChatGPT-like AI expert one could talk to about any database, which grounds its answers using GenSQL queries.   

###

This research is funded, in part, by the Defense Advanced Research Projects Agency (DARPA), Google, and the Siegel Family Foundation.

END


ELSE PRESS RELEASES FROM THIS DATE:

Exponentially increasing understanding of early life on Earth

Exponentially increasing understanding of early life on Earth
2024-07-09
Despite decades of research, there’s still much scholars don’t understand about life’s beginnings and early evolution. A UC Riverside paper has opened the door to understanding more and to framing future studies that could help predict climate change and search for life beyond Earth.   “This paper strives to inform the Earth sciences community where the research needs to go next,” said Christopher Tino, a UCR PhD candidate during the time of research and a first author. Many studies have explored signs ...

New method could yield fast, cross-country quantum network

New method could yield fast, cross-country quantum network
2024-07-09
Quantum computers offer powerful ways to improve cybersecurity, communications, and data processing, among other fields. To realize these full benefits, however, multiple quantum computers need to be connected to build quantum networks or a quantum internet. Scientists have struggled to come up with practical methods of building such networks, which must transmit quantum information over long distances. Now, researchers at the University of Chicago Pritzker School of Molecular Engineering (PME) have proposed a new approach — building long quantum channels using vacuum sealed tubes ...

Aging retinal pigmented epithelium: Omics-based insights into vision decline

Aging retinal pigmented epithelium: Omics-based insights into vision decline
2024-07-09
“These findings potentially support employing anti-aging therapies such as senolytic pharmacologic compounds to prevent or ameliorate progression to AMD [...]” BUFFALO, NY- July 9, 2024 – A new editorial paper was published in Aging (listed by MEDLINE/PubMed as "Aging (Albany NY)" and "Aging-US" by Web of Science) Volume 16, Issue 12, entitled, “Aging retinal pigmented epithelium: omics-based insights into vision decline.” In this new editorial, researchers Ioan V. Matei and Luminita Paraoan from ...

Public health researchers detail way forward post-pandemic

2024-07-09
AURORA, Colo. (July 9, 2024) – In the aftermath of the COVID-19 pandemic, the U.S. public health system must focus on critical questions of accountability, politicization and updating data systems if it is to do its job well and maintain the trust of the American people, according to a new report from the Colorado School of Public Health. The report, authored by Professor Jonathan Samet, MD, MS, of the Colorado School of Public Health and Professor Ross Brownson, PhD, of Washington University in St. Louis, was published recently in the journal Health Affairs. In ...

Improving 'health span' through slowing age-related cognitive decline

Improving health span through slowing age-related cognitive decline
2024-07-09
Two University of Oklahoma researchers have been awarded more than $2 million in grants from the Hevolution Foundation to further their studies on age-related cognitive impairment, with an emphasis on improving “health span,” or the number of years a person remains healthy. While modern medicine can help extend a person’s life span, researchers are increasingly studying ways to increase their healthy years of life. Because the process of aging increases the risk for memory problems and dementia, researchers must understand why as a first step toward delaying cognitive issues until later in life. The Hevolution Foundation ...

Globally significant upwelling is driven by topographical features on seafloor

2024-07-09
Irvine, Calif., July 9, 2024 – Exactly how the turbulent mixing of ocean water relates to global overturning circulation has been little understood by oceanographers, but an international research team, including an Earth system scientist at the University of California, Irvine, has found that bumpy topographical features along the sloping ocean floor contribute significantly to ocean seawater upwelling.   In a paper published recently in Nature, the researchers describe a “vigorous near-bottom upwelling” that results in the upward transition of water from denser to lighter ocean layers at a rate ...

Dolls and trucks: Political right and left share some parenting beliefs

2024-07-09
Key takeaways Virtually all study respondents on the political left and more than 75% on the right supported allowing children to play with both traditionally “girl” and “boy” toys. Those on both sides of the political spectrum also supported the idea that girls should be able to aspire to traditionally male pursuits. However, while most left-wing activists supported the idea of a child living in a way that does not align with their birth sex, most right-wing activists rejected the idea. Society appears deeply divided on how to parent with regard to gender. For example, some parents throw “gender reveal” ...

Delaying diabetes with diet and exercise for 4 years results in better long-term health

Delaying diabetes with diet and exercise for 4 years results in better long-term health
2024-07-09
Individuals diagnosed with prediabetes can reduce their long-term risk of death and diabetes-related health complications if they delay the onset of diabetes for just four years through diet and exercise. Guangwei Li of the China-Japan Friendship Hospital and colleagues report these findings in a new study published July 9th in the open-access journal PLOS Medicine. Type 2 diabetes is associated with an increased risk of death and disability, and imposes a significant economic burden on individuals and societies worldwide. Lifestyle changes, such as eating a healthy diet and getting more exercise, can delay or reduce the risk of developing diabetes in people ...

Global database reveals large gaps in our knowledge of four-footed animals

Global database reveals large gaps in our knowledge of four-footed animals
2024-07-09
Researchers developed TetrapodTraits – a global database of animals with four feet – which can now be applied for better ecology, evolution and conservation research. Mario Moura of the Universidade Estadual de Campinas, Brazil, and Walter Jetz of Yale University, US, published this work on July 9th in the open-access journal PLOS Biology. Tetrapods, which include amphibians, reptiles, birds and mammals, are generally well-documented species, which makes them useful as models in global biodiversity studies. However, gaps in our knowledge about many of these species, data inconsistencies and shifting scientific names can lead to biased conclusions about biodiversity. To help ...

Regorafenib synergizes with TAS102 against multiple gastrointestinal cancers

Regorafenib synergizes with TAS102 against multiple gastrointestinal cancers
2024-07-09
“In this study, we investigated the therapeutic effects and the underlying mechanisms of TAS-102 in combination with regorafenib against gastrointestinal cancers.” BUFFALO, NY- July 9, 2024 – A new research paper was published in Oncotarget's Volume 15 on July 2, 2024, entitled, “Regorafenib synergizes with TAS102 against multiple gastrointestinal cancers and overcomes cancer stemness, trifluridine-induced angiogenesis, ERK1/2 and STAT3 signaling regardless of KRAS or BRAF mutational status.” Single-agent TAS102 (trifluridine/tipiracil) and regorafenib ...

LAST 30 PRESS RELEASES:

UAB startup Endomimetics receives $2.8 million Small Business Innovation Research grant

Scientists turn to human skeletons to explore origins of horseback riding

UCF receives prestigious Keck Foundation Award to advance spintronics technology

Cleveland Clinic study shows bariatric surgery outperforms GLP-1 diabetes drugs for kidney protection

Study reveals large ocean heat storage efficiency during the last deglaciation

Fever drives enhanced activity, mitochondrial damage in immune cells

A two-dose schedule could make HIV vaccines more effective

Wastewater monitoring can detect foodborne illness, researchers find

Kowalski, Salonvaara receive ASHRAE Distinguished Service Awards

SkAI launched to further explore universe

SLU researchers identify sex-based differences in immune responses against tumors

Evolved in the lab, found in nature: uncovering hidden pH sensing abilities

Unlocking the potential of patient-derived organoids for personalized sarcoma treatment

New drug molecule could lead to new treatments for Parkinson’s disease in younger patients

Deforestation in the Amazon is driven more by domestic demand than by the export market

Demand-side actions could help construction sector deliver on net-zero targets

Research team discovers molecular mechanism for a bacterial infection

What role does a tailwind play in cycling’s ‘Everesting’?

Projections of extreme temperature–related deaths in the US

Wearable device–based intervention for promoting patient physical activity after lung cancer surgery

Self-compassion is related to better mental health among Syrian refugees

Microplastics found in coral skeletons

Stroke rates increasing in individuals living with SCD despite treatment guidelines

Synergistic promotion of dielectric and thermomechanical properties of porous Si3N4 ceramics by a dual-solvent template method

Korean research team proposes AI-powered approach to establishing a 'carbon-neutral energy city’

AI is learning to read your emotions, and here’s why that can be a good thing

Antidepressant shows promise for treating brain tumors

European Green Deal: a double-edged sword for global emissions

Walking in lockstep

New blood test could be an early warning for child diabetes

[Press-News.org] MIT researchers introduce generative AI for databases
This new tool offers an easier way for people to analyze complex tabular data.