(Press-News.org) You’d be hard pressed to find an industry today that doesn’t use data in some capacity. Whether it's health care workers using data to report the rate of flu infections in a certain state, manufacturers using data to better understand average production times, or even a small coffee shop owner flipping through sales data to learn about the previous month’s bestselling latte, data can reveal patterns and offer insights into our everyday behavior.
All of this data plays a critical role in artificial intelligence (AI) decision-making. Further, it creates a serious need for people to understand the value of data in the first place. By understanding how individual data sources contribute to technology-based decision-making processes, we can create a more effective and improved experience for all AI users.
For instance, studies have shown that prevalent facial recognition software performs less reliably in identifying women and people of color compared with white men, reflecting imbalances in facial data representing diverse populations. Measuring the value of data enables us to eliminate inputs that might contribute to biased models. Furthermore, understanding the value of data allows us to assign appropriate pricing to data sources, thereby facilitating data sharing. This is particularly important to industries where certain data is difficult to obtain or for small businesses grappling with limited data access.
Assistant Professor Ruoxi Jia in the Bradley Department of Electrical and Computer Engineering at Virginia Tech has received an National Science Foundation (NSF) Faculty Early Career Development (CAREER) award to investigate fundamental theories and computational tools needed to measure the value of data.
The five-year $500,000 grant will allow Jia and her team to develop scalable and reliable data valuation techniques that support strategic data acquisition and improve machine learning based data analytics.
“Right now, there is a lot of excitement about machine learning and AI, especially after the emergence of ChatGPT,” said Jia. “But what’s under the hood is a lot of data. That’s what enables this kind of machine, and that’s what we’re aiming to improve.”
ChatGPT, an AI chatbot launched this fall, allows users to ask for help with things such as writing essays, drafting business plans, generating code, and even composing music. As of Dec. 4, ChatGPT already had over 1 million users.
Open AI built its auto-generative system on a model called GPT 3, which is trained on billions of tokens. These tokens, used for natural language processing, are similar to words in a paragraph. For comparison’s sake, the novel “Harry Potter and the Order of the Phoenix” has about 250,000 words and 185,000 tokens. Essentially, ChatGPT has been trained on billions of data points, making this kind of intelligent machine possible.
Jia noted the importance of data quality and how it can impact machine learning results.
“If you have bad data feeding into machine learning, you will get bad results,” said Jia. “We call that 'garbage in, garbage out.' We want to get an understanding, especially a quantitative understanding, of which data is more valuable and which is less valuable for the purpose of data selection.”
The importance of more quality-based data has been noticed by ChatGPT developers as they just announced the release of GPT-4. The latest technology is “multimodal,” meaning images as well as text prompts can spur it to generate content.
A large amount of data is required to develop this type of machine intelligence, but not all data is open sourced or public. Some data sets are owned by private entities and there is privacy involved. Jia hopes that in the future, monetary incentives can be introduced to help acquire these types of data sets and improve the machine learning algorithms that are needed in all industries.
The University of California-Berkeley grad has had conversations with Google Research and Sony AI Research, among others, who are interested in the research benefits. Jia hopes these companies will adopt the technology developed and serve as advocates for data sharing. Sharing data and adopting improved machine learning algorithms will greatly benefit not only industries but individual consumers as well. For instance, if you’ve ever had a bad experience with a customer service chatbot, you’ve experienced low-quality data and poor machine learning algorithm design.
Jia hopes to use her background and area expertise to improve these web-based interactions for all. As a school-aged child, Jia always enjoyed math and science, but her decision to enter the electrical and computer engineering field stemmed from her desire to help people.
“Both of my parents are doctors. It was amazing to grow up seeing them help patients with some kind of medical formula,” said Jia. “That’s why I chose to study math and science. You can have a concrete impact. I’m using a different kind of formula to help, but I like that pursuing this career has made me feel like I can make a difference in someone’s life.”
The CAREER award is the National Science Foundation’s most prestigious award for early-career faculty with the potential to serve as academic role models in research and education and to lead advances in their organization’s mission. Throughout this project, Jia has demonstrated her desire to serve as an academic role model for graduate, undergraduate, and even K-12 students.
She is a core faculty in the Sanghani Center for Artificial Intelligence and Data Analytics, formerly known as the Discovery Analytics Center. The center has more than 20 faculty members and 120 graduate students, two of whom are working directly with Jia to conduct the planned research.
Jia plans to implement an education plan that equips students with the skills to harness data to improve decision-making impacting society. This educational plan will start with new machine learning courses for undergraduate students in the first two years of the project and focus on K-12 engagement in years three through five.
“There was a famous statistician named John Tukey,” Jia said. “He had a saying that the best thing about being a statistician is that you get to play in everyone's backyard. Machine learning is very much the same. It touches many areas of my colleagues’ work so it is easy for me to build connections and collaborate with other people. I really feel that my research is a privilege. It's a privilege to work in this area that many people care about.”
END
For chatbots and beyond: Improving lives with data starts with improving machine learning
2023-04-10
ELSE PRESS RELEASES FROM THIS DATE:
Mild COVID during pregnancy does not slow brain development in babies, study finds
2023-04-10
NEW YORK, NY (April 10, 2023)--Columbia researchers have found that babies born to moms who had mild or asymptomatic COVID during pregnancy are normal, based on results from a comprehensive assessment of brain development.
The findings expand on a smaller study that used maternal reports to assess the development of babies born in New York City during the first wave of the pandemic. That study found no differences in brain development between babies who were exposed to COVID in utero and those who were not exposed.
For the new study, the researchers developed a method of observing infants remotely, ...
Kids judge Alexa smarter than Roomba, but say both deserve kindness
2023-04-10
DURHAM, N.C. –- Most kids know it’s wrong to yell or hit someone, even if they don’t always keep their hands to themselves. But what about if that someone’s name is Alexa?
A new study from Duke developmental psychologists asked kids just that, as well as how smart and sensitive they thought the smart speaker Alexa was compared to its floor-dwelling cousin Roomba, an autonomous vacuum.
Four- to eleven-year-olds judged Alexa to have more human-like thoughts and emotions than Roomba. But despite the perceived difference in intelligence, kids felt neither the Roomba nor the Alexa deserve to be yelled at or harmed. That feeling dwindled as kids advanced ...
Hooper creating public database of slaving voyages across the Indian Ocean and Asia
2023-04-10
Jane Hooper, Associate Professor, History, received funding for the project: "Global Passages: Creating a Public Database of Slaving Voyages across the Indian Ocean and Asia."
Hooper, along with three other scholars, has received a three-year digital production grant from the National Endowment for the Humanities to support a major expansion of the open access SlaveVoyages website, available online at https://www.slavevoyages.org. The primary investigators will create an Indian Ocean and Asia (IOA) database of voyages that ...
A new technique opens the door to safer gene editing by reducing the mutation problem in gene therapy
2023-04-10
CRISPR-Cas9 is widely used to edit the genome by studying genes of interest and modifying disease-associated genes. However, this process is associated with side effects including unwanted mutations and toxicity. Therefore, a new technology that reduces these side effects is needed to improve its usefulness in industry and medicine. Now, researchers at Kyushu University in southern Japan and Nagoya University School of Medicine in central Japan have developed an optimized genome-editing method that vastly reduces mutations, opening the door to more effective treatment of genetic diseases with fewer unwanted mutations. Their findings were published in Nature Biomedical Engineering.
Genome-editing ...
Medicaid ‘cliff’ adds to racial and ethnic disparities in care for near-poor seniors
2023-04-10
PITTSBURGH, April 10, 2023 – Black and Hispanic older adults whose annual income is slightly above the federal poverty level are more likely than their white peers to face cost-related barriers to accessing health care and filling medications for chronic conditions, according to new research led by a University of Pittsburgh School of Public Health scientist.
Published today in JAMA Internal Medicine, the analysis links these disparities to a Medicaid “cliff” – an abrupt end ...
Potential drug treats fatty liver disease in animal models, brings hope for first human treatment
2023-04-10
A recently developed amino acid compound successfully treats nonalcoholic fatty liver disease in non-human primates — bringing scientists one step closer to the first human treatment for the condition that is rapidly increasing around the world, a study suggests.
Researchers at Michigan Medicine developed DT-109, a glycine-based tripeptide, to treat the severe form of fatty liver disease called nonalcoholic steatohepatitis. More commonly known as NASH, the disease causes scarring and inflammation in the liver and is estimated to affect up to 6.5% of the global population.
Results ...
Scientists show how we can anticipate rather than react to extinction in mammals
2023-04-10
Most conservation efforts are reactive. Typically, a species must reach threatened status before action is taken to prevent extinction, such as establishing protected areas. A new study published in the journal Current Biology on April 10 shows that we can use existing conservation data to predict which currently unthreatened species could become threatened and take proactive action to prevent their decline before it is too late.
“Conservation funding is really limited,” says lead author Marcel Cardillo (@MarcelCardillo) of Australian National University. “Ideally, what we need is some way of anticipating species that may not be threatened ...
This elephant’s self-taught banana peeling offers glimpse of elephants’ broader abilities
2023-04-10
Elephants like to eat bananas, but they don’t usually peel them first in the way humans do. A new report in the journal Current Biology on April 10, however, shows that one very special Asian elephant named Pang Pha picked up banana peeling all on her own while living at the Berlin Zoo. She reserves it for yellow-brown bananas, first breaking the banana before shaking out and collecting the pulp, leaving the thick peel behind.
The female elephant most likely learned the unusual peeling behavior ...
Health care access, affordability among adults with self-reported post–COVID-19 condition
2023-04-10
About The Study: In this survey study of 9,400 adults ages 18 to 64, a higher rate of respondents with self-reported post–COVID-19 condition (PCC; also known as long COVID) did not obtain needed health care in the past year because of cost compared with adults without PCC. Adults with PCC were also more likely to have unmet needs because of difficulties getting timely appointments or health plan authorization, among other challenges with health care institutions or health insurance. These findings suggest that improved health care access for adults with PCC may require developing clinical protocols and addressing insurance-related barriers.
Authors: Michael ...
Changes in children’s screen time during pandemic
2023-04-10
About The Study: The largest increase in children’s recreational screen time during the pandemic was on weekdays, especially at the outset of the pandemic when schools were closed; this increase was greater than expected for age-related growth. Change in weekend screen time during the pandemic was not significant compared with weekday screen time. Once in-person school resumed, weekday screen time decreased versus that during the COVID-1 wave (spring 2020), although it remained consistently higher than pre-pandemic estimates and age-related expectations.
Authors: Sheri Madigan, Ph.D., of the University of Calgary in Calgary, Canada, is the corresponding ...