- Press Release Distribution

Q&A: How to train AI when you don't have enough data

( Artificial intelligence excels at sorting through information and detecting patterns or trends. But these machine learning algorithms need to be trained with large amounts of data first.

As researchers explore potential applications for AI, they have found scenarios where AI could be really useful — such as analyzing X-ray image data to look for evidence of rare conditions or detecting a rare fish species caught on a commercial fishing boat — but there's not enough data to accurately train the algorithms. 

Jenq-Neng Hwang, University of Washington professor of electrical and computer and engineering, specializes in these issues. For example, Hwang and his team developed a method that teaches AI to monitor how many distinct poses a baby can achieve throughout the day. There are limited training datasets of babies, which meant the researchers had to create a unique pipeline to make their algorithm accurate and useful. The team recently published this work in the IEEE/CVF Winter Conference on Applications of Computer Vision 2024.

UW News spoke with Hwang about the project details and other similarly challenging areas the team is addressing.

Why is it important to develop an algorithm to track baby poses?

Jenq-Neng Hwang: We started a collaboration with the UW School of Medicine and the Korean Electronics and Telecommunications Research Institute's AI Lab. The goal of the project was to try to help families with a history of autism know whether their babies were also likely to have autism. Babies before 9 months don't really have language skills yet, so it's difficult to see if they’re autistic or not. Researchers developed one test, called the Alberta Infant Motor Scale, which categorizes various poses babies can do: If a baby can do this, they get two points; and if they can do that, they get three points; and so on. Then you add up all the points and if the baby is above some threshold, they likely don't have autism.

But to do this test, you need a doctor to observe all the different poses. It becomes a very tedious process because sometimes after three or four hours, we still haven't seen a baby do a specific pose. Maybe the baby could do it, but at that moment they didn't want to. One solution could be to use AI. Parents often have a baby monitor at home. The baby monitor could use AI to continuously and consistently track the various poses a baby does in a day.

Why is AI a good fit for this task?

JNH: My background is studying traditional image processing and computer vision. We were trying to teach computers to be able to figure out human poses from photos or videos, but the trouble is that there are so many variations. For example, even the same person wearing different outfits is a challenging task for traditional image processing to correctly identify that person's elbow on each photo.

But AI makes it so much easier. These models can learn. For example, you could train a machine learning model with a variety of motion captured sequences showing all different kinds of people. These sequences could be annotated with the corresponding 3D poses. Then this model could learn to output a 3D model of a person's pose on a sequence it has never seen before.

But in this case, there aren't a lot of motion captured sequences of babies that also have 3D pose annotations that you could use to train your machine learning model. What did you do instead?

JNH: We don't have a lot of 3D pose annotations of baby videos to train the machine learning model for privacy reasons. It's also difficult to create a dataset where a baby is performing all the possible potential poses that we would need. Our datasets are too small, meaning that a model trained with them would not estimate reliable poses.

But we do have a lot of annotated 3D motion sequences of people in general. So, we developed this pipeline.

First we used the large amount of 3D motion sequences of regular people to train a generic 3D pose generative AI model, which is similar to the model used in ChatGPT and other GPT-4 types of large language models.

We then finetuned our generic model with our very limited dataset of annotated baby motion sequences. The generic model can then adapt to the small dataset and produce high quality results.

Are there other tasks like this: good for AI, but there's not a lot of data to train an algorithm?

JNH: There are many types of scenarios where we don't have enough information to train the model. One example is a rare disease that is diagnosed by X-rays. The disease is so rare that we don't have enough X-ray images from patients with the disease to train a model. But we do have a lot of X-rays from healthy patients. So, we can use generative AI again to generate the corresponding synthetic X-ray image without disease, which can then be compared with the diseased image to identify disease regions for further diagnosis.

Autonomous driving is another example. There are so many real events you cannot create. For example, say you are in the middle of driving and a few leaves blow in front of the car. If you use autonomous driving, the car might think something is wrong and slam on the brakes, because the car has never seen this scenario before. This could result in an accident.

We call these "long-tail" events, which means that they are unlikely to happen. But in daily life we always see random things like this. Until we figure out how to train autonomous driving systems to handle these types of events, autonomous driving cannot be useful. Our team is working on this problem by combining data from a regular camera with radar information. The camera and radar persistently check each other’s decisions, which can help a machine learning algorithm make sense of what's happening.

Additional co-authors on the baby poses paper are Zhuoran Zhou, a UW research assistant in the electrical and computer engineering department; Zhongyu Jiang and Cheng-Yen Yang, UW doctoral students in the electrical and computer engineering department; Wenhao Chai, a UW master's student studying electrical and computer engineering; and Lei Li, a doctoral fellow at the University of Copenhagen. This research was funded by the Electronics and Telecommunications Research Institute of Korea, the National Oceanic and Atmospheric Administration and Cisco Research.

For more information, contact Hwang at



Wayne State University researchers uncover potential treatment targets for Zika virus-related eye abnormalities

Wayne State University researchers uncover potential treatment targets for Zika virus-related eye abnormalities
DETROIT - A groundbreaking study published in the journal iScience presents crucial insights into the ocular effects of Zika virus infection during pregnancy and offers promising avenues for therapeutic intervention. Produced by a team of researchers in the Department of Ophthalmology, Visual and Anatomical Sciences at the Wayne State University School of Medicine, the paper, “Targeting ABCG1 and SREBP-2 mediated cholesterol homeostasis ameliorates Zika virus-induced ocular pathology,” provides compelling evidence of the ...

Discovering Van Gogh in the wild: scientists unveil a new gecko species

Discovering Van Gogh in the wild: scientists unveil a new gecko species
You’ve probably seen nature depicted in art, but how often do you see an artwork hiding in nature? When they saw the back of a lizard in the Southern Western Ghats, a group of scientists from the Thackeray Wildlife Foundation in India were reminded of Van Gogh’s The Starry Night. As soon as they figured out it was a new species, it was only apt to name it in honour of the famous painter. “Cnemaspis vangoghi is named for Dutch painter Vincent Van Gogh (1853–1890) as the striking colouration of the new species is reminiscent of one of his most iconic paintings, The ...

Small birds spice up the already diverse diet of spotted hyenas in Namibia

Small birds spice up the already diverse diet of spotted hyenas in Namibia
Hyenas are generalist predators (and scavengers) with a broad range of prey species. They are known for hunting (or scavenging) larger mammals such as antelopes and occasionally feed on smaller mammals and reptiles. Being flexible in the choice of prey is a strategy of generalists – and this even extends to small passerine birds, as scientists from the Leibniz Institute for Zoo and Wildlife Research (Leibniz-IZW) and the University of Ljubljana observed in Namibia: Spotted hyenas pursued red-billed queleas, picked them from the ground or the surface of a waterhole and swallowed them whole, at a success rate of approximately one bird every three minutes. These observations were described ...

Imaging detects transient “hypoxic pockets” in the mouse brain

Using a bioluminescent oxygen indicator, Felix Beinlich and colleagues discovered a spontaneous, spatially defined occurrence of “hypoxic pockets” in the mouse brain. Their technique offers a way to learn more about brain oxygen tension (pO2), a measure of oxygen delivery and demand in brain tissue that changes dynamically but is not well understood. The findings could have implications for how rest and exercise affect pO2 in the human brain, including the role of these activities in conditions such as dementia, Beinlich et al. suggest. The researchers used a genetically encoded bioluminescent oxygen indicator called Green NanoLuc in mouse cortical astrocytes to ...

Dissolved organic matter could be used to track and improve the health of freshwaters

The dissolved organic matter (DOM) from hundreds of plant and animal remains could be used to track and possibly restore the health of freshwater bodies, Andrew J. Tanentzap and Jérémy A. Fonvielle write in this Perspective. The broad range of compounds, or chemodiversity, of DOM has multiple effects in freshwaters, including providing nutrients to support food web productivity, reducing or enhancing contamination from pollutants, and influencing the metabolism of microorganisms important to the biogeochemical cycle. DOM may also reduce the heat that ...

Indoor air quality standards in public buildings would boost health and economy, say international experts

There should be mandatory indoor air quality standards, say an international group of experts led by Professor Lidia Morawska.   Professor Morawska, Vice-Chancellor Fellow at the University of Surrey and Distinguished Professor at Queensland University of Technology, led the appeal to the World Health Organization to recognise the airborne transmission of the virus which causes COVID-19 early in the pandemic – and help minimise it.   Now, in a paper published by the prestigious journal Science, Professor Morawska's international team recommends setting standards for ventilation rate and three key indoor pollutants: carbon ...

Positive associations between premenstrual disorders and perinatal depression

Positive associations between premenstrual disorders and perinatal depression
Women affected by premenstrual disorders have a higher risk of perinatal depression compared with those who do not, according to research published March 28th in the open access journal PLOS Medicine. The relationship works both ways: those with perinatal depression are also more likely to develop premenstrual disorders after pregnancy and childbirth. This study suggests that a common mechanism might contribute to the two conditions. Menstruating women experience cyclical hormone fluctuations through puberty, menstrual cycle, pregnancy ...

New imaging method illuminates oxygen's journey in the brain

New imaging method illuminates oxygens journey in the brain
The human brain consumes vast amounts of energy, which is almost exclusively generated from a form of metabolism that requires oxygen. Consequently, the efficient and timely allocation and delivery of oxygen is critical to healthy brain function, however, the precise mechanics of this process have largely remained hidden from scientists.    A new bioluminescence imaging technique, described today in the journal Science, has created highly detailed, and visually striking, images of the movement ...

Researchers discover key gene for toxic alkaloid in barley

All plants mediate their environmental interactions via chemical signals. An example is the alkaloid gramine produced by barley, one of the world’s most widely grown cereals. Gramine provides protection against herbivorous insects and grazing animals and inhibits the growth of other plants. Despite long-standing interest, the key gene for the formation of gramine remained elusive. The researchers discovered a cluster of two genes in barley for gramine biosynthesis. The first gene (HvNMT) had already been discovered 18 years ago. In their study the researchers from IPK and the Leibniz University Hannover now identified a second gene (AMI synthase, HvAMIS), and ...

New approach to monitoring freshwater quality can identify sources of pollution, and predict their effects

New approach to monitoring freshwater quality can identify sources of pollution, and predict their effects
The source of pollutants in rivers and freshwater lakes can now be identified using a comprehensive new water quality analysis, according to scientists at the University of Cambridge and Trent University, Canada. Microparticles from car tyres, pesticides from farmers’ fields, and toxins from harmful algal blooms are just some of the organic chemicals that can be detected using the new approach, which also indicates the impact these chemicals are likely to have in a particular river or lake. Importantly, the approach can also point to the origin of specific organic matter dissolved in the water, because it has a distinct ...


Unveiling the mysteries of cell division in embryos with timelapse photography

Survey finds loneliness epidemic runs deep among parents

Researchers develop high-energy-density aqueous battery based on halogen multi-electron transfer

Towards sustainable food systems: global initiatives and innovations

Coral identified as oldest bioluminescent organism, suggesting a new model of ancient ecology

SRI chosen by DARPA to develop next-generation computational design of metallic parts and intelligent testing of alloys

NJIT engineers muffle invading pathogens with a 'molecular mask'

Perinatal transmission of HIV can lead to cognitive deficits

The consumption of certain food additive emulsifiers could be associated with the risk of developing type 2 diabetes

New cancer research made possible as Surrey scientists study lipids cell by cell 

Bioluminescence first evolved in animals at least 540 million years ago

Squids’ birthday influences mating

Star bars show Universe’s early galaxies evolved much faster than previously thought

Critical minerals recovery from electronic waste

The move by Apple Memories to block potentially upsetting content illustrates Big Tech’s reach and limits, writes Chrys Vilvang

Chemical tool illuminates pathways used by dopamine, opioids and other neuronal signals

Asian monsoon lofts ozone-depleting substances to stratosphere

PET scans reveal ‘smoldering’ inflammation in patients with multiple sclerosis

Genetics predict type 2 diabetes risk and disparities in childhood cancer survivors

Health information on TikTok: The good, the bad and the ugly

New study points to racial and social barriers that block treatment for multiple myeloma

Rensselaer researcher finds that frog species evolved rapidly in response to road salts

A new chapter in quantum vortices: Customizing electron vortex beams

Don’t be a stranger – study finds rekindling old friendships as scary as making new ones

There’s no ‘one size fits all’ when it comes to addressing men’s health issues globally

Comparison of the “late catch-up” phenomenon between BuMA Supreme and XIENCE stents through serial optical coherence tomography at 1–2 month and 2 year follow-ups: A multicenter study

Marine plankton communities changed long before extinctions

Research reveals tools to make STEM degrees more affordable

Q&A: UW research shows neural connection between learning a second language and learning to code

Keane wins 2024 Gopal K. Shenoy Excellence in Beamline Science Award

[] Q&A: How to train AI when you don't have enough data