250 Fossils Is Enough to Train an AI to Identify Shark Species - and That Changes Paleontology
At the Florida Museum of Natural History, more than one million vertebrate fossil specimens wait in storage. Bags of fossil-rich sediment sit unsieved, their contents unidentified. This backlog is not unusual - it is a structural feature of vertebrate paleontology, a field where the number of specimens vastly exceeds the human capacity to classify them. Computer vision AI has the potential to help, but there has been a practical obstacle: how many fossils does it take to train a reliable model?
A study published in Paleobiology provides an answer. Working with six shark species and thousands of fossil teeth, a team led by Bruce MacFadden, UF Distinguished Professor Emeritus and retired curator of vertebrate paleontology, found that AI classification accuracy above 90 percent plateaued at approximately 250 specimens per species - substantially fewer than researchers had assumed would be necessary.
Why Sharks, and Why Teeth
The team needed a vertebrate group with an abundant enough fossil record to actually test AI training thresholds. Sharks fit for a specific reason: their skeletons are made of cartilage, which almost never fossilizes, but their teeth are durable and persist long after every other trace of the animal disappears. Shark teeth are found in many fossil-bearing sedimentary layers in quantities comparable to pollen and spores, which palynologists have been classifying with AI since the 1980s.
The team selected six species from the Neogene period - spanning 23 to 2.6 million years ago - including both extinct species like Megalodon (the largest shark ever documented) and living species like the great white shark (Carcharodon carcharias). Photographing thousands of curated teeth from the Florida Museum collection was not quite sufficient; the team needed additional tiger shark and ancestral-great-white specimens, which three amateur fossil collectors on loan provided from their private collections.
Testing How Little Data Is Enough
The computer vision work was led primarily by Cristobal Barberis of Adaptive Computing and Arthur Porto, the Florida Museum's curator of artificial intelligence. The approach was methodical: they trained models by feeding labeled images in increments of 50, starting at 50 specimens per species and building up to 500, then tested each model on 25 unlabeled images per species.
The results were more encouraging than the team expected. Classification accuracy exceeded 90 percent and plateaued at roughly 250 specimens. Adding more training images above that threshold produced marginal gains not worth the effort of acquiring and photographing additional specimens.
More striking was the performance at the low end. Models trained on only 50 specimens per species still produced accuracy rates of at least 93 percent. "The thing that pleasantly surprised me is that even when you have fairly low sample sizes, you can still get pretty reasonable performance," said Porto.
What This Means for Vertebrate Fossil Collections
Vertebrate paleontology has traditionally been disadvantaged relative to other paleontological subdisciplines when it comes to AI-assisted identification. Vertebrates have over 200 bones per skeleton, specimens are rarely complete, fragments from different individuals rarely overlap morphologically, and matching a broken fragment to a species requires trained expertise that takes years to develop. That constraint has kept the backlog growing.
The finding that 250 well-photographed specimens per species can yield a reliable classifier changes the calculus. For species with at least that many accessible specimens in institutional collections, AI-assisted identification is now technically feasible. "We were getting accuracies of greater than 90%," MacFadden said, describing results that the team characterized as a practical threshold for deployment.
Shark teeth were a deliberately favorable test case - plentiful, morphologically distinct, and relatively easy to photograph consistently. For species with sparser fossil records or specimens that are harder to photograph in a standardized way, the approach may be more challenging. The study establishes a lower bound on training data requirements, but that bound may shift depending on how morphologically similar the species being classified are and how consistent specimen preservation is.
Beyond the Museum
MacFadden and colleagues have also done substantial work bringing fossils into K-12 classrooms through the SharkAI project. The envisioned endpoint is a curriculum where students use AI classifiers to identify shark teeth from online biorepositories based on tooth shape and the prey the animal would have hunted - connecting fossil morphology to ecology through a tool students can operate themselves.
Additional authors include Maria Vallejo-Pareja, Stephanie Killingsworth, Samantha Zbinden, Victor Perez, Kenneth Marks, and Devi Hall.