Whale ID
Humpback whale re-identification from underwater photographs.
The technology outlasted the animal it made permanent.
Humpback whale re-identification from underwater photographs.
The technology outlasted the animal it made permanent.
Moorea, French Polynesia. Humpback whales arrive on a predictable seasonal migration between July and November; the same family units return to the same waters across years. They have been photographed by the same researchers and the same NGO partners through enough seasons that a longitudinal record exists in principle, embedded in tens of thousands of full-body photographs sitting in folders across multiple institutions. The bottleneck was never sighting effort. It was the matching that comes afterwards. A trained eye takes roughly twenty minutes per photograph to confirm whether the animal in the frame has been seen before. At population scale, with multi-decade datasets, that bottleneck made true longitudinal analysis uneconomic.
A humpback whale lives across decades. Each photograph, in theory, is a longitudinal data point on an individual that will outlive most of its observers. The promise of population-scale, generational behavioral science depends on whether re-identification is a constant-time operation. Traditionally it is not.
Embedding model trained on the Moorea field corpus, plus similarity search over the population. Each new photograph is reduced to a vector and compared against everything previously catalogued; matches surface in milliseconds. The model isn't the contribution — the model is well-understood machinery. The contribution is conservation infrastructure: a re-identification step that fades into the background of a longitudinal study.
Field-collected corpus only — no synthetic augmentation, no scraped imagery, no crowdsourcing. Photographs were taken from the boat over multiple migration seasons under variable light, sea state, and behavioral context. The model has to be robust to those conditions because they are the conditions of the field. A clean corpus would be a different problem.
Dr. Delacy — cetacean re-identification collaborator and co-author on the forthcoming paper. Moorea-based NGO partner for field corpus.