NTSB Scrubs Public Files After AI Was Used to ‘Resurrect’ Voices of Deceased Pilots

Table of Contents
A Digital Resurrection in the Public Record
The National Transportation Safety Board (NTSB) recently found itself at the intersection of federal privacy laws and the rapid evolution of generative AI. The agency was forced to temporarily suspend public access to its accident docket system after discovering that the voices of pilots killed in a UPS plane crash had been digitally reconstructed and circulated online.
The incident centers on a technical loophole in how the NTSB shares investigation data. By law, the agency is prohibited from releasing actual audio recordings from cockpit voice recorders (CVRs) to the public. To provide transparency without violating these regulations, the NTSB often includes spectrograms in its dockets—visual representations of sound frequencies that allow investigators to analyze audio patterns without playing the actual recording.
For decades, these images were essentially useless to the general public, serving as mathematical snapshots of sound that required specialized expertise to interpret. However, the rise of sophisticated AI audio synthesis has turned these static images back into sound.
The Spectrogram Loophole
The catalyst for the current controversy was a realization shared on social media by Scott Manley, a YouTuber known for his deep dives into physics and astronomy. Manley pointed out that the megabytes of data encoded within a high-resolution spectrogram, when paired with a publicly available written transcript, could potentially be used to reverse-engineer the original audio.
It didn’t take long for users to test this theory. Using a combination of the spectrogram from UPS Flight 2976—which crashed in Louisville, Kentucky—and the official transcript of the final moments, internet users employed AI tools, including Codex, to create approximations of the pilots’ voices. The result was a synthetic audio file that mimicked the tone, cadence, and urgency of the deceased crew, effectively bypassing the federal restrictions meant to protect the privacy of the victims and their families.
The Agency’s Response
The NTSB reacted by pulling the plug on its public docket system to assess the extent of the vulnerability. While the agency restored general access on Friday, the move highlights a growing tension between the “open data” ethos of government safety agencies and the capabilities of modern machine learning.
Currently, 42 specific investigations remain closed to the public pending a full security review. Among these is the file for Flight 2976. The agency is now tasked with determining whether spectrograms—previously considered a safe, anonymized way to share data—must be redacted or altered to prevent AI-driven reconstruction.
The New Ethics of Forensic Data
This breach is not just a technical failure but an ethical one. The CVR is intended to be a tool for safety improvement, not public consumption. The trauma associated with hearing the final moments of a fatal accident is precisely why the NTSB maintains strict silos around audio data. When AI bridges that gap, it creates a scenario where the dead are essentially “re-voiced” without consent.
As generative AI becomes more capable of interpreting non-traditional data sources, the NTSB’s struggle suggests that traditional methods of data masking are no longer sufficient. What was once a secure image is now a blueprint for a voice clone, signaling a shift in how forensic evidence must be handled in an era of pervasive AI.