Is this audio AI-generated?

Drop an mp3, wav, m4a, ogg, or flac. We check for AudioSeal / SynthID watermarks first, then compute a 1024-bin STFT spectrogram in your browser and surface three forensic signals: spectral flatness uniformity, harmonic stability, and centroid drift. Nothing is uploaded.

Honest about the limits: the best public open detectors on Speech DF Arena (Dec 2025) sit at 13.84–18.02% average EER across cross-dataset evaluation, and ASVspoof 5 winners that score 3.30% on their training distribution jump to 10–18% on older datasets. We separate “synthetic,” “suspicious,” and “out of domain” instead of forcing a single verdict.

Two lanes: watermark, then forensic

An AudioSeal or SynthID hit is a high-confidence positive with limited coverage. Spectral forensics is broader but noisier. We surface them as separate evidence — a watermark match outranks any soft score in the UI.

Codec- and channel-aware

The hard 2026 lesson: noise isn't the real enemy. Neural codecs, pitch shift, time stretch, room reverb, and telephony pipelines (Opus, AMR-WB, EVS, SILK) hurt more. ADD-C shows baseline detectors lose ~5.30 EER points under realistic comms; Wave2Vec2 drops to 0.558 accuracy under a 0.1s echo on clean WaveFake. We classify the channel first, then weight the verdict.

Speech ≠ music

Suno and Udio end-to-end songs need a different detector than ElevenLabs or Sesame voices. We auto-classify the clip as speech, singing, or music — if your file is music-heavy, we'll tell you the speech score is out of domain rather than fake-confident.