TASTE-S Demo Page

Audio comparison demos for reconstruction quality and similarity across short-form and longform settings.

Speech Reconstruction (shortform)

Comparison of original audio and reconstructed samples across different codecs and tokenization methods under comparable bps.

Samples Original TASTE-S (ours) Text-only TASTE TaDiCodec Encodec (1500bps) DM-Codec (1000bps) Mimi (1000bps) SpeechTokenizer (500bps) BigCodec (1040bps) WavTokenizer (480bps)
Sample 1
Sample 2
Sample 3

Longform Reconstruction

Comparison of longform speech reconstruction between TASTE-S and TASTE.

Samples Original TASTE-S (ours; w/ built-in ASR) TASTE-S (ours; w/ external ASR) TASTE (w/ external ASR)
Sample 1 Duration: 173.6 s
Duration: 173.9 s
Duration: 175.3 s
Duration: 183.1 s