Real-valued Swin VAE · ×64 compression

High-fidelity music,
compressed.

SAGE is a Swin-based variational autoencoder that reconstructs music at a ×64 compression rate.

SCROLL TO DISCOVER

A/B comparison

Hear the difference

Same clip, every model. Switch instantly.

Benchmarks

Reconstruction & perceptual metrics

Reconstruction fidelity and perceptual quality against state-of-the-art audio VAEs. Final numbers land with the last training run.

Semantic probing metrics

How well the frozen latent space supports downstream semantic tasks. Final numbers land with the last training run.

Subjective evaluation

Take the listening test

A short MUSHRA test.

Start the test