Baseline System

Status

The baseline system is under development. Further details will be shared shortly.

Training data generation (either offline or online - TBD): download datasets, apply the challenge provided curation lists, run augmentations
Run model training
Run VERSA toolkit-based objective metric-based evaluation
Provide script for challenge rule validation: compute, latency

Encoder
- Strided Convolutions + Residual Units (causal)
Quantizer
- Hard residual VQ with STE (straight-through) gradient estimation, or an FSQ based method
- Support for Multi-bitrate training (1 and 6kbps)
Decoder
- Transposed Convolutions + Residual Units (causal)
Discriminator
- Borrow design from EnCodec and trim down channels