Frequently Asked Questions
Last Revised: August 27, 2025
Table of Contents
- Dataset Access & Filtering (Curation)
- Team Related (Team Size, Team Overlap)
- Challenge Phases & Evaluation of Submissions
- Submission Limits for Tracks and Papers
- Solution Types
- Evaluation of Computational Complexity & Latency
Dataset Access & Filtering (Curation)
Q: How do I access the training datasets for the LRAC Challenge?
A: The list of training speech, noise, and reverb datasets is provided on the challenge website: https://lrac.short.gy/datasets.
The dataset scripts for downloading, filtering (selecting a subset of files based on curation), and generating train-validation splits for creation of the official challenge baseline models are available in the data generation repository: https://lrac.short.gy/data-gen-repo. For more information, please refer to the README.md file.
Q: Could you share details of the dataset curation?
A: Details of the dataset filtering and balancing are provided in the Datasets section of the challenge website: https://lrac.short.gy/datasets.
Q: I am looking to work with the full (non-filtered) datasets – where can I find these?
A: If you plan to work with the full (non-filtered) set of files, these will also be downloaded when you run the prepare_espnet_data.sh
script located in the data generation repository: https://lrac.short.gy/data-gen-repo.
However, please note that the full (non-filtered) data will be saved “as downloaded” from the original sources, that is, no postprocessing (such as resampling) will be included. Postprocessing will only be applied by the prepare_espnet_data.sh
script to the subset of files that was used in training of the official challenge baseline.
Team Related (Team Size, Team Overlap)
Q: Is there a limit to the number of members per team?
A: No, currently there is no formal restriction on size of each team (within reason).
Q: Can I participate on more than one team?
A: Possibly. Requests for a team member overlap across teams will be considered on a case-by-case basis by the organizers.
Challenge Phases & Evaluation of Submissions
Q: What are the challenge phases and how will they be evaluated?
A: The challenge has two distinct phases as outlined on the leaderboard website and in the ENTRY PERIOD section of the Official Rules: https://lrac.short.gy/rules-entry-period: (i) the development phase, and (ii) the test phase. The timing of each phase is also outlined there.
- The development phase is based on an open test set and objective metrics, and is aimed as a guide for the participants during their solution development, while noting that (a) the existing objective metrics may have limitations, and as such it may not be advisable to solely rely upon them during the solution development, and (b) there will be inevitable differences between the open test set used in the development phase and the blind test set used in the test phase. However, for the latter, the organizers will provide examples to participants with track enrollments on the leaderboard website.
- The test phase is based on a blind (withheld) test set. The evaluation will be based solely on crowdsourced listening tests. No objective metrics will be reported for the test phase.
Further details are given in the Evaluation section of the challenge website: http://lrac.short.gy/evaluation and in the Official Rules: http://lrac.short.gy/rules.
Q: When will the leaderboard results be publicly available?
A: The leaderboard results (for both development and test phases of the challenge) will become publicly available on October 15, after the public announcement of the test phase results. Until then, the leaderboard will only be visible to participants who are enrolled in the corresponding tracks on the leaderboard portal.
Q: Are all submission results automatically published on the leaderboard?
A: No, they are not. During the development phase, participants can choose which open test set submission result, if any, to display on the leaderboard. They can also remove their entry from the leaderboard if they wish. However, the latest valid blind test set submission made during the test phase will be considered as the final submission, and its crowdsourced listening test results will be published on the final leaderboard.
Q: In Track 1 of the challenge, if the input speech contains slight reverberation or noise, should our model preserve this reverberation or noise at the decoder output? Or, is some mild processing (denoising/dereverberation) acceptable?
A: The primary objective in Track 1 is to preserve the input audio as faithfully as possible at the output, including under real-world conditions that include mild noise and/or reverberation. That said, the primary focus of the evaluation is on avoiding any noticeable degradation of speech quality or intelligibility–meaning that artefacts or reduced clarity introduced by the codec should be minimized. While the ideal is complete transparency (i.e., the output matches the input, including any original noise or reverberation), some solutions may incidentally perform mild denoising or dereverberation. Note that in the crowdsourced evaluation battery such effects will neither be specifically rewarded nor penalized, as long as the output does not diminish speech quality. On the other hand, introducing additional noise or reverberation that is not present in the original input will likely be regarded as degradation by the listeners. Specifically, the evaluations will leverage the D-MOS (Degradation Mean Opinion Score) to compare your model’s output against the original audio that includes real-world mild noisy and reverberant conditions.
Submission Limits for Tracks and Papers
Q: How many eval submissions are allowed?
A: For the development phase of the challenge, the up-to-date stimuli submission limits are shown to the participants registered on the leaderboard portal. The current limits for the development phase are 3 submissions per track per team per day. This limit may change. Each development phase submission will undergo automatic evaluation using objective metrics only. Participants will have the ability to select which evaluation results to publish on the leaderboard.
For the test phase, the limit shown in the leaderboard portal is 10 per day. Submission quota can be used to make revisions. Note that only the latest revision will actually undergo evaluation using crowdsourced listening tests. However, we may consider allowing up to two test phase submissions per track per team–we plan to make this determination by the end of the registration period. We’ll advise further details then.
Q: If a team participates in both Track 1 and Track 2, how many challenge papers can be submitted?
A: Teams may submit either a single paper covering both tracks or separate papers for each track. This applies to both the mandatory non-peer-review system description papers and the optional peer-review workshop papers.
The detailed system description papers are required with the final (test phase) submission (minimum 2 pages plus references) and will not be included in the IEEE ICASSPW proceedings. The optional 4-page + references workshop papers will undergo a peer-review process, and accepted papers will be included in the IEEE ICASSPW proceedings (details will be advised shortly).
Solution Types
Q: Are hybrid solutions combining neural networks with traditional coding or post-processing technologies allowed?
A: Yes, standalone neural, hybrid solutions (e.g., combining neural networks with traditional coding or post-processing), along with other solutions permitted in the Official Rules: https://lrac.short.gy/rules are welcomed and encouraged. Please note that the computational complexity, latency, and bitrate constraints apply to the overall system, regardless of its components. Teams are responsible for independently ensuring their solutions meet the constraints outlined in the Official Rules.
Q: Could the submitted codec have separate bitrate specific encoders and codebooks in case of a codecook based quantizer?
A: In typical real-time communication systems, we should be able to adjust the bitrate on the cloud depending on the network conditions on the receive side. So, during a single call, the bitrate might need to be dropped from 6 kbps to 1 kbps, and might return back to 6 kbps at any time.
Since we are asking to use a mixture of modes, 1 and 6 kbps, in the same inference run, that is within the same call/utterance, one should not be making an assumption that we can have separate systems for 1 and 6 kbps.
If we have separate encoders and/or codebooks for 1 and 6 kbps, to be able to switch from 6 to 1 kbps on the cloud or receive side, one might need to send both streams on the transmit side which increases the bandwidth to 7 kbps. Unless the participants come up with a solution to avoid this bandwidth increase, merely having separate codebooks per bitrate will not be following the rules even if a single decoder works with both codebooks.
Note that one might have two encoders running in parallel, one generating embeddings for the codewords up to 1 kbps, and the other generating embeddings for the codewords between 1 to 6 kbps (effectively 5 kbps bitrate). We consider those defacto as blocks of a single system, and such a system would be allowed.
Q: Could the decoder have bitrate specific modules that are activated only for a particular bitrate?
A: Activating different parts/modules of the same decoder is allowed as long as this does not prevent the system to change the bitrate in the middle of an utterance/call. Having bitrate specific fully connected layers is easier in this respect. When you have bitrate specific layers with history/memory/state, the network might have issues when switching from one bitrate to another. The complexity of the decoder should be reported as the maximum complexity that may occur in any inference run.
Q: Could you provide guidelines regarding latency measurements both for buffering and algorithmic latency?
A: We have added Latency Calculation Guidelines section to the Baseline page. Please refer to that section and let us know if you need additional clarification.
Evaluation of Computational Complexity & Latency
Q: How should computational complexity and latency be reported? Will a tool be provided?
A: Teams are responsible for independently assessing and reporting the computational complexity, latency, and bitrate of their solutions in the required system description paper. All reporting is on an honor basis.
The organizers may (to be determined) provide a latency assessment tool to assist teams during their solution development, but any such tool should be considered a guide only and not relied upon as the sole method for compliance. A helper for assessing compute complexity of a solution will not be provided by the organizers.
The organizers will publish a detailed system report for the baseline systems in the coming weeks.
Q: How should computational complexity be calculated?
A: Computational complexity should be measured as the total number of multiply–accumulate (MAC) operations, including those arising from matrix–matrix multiplications, matrix–vector multiplications, and summations. A fused MAC operation is assumed to be natively supported in both software and hardware. The resulting complexity should first be computed in MMAC/s, and subsequently scaled by a factor of 2 to provide an approximate measure in MFLOPS.
Q: Does the LRAC challenge have an official standard tool or designated framework for calculating model complexity (a specific version of ptflops, calflops, or other tools)?
A: No, the LRAC challenge does not mandate a specific tool for computing model complexity. Many existing tools (such as ptflops or calflops) are tied to a particular framework (e.g., PyTorch), and we do not want to restrict participants to any single framework.
We recommend calculating complexity analytically using layer design formulas (based on multiply-accumulate operations, or MACs), and using FLOP-counting tools only as a sanity check. Keep in mind that these tools may sometimes overestimate complexity. For example, a causal attention mechanism implemented via masked probabilities may be reported as more expensive than its true algorithmic complexity.
For the purposes of this challenge, the computational cost of nonlinear activations can be ignored. This is for two reasons:
- Nonlinearities can often be approximated by cheaper functions or lookup tables without significant performance loss, so we do not want to encourage premature optimization in this area.
- The actual runtime cost of nonlinear activations is highly implementation- and hardware-dependent, whereas multiply-add operations dominate the computational complexity of most neural networks.
If you have further questions, please reach out to us: lrac-challenge@cisco.com