Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models Paper • 2511.07253 • Published about 1 month ago • 2 • 2
Mitigating Attention Sinks and Massive Activations in Audio-Visual Speech Recognition with LLMS Paper • 2510.22603 • Published Oct 26 • 2 • 1
MoME: Mixture of Matryoshka Experts for Audio-Visual Speech Recognition Paper • 2510.04136 • Published Oct 5 • 3 • 2
Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach Paper • 2505.14336 • Published May 20 • 3 • 2
Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs Paper • 2503.06362 • Published Mar 9 • 3 • 2