Abstract: Audio-visual speech recognition (AVSR) aims to enhance the robustness of an automatic speech recognition (ASR) systems by incorporating visual information from lip movements, especially in ...