Visual Speech recognition using lip movement for deaf people using deep learning

  • Unique Paper ID: 155001
  • Volume: 8
  • Issue: 12
  • PageNo: 998-1002
  • Abstract:
  • There has been a growing interest in creating automatic lip-reading systems (ALR). Methods based on Deep Learning (DL), like other computer vision applications, have grown in popularity and allowed for significant improvements in performance. The audio-visual speech recognition approach attempts to boost noise-robustness in mobile situations by extracting lip movement from side-face pictures. Although most earlier bimodal speech recognition algorithms used frontal face (lip) pictures, these approaches are difficult for consumers to utilise because they need them to speak while holding a device with a camera in front of their face. Our suggested solution, which uses a tiny camera put in a phone to capture lip movement, is more natural, simple, and convenient. This approach also successfully prevents a reduction in the input speech's signal-to-noise ratio (SNR). Optical-flow analysis extracts visual characteristics, which are then coupled with auditory data in the context of DCNN-based recognition. We employ DCNN for audio-visual speech recognition in this paper; specifically, we leverage deep learning from audio and visual characteristics for noise-resistant speech recognition. In the experimental analysis we achieved around 90% accuracy on real time test data that provides higher accuracy than traditional deep learning algorithm.

Cite This Article

  • ISSN: 2349-6002
  • Volume: 8
  • Issue: 12
  • PageNo: 998-1002

Visual Speech recognition using lip movement for deaf people using deep learning

Related Articles