Visual Speech recognition using lip movement for deaf people using deep learning
Author(s):
G.J.Navale, RIYA SATIJA, MAHIMA OSWAL, BHAVESH DALAL, NIKET PATIL
Keywords:
Deep Learning, Image processing, clasisifcation, deep learning, feature extraction, feature selection
Abstract
There has been a growing interest in creating automatic lip-reading systems (ALR). Methods based on Deep Learning (DL), like other computer vision applications, have grown in popularity and allowed for significant improvements in performance. The audio-visual speech recognition approach attempts to boost noise-robustness in mobile situations by extracting lip movement from side-face pictures. Although most earlier bimodal speech recognition algorithms used frontal face (lip) pictures, these approaches are difficult for consumers to utilise because they need them to speak while holding a device with a camera in front of their face. Our suggested solution, which uses a tiny camera put in a phone to capture lip movement, is more natural, simple, and convenient. This approach also successfully prevents a reduction in the input speech's signal-to-noise ratio (SNR). Optical-flow analysis extracts visual characteristics, which are then coupled with auditory data in the context of DCNN-based recognition. We employ DCNN for audio-visual speech recognition in this paper; specifically, we leverage deep learning from audio and visual characteristics for noise-resistant speech recognition. In the experimental analysis we achieved around 90% accuracy on real time test data that provides higher accuracy than traditional deep learning algorithm.
Article Details
Unique Paper ID: 155001
Publication Volume & Issue: Volume 8, Issue 12
Page(s): 998 - 1002
Article Preview & Download
Share This Article
Conference Alert
NCSST-2021
AICTE Sponsored National Conference on Smart Systems and Technologies
Last Date: 25th November 2021
SWEC- Management
LATEST INNOVATION’S AND FUTURE TRENDS IN MANAGEMENT