Copyright © 2025 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
@article{175924, author = {Bhumireddi Harish and T. Anusha and Gandreti Sneha and Botta Vasavi Anusha and Kolipaka Keerthana and Gollamala Saiteja}, title = {Vision to Voice Object Detection with Real-Time Audio Assistance}, journal = {International Journal of Innovative Research in Technology}, year = {2025}, volume = {11}, number = {11}, pages = {5039-5046}, issn = {2349-6002}, url = {https://ijirt.org/article?manuscript=175924}, abstract = {Vision to Voice uses the YOLOv8 algorithm for object detection which provides real-time auditory assistance to the blind and presents the environment in vocal form. Navigating through such system enhances accessibility and inclusion due to environmental cues' vocalization, smart navigation, and image processing in real time. Thus, it gives a better chance for the visually impaired to operate with audio guidance and speech feedback in their daily lives. This system makes the user feel confident in navigating complex environments by upgrading the contextual awareness balanced between human and environment interaction through deep learning and image localization. This paper aims at discussing architecture and features of YOLOv8, thereby elaborating on its achievements as compared to its previous versions. YOLOv8 with its next-generation backbone for effective feature extraction joined with another refinement for better localizing objects within the neck and anchor-free detection for better performance and flexibility. State-of-the-art augmentations such as mosaic augmentation and adaptive training strategies on the model greatly improve robustness and generalization across various datasets. YOLOv8 provides framework alternatives through PyTorch, increasing portability and allowing customization of the code for deployment on other platforms like edge devices. Experimental results have demonstrated the model's efficacy in tireless real-world applications such as assistive technologies, autonomous navigation, video surveillance, industrial automation, and healthcare.}, keywords = {Vision to Voice, YOLOv8, Real-time Voice Assistance, Blinded Assistance, AI Navigation, Accessibility, Image Localization, Data Augmentation, PyTorch, Edge Devices, Auditory Feedback, Smart Navigation, Personalized Audio Guidance.}, month = {April}, }
Cite This Article
Submit your research paper and those of your network (friends, colleagues, or peers) through your IPN account, and receive 800 INR for each paper that gets published.
Join NowNational Conference on Sustainable Engineering and Management - 2024 Last Date: 15th March 2024
Submit inquiry