OCR using YOLOV3 and tesseract for text extraction

  • Unique Paper ID: 168620
  • PageNo: 1371-1375
  • Abstract:
  • This paper proposes a model for robust Optical Character Recognition using the latest object detection model, YOLOv3, coupled with the state-of-the- art OCR engine, Tesseract. This work uses YOLOv3 for the detection and localization of text regions in images quickly, as it has been found to run very fast when processing complex scenes with high accuracy. The region-wise information is then passed through Tesseract for text information extraction. The text extraction accuracy is thereby increased with the combination of methodologies in challenging environments, including varying font sizes, orientations, and backgrounds. The proposed system is evaluated over a heterogeneous dataset to show real performance improvements in both text recognition accuracy and processing efficiency compared to traditional OCR methods. The results are promising, showing that the combination of YOLOv3 and Tesseract presents a powerful solution for effective, precise, and fast extraction of text from images for applications.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{168620,
        author = {Vijiyalakshmi J and Sailesh R and Subiksha S and Yamini R and Naveenkumar K},
        title = {OCR using YOLOV3 and  tesseract for text extraction},
        journal = {International Journal of Innovative Research in Technology},
        year = {2024},
        volume = {11},
        number = {5},
        pages = {1371-1375},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=168620},
        abstract = {This paper proposes a model for robust Optical Character Recognition using the latest object detection model, YOLOv3, coupled with the state-of-the- art OCR engine, Tesseract. This work uses YOLOv3 for the detection and localization of text regions in images quickly, as it has been found to run very fast when processing complex scenes with high accuracy. The region-wise information is then passed through Tesseract for text information extraction. The text extraction accuracy is thereby increased with the combination of methodologies in challenging environments, including varying font sizes, orientations, and backgrounds. The proposed system is evaluated over a heterogeneous dataset to show real performance improvements in both text recognition accuracy and processing efficiency compared to traditional OCR methods. The results are promising, showing that the combination of YOLOv3 and Tesseract presents a powerful solution for effective, precise, and fast extraction of text from images for applications.},
        keywords = {},
        month = {October},
        }

Cite This Article

J, V., & R, S., & S, S., & R, Y., & K, N. (2024). OCR using YOLOV3 and tesseract for text extraction. International Journal of Innovative Research in Technology (IJIRT), 11(5), 1371–1375.

Related Articles