Enhancing Machine Learning Methods for Robust Real-Time Text Classification of Bilingual Documents

  • Unique Paper ID: 183108
  • Volume: 12
  • Issue: 3
  • PageNo: 42-48
  • Abstract:
  • The rapid growth of digital data has led to the widespread creation and storage of digital images containing text. The extraction and use of textual information might be advantageous for various kinds of domains. Text detection in natural images is primarily affected by noise, blur, distortions, font variation, alignments, and orientation. Government forms, mark cards, medical records, business receipts, and other increasingly common bilingual documents have a substantial impact about precision of text detection and recognition. The paper, "Enhanced Machine Learning Methods for Robust Real-Time Classification of Bilingual Documents," emphasizes these challenges and suggests a solution that uses image enhancement techniques to improve an image's appearance and quality. Digital documents are bilingual; hence, extracting information from them is challenging since a computer can read and interpret text written in many scripts, including English and Kannada, within the same documentation. It thus extracts text from documents using optical character recognition (OCR)-based problems. Natural language processing (NLP) to classify text, super-resolution methods to synthesize a high-resolution (HR) image from several low-resolution (LR) images and machine learning models EasyOcr to detect and recognize Text.

Copyright & License

Copyright © 2025 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{183108,
        author = {santhosh SG and Sridhara Acharya P and Sampath Kumar S and Rechal},
        title = {Enhancing Machine Learning Methods for Robust Real-Time Text Classification of Bilingual Documents},
        journal = {International Journal of Innovative Research in Technology},
        year = {2025},
        volume = {12},
        number = {3},
        pages = {42-48},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=183108},
        abstract = {The rapid growth of digital data has led to the widespread creation and storage of digital images containing text. The extraction and use of textual information might be advantageous for various kinds of domains. Text detection in natural images is primarily affected by noise, blur, distortions, font variation, alignments, and orientation. Government forms, mark cards, medical records, business receipts, and other increasingly common bilingual documents have a substantial impact about precision of text detection and recognition. The paper, "Enhanced Machine Learning Methods for Robust Real-Time Classification of Bilingual Documents," emphasizes these challenges and suggests a solution that uses image enhancement techniques to improve an image's appearance and quality. Digital documents are bilingual; hence, extracting information from them is challenging since a computer can read and interpret text written in many scripts, including English and Kannada, within the same documentation. It thus extracts text from documents using optical character recognition (OCR)-based problems. Natural language processing (NLP) to classify text, super-resolution methods to synthesize a high-resolution (HR) image from several low-resolution (LR) images and machine learning models EasyOcr to detect and recognize Text.},
        keywords = {Image Enhancement, Natural Language Processing (NLP), Optical Character Recognition (OCR), Real-Time Text Recognition.},
        month = {July},
        }

Cite This Article

  • ISSN: 2349-6002
  • Volume: 12
  • Issue: 3
  • PageNo: 42-48

Enhancing Machine Learning Methods for Robust Real-Time Text Classification of Bilingual Documents

Related Articles