Hrutuja Tiple, Dr. Manisha Pise, Deepika Uike, Shivani Kurwane, Khushi Chintala
Text Summarization, Transformer, Language Processing, Abstractive Text Summarization.
In today's digital era, the abundance of textual information presents a challenge for efficient comprehension and analysis. This challenge is particularly evident in the handling of lengthy documents such as PDF files. To address this, a Python script leveraging the PyMuPDF library for PDF text extraction and the Hugging Face Transformers library for text summarization, specifically utilizing the T5 model, has been developed. The script operates seamlessly from the command line, offering a user-friendly interface for summarizing PDF documents. Upon receiving the path to a PDF file as input, it employs PyMuPDF to extract text from the document. The extracted text then undergoes preprocessing, including the removal of extraneous spaces, newlines, and optionally, the "References" section. Subsequently, the preprocessed text is fed into a pre-trained T5 model, obtained via the Transformers library. The T5 model's capabilities are harnessed for text summarization, where it condenses the input text into a concise summary. The summarization process is fine-tuned to produce summaries of optimal length, ensuring comprehensibility while avoiding information loss. The script showcases robust error handling, gracefully managing exceptions encountered during PDF processing or model utilization. Output is provided in the form of both the original text snippet and the generated summary, aiding users in quickly grasping the document's essence.
Article Details
Unique Paper ID: 164748

Publication Volume & Issue: Volume 10, Issue 12

Page(s): 2076 - 2080
Article Preview & Download

Share This Article

Join our RMS

Conference Alert

NCSEM 2024

National Conference on Sustainable Engineering and Management - 2024

Last Date: 15th March 2024

Call For Paper

Volume 11 Issue 1

Last Date for paper submitting for Latest Issue is 25 June 2024

About Us enables door in research by providing high quality research articles in open access market.

Send us any query related to your research on

Social Media

Google Verified Reviews