Scholar: A Retrieval-Augmented Generation (RAG) based Multi-Document Question Answering System

  • Unique Paper ID: 195474
  • PageNo: 522-528
  • Abstract:
  • This paper presents Scholar, a Retrieval-Augmented Generation (RAG) based Multi-Document Question Answering System designed to address the challenge of information overload in academic research. Students and researchers accumulate large volumes of heterogeneous documents like PDFs, Word documents, presentations, spreadsheets, web pages, and images but lack efficient tools to extract specific knowledge from them without relying on unreliable general-purpose language models. Scholar enables users to build a personal knowledge base by uploading documents in multiple formats and asking natural language questions. The system processes uploads through format-specific parsers, splits content into semantically coherent overlapping chunks, and encodes them as dense vector embeddings using the locally-hosted all-MiniLM-L6-v2 sentence transformer model. These embeddings are stored in a ChromaDB vector database for persistent retrieval. At query time, the system performs cosine-similarity search to retrieve the most relevant chunks, assembles a grounded context block, and submits it to the Groq-hosted LLaMA-3.3-70B language model with strict anti-hallucination instructions. A distinguishing feature of Scholar is its multi-modal ingestion capability: in addition to text-based documents, the system accepts standalone images (JPG, PNG) and automatically extracts and describes figures embedded in PDFs using the Groq llama-4-scout vision model. This enables users to ask questions about charts, diagrams, and photographs. The system is built entirely on free, open-source tools and deploys locally on a standard Windows laptop with no GPU requirement, making it accessible to students at any institution.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{195474,
        author = {Ummidi Sri Vyshnavi and Talagana Tarun and Uggina Nookeswara Satyanarayana and Sabbavarapu Sai Madhu},
        title = {Scholar: A Retrieval-Augmented Generation (RAG) based Multi-Document Question Answering System},
        journal = {International Journal of Innovative Research in Technology},
        year = {2026},
        volume = {12},
        number = {11},
        pages = {522-528},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=195474},
        abstract = {This paper presents Scholar, a Retrieval-Augmented Generation (RAG) based Multi-Document Question Answering System designed to address the challenge of information overload in academic research. Students and researchers accumulate large volumes of heterogeneous documents like PDFs, Word documents, presentations, spreadsheets, web pages, and images but lack efficient tools to extract specific knowledge from them without relying on unreliable general-purpose language models.
Scholar enables users to build a personal knowledge base by uploading documents in multiple formats and asking natural language questions. The system processes uploads through format-specific parsers, splits content into semantically coherent overlapping chunks, and encodes them as dense vector embeddings using the locally-hosted all-MiniLM-L6-v2 sentence transformer model. These embeddings are stored in a ChromaDB vector database for persistent retrieval. At query time, the system performs cosine-similarity search to retrieve the most relevant chunks, assembles a grounded context block, and submits it to the Groq-hosted LLaMA-3.3-70B language model with strict anti-hallucination instructions.
A distinguishing feature of Scholar is its multi-modal ingestion capability: in addition to text-based documents, the system accepts standalone images (JPG, PNG) and automatically extracts and describes figures embedded in PDFs using the Groq llama-4-scout vision model. This enables users to ask questions about charts, diagrams, and photographs. The system is built entirely on free, open-source tools and deploys locally on a standard Windows laptop with no GPU requirement, making it accessible to students at any institution.},
        keywords = {Retrieval-Augmented Generation, Large Language Models, Vector Database, ChromaDB, Sentence Transformers, Groq API, FastAPI, Anti-Hallucination, Multi-Modal, Semantic Search, Multi-Document QA, LLaMA.},
        month = {April},
        }

Cite This Article

Vyshnavi, U. S., & Tarun, T., & Satyanarayana, U. N., & Madhu, S. S. (2026). Scholar: A Retrieval-Augmented Generation (RAG) based Multi-Document Question Answering System. International Journal of Innovative Research in Technology (IJIRT), 12(11), 522–528.

Related Articles