Open Domain Question Answering System Using Wikipedia
Author(s):
Ria Singh, Preetam B. A., Prince Zalavadia, Dr. Sharvari Govilkar
Keywords:
Question Answering System, Open Domain, Wikipedia, Deep Learning, Natural Language Processing, Document Retrieval, Document Reader, Answer Ranker.
Abstract
The open-domain question answering task has recently been addressed using unstructured data such as websites and online encyclopedias. Here, open-domain questions are answered by making full use of knowledge sources of Wikipedia via its API for many types of questions, it is critical to analyze user questions in terms of the nature of the answers being sought. The analyzed result of a question has three components: Answer Format, Answer Theme and Question Target (question analysis). The next step involves finding the most relevant documents or passages related to the question using either word embedding distances or Deep Learning (document retrieval). Finally, the answers are extracted from the passage (machine comprehension). PageRank technique can be used while computing the "document score" to assess relevance of a document to a query. BERT or ALBERT architecture (document reader trained on SQuAD 2.0 dataset) can be used which will dramatically improve performance for machine comprehension. The Answer Ranker (SoftMax function) extracts the 1-2 lines of answer for the query. The question provided and the answer generated are in audio format using the SpeechRecognition library and gTTS API.
Article Details
Unique Paper ID: 151185
Publication Volume & Issue: Volume 7, Issue 12
Page(s): 165 - 173
Article Preview & Download
Share This Article
Conference Alert
NCSST-2021
AICTE Sponsored National Conference on Smart Systems and Technologies
Last Date: 25th November 2021
SWEC- Management
LATEST INNOVATION’S AND FUTURE TRENDS IN MANAGEMENT