Document summarization using K- Means clustering and N GRAM Algorithm
Author(s):
Ankita matta, Indu kashyap
Keywords:
user search goals, feedback sessions, pseudo-documents, clustering, and document summarization
Abstract
Internet information is growing exponentially every day at a rapid pace. In order to find out the exact required information from the web, search engine has become absolutely necessary tool for the web users. It has also become more difficult to provide the user with the required information. When different users provide an ambiguous query to a search engine, they might be having different search goals. Therefore, it is required to find and analyze user search goals to improve the user experience. By representing the results in cluster we find out different user search goals for a query. It has proved to be more advantages in improving search engine relevance and user experience. Query classification, search result reorganization and session boundary detection are the approaches attempt to find out user search goals. But the mentioned approaches have many limitations i.e. Classified Average Precision (CAP), Average Precision (AP), Mean Average Precision (MAP). A new approach has been implemented that overcomes the limitations of existing approaches which will further analyzes and discover relevant user search goals. This approach first takes the user search query as an input, for each single result of the search query pseudo-documents are generated. Using K-means clustering algorithm, these pseudo-documents are clustered. Each cluster can be considered as one user search goal. Finally, this output cluster document, generates a summary of input documents which are provided to the user in order to satisfy its query.
Article Details
Unique Paper ID: 144637
Publication Volume & Issue: Volume 4, Issue 1
Page(s): 268 - 274
Article Preview & Download
Share This Article
Join our RMS
Conference Alert
NCSEM 2024
National Conference on Sustainable Engineering and Management - 2024