Document summarization using K- Means clustering and N GRAM Algorithm

  • Unique Paper ID: 144637
  • PageNo: 268-274
  • Abstract:
  • Internet information is growing exponentially every day at a rapid pace. In order to find out the exact required information from the web, search engine has become absolutely necessary tool for the web users. It has also become more difficult to provide the user with the required information. When different users provide an ambiguous query to a search engine, they might be having different search goals. Therefore, it is required to find and analyze user search goals to improve the user experience. By representing the results in cluster we find out different user search goals for a query. It has proved to be more advantages in improving search engine relevance and user experience. Query classification, search result reorganization and session boundary detection are the approaches attempt to find out user search goals. But the mentioned approaches have many limitations i.e. Classified Average Precision (CAP), Average Precision (AP), Mean Average Precision (MAP). A new approach has been implemented that overcomes the limitations of existing approaches which will further analyzes and discover relevant user search goals. This approach first takes the user search query as an input, for each single result of the search query pseudo-documents are generated. Using K-means clustering algorithm, these pseudo-documents are clustered. Each cluster can be considered as one user search goal. Finally, this output cluster document, generates a summary of input documents which are provided to the user in order to satisfy its query.
add_icon3email to a friend

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{144637,
        author = {Ankita  matta and Indu kashyap},
        title = {Document summarization using K- Means clustering and N GRAM Algorithm},
        journal = {International Journal of Innovative Research in Technology},
        year = {},
        volume = {4},
        number = {1},
        pages = {268-274},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=144637},
        abstract = {Internet information is growing exponentially every day at a rapid pace. In order to find out the exact required information from the web, search engine has become absolutely necessary tool for the web users. It has also become more difficult to provide the user with the required information. When different users provide an ambiguous query to a search engine, they might be having different search goals. Therefore, it is required to find and analyze user search goals to improve the user experience. By representing the results in cluster we find out different user search goals for a query. It has proved to be more advantages in improving search engine relevance and user experience. Query classification, search result reorganization and session boundary detection are the approaches attempt to find out user search goals. But the mentioned approaches have many limitations i.e. Classified Average Precision (CAP), Average Precision (AP), Mean Average Precision (MAP). A new approach has been implemented that overcomes the limitations of existing approaches which will further analyzes and  discover relevant user search goals. This approach first takes the user search query as an input, for each single result of the search query pseudo-documents are generated. Using K-means clustering algorithm, these pseudo-documents are clustered. Each cluster can be considered as one user search goal. Finally, this output cluster document, generates a summary of input documents which are provided to the user in order to satisfy its query.  },
        keywords = {user search goals, feedback sessions, pseudo-documents, clustering, and document summarization},
        month = {},
        }

Cite This Article

matta, A. ., & kashyap, I. (). Document summarization using K- Means clustering and N GRAM Algorithm. International Journal of Innovative Research in Technology (IJIRT), 4(1), 268–274.

Related Articles