A Noval Approach for Document categorization Based on Latent Sementic Indexing
Author(s):
Mamta Rani, Gagan Dhawan
Keywords:
Document Categorization, Tokenizing, preproccessing,Term Finding, VSM(Vactor Space Modle),Clustring,LSA or LSI, SOM
Abstract
The intensive expansion of the web and the enlarged number of users has forced new organizations to place their processed data on the web. Besides all this, the constant development in Internet usage is enhancing the problems in controlling the information. The swift dominance of World Wide Web relevance and the want to arrange the data efficiently, to look up the data for knowledge, have emphasized to develop more intellectual and efficient real time web clustering algorithms [8].Latent Semantic Indexing is a better textual representation technique as it maintains semantic information between the words. Hence, we used the singular valuedecomposition (SVD) methods to extract the textual features based on LSI.The LSI also knew LSA. In our experiments, we conducted comparison between some of the well –known classification methods such as Naïve Bayes, k-Nearest Neighbours,NeuralNatwork, Random Forest, Support Vector Machine, classification tree. A NovelApproch for document categorization based on LSI in which initially start work on contains Topic and then Topic contains the folders and folders contains categories after that a document will be created .
Article Details
Unique Paper ID: 146585
Publication Volume & Issue: Volume 5, Issue 1
Page(s): 175 - 179
Article Preview & Download
Share This Article
Join our RMS
Conference Alert
NCSEM 2024
National Conference on Sustainable Engineering and Management - 2024