Comparison Of Text Classification Models for Telugu News Articles
Author(s):
naga Sudha, madhaveelatha
Keywords:
Text classification, Logistic regression, Naive Bayes classifier, Support Vector machine, Gradient descent trees.
Abstract
Text classification is become important when the information is increasing rapidly over the internet. This information is in unstructured form and need to be digitized. As these documents are digital form it is necessary for organizing the data by automatically assigning a set of documents into predefined labels based on their content. It mainly depends on the methods that should be used in each phase improves the efficiency of the document classification. In this paper we propose a classification model that supports both the generality and efficiency. It also discusses some of the major issues involved in automatic text classification such as dealing with unstructured text, handling large number of attributes and natural language processing based techniques, dealing with missing metadata and choice of a suitable machine learning technique for training a text classifier. Both are achieved by following the logical sequence of the process of classifying the unstructured text document step by step and efficiency through various methods are proposed. The experimental results over news articles have been validated using statistical measures of accuracy and F-Score. The results have proven that the methods significantly improve the performance.
Article Details
Unique Paper ID: 146440
Publication Volume & Issue: Volume 4, Issue 12
Page(s): 760 - 763
Article Preview & Download
Share This Article
Join our RMS
Conference Alert
NCSEM 2024
National Conference on Sustainable Engineering and Management - 2024