Hinglish, Text Classification, Naive Bayes, Tokenization, Diacritics Removal, Words Stemming.
Text Classification can be done in easily in English, but it’s difficult to perform in various other languages. Not much of the work is done for Indian languages like Hindi, Marathi, Bengali. Due to the incredible growth in the internet users most of the people are comfortable with Hinglish which is the combination of Hindi and English. This paper identifies of language, sentiment analysis and the new classification based on Hinglish language is proposed.