Sentiment Analysis of IMDB Movie Reviews: Natural Language Processing for Opinion Mining

  • Unique Paper ID: 195128
  • Volume: 12
  • Issue: 10
  • PageNo: 8188-8193
  • Abstract:
  • The rapid growth of online platforms has significantly increased the volume of user-generated textual data. Movie review websites such as IMDb allow users to share opinions about movies, actors, and overall cinematic experiences. These reviews contain valuable information that can be used to understand audience sentiment and preferences. However, analyzing such large volumes of textual data manually is extremely difficult and time-consuming. Therefore, automated sentiment analysis techniques have become essential for extracting meaningful insights from user reviews. This research proposes a machine learning-based sentiment analysis system for classifying IMDb movie reviews into positive and negative categories. The system uses Natural Language Processing (NLP) techniques to process and analyse textual data effectively. Initially, the textual reviews undergo preprocessing steps such as text cleaning, tokenization, stop-word removal, and normalization. After preprocessing, the textual data is transformed into numerical feature representations using the Term Frequency–Inverse Document Frequency (TF-IDF) technique. To perform sentiment classification, multiple machine learning algorithms are applied, including Logistic Regression, Naive Bayes, and Support Vector Machine (SVM). In addition, an ensemble learning method is implemented to combine predictions from these individual models to improve overall performance. The experiments are conducted using the IMDb dataset containing 50,000 labelled movie reviews. Experimental results demonstrate that the proposed approach effectively identifies sentiment polarity and achieves high classification accuracy. The study highlights the effectiveness of combining TF-IDF feature extraction with machine learning algorithms for sentiment analysis tasks.

Copyright & License

Copyright © 2026 Authors retain the copyright of this article. This article is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

BibTeX

@article{195128,
        author = {Mrs. D. Kanakasatya and M. Prema Kumari and L. Hemanth Venkatesh and S. Bhaskar Sai Manikanta and S. Ramya},
        title = {Sentiment Analysis of IMDB Movie Reviews: Natural Language Processing for Opinion Mining},
        journal = {International Journal of Innovative Research in Technology},
        year = {2026},
        volume = {12},
        number = {10},
        pages = {8188-8193},
        issn = {2349-6002},
        url = {https://ijirt.org/article?manuscript=195128},
        abstract = {The rapid growth of online platforms has significantly increased the volume of user-generated textual data. Movie review websites such as IMDb allow users to share opinions about movies, actors, and overall cinematic experiences. These reviews contain valuable information that can be used to understand audience sentiment and preferences. However, analyzing such large volumes of textual data manually is extremely difficult and time-consuming. Therefore, automated sentiment analysis techniques have become essential for extracting meaningful insights from user reviews.
This research proposes a machine learning-based sentiment analysis system for classifying IMDb movie reviews into positive and negative categories. The system uses Natural Language Processing (NLP) techniques to process and analyse textual data effectively. Initially, the textual reviews undergo preprocessing steps such as text cleaning, tokenization, stop-word removal, and normalization. After preprocessing, the textual data is transformed into numerical feature representations using the Term Frequency–Inverse Document Frequency (TF-IDF) technique.
To perform sentiment classification, multiple machine learning algorithms are applied, including Logistic Regression, Naive Bayes, and Support Vector Machine (SVM). In addition, an ensemble learning method is implemented to combine predictions from these individual models to improve overall performance. The experiments are conducted using the IMDb dataset containing 50,000 labelled movie reviews. Experimental results demonstrate that the proposed approach effectively identifies sentiment polarity and achieves high classification accuracy. The study highlights the effectiveness of combining TF-IDF feature extraction with machine learning algorithms for sentiment analysis tasks.},
        keywords = {Sentiment Analysis, Natural Language Processing, Opinion Mining, IMDb Dataset, TF-IDF, Logistic Regression, Naive Bayes, Support Vector Machine, Ensemble Learning, Text Mining.},
        month = {March},
        }

Cite This Article

Kanakasatya, M. D., & Kumari, M. P., & Venkatesh, L. H., & Manikanta, S. B. S., & Ramya, S. (2026). Sentiment Analysis of IMDB Movie Reviews: Natural Language Processing for Opinion Mining. International Journal of Innovative Research in Technology (IJIRT). https://doi.org/doi.org/10.64643/IJIRTV12I10-195128-459

Related Articles