Effect of Word Embedding Techniques on Clustering of Netflix Movies and TV Shows dataset

  • Unique Paper ID: 157682
  • Volume: 9
  • Issue: 7
  • PageNo: 716-725
  • Abstract:
  • Netflix is one of the leading over-the-top (OTT) platforms because of its reputation for offering users a wide variety of high-quality streaming movies as well as TV Shows. The reason why Netflix's services are so popular worldwide is that the company uses recent technologies like machine learning, deep learning and Artificial Intelligence to provide consumers with more appropriate and intuitive recommendation. This paper is based on Unsupervised Clustering Analysis on Netflix Movies and TV Shows dataset. Aim of the Project is to form the Clusters based on K mean clustering, Agglomerative Clustering and Affinity Propagation Clustering. We have done Data Preprocessing, Text Cleaning, Exploratory Data Analysis, Vectorization, Implementing Clustering Models, Hyper parameter tuning. Dataset is analyzed with Word2Vec Word Embedding, CounVectorizer and TfidfVectorizer. Out of these Word2Vec has much better performance than other methods. I have Keep Silhouette Score , Elbow Method and Dendrogram as the Selection Criteria for Finding out optimum number of Clusters. We figure out Exploratory Data Analysis, Understanding what type content is available in different countries, Netflix has increasingly focused on TV rather than movies in recent years. Clustering similar content by matching text-based features

Cite This Article

  • ISSN: 2349-6002
  • Volume: 9
  • Issue: 7
  • PageNo: 716-725

Effect of Word Embedding Techniques on Clustering of Netflix Movies and TV Shows dataset

Related Articles