Quantification Of Variation Of Results Of The K-Means Clustering Algorithm Run Ten Times On The First Twenty Five Primes – A Criterion For Applicability Of The K-Means Clustering

Q: How many days will it take for my paper to be published?

The review time for papers is not fixed. However, if the paper is accepted and the author completes the processing charges formalities, the paper will be published within a few working days.

Q: I would like to receive a hard copy of the journal materials. Are there any additional charges?

You can log in to the author portal and pay 500 INR to receive the hard copy materials.

Chandini Giri; R Satya Ravindra Babu

Quantification Of Variation Of Results Of The K-Means Clustering Algorithm Run Ten Times On The First Twenty Five Primes – A Criterion For Applicability Of The K-Means Clustering

Authors: Chandini Giri, R Satya Ravindra Babu

Unique Paper ID: 150490
Volume: 7
Issue: 7
PageNo: 19-30

Keywords: K-Means Clustering Clustering Uncertainty

Abstract:
Clustering is one of the most common exploratory data analysis technique used to get an intuition about the structure of the data. It can be defined as the task of identifying subgroups in the data such that data points in the same subgroup (cluster) are very similar while data points in different clusters are very different. In other words, we try to find homogeneous subgroups within the data such that data points in each cluster are as similar as possible according to a similarity measure such as euclidean-based distance or correlation-based distance. The decision of which similarity measure to use is application-specific. K-means is one of the simplest unsupervised learning algorithms that solve the well-known clustering problem. K - Means clustering algorithm is a scheme for clustering continuous and numeric data. As K-Means algorithm consists of scheme of random initialization of centroids, every time it is run, it gives different or slightly different results because it may reach some local optima. Quantification of such aforementioned variation is of some importance as this sheds light on the nature of the Discrete K-Means Objective function with regards its maxima and minima. The K-Means Clustering algorithm aims at minimizing the aforementioned Objective function. In this research investigation, the author has attempted to quantify the variation of results of the K-Means Clustering Algorithm, run 10 times on the first 25 Prime numbers. Also, a notion of Percentage Uncertainty of clustering assignment for each data set point is computed for each run of the K- Means Clustering Algorithm. Also, a criterion is proposed for the applicability of K-Means Clustering Algorithm for the given data set.

Download article

email to a friend

Cite This Article

ISSN: 2349-6002
Volume: 7
Issue: 7
PageNo: 19-30

Quantification Of Variation Of Results Of The K-Means Clustering Algorithm Run Ten Times On The First Twenty Five Primes – A Criterion For Applicability Of The K-Means Clustering

Available:https://ijirt.org/article?manuscript=150490

Impact Factor
8.01 (Year 2024)

UGC Approved
Journal no 47859

Join Our IPN

IJIRT Partner Network

Submit your research paper and those of your network (friends, colleagues, or peers) through your IPN account, and receive 800 INR for each paper that gets published.

Join Now