Abstract
K-means Clustering is the most popular unsupervised mining technique. The paper explores the standard k-means clustering algorithm and its limitations. Fixing the number of clusters in advance limits the effectiveness of the algorithm. Moreover calculating the distance between each information item and cluster centers, in every single step causes the time complexity of the algorithm to increase. This paper exhibits an improved K-Means algorithm which pre-calculates the optimal number of clusters, dynamically using Dunn’s index and achieves high efficiency in terms of execution time, executes the algorithm in a parallel manner using the capabilities of Microsoft’s Task Parallel Libraries. Exploratory and comparative results demonstrate that the improved method can effectively enhance the pace of clustering and accuracy when applied on two dimensional raw data consisting of different numbers of records.