Abstract
Cluster analysis is one of the prominent unsupervised learning techniques widely used to categorize the data items based on their similarity. Mainly off-line and online analysis through clusters is more attractive area of research. But, high dimensional big data analysis is always introducing a new dimension in the area of data mining. We have different variable selection methods for clustering of data like density based, model based and criterion based variable selection methods. Because high dimensional cluster analysis is giving less accurate results and high processing time when considering maximum dimensions. To overcome these issues dimensionality reduction techniques have been introduced. Here, a million dollar questions are, which dimensions are to be considered? , what type of measures have to be introduced? And how to evaluate the cluster quality based on those dimensions and measures? Proposed approach effectively answers these questions by introducing Ensemble feature subset selection measure along with Extend leader follower algorithm to justify the proposal with experimental evaluations.