Abstract
The recent increase of data poses a major challenge in data extracting. High dimensional data contains high degree of irrelevant and redundant information. Feature selection is the process of eliminating such irrelevant and redundant data set with respect to the task to be performed. Several features selection techniques are used to improve the efficiency and performance of various machine learning algorithms. There are several methods that have been proposed to extract features from such high dimensional data. This paper proposes Clustering based extended Fast Feature Selection method to extract features from high dimensional data. The proposed algorithm Semi-supervised learning which is useful to partition the data in appropriate clusters. Also, it selects the most frequent feature subset from the input.