Abstract
Data mining is the task of discovering useful and interested patterns from the huge amount of the data where the data can be stored in databases, data warehouses and other information repositories. Data mining comprises an integration of techniques from various disciplines such as data visualization, database technology, information retrieval, high performance computing, machine learning and pattern recognition, etc. The classification of multi-dimensional data is one of the major challenges in data mining and data warehousing. In a classification problem, each object is defined by its attribute values in multidimensional space. Some of the existing systems consider the data analysis might identify the set of candidate data cubes for exploratory analysis based on domain knowledge. Unfortunately, conditions occurred for such assumptions are not valid and these include high dimensional databases, which are difficult or impossible to pre-calculate the dimensions and cubes. Some proposed system is formulated automatically find out the dimensions and cubes, which holds the informative and interesting data. In high dimensional datasets, the data analysis procedures need to be integrated with each other. Based on the information theoretic measures like Entropy is used to filter out the irrelevant data from the dataset in order to formulate a more compact, manageable and useful schema.