Abstract
Statistics is defined as the science of collecting, analyzing and presenting data.Data mining is a new discipline lying at the interface of statistics, database technology, pattern recognition, machine learning, and other areas. KDD has a spin that comes from database methodology and from computing with large data sets, while statistics has an emphasis that comes from mathematical statistics, from computing with small data sets, and from practical statistical analysis with small data sets. Statistical techniques are driven by the data and are used to discover patterns and build predictive models. And from the users perspective view with a conscious choice when solving a "data mining" problem is attack it with statistical methods or other data mining techniques. However, since statistics provides the intellectual glue underlying the effort, it is important for statisticians to become involved. KDD is statistics and data mining is statistical analysis. "Knowledge Discovery in Databases" is not much different. The main statistical issues in Data mining (DM) and Knowledge Data Discovery (KDD) is to examine whether traditional statistics approach and methods substantially differ from the new trend of KDD and DM.