Data Cleaning System to Handle Noisy Data
Data cleaning techniques are used for identification of record duplicates, missing data, and duplicate elimination. This paper
presents a data cleaning system, it goes through six steps: selection of attributes, formation of tokens, clustering algorithm, similarity
computation, elimination function, and finally merge step. The system architecture contains three components: users interface, data
cleaning, and reports component where they can communicate and cooperate with each other's. It is implemented using SQL Server 2010
and Microsoft visual c# 2010.
Data Cleaning System to Handle Noisy Data. (2015). International Journal of Engineering and Computer Science, 4(01). http://ijecs.in/index.php/ijecs/article/view/432