Data Cleaning System to Handle Noisy Data

A.F Elgamal

Authors

A.F Elgamal

Abstract
How to Cite
Metrics

Data cleaning techniques are used for identification of record duplicates, missing data, and duplicate elimination. This paper
presents a data cleaning system, it goes through six steps: selection of attributes, formation of tokens, clustering algorithm, similarity
computation, elimination function, and finally merge step. The system architecture contains three components: users interface, data
cleaning, and reports component where they can communicate and cooperate with each other's. It is implemented using SQL Server 2010
and Microsoft visual c# 2010.

Data Cleaning System to Handle Noisy Data

Authors

Dimension Badge

Downloads

Downloads

Issue

Section

Published