Duplicate Detection Algorithm In Hierarchical Data Using Efficient And Effective Network Pruning Algorithm: Survey

Ashwini G Rathod, Vinod S Wadne

Abstract

Duplicate detection consists in detecting multiple type of representations of a same object, and that for every object represented in a database source. Duplicate detection is relevant in data cleaning and data integration applications and has been studied extensively for relational data describing a single type of object in a single data table. The main aim of the project is to detect the duplicate in the structured data. Proposed system focus on a specific type of error, namely fuzzy duplicates, or duplicates for short name .The problem of detecting duplicate entities that describe the same real-world object is an important data cleansing task, which is important to improve data quality. The data which stored in a flat relation has numerous solutions to such type of problem exist.

Duplicate detection, which is an important subtask of data cleaning, which includes identifying multiple representations of a same real-world object. Numerous approaches are there for relational and XML data. Their goal is to either on improving the quality of the detected duplicates (effectiveness) or on saving computation time (efficiency)

Duplicate Detection Algorithm In Hierarchical Data Using Efficient And Effective Network Pruning Algorithm: Survey

Abstract

Author Resources

Journal Policies

Author Desk