Abstract
Peer-to-peer (P2P) databases are mostly used on the Internet for distribution and sharing of documents, applications, and other data. Finding answers to large-scale ad hoc queries like aggregation queries on these databases gives rise to many new challenges. Finding the exact solutions can consume a large amount of time and is also difficult to implement since the P2P databases are distributed and dynamic. In this paper, an approach for approximately answering of ad hoc queries in such databases is presented. Generally, the data is distributed across many peers in a distributed environment, and most of the times, within each peer, the data is highly correlated. This fact is taken advantage of and an approach to process the queries in such an environment is proposed in this work.