Abstract
Clustering is a technique of an unsupervised learning aimed at grouping a set of objects into a clusters, each cluster consist of objects that are similar to one another within the same clusters and are dissimilar to objects belonging to other cluster. The similarity between a pair of objects can be defined either explicitly or implicitly All clustering methods have to assume some cluster relationship among the data objects that they are applied on. Similarity between a pair of objects can be defined either explicitly or implicitly. In this we introduce a novel multiviewpoint-based similarity measure and two related clustering methods. The major difference between a traditional dissimilarity/similarity measure and ours is that the former uses only a single viewpoint, which is the origin, while the latter utilizes many different viewpoints, which are objects, assumed to not be in the same cluster with the two objects being measured. Using multiple viewpoints, more informative assessment of similarity could be achieved. Theoretical analysis and empirical study are conducted to support this claim. Two criterion functions for document clustering are proposed based on this new measure. We compare them with well-known k-mean clustering algorithms that uses a Euclidean distance measures on various document collections to verify the advantages of our proposal