Abstract
Many text mining applications contains side-information along with the text documents. Many web documents consist of meta-data with them which correspond to various different kinds of attributes such as the origin or other information related to the origin of the document. Data such as location, possession or even temporal information may prove to be informative for mining purposes in other cases. Such side-information may contain a huge amount of information. This huge amount of information may be used for performing clustering.
However, it may be difficult to compute the importance of this side-information, especially when some of the information from it is noisy. When the information is noisy it can be a risky approach for performing mining process along with the side information, because it can actually worsen the quality of mining process. This is why we need a principled way for performing the mining process, so that the advantages from using this side information can be maximized. We will do mining and clustering using the side information and iterative clustering and clusters will be formed. From these clusters we will search the desired keyword using user behavior, localization, personalization.