Abstract
The exponential growth of the Internet has led to a great deal of interest in developing useful and efficient tools and software to assist users in searching the web. Text is cheap, but the information i.e., knowing to which class a text belongs to, is expensive. Automatic categorization of text can provide this information at low cost, but the classifiers themselves must be built with expensive human effort, or trained from texts which have themselves been manually classified. Text classification is the process of classifying documents into predefined categories based on their content. Document retrieval, categorization and filtering can all be formulated as classification problem. Traditional information retrieval method use keywords occurring in documents to determine the class of the document. In this paper, we propose an association analysis approach for classifying the text using the generation of frequent item word sets (features), known as the Frequent-Pattern (FP) Growth. Naive Bayes classifier (Supervised classifier) is then used on derived features for final categorization