Abstract
: Data Mining is a versatile sublet in the field of computer science. It is the computational evolution mode of detecting patterns in large data sets. This paper give an indication on the different pre-processing techniques to mine text data. Text mining applications include – Information Retrieval, Information Extraction, Categorization, and Natural Language Processing. The pre-processing of text mining starts with Tokenization, followed by Stop-word removal and finally stemming. This paper evaluates Porter’s and Krovetz algorithm, highlighting their applications and drawbacks.
Downloads
Download data is not yet available.