Abstract
Data mining is a powerful area, which computerizes the process of searching valuable information from a large database. Its wide range of applications promises a future, where the data grow rapidly. Many problems in natural language processing, data mining, information retrieval, and bioinformatics can be formalized as string transformation. Proposed system implements string transformation in data mining field with the help of efficient algorithms. As its name implies string transformation includes a set of operators to transform a given string into most likely output strings. Insertion, deletion, transposition, and substitution are the operators for transformation. Transformation rules and predefined rule indexes are used here to avoid unwanted searches and time delay. Here the users can view the formation of possible outcomes from the given string. By extracting these, proposed system finds most appropriate matches with respect to the given string and gives them as output within seconds. Another important feature is to provide query reformulation. By using efficient methods, proposed system can introduce query reformulation with useful description about the given query. Query reformulation is also a transformation technique and it deals with the term mismatch problem. Here similar query pairs can mine from training data. Proposed system tries to transform a given query to original query and therefore make a better match between the query and the document and also give a brief description about this like a search engine. Challenge is compounded by the fact is that new information from the field is being added to the database on a daily basis. For this purpose, proposed system use a dictionary method to add details to the database and the information retrieved by text mining approach. Text mining is a new area of computer science and a sibling of data mining which fosters strong connections with data mining and knowledge management. Proposed system is an efficient system and need less time for the retrieval of data.