Abstract
Data mining is the domain that helps to extract the useful data from large store house of data. It has a large scope in the field of biological science as it can solve the critical problems related to the sequence pattern mining by working on large data sets. It helps in classification of biological sequences. The building blocks of proteins are amino acids. These amino acids play important role. Some are good for health and some are harmful when they come in association with other amino acid. The proposed system is focused on generating the frequent amino acid sets and association rules. We have considered five diseases in our project. If the association rule corresponding to the particular disease occurs in the list of association rules obtained in output, then we can conclude that the disease corresponding to that association rule exists. The measuring parameters are support count and confidence.