Abstract
An essential part of our information-collecting behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. Hence, Sentiment Analysis research has increased tremendously in recent times. Sentiment analysis deals with the methods that automatically process the text contents and extract the opinion of the users. In this work, biomedical opinions are extracted from twitter which contains many features needed to classify the opinions. However, such datasets contain many irrelevant or weak correlation features which influence the predictive accuracy of classification. Without a feature selection algorithm, it is difficult for the existing classification techniques to accurately identify patterns in the features. The purpose of feature selection is to not only identify a feature subset from original set of features but also to reduce the computation overhead in data mining .In the proposed feature selection approach, Shuffled Frog Leaping Algorithm (SFLA) algorithm optimizes the process of feature selection and yields the best optimal feature subset which increases the predictive accuracy of the classifier. SFLA is used as a feature selector and generates the feature subset and Naive Bayes, SVM and K-nn classification used to evaluate the feature subset produced. Experimental results show that the Naïve Bayes classification produces better accuracy when the selected features from shuffled frog leaping algorithm are used.