Abstract
Nowadays by the rapid digitization of the data in the Healthcare sector has resulted in the collection of mountains amount of data in various Electronic Health Records (EHR). As the data is the biggest asset in the modern age, whose proper utilization in the Healthcare sector can lead to the discovery of the dreadful diseases very well in time which in turn will provide high quality of care to patients and at less expenditure. Breast Cancer is a primary cause of death in women whose precise detection of Breast Cancer is important in early stages. Precise results can be achieved through data mining algorithms. Developing a machine learning models that can help us in prediction the disease can play a vital role in early prediction. These Machine learning methods can be used to classify between healthy people & people with different disease. In the given project the light is been thrown on the same disease by using certain selected machine learning algorithms in WEKA tool and a corresponding evaluation of the selected Machine learning algorithms in terms of accuracy is also performed so as to select the best classifier for the early diagnosis of the said disease with better accuracy results In this paper three different types of models were implemented on the Breast Cancer dataset as Naïve Bayes, Logistic Regression and Random Forest. Out of the three Random Forest lead the top by having accuracy of 98% and sensitivity 99% followed by Logistic Regression with accuracy of 96% and sensitivity 98% and finally with Naive Bayes with accuracy of 91% and sensitivity 94%.