Machine Learning techniques for an improved breast cancer detection

Elena-Anca PARASCHIV1,2, Elena OVREIU2
1 National Institute for Research and Development in Informatics – ICI Bucharest
2 Politehnica University of Bucharest,

Abstract: Breast cancer is one the most common types of cancer diagnosed in women and the second leading cause of cancer mortality after lung cancer. The diagnostic and prediction of the cancer development are realized, nowadays, using different techniques based on advanced methods, such as Machine Learning. This article intends to present the research results in the field of Machine Learning applied for the purpose of classifying medical data. Using a set of different algorithms, the aim was to classify the Breast Cancer Wisconsin database for diagnostic. The selection criteria of the algorithms were chosen as to emphasize the performances of Machine Learning techniques in terms of accuracy and precision. For implementation, techniques such as Support Vector Machines (SVM), k-Nearest Neighbor (kNN), Multilayer Perceptron (MLP), Decision Tree, Gaussian Naïve Bayes and Random Forest were used. A set of diagnostic images from a fine needle aspirate technique (FNA) was selected based on which the most representative features were identified. The best accuracy was obtained for the Random Forest algorithm, in this case 97.90%, which allows outlining a perspective of refining the classification achieved.

Keywords: breast cancer, machine learning, accuracy, SVM, KNN, MLP, Decision Tree, Random Forest, Gaussian Naïve Bayes.

View full text

Elena-Anca PARASCHIV, Elena OVREIU, Machine Learning techniques for an improved breast cancer detection, Romanian Journal of Information Technology and Automatic Control, ISSN 1220-1758, vol. 30(2), pp. 67-80, 2020.