Leveraging Machine Learning Algorithms for Early Detection of Breast Cancer: A Comparative Study Using Diagnostic Features

Aulia Abikhair, Huang Guanghui

Abstract


Breast cancer remains one of the leading causes of cancer-related mortality among women worldwide, making early and accurate diagnosis essential for improving survival rates and treatment outcomes. To address limitations associated with conventional diagnostic methods, Machine Learning (ML) techniques have been increasingly adopted to enhance classification accuracy and reduce diagnostic variability. This study presents a comparative evaluation of four widely used ML algorithms Random Forest, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Logistic Regression applied to a structured breast cancer diagnostic dataset. The dataset comprises morphological and texture-based features extracted from digitized tumor samples, enabling binary classification of benign and malignant cases. The models were trained using an 80:20 train–test split and validated through k-fold cross-validation. Performance evaluation was conducted using accuracy, precision, recall, F1-score, and confusion matrix analysis to ensure comprehensive assessment of classification behavior. Experimental results indicate strong predictive performance across all models, with overall accuracy values ranging from 0.95 to 0.96. Among the evaluated approaches, Random Forest demonstrated the most balanced performance, particularly achieving the highest recall for malignant tumors and the lowest false-negative rate, which is critical in clinical diagnostics. Feature importance analysis further revealed that tumor area, concave points, radius, and perimeter were the most influential predictors in classification decisions, consistent with established clinical indicators of malignancy. These findings confirm that classical and interpretable machine learning algorithms, especially ensemble-based methods, remain highly effective for structured breast cancer classification tasks. The study contributes to the advancement of reliable and transparent ML-based decision-support systems, supporting improved early detection and diagnostic accuracy in breast cancer care.

Article Metrics

Abstract: 0 Viewers PDF: 0 Viewers

Keywords


Breast Cancer Detection; Machine Learning; Random Forest; Tumor Classification; Feature Importance

Full Text:

PDF


Refbacks

  • There are currently no refbacks.



Barcode

IJIIS: International Journal of Informatics and Information Systems

ISSN:2579-7069 (Online)
Organized by:Departement of Information System, Universitas Amikom Purwokerto, IndonesiaFaculty of Computing and Information Science, Ain Shams University, Cairo, Egypt
Website:www.ijiis.org
Email:husniteja@uinjkt.ac.id (publication issues)
  taqwa@amikompurwokerto.ac.id (managing editor)
  contact@ijiis.org (technical & paper handling issues)

 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0