Comparison of Cart and Naive Bayesian Algorithm Performance to Diagnose Diabetes Mellitus

Irfan Santiko, Pungkas Subarkah


Based on Indonesia's health profile in 2008, Diabetes Mellitus is the cause of the ranking of six for all ages in Indonesia with the proportion of deaths of 5.7% under stroke, TB, hypertension, injury and perinatal. This is reinforced by WHO (2003), Diabetes Mellitus disease reached 194 million people or 5.1 percent of the world's adult population and in 2025 is expected to increase to 333 million inhabitants. In particular, in Indonesia, people with Diabetes Mellitus are increasing. In 2000, Diabetes Mellitus sufferers have reached 8.4 million people and it is estimated that the prevalence of Diabetes Mellitus in 2030 in Indonesia reaches 21.3 million people.This allows researchers and practitioners to focus their attention on detecting/diagnosing diabetes mellitus and to prevent it because the disease can cause complications. The method used in this research was problem identification, data collection, pre-processing stage, classification method, validation and evaluation and conclusion. The algorithm used in this research was CART and Naïve Bayes using dataset taken from UCI Indian Pima database repository consisting of clinical data ofpatients who detected positive and negative diabetes mellitus. Validation and evaluation method used was 10-crossvalidation and confusion Matrix for the assessment of precision, recall and F-Measure. The result of calculation has been done, got the accuracy result on CART algorithm equaled to 76.9337% with precision 0.764%, recall 0.769%, and F-Measure 0.765%. Whilethe diabetes dataset was tested with the Naïve Bayes algorithm, got an accuracy of 73.7569% with precision 0.732%, recall 0.738%, and F-Measure 0.734%. From these results it can be concluded that to diagnose diabetes mellitus disease it is suggested to use CART algorithm.

Article Metrics

Abstract: 335 Viewers PDF: 193 Viewers


Performance; Diagnosis; Algorithm;Diabetes mellitus

Full Text:



Diabetes Care. 2004. Global prevalence of diabetes: estimates for the year 2000 and projections for 2030.

Gorunescu,F.2011. Data mining Concepts, Models and Techniques. Verlan Berlin Heidelberg: Spinger

Han,J., & Kamber,M.2006. Data Mining Concepts And Techniques. Verlag Berlin Heidelberg : Spinger

Jayalskshmi, T., Santhakumaran, A., “ Impact of Prepocessing for diagnosis of diabetes mellitus using artificial neural network,” Machine Learning and Computing (ICMLC),2010 Second International Conference on, vol., no., pp.109-112,9-11 Feb.2010.

Kemenkes RI.2014.Situasi dan Analisis Diabetes. Jakarta : Kemenkes RI

Kusrini, & Lutfhi,E. T. 2009.Algoritma Data Mining.Yogyakarta:Andi Offset.

Larose, D. T., 2005. Discovering Knowledge In Data : An Introduction To Data Mining. New Jersey : Wiley-nterscience.

Patil, B.M., Joshi,R.C., Toshniwal,D.2010. Assosiation rule for classification of type 2 diabetic patients.Machine Learning And Computing (ICMLC),pp.330-334

Pima Indians Diabetes Dataset, UCI Machine Learning Repository , diambil dari Diakses 29 Agustus 2016

RISKESDAS,Indonesian Ministry of Health's Health Research and Development Agency,2013.

Timofeev, Roman.2004.Classification and Regression Trees (CART) Theory and Aplications.Humboldt University :,Berlin

WEKA, Machine Learning Group at University of Waikato, from . Access at 29 Agustus 2016.

International Diabetes Federation. Retrieve 3 July 2015, from

World Health Organization. Retrieve 18 June 2015, from

Han J, Kanber M. Pei J. Data Mining: Concepts and Techniques, 3rd ed. USA: Morgan Kaufman; 2012.


  • There are currently no refbacks.

barcodeInternational Journal of Informatics and Information Systems (IJIIS)
ISSN: 2579-7069 (online)
Organized by Information System Department - Universitas Amikom Purwokerto - Indonesia, Laboratoire Signaux Et Systèmes (L2s) - Université Paris 13 - France, and Bright Publisher
Published by Bright Publisher
Website :
Email :,

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0