DATA MINING MODEL CLASSIFICATION USING ALGORITHM K-NEAREST NEIGHBOR WITH NORMALIZATION FOR DIABETES PREDICTION

Authors

  • Muhammad Sholeh Institut Sains & Teknologi AKPRIND
  • Dina Andayati Program Studi Teknik Mesin, Fakultas Teknologi Industri, Institut Sains & Teknologi AKPRIND Yogyakarta
  • Rr. Yuliana Rachmawati Program Studi Informatika, Fakultas Teknologi Informasi dan Bisnis, Institut Sains & Teknologi AKPRIND Yogyakarta

https://doi.org/10.36342/teika.v12i02.2911

Keywords:

Model, data mining, normalization, kNN

Abstract

The model built using the data mining process can be used to make predictions from the data. The model can be built using a datasheet that contains data that is processed from the process. One implementation of the model in data mining is the prediction of a disease such as diabetes. In this study, a data mining model was developed using the k-NN algorithm and data normalization was carried out. The normalization method used is Z-Score and Min-Max. The research methodology is carried out by first determining the datasheet, selecting the data mining model and dividing the datasheet into datasheets into training data and data testing and evaluating the performance of the model created. The process of modeling using python programming. The data mining process uses a classification model using the k-NN algorithm. The datasheet used is a public datasheet, namely the diabetes datasheet which consists of 768 records and 8 attributes. The results of this modeling show that the normalization process can provide better accuracy values. The model developed without normalization produces a value of k=5 with an accuracy of 70%, normalization with the Z-Score method produces a value of k=21 with an accuracy of 72%, normalization with Min Max produces a value of k=3 with an accuracy of 74%. The recommended model is k-NN mode with a value of k=3.

Article Metrics

Downloads

Download data is not yet available.

References

H. Tandra, Penderita Diabetes Boleh Makan Apa Saja. Jakarta: Gramedia Pustaka Utama, 2021.

V. Tjahjadi, Mengenal, Mencegah, Mengatasi Silent Killer, “Diabetes.” Jakarta: Hikam Pustaka, 2017.

D. W. Hestiana, “FAKTOR-FAKTOR YANG BERHUBUNGAN DENGAN KEPATUHAN DALAM PENGELOLAAN DIET PADA PASIEN RAWAT JALAN DIABETES MELLITUS TIPE 2 DI KOTA SEMARANG,” Jurnal of Health Education, vol. 2, no. 2, pp. 138–145, 2017.

Z. M. Syahid, “Literature Review Faktor yang Berhubungan dengan Kepatuhan Pengobatan Diabetes Mellitus,” JIKSH : Jurnal Ilmiah Kesehatan Sandi Husada, vol. 10, no. 1, pp. 147–155, 2021.

I. Istianah, Septiani, and G. K. Dewi, “Mengidentifikasi Faktor Gizi pada Pasien Diabetes Mellitus Tipe 2 di Kota Depok Tahun 2019,” Jurnal Kesehatan Indonesia (The Indonesian Journal of Health), vol. X, no. 2, pp. 72–78, 2020.

M. Shouman, T. Turner, and R. Stocker, “Applying k-Nearest Neighbour in Diagnosing Heart Disease Patients,” Applying k-Nearest Neighbour in Diagnosing Heart Disease Patients, vol. 2, no. 3, pp. 220–223, 2012.

S. Wiyono and T. Abidin, “Implementation of K-Nearest Neighbour (Knn) Algorithm To Predict Student’S Performance,” Simetris: Jurnal Teknik Mesin, Elektro dan Ilmu Komputer, vol. 9, no. 2, pp. 873–878, 2018, doi: 10.24176/simet.v9i2.2424.

S. A. D. Alalwan, “Diabetic analytics: Proposed conceptual data mining approaches in type 2 diabetes dataset,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 14, no. 1, pp. 85–95, 2019, doi: 10.11591/ijeecs.v14.i1.pp88-95.

O. Llaha and A. Rista, “Prediction and detection of diabetes using machine learning,” in CEUR Workshop Proceedings, 2021, vol. 2872, pp. 94–102.

A. Azrar, M. Awais, Y. Ali, and K. Zaheer, “Data mining models comparison for diabetes prediction,” International Journal of Advanced Computer Science and Applications, vol. 9, no. 8, pp. 320–323, 2018, doi: 10.14569/ijacsa.2018.090841.

D. Cielen, A. D. B. Meysman, and M. Ali, Introducing Data Science. 2016.

M. Arhami and M. Nasir, Data Mining - Algoritma dan Implementasi. Yogyakarta: Penerbit Andi, 2020.

D. Jollyta, W. Ramdhan, and M. Zarlis, Konsep Data Mining Dan Penerapan. Yogyakarta: Deepublish Publisher, 2020.

A. Wanto et al., Data Mining : Algoritma dan Implementasi. Medan: Yayasan Kita Menulis, 2020.

Suyanto, Data Mining untuk Klasifikasi dan Klasterisasi Data. Bandung: Informatika, 2017.

S. Novita, P. Harsani, and A. Qurania, “Penerapan K-Nearest Neighbor ( KNN ) untuk Klasifikasi Anggrek Berdasarkan Karakter Morfologi Daun dan Bunga,” KOMPUTASI, vol. 15, no. 1, pp. 118–125, 2018.

Y. Yahya and W. Puspita Hidayanti, “Penerapan Algoritma K-Nearest Neighbor Untuk Klasifikasi Efektivitas Penjualan Vape (Rokok Elektrik) pada ‘Lombok Vape On,’” Infotek : Jurnal Informatika dan Teknologi, vol. 3, no. 2, pp. 104–114, 2020, doi: 10.29408/jit.v3i2.2279.

N. Hidayati and A. Hermawan, “K-Nearest Neighbor (K-NN) algorithm with Euclidean and Manhattan in classification of student graduation,” Journal of Engineering and Applied Technology, vol. 2, no. 2, pp. 86–91, 2021, doi: 10.21831/jeatech.v2i2.42777.

P. Cunningham and S. J. Delany, “K-Nearest Neighbour Classifiers-A Tutorial,” ACM Computing Surveys, vol. 54, no. 6, 2021, doi: 10.1145/3459665.

B. Santosa and A. Umam, Buku Data Mining dan Big Data Analytics. Bantul: Penebar Media Pustaka, 2018.

M. Fhadli and F. Tempola, Data Mining dengan Python untuk Pemula. Bogor: Guepedia, 2020.

D. A. Nasution, H. H. Khotimah, and N. Chamidah, “Perbandingan Normalisasi Data untuk Klasifikasi Wine Menggunakan Algoritma K-NN,” Computer Engineering, Science and System Journal, vol. 4, no. 1, p. 78, 2019, doi: 10.24114/cess.v4i1.11458.

Ahmad Harmain, P. Paiman, H. Kurniawan, K. Kusrini, and Dina Maulina, “Normalisasi Data Untuk Efisiensi K-Means Pada Pengelompokan Wilayah Berpotensi Kebakaran Hutan Dan Lahan Berdasarkan Sebaran Titik Panas,” TEKNIMEDIA: Teknologi Informasi dan Multimedia, vol. 2, no. 2, pp. 83–89, 2022, doi: 10.46764/teknimedia.v2i2.49.

H. A. Prihanditya and A. Alamsyah, “The Implementation of Z-Score Normalization and Boosting Techniques to Increase Accuracy of C4.5 Algorithm in Diagnosing Chronic Kidney Disease,” Journal of Soft Computing Exploration, vol. 1, no. 1, pp. 63–69, 2020.

E. Alshdaifat, “The Impact of Data Normalization on Predicting Student Performance: A Case Study from Hashemite University,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 4, pp. 4580–4588, 2020, doi: 10.30534/ijatcse/2020/57942020.

Provost & Fawcett, “Data science-what you need to know about analytic-thinking and decision-making,” Journal of Chemical Information and Modeling, vol. 53, no. 9, pp. 1689–1699, 2013.

Jiawei Han and M. Kamber, Data Mining: Concepts and Techniques. 2019.

Published

2022-10-31

How to Cite

Sholeh, M., Andayati, D., & Rachmawati, R. Y. (2022). DATA MINING MODEL CLASSIFICATION USING ALGORITHM K-NEAREST NEIGHBOR WITH NORMALIZATION FOR DIABETES PREDICTION. TeIKa, 12(02), 77-87. https://doi.org/10.36342/teika.v12i02.2911