PERBANDINGAN KINERJA NAÏVE BAYES DAN RANDOM FOREST DALAM PREDIKSI PERFORMA AKADEMIK SISWA
Abstract
Predicting student academic performance has become one of the important applications of machine learning in the field of education. This study aims to compare the performance of Naïve Bayes and Random Forest algorithms in predicting student academic performance using a data mining approach. The dataset used in this research is the Student Performance Dataset obtained from the UCI Machine Learning Repository, which contains academic, social, behavioral, and demographic student data. The research method applies the CRISP-DM framework consisting of data understanding, data preparation, modeling, and evaluation stages. The preprocessing stage includes categorical variable encoding, removal of G1 and G2 variables to avoid data leakage, and transformation of the target variable into binary classification. The dataset was divided into 80% training data and 20% testing data. Model evaluation was conducted using accuracy, precision, recall, f1-score, and confusion matrix. The results show that Random Forest achieved an accuracy of 72.15%, outperforming Naïve Bayes with 70.88%. Feature importance analysis indicates that absences, failures, age, and goout are the most influential factors affecting student academic performance. This study concludes that Random Forest provides better performance in classifying student academic performance and is capable of providing interpretation of factors influencing prediction results.
Downloads
References
[2] V. Y. D. Wijaya and G. Brotosaputro, “Penerapan Data Mining Dalam Prediksi Kinerja Akademik Mahasiswa Menggunakan Algoritma Machine Learning,” 2025.
[3] R. Z. Arifin, H. Firmansyah, and W. Asriyani, “Prediksi Kelulusan Siswa Berdasarkan Data Demografis dan Akademik pada Dataset Student Performance,” J. Pengabdi. Masy. dan Ris. Pendidik., vol. 4, no. 2, pp. 13300–13307, Dec. 2025, doi: 10.31004/jerkin.v4i2.4251.
[4] R. A. S. Prayoga, R. Basatha, M. S. Akbar, E. A. Elfaiz, and C. D. Putra, “Penerapan Metode Naïve Bayes untuk Klasifikasi Performa Siswa,” 2025. [Online]. Available: http://sistemasi.ftik.unisi.ac.id
[5] N. T. Renukadevi, K. Saraswathi, E. Roshini, M. G. Lakshitha, and S. Pratheeksha, “Evaluation of School Students Performance Using Machine Learning,” in Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025), SCITEPRESS - Science and Technology Publications, Sep. 2025, pp. 503–512. doi: 10.5220/0013622900004664.
[6] C. Li, Q. Zhou, L. Du, and S. Zhang, “Predicting Student Performance through Machine Learning Methods: Naive Bayesian Classifier,” J. Artif. Intell. Syst. Model., vol. 02, no. 04, 2024, doi: 10.22034/jaism.2024.481968.1068.
[7] Z. Umarova et al., “Use of the Naive Bayes Classifier Algorithm in Machine Learning for Student Performance Prediction,” Int. J. Inf. Educ. Technol., vol. 14, no. 1, pp. 92–98, 2024, doi: 10.18178/ijiet.2024.14.1.2028.
[8] A. B. Musa, “Understanding Student Performance in Foundation Year: Insights from Logistic Regression, Naïve Bayes, and Random Forest Models,” Int. J. Inf. Educ. Technol., vol. 14, no. 12, pp. 1716–1723, 2024, doi: 10.18178/ijiet.2024.14.12.2202.
[9] R. Waqia Wania, S. Mukherjee, S. Mondal, S. Mukherjee, and S. Hazra, “Predicting Student Performance Using Random Forest Algorithm,” 2025. [Online]. Available: http://www.icitet.uk
[10] A. F. Heikal and Z. A. Khalil, “Student Performance Prediction Using Machine Learning: A Comprehensive Analysis,” 2023. [Online]. Available: https://asric.africa/engineering-sciences
[11] P. Cortez and A. Silva, “Using data mining to predict secondary school student performance,” 15th Eur. Concurr. Eng. Conf. 2008, ECEC 2008 - 5th Futur. Bus. Technol. Conf. FUBUTEC 2008, pp. 5–12, 2008.

This work is licensed under a Creative Commons Attribution 4.0 International License.





















.png)
.png)
