PERBANDINGAN KINERJA NAÏVE BAYES DAN RANDOM FOREST DALAM PREDIKSI PERFORMA AKADEMIK SISWA

  • Marcelino Alberki Kabuhung Universitas Papua
  • Exsel Zeth Lopulalan Universitas Papua
  • Michael Sabandar Universitas Papua
  • Arif Handika Universitas Papua

Abstract

Predicting student academic performance has become one of the important applications of machine learning in the field of education. This study aims to compare the performance of Naïve Bayes and Random Forest algorithms in predicting student academic performance using a data mining approach. The dataset used in this research is the Student Performance Dataset obtained from the UCI Machine Learning Repository, which contains academic, social, behavioral, and demographic student data. The research method applies the CRISP-DM framework consisting of data understanding, data preparation, modeling, and evaluation stages. The preprocessing stage includes categorical variable encoding, removal of G1 and G2 variables to avoid data leakage, and transformation of the target variable into binary classification. The dataset was divided into 80% training data and 20% testing data. Model evaluation was conducted using accuracy, precision, recall, f1-score, and confusion matrix. The results show that Random Forest achieved an accuracy of 72.15%, outperforming Naïve Bayes with 70.88%. Feature importance analysis indicates that absences, failures, age, and goout are the most influential factors affecting student academic performance. This study concludes that Random Forest provides better performance in classifying student academic performance and is capable of providing interpretation of factors influencing prediction results.

Downloads

Download data is not yet available.

References

[1] S. Boujmiraz, H. Darhmaoui, and A. Drissi el maliani, “Predicting student performance: A comprehensive review of machine learning, deep learning, and explainable AI approaches,” Comput. Educ. Artif. Intell., vol. 10, p. 100548, Jun. 2026, doi: 10.1016/j.caeai.2026.100548.
[2] V. Y. D. Wijaya and G. Brotosaputro, “Penerapan Data Mining Dalam Prediksi Kinerja Akademik Mahasiswa Menggunakan Algoritma Machine Learning,” 2025.
[3] R. Z. Arifin, H. Firmansyah, and W. Asriyani, “Prediksi Kelulusan Siswa Berdasarkan Data Demografis dan Akademik pada Dataset Student Performance,” J. Pengabdi. Masy. dan Ris. Pendidik., vol. 4, no. 2, pp. 13300–13307, Dec. 2025, doi: 10.31004/jerkin.v4i2.4251.
[4] R. A. S. Prayoga, R. Basatha, M. S. Akbar, E. A. Elfaiz, and C. D. Putra, “Penerapan Metode Naïve Bayes untuk Klasifikasi Performa Siswa,” 2025. [Online]. Available: http://sistemasi.ftik.unisi.ac.id
[5] N. T. Renukadevi, K. Saraswathi, E. Roshini, M. G. Lakshitha, and S. Pratheeksha, “Evaluation of School Students Performance Using Machine Learning,” in Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025), SCITEPRESS - Science and Technology Publications, Sep. 2025, pp. 503–512. doi: 10.5220/0013622900004664.
[6] C. Li, Q. Zhou, L. Du, and S. Zhang, “Predicting Student Performance through Machine Learning Methods: Naive Bayesian Classifier,” J. Artif. Intell. Syst. Model., vol. 02, no. 04, 2024, doi: 10.22034/jaism.2024.481968.1068.
[7] Z. Umarova et al., “Use of the Naive Bayes Classifier Algorithm in Machine Learning for Student Performance Prediction,” Int. J. Inf. Educ. Technol., vol. 14, no. 1, pp. 92–98, 2024, doi: 10.18178/ijiet.2024.14.1.2028.
[8] A. B. Musa, “Understanding Student Performance in Foundation Year: Insights from Logistic Regression, Naïve Bayes, and Random Forest Models,” Int. J. Inf. Educ. Technol., vol. 14, no. 12, pp. 1716–1723, 2024, doi: 10.18178/ijiet.2024.14.12.2202.
[9] R. Waqia Wania, S. Mukherjee, S. Mondal, S. Mukherjee, and S. Hazra, “Predicting Student Performance Using Random Forest Algorithm,” 2025. [Online]. Available: http://www.icitet.uk
[10] A. F. Heikal and Z. A. Khalil, “Student Performance Prediction Using Machine Learning: A Comprehensive Analysis,” 2023. [Online]. Available: https://asric.africa/engineering-sciences
[11] P. Cortez and A. Silva, “Using data mining to predict secondary school student performance,” 15th Eur. Concurr. Eng. Conf. 2008, ECEC 2008 - 5th Futur. Bus. Technol. Conf. FUBUTEC 2008, pp. 5–12, 2008.
Published
2026-06-29
How to Cite
KABUHUNG, Marcelino Alberki et al. PERBANDINGAN KINERJA NAÏVE BAYES DAN RANDOM FOREST DALAM PREDIKSI PERFORMA AKADEMIK SISWA. Journal of Information System, Informatics and Computing, [S.l.], v. 10, n. 1, p. 234-243, june 2026. ISSN 2597-3673. Available at: <https://journal.stmikjayakarta.ac.id/index.php/jisicom/article/view/2433>. Date accessed: 30 june 2026. doi: https://doi.org/10.52362/jisicom.v10i1.2433.