Patterns to Identify Dropout University Students with Educational Data Mining
DOI:
https://doi.org/10.24320/redie.2021.23.e29.3918Keywords:
dropping out, dropouts characteristics, decision makingSupporting Agencies:
UPAEP-Universidad, Universidad Tecnológica de la MixtecaAbstract
This paper applies educational data mining algorithms to present an analysis of the most relevant characteristics of potential dropout students. The study used a dataset of 10,635 instances, acquired between 2014 and 2019 from 53 bachelor’s degree programs at a private university in the state of Puebla (Mexico). The results show that the model obtained from the decision trees performs better than other algorithms and allows for easy interpretation through decision rules. Furthermore, the model performs better than other related models in the literature that have been applied to the same problem. The methods used to select characteristics yielded the most important attributes to identify potential dropouts, such as the period, last semester completed, credits completed, attendance, courses failed, and program. These attributes and decision rules can be used to create mechanisms that help prevent dropout.
Downloads
References
Agaoglu, M. (2016). Predicting instructor performance using data mining techniques in higher education. IEEE Access, 4. https://doi.org/10.1109/ACCESS.2016.2568756
Albán, M. y Mauricio, D. (2019). Factors that influence undergraduate university desertion according to students perspective. International Journal of Engineering and Technology, 10(6), 1585-1602. https://dx.doi.org/10.21817/ijet/2018/v10i6/181006017
Al-Barrak, M. A. y Al-Razgan, M. (2016). Predicting student’s final GPA using decision trees: a case study. International Journal of Information and Education Technology, 6(7), 528-533. http://www.ijiet.org/vol6/745-IT205.pdf
Alpaydin, E. (2010). Introduction to machine learning. MIT Press.
Aulck, L., Velagapudi, N., Blumenstock, J. y West, J. (2017). Predicting Student Dropout in Higher Education. Machine Learning in Social Good Applications, 16(20). https://bit.ly/2R2XGdX
Bird, S., Klein, E. y Loper, E. (2009). Natural language processing with python. O’Reilly Media, Inc.
Carvajal, P. y Trejos, A. (2016). Revisión de estudios sobre deserción estudiantil en educación superior en Latinoamerica bajo la perspectiva de Pierre Bourdieu. Congreso CLABES, Quito, Ecuador. https://bit.ly/33YO5c2
Chiheb, F., Boumahdi, F., Bouarfa, H. y Boukraa, D. (2017). Predicting students’ performance using decision trees: Case of an Algerian University. International Conference on Mathematics and Information Technology (ICMIT). IEEE, Adrar, Algeria. https://doi.org/10.1109/MATHIT.2017.8259704
Clow, D. (2013). An overview of learning analytics. Teaching in Higher Education, 18(6), 683-695. https://doi.org/10.1080/13562517.2013.827653
Devijver, P. A. y Kittler, J. (1982). Pattern recognition: A statistical approach. Prentice Hall.
Estrada, R. I., Zamarripa-Franco, R. A., Zúñiga-Garay, P. G. y Martínez-Trejo, I. (2016). Aportaciones desde la minería de datos al proceso de captación de matrícula de instituciones de educación superior particulares. Revista Electrónica Educare, 20(3), 1-21. http://dx.doi.org/10.15359/ree.20-3.11
Frank, E., Hall, M. A. y Witten I. H. (2016). The WEKA workbench. Online appendix for data mining: Practical machine learning tools and techniques. Morgan Kaufmann.
Guevara, C., Sanchez, S., Arias, H., Varela, J., Castillo, D., Borja, M., Fierro, W., Rivera, R., Hidalgo, J. y Yandún, M. (2019). Detection of student behavior profiles applying neural networks and decision trees. En T. Ahram, W. Karwowski, S. Pickl y R. Taiar (Eds.), Human systems engineering and design II. IHSED 2019. Advances in Intelligent Systems and Computing (pp. 591-597). Springer. https://doi.org/10.1007/978-3-030-27928-8_90
Hernández-Sampieri, R. y Mendoza, C. P. (2018). Metodología de la investigación: Rutas cuantitativa, cualitativa y mixta (6a. ed.). McGraw Hill.
Kira, K. y Rendell, L. A. (1992). A practical approach to feature selection. International Conference on Machine Learning (pp. 249-256). Morgan Kaufmann Publishers.
Kononenko, I., Simec, E. y Robnik Sikonja, M. (1997). Overcoming the myopia of inductive learning algorithms with RELIEFF. Applied Intelligence, 7(1), 39-55.
López, L. y Beltrán, A. (2017). La deserción en estudiantes de educación superior: tres percepciones en estudio, estudiantes, docentes y padres de familia. Pistas Educativas, (126), 143-159. https://bit.ly/2SXOab5
Márquez-Vera, C., Cano, A., Romero, C., Mohammad, A. Y., Fardoun, H. M. y Ventura, S. (2016). Early dropout prediction using data mining: a case study with high school students. Expert Systems, 33(1), 107-125. https://doi.org/10.1111/exsy.12135
Mitchell, T. M. (2000). Decision Tree Learning. Washington State University.
Morales, J. y Parraga-Alava, J. (2018). How predicting the academic success of students of the ESPAM MFL?: A preliminary decision trees based study. Third Ecuador Technical Chapters Meeting (ETCM). Cuenca, Ecuador. https://bit.ly/2SHdDXu
Muñoz, S., Gallardo, T., Muñoz, M. y Muñoz, C. (2018). Probabilidad de deserción estudiantil en cursos de matemáticas básicas en programas profesionales de la Universidad de los Andes Venezuela. Formación Universitaria, 11(4), 33-42. https://bit.ly/38KFg8d
OECD. (2019). OECD Skills Strategy 2019. https://bit.ly/2P8GJwL
Rahi, S. (2017). Research design and methods: A systematic review of research paradigms, sampling issues and instruments development. International Journal of Economics & Management Sciences, 6(2).
Ramírez, E., Espinosa, D. y Millán, E. (2016). Estrategia para afrontar la deserción universitaria desde las tecnologías de la información y las comunicaciones. Revista Científica, 24, 52-62. https://doi.org/10.14483/udistrital.jour.RC.2016.24.a5
Rodríguez-Maya, N. E., Lara-Álvarez, C., May-Tzuc, O. y Suárez-Carranza, B. A. (2017). Modeling Students’ Dropout in Mexican Universities. Research in Computing Science, 139, 163-175. https://www.rcs.cic.ipn.mx/2017_139/Modeling%20Students_%20Dropout%20in%20Mexican%20Universities.pdf
Rokach, L. y Maimon, O. (2014). Data mining with decision trees: Theory and applications. World Scientific Publishing Co.
Romero, C. y Ventura, S. (2010). Educational data mining: A review of the state of the art. IEEE Transactions on Systems, Man, and Cybernetics, 40(6), 601-618. https://ieeexplore.ieee.org/document/5524021
Sara, N. B., Halland, R., Igel, C. y Alstrup, S. (April, 2015). High-school dropout prediction using machine learning: A danish large-scale study. Proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges, Belgium.
Secretaría de Educación Pública. (2019a). Glosario Educación Superior. https://bit.ly/31PLNLu
Sivakumar, S., Venkataraman, S. y Selvaraj, R. (2016). Predictive modeling of student dropout indicators in educational data mining using improved decision tree. Indian Journal of Science and Technology, 9(4), 1-5. https://doi.org/10.17485/ijst/2016/v9i4/87032
Theodoridis, S. y Koutroumbas, K. (2008). Pattern Recognition. Academic Press.
Witten, I. H., Frank, E., Hall, M. y Pal, C. (2016). Data mining: Practical machine learning tools and techniques. Morgan Kauffman.
Yamao, E., Saavedra, L. C., Campos, R. y Huancas, V. D. (2018). Prediction of academic performance using data mining in first year students of peruvian university. CAMPUS, 23(26), 151-160. https//doi.org/10.24265/campus.2018.v23n26.05
Yukselturk, E., Ozekes, S. y Kılıç Türel, Y. (2014). Predicting dropout student: an application of data mining methods in an online education program. European Journal of Open, Distance and e-Learning, 17(1), 119-133. https://doi.org/10.2478/eurodl-2014-0008
Zhou, Z. H. (2012). Ensemble methods: Foundations and algorithms. Chapman & Hall / CRC Hall / CRC Press.
Downloads
-
HTML
-
PDF
-
XMLSPANISH 61
Article abstract page views: 2666
Published
2021-12-20License
Copyright (c) 2021 Revista Electrónica de Investigación Educativa
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.