Predictive Model to Identify College Students with High Dropout Rates
DOI:
https://doi.org/10.24320/redie.2023.25.e13.5398Keywords:
dropping out, college students, forecasting, regression analysisAbstract
Decreasing student attrition rates is one of the main objectives of most higher education institutions. However, to achieve this goal, universities need to accurately identify and focus their efforts on students most likely to quit their studies before they graduate. This has given rise to a need to implement forecasting models to predict which students will eventually drop out. In this paper, we present an early warning system to automatically identify first-semester students at high risk of dropping out. The system is based on a machine learning model trained from historical data on first-semester students. The results show that the system can predict “at-risk” students with a sensitivity of 61.97%, which allows early intervention for those students, thereby reducing the student attrition rate.
Downloads
References
Ameri, S., Fard, M. J., Chinnam, R. B., & Reddy, C. K. (2016). Survival analysis based framework for early prediction of student dropouts. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (pp. 903-912). https://doi.org/10.1145/2983323.2983351
Bean, J. P. (1985). Interaction effects based on class level in an explanatory model of college student dropout syndrome. American Educational Research Journal, 22(1), 35-64. https://doi.org/10.3102/00028312022001035
Berens, J., Schneider, K., Görtz, S., Oster, S., & Burghoff, J. (2018). Early detection of students at risk: Predicting student dropouts using administrative student data and machine learning methods (Working paper No. 7259). Center for Economic Studies & Ifo Institute. http://dx.doi.org/10.2139/ssrn.3275433
Bruffaerts, R., Mortier, P., Kiekens, G., Auerbach, R. P., Cuijpers, P., Demyttenaere, K., Green, G., Nock, M., & Kessler, R. C. (2018). Mental health problems in college freshmen: Prevalence and academic functioning. Journal of Affective Disorders, 225, 97-103. https://doi.org/10.1016/j.jad.2017.07.044
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-Sampling Technique. The Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953
Chen, X. W., & Jeong, J. C. (2007, December). Enhanced recursive feature elimination. In Sixth International Conference on Machine Learning and Applications (ICMLA) (pp. 429-435). IEEE. https://doi.org/10.1109/ICMLA.2007.35
Daley, F. (2010). Why college students drop out and what we do about it. College Quarterly, 13(3), 1-5. https://eric.ed.gov/?id=EJ930391
Ferreyra, M. M., Avitabile, C., Botero Álvarez, J., Haimovich Paz, F., & Urzúa, S. (2017). At a crossroads: Higher education in Latin America and the Caribbean. World Bank.
Fodor, I. K. (2002). A survey of dimension reduction techniques (Technical Report No. UCRL-ID-148494). Lawrence Livermore National Lab. https://www.osti.gov/biblio/15002155
García, S., Luengo, J., & Herrera, F. (2015). Feature selection. In Data preprocessing in data mining (pp. 163-193). Springer International Publishing. https://doi.org/10.1007/978-3-319-10247-4_7
González, F. I., & Arismendi, K. J. (2018). Deserción estudiantil en la educación superior técnico-profesional: explorando los factores que inciden en alumnos de primer año [Student dropout in technical and vocational higher education: Exploring factors that influence freshmen]. Revista de la Educación Superior, 47(188), 109-137. https://doi.org/10.36857/resu.2018.188.510
Guo, X., Yin, Y., Dong, C., Yang, G., & Zhou, G. (2008, October). On the class imbalance problem. In 2008 Fourth international conference on natural computation (Vol. 4, pp. 192-201). IEEE. https://doi.org/10.1109/ICNC.2008.871
Hariri, S., Kind, M. C., & Brunner, R. J. (2019). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479-1489. https://doi.org/10.1109/TKDE.2019.2947676
He, H., & Ma, Y. (2013). Imbalanced learning: Foundations, algorithms, and applications. John Wiley & Sons.
Himmel, E. (2002). Modelo de análisis de la deserción estudiantil en la educación superior [Higher education student dropout analysis model]. Calidad en la Educación, (17), 91-108. http://dx.doi.org/10.31619/caledu.n17.409
Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (Vol. 398). John Wiley & Sons.
Lakkaraju, H., Aguiar, E., Shan, C., Miller, D., Bhanpuri, N., Ghani, R., & Addison, K. L. (2015, August). A machine learning framework to identify students at risk of adverse academic outcomes. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1909-1918), Sydney NSW Australia. https://doi.org/10.1145/2783258.2788620
Lin, J. J., Imbrie, P. K., & Reid, K. J. (2009. July). Student retention modelling: An evaluation of different methods and their impact on prediction results. Proceedings of the Research in Engineering Education Symposium (REES), Palm Cove, Australia. https://www.proceedings.com/content/023/023353webtoc.pdf
Observatorio de Educación Superior. (2017, July 1). Deserción en la educación superior [Dropout in higher education]. ODES Boletín (5). https://www.sapiencia.gov.co/wp-content/uploads/2017/11/5_JULIO_BOLETIN_ODES_DESERCION_EN_LA_EDUCACION_SUPERIOR.pdf
Patrick, M. E., Schulenberg, J. E., & O’Malley, P. M. (2016). High school substance use as a predictor of college attendance, completion, and dropout: A national multicohort longitudinal study. Youth & society, 48(3), 425-447. https://doi.org/10.1177/0044118X13508961
Pérez, A. M., Escobar, C. R., Toledo, M. R., Gutierrez, L. B., & Reyes, G. M. (2018). Prediction model of first-year student desertion at Universidad Bernardo O’ Higgins (UBO). Educação e Pesquisa, 44. https://doi.org/10.1590/S1678-4634201844172094
Raschka, S. (2018). Model evaluation, model selection, and algorithm selection in machine learning. arXiv Cornell University. http://arxiv.org/abs/1811.12808
Sandoval-Palis, I., Naranjo, D., Vidal, J., & Gilar-Corbi, R. (2020). Early dropout prediction model: A case study of university leveling course students. Sustainability, 12(22), 2-17. https://doi.org/10.3390/su12229314
Sistema para la Prevención de la Deserción en las Instituciones de Educación Superior-SPADIES. (2016). Reporte sobre deserción y graduación en educación superior año 2016 [Report on dropout and graduation in higher education, year 2016]. https://bit.ly/3K0RQmc
Thomas, L. (2002). Student retention in higher education: the role of institutional habitus. Journal of Education Policy, 17(4), 423-442. https://doi.org/10.1080/02680930210140257
Tinto, V. (1982). Defining dropout: A matter of perspective. New Directions for Institutional Research, (36), 3-15. https://doi.org/10.1002/ir.37019823603
Urzúa, S. (2017). The economic impact of higher education. In M. M. Ferreyra, C. Avitabile, J. Botero, F. Haimovich, & S. Urzúa (Eds.), At a crossroads: Higher education in Latin America and the Caribbean (pp. 115-148). World Bank. https://doi.org/10.1596/978-1-4648-1014-5_ch3
Downloads
-
HTML
-
PDF
-
XML
-
EPUB
-
ABSTRACT AUDIOSPANISH 592
Article abstract page views: 1829
Published
2023-05-03License
Copyright (c) 2023 Jhoan Keider Hoyos Osorio, Genaro Daza Santacoloma
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.