Predictive Model to Identify College Students with High Dropout Rates

Authors

DOI:

https://doi.org/10.24320/redie.2023.25.e13.5398

Keywords:

dropping out, college students, forecasting, regression analysis

Abstract

Decreasing student attrition rates is one of the main objectives of most higher education institutions. However, to achieve this goal, universities need to accurately identify and focus their efforts on students most likely to quit their studies before they graduate. This has given rise to a need to implement forecasting models to predict which students will eventually drop out. In this paper, we present an early warning system to automatically identify first-semester students at high risk of dropping out. The system is based on a machine learning model trained from historical data on first-semester students. The results show that the system can predict “at-risk” students with a sensitivity of 61.97%, which allows early intervention for those students, thereby reducing the student attrition rate.

Downloads

Download data is not yet available.

References

Ameri, S., Fard, M. J., Chinnam, R. B., & Reddy, C. K. (2016). Survival analysis based framework for early prediction of student dropouts. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (pp. 903-912). https://doi.org/10.1145/2983323.2983351

Bean, J. P. (1985). Interaction effects based on class level in an explanatory model of college student dropout syndrome. American Educational Research Journal, 22(1), 35-64. https://doi.org/10.3102/00028312022001035

Berens, J., Schneider, K., Görtz, S., Oster, S., & Burghoff, J. (2018). Early detection of students at risk: Predicting student dropouts using administrative student data and machine learning methods (Working paper No. 7259). Center for Economic Studies & Ifo Institute. http://dx.doi.org/10.2139/ssrn.3275433

Bruffaerts, R., Mortier, P., Kiekens, G., Auerbach, R. P., Cuijpers, P., Demyttenaere, K., Green, G., Nock, M., & Kessler, R. C. (2018). Mental health problems in college freshmen: Prevalence and academic functioning. Journal of Affective Disorders, 225, 97-103. https://doi.org/10.1016/j.jad.2017.07.044

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-Sampling Technique. The Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/jair.953

Chen, X. W., & Jeong, J. C. (2007, December). Enhanced recursive feature elimination. In Sixth International Conference on Machine Learning and Applications (ICMLA) (pp. 429-435). IEEE. https://doi.org/10.1109/ICMLA.2007.35

Daley, F. (2010). Why college students drop out and what we do about it. College Quarterly, 13(3), 1-5. https://eric.ed.gov/?id=EJ930391

Ferreyra, M. M., Avitabile, C., Botero Álvarez, J., Haimovich Paz, F., & Urzúa, S. (2017). At a crossroads: Higher education in Latin America and the Caribbean. World Bank.

Fodor, I. K. (2002). A survey of dimension reduction techniques (Technical Report No. UCRL-ID-148494). Lawrence Livermore National Lab. https://www.osti.gov/biblio/15002155

García, S., Luengo, J., & Herrera, F. (2015). Feature selection. In Data preprocessing in data mining (pp. 163-193). Springer International Publishing. https://doi.org/10.1007/978-3-319-10247-4_7

González, F. I., & Arismendi, K. J. (2018). Deserción estudiantil en la educación superior técnico-profesional: explorando los factores que inciden en alumnos de primer año [Student dropout in technical and vocational higher education: Exploring factors that influence freshmen]. Revista de la Educación Superior, 47(188), 109-137. https://doi.org/10.36857/resu.2018.188.510

Guo, X., Yin, Y., Dong, C., Yang, G., & Zhou, G. (2008, October). On the class imbalance problem. In 2008 Fourth international conference on natural computation (Vol. 4, pp. 192-201). IEEE. https://doi.org/10.1109/ICNC.2008.871

Hariri, S., Kind, M. C., & Brunner, R. J. (2019). Extended isolation forest. IEEE Transactions on Knowledge and Data Engineering, 33(4), 1479-1489. https://doi.org/10.1109/TKDE.2019.2947676

He, H., & Ma, Y. (2013). Imbalanced learning: Foundations, algorithms, and applications. John Wiley & Sons.

Himmel, E. (2002). Modelo de análisis de la deserción estudiantil en la educación superior [Higher education student dropout analysis model]. Calidad en la Educación, (17), 91-108. http://dx.doi.org/10.31619/caledu.n17.409

Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (Vol. 398). John Wiley & Sons.

Lakkaraju, H., Aguiar, E., Shan, C., Miller, D., Bhanpuri, N., Ghani, R., & Addison, K. L. (2015, August). A machine learning framework to identify students at risk of adverse academic outcomes. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1909-1918), Sydney NSW Australia. https://doi.org/10.1145/2783258.2788620

Lin, J. J., Imbrie, P. K., & Reid, K. J. (2009. July). Student retention modelling: An evaluation of different methods and their impact on prediction results. Proceedings of the Research in Engineering Education Symposium (REES), Palm Cove, Australia. https://www.proceedings.com/content/023/023353webtoc.pdf

Observatorio de Educación Superior. (2017, July 1). Deserción en la educación superior [Dropout in higher education]. ODES Boletín (5). https://www.sapiencia.gov.co/wp-content/uploads/2017/11/5_JULIO_BOLETIN_ODES_DESERCION_EN_LA_EDUCACION_SUPERIOR.pdf

Patrick, M. E., Schulenberg, J. E., & O’Malley, P. M. (2016). High school substance use as a predictor of college attendance, completion, and dropout: A national multicohort longitudinal study. Youth & society, 48(3), 425-447. https://doi.org/10.1177/0044118X13508961

Pérez, A. M., Escobar, C. R., Toledo, M. R., Gutierrez, L. B., & Reyes, G. M. (2018). Prediction model of first-year student desertion at Universidad Bernardo O’ Higgins (UBO). Educação e Pesquisa, 44. https://doi.org/10.1590/S1678-4634201844172094

Raschka, S. (2018). Model evaluation, model selection, and algorithm selection in machine learning. arXiv Cornell University. http://arxiv.org/abs/1811.12808

Sandoval-Palis, I., Naranjo, D., Vidal, J., & Gilar-Corbi, R. (2020). Early dropout prediction model: A case study of university leveling course students. Sustainability, 12(22), 2-17. https://doi.org/10.3390/su12229314

Sistema para la Prevención de la Deserción en las Instituciones de Educación Superior-SPADIES. (2016). Reporte sobre deserción y graduación en educación superior año 2016 [Report on dropout and graduation in higher education, year 2016]. https://bit.ly/3K0RQmc

Thomas, L. (2002). Student retention in higher education: the role of institutional habitus. Journal of Education Policy, 17(4), 423-442. https://doi.org/10.1080/02680930210140257

Tinto, V. (1982). Defining dropout: A matter of perspective. New Directions for Institutional Research, (36), 3-15. https://doi.org/10.1002/ir.37019823603

Urzúa, S. (2017). The economic impact of higher education. In M. M. Ferreyra, C. Avitabile, J. Botero, F. Haimovich, & S. Urzúa (Eds.), At a crossroads: Higher education in Latin America and the Caribbean (pp. 115-148). World Bank. https://doi.org/10.1596/978-1-4648-1014-5_ch3

Downloads

Article abstract page views: 1571

Published

2023-05-03