A hybrid of random forest and SVM for predicting student performance

Document Type : Original Article


1 Meybod University, Yazd, Iran

2 ICT Research Institute, Tehran, Iran



Identifying at risk students is a crucial task in universities to decrease the dropout rate. This causes optimal use of resources and facilitates the decision making. So improving models for identifying significant factors would be beneficial to the students, university and society. The present work proposes educational data mining techniques and identifies methods which can help increasing the algorithms’ performances. In this research, using random forest, with 0.94 accuracy, total grade point average of previous semesters, admission quota and number of conditional semesters (failed courses) were identified as the most important factors effect on students’ failure or success. Comparing the performance of SVM, LR, LDA, CART, KNN, NB algorithms using the most important features as input, showed that SVM classifier acts best between them and reducing the variables causes developing the algorithms’ performance. Also tuning the parameters on python program helped getting a desirable result for predicting at risk students.