Hybrid k-means–kNN approach for predicting student performance based on library usage behaviours: A case study in Tanzanian higher education

Research Article

Hybrid k-means–kNN approach for predicting student performance based on library usage behaviours: A case study in Tanzanian higher education

DOI: 10.1080/20421338.2025.2588228
Author(s): Hussein Bakiri The Institute of Finance Management (IFM), Tanzania , Rose Tinabo The Institute of Finance Management (IFM), Tanzania

Abstract

The complex relationship between student participation and resource usage is frequently missed by traditional predictors of academic performance, such as past academic records and demographic data. These predictors have limited insight into real-time learning behaviours and often act as lagging indicators compared to behavioural and contextual activities. In contrast, library usage behaviours like study frequency, resource utilization, and reading habits provide dynamic, process-oriented evaluations of student engagement. This study examines the possibility of library usage as a non-traditional predictor by analyzing data on student library usage frequency and the extent to which the student performs the reading, studying, information-seeking, researching, and resource utilization activities to predict student performance. The study uses library usage and likely academic performance data of 1062 students of the five higher learning Institutions collected via online Google Forms. A hybrid method combining k-means clustering and kNN classification algorithms was employed; k-means was explicitly used for clustering processes to perform the initial formulation of ‘k’. The clustering identified three distinct clusters: students with predominantly disagreeing responses (students whose performances were not caused by the library usage), those with agreeing responses (students whose performances were caused by the library usage), and a neutral-to-slightly-agreeing (students whose performances showed no relationship with library usage) group. Having obtained a confidence interval of 98.61% and a p-value of 2.2e-16, the results indicate that library usage behaviours can significantly be used to predict student performance. These findings suggest that higher education institutions must reinforce policies encouraging and monitoring active library involvement. The original contribution of this study is the introduction of a machine learning model that uses library usage behaviours as dynamic predictors of student performance, providing a novel perspective over conventional academic and demographic performance indicators.

Get new issue alerts for African Journal of Science, Technology, Innovation and Development