Statistical analysis of examination results

Problem: The Institute of Mathematics and its Applications requires a quarterly summary pertaining to the results of examinations undertaken at centres across the UK.   The specific analysis requested can vary, but consistently reports on variations and trends such as the number of candidates attempting each question on the paper, and the range of marks allocated to each question. Of specific interest was to determine if any variable could be identified as an indicator of either the specific questions answered within the exam, or overall score achieved.

Approach: Our work was undertaken in two distinct phases: data cleansing and statistical analysis. Data cleansing was undertaken to ensure that the analysis was performed on accurate and up to date information. As such, all duplicate, erroneous and inconsistent data were identified and either modified or deleted. This process was undertaken to reduce the volume of incorrect data included within the statistical analysis which, ultimately, could have led to the drawing of incorrect conclusions. By applying a range of statistical techniques to all available data, key trends and patterns within the data are identified.

Methodology: The dataset comprised two populations, one substantially smaller than the other. The difference in population sizes was linked to the variations in data collection methods which meant it was not possible to assume both groups belonged to the same population. Summary statistics including the: minimum, maximum, median and interquartile ranges for each of the key characteristics of interest were calculated and presented in a table, and as a Box and Whisker plot. This relatively simple, yet robust approach highlighted variations in key characteristics across the two populations. In depth analysis of observed variations revealed that several values were skewing the overall results. This finding was explained to the project team.

To understand fully the variation in the key characteristics of interest, the calculated variables were both compared against known baselines, and presented as a part of a time series. Correlations between the key characteristics and the important values of interest were investigated. The results highlighted that there was one variable, strongly correlated across both populations. This was further confirmed through the use of Principal Component Analysis.

Outcome: The statistical analysis of the available data on a quarterly basis ensures that all involved in the management of the examinations are able to question and investigate their key areas of concern on a regular basis. Detailed information can be provided in a timely manner to support a range of evidence based decision-making processes including: potential alterations to the wording of questions; number of questions to be included within an exam paper and mathematical topics to be included within questions. Alterations to the exam paper can be incorporated as required. The management team also now have a clear understanding of key variables which show strong prediction to the overall score of a candidate.

Benefits: Our work with the Institute of Mathematics and its Applications enabled those involved to maintain a holistic overview of the examination results throughout the year. By considering the data on regular basis, different areas of interest can be investigated as they arise, and time-based analysis can be used to highlight potentially anomalous points within a trend. The statistical analysis used varies, dependent upon the questions posed by the management team. This has necessitated a close working relationship with staff working on this project at numerous locations throughout the UK. Many of those involved in the project do not have detailed mathematical training. As such, it is essential that the methodology, results and the impact of the results are presented simply, succinctly and unambiguously.

Contact: Sophie Carr