Opinion

Question level analysis is a waste of your time… so stop

When used correctly data can help inform important school decisions, but there is always a danger one reads too much into the numbers

When used correctly data can help inform important school decisions, but there is always a danger one reads too much into the numbers

18 Nov 2025, 5:00

Our education system is awash with data. When used correctly this can help inform important school decisions, but there is always a danger one reads too much into the numbers, spotting trends and patterns that are not really there.

The use of assessment results falls into this category. Given their role in accountability, school stakeholders obviously take a keen interest in the results (particularly when it comes to key stage 2 SATs and GCSEs).

But some might take their analyses too far, making inferences and basing decisions on somewhat shaky grounds.

Last year, half of teachers in a Teacher Tapp poll reported they were entering data for Question Level Analysis (QLA) – which analyses performance on individual questions, instead of overall scores.

Similarly, assessment organisations are regularly asked to provide their clients with “sub-domain” scores, so they can better understand pupils’ relative strengths and weaknesses.

For instance, perhaps little Jonny is great at fractions but terrible at geometry, meaning he should spend more time working on his understanding of shapes and angles.

Or perhaps the whole class is awesome at algebra but doesn’t really get statistics, potentially guiding teaching plans.

Fraught with difficulties

The trouble is that QLA is fraught with difficulties. Individual questions differ in several ways, making it difficult to know what exactly to take from a single correct or incorrect response.

Such information is also very noisy, given one is looking at single points of data, one at a time.

Sub-domain scores in many ways attempt to bridge this gap, providing schools with more granular information than overall test scores, but with greater reliability than looking at individual question responses.

Sounds great, right? But is such additional information really that useful to schools?

In a recent project funded by the Nuffield Foundation, we investigated this issue with respect to the key stage 2 mathematics test.

The central aim of our project was to investigate the reliability of key stage 2 SATs sub-domain scores, and provide useful information to provide back to schools (e.g. to inform their teaching and curriculum development).

First, the good news

We believe that producing sub-domain scores that are reliable enough for school-level reporting is indeed possible. We have managed to produce reasonably reliable school-level SATs scores for the eight areas of the Key Stage 2 mathematics curriculum.

But this must be done by pooling data across years and requires the use of fairly sophisticated statistical techniques (you can’t just add up the number of geometry questions pupils get right and expect to produce a reliable geometry score – which is what the Department for Education currently do).

Now for the bad news

These scores turn out to be pretty useless, in terms of the additional information they provide.

They, in essence, give schools very little extra insight over what can be inferred from overall mathematics scores. This is reflected by just how similarly schools perform across the eight national curriculum domains – the vast majority of the correlations sit above 0.99.

One may of course question whether this is something specific to the key stage 2 mathematics test. We have, however, also experimented with the reading data, where essentially the same result was found.

Our initial plan – once we had produced our scores – was to deliver school-level results back to schools. But – based on our findings – we no longer believe this is the right thing to do.

With more than enough data to be getting on with, all this information would do is give schools some extra distracting noise.

While this may at first seem a bit of a depressing result (at least for us) the findings do have real value for schools.

We all know the workload pressures staff are under. Our results show that any school currently undertaking QLA or any kind of sub-domain analysis of the key stage 2 tests should stop. This practice is at best a waste of time and – at worse – counterproductive.

We believe the same is likely true for many other assessments schools use, including those from commercial providers and QLA of GCSEs.

In life, sometimes less is more. This is also true in terms of reporting results from assessments back to schools.

This research was conducted by John Jerrim, Dave Thomson and Natasha Plaister

Latest education roles from

Director of Education

Director of Education

Chartered College of Teaching

Director of Finance

Director of Finance

Inspire Learning Partnership

Lead Practitioner in Maths

Lead Practitioner in Maths

Bolton College

Chief Executive Officer

Chief Executive Officer

Brooke Weston Trust

Your thoughts

Leave a Reply

Your email address will not be published. Required fields are marked *