All awarding systems must discriminate but our grades discriminate in all the wrong ways, writes Dennis Sherwood

Grades, grades, grades. Why are we so obsessed with grades? Simple. Because the difference between an A and a B means a student can become a doctor, or can’t. Because a 3 rather than a 4 in GCSE English relegates a student to the “Forgotten Third”.

Grades have a peculiar duality. They appear to achieve two contradictory outcomes simultaneously: ‘homogenisation’ and ‘discrimination’. ‘Homogenisation’ because all students awarded the same grade are regarded as indistinguishable in quality. ‘Discrimination’ because grade Bs are deemed profoundly different from grade As – doctor material, or not.

But are all grade As the same? How different is the student with the highest grade A from the one with the lowest? More importantly, are all As different from – and inherently better than – all Bs? What, in truth, is the difference between the student awarded the lowest grade A, and the student awarded the top grade B? Is that a smaller difference than between the top grade A and the lowest grade A?

Are those cliff-edge grade boundaries making false – and unfair – distinctions? Every teacher agonising over which side of a grade boundary a given student will be placed this summer will be all too familiar with this dilemma.

Every teacher agonising over a grade boundary this summer will be familiar with this dilemma

In truth, even the “gold standard exam system” doesn’t get it right. By Ofqual’s own admission, “it is possible for two examiners to give different but appropriate marks to the same answer”. So a script given 64 marks by one examiner (or team) might equally legitimately have been given 66 by another. And if the cliff-edge grade boundary is 65, then the grade on that candidate’s certificate depends on the lottery of who marked their script.

That explains Dame Gleny’s Stacey’s statement to the Education Select Committee that exam grades “are reliable to one grade either way”. By any reckoning, that must mean that grades, as currently awarded, are fatally flawed. But grades have been with us for a long time, and inertia makes it hard to imagine an alternative.

Yet, there is a simple one. Ditch the grade. A student’s certificate could just as easily present assessment outcomes in the form of a mark, plus a measure of the ‘fuzziness’ associated with marking – a statistically valid way of representing those “different but appropriate marks”. ‘Fuzziness’ is real, and according to Ofqual’s own research, some subjects (such as English and History) are fuzzier than others (such as Maths and Physics).

So, for example, a certificate might show not grade B but 64 ± 5. 64 is the script’s mark, and ± 5 is the measure of the subject’s fuzziness.

Instantly, we are rid of cliff-edge grade boundaries. Anyone seeking to distinguish between a student assessed as 64 ± 5 and another assessed as 66 ± 5 will realise that these two students are in essence indistinguishable on the basis of this exam alone.

We need to change the rules for appeals too. As things stand, the student awarded 64 and re-marked at 66 on appeal (if that were allowed!) would see their grade rise consequentially from B to A. But 64 ± 5 explicitly recognises that marking is ‘fuzzy’, and that it is possible, nay likely, that a re-mark might be anywhere in the range from 59 to 69. And since 66 is within this range, the re-mark confirms the original assessment: only if the re-mark were greater than 69 or less than 59 would the assessment be changed.

Accordingly, if the ‘fuzziness’ measure is determined statistically correctly, the likelihood that an appeal would result in a change in the assessment will be very low. So this idea not only delivers assessments that are fairer, but that are much more reliable too.

Showing assessments in the form of 64 ± 5 is not perfect. No awarding system is. Issues with curriculum and the weaknesses of exams themselves would still need addressing.

But the benefits of fairness and reliability are highly significant. And shifting the responsibility for discriminating between who should and shouldn’t become a doctor onto those who will train those doctors rather than those who teach teenagers must surely be better and fairer.

That alone seems reason enough to consign grades to the graveyard.