This year’s exams will go ahead but demands for reform are unlikely to go away, so what does the research tell us about end-point assessment? Asks Cat Scutt

I’m always envious of anyone who is absolutely certain of the best thing to do, based on “the evidence”. But I’m also slightly bemused by it. “What the evidence says” is rarely, if ever, simple. Even when there appears to be a clear answer, there are caveats, boundary conditions, trade-offs – and context is key.

A good example of this is the debate around next year’s round of summer exams – which we’ve heard this week will go ahead, albeit to a slightly adapted timetable. Many argued that proceeding with exams was critical, no matter what. Others argued we should ditch them – perhaps for ever.

Given school closures, variation in access to remote learning, and uncertainty about what will be possible or safe next year, there are certainly challenges in relying on exams next summer. But is there a viable alternative?

Kaili Rimfeld and colleagues argue that their research shows teacher assessment is as reliable and stable as standardised exam scores. But on further reading, using teacher assessment to predict pupils’ GCSE and A-level grades had only 90 per cent of the accuracy of using past test scores. They suggest the difference lies in factors that may affect exam performance in addition to academic ability or preparation, such as anxiety and the pupil’s beliefs about their abilities.

Writing well before the current crisis, Daisy Christodoulou reminds us of compelling evidence that teacher assessment can be (unconsciously) biased against particular groups of students. Rob Coe notes that this may include FSM pupils and those with EAL, with SEN, or with challenging behaviour. There is also evidence that stereotypes around ethnicity can influence teacher assessment. This may be difficult to swallow, but if we want a serious discussion about teacher assessment, we need to be able to reflect on it – and on the reasons it may happen.

A less convincing argument against teacher assessment is that we can’t trust teachers

A less convincing argument against teacher assessment is that we can’t trust teachers – they’ll submit “implausibly high” predictions. As Sam Freedman argued on Twitter, this is not teachers exaggerating pupils’ capabilities. Their predictions can’t take into account, for example, a pupil missing out a question accidentally or a particularly challenging question. Teachers cannot know who might be affected in this way, so their predictions may overall appear inflated – but it would be absurd to suggest teachers should be downgrading their high expectations of pupils. We must trust teachers’ integrity.

Of course, the validity of any test depends not (just) on the test itself but, as Dylan Wiliam argues, on the inferences we draw from it. Considering purpose is therefore important. Exams are used to judge pupils, to decide college or university places and employment. But they are also used to measure schools’ performance – and, in some cases, even for judging the performance of individual teachers (though as EPI point out, there are all sorts of issues with this!). And as Stuart Lock and others state in a letter to the Daily Telegraph, they provide a motivator for pupils and a celebration of their achievement.

There are other issues, too – moving to teacher assessment in place of high-stakes exams could fundamentally change the role and focus of teachers, damaging the relationship between schools, pupils and parents.

But perhaps it’s not one or the other. Dylan Wiliam argued back in 1998 that teacher assessment could play a key role in summative assessment – alongside exams. A recent joint letter signed by the Chartered College suggests that centre-based assessment should be adopted for summer 2021 in addition to exams, given the current uncertainty. Efforts must of course be made to mitigate the limitations of teacher assessment, and exams must be sufficiently flexible to recognise the varied interruptions to education that have occurred.

This summer’s “mutant algorithm” fiasco illustrates a critical point about research. Something can appear to work when aggregated. But we’re not just interested in what works overall – we need to consider the individual, too. And this is where professional judgment comes in. Teacher expertise is crucial – which is why knowledge and understanding of assessment is tested as part of both the teacher and school leader routes to Chartered Status.