Pupils taking international aptitude tests have scored much worse since organisers switched from paper exams to a computer-based system – and the new method makes comparisons highly unreliable, according to new research.
The latest Programme for International Student Assessment (PISA) used computers for the first time in 2015, making it much harder to compare results with previous years, even with adjustments made to counteract the shift.
UCL’s Professor John Jerrim made the revelations in his recent paper for the Centre for Education Economics, an independent think-tank.
Between 2000 and 2012, PISA used a paper assessment, but in 2015, pupils in 57 out of the 72 countries involved took the test on a computer. In 2018, 70 countries will use the computer-based assessment.
Different kids tend to pick up marks differently when you change the mode of testing
Looking at sample data for three countries from a trial carried out by the OECD, which runs the tests, Jerrim found that pupils taking the test on computers underperformed their peers by up to 26 points in Germany, up to 18 points in Ireland and up to 15 points in Sweden.
He also found that adjusting scores to account for the weaker performance on computers was not a suitable solution.
He concluded that the adjustments made to PISA results in 2015 “do not overcome all the potential challenges of switching to computer-based tests”, meaning that “policymakers should take great care when comparing the results across and within countries obtained through different modes”.
This has serious implications for a test which “has increasingly come to dominate education-policy discussions worldwide”, and for other assessments making a similar switch.
A spokesperson for Ofqual refused to comment on what affect the research may have on the future of testing in England, pointing out that few general qualifications are assessed using computers at present.
Different results should be expected when changing from paper to computer-based testing, according to Tim Oates of Cambridge Assessment, but this is only a problem if you claim to be “maintaining the exactly the same standard”.
“Different kids tend to pick up marks differently when you change the mode of testing. Different skills are mediating the ability to demonstrate what they know,” he said.
“But the whole point of PISA is to gather cross-sectional data every few years on a sample of 15-year-olds, and then make a claim that the country is improving or deteriorating in certain ways.
Comparing years that used a different mode – that becomes an issue.”
Oates has wider concerns about replicating the problems in other tests, such as the Trends in International Mathematics and Science Study (TIMSS).
“TIMSS is going in the same direction,” he said. “They are adopting on-screen assessment very rapidly, mainly because that’s what PISA did. I am worried about it.”
Andrew Harland, chief executive of the Exams Officers Association, said the news cast doubt on the reliability of the assessment.
“Unless and until this sort of anomaly or questionable outcome can be resolved, it would seem unwise at the very least to change from written examinations to computer based examinations,” he told Schools Week.
“At least at the moment everyone knows and understands the outcomes of the traditional examination system. If the switch is made to computer based assessments before there is understanding of the impact of those systems on the outcomes for the candidates, those decisions will be inevitably compromised and therefore probably inaccurate.”
The OECD’s Yuri Belfali insisted that progressing to computer-based delivery of the PISA tests had been “appropriate” and “inevitable”.