Ofqual: ‘Fundamental mistake’ to believe algorithm grades would ‘ever be acceptable to public’

The “fundamental mistake” made by Ofqual throughout this year’s exams debacle was to believe that its calculated grades “would ever be acceptable to the public”, the regulator’s chair has said.

Roger Taylor appeared in front of the education select committee this morning to answer for the errors made in deciding GCSE and A-level grades this summer, namely the “mutant algorithm” as described by prime minister Boris Johnson.

He lifted on the lid on the internal battle between Ofqual and the education secretary Gavin Williamson – stating that policies such as the last minute decision to use mock grades were getting “out of control”.

He also revealed that the regulator’s “first choice” at the beginning of lockdown was to hold socially distanced exams this summer – but Williamson opted to cancel them and use standardised grades without consultation.

In a statement provided to the committee by Taylor, published during the hearing, he said: “The blame lies with us collectively – all of us who failed to design a mechanism for awarding grades that was acceptable to the public and met the secretary of state’s policy intent of ensuring grades were awarded in a way consistent with the previous year.”

Delivering calculated grades was ‘impossible task’

But he said it was an “impossible task”, adding a “better” algorithm would not have made outcomes more acceptable, nor that “more effective communications effort would have overcome this”.

He added that with “hindsight it appears unlikely that we could ever have delivered this policy successfully” and during the hearing stated that “the fundamental mistake was to believe that this would ever be acceptable to the public”.

Taylor told the committee of MPs that Ofqual’s “initial advice” to Williamson was that the “best way to handle this [lockdown] was to try and hold exams in a socially distanced manner, that our second option was to delay exams but the third option if neither of these were acceptable would be to have to try and look at some form of calculated grade”.

But it was Williamson who then “subsequently took the decision and announced without further consultation with Ofqual that exams were to be cancelled and a system of calculated grades were to be implemented”.

Ofqual could have rejected the government’s demand for such a system – but said it didn’t because it was “decided that this was in the best interests of students, so that they could progress to their next stage of education, training or work”.

DfE was ‘fully informed’ of the risks

Taylor also said the DfE was “fully informed about the work we were doing and the approach we intended to take to qualifications, the risks and impact on results as they emerged. However, we are ultimately responsible for the decisions that fall to us as the regulator.”

This explicitly contradicts claims by Williamson that he was unaware of problems with the results until after they were delivered to pupils.

On August 11 the Department for Education announced eleventh hour plans to use mock grades instead of standardised grades before telling Ofqual, according to Taylor.

The chair said the regulator’s advice to the education secretary at this point was that “we could not be confident that this could be delivered within the statutory duties of Ofqual to ensure that valid and trustworthy grades were being issued”.

But the “secretary of state as he is entitled to do nonetheless announced that that was the policy of the government”.

Taylor claimed Ofqual was “very concerned that this idea of a valid mock exam had no real credible meaning but we consulted very rapidly and developed an approach that we felt would be consistent with awarding valid qualifications”.

They then agreed that “with the DfE and with our understanding with the secretary of state’s office” and published this on the August 15.

Taylor said Ofqual was then contacted by Williamson later that evening and was “informed that this was in fact not to his mind in line with government policy”.

The Ofqual board was pulled together late that evening and they “realised we were in a situation that was rapidly getting out of control, that there were policies being recommended and strongly advocated by the secretary of state that we felt would not be consistent with our legal duties and that there was a growing risk around delivering any form of mock appeals result in a way that would be acceptable as a reasonable way to award grades”.

‘We knew algorithm gave advantage to private schools’

One of the biggest controversies of A-level results day was that private schools saw the biggest boost in top grades following the standardisation process.

Taylor said Ofqual is aware that the algorithm used “did give an advantage to private schools” considering the “inability to standardise small cohorts” but claimed it is “also true” that overall the process of standardisation “reduced the advantage enjoyed by private schools”.

“That is why we felt it was fairer to use the standardisation process as the mechanism to ensure the greatest possible fairness in the circumstances,” he continued, adding that Ofqual does “acknowledge that the level of fairness achieved was not felt to be acceptable but it did improve the level of fairness”.

Your thoughts

Leave a Reply

Your email address will not be published. Required fields are marked *

One comment

  1. Huy Duong

    Ofqual’s fundamental mistake was to force the automated grading far beyond what statistics could support and still not allowed meaningful human dialogue into the process. It could have done much better if it had been more honest and open. For example:
    1) It could have admitted early on with the public that statistical modelling was going have only limited reliability. Instead, for months it hid that behind “we are still refining the details of the model”, as if “refining” could magically solve fundamental statistical problems.
    2) Once it knew from simulating 2019 actual results that grades for some cohorts my be wrong by up to 45%, it should come clean and discuss an alternative route for those cohorts that will involve more human dialog, but it charged on.
    3) Once it knew how unreliable the calculated grades were going to be, it should have scale down the attempt to control grade inflation. Eg, instead of keeping that down to around 2%, it could settle for, say, 7% as a compromise that would reduce the level of injustice (incorrect downgrading). That would in turn reduce the number of appeals, so it could widen the appeal criteria to something meaningful.
    Instead, it chose to feed the public with spins hand half truths, while it charged on dogmatically.