Reception baseline assessment: 8 key findings from the DfE’s pilot report

The Department for Education has published a report on its pilot of the new reception baseline assessment, which will be rolled out to all schools this September.

The National Foundation for Educational Research (NFER) was tasked with carrying out the process, which consisted of a trial with members of the September 2018 intake, and then a pilot starting in September 2019 in schools for all three main intake periods.

Here are the eight most interesting things we learned from the report.


1. ‘Early reading’ and ‘shape’ tasks removed following pilot

The 2018 trial of the assessment involved “considerably more” tasks than were needed. Tasks are split into two groups: literacy, communication and language (LCL) and maths.

Following the 2018 trial, about half of the LCL items and about a third of mathematics items were removed from the final selection.

And then, following last year’s pilot, one LCL task and one maths task were also removed.

The LCL task that was removed related to early reading. According to the report, very few pupils completed the task and “even fewer pupils answered correctly”.

The maths task that was removed related to shape, which was removed “to balance the assessment between the two components and to better reflect the early years foundation stage, from which shape is being removed”.

“These changes will have the effect of reducing the time required to complete the assessment without compromising the quality of the assessment,” said the government.


2. Over a quarter of schools didn’t complete all assessments

Overall, 9,657 schools signed up to take part in the pilot. Of those, 8,994 schools (93 per cent) uploaded pupil data to the assessment system and just 7,046 schools (73 per cent) completed assessments for all uploaded pupils by October 25 last year.

The DfE said 415 schools officially withdrew, while some just didn’t log in to the system at all.


3. Two tasks favour boys, but girls do better overall

Data from the first half term of the pilot revealed both the LCL and maths components had one task each which exhibited “differential item functioning in favour of boys”.

However, the report said that this “should be interpreted in the context that girls performed better on average than boys on both components overall”.

In fact, girls “significantly outperformed boys on the overall assessment as well as on the individual components”.

But the report added: “There was no evidence to suggest these differences could have been due to any construct irrelevant bias and are therefore not considered a threat to the validity of the assessment.”


4. There is no ‘ceiling effect’

Analysis of the pilot tests found there was a “good spread” of pupils across the score range.

Less than 0.8 per cent of pupils scored no marks, and less than 0.4 per cent achieved full marks.

According to the DfE, this is evidence that there is “not a ceiling effect on the assessment and that it can discriminate well between pupils across the ability range”.


5. The tests demonstrated ‘high degree’ of reliability

The internal reliability of the tests were measured using something called “Cronbach’s Alpha”.

According to the DfE, an Alpha coefficient of 0.7 or above is “generally considered sufficient for an assessment to be considered suitable to use for drawing inferences about groups”.

Following changes to the assessment following the 2018 trial, the Cronbach’s Alpha for the whole assessment is predicted to be 0.91, demonstrating a “high degree of internal consistency reliability”.


6. SEND and cultural reviewers are happy

As part of its analysis, the government consulted a SEND reviewer and cultural reviewer to check the content would work for all pupils.

The SEND reviewer said the assessment “shows an excellent regard for the barriers that SEND children may face”.

According to the DfE, it was also felt that the guidance documentation “sets high expectations for pupils with SEND, contrary to the general tendency to assume that pupils with SEND will perform poorly”.

The cultural reviewer said all the materials were “acceptable from a cultural point of view” and “unproblematic across a wide spectrum of religious and ethnic communities”.

The inclusive nature of the images was also remarked upon since they “include variations of skin tone without exaggerating physical differences”.

The report recognised that children with English as an additional language “may have additional difficulties” with the assessment, but it was “not felt that any of the assessment content needed either to be removed or simplified”.

“The removal of items would mean that it would not be possible to ensure coverage of all content domains for all children. In addition, at all stages of the process, question wording was reviewed and simplified as far as possible.”


7. DfE plans to improve guidance

Feedback from the pilot shows that 92 per cent of respondents found training materials useful and 88 per cent felt an appropriate amount of
information was provided in the administration guide.

However, the DfE has set out a list of improvements it will make to further “improve” the guidance.

This will include providing additional guidance on specifying how resources should be set up and managed within or between items. Guidance will also be updated to exemplify ways of using the training materials to best prepare for the assessment, and “clear messaging about the six-week window for carrying out the assessment being from the time that an individual child starts school”.


8. Raw scores won’t be shared with schools

The DfE explored “possible unintended consequences” of the assessment, warning of the potential for “streaming or labelling” of children based on their scores, and the “potential for retrospective judgement of early years provision”.

To mitigate against these risks, the DfE will not share raw scores with schools, teachers or parents.

Instead, the data will be stored in the national pupil database and will only be used to form the progress measure at the end of key stage 2.

“This will help to prevent scores being used as a grouping mechanism, and it will mean that there is a reduced risk of any early years settings being assessed based on a school’s RBA total scores,” the report said.

“This will also help to reinforce the important message that no preparation is necessary ahead of the assessment, and that neither schools nor parents need to do any practice beforehand.”

Latest education roles from

Internal Quality Assurance Employability and Distance Learning

Internal Quality Assurance Employability and Distance Learning

Capital City College Group

Distance Learning Tutor

Distance Learning Tutor

Capital City College Group

Event Support Team Leader

Event Support Team Leader

MidKent College

E-Sport Technician

E-Sport Technician

MidKent College

Digital Technician

Digital Technician

MidKent College

Student Welfare Officer

Student Welfare Officer

MidKent College

Your thoughts

Leave a Reply

Your email address will not be published. Required fields are marked *


  1. More information to be stored on the national pupil database. The ICO found parents are mainly unaware of the database’s existence or that it can be shared with other government departments, schools and LAs, and (under strict guidance) to third parties including those providing ‘products connected with promoting the education or wellbeing of children in England.’
    Why should the government keep such detailed information of every state-educated pupil and student in England going back to 2002 when it began?

  2. Results from the baseline assessment will be used to calculate ‘progress’ made by age 11. This will discriminate against schools with a mobile population and where the intake is inclusive, just as it does in secondary schools.
    In any case, progress isn’t linear. There are rises, dips and plateaus, set backs and surges forward. And when the Times describes grammatical terms which pupils are expected to know at the end of junior school as ‘arcane’ and has to explain what one of them means, it’s perhaps time to think again.

  3. Julia dolman

    If the baseline assessment does not take,emotional,independence skills ,and attitudes to learning, on entry,into account then the whole thing surely becomes anther box ticking exercise.