The key stage 2 SATs results are out, but how useful are the results? asks Anne Watson, Emeritus Professor of Mathematics Education at the University of Oxford

The aim of the new key stage 2 curriculum was to raise the standard of mathematics and make sure pupils were ready for secondary mathematics, and the test had to adhere to a test framework that related closely to the curriculum aims and content.

It is terrific news that teachers and children have worked hard to get some success in these tests, especially as it tests the whole of the year 6 curriculum, for which they have only had two terms by the time they are tested.

30% of children might now be labelled as ‘failures’ in these tests

The aim to raise standards has resulted in a new way to measure performance so that no comparative judgements can be made.** **This means we do not know from the data alone whether the government has done a good job or a bad job and whether the test designers and score-scalers have done a good job or a bad job.** **All we know is that 30% of children might now be labelled as “failures” in these tests.

We know that to “pass” you had to achieve 60 out of the available 110 raw marks, but we do not know how these marks were achieved and whether this means adequate performance across all three tests, or exceptional performance on two of them. Of course the “pass mark” mark can have some meaning in terms of mathematical knowledge and achievement, or can also be a notional figure arrived at by some algorithm. We do not know what it means or whether it indicates preparedness for the secondary curriculum, without looking at how marks were gained across the three mathematics papers. It would be helpful if teachers could have specific feedback about areas that need further attention before next school year so they can plan their teaching.

But is it the right test? It is not enough merely to be a harder test, it also has to be an appropriate test, mathematically coherent, and good preparation for secondary school.

It is not just the teachers who need to work smarter, but also the test-writers, as there are some flaws in the tests.

An unnecessarily high level of “test behaviour” is needed

1. These tests were not graduated, so some pupils will have found it impossible to demonstrate all their knowledge. The mark scheme does not distinguish sufficiently between pupils who can do all of the questions correctly, pupils who have some understanding but make an error in the calculation, and pupils who have no understanding. In some of the reasoning questions there were only two marks available for three or four reasoning steps.** **The range of raw scores could have been increased to give more credit to pupils who can do the parts of questions successfully. I suspect that there were many children who know more than these results imply.

2. The tests are geared strongly towards rewarding students who could use formal methods in arithmetic. However, most of the questions in paper 1 could be done very quickly by mental methods which would indicate strong knowledge of number, strong conceptual understanding, and appropriate flexibility in problem-solving and high levels of numeracy. The provision of squared paper to show “workings”, and the emphasis on formal methods in the test preparation literature, may have led some pupils to embark on written methods where mental methods and number knowledge would be quicker and more appropriate. An unnecessarily high level of “test behaviour” is needed to make these choices. For example, the question 326 ÷ 1 should not have had squared paper provided.

I suspect there are children who know more than these results imply

3. On the plus side, the tests establish an appropriate standard of progress towards proportional reasoning, numerical fluency and understanding of place value, that are important foundations for secondary mathematics.

4. Papers 2 and 3 contain many interesting questions that require a flexible understanding of content. However, it does not seem fair that pupils who have some understanding of, for example, angles and area can only show that understanding if they can correctly decipher a complex situation. If they fail in their deciphering, nobody knows whether or not they have the basic knowledge which would be a foundation for secondary mathematics. Only calculation is tested in a way that shows graduated knowledge and competence. Again, I suspect there are children who know more than these results imply.

And are the tests the best they can be to develop mathematical competence, and do they make mathematical sense?

1. Given that the tests influence what is taught at KS2, the implied progression ignores international research in the development of algebraic understanding, competence and efficiency in calculation, the development of geometrical reasoning, and the processes of problem solving. The framework for test development did take account of these, but I am unconvinced that these tests adhere fully to the published testing framework.

2. The tests use the convention of separating groups of digits in numbers over 999 using a comma. The International Organization of Standardization requires that digits be separated in groups of three, and the use of a comma is only as the decimal sign. No justification for the internationally unacceptable use of the comma in these tests has been given.

3. The curriculum, the sample papers, the supportive material for teacher assessment, and all leading mathematical dictionaries and authorities, describe “formula” as a representation of a relationship between quantities, or a process for generating particular values. In the test, there was an algebraic question that tested the abstract use of an expression and an equation. These are not in the curriculum for KS2, where symbolism is confined to those relationships that have some mathematical meaning, such as area of a rectangle = length x width. However, by setting algebra questions in a more abstract way, teachers may be tempted to teach algebra in a traditional, meaningless, way that in the past has led to confusion and dislike.

*Anne Watson is Emeritus Professor of Mathematics Education at the University of Oxford*

Very well said. Secondary schools should take a lot of care with their interpretation of students’ KS2 maths results in 2016 – and might want to think twice before using these alone as the basis for setting.

Separately, the comma issue beggars belief.

To answer some of the questions you rightly pose some diagnostic analysis of their pupils responses to individual test questions and categories of questions would be valuable for schools.

I think this may be more difficult now that schools do not get scripts back and cannot look at several responses to one question at the same time but have to scroll through individual papers online?

Point 2 at the end. The comma as a decimal… this was not credited in the mark scheme.

The papers have already been marked online and each individual mark recorded on computer. Therefore, it would be easy for the DFE to make a spreadsheet available containing a row for each child followed by all the 0’s and 1s etc for each question. The school could then do a quick question level analysis to learn the lessons for the teaching in their school. With simple excel skills, we could even evaluate this in terms of gender, sen, PP.

Why don’t they make this data available? My staff don’t have time to scroll through 60 x 3 papers x 40 questions, typing them all in!

This would automatically be done by the DFE if they were minded to help schools improve, rather than just devise a test to pass/ fail schools and their pupils.