Who’s gaming who?

Sometimes I do wonder whether schools are gaming the performance measures or whether the performance measures are gaming the schools.

There’s already a raft of nuances in the performance table measures that marginally move the data one way or the other.

Firstly, the KS2 fine-level cap at 5.8 means that students who have a higher starting point than that are effectively lumped in with everyone else at 5.8. The reason for this is that the use of the level 6 KS2 tests varies between schools, but it has the side effect of giving the students with the highest prior attainment a slightly less challenging attainment 8 estimate.

Secondly, the attainment 8 figures this year are going to be hugely affected by the meddling with points scores for legacy GCSE qualifications. In short, a school with students who all attained A grades last year and this year, would receive the same attainment 8 score in both years. However a school with C grade students this year and last year would receive a much lower attainment 8 score. Although actual achievements are the same. What is the likelihood that these statistics will be put forward for the Grammar school argument?

Finally,  I’ve read this week that in order to solve the problem of one outlier being able to effect the progress 8 score of a school, that a cap could be introduced of -2.5 or +2.5 for each pupil. This would reduce the impact of outliers on the school score.

However such an arbitrary cap would penalise schools with intakes of lower prior attainment and conversely favour those with higher prior attainment.

cap.PNG

So as can be seen, introducing an across the board cap on those students achieving nothing, benefits schools with higher prior attainers, in that it reduces this impact on those schools to a greater degree.

Furthermore, what sort of message does a cap send out?, basically that for a level 5.0, the first 33 attainment points they achieve count for very little. This means schools could be encouraged to not persevere with the student projected to achieve 10 A8 points because unless they get to 33 points it makes no difference to the school. Whereas in the uncapped system, improving that students grades does carry an incentive.

Other solutions could be…

…to report on a typical P8 score for schools, which perhaps looks at the middle 90% of P8 scores, although again that could carry perverse incentives.

… to introduce a cap that slides with starting points, so that the cap for a lower prior attainer could be -1.0, whereas a higher attainer might be -3.0. This would work on some sort of statistical link between A8 estimate and where the cap sits.

…to leave it alone.

Whatever the powers that be decide, I hope they consider the unintended consequences of their well-meaning actions.

 

 

 

 

Parlez-vous Progress

Alongside the DfE school performance tables which were published last week, comes a statistical first release that covers a wealth of interesting national, regional and local data.

https://www.gov.uk/government/statistics/revised-gcse-and-equivalent-results-in-england-2015-to-2016

The tables contained within in this link tell us a lot of useful information, but in the hubbub that focuses on school achievements at this time, sometimes interesting messages can be missed.

One that fascinates me, and has for a long time is the strong performance of schools in London compared to their counterparts across the country. This is not a new phenomenon and has been reported on several times in recent years, in pieces such as this, this  and this.

However of course, this year, we have a new progress measure on the block (progress 8) and it is interesting to see how London fares here compared to other areas.

In order to streamline this analysis, I have categorised the DfE regions like so:

lnr

In brief, London hugely outperforms these areas on the Progress 8 (P8) measure. London achieves a P8 score of +0.16, whilst the North lags way behind with a score of -0.11. The rest of the country scores -0.03.

chart1

Progress 8 can be broken down into “elements” that contribute to the overall score, the area where London performs strongests is in the Ebacc element, the Ebacc element contains academic subjects in the curriculum areas of science, humanities, languages and computer science.

chart2

So this really appears to be a London / North divide. So again we should dig a bit deeper… as I mention above, the Ebacc element comprises of sciences, humanities and languages.

When we investigate these three components, it is languages that comes out with the greatest disparity:

chart3

London massively outperforms the other areas in terms of progress made in languages, and in fact when we break this down to school level, we can see that 75% of schools in London make positive value added in languages, compared to just 45% in other areas.

This was the end of my original blog, however I was inundated with people hypothesising that these patterns shown above were due to students entering GCSEs in their home languages. With London having a more diverse population, this had greatest impact on Language value added scores in London.

It is a sensible hypothesis, but not all that easy to investigate with the data we are given.

However what we can say with certainty is that across the country students with lower prior attainment at Key Stage 2 (KS2) achieve higher grades on average than students in other Ebacc areas:

langva

As we can see, students with the lowest prior attainment at KS2 attained much higher on average in languages than students with similar levels of prior attainment in science or humanities. (English and maths also a lot lower on average). Of course it is worth noting that the KS2 fine levels in 2016 are derived from an average of English and maths and therefore take no account for ability in languages, or science or humanities for that matter.

It is a fact that in 2016, students with lower levels of prior attainment as measured by KS2 outcomes in English and maths achieved, on average, much higher grades in languages than in other subjects. 

OK, so these students, achieving great things in languages… it might be fair to ask, whether the languages they are achieving good grades in are ones that the schools have painstakingly taught them, or whether they are taking examinations in languages that they already speak at home or in the community.

There’s no way to tell from the national data that is publicly available. However what we can tell is the proportion of students that are EAL (have English as an Additional Language) in a school. Then we can look at the value-added scores for those schools in languages:

lang-va

So as might be expected, progress in languages is much greater in schools where greater proportions of students potentially speak multiple languages.

How does this translate to our regions?

prop

London has proportionally more schools than the rest of England that have 50% or greater of their students with EAL.

Is there regional variation in school outcomes between schools with large proportions of EAL?

varegion

Some variation exists between the North and London.

In summary, schools with greater proportions of students with EAL make more progress in languages. These schools are concentrated in London. Therefore Language Value Added scores are higher in London than in other regions. Languages form an important component in both Progress 8 and the Ebacc measure. Therefore this should be taken into account when considering the relative performance of schools in Languages, and possibly depending on which subject mix has been taught in schools, in general.

Interactive RAISE – 5 reports

Hello,

A brief blog to give my thoughts on 5 RAISE online interactive reports that are worth looking at straight away. I’m not saying the other reports aren’t worth it.. .but I think these are to good to look at first as a starting point.

How do  I get to them?

Log on to RAISEOnline: https://www.raiseonline.org/

Click on Reports, then Key Stage 4.

I’ve highlighted the reports below:

raise

OK… some important things to note, as highlighted by the stars

KS4.P8 – when you open this report you might think, umm that’s nice, but when you go to Options – Progress Related it takes the report to another level. A much more relevant and useful level… do it.

raise1

KS4.Thresh – the eagle eyed amongst you will have noticed I’ve only highlighted 4 reports above… but I said 5 reports. Well I think it is useful to cut the KS4.Thresh report both ways. As standard the report opens and it is comparing you to the specified national comparator. This is fine… and correct… I’m seeing a lot of comments saying this report is incorrect.

Actually, it’s not correct, you need to read the column titled ‘Natioanl comparator type’ and then understand that the national data shown relates to that group. This is as identified in the statement of intent that it is better to compare school figures to the national figure for the comparator group, as this avoids the issue of gaps appearing to narrow or widen based on school overall performance.

Does that make sense? Anyway, if you wish to see it the other way, i.e. comparing like with like, again choose Options – Same. This is useful, but not how schools should be judging themselves on narrowing the gap. Use the specified report for that.

Finally the two subject level reports I’ve highlighted above are useful, as they are.

Progress Reyt*

*Reyt is a Yorkshire persons way of saying “right” or “really” sometimes they might say “reet” or in some areas they say “rare”. Anyway I’m using “reyt” as it rhymes with “eight”, I barely feel the play on words is worthy of the explanation but… I aim for clarity.

So what I am trying to say is Progress Right, or Getting Progress Eight Right.

On 26th September this year schools got their first glimpse of their provisional progress 8 score for the results of the 2016 cohort. There was considerable consternation in some schools due to the figure being considerably lower than what they had calculated in their MIS or analysis system from results day. This was entirely to be expected, as I’ve said before, have a handle on your P8 scores using previous years estimates, but don’t publish them, and certainly don’t shout them from the rooftops.

So most schools saw their provisional progress 8 score ‘drop’ by between 0.1 and 0.2 depending on the proportion of students with lower levels of prior attainment in the cohort due to the increase in entries to Ebacc subjects (see edudatalab post here).

So although the overall headline figure was ‘wrong’ as would have been calculated in year, and on results day, lots of your thinking if you were using progress 8 would have been right.

For example, the table below shows how GCSE subjects in my school fared before and after switching the estimates from the best national attainment 8 estimate of 2014 and 2015 to the 2016 estimates.

I just need to stress, that we don’t rank subjects in this way, I am just using it as a device to show that irrespective of the which P8 estimates we use the subjects are in a similar order. Therefore… in year when we are working with these estimates and scores we are supporting and asking relevant questions of relevant areas at internal assessment points.

sbjs

N.B. P1 just means progress 1, which is what we call progress in a single subject.

Equally, when we talk about individual students and look to support or challenge individuals, review options and such like, you can see from the chart below that again, irrespective of which set of estimates we were using or should have been using that we would have had a good idea of which students were making least and most progress.

linesrank

Again we do not rank students like this. It is for illustrative purposes.

Students of course fit into groups, so again if we were looking at groups, or gaps we can be fairly confident we are looking at the right sort of things.

In conclusion, I believe that using progress 8 methodology on your current cohorts is OK, certainly better than using a methodology based around thresholds or in my opinion than doing nothing at all.

So after all, it’s not about getting the Progress 8 headline, the flashing and dancing school score right, it’s about using the methodology in the right way, thinking about what it does tell you in the right way, and supporting your students and teachers in whatever way you feel is the right way.

Reyt?

 

 

Performance Dark Arts

I have been meaning to write this blog for a while but I keep putting it off because the subject matter is a little bit sensitive, probing into the dark areas of data mismanagement for perceived performance gains.

So let’s highlight some of the ways that schools are playing the system, how these games can be spotted and how we can gain greater insight into what is happening. Let’s start with everyone’s favourite qualification right now, the European Computer Driving License:

ECDL:

The ECDL has been around a fairly long time, certainly I can remember having the opportunity to do some similar exams whilst I was working for a local authority way back in 2003. However what is happening now is that it is being seen as an easy qualification for schools to gain a huge boost to their progress 8 score in the Open element. The ECDL story is what triggered me to write this piece today when I saw the 346% increase in the numbers of ECDL qualifications achieved over the past year.

ee

Source

So, 117,200 certificates in the past year, that almost puts it on a par with French or GCSE PE for the number of entries. The rise of ECDL has been well documented elsewhere, most notably when the PiXL club touted it as an easy success route at one of their national conferences.

Much else has been written about the rise of ECDL, most notably from Schools Week, here and again here. Also weighing in heavily on the debate are Edudatalab who make it clear in this piece that student’s achieve dis-proportionally higher grades in the ECDL than they do in other qualifications…

“In the European Computer Driving Licence (ECDL) qualification, which has drawn criticism recently, the difference is staggering. On average, pupils taking the ECDL achieve 52 points – equivalent to a grade A – whereas they average 38 points – below grade C – in their GCSEs.” (Edudatalab, May 2016)

So not only can higher grades be achieved, but schools can put student’s through the qualification in a much shorter amount of time, 3 or 4 days of intensive teaching and testing can achieve results for students way in excess of what they can achieve in a two year course in a GCSE subject. In my experience, an employer might value an A grade in mathematics, higher than a Distinction in ECDL, but the performance tables awards them equal points. And this is the crux really, the fault does not lie in the existence of the ECDL qualification but instead lies at the feet of whoever decided that the course was the equivalent of top grades in mainstream GCSEs. However it is not mathematics that is suffering in the progress 8 world, all students have to take mathematics. The ‘loser’ qualifications come in the form of GCSEs that also are available for the open bucket, they cannot compete with the power of ECDL. The dilemmas schools face are thus: Spend two years completing an approved GCSE in a non-Ebacc subject or spend 3 days blitzing ECDL and achieving higher tariff results to boot.

Of course, this hoo-hah has not gone unnoticed, if the DfE and Ofqual seem oblivious to this phenomenon then at least Ofsted appear to be on the ball. In their Summer Update to Inspectors, they make reference on pages 4 and 5 to Examination Entry and Curriculum where they state:

identify any subjects with a substantially higher percentage of entry than the national figure, taking into account any specialism of the school, and the total of all qualifications in a subject area, such as information and communication technology (ICT), or in related areas, such as ICT and computer science

Without mentioning any particular qualification, this could well be alluding to ECDL entries.

Schools need to tread carefully in this area and think carefully about future entry patterns, I’m up for rewarding schools efforts, but not at the expense of others (remember Progress 8 is a zero-sum game, so improvements – however they are made – in one school then affect the rest of the school population.) If ECDL really is the answer, every student might as well do it and then we can be judged on the remaining Progress 7.

The ECDL effect can be seen via performance figures where schools have great Open element scores but lower impact on the scores in English, maths, and Ebacc elements.

So three ways the ECDL issue can be resolved:

  • Reduce points, and then schools that still feel it is genuinely beneficial to their students can still offer it.
  • Tighter Ofsted scrutiny over entry patterns (this might already be happening)
  • Publish Progress 8 without ECDL, or simply the best 7 subjects – Progress 7, this would highlight any schools that are relying on one qualification to boost scores.

OK next up on my list…

MFL qualifications for native speakers:

This is not going to be a popular stance (with some) but some schools simply put their students in for GCSE qualifications in their home language.

How can this be spotted?

When we look at school results and see things like; 5 A grades in Polish, 8 A* grades in Urdu, 3 top grades in Mandarin.

OK fine, these results like ECDL are falsely inflating a slot in progress 8, this is not as widespread but possibly even less worthy, simply because the school has not taught these children anything, whereas there is an element of teaching in ECDL. A GCSE in their own language is probably not that useful to them.

The counter argument to this is that these children often have lower ability in English and therefore this is simply a counterbalance to that, but it isn’t really close to being the same thing. Lower prior attainment in English will be reflected in KS2 starting points.

If Ofsted were looking, they could say, OK – 3 A grades in Polish, great well done, you must have some great teaching in that area, can I observe part of a lesson please? Where would they go? These things are superfluous to the school system.

See also: Graded Music Qualifications, taken privately outside of school – claimed by the school – the same thing really.

Finally for today, because I have a bunch more:

The Missing Masses:

When Edudatalab, published the excellent Floors, Tables and Coasters in 2015, I expected a much bigger uproar nationally about this graph:

missing

This clearly shows an exodus of pupils in Autumn and Spring Term of Y11 and what happens on the third Thursday in January? The school census, where pupils on roll at that point count in school performance figures whether they like it or not. This isn’t just an odd phenomena this is deliberate massaging of the figures to remove ‘troublesome’ or dare I suggest ‘underperforming’ students before they are cemented into the school performance figures.

This is a major concern to me in education, how can these students simply be allowed to be AWOL, do we not as a nation have a duty of responsibility to educate this students to the best of our ability to the end of compulsory education.

However deeper than this even is the MAT effect. Students switching schools within MATs to simply go onto a roll in January (briefly perhaps) so that they do not adversely affect the results of a school that is in danger of inspection in the next cycle. They might never set foot in the school. I’ve mentioned this before here, it’s impossible to prove from here.

However, surely nationally, the NPD can be interrogated to show numbers on roll at individual schools in Y10 January and then the following year Y11 January and this can be used as a basis for further investigations.

I don’t like this one at all, infact I like it least of the three mentioned today. The darkest of the dark arts, a large secondary MAT with an outstanding special school included… have a think where those students are ‘going’ for a few days in January.

Combined Science Grades in Progress 8

GCSE Combined Science qualifications are reported on a seventeen point scale from 1-1 to 9-9, where 9-9 is the highest grade. Results not attaining the minimum standard for the award will be reported as U (unclassified). So from 2018, students will be achieving grades like:

6-6

6-5

5-5

5-4

Etc.

For progress 8 purposes and the Ebacc bucket, either both grades can count or one grade can count or neither grades can count:

So Student A:

Double Science: 6-6

Geography: 6

History: 5

French: 4

Both Science Grades and Geography count and Ebacc bucket scores 18 points.

 

Student B:

Double Science: 6-5

Geography: 6

History: 6

French: 4

This is the strange one, Geography and History count, One of the science grades count but because the unused science grade is a 5, the 6 is worth 5.5. i.e. the science grades are totalled and divided by 2: In this scenario the Ebacc bucket contains 17.5 points.

 

Student C:

Double Science: 6-5

Geography: 6

History: 6

French: 6

Double Science does not count, Ebacc bucket consists of Geog, History and French – 18 points.

 

Student D:

Double Science: 5-5

Geography: 6

History: 6

French: 6

Double Science does not count, 18 points.

 

No official source yet, as per usual it is a based on a query somebody raised via e-mail with the DfE helpdesk.

No idea at the time of writing as to what happens to the “unused” science grades and what points they score in the Open bucket.

Scaled Scores and Setting Targets

Hello,

A brief blog because it’s late, but many people have been asking about this.

A lot of schools wish to be able set end of KS3 and KS4 targets for their new Y7s, based upon scaled scores.

The problem is… the desire to transpose scaled scores on top of existing target setting processes that involve levels, sub-levels or fine-levels. A lot of these attempts seem to say something like well 100 is a 4a so that makes a grade B on our old target-setting system, which on the new spec is a grade 6, so we’ll set a grade 6. And this moves incrementally up and down the spectrum.

I don’t recommend that approach, but who am I to say it is wrong. Generally I do not advocate saying things like a 4a is 100 because it has been made quite clear that the old levels system and the new scaled scores do not correlate in that way.

However I guess it is obvious to say the children achieving the highest scaled scores, would have achieved the highest NC levels and vice versa.

Anyway – enough of what I do not recommend and a word about what I think.

Firstly you have to understand your scaled score distribution and how it compares to national.

  1. What does my scaled score distribution tell me about the nature of the cohorts attainment distribution compared to national.
  2. Does the distribution show something I’d expect? Is it typical?

You can download your scaled scores from ncatools (which I imagine many of you have done) then you can map your scores against national. I suggest starting with Reading and maths, and I provide a spreadsheet you can use to do this here: LINK

Then answer the questions above.

So perhaps your distribution in maths looks like this:

scaled-scores-maths

So you would say, OK this school has a range of scaled scores that are loaded towards the lower range, with a higher distribution there and in the middle but fewer of the higher scores.

If you agree that this is not unusual for your school, then you can move forward with confidence.

Next I would think about what KS4 outcomes you would expect for  such a cohort, and if it appears not unusual to what you are used to then you can fairly confidently benchmark certain areas.

Grade 7 – you can benchmark your existing proportion of A grades achieved to Grade 7, and you can benchmark the national distribution of A grades to grade 7.

So to run through an example, we know that roughly 22% of GCSE entries result in an A grade or above. We also know that Ofqual have said that roughly the same proportions will achieve grade 7s or above as achieved A grades or above.

So we can take a small leap of faith and say that the top 20-25% of scaled scores will be the ones achieving the grade 7s and above. Then you can look at the scaled score distribution and see that this lies somewhere around a scaled score of 107 / 108.

Great, then you can repeat the process with grade 4s and grade Cs as this is the other grade point that Ofqual have said has been ‘pegged’ (roughly) between the two specificati0ns.

A*-C grades nationally is around 70% so reasonable to assume that 9-4 grade proportions will be the same, mapped onto scaled score distribution… that gives you something around a scaled score of 99.

The final thing you know from Ofqual is that grades will be distributed equally. i.e. in theory it is the same ‘distance’ from 4 to 5 as it is grade 6 to 7. Hmmm we’ll wait and see but that’s what they say. This means you can statistically fill in the gaps, sort of anyway.

Then you’ve got to think about what YOU know about YOUR school and the students in it, and what is an appropriate level of challenge.

i.e. if you are a high attaining school and usually achieve 40% A*-As then you are dealing with a different situation to a school that regularly achieves 5% A*-A grades.

You need to marry these factors with your thoughts about your scaled scored distribution. The national data will give you a clue.

Basically you are estimating a whole bunch of things and as such you are producing estimates. Then you need to make a prediction, and then you can end up with a target. But of course everyone wants aspirational targets, so you’ll need to add a bit.

I do feel that the whole affair this year is much more of a personal process and not simply the case of plugging data into a system and trusting the numbers it’s spits out.

Estimation based upon historical evidence is impossible, we have new starting points and new end points.

Then again you will probably want to do something…. so….

This is what we’ve done:

Scaled Score / Target

109+ / Grade 8

106-109 / Grade 7

102 – 106 / Grade 6

97 – 102 / Grade 5

92 – 97 / Grade 4

<92 / Grade 3

I should stress that this targets are not currently shared with students, and possibly won’t be until year 9, that’s a whole other blog about life after levels. Instead students are placed in discreet ‘starting profiles’ for monitoring and support purposes.

Going to stop here, could ramble all night.