Tuesday, October 13, 2009


The following article appeared in the Australian newspaper yesterday.

It is certain to generate some heated debate and will, no doubt, lead to some improvements in the NAPLAN tests.

National literacy and numeracy tests 'not reliable'
Justine Ferrari, Education writer | October 12, 2009

"The results from the national literacy and numeracy tests are unreliable and cannot accurately compare a school's performance from year to year or track the progress students make as they go through school.
Melbourne University associate professor Margaret Wu argues that the test results are too imprecise to be used as proposed by the federal government - to compare schools and identify those struggling to improve student performance.
In a report analysing the results from the National Assessment Program - Literacy and Numeracy, Professor Wu says the change in student scores from last year to this year was greater in one-third of the tests than could be statistically expected.
NAPLAN, which started last year, tests all students in Years 3, 5, 7 and 9 on reading, writing, spelling, punctuation and grammar, and numeracy.
Professor Wu says the change in student scores this year compared with last year was too high for reading in Years 3 and 5, numeracy in Years 5 and 9, punctuation and grammar in Years 3 and 7, and too low in Year3 numeracy.
"With seven out of 20 subject areas showing aberrant results, it is difficult to have confidence in the overall NAPLAN 2009 results," she says.
The NAPLAN results are a key part of the government's commitment to report transparently on school performance, and will be published on a website in January, allowing parents to compare the results of their child and school with other schools.
Federal Education Minister Julia Gillard plans to compare school NAPLAN results in groups of "like schools" with similar student demographics, to identify those struggling to teach students and share the experience of those schools and teachers performing better than comparable schools.
"I personally think what the government proposes is going beyond the accuracy and the validity of the NAPLAN results," Professor Wu told The Australian. "It basically means linking NAPLAN results based on student performance with teacher performance. That link is conjecture.
"These are numbers. We can't allege any school or teacher is not working based on these numbers. We don't have the confidence these numbers show that these teachers haven't worked hard enough because of student performance."
Professor Wu's analysis says the change in average scores in tests around the world between groups of students at the same level is equivalent to two months of learning. But in this year's NAPLAN, the scores changed by a much larger amount, in some tests by twice as much.
In Year 3 grammar and punctuation, the average scores increased by almost four months of learning on average, with scores ranging from two months in some states to almost six in others.
"While this effect on size is not at all likely for any jurisdiction, the fact that all eight jurisdictions had such a large growth suggests there are systematic errors in the data analysis, most likely in the equating process, that account for these aberrant results," she says.
Professor Wu attributes the flaw to problems in matching the tests from one year to the next so that students could be marked on the same scale. Because tests change every year, they must be matched to ensure the questions are of a comparable difficulty, in a statistical process known as equating.
Professor Wu said the other problem was that 40 questions, which comprise a NAPLAN test in each area, were too few to assess a student's knowledge of the curriculum and to track their progress through school.
But Australian Council for Education Research chief executive Geoff Masters said an expert advisory group of psychometricians and statisticians had approved the equating process used this year.
Professor Masters said the results between last year and this year showed small improvements in most areas of the tests but were statistically significant in only two of the 20 areas.
"There's good reason to believe some of the very small changes that occurred between 2008 and 2009 were based on the attention paid to NAPLAN in its second year of operation," he said.
Professor Masters said some states, particularly Queensland and Western Australia, made deliberate efforts to improve their results.
The equating process used this year involved having students in New Zealand do questions from the 2009 and 2008 NAPLAN tests, and their results were used to adjust the scores of Australian students.
The other way of equating tests so that students in every year of school can be marked on the same scale is to include common questions in the different tests, so that the Year 3 test will include some Year 5 questions and the Year 5 test will include some Year 3 questions.
Professor Masters said no equating process was perfect, and if a group of hundreds of New Zealand students sat the test as opposed to hundreds of thousands of Australian students, a level of imprecision was inevitable.
The Australian Curriculum, Assessment and Reporting Authority chief executive, Peter Hill, said equating errors were "extraordinarily difficult to control" and everyone had agreed they wanted to improve the system for next year's tests.
But Dr Hill said the errors did not affect the analysis of results to see which schools were doing better than others, if the comparisons were not made across the years.
Professor Wu advocates a more frequent testing system over the year, such as a teacher administering online tests of 40 questions once a month.
Instead of having one reading test for Year 3 students, Dr Hill said one way to improve the tests would be to have three or four that could test the curriculum more broadly.
Professor Masters said another option was to have tests of varying difficulty, with teachers assigning tests to their students according to the skills.
A spokesman for Ms Gillard said the government was confident of the NAPLAN tests and their results, which have cost the commonwealth and states about $14 million to develop in the first two years.
The spokesman said an extensive equating process was carried out this year, and the average scores for all grades were highly correlated, indicating the 2008 and 2009 tests were of equivalent standard and the scales were reliable."


NAPLAN Practice Tests and Answers are available
from Kilbaha Multimedia Publishing:

(1) Detailed answers to the 2008 official sample NAPLAN questions.
(2) Detailed answers to the May 2008 NAPLAN tests.
(3) Detailed answers to the May 2009 NAPLAN tests.
(4) 2009 NAPLAN Trial Tests with detailed answers and responses.
(5) 2010 NAPLAN Trial Tests with detailed answers and responses.
Download order form here.