by Brandt Redd, OfThat.com
In a recent Freakonomics Post, Roger Pielke Jr. writes about the perils of “False Positive Science.” We constantly fight the fallacy of equating correlation with causation. But false positive science involves a more subtle error. In the search to find statistically significant results, researchers often try many different analytical alternatives. Their papers rarely list all of the failed models, only the one that achieves statistical significance is used. Joseph Simmons and colleagues write, “It is unacceptably easy to publish ‘statistically significant’ evidence consistent with any hypothesis.” And this mistake is more difficult for the reader to detect than the correlation/causation fallacy.
|Credit: Randall Munroe – xkcd.com|
When it comes to research into educational achievement, another issue comes into play. Since humans are natural learners, just about everything works. In his book, Visible Learning, John Hattie gives this rigorous treatment. Over 15 years, Hattie and his staff studied over 800 meta-analyses representing hundreds of thousands of studies into what affects student learning. For every study, they converted the results into a common effect size scale.
Roughly speaking, the effect sizes used in Visible Learnig are the amount of improvement a student would make in a year scaled to one standard deviation on a standardized test. By mapping all effects onto a common effect size scale you can compare the relative value of different techniques and theories.
Among Hattie’s observations is the following:
Almost everything works. Ninety percent of all effect sizes in education are positive. Of the ten percent that are negative, about half are “expected” (e.g., effects of disruptive students); thus about 95 percent of all things we do have a positive influence on achievement. When teachers claim that they are having a positive effect on achievement or when a policy improves achievement this is almost a trivial claim: virtually everything works. One only needs a pulse and we can improve achievement. (Hattie, Visible Learning, p. 15)
On Hattie’s scale, a child simply living for a year with no schooling achieves an effect size of 0.15. “Maturation alone can account for much of the enhancement of learning.” Being present in a classroom with a teacher results in effect sizes between 0.15 and 0.40. So, for an innovation to be interesting, it must result in an effect size substantially higher than 0.40. (Hattie, p. 16).
From the book, here are some selected influences with their rank and effect sizes.
|3||Teaching||Providing formative evaluation||0.90|
|7||Teaching||Comprehensive interventions for learning disabled||0.77|
|56||Teacher||Quality of teaching||0.44|
|62||Teaching||Matching style of learning||0.41|
|81||Student||Drugs (e.g. for ADHD)||0.33|
|133||School||Open vs. traditional||0.01|
There’s a ton of stuff to chew on here. Far more than I can do justice in a blog post. Hattie has between one half and five pages for each of the 138 effects and there is nuance that the numbers don’t capture. I’ll just make a few observations:
- The top five influences all involve adapting the experience according to individual student needs.
- Charter schools, something I favor, have an unimpressive effect size of 0.20. But charters were intended to enable experimentation. So we should expect them to average similar to conventional public schools but with a much larger standard deviation. Recent studies seem to confirm that expectation. And studies are starting to identify what factors distinguish the high-performing charters from other schools.
- Smaller schools help somewhat while the impact of smaller classes is minimal. That’s probably because most small-class initiatives dilute their impact by with a consequential reduction in teacher experience.
- Feedback loops, among my favorite topics, appear at #10 with an effect size of 0.73.
- Home and socioeconomic status have a huge impact. But other factors are bigger so it should be possible to overcome the achievement gap in the school.
- Phonics Instruction has an effect size of 0.60 while Whole Language has one tenth that effect. There’s much to be said for Whole Language and I tend to agree with its constructivist roots but not at the expense of phonics.
Of course, the observation that nearly everything works doesn’t eliminate the other perils of false positive science and the correlation/causation fallacy. All three of these make it possible to latch on to ones’s favorite intervention while claiming to be evidence driven. To defend against this, we must seek 2-5 times improvement in learning performance and replicable results. It also helps to be careful, honest and humble.
Note : This post is licensed under a Creative Commons Attribution 3.0 United States License.