Here’s a guest post from Eric Fingerhut, published with love for the Washington Post, Neil Greenberg and statistical analysis.
“Stats show that Redskins’ Robert Griffin III will never be as good as his rookie season,” screamed the headline on the Washington Post’s Fancy Stats blog Monday afternoon. The third sentence of the blog post was less emphatic but still pretty definitive: “We likely have seen the best RGIII will ever be.”
There are a number of reasons I could think of why RGIII may end up never flashing the form we saw in 2012: lingering effects of his knee injury, a failure to adjust to being a pocket passer, defenses simply figuring him out. But stats that show that RGIII will never be the same, even two, three or five years from now? I’d be interested in seeing those. But the author of the post, Neil Greenberg, actually shows no such thing.
Using a statistic called “adjusted yards per attempt” — basically dividing a QB’s passing yards by his attempts while also taking into account touchdowns and interceptions — he first shows that, other than Griffin, just three rookie quarterbacks since 1970 have achieved an AYPA of 20 percent above league average. Only one of those quarterbacks had a season as good as that again, which would give Griffin a 33-percent chance of returning to form, slightly better than never. (That one QB who did get back to that level? Dan Marino, who did it five more times.) Of course, as anyone who knows anything about statistics should know, drawing inferences from a sample size of three is pretty unreliable.
So Greenberg then links to a list of all QBs who ever had a season 20 percent above the league’s AYPA average. Conveniently, there are exactly 100 on the list, 47 of which had at least one more such season during their career (including such illustrious names as Chris Chandler, Elvis Grbac, Erik Kramer and Wade Wilson). Meanwhile, somewhat confusingly, the post also contains a graph which states that 49 percent of QBs who hit the 20 percent over AYPA average never repeat that achievement.
In other words, a post with a headline stating that RGIII will “never” be as good as 2012, and whose text claims that we’ve “likely” seen the best of RGIII, actually shows that RGIII has about a 50-percent chance of being as good as he was in his rookie season. Sure, 50 percent isn’t a guarantee, but it’s a very long way from never. And if someone tells me that something is “likely,” I usually think there’s a much better chance than 50 percent of it happening.
This failure of the stats in Fancy Stats posts to support its claims is unfortunately all too frequent in the four months since the Post launched a blog devoted to sports statistics. In an era when many are drawn to sportswriters who use statistical analysis in their writing as an antidote to an older generation of sportswriters who frequently express strong opinions without providing many facts to back them up, Fancy Stats is offering the worst of both worlds — strong opinions backed up by stats that are frequently cherry-picked, lacking context or misused.
Take another post about the Redskins from late last month, “Redskins DeAngelo Hall was among the best corners of 2013.” Let’s not even deal with Hall specifically, but with another claim made in the post — that the Redskins’ secondary was better than the Seahawks’ last year. As the post says:
“Seattle’s Legion of Boom, on the other hand, may be more sizzle than steak. Not to say they aren’t effective, but the chart above shows that there are other secondaries that could contribute more to winning. And DHall is the best player on one of them.”
Anyone who has ever watched a football game could tell you that the Seattle secondary is better than the Washington secondary. So how could a chart show otherwise? Because Fancy Stats is using stats called Win Probability Added and Expected Points Added, which assign a probability for winning or point value to every “positive” play a cornerback or safety makes for his team. The problem is that those stats don’t include any negative plays — in other words, if Hall returns an interception 80 yards for a touchdown, that counts in this stat, but if his man catches an 80-yard touchdown, that isn’t counted. It’s only really telling half the story. How can a serious stat analyst use this one-sided stat to make such an overblown claim?
And then there are the times when Fancy Stats just makes up stats in order to prove questionable conclusions. Take the post last week, “Don’t count the New York Knicks out of the playoffs just yet,” which argues that the Knicks are likely to go back to the playoffs in part because of a “soft schedule.” There are three different places the statistical analysis in this post is flawed.
First, the whole idea that strength of schedule matters in the NBA is odd, considering that every team the Knicks compete with in their conference play 78 of the same 82 games. When teams’ schedules are 95-percent identical, is strength of schedule really a consequential factor?
But the way the strength of schedule is calculated in the post is even more troubling. The blog utilizes Las Vegas odds of each NBA team winning the championship. For example, Cleveland has a 40-percent chance of winning the championship because their odds are 2.5-1, while the odds of the Wizards winning the championship, at 33-1, gives them about a 3-percent chance. Setting aside the fact that Vegas odds are set not based on the actual strength of teams, but on what will bring in the most money for the sports book, using this system results in a totally distorted stat.
If we are going to use this stat, though, Cleveland, at a 40-percent chance at the championship, is considered more than 13 times “stronger” than the Wizards. That may be true about the relative chances of the two teams to win a title, but in a typical regular season game? Cleveland isn’t 13 times tougher to beat, or 13 times better, during the regular season than the Wizards. In fact, if Cleveland wins 65 games next year, they might win about five times more games than the worst team in the league. But under the Fancy Stats system, they’re considered a 13-times tougher opponent than a team expected to make the playoffs (i.e., the Wizards). In other words, the only reason that the Knicks seem to have such a “soft schedule” is that having one less game against the Cavs has an effect on the Fancy Stats schedule strength formula way out of proportion to its actual difficulty.
I could go on, but this is kind of on the long side already. And I’m sure the Washington Post thought that by starting a stats-based blog they would appear modern and aware of trends in sportswriting and sports fandom. But by having a blog that so frequently provides statistical analysis that misleads readers, they’re frequently looking like a news organization that doesn’t understand math. Either that, or they just want readers to click on sensational, attention-grabbing headlines. If that’s the case, then unfortunately, Fancy Stats is doing its job.