As you may have been able to tell in my few comments here, I am a bit of a stat geek. I like to look at numbers and use them to evaluate team and individual performance. I am also a big fan of Win Probability Added (WPA) and Estimated Points Added (EPA) style analysis. However, I see these kinds of stats thrown around a lot, and used in ways that they shouldn't be. So here's a quick summary of WPA and EPA's strengths and weaknesses.The strengths:
The strengths of both WPA and EPA analysis are fairly obvious. By looking at mountains of game logs, Brian Burke (I think this is a one man project) has figured out both how many points a team can be expected to score in a given down, distance and field position, and how likely it is that the team will win accounting for all those factors, plus game time and score. That allows us to put huge amounts of individual stats into real context. For example, a guy who throws for a lot of yards but has a low rate of success on third down like Kyle Orton did last year will rate more poorly by EPA than more traditional measures like total yards or even NY/A. This is doubly true for someone who racks up garbage time stats, since WPA will put almost no value into garbage time touchdowns.
It's not clear at all that WPA is predictive going forward. It is also not a particularly good method of evaluating an individual's performance. That's because teams don't approach the game noticeably differently if the leverage index is really high (leverage just means WPA might change a lot on that play). For an example of this, let's look at last night's wild game between the Falcons and the Eagles. There were five touchdowns scored in the second half. Here is each drive at a glance:
Philly touchdown drive, 70 yards, starting WPA was .18 increased to .31 for a net WPA of .13. Scoring play was a Vick screen to Maclin for 28 yards.
Key play: interception by Asante Samuel increases WPA by .16 to .49 (Philly has a 49% chance of winning the game).
Philly touchdown drive of 22 yards, WPA increases from .49 to .62 for a net WPA of .13. Scoring play was a McCoy 8 yard run.
Philly touchdown drive, 20 yards, increased WPA from .77 to .84 for a net WPA of .07. Scoring play was a McCoy 2 yard run.
Atlanta touchdown drive of 80 yards, decreased Philly WPA from .84 to .69 for a net WPA of .15. Scoring play was a Ryan pass to Mughelli.
Atlanta touchdown drive of 80 yards, decreased Philly WPA from .68 to .24 for a net WPA of .44. Scoring play was a Turner run.
The issue here is two fold. First, there is a tendency to conflate WPA with "clutch." Although WPA does indicate the plays that turned the game, what does that say about the first Vick drive? Vick has just turned the ball over on two consecutive possessions, and had another huge fumble early in the first half. Down 11, they NEED a score to stay in the game. The Georgia Dome is rocking. And Philly goes out and executes a perfect series of McCoy runs and quick Vick passes to get back in the game. That's perfect execution in a huge pressure, must have, situation. And yet this 70 yard drive has a WPA that is exactly the same as the next 22 yard drive.
Does anybody seriously think the Eagles offense performed better on the second drive than the first? The same argument could be made for the second Falcons drive: was that three times more impressive than the first one? But if you just add up the WPAs and say "this is how well so and so performed," that's exactly what you are saying.
The second issue is that the leverage of plays are not independent. If the first Eagles touchdown drive in the second half results in a punt, the subsequent Samuel interception and short drive have a much lower leverage, and much less WPA at stake.
So WPA is an interesting stat and can tell us a lot about the turning point plays in close games. However, as a measure of individual player performance, it is seriously problematic.
Since this is a Broncos' blog, let's apply that to the Brooncos' game vs. the Bengals. If you look at total WPA, basically the offense laid a giant turd and the defense was Orange Crush mach 2, with Joe Mays starring as Randy Gradishar reincarnated. That only really tells half the story though. Through the benefit of excellent play by both units, the Broncos had a 92% chance of winning when the kicked to the Bengals up 17-3 in the second half.
It was the defense that allowed the Bengals a fairly easy touchdown drive. Thanks to the relatively low leverage of the situation, the closing score only decreased the Broncos WPA to 76%. However, this heightened the leverage for subsequent situations, making Orton's fumble extremely high leverage. Basically, the offense's good plays were relatively low leverage, while some defensive lapses made their mistakes (Orton's fumble) higher leverage.
Again, that's not to say Orton's fumble or the offensive failings late in the fourth quarter weren't problematic, because they were. But using WPA in isolation massively understates the role of the defense in allowing the Bengals to even be competitive in the second half of the game.
Use WPA. It's a fun way to watch and understand important plays in football. But please use it in context.
Update: I can't believe I forgot the most useful application of WPA, which is to evaluate coaching decisions such as punt, kick or go for it on 4th down, or whether or not to onside kick. Burke's 4th Down Decision Study should be required reading for any NFL fan.
Update x2: Since I've kind of devolved into a discussion about leverage adjusted win probability in the comments, I'm going to discuss it for a second here. As I mention there, both Tom Tango (one of the foremost baseball WPA guys) and Brian Burke (the only football WPA guy) agree that the best evaluation of a player would be to adjust WPA for leverage (WPA/LI). The question that often comes up is "doesn't that defeat the purpose of WPA?" The answer is, not really. Although WPA/LI will reflect expected points (EPA) in many circumstances, in other cases an outcome that makes it more likely that you will score will also make you more likely to lose. So a five yard pass on first and ten where the receiver stays in bounds will have a positive EPA, but if you are down 14 late in the 4th quarter, it will also have a negative WPA. WPA/LI will therefore remove a lot of the issues tunesmith mentions in the comments where a garbage time drive will artificially inflate numbers, but also recognizes that a player can't change the leverage of the situation he's in. All he can do is make the best possible play in terms of helping the team win at that point in the game.