Wednesday, 24 September 2014

Goalies and Confidence Intervals: A Study in Uncertainty

How many saves does it take to see a goaltender's true save percentage? If you want to be pedantic, you could say we can never know a goalie's true talent--all we have over time is a steadily accruing sample that gets us closer and closer to some "true" number. One game is definitely too few; one thousand probably gets us where we want to go. But what about the in between? There's a long period of time where we sort of know what's going on, but it's hard to get objectively specific about a goalie's potential to have a career .920 save percentage versus a .925.

Imagine sitting on one side of a dark partition. On the other side is someone with a ten-sided die, some sides of which have been colored red. The person rolls the die and tells you whether they rolled white or red. After 10 rolls, how close are you to knowing how many sides are red? After 50, maybe you can say it's probably between 2 and 5. 100, perhaps a 2 or a 3. You can imagine getting more and more sure that you know how many sides are red. That's the role a confidence interval plays: taking your sample (however big or small) and using it to tell you the values the whole population probably lies between--for a given probability. This is the process of finding out how good a goalie is: looking at roll after roll of a many-sided die to narrow down the possibilities.

Here is a Wilson binomial confidence interval for every active NHL goalie who played a game last season, sorted by their current team or organization. It's at 95% confidence, so only 1 in 20 times would we expect it to be wrong. 95% is the threshold most commonly used for statistical significance, and the point at which--crudely speaking--we basically 'know' something. Don't be intimidated by the giant table; it's just for your future reference. I'll break it down below.

SA = Shots Against
SV% = Career Save Percentage
Low CI = the low value/floor of their 95% confidence interval
High CI = the high value/ceiling of their 95% confidence interval
CI Width = the difference between high and low--a way to eyeball how much we don't know

The main takeaway for me is just how little we can say definitively about goalies who haven't faced a lot of shots. I picked a few goalies from that list (long-time starters, recent starters, career backups, about-to-be starters or backups), and sorted by career saves.

A starting NHL goalie plays about 55 games and faces an average of 30 shots a game, so they see roughly 1650 shots per season. Thomas Greiss, the new Penguins backup, has seen about a season's worth of shots in his career thus far, and yet we can only pinpoint his career save percentage to between .901 and .927. Part of the problem is that most goalies are above a .900 save percentage, but there's still a huge difference between .910 and .920 in terms of goals allowed. But generally, we should be a bit more cautious about assuming a goalie who played a stellar half season will keep it up in the future. I really like Eddie Lack, but at the same time we can't say a lot about him.

It's fair to say that you're willing to accept more risk of being wrong in exchange for a tighter confidence interval. Are you willing to bet that 4 of 5 times a goalie will pan out in that range? To flip a coin?

I took three of my favorite goalies with very different career shots faced and tried out different risk levels on them (the rightmost column). You can see how that affects the confidence interval. Even if we used a confidence interval for Eddie Lack that only had a 50% likelihood of including his career save percentage, we could only posit that it falls between .906 and .918. That's still a pretty big variation.

Confidence intervals aren't too helpful for decisions on minute differences between goalies; they're more suitable for reinforcing what we already know about the unreliability of small sample sizes. Over the long haul, we can look at who the objectively best goalies in the league are--those whose low-end confidence interval values are the highest. Here are those goalies:

And here are the worst goalies--those with the lowest ceiling:

(Fancy meeting you here, Ondrej...)

Confidence intervals are helpful in forcing us to check our expectations about unproven goalies. In the long run, they can do more to help us determine relative value than save percentage alone. As is often said, goalies are voodoo. We would do well to remember that.

Megan blogs intermittently about whichever hockey stats catch her fancy at She can be reached on twitter at @butyoucarlotta, or via email at shinnystats at gmail.

1 comment:

  1. Before I ask my question I just want to saw awesome job; everything was easy to understand and your analysis adds a lot to the data, so thanks! I just have a problem with Cam Talbot having a basement of 0.918 after only 560 shots against. Seems to me like there's a greater than 5% chance that he'll be below that.