Imagine sitting on one side of a dark partition. On the other side is someone with a ten-sided die, some sides of which have been colored red. The person rolls the die and tells you whether they rolled white or red. After 10 rolls, how close are you to knowing how many sides are red? After 50, maybe you can say it's probably between 2 and 5. 100, perhaps a 2 or a 3. You can imagine getting more and more sure that you know how many sides are red. That's the role a confidence interval plays: taking your sample (however big or small) and using it to tell you the values the whole population probably lies between--for a given probability. This is the process of finding out how good a goalie is: looking at roll after roll of a many-sided die to narrow down the possibilities.
Here is a Wilson binomial confidence interval for every active NHL goalie who played a game last season, sorted by their current team or organization. It's at 95% confidence, so only 1 in 20 times would we expect it to be wrong. 95% is the threshold most commonly used for statistical significance, and the point at which--crudely speaking--we basically 'know' something. Don't be intimidated by the giant table; it's just for your future reference. I'll break it down below.
SA = Shots Against
SV% = Career Save Percentage
Low CI = the low value/floor of their 95% confidence interval
High CI = the high value/ceiling of their 95% confidence interval
CI Width = the difference between high and low--a way to eyeball how much we don't know
The main takeaway for me is just how little we can say definitively about goalies who haven't faced a lot of shots. I picked a few goalies from that list (long-time starters, recent starters, career backups, about-to-be starters or backups), and sorted by career saves.
It's fair to say that you're willing to accept more risk of being wrong in exchange for a tighter confidence interval. Are you willing to bet that 4 of 5 times a goalie will pan out in that range? To flip a coin?
Confidence intervals aren't too helpful for decisions on minute differences between goalies; they're more suitable for reinforcing what we already know about the unreliability of small sample sizes. Over the long haul, we can look at who the objectively best goalies in the league are--those whose low-end confidence interval values are the highest. Here are those goalies:
And here are the worst goalies--those with the lowest ceiling:
Confidence intervals are helpful in forcing us to check our expectations about unproven goalies. In the long run, they can do more to help us determine relative value than save percentage alone. As is often said, goalies are voodoo. We would do well to remember that.
Megan blogs intermittently about whichever hockey stats catch her fancy at shinnystats.wordpress.com. She can be reached on twitter at @butyoucarlotta, or via email at shinnystats at gmail.