

Adjusted PlusMinus: An Idea Whose Time Has ComeBy Steve Ilardi, Ph.D.October 28th, 2007
Steve Ilardi is a professor of clinical psychology at the University of Kansas, and former statistical consultant to the KU men’s basketball team under Roy Williams. With the support of assistant coaches Jerod Haase and Ben Miller, Ilardi developed and implemented an adjusted plusminus model of player evaluation at KU, one similar to the models independently developed by Dan Rosenbaum and Jeff Sagarin. In his ‘day job’, Ilardi is a clinical researcher who has worked to develop a novel, lifestylebased treatment for depressive illness.
Who’s better: LeBron or Kobe? Duncan or Garnett? Nash or Wade? And did Dirk really deserve to win the league’s MVP award last year? While such questions serve as the source of endless debate among NBA fans, few seem to think they’re capable of being answered in any definitive sense. I believe they are. In fact, in this article I describe a methodology that delivers such answers with impressive mathematical precision. Most fans of the game, of course, rely on just a few basic statistics in their evaluation of player performance: points, rebounds, assists, steals, blocks, fouls, free throws, shooting percentage, and so on. The common assumption is that the best players are the ones who put up the gaudiest numbers. But this approach to player evaluation quickly runs into problems. Should all numbers count the same? Perhaps some stats should some stats be weighted more heavily than others? Who is more valuable – a player like Nash, who puts up eyepopping assist numbers, or one like Garnett, who’s a beast on the boards? And what about defensive play, which accounts for 50% of each player’s time on the court? Can defensive contributions really be captured in full by the paltry set of available stats like steals and blocks? Not likely. The limitations of the boxscorebased approach to evaluating players are obvious, and dissatisfaction with this tack has led many to focus instead on the game’s bottom line. One might simply seek to identify which players are winners. Certainly, some players do things that never show up in a box score, but nevertheless contribute to team success: locking down the opposing team’s best scorer, setting superb screens, hustling for loose balls, altering shots, making nifty outlet passes that spark easy fastbreak scores. Coaches love players who do the "little things," because coaches care a great deal about winning, and very little about stats, per se. So perhaps the best approach (the most valid approach) to player evaluation is one that identifies the players who help their teams win. This is undoubtedly the logic that underlies the widespread intuition that Tim Duncan was a better player last year than Kevin Garnett, even though KG had better overall numbers in points, rebounds, assists, steals, and FT% (he also played more minutes per game). Duncan was a winner; Garnett wasn’t. But, on the other hand, didn’t Duncan also have better teammates? How much of the Spurs’ success was directly attributable to Duncan, as opposed to the contributions of players like Ginobili, Bowen, and Parker? And how much of the TWolves’ demise was due to the play of Garnett, as opposed to that of his lessheralded teammates? Is there any reliable way of finding out? Fortunately, there does exist a straightforward mathematical answer to such questions – one worked out independently by at least three different people in the early 2000s: Dr. Dan Rosenbaum (a statistical consultant to the Cleveland Cavaliers), mathematics whiz Jeff Sagarin (a consultant to the Dallas Mavericks), and myself (formerly a stat consultant to the Kansas Jayhawks). The basic model – which yields a player statistic known as adjusted plusminus  has already been described in admirable detail by Rosenbaum (see Measuring How NBA Players Help Their Teams Win). My task in this article, therefore, is merely to provide a gistlevel sense of what the model does, and a means of understanding the accompanying rankordered listings of each player in the league who logged at least 400 minutes during the 20062007 season. Adjusted PlusMinus: The Basics By now, many basketball fans are familiar with the basic plusminus concept, as it’s been showing up for years in game commentary at both the NBA and college levels. You might see it alluded to in a game graphic that looks something like this:
In essence, the plusminus stat simply keeps track of the net changes in score when a given player is either on or off the court. Logically, of course, the players who make the greatest overall contributions to team success should be the ones with the largest positive plusminus impact. Unfortunately, however, the plusminus stat doesn’t always fare particularly well in the messy real world of NBA basketball. For one thing, some players spend most of their time on the court in the company of very good teammates, while others frequently play in tandem with much weaker players. The plusminus stat doesn’t account for these inequities at all. Likewise, some guys always find themselves matched against the opponent’s best players, while others more often face the opposing team’s second unit. That’s another big problem as far as the plusminus stat is concerned. What’s needed, of course, is some way of adjusting the plusminus stat to account for all such potential confounds. This is exactly what the adjusted plusminus stat does: it reflects the impact of each player on his team’s bottom line (scoring margin), after controlling statistically for the strength of every teammate and every opponent during each minute he’s on the court. Again, the gory mathematical details of the adjusted plusminus model have been described elsewhere (and they are beyond the scope of this article) – but it’s worth noting that the model relies on the same basic mathematical/ statistical approach currently in widespread use by medical researchers and other scientists all over the world. For example, when an epidemiologist needs to estimate the relative risks posed by smoking, asbestos, and radon and to calculate the odds of contracting cancer on the basis of exposure to each respective hazard, he’ll invariably use the same basic type of statistical model. In other words, the adjusted plusminus analysis is based upon a robust statistical approach that already provides a solid data analytic foundation for many branches of science and medicine^{1}. There are other things about the model that need to be explained, as well, but as a prelude to considering additional nuances, I present for your consideration the Top 20 players from the 20062007 season^{2} (including the playoffs), among players who logged at least 20 minutes per game:
Table 1: Top 20 Players for 20062007 Season
One of the first things you may notice: the list comprises players widely regarded by experts and casual fans alike as among the very best in the game. Every single player in the Top 10, for example, is a bona fide AllStar. Thus, the adjusted plusminus model exhibits superb face validity; it yields results that in many respects mirror our own intuitions about how players should stack up against one another. And remember, it does so by means of a complex mathematical analysis that completely ignores boxscore stats. (These adjusted plusminus numbers are derived without a single direct input from stats like points scored, rebounds, assists, FG %, blocks, steals, free throws, fouls, etc.) This is not to say that the Top 20 list contains no surprises. It does. But, then again, for this sort of sophisticated analysis to be truly worthwhile, it should yield at least some surprises. If there weren’t any at all, that would be tantamount to saying the model tells us nothing we didn’t already know. Certainly, having Kevin Garnett emerge as the league’s premiere player (in a virtual tie at the top with LeBron) is a bit of a surprise. While KG is widely regarded as one of the league’s elite players, he only finished 9^{th} in last year’s MVP voting. Is it possible that he was truly the league’s best player, and that the TWolves’ relatively poor record was due to the collective ineptitude of KG’s teammates (despite his own stellar play)? The adjusted plusminus model answers the question with a resounding ‘Yes’, as all four of Garnett’s fellow starters were subpar from an adjusted plusminus standpoint: Ricky Davis (0.93), Mark Blount (4.13), Trenton Hassell (4.24), and Mike James (3.15). But perhaps the model’s biggest surprise is the inclusion of Celtics rookie point guard Rajon Rondo among the Top 20 (he clocked in at #16). Although Rondo is already a superb defensive player and a skilled distributor on offense, it’s probably best to interpret his gaudy plusminus rating with a bit of caution. For one thing, his rating is based on a fairly limited number of total minutes (he averaged less than 23 mpg last year). Since a player’s adjusted plusminus estimate is just that – an estimate – the stat can be a bit ‘noisy’ and errorprone^{3} unless it’s based on a large number of minutes (since each additional minute represents additional data for the model to use in refining its estimates). Fortunately, however, the model can even tell us precisely how noisy each player’s rating is. In the case of Rondo, the model specifies a roughly 96% probability that he actually was an aboveaverage player last year^{4} (i.e., that his true adjusted plusminus value was greater than 0), but about a 40% probability that he was not good enough to be in the Top 20. Thus, I’m convinced that Rondo’s appearance on the Top 20 list was a bit of a fluke, attributable to the fact that his estimated plusminus value was based on a suboptimal number of minutes. It’s important to point out that, in order to help improve the accuracy of the 20062007 adjusted plusminus estimates, I added data to the model from the preceding season (20052006), but weighted those data much less heavily^{5} than data from last season. This had a net effect of yielding much better (less noisy) player estimates, but estimates that still reflected player performance from the 20062007 season. It’s also worth noting that all playoff games from 0607 were included in the model. In fact, they were weighted twice as heavily as were regular season games, thereby reflecting the heightened importance of the NBA’s "second season." (In the case of Rondo, alas, neither of these supplemental sources of data was available to help stabilize his plusminus estimate; he was not even in the league yet in 20052006, and the Celtics missed last year’s playoffs, so Rondo’s plusminus value was derived exclusively from the 20062006 regular season.) A comprehensive listing of all NBA players with at least 400 minutes played in the 20062007 regular season, rankordered on the basis of the adjusted plusminus statistic, is provided in Table 2 (below) and Table 3 (at the end of the article). (Please note that separate listings are provided for players who averaged above and below 20 minutes per game, respectively.)
Table 2:
Future DirectionsPlayer Interactions. One of the most frequently encountered questions about the adjusted plusminus model concerns the fact that some players seem to be much more effective when they’re on the court with a specific teammate, and much less effective with others. For example, two seasons ago it was widely believed that Damon Jones was very effective when on the court in tandem with Shaq (whose inside presence commanded doubleteams that often left Jones free to nail open 3pointers), but not very effective in most other situations. In statistical terms, this is the equivalent of claiming that the main plusminus effect of Jones was modified by a significant JonesbyShaq interaction effect. Luckily, the adjusted plusminus model is perfectly capable of detecting and accurately estimating such playerbyplayer interactions whenever they exist. It’s simply a matter of adding the appropriate interaction terms to the statistical analysis and evaluating them. I am now in the process of conducting such analyses with last season’s data, but it’s been a slow, tedious, timeconsuming process, so I may not have any final results to report for a couple months. I’ll be sure to add an update (with accompanying Table), however, just as soon as those analyses are completed. Continuous InSeason Updates. Aaron Barzilai, to whom I’m deeply indebted for providing the dataset upon which these analyses are based (www.basketballvalue.com), has suggested to me that we consider providing continuously updated adjusted plusminus numbers for the coming season. If we can work out the logistical kinks, this will be a coming attraction . . . so please stay tuned. PerMinute Versus PerPossession Estimates. Given the intuitive ease of understanding plusminutes estimates on a perminute basis, I decided to conduct the present analyses using a perminute metric. In contrast, Dan Rosenbaum and Dave Lewin, in previous analyses published on this site, have used the perpossession metric, which can be advantageous when it comes to modeling the impact of particular team lineups that are on the court together for an unequal number of possessions (e.g., 4 possessions on offense but only 3 possessions on defense). The more I have reflected on the relative strengths and weaknesses of each approach, I have come to suspect that Rosenbaum’s perpossession approach is the more accurate one (i.e., with slightly lower standard errors of estimate). Therefore, I plan to repeat the present analyses on a perpossession basis, which will provide a nice empirical test of the degree to which the metric of analysis (perminute versus perpossession) matters at all. Prediction. The ultimate validation of any scientific model is derived from its ability to make useful predictions of future events. I have claimed herein that the adjusted plusminus model provides a valid estimate of each player’s ultimate effectiveness. But, of course, there are myriad other "comprehensive statistics" out there, about which similar claims have been made (John Hollinger’s PER rating and David Berri’s wages of wins metric come quickly to mind). How can I actually prove that the adjusted plusminus rating is superior? The best way, of course, is simply to pit it in a headtohead contest with other rating systems in predicting team outcomes in future seasons. Presumably, one could come up with an aggregate (weightedaverage) rating for each team for any given season, based upon each player’s expected minutes played and his most recent set of ratings (perhaps tweaked a bit to reflect any anticipated improvement or decline as a function of player age, cumulative games played in career [i.e., mileage], recent injuries, etc.). I have not made the requisite calculations yet for the upcoming season, but I’m confident that such an analysis should prove the adjusted plusminus statistic to yield better prediction than that afforded by rival measures. Of course, a sufficient sample size for testing this will probably require at least a few upcoming seasons’ worth of data, but it’s an analysis just crying out to be done!
Table 3:
1 = This statistical model is commonly referred to as the general linear model (GLM), which encompasses an array of specific techniques like multiple regression, analysis of variance, discriminant function analysis, logistic regression, etc. 2 = All analyses are based upon a comprehensive gamebygame dataset provided by Aaron Barzilai, founder of www.basketballvalue.com. 3 = Each player’s plusminus value is derived as a parameter estimate in a regression model, and thus each such estimate has an associated standard error (standard errors for each player are provided in the leftmost column of Tables 2 and 3). This standard error essentially tells us how noisy any given player’s adjusted plusminus value is. If you were to randomly, repeatedly sample an equivalent number of minutes on the court for a given player, his estimated plusminus value would fall in a roughly bellshaped curve around the true value (sometimes it would be estimated on the high side, sometimes low). The standard error gives us key information about far (on average) the player’s estimated value will be from his true value; in technical terms, it represents the standard deviation in the distribution of parameter estimates with repeated sampling. 4 = Rondo’s estimated plusminus was 5.59, with a standard error of 3.12. He was thus 1.79 standard deviations (5.59/3.12) above a value of 0. Referring to a standard normal distribution table (or zstatistic), we find that a value would only fall 1.79 standard deviations above the mean about 3.6% of the time. 5 = Specifically, data from the 20062007 season were weighted three times as heavily as data from the 20052006 season.



Copyright © 2007 by 82games.com, All Rights Reserved