FanPost

PA100 Primer

PA 100 is an NBA player evaluation metric that I built during the 2011/12 season. The metric parcels value among players using a priori assumptions about how individual statistics translate into team points. Unlike other evaluation metrics (PER, Win Shares, Wins Produced), PA100 uses play-by-play and matchup data (thanks to basketballvalue.com(http://basketballvalue.com/downloads.php)) instead of box-score data. This allows it to capture the value of players at a higher resolution, and I think, get more accurate results. I'm not going to get into the nitty-gritty of the model, but I will outline the basic logic and explain the component parts. The key to the model is how it parcels each offensive possession into two stages. 1) Shot creation, and 2) Shot execution.

PA 100 is an NBA player evaluation metric that I build during the 2011/12 season. The metric parcels value among players using a priori assumptions about how individual statistics translate into team points. Unlike other evaluation metrics (PER, Win Shares, Wins Produced), PA100 uses play-by-play and matchup data (thanks to basketballvalue.com) instead of box-score data. This allows it to capture the value of players at a higher resolution, and I think, get more accurate results. I'm not going to get into the nitty-gritty of the model, but I will outline the basic logic and explain the component parts. The key to the model is how it parcels each offensive possession into two stages. 1) Shot creation, and 2) Shot execution.

Shot creation:

Shots from certain locations and situations are more efficient than others. On average, shots at the rim and from three earn more points than mid-range jumpers, and assisted shots are more effective than unassisted shots. Ideally, teams would only take the highest efficiency shots; however, gaining access to those shots can be very difficult. Shot creation represents the fact that opposing defenses aren't going to sit around at let you take layups all game. There is a reason most teams take 50% of their shots from mid-range. Often that is the only shot you can get. I include shot creation in the model to credit players who help their team get better shots and punish players who do not.

Valuing shot creation, and in particular devaluing its absence, is where this metric most distinguishes itself from alternative evaluation metrics. Wins Produced and the related Wins Score metrics give all the benefits of creating a better look to the player who drops in the uncontested layup or open three, while placing the entire cost of failed shot creation on the player who ultimately takes the contested mid-ranger. PER simply gives players credit for any shot with a FG% over 30.4. That certainly gives points to creators, but fails to discriminate between "creators" and "chuckers." Both of these metrics give credit for assists, but neither appreciates the range in assist value, and more importantly, neither controls for the value that an assist passes along to its recipient. Appreciating shot creation (and the lack thereof) improves our understanding of which players make the offense work and improves our ability to parcel team performance among the individual players.

To calculate shot creation, I start with the basic assumption that a team can take an unassisted jumper any time it wants. I call this the "settle." Taking a "settle" shot does not confer any value because it is, on average, the lowest quality shot a player can take. Player's generate increasing value for assisted jumpers, unassisted threes, assisted threes, unassisted shots at the rim, and assisted shots at the rim in that order (note 1: I factor situational expected foul rates into these values; note 2: I also estimate the number of missed assisted shots, or "potential assists," and give players credit for those.) This value is equivalent to the league mean of those shot types minus the league mean value of "settling" (an unassisted jumper). Player values aren't set by actual points scored, but instead are based on how they change the situation and thus improve on "expected points scored." I give all of this value to the player who either gets the look on his own, or makes the pass into the situation. I do not credit pass recipients for shot creation. This is a false simplification, but I think an acceptable one.

Here is a list of the top Shots Created scores in 2011/12:

Caral_medium

Not surprisingly, point guards dominate this rating. Offensive creation is the explicit role of point guards, so it would be a concern if they didn't shine here. The "Shots Created" (SC 100) score is the sum of all of the different ways a player can add expected value to an offensive possession minus turn-overs above/below the NBA mean (turnovers are valued as an average possession.) Players are also credited with the value of an average possession when they collect an offensive rebound. The value indicates the expected points added by that player over 100 possessions. Players like Rondo and Rose who are good at creating opportunities at the rim consistently dominate this score.

Shot execution:

While there is a league average value of shots at the rim, shots from range, and assisted jumpers, the player taking the shot also makes a huge difference. What I am looking for in "Shot execution," is a player's ability to transcend the expected value of different types of shots. An unassisted mid-range shot is much more valuable when it comes from Dirk Nowitzki than when it comes from Darko Milicic. This is true for every type of shot. Players who shoot above the expected value on different looks improve their teams' point production by either enhancing the value of good shot creation or by dampening the sting of possessions that fail to produce quality shots.

Calculating shot execution is easy. Players have an expected value at each location on the floor based on the percentage or their shots that were assisted and the league-wide expected value of assisted and unassisted shots at that location. For example, a player who takes four shots at the rim every 100 possessions is expected to add about 5 points above the baseline expectation of "settling." I subtract this expected value from the player's actual contribution to find how much value his execution adds at the rim. I do this for each player in each situation and then sum across them to find their "Shot Execution" (SE 100) scores.

Here is a list of the top Shot Execution scores in 2011/12:

3w9zt_medium

Snipers like Dirk and Durant and guys like Howard who draw a lot of fouls consistently dominate this metric.

Putting it all together.

To find PA100.off, I sum players' shot creation and execution scores minus then subtract the mean score for player at their position.

Here is a list of the best PA100 scores in 2012:

3aqgj_medium

The best offensive players are pretty much exactly who everyone thinks they are. LeBron and Howard are the best offensive players at their respective positions every year, Kobe passed the shooting guard crown to Wade in 09 and Paul and Nash traded off years as the top point guard.

PA100.def

Once you understand what goes into the offensive half of the metric, understanding the defensive end is simple.

PA100.def takes the production of all the players to lineup across from ego throughout the season. When a player creates a shot, grabs an offensive rebound, shoots efficiently, turns the ball over, attempts a free-throw, or does anything else captured in the PA100.off metric, the value of that action is credited/debited to the defensive player at his position. This means that when Steve Nash drives and dishes to Jared Dudley for a three against the Celtics starting rotation, Rajon Rondo is debited with an assist-to-3 against, and Paul Pierce is debited with the difference between a made three and the expected value of an assisted three point attempt. The only exceptions to reliance on "counter-part production" in the PA100.def data are steals and blocks. The value of a steal is credited to the actual ball-thief, rather than the counter-part of the player committing the turnover. The value of a missed shot (either from three, mid, or rim) is credited to the shot-blocker, while the shot created (at rim or three) is debited from the counter-part player.

Here are the top 10 PA100.def scores in 2011/12:

Anheb_medium

Cross-matching and help defense are obvious problems for this approach to evaluating defense. You might be eyeing George Hill, and especially Lou Williams as likely symptoms of this problem. Simply put, the defensive metric is more noisy than the offense, so it should always be taken with a grain of salt. I have plans that I think will greatly improve the parceling of defensive value, but it is a big project. For now, I do have a simple diagnostic tool to help identify cross-matching and help defensive impacts.

The "HLP" measure calculates how opponents at the other four positions on the court perform (using the same method outlined above) when ego is on the court vs. off, and then aggregates that production differential. The HLP value is not actually factored into the PA100.def scores, because it is pretty noisy and carries some bias that isn't in the other measures. However, I think that in conjunction with rotation information it is a useful tool. Ultimately I want to use this same general concept, but look at how each individual teammate is impacted by a player's presence on the court, but that is a much larger endeavor.

Here are the top HLP scores in 2011/12:

5whhr_medium

As you can see, it is a weird mix of expected and unexpected players. Context is very important to consider when looking at this measure. "Who did this player log minutes with, and who were they replaced by?" is an essential question.

PA100.dif:

PA100.def is simply PA100.off - PA100.def.

Here are the top 15 PA100.dif scores in 2011/12:

Wcvpj_medium

So why should you take these results seriously? What utility does this metric add?

PA 100 is descriptive:

Taking individual players' PA100 scores, multiplying them by possessions player and then summing together the PA of all players on each team yields expected team point differentials. Comparing these values to actual team point differentials we can see that PA100 does a very good job of explaining team success.

Pqq9a_medium

Individual PA100 scores explains 97.4% of team performance (r = .987). This means that you may be able to debate the parceling of points within a team, but as a whole, PA100 explains team performance almost perfectly. In addition to describing team performance, the component values in PA 100 make a nice roster construction tool. They make it easy to identify promising passer and finisher pairing, they make skillset dearths and redundancies apparent, and they help decide who to give the ball in different shooting situations for optimal results.

PA 100 is predictive:

PA 100 scores remain relatively stable across time. A player's performance in one season is a pretty solid predictor of his performance in the next season.

R36xc_medium

The between year correlation is .802 for players who log at least 3,000 possession (roughly 1,500 minutes). The more possessions a player plays in each year, the better the prediction. The prediction is also better if players remain on their current team between years, although this effect is weak. Defensive is noisier than offense, and within offense, shot creation is more stable than shooting efficiency.

We can further improve our ability to anticipate future production by accounting for age. Younger players tend to improve, while older players tend to fall off. The graphic below shows the rate of change in PA100 offensive and defensive scores with age. The solid line at 0 represents the point where players reach their peak performance. The values above the 0 line indicate the expected improvement in points generated per 100 possessions in the next season. The values below the 0 line indicate the expected decline in points generated per 100 possessions in the next season:

Kichl_medium

Players tend to peak offensively shortly after their 24th birthday. They then begin an increasingly rapid decline in offensive ability throughout the rest of their careers. Defensively, players tend not to peak until closer to their 26th birthday. They then slowly lose their defensive ability until about 32 or 33 when the finally see the same accelerated decline in defense that they experienced in their offense 5-7 years earlier.

(see this post for examples of this projection tool in action)

Problems?

In addition to the issues I discussed under the defensive metric, I think that some of the extreme values are a too large. My hunch is that I am not accounting for the diminishing returns in some forms of production. In particular, I suspect there are diminishing returns in shot creation (there is only one ball), and offensive rebounding (something identified in previous studies.) For example, I believe that LeBron James is generating +14 points per 100 possessions, however, I suspect that in the process he takes away opportunities for his teammates to generate their own value, and thus his true impact on the lineup is exaggerated (and his teammates' underestimated.) Once I get around to investigating diminishing returns I will include them in the model.

I don't like position adjustments. I think they paper over problems. However, I do adjust for position because the position of your counter-part ends up impacting defensive ratings. For example, point guards are more productive offensively than wings. I believe this to be reality, but without a position adjustment that fact makes the raw values under-appreciate a good defender like Rajon Rondo in comparison to some wings who in reality are less impressive. I use the raw numbers for my own analysis, but the position adjusted numbers are more intuitive for public consumption.