Around March of last year, we (being Will and Ian) said to ourselves, "What if we built a draft model? That would be kind of fun." We didn’t do anything about it for about fourteen months, and then we did. This is the story of that draft model.
Step 1: Building WISP
WISP was built with data from the 2006-2008 and 2010-2015 draft classes. We started in 2006 because it was the first draft after the passage of the one-and-done rule requiring players be at least one year removed from their high school graduation to be draft-eligible. We used per-40 stats to attenuate the effect of differential minutes allocation and lost the 2009 draft class because there are no minutes played for the 2009 NCAA season on sports-reference. All college data courtesy of sports-reference and all NBA data courtesy of basketball-reference.
WISP predicts NBA Box Plus-Minus, the most easily accessible rate-based stat on basketball-reference. BPM is essentially a formula that uses easily available box-score statistics to emulate the come complicated and difficult catch-all statistic RAPM.
An important detail is that WISP was built using career BPM as the output variable. The reason for this is simple and kind of dumb: it was very easy to extract career BPM but rather hard to extract BPM from a certain time frame (e.g. ages 24-28, a player’s first 5 years in the league, etc.). That means players from earlier drafts will have a more reliable measurement of BPM as the result of more NBA minutes played. For example, WISP is informed equally with information from both LaMarcus Aldridge (23,000 NBA minutes) and Karl-Anthony Towns (5,700 NBA minutes). This is a serious limitation, and all inferences from WISP should be taken with a Sim Bhullar-sized grain of salt.
There are three sub-models in WISP: Guard, Wing and Big. We made this decision using the logic that the statistics that precede NBA success are not the same for every kind of basketball player. Luckily, sports-reference categorizes college players as "Guards", "Forwards" and "Centers." However, their criteria for centers are illogically stringent, as they classify both the 6’4 John Jenkins and the 7’1 Javale McGee as forwards. In light of this, we instead decided to deem any player determined to be a forward by basketball reference and less than 6’10 a "Wing" and all basketball-reference forwards 6’10 or taller "Bigs." This results in a somewhat goofy taxonomy for a few players, but seemed like a reasonable rule of thumb for most players.
The models were built using all-subsets regression, maximizing adjusted r-squared. r-squared essentially lets you how the strength of the association between combination of the predictor variables and the outcome variable, but this adjustment penalizes you for adding in too many predictors. Adding too many predictors would mean that your model is unlikely to be applicable to data that you did not use to build it. Adjusted, rather than raw r-squared, hopefully allows for the right balance of prediction and generalizability.
Although it was possible to build models with a higher adjusted-r2 busting out four-way interactions all over the place, we tried not to overfit the models too badly (note: they are likely already overfitted). Keeping in mind that all variables save age and height are per 40 minutes played, the regression equations for each model are:
AGE*HEIGHT + FTM*FTPERCENT + 3PM*STL + TRB*AST + TRB*STL + TRB*PF + AST*BLK
This model had an r-squared of .33 and an adjusted r-squared of .25.
AGE*HEIGHT + 2PM*2PERCENT*3PA + TRB*AST + AST*BLK + STL*PF +BLK*TOV + FTPERCENT*TOV
This model had an r-squared of of .42 and adjusted r-squared of of .25.
AGE+ 3PA + 2PERCENT*STL + TRB*BLK + AST*STL + STL*BLK + STL*PF + TOV*PF
This model had an r-squared of of .40 and adjusted r-squared of of .28.
Step 2: Examining in-sample predictions
Here we will examine BPM predictions the players that were used construct WISP. We should not be surprised when WISP predicts very good players to be very good and very bad players to be very bad, given that they were used to compute the outcome.
Here are the 20 top-rated prospects in the guard model:
It is somewhat comforting to see EWP heroes of yore Delon Wright and Jordan Adams appearing in the very top of WISP, even if they never turned out to be NBA superstars. Evan Turner and Moe Harkless’s classifications seem somewhat dubious, but who am we to question sports-reference?
Now, here are the in-sample guards’ WISP BPM (x-axis) compared to their actual NBA BPM (y-axis):
Sorry, Isaiah and Russ :( Curse you, Evan Turner!
Here are the 20 top-rated prospects in the wing model:
We’re highly amused that Jae Crowder is reported as a better prospect than Durant, but that’s what happens when you have no RSCI (high school scouting ranking) or DrafteExpress ranking variable. Aaron Gordon, Paul Millsap and Kevon Looney all appear as wings based on the height cut-off but acquit themselves well. Shouts out to Super Cool Beas.
Now, here are the in-sample wings’ WISP BPM (x-axis) compared to their actual NBA BPM (y-axis):
Here are the 20 top-rated prospects in the big model:
The 2016 draft is a chance to test whether WISP can predict player outcomes outside of players that built the model. The results are decidedly mixed!
Step 4: Looking at the 2017 NBA Draft
And now, 1200 words later, we get to the whole reason this thing exists in the first place: to predict the success of current college players. We will report the WISP BPM for every college player in the top Draft Express Mock Draft (top 60 prospects). Based on WISPs somewhat idiosyncratic analysis of the 2016 draft class, I would again recommend looking at these rankings as a fun thought exercise rather than anything at all informative.
Lonzo Ball, not shockingly, is predicted as best player in the class. His WISP BPM is roughly the same as Kyrie Irving, the #1 WISP guard of the last decade. His combination of steals, rebounds and assists rockets him to the top of the rankings (and doesn’t factor in his insane 73% on two-point shots.
After Ball are big men Jordan Bell and Zach Collins. It is easy to tell why these two players are ranked so highly: Bell is second among bigs in both blocks and steals; Collins is first and fourth, respectively. Both are also among the top three bigs in two-point field goal percentage, making them WISP darlings.
Rounding out the top five are Markelle Fultz and OG Anunoby, followed by John Meyer’s PF progeny, TJ Leaf. Justin Patton and Jayson Tatum are the final two players projected for a positive BPM by WISP.
Players projected to go in the lottery by DX but that receive unfavorable WISP scores are De’Aaron Fox, Lauri Markkanen and Josh Jackson. Fox ranks last among guards in 3PM and posts middling scores in free throw percentage and rebounds. Where Fox is elite is drawing free throws, ranking first among guards in FTA. Markkanen’s biggest college skill of perimeter shooting isn’t strongly associated with successful NBA shooting, and isn’t included in the WISP model for bigs. As such, he was a likely candidate to deviate from scouts' rankings. His anemic rebound (worst big), block (2nd worst big) and steal (worst big) rates doomed him in the eyes of WISP.
Josh Jackson is the player with the greatest negative differential between DX rank and WISP rank. The most predictive interaction term in the wing sub-model is free throw percentage by turnovers, and Jackson finishes 2nd to last in both categories. This, combined with his extremely low number of 3PA overcomes his quite good steal, rebound and two-point attempt rates. If I were to name a player whose 2017 WISP score I trust least, it would certainly be Jackson (Sindarius Thornwell 4 lyfe!)
Now, a note on Canis Hoopus draft darling Jonathan Isaac. Since he is 6’11, his WISP BPM is calculated using the big model (where he ends up as the 9th overall big). While his steal and block rates rank highly among bigs, his two-point field goal percentage and high volume of 3s (a negative indicator big man success) hold him back. However, if you run him as a wing, he ranks 3rd in the class, in between Bell and Collins. Interestingly, if you run OG Anunoby as a big rather than a wing, he’d remain 5th on the WISP board.
Overall, there appears to be a fair amount of agreement between WISP and scouts regarding the 2017 class, with a correlation of r = .54 between WISP BPM and Draft Express ranking, as compared r = .22 between WISP and draft slot in 2016. Below, you will see a graph of BPM projected by WISP vs. DX rankings:
And finally, a plot of WISP rankings vs DX rankings, with players above the blue line favored by Draft Express and players below the blue line favored by WISP
To improve our model, the next thing we are attempting to add is RSCI/DX rankings for the in-sample players to model scouting opinions (which IIRC improved Layne’s EWP). We hope to add a NCAA strength of schedule component, to discriminate between Zach Collins's and Joel Embiid's nearly identical per 40 statistics. We’d also like to amend our outcome variable. Ideally we’d have the BPM first 4-5 years of a player’s career, to estimate production on a rookie contract.
If you’d like to know more information about a specific player or class or if you have any general questions, let us know in the comments.
If you'd like to play around with the data yourself or examine the R script that created the model, you can find it here.
Thanks for reading and good luck in the Fake Mock Draft.