## WISP: Will and Ian’s Shoddy Predictions

Nelson Chenault-USA TODAY Sports

Around March of last year, we (being Will and Ian) said to ourselves, "What if we built a draft model? That would be kind of fun." We didn’t do anything about it for about fourteen months, and then we did. This is the story of that draft model.

Step 1: Building WISP

WISP was built with data from the 2006-2008 and 2010-2015 draft classes. We started in 2006 because it was the first draft after the passage of the one-and-done rule requiring players be at least one year removed from their high school graduation to be draft-eligible. We used per-40 stats to attenuate the effect of differential minutes allocation and lost the 2009 draft class because there are no minutes played for the 2009 NCAA season on sports-reference. All college data courtesy of sports-reference and all NBA data courtesy of basketball-reference.

WISP predicts NBA Box Plus-Minus, the most easily accessible rate-based stat on basketball-reference. BPM is essentially a formula that uses easily available box-score statistics to emulate the come complicated and difficult catch-all statistic RAPM.

An important detail is that WISP was built using career BPM as the output variable. The reason for this is simple and kind of dumb: it was very easy to extract career BPM but rather hard to extract BPM from a certain time frame (e.g. ages 24-28, a player’s first 5 years in the league, etc.). That means players from earlier drafts will have a more reliable measurement of BPM as the result of more NBA minutes played. For example, WISP is informed equally with information from both LaMarcus Aldridge (23,000 NBA minutes) and Karl-Anthony Towns (5,700 NBA minutes). This is a serious limitation, and all inferences from WISP should be taken with a Sim Bhullar-sized grain of salt.

There are three sub-models in WISP: Guard, Wing and Big. We made this decision using the logic that the statistics that precede NBA success are not the same for every kind of basketball player. Luckily, sports-reference categorizes college players as "Guards", "Forwards" and "Centers." However, their criteria for centers are illogically stringent, as they classify both the 6’4 John Jenkins and the 7’1 Javale McGee as forwards. In light of this, we instead decided to deem any player determined to be a forward by basketball reference and less than 6’10 a "Wing" and all basketball-reference forwards 6’10 or taller "Bigs." This results in a somewhat goofy taxonomy for a few players, but seemed like a reasonable rule of thumb for most players.

The models were built using all-subsets regression, maximizing adjusted r-squared. r-squared essentially lets you how the strength of the association between combination of the predictor variables and the outcome variable, but this adjustment penalizes you for adding in too many predictors. Adding too many predictors would mean that your model is unlikely to be applicable to data that you did not use to build it. Adjusted, rather than raw r-squared, hopefully allows for the right balance of prediction and generalizability.

Although it was possible to build models with a higher adjusted-r2 busting out four-way interactions all over the place, we tried not to overfit the models too badly (note: they are likely already overfitted). Keeping in mind that all variables save age and height are per 40 minutes played, the regression equations for each model are:

Guards:

AGE*HEIGHT + FTM*FTPERCENT + 3PM*STL + TRB*AST + TRB*STL + TRB*PF + AST*BLK

This model had an r-squared of .33 and an adjusted r-squared of .25.

Wings:

AGE*HEIGHT + 2PM*2PERCENT*3PA + TRB*AST + AST*BLK + STL*PF +BLK*TOV + FTPERCENT*TOV

This model had an r-squared of of .42 and adjusted r-squared of of .25.

Bigs:

AGE+ 3PA + 2PERCENT*STL + TRB*BLK + AST*STL + STL*BLK + STL*PF + TOV*PF

This model had an r-squared of of .40 and adjusted r-squared of of .28.

Step 2: Examining in-sample predictions

Here we will examine BPM predictions the players that were used construct WISP. We should not be surprised when WISP predicts very good players to be very good and very bad players to be very bad, given that they were used to compute the outcome.

Here are the 20 top-rated prospects in the guard model:

It is somewhat comforting to see EWP heroes of yore Delon Wright and Jordan Adams appearing in the very top of WISP, even if they never turned out to be NBA superstars. Evan Turner and Moe Harkless’s classifications seem somewhat dubious, but who am we to question sports-reference?

Now, here are the in-sample guards’ WISP BPM (x-axis) compared to their actual NBA BPM (y-axis):

Sorry, Isaiah and Russ :( Curse you, Evan Turner!

Here are the 20 top-rated prospects in the wing model:

We’re highly amused that Jae Crowder is reported as a better prospect than Durant, but that’s what happens when you have no RSCI (high school scouting ranking) or DrafteExpress ranking variable. Aaron Gordon, Paul Millsap and Kevon Looney all appear as wings based on the height cut-off but acquit themselves well. Shouts out to Super Cool Beas.

Now, here are the in-sample wings’ WISP BPM (x-axis) compared to their actual NBA BPM (y-axis):

Much like our professional counterparts, WISP overestimated Anthony Bennett and Michael Beasley and underestimated Draymond Green and Chandler Parsons.

Here are the 20 top-rated prospects in the big model:

Big men are where we see the most consensus between scouts and WISP at the top of the list. There are no players taken outside of the lottery until McBob (swoon) at #11 and he, Whiteside and Muscala are the only bigs in the WISP top 20 to be taken after the first round.

Now, here are the in-sample bigs’ WISP BPM (x-axis) compared to their actual NBA BPM (y-axis):

Nerlens and Jason Thompson are the two highest-ranking bigs to underperform while Steve Novak, Mason Plumlee and Kevin Love (hmmmmm) outperform their WISP expectations.

For good measure, here are the top 30 overall prospects and a graph of all in-sample players' WISP and real BPM.

WISP tends to underestimate performance on the high end of the NBA spectrum: there were 44 players with a WISP BPM above zero, compared to 83 players with actual BPM greater than zero. Conversely, WISP overestimates the terrible players. There are 35 players with a WISP BPM less than -5.0 and 65 players with an actual BPM less than -5.0. So, when you look at the very best players, it probably makes more sense to focus on the rank-order rather than their absolute WISP BPM value.

Looking at how WISP (which of course has the benefit of hindsight) fares against NBA GMs, following WISP rankings (blue line) would give you a small advantage over the real draft order (red line) for the first ten picks or so, but no additional advantage over the course of the first round.

For players with over 1000 NBA minutes played, WISP BPM correlates with actual BPM at r = .55. This is a stronger association than most I measure in my professional life as a social psychology researcher, but I am guessing somewhat marginal for a model of this nature. If you have questions about individual drafts, feel free to ask and we can provide more information in the comments.

Step 3: Examining out-of-sample predictions

The 2016 draft is a chance to test whether WISP can predict player outcomes outside of players that built the model. The results are decidedly mixed!

WISP pegs Ben Simmons as the best prospect of the last 10 years with a WISP BPM of 6.5, a full 2.3 points above Anthony Davis, the highest in-sample player. He absolutely murders in rebounds, steals, assists and two-point percentage, an extremely juicy combo for the big man sub-model. Interestingly, Simmons drops to #10 in the 2016 rankings if you run him using the wing model, but vaults back up to second if you run his as a guard (while is how he might play offense as a 76er).

WISP is head-over-heels for Chinanu Onuaku, placing him with a predicted BPM equidistant between Joakim Noah and Joel Embiid. It also has 56th overall pick Daniel Hamilton as the #3 ranked player, whose predicted BPM is comparable to Marcus Smart and Eric Gordon. Danny Hamilton was also ranked suspiciously high by other statistical models, coming in between #10 and #15 in the 2016 class in long-standing and respected models created by Nick Restifo, Jesse Fischer and Steve Shea. Less controversial choices Deyonta Davis, Jakob Poeltl, DeAndre Bembry, Denzel Valentine and Brandon Ingram fill out the list of players with a WISP BPM greater than zero.

Here is the order of the 2016 first round if WISP were in charge:

Alternatively, WISP is way down on Malcolm Brogdon, Jaylen Brown and Buddy Hield, all who played well enough over the course of the 2016-2017 season. Overall, WISP BPM was correlated with actual BPM at a paltry r = .09. However, I assume history will vindicate its prediction of Ben Simmons as the greatest player of his generation.

Step 4: Looking at the 2017 NBA Draft

And now, 1200 words later, we get to the whole reason this thing exists in the first place: to predict the success of current college players. We will report the WISP BPM for every college player in the top Draft Express Mock Draft (top 60 prospects). Based on WISPs somewhat idiosyncratic analysis of the 2016 draft class, I would again recommend looking at these rankings as a fun thought exercise rather than anything at all informative.

Lonzo Ball, not shockingly, is predicted as best player in the class. His WISP BPM is roughly the same as Kyrie Irving, the #1 WISP guard of the last decade. His combination of steals, rebounds and assists rockets him to the top of the rankings (and doesn’t factor in his insane 73% on two-point shots.

After Ball are big men Jordan Bell and Zach Collins. It is easy to tell why these two players are ranked so highly: Bell is second among bigs in both blocks and steals; Collins is first and fourth, respectively. Both are also among the top three bigs in two-point field goal percentage, making them WISP darlings.

Rounding out the top five are Markelle Fultz and OG Anunoby, followed by John Meyer’s PF progeny, TJ Leaf. Justin Patton and Jayson Tatum are the final two players projected for a positive BPM by WISP.

Players projected to go in the lottery by DX but that receive unfavorable WISP scores are De’Aaron Fox, Lauri Markkanen and Josh Jackson. Fox ranks last among guards in 3PM and posts middling scores in free throw percentage and rebounds. Where Fox is elite is drawing free throws, ranking first among guards in FTA. Markkanen’s biggest college skill of perimeter shooting isn’t strongly associated with successful NBA shooting, and isn’t included in the WISP model for bigs. As such, he was a likely candidate to deviate from scouts' rankings. His anemic rebound (worst big), block (2nd worst big) and steal (worst big) rates doomed him in the eyes of WISP.

Josh Jackson is the player with the greatest negative differential between DX rank and WISP rank. The most predictive interaction term in the wing sub-model is free throw percentage by turnovers, and Jackson finishes 2nd to last in both categories. This, combined with his extremely low number of 3PA overcomes his quite good steal, rebound and two-point attempt rates. If I were to name a player whose 2017 WISP score I trust least, it would certainly be Jackson (Sindarius Thornwell 4 lyfe!)

Now, a note on Canis Hoopus draft darling Jonathan Isaac. Since he is 6’11, his WISP BPM is calculated using the big model (where he ends up as the 9th overall big). While his steal and block rates rank highly among bigs, his two-point field goal percentage and high volume of 3s (a negative indicator big man success) hold him back. However, if you run him as a wing, he ranks 3rd in the class, in between Bell and Collins. Interestingly, if you run OG Anunoby as a big rather than a wing, he’d remain 5th on the WISP board.

Overall, there appears to be a fair amount of agreement between WISP and scouts regarding the 2017 class, with a correlation of r = .54 between WISP BPM and Draft Express ranking, as compared r = .22 between WISP and draft slot in 2016. Below, you will see a graph of BPM projected by WISP vs. DX rankings:

And finally, a plot of WISP rankings vs DX rankings, with players above the blue line favored by Draft Express and players below the blue line favored by WISP

Next Steps:

To improve our model, the next thing we are attempting to add is RSCI/DX rankings for the in-sample players to model scouting opinions (which IIRC improved Layne’s EWP). We hope to add a NCAA strength of schedule component, to discriminate between Zach Collins's and Joel Embiid's nearly identical per 40 statistics. We’d also like to amend our outcome variable. Ideally we’d have the BPM first 4-5 years of a player’s career, to estimate production on a rookie contract.

If you’d like to know more information about a specific player or class or if you have any general questions, let us know in the comments.

If you'd like to play around with the data yourself or examine the R script that created the model, you can find it here.

Thanks for reading and good luck in the Fake Mock Draft.

*************************************************************************************************************************************************

TL;DR: Ben Simmons is GOAT, Lonzo is v good, T.J. and OG in the lottery