DraftExpress - Analytics, Models and the NBA Draft

By: Daniel Frank

Every year, members of the APBR metrics bulletin board come together and share their rankings for the ensuing NBA draft. As the online home of the NBA analytics community, featuring NBA team executives, leading basketball journalists and of course, a gifted and passionate fan base, the draft rankings are a little different than a typical mock draft. Due to the increased popularity and ubiquity around the league of basketball analytics, the APBRmetrics community wanted to share their rankings with the passionate fans and followers of DraftExpress.

Before you read this, please look at any previous year's mock draft. Note how many lottery picks quickly left the league and how many later picks became stars. Every year following the NBA draft, analysts will tell you that the teams that won the draft are those that drafted players who slid from their projected location in mock drafts, while the teams that lost are those who drafted players above their projected location. Despite the increase in information available for NBA prospects, mock drafts have become less prescient. In today's landscape where analytics play such a large role in determining where and when a player is drafted, traditional mock drafts are left in a state of turmoil.

Despite the amazing statistical tools available to evaluate the impact of NBA players, ranking college prospects is a much more complicated task. Not only must team style, strength of schedule and age be accounted for, a more important question must be asked; how will a player's game change when competing against bigger and faster opponents? Other important factors simply cannot be measured with stats: how much room does the player have to grow (physically and mentally)? What about their work ethic? Do they have a good head on their shoulders? Can they guard a position in the NBA? Additionally, teams need to prioritize if they want a player who excels in traditional basketball metrics, or has good analytics. Zach LaVine was selected to the Rising Stars challenge, won the 2015 Slam Dunk Championship and named to the NBA All-Rookie Second team. Most consider this to be a successful rookie season and a good draft selection by Minnesota. On the other hand, analytics rate LaVine's year as one of the worst in the entire NBA and a bad draft choice.

Keep in mind when looking at these models that they are not mock drafts. Use these rankings in supplement to all of the other information you know about a prospect. Further, due to the limited information available for international prospects, these models only provide rankings for NCAA players eligible for the NBA draft that are likely to be drafted.

Preview on the Five Different NBA Draft Models, and Their Top-14 Prospects. Full Ranking Displayed At Bottom
Note: Non-collegiate prospects, such as Emmanuel Mudiay, Mario Hezonja, Kristaps Porzingis, and others, have been excluded from this study.

My name is Layne Vashro (@VJL_bball) and I am presenting my simple Estimated Wins Peak (EWP). In the past, I have put together a number of different projection models and tools to help evaluate incoming talent. These include several NCAA/International models, a player-season comparison finder, a tool that shows how each statistic has historically translated to the NBA for players under different coaches, and a tool that allows you follow each prospect's progression/regression throughout the season. You can find these over at NylonCalculus under the Our Stats tab.

The goal of the EWP model is to project how good each prospect will be at the peak of his NBA career. In order to do that, I must quantify peak NBA performance in some acceptable way. I do this by calculating the number of wins a player is responsible for in each season of his career using a blend of Win Shares (box-score metric) and RAPM (+/- metric). I then use a two-year rolling average and select the highest value as that player's wins peak. Here is a link to the list of previously drafted players included in the sample. If this list largely agrees with the order in which you would select these players in a redraft, you can at least be comfortable with my model's validity.

The model is constructed with collegiate box-score statistics pulled from DraftExpress.com and basketball-reference.com, play-by-play statistics pulled from hoop-math.com, anthropometric information (measurements) from DraftExpress, and a selection of team statistics pulled from sports-reference.com. Then, a linear regression is used to identify what each bit of pre-NBA information says about a player's future peak production in the NBA based on historical results. This knowledge is then applied to current prospects to project their NBA future.

Prospect	LV Ranking
D'Angelo Russell	1
Karl Towns	2
Justise Winslow	3
Jahlil Okafor	4
Stanley Johnson	5
Kevon Looney	6
Frank Kaminsky	7
Delon Wright	8
Kelly Oubre	9
Tyus Jones	10
Willie Cauley-Stein	11
Myles Turner	12
Rondae Hollis-Jefferson	13
Christian Wood	14

My name is Steve Shea (@SteveShea33) and I am an associate professor of mathematics at Saint Anselm College and a co-author of the book, Basketball Analytics.

College Prospect Rating (CPR) uses a college player's box score statistics and his class (freshman, etc.) to approximate his NBA potential. It differs significantly from other objective draft models in at least the following ways: CPR does not use regressions. Thus, CPR does not have to make a choice of a dependent variable. This is nice, but not the primary motivation for not using regressions. A typical regression uses information of what has worked in the past to predict what will work in the future. Implicit in the prediction is the assumption that the context of the past will be similar enough to the context of the future. This may not be true in the NBA. The NBA is changing in very measurable ways (such as the percentage of a team's offense that comes from 3-point shots). CPR hopes to project the players that will succeed in 2016 and beyond, not pick the players that will thrive in the 90s.

College players are inconsistent. This is most problematic in the freshman season, which is the last college season for some of the top prospects. Some freshmen improve dramatically over the course of the season. Others simply don't show up on occasion. These inconsistencies blur season average numbers. CPR gets around this by focusing only on each individual's top 10 performances in each statistic.

There are no weights on statistics or adjustments for scarcity of position. CPR weights all statistics the same. It simply looks for excellence. Anthony Davis was excellent. Kevin Durant was excellent. The two were superior prospects in very different ways. CPR leaves it to the team to decide what positions they need or perceive to be scarce at the time, and what type of player they want. In spite of its nonstandard construction (or maybe because of it), CPR has been effective projecting both high picks that busted and late picks that surprised as can be seen here.

Prospect	SS Ranking
D'Angelo Russell	1
Jahlil Okafor	2
Karl Towns	3
Kevon Looney	4
Bobby Portis	5
Myles Turner	6
Cameron Payne	7
Stanley Johnson	8
Tyus Jones	9
Justise Winslow	10
R.J. Hunter	11
Christian Wood	12
Terry Rozier	13
Jordan Mickey	14

My name is Nick Restifo. In my basketball life, I write for Nylon Calculus and am a special assistant for the D2 powerhouse that is the University of New Haven Chargers. If you like, you can follow me on Twitter at @itsastat.

My overall predictions are based on an ensemble of four base models predicting a two year career peak blend of RAPM and Win Shares. The ensemble takes input from a regression based model and a bagged neural network trained on two different subsets of data; all prospects with statistics listed on DraftExpress since 2001-2002, and just those prospects that were actually drafted since 2001-2002 (a total of four base models).

I use RSCI high school rank, NBA Combine measurements and tests, pace and per minute adjusted box score statistics, minutes per game, age on February 1st of a player's draft year, strength of schedule, and percentage of points from three (to account for some spacing benefits). I average an entire player's pre-NBA career, each year weighted by minutes played. For the vast amount of missing data for the players who did not participate in the combine, I impute regression based estimates of body dimensions (hand length, body fat, etc) based on listed height and weight. For the vertical and agility tests, I impute missing values via decision trees trained on a player's age and body dimensions.

In comparison to other models, since I include high school ranking as a variable, my model will favor those highly heralded high school players significantly more than other models. While this has served as an important predictor in the past, it helps put into context the rankings of 5-star recruits such as Cliff Alexander and unranked high school prospects like Frank Kaminsky.

Prospect	NR Ranking
D'Angelo Russell	1
Jahlil Okafor	2
Karl Towns	3
Myles Turner	4
Tyus Jones	5
Kevon Looney	6
Bobby Portis	7
Chris McCullough	8
Justise Winslow	9
Delon Wright	10
Montrezl Harrell	11
Kelly Oubre	12
Jerian Grant	13
Cliff Alexander	14

My name is Jesse Fischer. I work as a Senior Software Engineer at Amazon. My academic background includes a degree in Computer Engineering with a minor in Mathematics from the University of Washington. I blog on www.tothemean.com and can be found on twitter at @jessefischer33.

My "Longevity" draft model optimizes for "long term value" as defined by a player's max five year "Value over Replacement Player" (VORP) which is based on the stat BPM. VORP accounts for playing time allowing injuries/durability/coaching preferences to be factored in, which is important when measuring longevity. To account for players who are still playing (and most importantly the players who haven't hit their 5 year peak yet) I have a separate "Predicted VORP" model (based on age, VORP trajectory, playing time trajectory, max single season VORP, etc) which predicts a player's max 5 year value based upon data from his career thus far.

The actual "Longevity" model is based upon public data: college stats (multiple years), team stats, NBA Combine data, etc. It also includes actual/expected draft position to take into account real life scouting as a factor rather than only blindly using numbers. Data was transformed to try to account for pace, competition, playing time, teammate quality, among other things that I describe on my blog. The model uses more than just the players who were drafted, but also any which have had a remote chance of playing in the NBA (assigning a replacement player value if they didn't make it). To automate this pre-filtering there is an additional "Predicted NBA Player" model which projects the probability of a NCAA player playing in the NBA.

Using a filtered dataset, the "Longevity" model runs as a blend of many different models. The individual models consist of different machine learning algorithms, all optimized and tuned in different ways. Further, my overall model is not limited to linear relationships like most draft models. There are a lot of details and discussion left out of this summary which can be read about in future posts on my blog: www.tothemean.com.

Prospect	JF Ranking
Justise Winslow	1
D'Angelo Russell	2
Frank Kaminsky	3
Stanley Johnson	4
Jahlil Okafor	5
Karl Towns	6
Delon Wright	7
Willie Cauley-Stein	8
Rondae Hollis-Jefferson	9
Tyus Jones	10
Jerian Grant	11
Kevon Looney	12
Sam Dekker	13
Myles Turner	14

My name is Masseffectlenk (@masseffectlenk), and I am a graduate student in bioengineering.

My model is based on a regression utilizing basic box score stats that are on a pace-adjusted, per 40 minute basis. Rates are specifically used--three point rate, free throw rate, assist rate, usage rate--as well as height, weight, and age. The regressions are informed by nearly a thousand NBA players who have played over 100+ NBA minutes and were drafted post-2005. The stats are mapped to a blend of the average offensive and defensive win shares/RPM values. Similarity scores are then created based on the overall stats and are used to determine the weightings for the regression. The most recent model is informed by recent at-rim shots per 40 minutes and dunk rate for each player in the past three seasons, and adds on the athletic component--this only applies to NCAA players this past season. Spreadsheet for data can be found here.

The athletic regression is used specifically to dock players who post deceptively athletic box score stats but lack athleticism otherwise (e.g. Jordan Adams) or elevate those who are more athletic than their box score stats indicate (e.g. Jordan Clarkson, Norman Powell this year).

Prospect	ME Ranking
Jahlil Okafor	1
Karl Towns	2
D'Angelo Russell	3
Justise Winslow	4
Stanley Johnson	5
Myles Turner	6
Willie Cauley-Stein	7
Tyus Jones	8
Christian Wood	9
Kevon Looney	10
Frank Kaminsky	11
Delon Wright	12
Bobby Portis	13
Robert Upshaw	14

My name is Daniel Myers (@DSMok1) and I am a structural engineer by trade, born and raised in Oklahoma, but now living in Maine. I have always been a math nerd (and Excel whiz), and started dabbling in advanced sports statistics around 2007. I started posting on the APBRmetrics forum in 2009, and currently am the acting administrator. My focus is to be open with my work and very aware of the limitations and weaknesses of our statistics.

Box Plus/Minus (BPM) is my contribution to this project, but it is not a projection system at all. Rather, it is perhaps the best public metric for measuring actual production at the college level. The ranking published here is simply a ranking by BPM, which evaluates player production per possession. The full derivation and methodology is available at http://www.basketball-reference.com/about/bpm.html.

BPM was developed by regressing advanced box score stats onto long term Regularized Adjusted Plus/Minus (RAPM). This was done using NBA data (no RAPM is available for the NCAA), but the values of each statistic should be valid at the NCAA level as well. BPM is adjusted for context and strength of schedule. Full BPM data for the NCAA are available through Sports Reference / College Basketball's Player Season Finder.

Treat BPM not as a projection of NBA ability, but rather context for the other models: has the player produced in college? Why or why not would we expect that to translate to the NBA? If they produce well in the NCAA as a freshman, that's a great indicator.

If you have enjoyed this article and want to see more about NBA draft models or read about analytics, we encourage you to visit us at the APBRmetrics forum.

Composite Ranking Comparison to DraftExpress Rankings

Table also available in Google Spreadsheet format for sorting purposes and further analysis.

*BPM is not included in the composite ranking
COMP Simple composite ranking including all 5 models.
LV: Layne Vashro
SS: Steve Shea
NR: Nick Restifo
JF: Jesse Fischer
ME: Masseffectlenk
BPM: Daniel Myers
DX-100: DraftExpress Top-100 Ranking
DIFF Difference between composite ranking and DraftExpress Top-100 Ranking (+=DX is higher on prospect, -=Model Composite ranking is higher )

COMP	Prospect	LV	SS	NR	JF	ME	BPM*	DX-100	DIFF
1	D'Angelo Russell	1	1	1	2	3	9	3	-2
2	Jahlil Okafor	4	2	2	5	1	11	2	0
3	Karl Towns	2	3	3	6	2	1	1	2
4	Justise Winslow	3	10	9	1	4	13	5	-1
5	Stanley Johnson	5	8	16	4	5	15	6	-1
5	Kevon Looney	6	4	6	12	10	40	17	-12
7	Myles Turner	12	6	4	14	6	33	8	-1
7	Tyus Jones	10	9	5	10	8	22	11	-4
9	Delon Wright	8	24	10	7	12	2	25	-16
10	Frank Kaminsky	7	16	26	3	11	3	7	3
11	Christian Wood	14	12	17	21	9	67	22	-11
12	Bobby Portis	42	5	7	16	13	26	15	-3
13	Cameron Payne	26	7	21	15	15	30	18	-5
14	Kelly Oubre	9	18	12	20	26	34	14	0
15	Jerian Grant	20	25	13	11	27	31	12	3
16	Willie Cauley-Stein	11	64	15	8	7	4	4	12
17	Rondae Hollis-Jefferson	13	49	23	9	18	10	10	7
18	Chris McCullough	17	65	8	18	23	53	27	-9
19	R.J. Hunter	33	11	51	22	24	54	23	-4
20	Montrezl Harrell	40	44	11	19	28	25	19	1
21	Sam Dekker	16	61	22	13	31	18	13	8
22	Richaun Holmes	27	32	36	30	22	24	34	-12
23	Robert Upshaw	36	57	24	17	14	43	26	-3
24	Trey Lyles	24	58	19	23	25	28	16	8
25	Jarell Martin	38	15	39	25	40	68	24	1
26	Terry Rozier	41	13	28	38	41	36	40	-14
27	Justin Anderson	15	47	30	26	45	7	20	7
28	Wesley Saunders	45	21	18	61	19	66	72	-44
28	Jordan Mickey	44	14	20	34	52	23	28	0
30	Vince Hunter	18	41	37	43	30	69	48	-18
31	Seth Tuttle	19	52	35	50	16	5	74	-43
32	Rashad Vaughn	28	19	31	54	46	75	36	-4
33	Dakari Johnson	29	71	29	33	20	16	38	-5
33	Devin Booker	22	31	68	24	37	29	9	24
35	Aaron White	34	28	73	29	21	8	41	-6
36	Cliff Alexander	25	70	14	62	17	42	37	-1
37	J.P. Tokoto	23	63	34	28	43	35	39	-2
38	Andrew Harrison	31	59	45	35	32	27	30	8
38	T.J. McConnell	37	40	56	27	42	6	49	-11
40	Josh Richardson	54	29	46	41	35	44	50	-10
41	Larry Nance	43	43	44	51	29	55	54	-13
42	Branden Dawson	21	68	27	56	39	21	61	-19
43	Pat Connaughton	35	42	54	37	51	41	53	-10
44	Chasson Randle	48	27	33	55	57	48	62	-18
45	Michael Frazier	49	62	32	31	59	37	31	14
46	Michael Qualls	59	45	40	36	54	56	32	14
47	Tyler Harvey	68	20	53	40	56	58	51	-4
48	Derrick Marks	52	34	52	67	34	17	69	-21
48	Darrun Hilliard	46	46	65	46	36	12	52	-4
50	Aaron Harrison	30	55	41	52	62	27	55	-5
51	Alan Williams	63	33	66	42	38	61	46	5
52	Quinn Cook	57	56	38	45	50	19	47	5
53	Keifer Sykes	53	35	61	53	47	57	56	-3
54	Ryan Boatright	60	23	42	70	58	39	66	-12
55	Olivier Hanlan	56	26	60	48	64	59	42	13
56	Joshua Smith	32	74	49	68	33	14	68	-12
57	Norman Powell	55	54	55	44	48	60	35	22
58	TaShawn Thomas	50	73	25	60	49	46	63	-5
59	Anthony Brown	47	39	75	32	70	50	33	26
60	Rakeem Christmas	66	53	50	39	60	20	29	31
61	Shannon Scott	39	60	43	69	63	62	70	-9
62	Joseph Young	70	17	48	72	69	58	45	17
63	Treveon Graham	67	37	72	59	53	52	58	5
64	Brandon Ashley	51	72	58	57	55	51	57	7
65	Corey Hawkins	73	22	64	64	73	38	67	-2
66	Chris Walker	61	75	74	47	44	74	43	23
66	Dez Wells	64	50	62	58	67	65	59	7
68	Travis Trice	65	36	69	73	61	49	73	-5
68	D.J. Newbill	69	30	70	63	72	47	60	8
70	Terran Petteway	72	38	59	65	74	64	64	6
71	Jonathan Holmes	58	48	67	71	71	45	21	50
72	Juwan Staten	71	67	47	66	65	70	71	1
73	Marcus Thornton	62	51	71	74	68	63	65	8
74	Julian Washburn	74	66	63	49	75	73	44	30
75	LeBryan Nash	75	69	57	75	66	72	75	0

Analytics, Models and the NBA Draft

Comments

Recent articles

Twitter @DraftExpress
Follow @draftexpress