Sunday, August 9, 2009

Correlation on draft day?

I was building a data set for an upcoming football draft. What I am going to attempt to do is set up a draft heuristic that tells you what position you can go the most from by drafting it at any particular moment in the draft. So if QB's are a hot commodity this year, it tells you what performance remains on the draft board and whether you will have better performance from drafting a QB or a running back in your draft slot. So its essence is to instead of taking the best player on the draft board, which analyst will tell you to do because you can trade (of course this analysis ignores how hard it is to trade and what happens if you end up with 5 RB's, which depending on your gift of gab may not be the ideal foundation for making trade propositions,) instead shows you relative outperformance for each subset of positions. So if you can get a +5 over the median/mean QB but you can get a +7 RB you would or should take the RB. The analysis will also be helpful in making trades to complete your team.

However, when investigating the data set I came upon an unsettling set of numbers. Here is the first set in graph form.


So what I was seeing was tiers of players. You can see distinctly the upper echelon of players, then a another, then another, setting up a power law. A regression puts the R-squared at 72%. So then I wondered about the format in which a draft is set up. The normal way in which I have participated is a serpentine format that goes from 1 to 12 and then 12 to 1. So I ran those numbers below.

It definitely shows a two-tiered system of haves and have-nots with the first 5 draft positions able to parlay the superstars outperformance versus the mitigating lower second round draft pick. This leads me to believe that auction drafts, as many claim, have more fair outcomes than the serpentine method. Something to consider as draft season approaches. Good luck out there.

No comments:

Post a Comment