New grading system

I started tracking and grading college football picks in 2009 while watching College Gameday on ESPN one Saturday morning.  Over the years, I have expanded to tracking 100+ analysts, and I now track picks against the spread, totals and money line picks as well.  I often get feedback about how easy it is to pick straight up winners on college football games.  I disagree, because the top analysts typically pick in the 70-75% range, and the top computer systems are in the same neighborhood according to The Prediction Tracker (https://www.thepredictiontracker.com/ncaaresults.php).  However, that isn’t the reason for this post.

For many years I have had an idea in my mind about how to grade straight up win/loss picks in a more meaningful manner.  There are too many analysts picking too many different games to truly compare apples to apples.  I wanted a way to add weight to the value of every single game played to grade a particular pick on a game.  I tossed around a few different ideas over the years (most of which failed completely) but none ever really satisfied me enough to actually implement.

Recently, I had a better idea.  Rather than come up with my own weighting system, I realized there are a lot of really smart people who already did the heavy lifting on the ratings side of things.  For a great overview of different rating systems you can check this College Football rankings page on Sports Betting Dime.

The traditional win percentage model which I have used in the past can best be described below.  Favorites, underdogs, point spreads and ratings have no bearing on the value of a team or results of a game.

Win ValueLoss ValueTie Value
Favorite100.5
Underdog100.5

On a whim, I took the picks from the 2019 season and I applied Jeff Sagarin’s 2019 Predictor ratings to assign a value for each team and grade matchups accordingly.  This model can be best described as:

Win ValueLoss ValueTie Value
Favorite (rating1)+rating2-rating10
Underdog (rating2)+rating1-rating20

2019 Sagarin Predictor Rating Example:

Win ValueLoss ValueTie Value
Ohio State (105.93)67.49-105.930
Northwestern (67.49)105.93-67.490

Thus if you pick the favorite (Ohio State) and win, you gain Northwestern’s Sagarin point value (+67.49).  If you pick Ohio State and lose, you forfeit your own Sagarin point value (-105.93).  Thus, wins and losses are weighted to a team’s value.  Wins are rewarded.  Losses are punished.

When applying these win/loss values across the 2019 season, the visualization is striking.  Those who pick a high percentage of wins and those who pick a high winning value with the Sagarin predictor weights have very little, if any, correlation.

I took it a step further, and applied the same method to weight each team and individual matchup by using Sagarin Predictor ratings, Bill Connelly’s SP+ ratings, ESPN’S FPI ratings, and Bill Massey’s computer ratings.  What I found is, the weighted scores are fairly consistent whether using Sagarin predictor, SP+, FPI or Massey ratings.  However, the graph still goes crazy when comparing against raw win percentage.

Since they each use different grading scales, I have scaled and standardized the grades across all five columns for comparison in the graph.  Below is the full table of raw grades which are not scaled.

My plan starting in fall 2021 is to post grades with the raw win percentages, but also with some weighted grade value.  I can use an average grade for each game synthesized from some combination of SP+, FPI, Sagarin Predictor and Massey ratings.  I will be using the weighted grade as a better measure of “value” on all analysts I track.  These four ratings are updated each week during the season to include the most recent games, so the team ratings and value assigned will be moving and self-corrective as the season moves along.  I expect the visual representation to be much wilder early in the season and flatten out as more games are played.

The four rating systems I experimented with are described below.

-SP+ is Bill Connelley’s efficiency metric, which he describes as “a tempo- and opponent-adjusted measure of college football efficiency”.  He says “It is not a résumé tool…It can be used to make predictions, similar to the analytics systems Vegas uses”. It measures five metrics: efficiency, explosiveness, field position, finishing drives, and turnovers. It is currently the sexy metric out there amongst analysts and gamblers.

-According to espn.com, “FPI is a predictive rating system designed to measure team strength and project performance going forward…If Vegas ever published the power rankings it uses to set its lines, they would likely look quite a lot like FPI”.  I like the perspective of oddsmakers and this is as close as I can probably get.

-Sagarin Predictor (Pure Points) rating is based on win/loss and margin of victory.  His traditional BCS rating ignores margin of victory which is why I chose his Predictor (Pure Points) rating.  It covers both FBS and FCS teams, which interests me because I include FCS teams.

-Kenneth Massey describes the purpose of his system as “ to order teams based on achievement”, so it is opposite of FPI in the sense that FPI is predictive. Massey rates everyone from the Power Five down to junior colleges, making it the most inclusive of all rating systems I can find.

I could use any number of power ratings found out on the internet.  I also think it would be interesting to compare this synthesized rating average against a human poll (the coach’s poll, CFP committee, or AP poll) to see how different they might be.

What rating systems do you trust the most?  What rating systems are trash?  Why should I use your rating system over others? I’d like to hear feedback in the comments, or hit me up on Twitter if you’d like.