## Friday, April 13, 2012

### Weighted wins 2

I've been interested in using weighted wins as a statistic for a while. The idea is that by considering the strength of an opponent, an algorithm could handicap the results: a win against a good opponent counts for more than a win against a middling opponent. Of course, the algorithm has to base the calculation of how good an opponent is on how well it did against its opponents, so the algorithm must be based on the whole set of results. Therefore, I looked at a debate tournament and an N.F.L. season. The process can continue infinitely, although usually, after a couple times, things settle down and the handicapping doesn't change much upon further iterations. This is a kind of Markov chain technique.

One assumption that this depends on is that a team has an invariant strength. Clearly, this is suspect for debate (differing strengths on the affirmative and negative sides) and the other N.F.L. (the defense and offense are, literally, two different teams). Is it possible to use the same idea but adapt it to recognize the split-strengths?

I input total offensive yards from each 2011 N.F.L. game. A lot of yards against a weak defense is good; a lot of yards against a strong offense is better. By using only this information -- offensive yards for each team vs. each opponent -- a few matrix operations yielded a weighted offensive yards statistic and a defensive strength score for each team's offense and defense, respectively:

For example, against the "average" defense, the Saints would have earned 460 yards. The Steelers defense would have cut that to 82%, to 377 yards. Or so say these statistics. They never actually played.

How well did these statistics work? Well, as predictions, terribly. But, compared to reputable sports statisticians, fairly well. This is a comparison of my rank versus football outsiders rank (before the playoffs began) [offense on left, defense on right]:

On offensive, the difference was on average 3 ranks. On defense, the difference was on average 5.3 ranks. So, the rankings I generated from nothing other than actual yards in regular season games compared moderately closely to the ranks based on a complex calculation they call the DVOA (Defense-adjusted Value Over Average).

The same idea would work for debate tournaments: affirmative strength and negative strength could be treated as two separate variables for each team.