Tuesday, January 4, 2011

The Problem with RPI (and the S-Factor)

Charlie Creme, resident women's bracketologist for ESPN, has published his first guess of the season. His number one seeds are Connecticut, Duke, Tennessee and Baylor, which I would agree are the obvious top four teams in the nation right now. The College Women's Hoops S-Factor, however, does not agree with Creme, putting 6 teams in front of Baylor, the #1 team in the nation right now according to the most recent AP poll. It is entirely a strength of schedule issue, a flaw that appears more in RPI numbers than in real life.

Let's compare Texas A&M with Baylor. Both Baylor and A&M have only one close loss so far, to elite teams (Duke, Connecticut) at the home arena of those teams. They've played similar numbers of top 50 teams (Texas A&M: 5, Baylor: 4) and have similar winning percentages in those games.

But humans and computers diverge in two key ways. Humans love Baylor because they have a huge marquee win to their credit (over Tennessee), while Texas A&M's biggest victory has come at the expense of Michigan, a team with much less cachet than the legendary Lady Vols. Computers, however, don't recognize "cachet" unless you program it into them. Computers prefer Texas A&M because they have played only two games to sub-100-ranked teams, while Baylor has played 9 such games. Baylor's schedule has included great teams and terrible teams, while Texas A&M's schedule has mostly been full of pretty good teams.

A team that plays 10 mediocre teams will have the same strength of schedule as a team that plays 5 great teams and 5 terrible teams. If both teams win every game, computers would rank the two teams equally. But Team B beat 5 great teams! They should be much higher ranked than Team A, according to a human's analysis.

Now in: (same as yesterday)
Now out: (same as yesterday)

Conferences with multiple bids:
Big East: 10
ACC: 7
Big 12: 7
Big Ten: 6
PAC-10: 5
SEC: 3
Atlantic 10: 2

No comments:

Post a Comment