Penn State at Indiana 2021

Early RPI Explainer

RPI too early is just that… way too early

by Carl James •@jovian34  • March 16th, 2021

I wasn’t planning on writing about College Baseball Ratings Percentage Index (RPI) before St. Patrick’s Day, but I’ve seen a lot of chatter on the various interwebs. This chatter, while well intention-ed, belies an understand of what the metric can and cannot do.

I find RPI to be a very good metric to compare baseball teams in a typical season, especially deep into a season. Even though it was developed for basketball decades ago, I think it is even better suited to baseball then it was for basketball, which has since abandoned RPI and embraced the NCAA Evaluation Tool (NET).

RPI is ridiculously simple and relies on a simple set of input data:

    1. Games played between Division I opponents
    2. Location of game (home, away, neutral)
    3. Result of game (win, loss, or tie)

That is it. RPI knows nothing else. It doesn’t care about conference, reputation, margin of victory, team ERA, team OPS, nothing else than the three points above.

In basketball the NCAA developed NET because RPI was giving what they thought to be an inaccurate picture. Most basketball teams play each other only one time. It was determined that factoring offensive versus defensive efficiency in those games allowed for a better ranking system than simply going off who won the game.

In baseball, most games are part of three-game series and in total baseball plays a lot more games. So win-loss results are more telling. Two teams A and B play a three game series.

There are four possible outcomes:

  1. A sweeps
  2. A wins 2-1
  3. B wins 2-1
  4. B sweeps

Technically there are more possibilities due to ties, but those require an odd set of circumstances to happen that are beyond the scope of this discussion.

Because these are three different games, the distinction in these results are captured well with simply wins and losses.

So we have the inputs, what about the outputs? This is the RPI formula:

    • 50%: the combined winning percentage of all opponents
    • 25%: the combined winning percentage of all opponents of those opponents
    • 25%: a teams own winning percentage adjusted for location

So take Indiana right now. They have played 8 games:

Opponent

Opponent Record

Opponent Opponent Record

Result

Location

Adjusted result

Rutgers

4-4

30-34

Loss

Nuetral

1 Loss

Minnesota

2-6

42-24

Win

Away

1.3 Wins

Rutgers

4-4

30-34

Win

Nuetral

1 Win

Minnesota

2-6

42-24

Win

Away

1.3 Wins

Penn State

2-6

48-16

Win

Home

0.7 Wins

Penn State

2-6

48-16

Win

Home

0.7 Wins

Penn State

2-6

48-16

Win

Home

0.7 Wins

Penn State

2-6

48-16

Win

Home

0.7 Wins

Totals

20-42

336-180

7-1

 

6.4-1

50% of (20-42 or 0.3226) = 0.1613

25% of (336-180 or 0.6512) = 0.1628

25% of (6.4-1 or 0.8649) = 0.2162

RPI is all three added together at 0.5403 which is then compared to the other teams in D1. As of when I performed these calculations 36 Division I teams had a higher RPI value than Indiana giving a rank of 37th out of 272 teams that have played games.

While I give the ranking in this exercise to show how it works, it is not meaningful at this time of the season. One quarter of Indiana’s RPI value is Penn State’s total record, half of which is a sweep at the hands of IU. Every number feeds back into itself and until there is a lot more diversity in opponent and game count, RPI fails to show any true meaning and ought to be completely ignored.

RPI is a good metric in May. It is a bad metric in April. It is a laughable joke in March. If we were to take it seriously, we would be talking about whether Indiana would be destined for the Muncie or Terre Haute NCAA Regionals in June. According to the RPI right now that would be happening. We know that won’t be the case in May. 

What is the impact of playing only conference opponents?

An argument can be made the RPI is utterly irrelevant to the Big Ten (B1G) in 2021 because of the lack of non-conference foes. Essentially this is a conference bubble. Conversely, an argument can easily be made that the talent of the B1G is a known entity and that as a top quartile RPI conference, it is safe to say that the RPI results are relatively in-line with the NCAA as a whole.

The thing to consider is the first two components of the formula which make up 75% of the result and is the teams “Strength of Schedule” (SOS). When all of the teams in the B1G play each other roughly (but not exactly) the same number of times, as the year progresses, all of the SOS values are for B1G teams are going to be really close to 0.5000. The only significant distinction is in the 25% result the team itself has. Assuming a typical RPI distribution, a team will need roughly 33 of 44 wins to have an at-large RPI. Also the Home/Away adjuster will be a wash for the B1G as well by the end of May as every team has been schedule for the same number of home and away games.

We do not know how the NCAA Selection Committee will weigh RPI for the B1G. It is completely unknown at this point. RPI may be irrelevant. It may be very relevant. In the end, all a team can do is win the games they have in front of them to make a case. The regular season champ will get the automatic bid, and that is the only sure way into June baseball.