Sunday, December 21, 2008

Top Talent-Producing Universities (Revisited)

Thanks to some great discussion over at Baseball Think Factory, I felt it was important to take at least one more look at my post on Top Talent-Producing Universities. As I said in that post, I was only playing around with the numbers, so I didn't do much beyond a couple of simple calculations. But it's obvious that a simple average doesn't quite do it justice: it penalizes schools like USC who churn out many major leaguers (who tend to drag down the average since they all can't be Mark McGwire or Randy Johnson).

One suggestion, then, is to rank the schools solely by Total Win Shares:

School# Major Leaguers
Total WS
Notable Alumni
USC100
4563
Tom Seaver (388 WS), Mark McGwire (343 WS), Randy Johnson (315 WS), Fred Lynn (280 WS)
Arizona St. University
88
4244Barry Bonds (714 WS), Reggie Jackson (444 WS), Sal Bando (283 WS)
University of Michigan
72
3026Charlie Gehringer (383 WS), Barry Larkin (346 WS), George Sisler (292 WS)
Saint Mary's College of California
62
2816Von Hayes (177 WS), Tom Candiotti (158 WS)
UCLA63
2721Jackie Robinson (257 WS), Todd Zeile (221 WS), Troy Glaus (158 WS)
University of Texas at Austin
95
2633
Roger Clemens (440 WS), Burt Hooten (164 WS), Greg Swindell (136 WS)
Notre Dame
67
2567Carl Yastrzemski (488 WS), Cap Anson (381 WS), Cy Williams (235 WS)
California50
1904Jeff Kent (335 WS), Jackie Jensen (187 WS)
Mississippi State University
41
1896
Rafael Palmiero (395 WS), Will Clark (331 WS)
University of Illinois at Urbana-Champaign
69
1876Lou Boudreau (277 WS), Tom Haller (179 WS)

I'm not convinced that this is the best way to look at it, though. It allows for quantity to be ranked over quality, at least in theory. Somehow, we need to account for the schools who have produced the higher level talent while also making sure not to ignore the contributions of the lower level talent they produced.

There were suggestions about taking the top 10% of each school and weighting their contributions differently, or of taking each alumni's total Win Shares and weighting them by their ranking, and so on. While those all would probably re-shape the output into something closer to what we're looking for, I think that they may be needlessly complex (plus, they'd make me work a little harder than I really want to right now).

The method I decided to use incorporates the Average Win Shares shown in the last post and combines it with the number of quality players each school produces. The idea behind this is that schools who produce more high-quality players will be pushed up in the rankings. The formula is this: for schools who have produced 10 or more major leaguers, multiply the average Win Share value by the total number of alumni who accrued 100 or more Win Shares in their career.

This method produces a list of schools that seems a little more likely.

School# Major Leaguers
Average WS
# Alumni (100+ Career WS)
Weighted Average
Notable Alumni
USC100
45.6
17
775.7
Tom Seaver (388 WS), Mark McGwire (343 WS), Randy Johnson (315 WS), Fred Lynn (280 WS)
Arizona St. University
88
48.2
12
578.7Barry Bonds (714 WS), Reggie Jackson (444 WS), Sal Bando (283 WS)
Saint Mary's College of California62
45.4
11
500.0Von Hayes (177 WS), Tom Candiotti (158 WS)
UCLA
63
43.2
8
345.5Jackie Robinson (257 WS), Todd Zeile (221 WS), Troy Glaus (158 WS)
University of Michigan
72
42.0
8
336.2Charlie Gehringer (383 WS), Barry Larkin (346 WS), George Sisler (292 WS)
California
50
38.1
8
304.6
Jeff Kent (335 WS), Jackie Jensen (187 WS)
Notre Dame
67
38.3
7
268.2Carl Yastrzemski (488 WS), Cap Anson (381 WS), Cy Williams (235 WS)
San Diego St.
34
51.7
5
258.7Tony Gwynn (398 WS), Graig Nettles (321 WS), Mark Grace (294 WS)
University of Minnesota
31
51.3
5
256.3
Dave Winfield (415 WS), Paul Molitor (414 WS), Terry Steinbach (173 WS)
University of Tennessee
38
42.6
6
255.6Todd Helton (258 WS), Phil Garner (195 WS), Rick Honeycutt (130 WS)

A few notes:
  • The second method, that tries to account for quantity and quality, gives a pretty similar list to the first list. Now, maybe this means that we came up with a better, more logical method to rank the universities. However, it might also mean that we fudged with the method enough to give us whatever we were looking for originally. I'd like to think it's the former, but we have to recognize that the latter might be possible.
  • Schools like Columbia and Cal Poly, which placed highly in the previous post, end up ranking 21st and 23rd, respectively, in this method, which is still higher than I would've expected from my less-than-famous alma mater. (Georgia Tech finishes 12th)
  • In this ranking, I did my best to account for players who played at more than one school (like Barry Zito who transferred from UCSB to USC). Players are credited to the school they played at last. However, there are about 60 players in the database who are listed as having played at multiple schools in the same years, so there was no way for me to know who they played with last. I left them as is; they didn't affect the top of the rankings.

No comments: