Maximum Aardvark

« Being A Good Internet Citizen | Why I Love Google, Reason #6463 »

Stat Aggregator

I'm about 75 pages into the 2003 Baseball Prospectus (Cleveland's player profiles), and have gotten exponentially more excited about the upcoming baseball season over the past few days. I woke up this morning, however, with a sense of dread that I wouldn't be able to follow everything as well as I have in the past. I can think of no good reason for this dread, but it's still there, dammit. All the news sites I've come across, are bloated and difficult to navigate quickly (especially ESPN since its MSN redesign and MLB.com). I think I've been spoiled by my recent discovery of the joys of news aggregators. This last thought, naturally, got me thinking: why aren't baseball boxscores syndicated anywhere?

At first, I figured that the boxscore could be entered, in its entirety, into the element of an RSS feed, with (perhaps) a separate channel for each team and an all-encompasing channel to get all major league baseball boxscores. This is a passable solution, and would work with any existing news aggregator, which would be nice. But why stop there? Why not design an application that handles baseball boxscores specifically? From a stathead's perspective, it's a dream, because while complete, exhaustive statistics for Major League Baseball are readily available, many of the minor leagues' sites leave quite a bit to be desired.

How would such a statistics aggregator receive its data? I can think of no better way than to follow the RSS example and use XML. I spent my shower this morning trying to conceive what a DTD to describe baseball statistics would look like; I spent the first fifteen minutes at work searching to see if someone had already done it. Turns out, Clint Wrede worked on it back in 2000, and actually produced a working DTD. From my (admittedly limited) knowledge of XML, however, a finished document seems to have information scattered throughout it that exists strictly for display purposes. I don't know enough yet to say whether this is a shortcoming of the DTD or its use.

In any case, what I really want is an application that will download, store, and display baseball statistics for games as they happen. It should be generic enough that it can work with any provided source (so that, for example, my dad could publish his softball team's boxscores and they would display alongside the latest major league scores). Of course, for any of this to work, there will also need to be real-time providers of this information. Perhaps someday I'll be ambitious and knowledgable enough to make (at least portions of) this dream a reality; until then, I invoke the LazyWeb to do it for me.