About the project
As part of SABR's mission to preserve and disseminate materials relating to the history of baseball, researchers have undertaken a community project to document the statistical history of professional baseball. The core focus of the project is to compile statistics for each league-season, using the best information available. Each season's statistics are critically examined before publication, and known errors and omissions from Guides and other sources are corrected.
Data coverage and sources
The project uses the league-season (one full season of one league) as the basic unit of statistical compilation. Leagues are scheduled for compilation in reverse chronological order. The current focus of new input and evaluation is the 1989 season.
Once in electronic form, we review all statistics for errors, both in transcription and in balance. We check whether team statistics are the sum of its players', and whether totals such as runs, hits, and so forth balance between batters and pitchers. This process ensures data quality, and also often catches errors in published totals. This process is also labor-intensive, which means it takes a while for a season to achieve quality certification in the database. We appreciate your patience in allowing us to bring you a quality resource.
We use the best information available in compiling statistics for a league. Official league statistics and tabulations published in major guides are used for most leagues. We also build on research done by SABR members and others in correcting and extending those publications.
Information from other sources
For leagues for which full statistics have not been compiled and vetted in electronic format, we offer selected statistics for players based on a database compiled by Ed Washuta and donated to SABR in 2007. Due to the sheer size of the task, we regret that changes to the Washuta data, including statistical errors and the addition of statistics for unlisted players, will only made in the case of extreme and egregious errors.
Is the database available in (MySQL, CSV, etc.) format?
The statistical history of minor league baseball is very poorly documented. We view most of the statistics we currently display as being provisional, and we anticipate this will be the case for some time. We believe it is unwise to release downloadable datasets which are immature and have not been cross-checked for quality. It is our plan to offer downloads of a full year's worth of statistics (for all leagues) once all leagues in that year have completed the proofing process.
We are able to offer to run specific queries against the dataset for research projects. This service is available to all SABR members free of charge. Please contact Sean Lahman for information on custom querying of the database.
How can I help?
The development of the database is powered entirely by volunteer effort. With a goal of providing statistics for over 4,000 league seasons, volunteers are needed to compile, cross-check, verify, and fill in gaps in statistics. Much of the work involved is data entry and validation. If you're a SABR member, you can inquire about volunteering by contacting Sean Lahman.