Editor’s Note: Spring 2024 Baseball Research Journal
This article was written by Cecilia Tan
This article was published in Spring 2024 Baseball Research Journal
We live in the best of times and the worst of times for information. The best of times: I now have vast libraries and thousands of newspapers accessible from home. From my sickbed last year, I read not only COVID-19 treatments, I downloaded archived newspaper accounts of the 2004 Red Sox trophy tour for an essay I wrote in Sox Bid Curse Farewell.
But, the worst of times: misinformation is proliferating endlessly. Some is political disinformation: leaders and candidates spout falsehoods with impunity these days. But some is from another source: generative AI. ChatGPT launched the AI hype in 2022, but news outlets have already been using so-called “AI” to generate articles for more than 10 years—and sports coverage in particular. Does your local newspaper website have coverage for every high school and college football game in your area? How can they, with fewer writers than ever before? They’ve been using a tool like Wordsmith (from a company called Automated Insights—AI, get it?) Wordsmith produced 300 million pieces of content in 2013, and by 2014—ten years ago!—that amount had already jumped to a billion.1 The Associated Press bought into Automated Insights in 2014.2 And AI-generated “news” content keeps growing exponentially.3
If you’re not horrified by this, you should be. What use is access to the world’s newspapers from my home if all I find is robot-written articles all sourced from the same data feed? A researcher like Herm Krabbenhoft can look at 12 different newspaper accounts from a baseball game in Detroit and determine that Hank Greenberg had an RBI that day—even though the official American League records erred and did not record it. But if a stringer noting game stats today makes a typo, it’ll be reproduced in every story as if it’s the truth—it will become the truth. These are not the droids you’re looking for.
It gets worse. One problem with Large Language Models is that once they ingest too much content generated by AI, their output degrades and their reliability plummets.4 Since the bots themselves can no longer read the Internet, companies like OpenAI have been—unscrupulously and perhaps illegally—training their models on the text of hundreds of thousands of published books.5 (And fanfiction!6)
Which brings us to the downfall of Sports Illustrated. The former pinnacle of both quality in sports journalism and the pay scale, Sports Illustrated is now functionally dead, with a mass layoff of the writers and staff in January 2024.7 The venerable brand name was caught filling their site with AI-written junk—junk so bad that readers caught on immediately that no sane human would have written it.8 Caught with their hand in the cookie jar, they decided to smash the jar entirely. Not only does this not make sense as journalism, it doesn’t even make sense as capitalism. AGM/Arena Group just bought Sports Illustrated in 2019 for $110 million. Now, they’ve made it worthless.
Turning the Internet into a money machine has meant that any inherent human value in words has been entirely discarded in favor of their dollar value. But information, facts, and knowledge have a value to humans other than a dollar amount! Every word you see written here in the BRJ and on the SABR website, has been lovingly, painstakingly, researched, written, edited, fact-checked, and proofed by an actual human being who felt that the work was worth doing.
If you see the value in that, there are multiple ways to aid in the effort. One is to donate, of course—SABR is a non-profit! The other, though, is to join the effort, and donate your energy or time. Become a researcher, write for the BioProject or the Games Project, or elbow your way up to the table where the editors, proofreaders, and fact-checkers are pushing our pencils. The words we publish add value to our lives and I feel better and better about that with each passing day.
— Cecilia M. Tan
SABR Publications Director
Related links:
Notes
1. Lance Ulanoff, “Need to Write 5 Million Stories a Week? Robot Reporters to the Rescue,” Mashable, July 1, 2014, https://mashable.com/archive/robot-reporters-add-data-to-the-five-ws.
2. Jason Abbruzzese, “The Associated Press Turns to Computer Automation,” Mashable, June 30, 2014, https://mashable.com/archive/the-associated-press-turns-to-computer-automation-for-corporate-earnings-stories.
3. Andreas Graefe, “Guide to Automated Journalism,” Columbia Journalism Review, January 7, 2016, https://www.cjr.org/tow_center_reports/guide_to_automated_journalism.php.
4. Ben Lutkevich, “Model Collapse Explained: How Synthetic Training Data Breaks AI,” TechTarget, July 7, 2023, https://www.techtarget.com/whatis/feature/Model-collapse-explained-How-synthetic-training-data-breaks-AI.
5. Blake Brittain, “Pulitzer-winning authors join OpenAI, Microsoft copyright lawsuit,” Reuters, December 20, 2023, https://www.reuters.com/legal/pulitzer-winning-authors-join-openai-microsoft-copyright-lawsuit-2023-12-20/.
6. Rose Eveleth, “The Fanfic Sex Trope That Caught a Plundering AI Redhanded,” Wired, May 15, 2023, https://www.wired.com/story/fanfiction-omegaverse-sex-trope-artificial-intelligence-knotting/.
7. Nicole Kraft, “Mass Layoff Appears to be the End of Sports Illustrated,” Forbes, January 21, 2024, https://www.forbes.com/sites/nicolekraft/2024/01/21/mass-layoff-appears-to-be-the-end-of-sports-illustrated/?sh=50c2176075e5.
8. Maggie Harrison Dupre, “Sports Illustrated Published Articles by Fake, AI-Generated Writers,” Futurism, November 27, 2023, https://futurism.com/sports-illustrated-ai-generated-writers.