MASTER NOTES: 'It was a dark and stormy night...'

About 10 years ago, a BaseballHQ subscriber hired me to write a weekly story about the action in his home league—“The Mugwumps jumped three spots into second place on great weeks by Jermaine Dye and Bronson Arroyo, while poor performances from Jeff Francis and Carlos Zambrano dropped NoNameNeds out of the money…” That sort of thing.

The idea was that the guy would distribute the story every week to his leaguemates, making the league a little more fun and interesting, and possibly helping keep the also-rans a little more involved.

I backed out of the deal after the first few weeks. It started off being kind of fun, but it got to be about as boring a task as I’ve ever had in 40 years as a professional writer. And I once had to write a story about a kid’s pet turtle.

Fast-forward to this season. I have a team in a mixed league with some other fantasy writers and some regular citizens whose league operates through CBS Sportsline. And every week, the site posts a recap about the league, not really comprehensive but a few anecdotes about what went on for a few of the teams in the league. Recently, the summary included an item about my team. It said:

“Coaches Patrick Davitt and Todd Zola had more than enough production in the home runs category this week. Baseball HQ Radio produced the week's 3rd best performance in the HRs category despite leaving Evan Gattis and his 3 HRs on the bench. No other player riding the pine did better. Baseball HQ Radio moved up from 13th to 10th in the HRs standings.”

My first thought was, “What poor soul has to do this every week, sifting through thousands of leagues looking for interesting tidbits?” About four-one-hundredths of a second later, I realized no soul, poor or otherwise, was doing anything.

The story was written under the byline “Fantasy Journalist,” which has a kind of double meaning here: The subject is fantasy. So is the journalist. The story was written by a robot.

Now before you get visions of C3PO sitting at a keyboard tapping out baseball yarns for fantasy owners, the writing is not being done by a physical robot. It’s being done by software. I’m pretty sure the story templates are written by real people, with software macros “filling in the blanks” by extracting pre-defined statistical nuggets from every team in the thousands of leagues the site administers.  

For example, in the story I mentioned above, some person probably noticed that Evan Gattis had three HR despite being benched in a large percentage of leagues, and thought that might make a story item. So he or she wrote a template that says, “Coaches (BLANK) and (BLANK) had more than enough production in the home runs category this week. (TEAMNAME) produced the week's (HR WEEK RANK) best performance in the HRs category despite leaving Evan Gattis and his 3 HRs on the bench. No other player riding the pine did better. (TEAMNAME) moved up from (WEEK2 HR RANK) to (WEEK3 HR RANK) in the HRs standings.”

All that’s left is to instruct the computer to scour the database for teams with Gattis on the bench, fill in the blanks, and, voilà! An entertaining snippet, customized for each league.

Now I know what you might be thinking: just because the writing was hackneyed and formulaic, like “riding the pine,” doesn’t mean it was a robot. There’s lots hackneyed and formulaic writing in real human sportswriting, especially on the web.

But there are other clues pointing to the presence of the non-human. Often, the teams cited in the story don’t quite match the point the story is making. The story about our team said we had “more than enough HR production” in the week, but we clearly did not. We were only third for the week in HR and remained in the second division in the HR category for the season. And in the same “news package” that announced our big HR week, a story about another team said in part that the team had zero everything for stats that week. A database scrape that didn’t quite work—and that even the dimmest freelancer would have noticed and fixed.

Still not convinced? The site also has a link to “modify your gender in these stories.” You may choose from a pulldown menu called “Story Gender Pronouns,” offering you the choice of “Male,” “Female” and “Do not designate.” No word on how this has gone over in North Carolina, but you’d think that a real human writer would recognize that the names “Patrick” and “Todd” are pretty obviously male.


I checked my hypothesis, and after further review, the decision on the field is confirmed, and even amplified. There are a lot more stories out there being written by software than you might think. Certainly more than I thought.

The automated writing business has two big players, Automated Insights (AI) and Narrative Science (NS). AI has worked with the Associated Press to produce financial reports from companies’ financial data, allowing the AP to go from writing about 1,200 such reports per year to 12,000, mostly with no human intervention.

AI is owned by the company that also owns STATS LLC, a huge supplier of sports data. And with the success they were having by converting spreadsheets full of financial data into stories, it wasn’t a great leap of the imagination to think about converting spreadsheets full of sports data into stories. And that’s just what they’re doing.

AI has a very successful product, converting Yahoo! fantasy football data into “game stories” about all the head-to-head matchups. The stories run from 500 to 1,000 words, and the software produces stories at a rate of about 500 per second, slightly faster than we write stories at The company is also working with Associated Press on coverage of several NCAA sports, including Division I baseball.

In a Poynter article, AI CEO Robbie Allen said, "Much like what we did for the AP around earning reports, I think most if not all of sporting events coverage, at least in terms of writing previews of events and recaps, should be automated to some degree.”

Narrative Science, for its part, creates data-based narratives for The Big Ten Network, and powers a feature in an iPhone app called Gamechanger. Youth baseball coaches and parents use the app to score the game, and at the end, the app instantly creates a print-ready article about the game.

The company worked with experienced journalists to help the software find an “angle” for the story, beyond just the score—margin of victory, any comeback, performance by individual stars and any significant or unusual stats.


You might say that these robo-writer software applications will never replace the analytical articles on, much less the great sportswriters like Joe Posnanski. And you’re probably right.

For now.

But most game story sportswriting is pretty formulaic, and a guy doesn’t have to be Joe Posnanski to do it. Almost every story follows a highly predictable model: “(STAR PLAYER) did (NOTEWORTHY STAT) to lead the (WINNING TEAM) to a (SCORE) victory over the (LOSING TEAM) in (LEAGUE) action on (DAY OF GAME)."

Heck, you don’t really even have to be at the game. It’s long enough ago now for me to safely admit this: When I was in journalism school, I was once assigned to cover a local junior hockey game. The trouble is, the game was on a Saturday afternoon, and I had been, ahem, out late on Friday night. Really late. And as game time approached, the cat was still making too much noise clomping across the carpet, and might have used my mouth for a litter box.

So the thought of schlepping halfway across town and sitting through a junior hockey game did not brighten my spirits. Instead, I listened to the game on the radio, noting all the goals and other noteworthy action. I also listened to the brief post-game wrap-up, which included a press conference to get a couple of (very predictable) quotes from the home coach. I wrote the story in about 15 minutes, turned it in on Monday, and got an A+.

(I also got a 100% mark for another 15-minute special about the local real-estate business. Curiously, real estate stories are another field where robo-writing is taking hold.)

Remember, it wasn’t so long ago that we all would have scoffed at the idea of asking a disembodied voice in our phone to choose a restaurant or summon a cab. So maybe we should pay attention to Kris Hammond, the cofounder and Chief Technical Officer of Narrative Science, who told that software programs could be writing 90% of news articles within 15 years. I’m not sure that’s right, but if I was graduating high-school right now, I wouldn’t be applying to journalism school.


Click here to subscribe

  For more information about the terms used in this article, see our Glossary Primer.