ROTISSERIE: Self-discovery and the BaseballHQ Pitcher Matchup Tool

There’s an old joke that goes:
Q: “What’s the most dangerous part of a car?”
A: “The nut holding the steering wheel!”

The danger in steering in-season pitching decisions, is that the fantasy owner (nut) can choose to ignore the recommendations of the data, creating potential danger for his fantasy teams. This writer attempted to track those decisions and outcomes during the 2019 season, to learn from mistakes and perhaps use the BaseballHQ Pitcher Matchup Tool more wisely. 


Basic Data

The 2019 sample consists of 33 pitchers across five leagues (two AL-only 5x5 with daily transactions, one NL-only weekly 5x5, the HQ-WONK Writers ONly Keeper daily mixed 5x5, and a mixed weekly head-to-head). Due to some Mac/PC Excel version compatibility issues for a cloud-based spreadsheet, it became difficult to keep up with file maintenance, and the logging ended about seven weeks into the season. This left a dataset of 238 pitcher start/sit decisions.

That dataset includes: 

  • The overall and category ratings for those pitchers’ starts from the BaseballHQ Pitcher Matchup Tool (not populated in the off-season),
  • The standard recommendation included in the Daily Matchups report, based on league format and player pool depth (+.22 for the only leagues, +.34 for HQ-WONK, and +.30 for the H2H league)
  • The actual start/sit decision,
  • Some personal notes about why that decision was made (justifying deviation from the recommendation), and
  • The resulting PQS (Pure Quality Start) scoring detail for that particular start.

Two pitchers were owned in three leagues: Yonny Chirinos (RHP, TAM) and Felix Pena (RHP, LAA). Three pitchers were owned in two leagues: Jeremy Hellickson (RHP, FA), Wade Miley (LHP, CIN), and Caleb Smith (LHP, MIA).

With more teams utilizing the “opener” strategy in 2019, the Daily Matchup tool evolved to consider the bulk pitcher rather than a one-inning reliever as the starter. Note that most bulk relievers fail to accumulate more than six innings, and many also fail to achieve the five strikeout criteria in the PQS scoring rubric, so they rarely score higher than a PQS-3. In this sample, there were eight bulk relief appearances, one of which earned a PQS-5 and another received a PQS-4 (both thrown by Felix Pena). 


Analysis of Outcomes

There are actually multiple layers of outcomes. The first is the personal decision process to accept or deviate from the tool’s recommendation. A look back at the data showed an interesting distribution:

          Decision: start   sit
     start            127    0
     none               7    0
     sit               50   54

The recommendation here is based on comparing the overall rating for the start, with the appropriate median (50th percentile) value for the particular league format. Subjective comments in the Daily Matchups articles were not recorded, but could have influenced a different decision than the median value. As noted in the table, there were seven matchups for which my starting pitcher was not listed in the tool on that day. This is not surprising, particularly in the early weeks of the season, with weather cancellations and rotation shuffling.

It is interesting, in retrospect, that all of the "start" recommendations were accepted. However, what really catches attention here is the personal decision to deviate from almost half of the “sit” recommendations. Notes captured sporadically for these starts in a “Rationale” column on the spreadsheet included:

  • "facing struggling offense"
  • "better than opposing pitcher"
  • "no better alternatives on (my) staff"
  • "liked the matchup"
  • "taking a chance"

Let’s face it, we all want to manage our teams. We believe in our analyses and intuitions, and sometimes we want to be speculators rather than spectators. 

Now, here’s where the punishment occurs. How good were those personal decisions, based on their PQS scores?

Recommend/Decision  sample  PQS Avg  DOM%  DIS% 
   Start/Start        127     2.5     32%   30% 
   None/Start           7     2.7     57%   43% 
   Sit/Start           50     2.2     20%   40% 
   Sit/Sit             54     2.2     24%   35% 
   All                238     2.4     29%   35%

The clear message here is that following the Daily Matchup team’s recommendations yields a better result, and that my own decisions to start pitchers with a “sit” recommendation yielded even worse DOM%/DIS% results than those pitchers that were left on the bench!

Also, as noted earlier, multiple ownership of certain pitchers led to multiple decisions for certain pitcher starts. Normalizing the data to only count unique starts (one per pitcher per day) or unique decisions (one of each decision per pitcher per day) has minimal impact on the results data shown in the table above.


Was it worth it?

Even with a truncated usage sample, there were some obvious learnings:

1) First and foremost, the Daily Matchup Report is a highly effective tool, and the daily recommendations proved their value as shown above. The data shows that following the reports’ “start or sit” recommendations provided more positive and fewer negative results than this writer’s attempts to start certain pitchers when the model said sit. Based on the disparity here, results may have actually been better if the 54 were started in place of the 50.

2) More introspectively... Three of the four roto leagues have penalties for failing to reach the minimum innings, which may have been a small contributing factor to the personal need to override the matchup tool’s recommendations. However, with questionable pitcher readiness immediately following spring training, and greater difficulty projecting actual pitcher starting assignments in the early weeks of the season, it would seem to be more prudent to soft pedal innings accumulation at the outset, and add/ start more pitchers after rotation schedules and current season performance have both normalized.  

3) Be careful about rostering pitchers in multiple leagues, and have strategies for how to deploy them. Three of those multi-league pitchers (Chirinos, Pena, Smith) seemed poised for a 2019 breakout at reasonable prices, particularly with the potential for Chirinos and Pena to swing into the bulk reliever role. That trio rewarded their owner with a combined PQS 2.8 average score. 

Miley was rostered in two of the daily transaction leagues, with hopes of only using him in favorable matchups. He had eight starts in each of the two leagues (16 start/sit decisions) during the sample, starting for nine of them (PQS average (1.3) and sitting for seven of them (PQS average 1.4). Hellickson was a double reserve dart throw, making only two starts (PQS-3 and PQS-2) while riding the pine for nine others before being released in both leagues. (Note that Miley was only recommended in two of his eight starts, and Hellickson was never recommended.)

4) There’s an article in the BaseballHQ Strategy Library (head over there sometime and poke around, if you haven’t done so lately) about the benefits of keeping a rotisserie journal. Applying that concept and philosophy in this example, even the limited journal entries (start/ sit decisions) combined with the data guiding the decision and the resulting outcomes, provides a platform for tracking in-season (if the user can stick with it!), as well as a trove of data for draft preparation. 

An off-season alternative would be to download the PQS logs, and link them to downloaded transaction logs from the applicable fantasy league hosting service(s). Then, by using an electronic note-taking system, with comments and rationales indexed by pitcher and date, those personal notes could be linked to the rest of the data. That too would provide good intelligence for draft preparation.

Regardless of the specific approach, the focus on analytical decision-making, backed with results measurement of those decisions, is how most effective businesses operate, and our fantasy rosters deserve the same.

Click here to subscribe

  For more information about the terms used in this article, see our Glossary Primer.