RESEARCH: Do Pitchers Who are Behind in the Count Get Hit Harder?

When a pitcher is behind in the count, it’s a less than ideal situation. The batter is closer to walking than striking out. Also, the hitter can wait on their pitch to hit. So do hitters make better contact when the pitcher "must throw strikes"?

First, the key is to strip out as many other variables as possible (notably walks and strikeouts) and just get the batted ball effects. For that reason, I analyzed BABIP (batting average on balls in play) and SLGcon (slugging percentage on contact).

Second, to get the Behind, Even, and Ahead percentages, I looked at each pitch during the season to find these rates. I included first pitches (0-0) in the Even rates.

A couple of issues I acknowledge before presenting the findings. One is that the count affects a pitcher’s two most important talent factors, his walk and strikeout rate, which we're ignoring in the results. Two is that the analysis compares results on all pitches, and just a few batted ball factors; I’d not be surprised if certain data subsets are more predictive and impactful than the overall set. This is not the end but a start for a longer conversation.

Using Statcast and PITCHf/x data going back to 2008, I found the following values:

  • Ahead%: Counts with ‘Strikes – Balls > 0’
  • Even%: Counts with ‘Strikes – Balls = 0’
  • Behind%: Counts with ‘Strikes – Balls < 0’
  • BABIP: (Hits – Home Runs)/(Balls in Play – Home Runs)
  • SLGcon: (Doubles + 2 x Triples + 3 x Home Runs)/(Balls in Play)

The first step (and could have effectively been the last step) was to see if being ahead or behind in the count is predictive. And it is. Here are the year-to-year Ahead% and Behind% values with a minimum of 900 pitches (about nine starts) between the two seasons.

An obvious correlation exists from season-to-season. Being ahead or behind is just as much of a predictable trait as strikeouts or groundballs.

Then, I found the stabilization point where half of a pitcher's projection would be his own and the other half would be the league average. The 900-pitch threshold used above wasn’t chosen out of thin air—it is the stabilization point. A starter’s talent to be ahead or behind in the count is known about one-third of the way through a season.

Now knowing that being ahead or behind is a measurable skill, I wanted to see if pitching behind in the count yields better/harder contact than pitching ahead in the count. Among all pitchers with at least 100 pitches (7,536 total samples), here is how they fared in both BABIP and SLGcon when behind vs. even and when ahead vs. even:

With BABIP, it’s about a four- to six-point swing from being ahead to behind (i.e., in the Average row above, 0.0028 minus -0.0027 = 0.0055). The amount is measurable, but not game-changing.

The SLGcon is a little more telling, with over a 100-point swing (i.e., in the Average row again, 0.0642 minus -0.0513 = 0.1155). This measurement is the equivalent of a 100-point change in OPS allowed. It makes obvious sense—pitchers give up harder contact when behind in the count than when ahead—but now there are some numbers behind it.

Next, I tried to put a little more context into what this difference in SLGcon might mean to a pitcher’s fantasy value. From the research, I found [Ahead% – Behind%] to be the best indicator of poor results (specifically on ERA, xFIP, FIP, HR/fb, and HR/9). Details can be found at this link, but since these splits are otherwise not publicly available, I'm transitioning to the ratio [Ahead% / Behind%] for the rest of this analysis.

The two values are similar to K/BB and K%-BB%. Both are attempting to compare a pitcher's strikeout and walk rate. The key with Ahead% / Behind%, there is a nice proxy for it I found in which fantasy owners can use going forward.

Using Ahead% / Behind%, just under half of pitchers have a ratio greater than 1.0 (or ahead in the count) and the rest are behind (minimum 100 pitches). If the sample is filtered to include only those that have thrown 900 pitches (again, the stabilization point for Ahead% and Behind%), then over two-thirds of the pitchers have a ratio over 1.0. Simply, pitchers who throw more strikes are allowed to stay in the league while those who don't are demoted (i.e., survivor bias). Here are the percentage of pitchers in different bands for reference (minimum 100 pitches).

These numbers aren’t currently available exactly in this form on the web, but Baseball-Reference.com shows how often a pitcher is ahead or behind when a ball is put into play. This stat is a bit different than the one used in this analysis, as I believe that each pitch should contribute to being ahead or behind. For high strikeout pitchers, if they are ahead, they can utilize their swing-and-miss non-fastballs and will rarely have a ball in play. That said, while the values on Baseball-Reference.com are far from perfect, they will at least give a fantasy owner a way to apply the findings to the current player pool. For example, this link to Marcus Stroman shows that he was behind in 223 at-bats and ahead in 204, for an Ahead/Behind ratio of 91%. His per pitch Ahead/Behind ratio using all pitches was 87%—not a perfect proxy, but it’s workable.

In all, I found three Ahead / Behind ratios useful in that they are easy to remember and there is a break in the data. These are: >1.5, 1.0 to 1.5, and <1.0. I calculated an average of four stats (i.e ERA–FIP, ERA–xFIP, HR/9, and HR/FB) for the three groupings. Here are the results from all the years of data (minimum 900 pitches).

These results initially made no sense to me. The lack of a difference between Ahead / Behind sub-groups doesn’t jibe with the earlier results on SLGcon. I went back to see which numbers were off and found neither one. The mixed-up results come back to a simple answer: the ever-changing baseball.

When dealing batted ball data, the run environments has been changing almost every single year. Rather than going all the way back to 2008, here are the combined values from just the past three seasons when the baseball was at various levels of being juiced. 

Those results make much more sense. And, when each season is examined individually, there is an obvious difference in the values. Here are the results from the past three seasons:

2019

2018

2017

The preceding values are a little more of what I expected. Here are some observations:

  1. All the differences seem to be related to home runs allowed and not doubles and triples. FIP (input is strikeouts, walks, and home runs) and ERA almost track together.
  2. The >1.5 group are the pitchers to target, with their ERA being a half run lower than their xFIP by allowing fewer home runs. A half-run difference in ERA is huge. Here are the pitchers who had a 1.5 Ahead/Behind ratio or higher last year.

A lot of good pitchers in there. The list contains both starters and relievers. To help factor out some of the other differences between these two pitcher groups, like relievers only seeing a lineup once, I'll continue only with starters. Being ahead in the count can help explain how most of these pitchers have had career ERAs lower than their xFIPs (besides Tanaka).

While some of these pitchers can be at the extreme ends of the batted ball spectrum, which explains some of the ERA suppression, being ahead or behind might explain how they’ve thrived over the year.

The truth is that most of this home run proneness is already accounted for in projections, because projections incorporate home runs per batted ball event (HR/BIP). If a projection starts with HR/BIP, its projection will be closer to the end-of-season results. If the projection starts with flyball data and then incorporates a league-average HR/FB% rate, then it could under- or over-project some pitchers’ ERAs.

Rarely does a single factor change how everyone should evaluate pitchers. The key for me is that I'm now more suspect of using in-season xFIP as a small sample size proxy for ERA. xFIP provides a neat and quick option to evaluate a pitcher’s talent without the noise of runs allowed and home runs; but without incorporating a pitcher's Ahead / Behind ratio, it misses the ability to explain as much as a half-run difference between actual ERA and expected ERA based on skills. This got me thinking about how best to evaluate pitchers with a small data sample. There's no single perfect answer. ERA estimators are a useful high-level guide, while strikeouts and walks (K%-BB% or K/BB) are telling, but do not include a batted ball component. Maybe now Ahead / Behind ratio can help round out the story.

I need to draw the line at this point—the preceding was quite a bit to digest (I’m still digesting it myself). The key, for now, is to know that being ahead or behind in the count seems to affect how hard a pitcher gets hit. More needs to be examined on this topic since the surface is just being scratched. Hopefully the next time I write about it, the entire subject will be cleared up some.


Click here to subscribe

  For more information about the terms used in this article, see our Glossary Primer.