Demystifying Clutch: The Real Factors That Determine The Best Performers

This is a collaborative effort made by the combined efforts of Elijah Emery and Tieran Alexander


Clutch is a Myth

Clutch Hitting is a divisive topic. There have been countless articles on the subject, most of which all end with the same conclusions. Clutch is luck. Clutch is not predictive. Clutch isn’t real. Teams don’t believe in clutch so why should we? The typical anti-clutch arguments are based on wRC+ and how the metric generally isn’t stable year-to-year in leverage splits. 


When people do consider clutch real, it’s the former players who believe in it. Loosely backed by arousal theory, their argument usually boils down to mental toughness, or the “it” factor. Analysts often dismiss these arguments as silly. ‘Every player has to perform in the bright lights so why are some better if it’s mental?’Why can’t Reggie Jackson hit like he does in October all the time?’If it’s mental, then couldn’t they force it all the time?’ The popular arguments surrounding clutch have historically been more subjective than objective.


What Actually Determines Clutch

We believe that both perspectives are looking at statistical clutch incorrectly. Clutch is not a result of mental acumen, rather a result of how the offense is structured. The documented struggles with proving clutch’s sustainability are a result of flawed evaluation. Using wRC+ changes based on situation isn’t necessarily the most accurate available method of quantifying performance in high leverage due to contextual differences between normal plate appearances and those taken in high leverage. Using OBP changes isn’t conducive to clutch because of various values in outcomes beyond whether or not the batter reached base. Using win probability added is more logically sound, but is dependent on the leverage index.


The problem with WPA and the “clutch” stat you see on FanGraphs is that it’s inherently biased more towards the inning than the situation. The first inning, with the bases loaded, and no outs is considered a medium leverage situation. The bottom of the ninth with no outs and nobody on, down by two, has the exact same leverage index. Those two situations are completely different but valued the same by the leverage index based clutch score.


Additionally, with most high leverage situations coming in the late innings, they usually come against a team’s best pitchers. The league OPS has never been higher in high leverage than low leverage before. The league as a whole has “-25 clutch” because everyone hits worse in situations labeled as high leverage due to the situations high leverage plate appearances are taken in.


José Abreu is the perfect example of the flaws of the leverage index. In his career, Abreu has a .898 OPS with runners on compared to a .827 OPS with the bases empty. So surely he would be considered extremely clutch? That is not the case according to the clutch score as Abreu is at -3.14 in his career. Why? Abreu has just a .733 OPS in high leverage situations, because of how his offensive profile matches up against higher velocity relievers. This is not some small sample fluke, Abreu has 2551 career plate appearances with runners on base. He is genuinely better with runners on, but leverage splits don’t recognize the value of that. 


The correct way of evaluating clutch hitting is to look at runs scored. That is the entire meaning behind being “clutch”. Driving in runs when you have the opportunity to do so, and when it matters most. Thinking about it logically, why would we not look at that? Walking with a runner on second is not clutch; it’s passing the baton to the next player. You don’t increase the odds of scoring exactly one run at all by walking. You increase the odds of scoring multiple, but that’s not the same thing.


Three-Run Homers Win Games

The primary objective of offense in baseball is to score runs for your team. More runs are always better, as they positively influence win probability. This exemplifies the importance of understanding that different opportunities contain inherently varied sub-metas, compared to the oversimplification that is situationally-independent statistics. 

The value of any given event in a baseball game is context dependent. For example, a Home Run’s value is determined by the quantity of runners on. Assuming there are no runners on, a homer can only increase a team’s run total by one–no more, no less. With bases empty, the value of a walk or single is much closer to that of a home run than it is with runners on.


The majority of runs scored comes with runners on, so despite popular belief, slugging percentage is more influential on scoring runs than OBP. The ability to get on base plays a prominent role in setting the table for high-value slugging opportunities, but slugging’s allure is that it produces a greater quantity of runs with runners on. Since the turn of the 21st century, slugging percentage has a significantly higher correlation to team runs scored than OBP. 

The Worst Luck Ever

This has not always been the case. From 1999-2002, when Tom Tango was designing wOBA, OBP had a higher correlation to runs scored than slugging percentage. A four year base is usually stable enough to build a results based stat, but this four-year window is a historical outlier.


From 1998-2001, OBP was more desirable than slugging. In 1997, slugging was more valuable. In 2002, slugging was more valuable. In the deadball era, OBP and slugging had the exact same correlation to runs scored. From 1961-1997, slugging was more valuable. From 2002-2022, slugging led to more runs. The only other time we see OBP matter more is during the live ball era. Slugging is more valuable than OBP 99% of the time. 


During that four year window that was used to find the basis of wOBA, wOBA has a significantly higher correlation to runs scored than OPS. Even the adjusted OPS with OBP worth 1.8x more than slugging has a higher correlation to runs than OPS. Since 2002, OPS has mattered more. Before 1999, OPS mattered more. Slugging percentage is and almost always has been the most important part of scoring runs. We aren't the first people to notice this. Baseball Prospectus found this back in 2018 as well. 


Slugging is only getting more important in recent years too. If we isolate this sample to just the juiced ball era (2016-2022), slugging percentage has a higher correlation to teams runs scored than even wOBA itself. 

Even isolated power has a .705 r², which is nearly as high as OBP at .745. This is a sample size of over one million plate appearances, contained within 210 teams. The results are no small-sample fluke; it's a real trend that should influence how we perceive player value. Slugging is the most important thing for a hitter–not getting on base. Getting runners on is important, but actually scoring them is more important. (wOBAcon is the best metric for evaluating quality of contact, not SLGcon, because it better understands the value difference between a single, double, etc.)


Tailoring Approach to the Situation

Hitting is situational. Slugging percentage means more than anything with runners on. However,  the amount of RBI chances a team gets has a much higher correlation to team runs scored than the amount of solo home runs a team hits. Converting your opportunities to score runs is important. The Braves do that better than anyone. However, the best offense in baseball is the Dodgers–not the Braves–because they have the second most RBI opportunities and have the third highest conversion rate (Runs/RBI Opportunities with runners on). The ability to do both is what makes an elite offense. 


The Cardinals are the perfect example of this. They maximize their baserunners with the bases empty by being extremely patient and working walks. They have the third highest team OBP with bases empty and the highest team slugging with runners on. The Cardinals adjust their approach based on what is more valuable at the time, creating one of the best offenses in baseball this year. 


Teams can go even farther with this, though. That approach can be to their detriment when facing a pitcher like JT Brubaker who can steal two called strikes with the sinker, then bury a slider for a chase. Failing to adjust to high zone rates was the primary cause of the Cardinals’ 48-inning scoreless streak earlier this year. When the pitcher is throwing more strikes, being aggressive is how you beat them. When the pitcher can’t find the strike zone, taking pitches is the best approach. A hitter’s approach should be fungible. 


A defense also needs to adapt to the situation. Outfield DRS yields a higher correlation to SLGcon, but infield DRS correlates more closely to BABIP. Strategically using shifts to maximize OBP/SLG based on the situation should be commonplace. Shallower outfielders take away more singles, while deeper fielders take away more extra base hits. By adjusting the positioning of the outfielders based on game context, teams can more effectively prevent their opponents from scoring.


Fewer Fastballs = Better Hitting

The proposed existence of clutch is tied to an undeniable fact: position players experience a fundamentally altered hitting environment with runners on. This is not only referring to the comparative lack of shifts with baserunners, but the entire makeup of a batter’s experience at the plate which changes. 


With runners on, pitchers favor what they think is their best pitch more than with the bases empty. Throwing strikes is a secondary priority to getting whiffs and weak contact. That is the correct approach. This means that pitchers typically throw a lot more sliders and a lot fewer four-seam fastballs.

It’s worth noting that these changes in pitch usage are misleading. Very few players throw all five pitches listed above, but this is based on the league average. The shown change is likely nearly double of what is shown for the average pitcher. This is using all pitches (over 650,000 with runners on) from 2020-2022.


You would expect hitters to do worse with the pitcher leaning on their “best pitch” more, but the opposite is true. With runners on base, the league average wOBAcon actually increases from .356 to .368. Walk rate is up 1.1% with runners on, and strikeout rate is down 2.1%. Hitters are significantly better in all three fields, with runners on base. But why? How does that make sense when pitchers are being more careful and throwing their best pitch more? Are fastballs getting punished more? What causes such a sharp contrast?  


The difference in damage on contact with runners on has nothing to do with the fastball. The four-seam is actually the only pitch that doesn’t get less effective with runners on base. The decrease in strikeouts is more closely tied to a decrease in secondary CSW than the fastball. All pitches are worse with runners on, but the fastball sees the least change in effectiveness. 


To put it simply, throwing a pitch more makes it perform worse. When pitchers are throwing more breaking balls, hitters are more willing to sit on those breaking balls. When hitters sit on a pitch, they whiff less in the strike zone, chase less, and swing more in the zone. When breaking balls are more predictable, hitters sit on them more and can do more damage as a result. 

Effective Guessing

Breaking balls aren’t just more predictable because pitchers are throwing them slightly more– even if still less than the fastball. Secondaries are more predictable because of when pitchers are throwing them. With runners on, fastball usage decreases on the first pitch from 38.4% to 31.1% and slider usage goes up from 16.5% to 20.4%. Because of hitters guessing sliders more often, the average wOBAcon on a first-pitch fastball drops from .405 to .383 with runners on. The average wOBAcon of a first pitch slider with runners on? Exactly .383. Given that slider CSFW (Called Strike% + Swinging Strike% + Foul%) is lower on first pitch than fastballs, the heater is actually the more effective initial offering with runners on.


The first pitch isn’t the only situation where sliders become too predictable with runners on. Fastball usage is down 4.6% and slider usage goes up 3.1% when the batter is ahead in the count. The result? Fastballs have a lower wOBAcon with runners on than a slider does when there are more balls than strikes. This is statistically abnormal because, on average, heaters yield much greater damage on contact than sliders. Fastballs wOBAcon is .028 worse than the slider in an identical situation but with the bases empty. Given that fastballs also generate strikes at a higher rate, fastballs are more often than not the better pitch when down in the count. 


We can observe these effects in other pitches, but it’s most prominent with the slider. In 90% of counts, if the usage is higher with runners on than off, the performance is worse by wOBAcon. If it is lower, the performance is better. Pitch usage means so much more than the public has ever credited it for. This is a league wide sample of over 650,000 pitches. This is not some small sample data fluke, but a real trend. 

Called Strikes Don’t Work

Strikeouts contribute heavily to repressing batter performance. Generating non-foul strikes (CSW) is extremely valuable because of its influence on strikeouts, and called strikes contribute more to CSW than swinging strikes. This premise is valuable for understanding why called strikes have a place in baseball: they generate strikeouts.


Simultaneously, called strikes have much less value when there are baserunners. When bases are empty, the league called strike rate is 17.4%. With runners on, that number dips to 15%. This is because hitters are much more aggressive with baserunners. Alternatively, swinging-strike rate increases from 11.9% to 12.5%. Called strikes lose both raw and comparative value in high-risk environments. This is not to say that pitches which generate called strikes are bad, but solely that specific offerings with value derived exclusively from called strikes see significant contextual regression.


Understanding the regression of called strike oriented offerings requires one to understand what makes a pitch especially called strike-friendly, which is mostly based on the movement profile. Pitches with large movement separation off of other offerings, like many curveballs, tend to get the most called strikes because they don’t look like strikes at first and convince the batter not to swing. These same pitches are much easier to identify and hit when aggressive, hence the league’s substantial increase in damage against those pitches and corresponding decrease in usage.


There is perhaps no better example of dwindling effectiveness in called strike oriented offerings than the Brewers. Milwaukee ranks highly in OBP-avoidance and K-BB%; additionally, they rank 7th in DRS. However, the Brewers are a bottom third team in strand rate (total run yield divided by total RBI opportunities, accounting for quantity of baserunners). Milwaukee has a league-leading 19.2% CS rate with bases empty and maintains their number 1 ranking with runners on, but the CS rate drops by 2.8 percentage points, to 16.4%. 


With bases empty, Milwaukee’s OBP yield is .295–10th best in baseball–and their SLG yield is .384, tied for 16th. Allowing higher SLG with bases empty isn’t a big deal; something that is, though, is what happens to their home run yield. Milwaukee ranks 5th worst in HR-yield with runners on. This is the result of throwing more pitches susceptible to destructive contact. While their sprays may appear to be fine, that’s because setting thresholds doesn’t represent the nature of dangerous contact and backspin, specifically in a much more aggressive environment such as with runners on. On hard-hit flyballs, Milwaukee yields the greatest totals in both average exit velocity and distance–higher than even the Rockies. With runners on, the Brewers’ weak induced chase rates (29th in breaker chase rate resulting from CS-centric movement profile) lead to unfavorable counts and undesirable outcomes.

Pitchers are Transformers

Justin Verlander was the best pitcher in the AL by ERA from 2018-2019. His FIP was 0.48 runs worse over that time span. Why? Justin Verlander was allowing a lot of home runs–but three-fourths of which were solo home runs. Some would say this is just clusterluck, and he was due for regression to the mean. 


However, Justin Verlander was allowing so many more solo home runs than multi-run home runs because he fundamentally altered his approach when a runner reached base. With no runners on, Verlander would throw his fastball inside, where it gets the most whiffs. He would throw his breaking balls in the zone more, and use them for called strikes. When runners get on he flips the script. The fastball was pounded to the gloveside, and the breaker was thrown out of the zone more often, intending to generate chases. 


The in-zone contact is a lot weaker, and less often pulled, off of the gloveside fastball, leading to fewer home runs. Additionally, higher swing rates on breaking balls mean more balls in play off of them and the breaker produces weaker contact. Verlander generated 6% more groundballs and allowed 4% fewer line drives. The entire batted ball profile transforms with runners on to fit the situation. The drawback is slightly more walks and fewer strikeouts, so it’s only a worthwhile approach with runners on when he can’t risk the extra-base hit. 


Justin Verlander is just one of several pitchers who fundamentally alter their approach with runners on. Cal Quantrill doubles his walk rate with runners on by selling out for an 8% boost in groundball rate. Merrill Kelly zones a lot fewer pitches and takes advantage of hitters’ increased tendency to swing with runners on to get weak contact that way at the expense of walks and strikeouts. 


Many other pitchers alter their game plan to prioritize weak contact with runners on base and limit baserunners without runners on. Because OBP matters so much more than slugging with the bases empty, it makes sense to sell out for K-BB% at the expense of contact quality. Because OBP matters only half as much as slugging with runners on, limiting extra-base hits has to be priority number one. Shifting their approach based on the situation allows them to consistently outperform peripherals and escape trouble. For starting pitchers at least this is the case. For relievers, you always need to limit contact quality because one run will often lose the game.

Clutch is real, just not in the way that most people envisioned it. We don’t know if there is a mental factor in clutch hitting or not. What we do know is that certain profiles inherently perform better in clutch situations, and for verifiable reasons. In reality, all these bouts between conflicting opinions could have simply been a short-sighted over-complication of what were simply different perspectives. There is legitimacy to both typical arguments, but failing to recognize the validity of the other has long since prevented a conclusion to something that has always just been common sense.