Thoughts on the Astros’ Sign Stealing Scandal: How Knowing What’s Coming Alters Batting Performance

A few people have asked me whether I have any thoughts about how knowing what pitch is coming (e., in the Astros cheating scandal) changes batting. Not only do I have thoughts – I have data! For those not in the know, the Houston Astros were recently severely punished after they were caught stealing signs. The sign stealing scheme involved a player in the dugout watching a video (from a center field camera) showing the catcher putting down the signs and then banging on a trashcan with a bat to send an auditory signal to the batter about what pitch was coming (typically no bangs=fastball, two bangs=off speed pitch). You can see it in action here:

Whether the Astros were doing this is not up to debate any longer but what is still unclear is how much, if at all, they benefited from knowing what pitch was coming. While some have provided evidence of no real significant improvements in hitting performance for the 2017 season in question others have shown clear benefits. Intuitively, one would think that knowing what pitch was coming would be a clear advantage (with some opposing players claiming that it is an even bigger deal than using performance enhancing drugs) so what is going on here?

Well, we have conducted some studies using the baseball batting virtual environment in my lab that are relevant to this issue. In our research, batters “knew” what pitch was coming in two fundamentally different ways. Although the first set of studies is very different from sign stealing, I want to look at the data from both of these because, as you will see, they produce different effects and the contrast helps to understand what is going on.

Knowing What Pitch is Coming Based on Situational Probabilities

The first way that we have manipulated knowledge about the upcoming pitch in our studies is making the pitches thrown by the simulated pitcher more or less predictable based on their tendencies and/or the game situation. In research parlance we call this information situational probabilities which refers to the fact that the probability of different events (e.g., a pitcher throwing a fastball) varies as function of the game situation (e.g., the number of balls and strikes aka “the count”). It is proposed that through knowledge of the relationship between event probability and game situation in their particular sport, a skilled athlete can reduce event uncertainty by preparing a movement for the highest probability event.

The first example of this type of pitch knowledge I want to look at is a study I published in 2015 which examined the effect of giving batters information about a pitcher’s tendencies in the form of heat maps like the own shown below. These displays gave information about the location and types of pitches thrown most commonly for each count.

Participants in the study (college players) were split into 3 groups. A Full group which received all the information about the pitchers tendencies at the start of the study (after a baseline pre-test) and after each at-bat, a Build-Up group which received cumulative, accrued pitcher statistics after every 4 at-bats, and a Control group that did not see any pitch charts. Another way to think of this is that the Full group received all the information about the pitcher right from the get-go while the Build-Up group had to learn it by facing the pitcher. What was found? As shown in figure below, the first effect that I found was that giving the information to the batters in the Full group after the pre-test had a large, immediate effect on performance (an increase of about 100 points in batting average!). So knowing what is coming through knowledge of a pitcher’s tendencies definitely helps!

But what happened after 40 at-bats facing this simulated pitcher? The Build-Up group caught up to the Full group and show a similar batting average benefit as compared to the control group.

But now here is where it gets really interesting! To add a twist to the study, I next had batters face a simulated pitcher with completely different tendencies than the one they had been batting against, what we call a transfer condition (Trans in the figures below) in motor learning research. They were not shown any pitch charts for this new pitcher. Here is what happened shown both in terms of batting average and the relative performance for pitches that were high probability (i.e. matched the old pitcher’s tendencies) vs low probability:

Prior to running the study, I hypothesized that learning the first pitcher’s tendencies would hurt performance when switching to a new pitcher for both the Full and Build-Up groups because they would initially make many incorrect pitch predictions when transferring to the new pitcher. This data for the Full group strongly supported this as their batting average dropped by about 200 points when switching to the transfer condition and was significantly lower than the control group. And, as shown in the bottom figure, they performed particularly poorly for pitches that were high probability for the previous pitcher but not the new one! But surprisingly, there was no evidence of negative transfer for batters in the Build-Up group. They had a significantly higher batting average than the control group in the transfer phase and were not fooled by the formerly high probability pitches like the batters in the Full group. My post-hoc explanation for this effect was that that providing situational probabilities in an accrued manner (like in the Build-Up group) could lead to, through a process of guided discovery, not only information that could be used against the current opponent, but also helped to develop their ability to generate their own event profiles when explicit situational information is not available (i.e., it improved their ability to do pitch recognition and learn to “read” a pitcher on their own). More on this in a bit, but the main message from this first study is: knowing what pitch is coming leads to improved batting performance but the effect also depends on how the information is received by/delivered to the batter.

The second study I want to look at is one that Rouwen Canal Bruland and I published in 2018 in which we looked both at batting performance and swing kinematics. In this study, knowledge of what pitch was coming was varied by altering the probabilities of different pitch types and informing the batter about these probabilities before each at-bat. In the first experiment of the study, there were two pitches (a fastball and curveball) with 3 different values of fastball probability (.8, .65 and .5). We also varied how long the ball was visible (expressed as the point in time during the ball flight at which the ball was occluded in the simulation). There were 3 different occlusion times (50, 100 and 150 ms after pitch release). We hypothesized that (i) batters would hit better when they had a better idea of what was coming (higher fastball probability) and (ii) this effect would be larger when they had less information from the ball flight to recognize the pitch (i.e., shorter occlusion time). As shown in the figure below, the results for batting average were highly consistent with these predictions:

For the 100 ms occlusion time, batter average was roughly 40 points higher for a fastball probability of .8 as compared to a probability of .65 and roughly 80 points higher than for a fastball probability of .5. So consistent with the first study I discussed in this post, better knowledge about what’s coming helps performance! Note that the difference between the fastball probability conditions was larger for the short (50 ms) occlusion time than it was for long (150 ms) occlusion. One implication of this finding is: knowing what pitch is coming should have a larger effect for pitchers with higher velocity (the equivalent of a shorter viewing time).

In the next part of the analysis for this study we asked: what did batters do differently when they “knew” what was coming? For this, we looked at the changes in the relative timing of four different stages of the swing: (i) Windup onset time, defined as the instant the batter’s lead foot breaks contact with the ground, (ii) Pre-swing onset time, defined as the instant the batter’s lead foot re-establishes contact with the ground, (iii) Swing onset time, defined as the instant at which downward motion of the bat begins and (iv) Minimum bat height, defined as the instant in time when the bat reached its minimum height above the ground. Interestingly, we found that the effects of pitch probability on these kinematic variables depending strongly on the occlusion time.

As shown in the Figure below, when the batter had a longer view of the ball’s flight (i.e., the 150 ms condition), increasing the probability of the fastball appeared to result in them starting the initial, windup stage of the swing slightly earlier and then not using pitch probability to alter the relative timing of any of the later stages . They presumably used visual trajectory information to alter the later stages of their swing (particularly, the swing onset time). This is an effective approach because it allows the batter to be ready for faster pitches (by getting things moving earlier when the probability of a fastball is high) while not leaving themselves unable to react to a lower probability curveball.

For the intermediate 100 ms occlusion time conditions, batters exhibited behavior consistent with the strategy of “sitting on a fastball” . Specifically, the early stages of the swing appeared to be unaffected by the pitch probability while the later stages (in particular, the onset of the bat movement) occurred earlier as fastball probability increased. This presumably occurred because the occlusion time was too short to allow batters to alter the later stages of their swing on the basis of visual trajectory

Overall, across experiments in the study, batters used a variety of different adjustments to their swing including: (i) altering movement onset time without changing the relative timing of the different stages of the swing, (ii) keeping the initial stage of the movement the same, but instead making adjustments to the later stages, and (iii) varying the swing velocity.

Summary so far: From these and other studies we have done we definitely see strong evidence that knowing what pitch is coming (based on the use of situational probabilities or tendencies) improves overall batting performance. But how this knowledge is used exactly depends on a lot of factors including how was it was gained and the availability of other information sources. How the information is received also seems to be related to the amount of negative transfer that will occur when switching to a new pitcher.

However, being able to predict what pitch is coming based on probabilities is not the same as knowing what is coming because the catcher’s signs have been stolen so let’s look at some work we have done that more closely resembles that situation..

Knowing What Pitch is Coming Based on a Signal from a Sign Stealing

Before the Astros’ banging scheme came along, the most common (perfectly legal) way to steal signs occurred when a runner was on second base. The runner would watch the catcher put down the sign then signal the batter what pitch was coming by changing their posture in some way (e.g., putting your hand up to your heart for a fastball).

Back when we were doing research looking at expertise differences in attentional focus for baseball batters, I actually created a simulation of this situation. I created a virtual runner on 2nd base that would signal the pitch type with batters instructed beforehand what the signals meant. To add a twist, that turned out to be very important, the runner signaled the incorrect pitch 10% of the time. At the time, the question we were really interested in was: how well batters would be able to pick up these subtle signals? We hypothesized that because expert batters have more available attentional resources (and tend to be focused more externally) they should pick these signals up more accurately than novices, and that was indeed what we found. But we did also collect data (which was never published) looking at how the presence of these signals changed performance. For this I want to look at a few different measures of batting performance. When looking at overall batting average, performance was significantly better (by about 25 points) when the signs from the runner were available. This difference was only marginally significant statistically — so there seemed to be a benefit but maybe not as large as we would expect..

To understand this a bit more, I next looked at metrics of swing discipline: OSwing% (the percentage of time the batter swings at pitches outside the strike zone) and ZSwing% (the percentage of swings for pitches inside the strike zone). Here is where the real benefits of the stolen signs appeared. OSwing% was significantly lower and ZSwing% significantly higher when the signs were stolen:

But here’s where it gets interesting! I next looked at the quality of contact by calculating the %Barrels (defined here as and swing outcome with an exit velocity >85 mph and a launch angle between 26-30 deg). For those that don’t know, swings with this combination of velocity and angle have a high probability of resulting in a hit. Here is what I found..

Batters in my study actually made quality contact more frequently when they did not receive the stolen signs! One last bit of data before I try to make sense of this all. What happened when the runner on second base gave the incorrect sign?

Not surprisingly, there was a huge decline in batting average when the sign was incorrect. Note that, relative to the control group (dashed line), the benefit of getting a correct sign is much smaller in magnitude than the harm produced by getting an incorrect one!

Summary: So, looking a the data from this unpublished study, we see that knowing what pitch is coming from a runner on 2nd base does improve batting performance. This seems primarily to result from better plate discipline – when the signs were stolen batters swung at more pitches in and fewer pitches out of the strike zone. However, the benefits seem to be offset by two negative effects. First, in the rare cases where the runner gives the incorrect sign (10% of the time in this study), there is a huge penalty on performance. This is not surprising and kind of similar to the “transfer to a new pitcher” condition in the first study I talked about in this post. The second negative effect is much more surprising. When batters were given stolen signs they made worse contact when they did hit the ball! I am not sure why this occurred but I have two related, working hypotheses:
(i) having to look at the runner on second base disrupts the batter’s attentional/gaze shifting behavior and makes they less able to detect advance cues and pick up the release of the pitch and
(ii) being given a stolen sign stops the batter from doing pitch recognition based on advance cues and the early part of the ball’s trajectory. Why would not doing pitch recognition matter when you know the pitch anyways?! Because the same visual information used to recognize the pitch type also specifies the time to contact and crossing location of the pitch. In my study, these varied from pitch to pitch (even within the same pitch type) so just knowing the pitch was a fastball is not enough, for example. You still need to pick up information about where it will be when it gets to the plate in order to hit it hard! So short circuiting this process with stolen signs may actually be harmful.
I hope to look at these hypotheses more in future studies.

*Amazingly, one analysis of the Astros’ 2017 season indicates the sign stealing scheme used essentially recreated the design of the study I just described and replicated the performance results. From the article:
“When they did so, a non-fastball was on the way 93 percent of the time and they were wrong seven percent of the time…” (comparable to my 10% error rate)
“But when the players in the tunnel thought they had cracked the code but it turned out they hadn’t, it harmed the batters at the plate more than the knowing the incoming pitching helped them..” (see the figure above)

Final Conclusions

  • Knowing what pitch is coming does improve batting performance but the specific effects it has (i.e. changes in the batting kinematics) and how much it helps seems to depend on a lot of factors including the time available to view the pitch (which will depend on the pitcher’s velocity) and how the information is gained.
  • Knowledge gained from “natural means” (e.g., learning situational probabilities, using advance cues and picking up information early in the ball flight) which I modeled here seems to be more beneficial to overall batting performance as compared to getting an “artificial” signal based on stolen signs.
  • Receiving stolen signs seems primarily to improve performance by improving plate discipline. Surprisingly, it does not seem to improve (and may even hurt) the ability to actually hit the ball solidly. This latter effect may be due to the batter being not being as focused on picking up information about time to contact and crossing location when they receive a stolen sign.
  • Incorrect knowledge (in the form of wrong tendencies or a incorrect stolen sign) has a HUGE negative effect on batting performance that may offset any positive benefits.