Introduction

The MLB enacted several rule changes in the 2022-23 offseason to increase engagement and appeal to a younger audience. Some rule changes, such as introducing the pitch clock and limiting the number of pickoff attempts, were instituted to shorten the length of a game. Still, the most significant rule change in altering overall game strategy was the implementation of infield shifting rules. The new rule restricts defensive alignments so fielding teams must position their infielders on the dirt and two on each side of second base. The primary purpose of eliminating the shift, defined as having at least three infielders on one side of second base, was to increase overall offense to provide a more action-filled product as the MLB batting average reached a 55-year low of .243 in 2022. The analytics teams and coaching staff for MLB organizations were able to deduce tendencies for hitters and optimize defensive alignment to minimize a ball in play, particularly a ground ball, resulting in a base hit.

The 2023 season has provided early indications that the MLB has been successful in increasing offense. According to Baseball Reference, MLB batting average has risen slightly to .248, and runs per game have jumped to 4.58 (as of 8/12/23) compared to an 8-year low of 4.28 in 2022. While each rule change has played a role in increasing the MLB’s product value, we believe eliminating the shift has had the most substantial effect. Teams have historically shifted more often against left-handed hitters compared to right-handed hitters because infield positioning is better optimized against lefties. Shifting against righties still requires a first-baseman to play close to the first base line to cover the base on a potential groundball putout, occupying an area that leaves a much larger gap on the right side of the infield. According to Baseball Savant, teams shifted 52.6% of plate appearances against left-handed hitters in 2021, a much larger percentage than a shift rate of 16.2% against righties in the same year. For this reason, our analysis will focus on identifying left-handed hitters and overall hitting tendencies that suggest greater offensive success in 2023 with defensive restrictions put into place.

Data Description

Our play-by-play data of the 2022 season is sourced by Baseball Savant where each row is a ball put into play by a left-handed hitter. In gathering data we encountered an issue where Baseball Savant would not allow for tens of thousands of pitches to be pulled into a single csv file. Therefore, our team had to adjust. We set a threshold of at least 50 pitches put into play and pulled data from each month of the season. Therefore, one big limitation within our data is that some players’ full data isn’t included under the circumstance of an injury that prevented 50 plate appearances from being reached in a given calendar month. To adjust for some cases in which some players reached the 50 plate appearance threshold for a given month but did not play a full season, we filtered for players with at least 200 at-bats with the ball put into play. Different filtration of the data is described within each subsection.

Variables of focus are:

Predicting Hitter Success Without Defensive Shift Alignments

In this section, we analyzed the 20 left-handed hitters who pulled the ball the most in the 2022 MLB season. Our goal was to see how often they were being shifted and how their at-bats differed when they were shifted versus when they were not shifted. With information on both the frequency of facing a shifted defense and the overall difference in BABIP among these batters, we believe we can make predictions on how these batters will fare in the 2023 season when they are exclusively facing unshifted defenses.

To start, we looked at all left-handed batters with at least 200 plate appearances during the 2022 MLB season. Then, we looked at the batters with the highest proportion of pulled balls among this group. The 20 left-handed batters who pull the ball the most are those with a pulled ball proportion greater than 0.65. Below is a graph that shows these 20 batters sorted by how often they faced a shift last season.

player_name prop_shifted prop_pulled
Santana, Carlos 0.972 0.765
Raleigh, Cal 0.952 0.683
Calhoun, Kole 0.937 0.663
Odor, Rougned 0.935 0.657
Hicks, Aaron 0.931 0.663
Ramírez, José 0.931 0.681
Bellinger, Cody 0.917 0.662
Kepler, Max 0.904 0.647
Muncy, Max 0.895 0.654
Santander, Anthony 0.855 0.674
Rutschman, Adley 0.821 0.658
Ruiz, Keibert 0.820 0.709
Tellez, Rowdy 0.785 0.658
Heim, Jonah 0.783 0.705
Melendez, MJ 0.749 0.654
Naylor, Josh 0.749 0.672
Marte, Ketel 0.687 0.662
Grisham, Trent 0.625 0.664
Rengifo, Luis 0.604 0.681
Varsho, Daulton 0.423 0.718

Key:

prop_shifted - Proportion of total at bats the batter faced a shift

prop_pulled - Proportion of balls put into play that were pulled

Generally, we can see from the graph above that players who pull the ball more often are more likely to face a shifted defense. However, there is a major distinction between some batters such as Cal Raleigh and Luis Rengifo who have almost identical pulled ball percentages but get shifted at vastly different rates (95% of at-bats versus 60% of at-bats). One possible explanation for the difference is performance against a shifted defense. A team is much more likely to shift if a batter is historically worse against a shifted defense than if a batter is able to overcome the shift. To get more context, we looked at the same dataset of 20 players and found their BABIP and the proportion of hits that were pulled.

player_name prop_hit prop_hit_pulled
Varsho, Daulton 0.358 0.729
Ramírez, José 0.332 0.690
Rutschman, Adley 0.330 0.569
Raleigh, Cal 0.308 0.820
Calhoun, Kole 0.297 0.549
Naylor, Josh 0.291 0.655
Muncy, Max 0.290 0.791
Ruiz, Keibert 0.290 0.710
Kepler, Max 0.288 0.595
Bellinger, Cody 0.287 0.726
Hicks, Aaron 0.282 0.642
Marte, Ketel 0.280 0.685
Grisham, Trent 0.279 0.690
Rengifo, Luis 0.274 0.628
Tellez, Rowdy 0.274 0.659
Odor, Rougned 0.270 0.556
Heim, Jonah 0.265 0.622
Santander, Anthony 0.250 0.694
Melendez, MJ 0.247 0.645
Santana, Carlos 0.225 0.764

Key:

prop_hit - Proportion of balls put into play against a shifted defense that were hits

prop_hit_pulled - Proportion of hit balls against a shifted defense that were pulled

While there are several exceptions, we can see that generally speaking, the players who are shifted against the most have a lower BABIP when they are shifted. This is most notable with Carlos Santana and Daulton Varsho, who are at either end of the extremes. Santana has the highest proportion of at-bats where he faces a shifted defense and has the lowest BABIP against the shift. Varsho, on the other hand, has the lowest proportion of at-bats against a shifted defense among the 20 left-handed batters who pull the ball the most while having the highest BABIP against a shifted defense. While extreme cases like these are easier to analyze, there is certainly more information that results in defenses shifting some batters more than others. With this data, we can see the batters’ BABIP, but we do not take into account how good of a hitter they are on average. Comparing the BABIP of an elite hitter with an average or below average player will likely not tell the complete story of how effective it is to shift the batter. To combat this, we also looked at the BABIP of these players against an unshifted defense and took the difference of the two numbers. By doing this, we believe we measure the effectiveness of a shifted defense on the player’s batting performance.

player_name prop_hit_ns prop_hit_pulled_ns num_obs_ns prop_hit prop_hit_pulled num_obs_shifted prop_hit_diff
Santana, Carlos 0.571 1.000 7 0.225 0.764 244 0.346
Melendez, MJ 0.440 0.622 84 0.247 0.645 251 0.193
Ramírez, José 0.429 0.833 28 0.332 0.690 380 0.097
Rutschman, Adley 0.419 0.833 43 0.330 0.569 197 0.089
Bellinger, Cody 0.400 0.750 30 0.287 0.726 331 0.113
Raleigh, Cal 0.400 0.500 10 0.308 0.820 198 0.092
Santander, Anthony 0.347 0.824 49 0.250 0.694 288 0.097
Naylor, Josh 0.340 0.794 100 0.291 0.655 299 0.049
Odor, Rougned 0.333 0.857 21 0.270 0.556 300 0.063
Ruiz, Keibert 0.319 0.667 47 0.290 0.710 214 0.029
Calhoun, Kole 0.312 1.000 16 0.297 0.549 239 0.015
Rengifo, Luis 0.311 0.781 103 0.274 0.628 157 0.037
Marte, Ketel 0.307 0.778 88 0.280 0.685 193 0.027
Heim, Jonah 0.298 0.929 47 0.265 0.622 170 0.033
Tellez, Rowdy 0.295 0.615 88 0.274 0.659 321 0.021
Varsho, Daulton 0.289 0.831 225 0.358 0.729 165 -0.069
Grisham, Trent 0.248 0.677 125 0.279 0.690 208 -0.031
Hicks, Aaron 0.214 0.667 14 0.282 0.642 188 -0.068
Muncy, Max 0.171 0.500 35 0.290 0.791 297 -0.119
Kepler, Max 0.129 1.000 31 0.288 0.595 292 -0.159

Key:

prop_hit_ns - Proportion of balls put into play against an unshifted defense that were hits

prop_pulled_ns - Proportion of hit balls against an unshifted defense that were pulled

num_obs_ns - The number of at-bats a player faced an unshifted defense

prop_hit - Proportion of balls put into play against a shifted defense that were hits

prop_pulled - Proportion of hit balls against a shifted defense that were pulled

num_obs_shifted - The number of at-bats a player faced a shifted defense

prop_hit_diff - The difference in prop_hit_ns and prop_hit

In the data shown above, the players are sorted by their BABIP against an unshifted defense. In the right-most column, we can see the difference between a player’s BABIP against an unshifted defense and the player’s BABIP against a shifted defense. Players that have higher values in the prop_hit_diff category hit significantly better against an unshifted defense than against a shifted defense, making them prime candidates for an improved batting season in 2023. While these proportions can generally tell us how a batter fares in shifted vs unshifted situations, it is important to note that the number of observations is low for some categories. For example, Carlos Santana has a remarkably high 0.572 BABIP against an unshifted defense, but he only faced an unshifted defense 7 times in the 2022 MLB season. If we were to continue this project, one element we would likely want to explore is previous years batting statistics as well to give us more observations and provide less variability in our data.

The above graph shows us the relationship between how often a batter pulls the ball and how often they were shifted last season. The color of the data points also provides us with data on how often they pulled the ball when they got a hit against a shifted defense. We generally expect this graph to have a positive relationship because as a batter pulls the ball more often, the more beneficial it would be to align the defense in a shift. The batters who appear towards the top right of the graph (meaning they pull the ball and face a shifted defense often) are the most likely candidates to improve in 2023 because the defense can no longer be shifted. Those who appear in a lighter blue are also more likely to improve because it shows that their hits do not frequently come from adjusting to the shift.

This graph shows the relationship between how often a batter faces a shifted defense and their BABIP against a shifted defense. Batters appearing towards the bottom right of this graph are those that are likely candidates for an improved batting season in 2023 because it indicates that they are constantly being shifted against and have a low BABIP in those situations. Furthermore, the players appearing in the lighter shade of blue bat significantly better against an unshifted defense than a shifted one, which also makes them breakout candidates. On the other hand, players who don’t often face a shifted defense or players who have similar BABIP whatever defense they face will likely not be impacted as significantly by the new rules.

Based on the analysis shown above, we chose Carlos Santana, MJ Melendez, and Cal Raleigh as the 3 players we thought would be affected the most by the new shift rule. Santana and Raleigh are the two batters that were shifted upon the most during the 2022 season, and Santana and Melendez had the greatest increase in BABIP when facing an unshifted defense versus a shifted defense. Looking at these players’ stats so far in 2023 (as of 9/1/23 from baseball-reference.com), each of these 3 batters have a higher batting average than in 2022. Carlos Santana’s batting average is up from .202 in 2022 to .231, MJ Melendez is up from .217 to .231, and Cal Raleigh is up from .284 to .306. These increases support our assumption that players who are shifted upon often and pull the ball frequently will have increased batting stats in 2023. However, it is important to note that there are several other factors that go into these statistics, including age, injuries, and team situation.

Another interesting observation is that Santana and Melendez both have lower on base percentages this year, with Santana dropping from .316 to .312 and Melendez dropping from .313 to .300. If we were to continue this project, an area of interest would be looking into the change in approach of each of these batters and analyzing why their OBS has dropped despite an increase in BA.

Correlating Shift Alignments with Different Types of Balls In Play (BIP)

After analyzing the hitters that were shifted the most in the 2022 season, we now want to find out who hit the ball in the air most often with a shifted defense. Once again, we are only considering any hitters who at least have 200 Balls In Play (BIP) in our analysis. This gives us a great amount of dataset to work with to draw some conclusions. To find out which hitters hit the ball in the air more often, we have considered the rate of fly-balls, ground-balls, and line-drive rates of left-handed hitters.

To start off our analysis, we used the same list of hitters that pulled the ball most while shifted (see chart 1). We used launch angles to help us determine whether a ball in play is a fly-ball or a ground-ball. Typically, the launch angle of a ground-ball is less than 10 degrees, a line-drive between 10 to 25 degrees, and a fly-ball between 25 to 50 degrees. To clean up our data, we removed any observations that did not record the launch angle or fielding alignment of a player’s ball in play, which was about a 100 observations that were omitted from approximately 35,000 observations. We then combined the observations of all players with greater than 200 BIP to the proportions of their fly-balls, line-drives, and ground-balls that were pulled against a shifted defense. The following chart gives us the top 20 hitters that had the highest proportion of fly-ball rates against a shifted defense, along with their line-drive, and ground-ball proportions. ’

player_name prop_flyball prop_line_drive prop_groundball prop_shifted prop_pulled num_obs
Raleigh, Cal 0.490 0.207 0.303 0.952 0.683 208
Santander, Anthony 0.481 0.199 0.320 0.855 0.674 337
Muncy, Max 0.471 0.211 0.317 0.894 0.653 331
Escobar, Eduardo 0.465 0.234 0.301 0.793 0.605 256
Odor, Rougned 0.458 0.213 0.329 0.937 0.655 319
Bellinger, Cody 0.444 0.197 0.358 0.917 0.661 360
Schwarber, Kyle 0.439 0.216 0.345 0.900 0.653 412
Rizzo, Anthony 0.439 0.213 0.348 0.816 0.647 385
Tucker, Kyle 0.435 0.212 0.353 0.901 0.645 485
Ramírez, José 0.429 0.235 0.336 0.931 0.681 408
Heim, Jonah 0.429 0.198 0.373 0.783 0.705 217
Cronenworth, Jake 0.424 0.207 0.369 0.219 0.531 493
Ortega, Rafael 0.422 0.225 0.353 0.398 0.610 249
Santana, Carlos 0.422 0.191 0.386 0.972 0.765 251
Narváez, Omar 0.413 0.223 0.364 0.529 0.524 206
Yastrzemski, Mike 0.411 0.239 0.351 0.796 0.644 348
Suwinski, Jack 0.406 0.151 0.443 0.660 0.642 212
Pederson, Joc 0.404 0.200 0.396 0.782 0.589 280
Winker, Jesse 0.398 0.212 0.390 0.732 0.605 354
Mullins, Cedric 0.394 0.191 0.415 0.504 0.604 482

We can see Cal Raleigh tops the chart with the highest fly-ball proportion rate at 49%, meaning almost half of his pulled BIP ends up in a fly-ball. He also faces a shifted defense 95% of his BIP. This could mean that Raleigh tries to get the ball in the air more often in order to get a hit past the shifted infielder. Another player’s statistic that is interesting to see is Jake Cronenworth’s shifted defense rate, which is only 22% of his BIP. Yet he has a relatively high rate of pulling the ball in the air at around 42%. We could infer from this observation that Cronenworth pulls the ball more often in the air without a shifted defense because the right fielder has a better chance at recording the out, or Cronenworth can hit across the baseball field, hence having a shifted defense does not give the fielders an advantage of getting him out.

Now that we have a basic understanding of the players’ fly-ball or line-drive stats against a shifted defense, we wanted to dive deeper into whether there is a correlation between the shift alignment and how often players try to hit the ball in the air. We first created a correlation matrix that includes the values of each of our stats’ proportions from chart 4. Then, we converted the matrix into a correlation plot to give us the correlation values between shift alignments and the fly-ball, line-drive proportions, etc.