Managing with Markov
Baseball fan—and analyst—Carl Morris shows a statistical path to more runs scored.
It has attracted less attention, perhaps, than searches for Big Foot or a cure for the common cold, but the quest for the optimal baseball statistic continues to confound even the brightest minds. The Nielsen ratings that determine the exchange of billions of advertising dollars may be based on a small fraction of American homes, presidential vote-counts have been unmasked as hopelessly murkyyet to approximate that Derek Jeter had 500 times at bat instead of 502 is heresy of the highest order because numbers in baseball, from batting averages to earned-run averages to brain-bending formulae longer even than this sentence, are counted and scrutinized more precisely than any survey or census. That pursuit continues all the way to the Harvard statistics department.
|Statistician Carl Morris|
Portrait and collage by Jim Harrison
Carl N. Morris, professor of statistics and of health care policy, might be better known for his work in hospital-quality evaluation, but he has spent countless hours assessing the squeeze play and sacrifice bunt with pages of numbers that would leave most fans downright vertiginous. It's hard to imagine Morris getting more worked up over universal healthcare than he does when his beloved Red Sox squelch a rally with a misguided attempt to steal second base.
His research and equations render simple batting averages and runs batted innot to mention the "homers-hit-by-a-shortstop-under-a-full-moon" stat epidemic of modern baseballhopelessly simplistic and antiquated. Employing the concept of Mar- kov processes, Morris looks beyond conventional categories like doubles and walks to place those events in the context of how they affect the events around themand, therefore, a team's chances of scoring. "A lot of the physical world is Markovian, and baseball is, too," says Morris. "Looking at the game that way can give you a much better idea of the value of a player or strategy than the conventional methods can."
Markov theory concerns events that are conditional upon those that precede them and affect those that come after. (The stock market and weather patterns are Markov processes; coin flips and poker hands are not.) Baseball might not seem the most fertile ground for truly sophisticated statistical analysis: anyone who has heard an 0-for-10 ballplayer say he's "due for a hit" can be forgiven some skepticism, while most baseball general managers would think a Poisson distribution concerns the breakup of the 1997 Florida Marlins. But Markov theory applies perfectly to baseball, because a hitter who rips a double hasn't necessarily helped his team score; his play succeeds only if a batter before him has reached base or if a hitter after him drives him home.
Morris's method can measure the true contribution of every result a batter can producefrom triples and home runs to sacrifices and double playsby examining the true effects of each event. Batters always hit in one of 24 situations: with zero, one, or two men out, and with one of eight configurations of men on base (none, first, first and third, and so on). A hopelessly complicated and time-consuming process determines the average number of runs that teams score in each scenariofor instance, a team with one out and a man on second scores on average .720 runs in that inning. By examining what statisticians call the "change in state" between one batter and the next, the actual contribution of each hitter can be precisely determined. Morris calls these Net Expected Run Values (NERV) and offers the following hypothetical examples, using aggregate statistical data from last year's American League season.
Chart by Steve Anderson (Data: American League 2001)
* Nomar Garciaparra of the Red Sox comes up with a man on first and none out. If he drives the man home with a double, he has created more than the one run his RBI would reflect. Since the matrix (see the chart on page 85) shows that the state he batted in usually results in .907 runs and the state he left the next batter (with a man on second rather than first) results in 1.138, he gets credit for the run that scored plus the expected .231 he added.
* The Yankees' Jason Giambi bats with men on second and third and none out. He pops up. When Giambi came up, his team expected to score 1.957 runs, but he left with an expected potential of 1.353, so he gets credit for negative .604 runs.
* Alex Rodriguez of the Texas Rangers strides to the plate with a man on first and two out, a situation that has the Rangers expected to score .239 runs. If he hits a home run, his NERV is the two runs that scored minus .239, because the next batter no longer has a man on first base and the added scoring potential that allows. If Rodriguez strikes out, his NERV is minus .239, since he has ended the inning and there's no hope to score any more runs.
Such analyses can't reliably determine the relative skills of Babe Ruth and Ted Williamsfor one thing, baseball has only recently begun to keep situation-specific data. But Morris's approach neatly unveils the benefits and costs of certain in-game strategies. For example, is the sacrifice bunt (say, making the first out intentionally to move a runner from first to second) ever a good idea? Almost never. The matrix shows why: the state before has a NERV of .907, the state after just .720, meaning that the only time you'd want to sacrifice is if you need a single run late in the game. Yet some traditionalist big-league managers also try this ploy in early innings. "You really could manage a team better by looking at that matrix," Morris asserts.
NERV clarifies a concept that major-league teams are only now beginning to grasp: the preciousness of outs. Outs are baseball's clock; you get only 27 of them, and squandering them can be deceptively catastrophic. Wonder why there are so few stolen bases in modern baseball? You need a 71 percent success ratea number easily determined through NERVto break even. (When informed a few years ago that his Oakland Athletics were last in the league in stolen bases, team general manager Billy Beane instinctively replied, "Good.")
Major-league baseball was stuck in the statistical dark ages when Morris was growing up in San Diego in the 1950s. He loved the Red Sox for their high Fenway Park-aided batting averages and longed to be their manager, but chose a safer career route by earning an aeronautical engineering degree at Cal Tech. "I couldn't make a paper glider that would fly, though," he says, "so I went into mathematics." After 10 years at the prestigious Rand Corporation, Morris accepted a statistics and mathematics professorship at the University of Texas. During the 1970s, as a diversion, he dabbled in sports analysis, principally tennis.
"I don't go out of my way to advertise my baseball work, for fear people will have nothing to do with me as an academic," jokes Morris, who came to Harvard in 1990. Analyzing baseball has become simply his hobbyan enthralling hobby that is potentially valuable to major-league teams. With today's tremendous computing power (not to mention the financial cost of signing a multimillion-dollar free agent who fails) many clubs pay more than $100 an hour for part-time consultants to bang on numbers in new and inventive ways. The Toronto Blue Jays, in fact, just hired a full-time statistics analystKeith E. Law '94, a one-time sociology and economics concentratorto add that extra dimension to their organization.
Would Morris consider bringing his approach to a major-league organization? He's quick to reveal that he developed the Markovian approach, but didn't invent it. Another amateur baseball analyst, George Lindsey (who by day did operations research for Canada's Department of National Defence), broke that ground in the early 1960s. But Morris would relish a chance to consult for, say, his Red Sox. "The major leagues are not using enough of this stuff. That's the power of my field," Morris says. "A professional statistician can, out of love for the game, improve the way baseball is played."
Until then, Morris will continue to refine his analysis of baseball's critical moves: when to send a runner around third, when to play the infield in, when the numbers call for a squeeze play. He can still remember how the 1964 San Francisco Giants all but ruined their pennant hopes by laying down foolish sacrifice bunts. "It was killing me," he groans. If Morris can help other teams see the light, his hundreds of hours of work will be the rare sacrifice worth while.
Alan Schwarz is senior writer for Baseball America magazine and a frequent contributor to the New York Times. His last story for this magazine, "The Saga of a Great Headline," ran in the November-December 2000 issue.