Saturday, July 12, 2014

The Matthew Effect

The Matthew effect is a term introduced by sociologist Robert Merton in 1968. It takes its name from Matthew 25:29, the parable of the talents:

For unto everyone that hath shall be given, and he shall have abundance; but from him that hath not shall be taken even that which he hath.

In other words, the rich get richer and the poor get poorer. Merton suggested that the positive behaviors of high status people are more likely to be recognized and rewarded than those of low status individuals, while high status people's mistakes are more likely to be overlooked. This creates a positive feedback loop in which increased confidence causes their performance to improve and their reputation to increase over time. The opposite happens to low status individuals. Their mistakes are more apparent, leading to negative feedback, stress and disruption of performance.

Merton also coined the term self-fulfilling prophecy, in which predictions result in behaviors that cause the predicted outcome to occur.  The Matthew effect is a type of self-fulfilling prophecy in which the observer's positive (or negative) expectations cause more (or less) successful behavior in the target over time. This has broad implications for people's self-esteem and the inequality of their social and economic outcomes.

Two business school professors, Jerry Kim and Brayden King, looked for evidence of the Matthew effect in major league baseball. They predicted that a pitcher's status would influence calls by the home plate umpire. Pitcher status was defined as the number of times he had previously been chosen to the All-Star team. It was predicted that, as the number of All-Star appearances by a pitcher increased, more of their balls would be called strikes (over-recognition) and fewer of their strikes would be called balls (under-recognition). The study was made possible by the Pitch f/x system, in place in all major league ballparks, in which cameras objectively measure whether each pitch is in the strike zone.

© www.sportvision.com
The data base was all the pitches taken (not swung at) by the batter during every game of the 2008 and 2009 seasons. These pitches must then be called either a ball or a strike by the umpire, and each call was evaluated for correctness. These data were related to over two dozen pitcher, batter, catcher, umpire and situational characteristics. Some of these variables are of real importance to baseball fans, but they could all be statistically controlled in order to evaluate the status hypothesis.

Baseball fans may be interested in the big picture—the distribution of correct and incorrect calls among the almost 800,000 calls the researchers measured.


Called Ball
Called Strike
Actual Ball            
87.10%
12.90%
Actual Strike          
18.80%
81.20%

The umpires were correct about 85% of the time. (There were more actual balls than actual strikes.) Umpire bias favored the batter, since more strikes were called balls than balls were called strikes. The count (the number of balls and strikes to that point) had a big effect. For example, the likelihood that the umpire mistakenly called a strike was 62% lower when the count was 0-2 and 49% higher when the count was 3-0. Apparently, umpires don't like their call to end an at-bat. Umpire calls also tended to favor the home team. Errors of both over- and under-recognition increased with the situational importance of the at-bat.

The hypothesis was strongly confirmed. Look first at over-recognition: Holding all other variables constant, the more trips a pitcher had made to the All-Star game, the more likely a ball was to be called a strike. The probability of a mistaken strike call increased from 12.8% among pitchers who had no All-Star appearances to 14.9% among pitchers with five or more appearances. Each additional trip to the All-Star game increased the likelihood of over-recognition by 4.9%.

The situation was reversed for under-recognition, also confirming the hypothesis. A strike thrown by a pitcher with no All-Star appearances was mistakenly called a ball 18.9% of the time, but only 17.2% of the time if the pitcher had five or more appearances. Each trip to the All-Star game decreased the likelihood of under-recognition by 2.7%.

In further analyses, the authors were able to show that, with this large data set, pitcher status also had statistically significant effects on the outcome of the at-bat (the total bases reached by the batter) and the game (whether the pitcher's team won). In an analysis that made some admittedly questionable assumptions, they calculated that umpire errors alone were worth approximately $575,000 in salary to a high status pitcher over the course of his career.

© totallycoolpix.com
Of course, Matthew effects can occur any time one person evaluates another—a teacher grading a student, a boss rating a worker, a reviewer reading a manuscript, etc. As a demonstration of how quickly performance expectations can occur, consider a study by Ned Jones and others. Participants watched a videotape of a college student answering 30 difficult questions, with feedback after each item indicating he had answered 15 of them correctly. In the ascending condition, the student gradually improved. He got three of the first ten right, five of the second ten, and seven of the last ten. In the descending condition, the pattern was reversed. (The difficulty of the questions was held constant by asking exactly the same questions in the opposite order.) First impressions mattered a great deal. The student was rated as more intelligent in the descending than in the ascending condition. The authors had hoped the ascending student would get some credit for improvement, but it didn't happen.

In this experiment, as in baseball, the expectations were based on the target's actual past performance. However, expectations can be based on gender, race, class or other social categories. In other words, stereotypes based on group membership can create self-fulfilling prophecies leading to discrimination.

There is no reason to think that umpires and ballplayers are consciously aware of the systematic nature of these errors. A New York Times article about the Kim and King study included the usual quotes from baseball people expressing their surprise at or disbelief in the results. Most teachers, bosses and reviewers probably think they're being objective, too.

In major league baseball, the technology is already in place to have balls and strikes called automatically using the Pitch f/x system. Why would anyone (except maybe Clayton Kershaw) not think that's a good idea?

You may also be interested in reading:

Is Democracy Possible? Part 1 (see also Parts 2 and 3)


No comments:

Post a Comment

Comments are always welcome.