John von Neumann was a man with many interests, from formulating the mathematics of quantum mechanics to developing the modern computer. Luckily for the birth of modern game theory, he also had a passion for poker, an interest that culminated in the monumental work Theory of Games and Economic Behavior, written in collaboration with economist Oskar Morgenstern. Their focus was on cooperative games — that is, games in which players can form coalitions or, in the words of Fields medalist John Milnor, “sit around a smoke-filled room and negotiate with each other.” Game theory was a radical departure from the standard view of economics, the so-called Robinson Crusoe economy, in which consumers’ well-being is not affected by their social interactions: Like Crusoe, alone on a deserted island where he could interact only with nature, the players in this idealized economy interact only with prices. But as von Neumann observed, “Real life consists of bluffing, of little tactics of deception, of asking yourself what is the other man going to think I mean to do, and that is what games are about in my theory.”
Von Neumann’s pioneering work had applications to warfare (he was one source of inspiration for the iconic wheelchair-bound scientist in Stanley Kubrick’s Dr. Strangelove) but found limited applications in real world and economic theory. The greatest revolution in game theory came from John Nash’s analysis of noncooperative games, in which the emphasis is on individual behavior. In 1994, Nash was awarded a Nobel Prize in economics for the far-reaching applications of his groundbreaking work.
Game theory has emerged as a very powerful and versatile technique, capable of modeling situations as diverse as the theory of corporate takeover bids and the negotiations of Britain and the European Union over the application of Article 50. One of the most spectacular applications was in the distribution of large portions of the electromagnetic spectrum to commercial users in an auction organized by the U.S. government in 1994. Experts in game theory were able to maximize both the government revenue — amounting in the end to a staggering $10 billion — and the efficient allocation of resources to the frequency buyers. (A similar auction carried out by the New Zealand government in 1990 without the help of game theory experts ended up being a total fiasco.)
In this article we will outline the very basics of noncooperative game theory with a view toward financial applications, starting from the ubiquitous prisoner’s dilemma and concluding with more-realistic games that describe important aspects of modern financial markets and model real investors’ behavior.
Crime and Punishment
Alice and Bob have committed a serious crime: They stole $10 from their mom’s wallet, which she keeps in her nightstand. But being amateur thieves, they were not too careful in orchestrating their misdemeanor, and their mom, Carol, caught them trespassing in her bedroom. She immediately realized something was out of order and checked her wallet, discovering with great consternation that money was missing. As a precautionary measure, she decided to confine Alice and Bob to separate rooms.
Carol has enough evidence to convict both of her children for the lesser crime of trespassing, for which they would be grounded for one day. But she lacks evidence for the principal crime, the theft. Being an ingenious woman, Carol offers Alice and Bob a deal. If they both confess to the theft, they will benefit from a reduced sentence and each will be grounded for five days. If neither confesses, they will be sentenced only for the lesser crime and grounded for only one day. Finally, if Alice confesses while her accomplice remains silent, she will not face any charge and Bob will be grounded for 20 days. The same scenario applies if Bob confesses and Alice remains silent. The siblings must reach their decision independently, without communicating with each other.
We can summarize Alice and Bob’s available strategies with the following payoff matrix:
In the matrix the negative numbers represent the number of days of punishment. For example, the values in payoff (−5,−5) are the days Alice and Bob would remain grounded if both confessed.
From the siblings’ point of view, remaining silent and confessing can be seen as forms of cooperation and defection. To highlight this interpretation, we relabel the payoff matrix as follows:
Given the assumption of rationality — and the adage that there is no honor among thieves — what strategies will Alice and Bob choose? To analyze their strategic behaviors, let us introduce the idea of the Nash equilibrium. We are in a Nash equilibrium if Alice’s choice is optimal for Alice given Bob’s choice and at the same time Bob’s choice is optimal for Bob given Alice’s choice. In other words, neither player has an incentive to deviate unilaterally by playing a different strategy, given the strategy chosen by his or her opponent.
We can find the Nash equilibrium by first considering Alice’s point of view: If her brother defects, she can be punished with either 20 days (if she cooperates) or five days (if she defects). Given these outcomes, we can easily guess Alice’s choice: She will decide to defect by confessing to her mom! Because the game is completely symmetrical, Bob will also choose to defect. The scenario in which both players defect is the sought-after Nash equilibrium for the prisoner’s dilemma.
Although neither player has an incentive to unilaterally change strategy, the Nash equilibrium does not represent the best possible outcome: Alice and Bob would have been better off by cooperating and remaining silent. But this scenario is not the equilibrium we just found. There is no “right” solution to this little game, hence the dilemma.
Not a Zero-Sum Game
The prisoner’s dilemma is the most paradigmatic example of a non-zero-sum game. In this kind of game, complementary and conflicting interests can be present simultaneously. In zero-sum games — like tic-tac-toe, chess or “global thermonuclear war” (played by the computer in the movie WarGames) — players are purely antagonistic. In these games “wealth” is transferred from loser to winner. In the financial world the futures, options and currency markets are all zero-sum games. By contrast, the stock market is a non-zero-sum game because performance is inextricably linked to external factors, such as the overall economic outlook. All investors could profit in a bull market, for example.
Life is riddled with examples of the prisoner’s dilemma, from countries negotiating on actions to limit global climate change to birds trying to remove ticks from each other’s feathers (cooperation/defection corresponding to a bird agreeing/refusing to pull off its companion’s ticks). There is a simple and very practical financial application: competition in oligopolistic markets, where optimal quantity and price always depend on choices made by a small number of companies. Let us consider the case of rivals Coca-Cola Co. and PepsiCo. It would be in the interest of both cola makers to cooperate and keep the prices of their carbonated beverages artificially high. But if, out of the blue, Coca-Cola decided to reduce its price — that is, defect — PepsiCo would be forced to follow to protect its market share. We can represent the available options with the following payoff matrix, in which the entries represent the increase in the companies’ profits per year (in arbitrary units). It is easy to see that we are again in a prisoner’s dilemma, as both companies have an incentive to defect.
In the prisoner’s dilemma we found a single Nash equilibrium. Games with multiple equilibria are common, though. As an illustration, let us consider the game known as the “battle of the sexes.”
Alice and Bob, no longer grounded, have decided to meet at a movie theater, but neither can recall whether they planned to watch the action film or the romantic comedy playing there. Alice is very passionate about action movies, while Bob has a clear partiality for romantic comedies. Despite their personal preferences, they would rather watch a movie together than sit alone among strangers. Unfortunately, Alice’s cell phone battery is dead and they cannot communicate with each other. What should they do?
We can represent the situation with the following payoff matrix:
We can think of the entries in the matrix as measures of Alice’s and Bob’s levels of happiness. For example, if they end up watching the action movie, the payoff can be interpreted as Alice being four times happier than her brother. The siblings’ unhappiness if they end up watching different movies is quantified by a zero payoff. It is simple to verify that this game has two Nash equilibria: both siblings attending the action movie and both attending the romantic comedy.
All the examples we have encountered so far are known as pure Nash equilibria. In the battle of the sexes, we also have a new type of equilibrium — a mixed Nash equilibrium. In this kind of equilibrium, a player does not always choose the same strategy but rather chooses among the possible strategies with a certain probability. Clearly, all pure equilibria are particular cases of mixed equilibria in which a strategy is played with probability 1. For the battle of the sexes, the mixed Nash equilibrium corresponds to Alice going to the action movie with an 80 percent probability (4/5) and Bob going to the romantic movie with a 20 percent probability (1/5).
Nash proved that for all finite games — that is, games that must terminate after a finite number of moves — a mixed Nash equilibrium always exists. Games like chess and poker have a mixed Nash equilibrium. Given the games’ complexity, however, an explicit constructive solution is not known, so knowing this is unlikely to give you an edge in your next game.
Nice Guys Finish First
Although cooperation is often a fact of life, we saw that rational players in the prisoner’s dilemma end up playing selfishly even though mutual cooperation is in their best interest. How, then, can cooperation emerge without being forced by an external authority?
One way to achieve this is to play multiple rounds of the prisoner’s dilemma. This was the insight of American political scientist Robert Axelrod, who in the early 1980s organized prisoner’s dilemma tournaments for which various experts in game theory submitted their computer code. The winner was the simplest program, a four-line program called TIT-FOR-TAT. It works as follows: In the first round the program always cooperates, and in the successive rounds it simply copies the opponent’s choice from the previous move.
TIT-FOR-TAT clearly values cooperation. If the opponent decides to cooperate in its first move, TIT-FOR-TAT will appreciate the kind gesture and cheerfully cooperate in the second move. But it will also penalize unprofitable encounters (hence the name): If the opponent decides to defect in its first move, TIT-FOR-TAT will punish the selfish behavior by defecting in the second move. Axelrod introduced the concepts of niceness and forgiveness to characterize different behaviors; a program is nice if it never defects first. Similarly, a program is forgiving if it tends to resume cooperation after its opponent has done so. TIT-FOR-TAT is an example of a nice and forgiving strategy. Remarkably, such programs achieved the highest score in Axelrod’s tournaments — cooperation at last!
The games we have considered so far had a small number of perfectly rational players. But most humans, including traders, are only moderately good at deductive logic while at the same time very efficient at recognizing emerging patterns and learning from experience. In the final section we will introduce a more complicated game that can model the behavior of many interacting traders participating in a real-world market. In this model players are not perfectly rational but will be able to learn and improve their strategies through evolution.
A Minority Game
It’s Friday night and you want to chill out at your favorite downtown bar. But after a long week of work, you know that if the bar is overcrowded, all the fun would be lost; you would rather stay home. The bar can easily accommodate 60 people. If you expect there will be fewer than 60, you decide to go. Other people — say, 100 other people — are also interested in going. Everyone else shares your preferences, and the only public information is past attendance. What is your optimal strategy?
This dilemma, known as the El Farol problem after a bar in Santa Fe, New Mexico, was proposed by economist W. Brian Arthur in 1994. Note that in this situation there is no common, global best strategy. Indeed, if such a strategy existed, everybody would use it. As a consequence, if the strategy predicted a crowded bar next Friday, nobody would go. Basing your decision on tossing a coin wouldn’t help either: On average, only 50 people would show up, and the bar would be underutilized.
The key to solving the El Farol problem turns out to be an efficient analysis of previous attendance, using methods akin to those in standard technical analysis. A more rigorous analysis can be done considering a famous variant of this problem, the so-called minority game. In this game an odd number of players compete against one another by choosing between two possible outcomes. Rather than deciding whether to visit a favorite bar, the option here is to buy or sell a stock. The winners of the game are those who end up selecting the side chosen by the minority. Because the number of winners is smaller than the number of losers, this is an interesting example of a negative-sum game.
Let us consider the simplest possible case, involving three players: Alice, who has grown up to become a respected trader, and two of her colleagues, who have some time to kill. At each stage of the game, they buy or sell a fictitious stock. If Alice plays a contrarian strategy — for example, if she sells when the other two traders buy — she will make money because she will sell at a price that has been set higher by the demand of her fellows. The goal of a good strategy is to correctly forecast the minority side.
In their simplest incarnation momentum strategies essentially follow the leading market sentiment. But in an optimal strategy, it is not wise to always follow the crowd. Even when we correctly anticipate the overall sentiment, we will want to enter or exit the market before other traders do. We will transact at a better price by being ahead of the crowd. It is, then, beneficial to be contrarian, and the minority game precisely models this scenario because winners by definition are in the minority group.
The mathematical tools necessary for the analysis of the minority game are those provided by statistical physics and in particular by the theory of phase transitions, a subject we briefly touched upon in a previous article.
Now let us consider the rules of the game in more detail. We have N players, with N an odd integer. Each player has finite resources and can process only a finite amount of past information. This information is public and amounts to knowing the history of the most recent M winning outcomes. The parameter M, therefore, represents the memory capacity of each trader. Note that the assumption of a limited memory is especially relevant in an era in which computational power is never enough to tame the exponential growth of Big Data.
If we represent sell/buy decisions as −1/1, and memory M = 5, a possible history could look like −1,−1,1,1,1, in which case the first two winning decisions are sells (sellers were a minority) and the remaining three are buys (buyers were a minority).
A strategy is a function that, given history, makes a forecast for the next market outcome. The number of possible strategies is potentially enormous: For a trading period of only two weeks, the number of strategies is far larger than the total number of particles in the visible universe. Therefore, despite its apparent simplicity, the model has a rich dynamic. From the strategy pool each player is given a certain number S of strategies to trade. Note that the case S = 1 would not be interesting because all traders would be forced to always play the very same strategy and there would be no room for learning. Each strategy is awarded a virtual point for any correct prediction of the market outcome. At each round Alice trades her best strategy — that is, the one with the highest current virtual score. If her best strategy happens to correctly predict the next outcome, she will gain $1.
At each round of the game, we can sum all traders’ decisions. This sum represents the excess demand for the underlying stock; its volatility models the fluctuations of the underlying market. A high-volatility regime will correspond to a noisy and unpredictable market, whereas a low-volatility regime will correspond to a predictable market. Not all strategies are independent, and the number of uncorrelated strategies turns out to be 2M. Note that the smaller the traders’ memory capacity M, the smaller the number of uncorrelated strategies they can create. To explore the minority game’s remarkably rich dynamics, let us now introduce the parameter θ=2M/N, which represents the ratio of the number of uncorrelated strategies to the number of traders. As we vary θ, we have a phase transition from a predictable phase of the market to an unpredictable one.
When θ is small enough — that is, the number of independent strategies is small compared with the number of players — the traders end up crowding into the same strategies. The agents become a crowd as they process the available information in the same way and end up using the same best strategy. This phase is, therefore, dominated by herding; because the purpose of the game is to be in the minority, all the traders lose. Without a sufficiently diversified pool of strategies, traders behave like the proverbial sheep running off a cliff. In this phase players do worse than they would have if they’d just tossed a coin. In this “worse than random” regime, there is no helpful information to be extracted from history.
Let us now consider the other phase, for a sufficiently large θ. In this regime traders have a good memory and the number of independent strategies is sufficiently large relative to the number of players. We are in a “better than random” phase: Strategies perform better than random coin tosses, and future market behavior is predictable with some degree of statistical confidence. This period is also known as the inefficient phase of the market: Arbitrage opportunities exist, and the efficient market hypothesis is clearly violated. Quite remarkable is the fact that as the traders’ memory capacity increases, the market’s predictability decreases, approaching asymptotically the random toss game. This shows that being omniscient is not beneficial!
At the interface of the worse-than-random and better-than-random phases, we have a very interesting regime in which the market has minimum volatility and therefore is in its most predictable state; traders have learned to share the limited available resources. This phase corresponds to a cooperative regime in which agents can efficiently process historical information and attempt to predict future market movements — a somewhat surprising revelation considering the intrinsically selfish nature of traders.
The minority game is a promising attempt at modeling the trial-and-error inductive thinking of real traders. It also shows the deep connection between certain applications of game theory and statistical physics. Von Neumann himself was very keen to apply ideas rooted in statistical physics to the study of large numbers of interacting agents in an economy. After all, game theory has never been an ivory tower theory, and from its very early beginnings it has benefited from the interaction with a vast range of fields, from economics to political science to biology. Game theory has evolved into a mature discipline central to our understanding of human behavior and strategic interactions. And perhaps more pragmatically, the next time you find yourself in an awkward social dilemma or wonder whether snitching on your friend is a good idea, you know where to look!
Robert Axelrod and William D. Hamilton. “The Evolution of Cooperation.” Science 211 (1981).
Robert Axelrod. The Evolution of Cooperation (Basic Books, 1984).
Damien Challet, Matteo Marsili and Yi-Cheng Zhang. Minority Games (Oxford University Press, 2005).
John Nash. “Non-Cooperative Games.” Annals of Mathematics 54, no. 2 (1951).
John von Neumann and Oskar Morgenstern. Theory of Games and Economic Behavior (Princeton University Press, 1944).
Thought Leadership articles are prepared by and are the property of WorldQuant, LLC, and are circulated for informational and educational purposes only. This article is not intended to relate specifically to any investment strategy or product that WorldQuant offers, nor does this article constitute investment advice or convey an offer to sell, or the solicitation of an offer to buy, any securities or other financial products. In addition, the above information is not intended to provide, and should not be relied upon for, investment, accounting, legal or tax advice. Past performance should not be considered indicative of future performance. WorldQuant makes no representations, express or implied, regarding the accuracy or adequacy of this information, and you accept all risks in relying on the above information for any purposes whatsoever. The views expressed herein are solely those of WorldQuant as of the date of this article and are subject to change without notice. No assurances can be given that any aims, assumptions, expectations and/or goals described in this article will be realized or that the activities described in the article did or will continue at all or in the same manner as they were conducted during the period covered by this article. WorldQuant does not undertake to advise you of any changes in the views expressed herein. WorldQuant may have a significant financial interest in one or more of any positions and/or securities or derivatives discussed.