Newcomb’s Paradox: A Solution Using Robots

Newcomb’s Paradox is a situation in decision theory where the principle of dominance conflicts with the principle of expected utility. This is how it works:

The player can choose to take both box A and box B, or just take box B. Box A contains $1,000. Box B contains nothing or $1,000,000. If the Predictor believes that the player will take both boxes, then the Predictor puts $0 in box B. If the Predictor believes that the player will take just B, then the Predictor puts $1,000,000 in box B. Then the player chooses. The player doesn’t know whether the Predictor has put the $1,000,000 in box B or not, but knows that the Predictor is 99% reliable in predicting what the player will do.

Dominance reasoning says for the player to take both boxes. Here’s why:

If the Predictor predicted that the player will choose just one box, then if the player picks just box B the player gets $1,000,000, but if the player picks both boxes the player gets $1,001,000. $1,001,000 > $1,000,000, so in this case the player should pick both boxes.

If the Predictor predicted that the player will choose both boxes, then if the player picks just box B the player gets $0, but if the player picks both boxes, the player gets $1,000. $1,000 > $0, so in this case the player should pick both boxes.

So, no matter what the Predictor did, the player is better off choosing both boxes. Therefore, says dominance reasoning, the player should pick both boxes.

Expected utility reasoning, however, says for the player to take just box B:

If the player picks both boxes, expected utility is 0.99*$1,000 + 0.01*$1,100,000 = $11,990. If the player picks just box B, expected utility is 0.99*$1,000,000+0.01*$0 = $990,000. Expected utility is (much) higher if the player picks just box B.

The problem is called a ‘paradox’ because two decision making processes that both sound intuitively logical give conflicting answers to the question of what choice the player should make.

This description of Newcomb’s Paradox is actually ambiguous in certain respects. First, how does the Predictor predict? If you don’t have any idea, it could be difficult to figure out what’s going on here. The second (and related ambiguity) is how the player can choose. Can they choose randomly, for example? (If they choose in a completely random way, it is difficult to understand how the Predictor predicts correctly most of the time.)

Instead of addressing the ambiguous problem above, I decided to create a model of the situation that clarifies the exact mechanics. This model, then, might not address certain issues others have dealt with in the original problem, but it adheres to the general parameters above. Any solutions derived from the model apply to at least a subset of the formulations of the problem.

It is difficult to create a model with humans, because humans are too complex. That is, it is very difficult to predict human behaviour on an individualized basis.

Instead, I created a model involving robot agents, both player and Predictor.

This is how the model works (code at bottom of post):

time = 1

Player is either Defiant Dominance (DD) or Defiant Expected Utilitarian (DE). What this means is that

if player is DD, then % chance player picks both boxes = 99%.

if player is DE, then % chance player picks just box B = 99%.

time = 2

The Predictor checks the player’s state:

if player is DD, then Predictor puts no money in box B

if player is DE, then Predictor puts $1,000,000 in box B

time = 3

Then the player plays, based on its state as either DD or DE, as described above.

It follows that the Predictor will get it right about 99% of the time in a large trial, and that the DE (the player that consistently picks the expected utility choice) will end up much wealthier in a large trial.

Here are some empirical results:

trials = 100, average DD yield = $1,000, average DE yield = $1,000,000

trials = 10,000, $990.40, $1,000,010.20

trials = 100,000, $990.21, $1,000,010.09

Yet, to show the tension here, you can also imagine that the player is able to magically switch to dominance reasoning before selecting a box. This is how much that the players lost by not playing dominance (same set of trials):

trials = 100, total DD lost = $0, total DE lost = $100,000

trials = 10,000, $96,000, $9,898,000

trials = 100,000, $979,000, $98,991,000

What this shows is that dominance reasoning holds at the time the player chooses. Yet, the empirical results for yield for the kind of player (DD) that tends to choose dominance reasoning are abysmal (as shown in the average yield results earlier). This is the tension in this formulation of Newcomb’s Paradox.

What is clear, from looking at the code and considering the above, is that the problem isn’t with dominance reasoning at time = 3 (i.e., after the Predictor makes his prediction). A dominance choice always yields a better result than an expected utility choice, in a given environment.

The problem, rather, is with a player being a DD kind of player to begin with. If there is a DD player, the environment in which a player chooses becomes significantly impoverished. For example, here are the results for total rewards at stake (same trials):

trials = 100, total with DD = $100,000, total with DE = $100,100,000

trials = 10,000, $10,000,000 ($10M), $10,010,000,000 (> $10B)

trials = 100,000, $100,000,000 ($100M), $100,100,000,000 (> $100B)

DD is born into an environment of scarcity, while DE is born into an environment of abundance. DE can ‘afford’ to consistently make suboptimal choices and still do better than DD because DE is given so much in terms of its environment.

Understanding this, we can change how a robot becomes a DE or a DD. (Certainly, humans can make choices before time = 2, i.e., before the Predictor makes his prediction, that might be relevant to their later choice at time = 3.) Instead of simply being assigned to DD or DE, at time = 1 the robot can make a choice using the reasoning as follows:

if expected benefits of being a DE type > expected benefits of being a DD type, then type = DE, otherwise type = DD

This does not speak directly to the rationality of dominance reasoning at the moment of choice at time = 3. That is, if a DE robot defied the odds and picked both boxes on every trial, they would do significantly better than the DE robot who picked only 1 box on every trial. (Ideally, of course, the player could choose to be a DE, then magically switch at the time of choice. This, however, contravenes the stipulation of the thought experiment, namely that the Predictor accurately predicts.)

By introducing a choice at time = 1, we now have space in which to say that dominance reasoning is right for the choice at time = 3, but something that agrees with expected utility reasoning is right for the choice at time = 1. So, we have taken a step towards resolving the paradox. We still, however, have a conflict at time = 3 between dominance theory and expected utility theory.

If we assume for the moment that dominance reasoning is the rational choice for the choice at time = 3, then we have to find a problem with expected utility theory at time = 3. A solution can be seen by noting that there is a difference between what a choice tells you (what we can call ‘observational’ probability) and what a choice will do (what we can call ‘causal’ probability).

We can then put the second piece of the puzzle into place by noting that observational probability is irrelevant for the player at the moment of a choice. Expected utility theory at time = 3 is not saying to the player “if you choose just box B then that causes a 99% chance of box B containing $1,000,000″ but rather in this model “if you chose (past tense) to be a DE then that caused a 100% chance of box B containing $1,000,000 and also caused you to be highly likely to choose just box B.” I.e., expected utility theory at time = 3 is descriptive, not prescriptive.

That is, if you are the player, you must look at how your choice changes the probabilities compared to the other options. At t = 1, a choice to become a DE gives you a 100% chance of winning the $1,000,000, while a choice to become a DD gives you a 0% chance of winning the $1,000,000. At t = 3, the situation is quite different. Picking just box B does not cause the chances to be changed at all, as they were set at t = 2. To modify the chances, you must make a different choice at t = 1.

Observational probability, however, still holds at t = 3 in a different way. That is, someone looking at the situation can say “if the player chooses just box B, then that tells us that there is a very high chance there will be $1,000,000 in that box, but if they choose both boxes, then that tells us that there is a very low chance that there will be $1,000,000 in that box.”


So, what is the Paradox in Newcomb’s Paradox? At first, it seems like one method, dominance, contravenes another, expected utility. On closer inspection, however, we can see that dominance is correct, and expected utility is correct.

First, there are two different decisions that can be made, in our model, at different times (time = 1 and time = 3).

player acts rationally at time = 1, choosing to become a DE

player thereby causes Predictor to create environmental abundance at time = 2

but player also thereby causes player to act irrationally at time = 3, choosing just box B

The benefits of acting rationally at time = 1 outweigh the benefits of acting rationally at time = 3, so “choosing just box B” is rational in so far as that phrase is understood as meaning to choose at time = 1 to be a DE, which in turn leads with high probability to choosing just box B.

Second, there are two different kinds of probability reasoning that are applicable: causal reasoning for agents (the player in this case), on the one hand, and expected utility for observers, on the other. Causal reasoning says at time = 1 to choose to be a DE, and at time = 3 to choose both boxes.

At neither time does causal reasoning conflict with dominance reasoning. Expected utility reasoning is applicable for observers of choices, while causal reasoning is applicable for agents making the choices.

Therefore, Newcomb’s Paradox is solved for the limited situation as described in this model.

Applying this to humans: For a human, there is probably no ‘choice’ to be a DD or DE at time = 1. Rather, there is a state of affairs at time = 1, which leads to their choice at time = 3. This state of affairs also causes the Predictor’s prediction at time = 2. The question of “free will” obscures the basic mechanics of the situation.

Since a human’s choice at t = 3 is statistically speaking preordained at time = 2 (per the stipulations of the thought experiment) when the Predictor makes his choice, all the human can do is make choices earlier than t = 2 to ensure that they in fact do pick just box B. How a human does this is not clear, because human reasoning is complex. This is a practical psychological question, however, and not a paradox.


One lesson I took from solving Newcomb’s Paradox is that building a working model can help to ferret out ambiguities in thought experiments. Moving to a model using robots, instead of trying to think through the process first-person, helped significantly in this respect, as it forced me to decide how the players decide, and how the Predictor predicts.

Creating a model in this case also created a more simple version of the problem, which could be solved first. Then, that solution could be applied to a more general context.

It only took a few hours to get the solution. The first part, that there is a potential choice at time = 1 that agrees with expected utility reasoning, came first, then that what matters for the player is how their choice causally changes the situation. After thinking I had solved it, I checked Stanford’s Encyclopaedia of Philosophy, and indeed, something along these lines is the consensus solution to Newcomb’s Paradox. (Some people debate whether causality should be invoked because in certain kinds of logic it is more parsimonious to not have to include it, and there are debates about the exact kind of causal probability reasoning that should be used.) The answer given here could be expanded upon in terms of developing an understanding of the agent and observer distinction, and in terms of just what kind of causal probability theory should be used.


UnicodeString NewcombProblem()
int trialsCount = 100000;

DefiantDominance, // i.e., likes to use dominance reasoning

// setup which type of player for this trial
// time = 1;

//playerType = DefiantExpectedUtilitarian;
playerType = DefiantDominance;
double totalPlayerAmount = 0.0;
double totalPlayerAmountLost = 0.0;
double totalAmountAtStake = 0.0;
int timesPredictorCorrect = 0;

for (int trialIdx=0; trialIdx<trialsCount; trialIdx++)
// Predictor makes his decision
// time = 2;

bool millionInBoxB = false;
if (playerType == DefiantExpectedUtilitarian)
millionInBoxB = true;

// player makes their decision
// time = 3;

double chancePicksBoth = playerType == DefiantDominance ? 99 : 1;

// now results …
// time = 4;

bool picksBoth = THOccurs (chancePicksBoth);

// now tabulate return, if !millionInBoxB and !picksBoth, gets $0
if (millionInBoxB)
totalPlayerAmount += 1000000.0;
if (picksBoth)
totalPlayerAmount + = 1000.0; // box A always has $1,000

totalAmountAtStake += 1000.0;
if (millionInBoxB)
totalAmountAtStake + = 1000000.0;

if (!picksBoth)
totalPlayerAmountLost + = 1000.0;

if (picksBoth && !millionInBoxB)
if (!picksBoth && millionInBoxB)

double averageAmount = totalPlayerAmount/(double)trialsCount;
double percentagePredictorCorrect = (double)timesPredictorCorrect/(double)trialsCount*100.0;

UnicodeString s = “Trials: “;
s += trialsCount;
s += “, “;
s += playerType == DefiantDominance ? “DefiantDominance” : “DefiantExpectedUtilitarian”;
s += ” – Average amount: “;
s += averageAmount;
s += “, Total amount lost because didn’t use dominance reasoning at moment of choice: “;
s += totalPlayerAmountLost;
s += “, Total amount at stake (environmental richness): “;
s += totalAmountAtStake;
s += “, Percentage Predictor correct: “;
s += percentagePredictorCorrect;
s += “%”;
return s;

Leave a Reply

Your email address will not be published. Required fields are marked *