Chapter 4

Thought Experiments Supporting the Self-Sampling Assumption

This chapter and the next argue that we should accept SSA. In the process, we also elaborate on the principle’s intended meaning and begin to develop a theory of how SSA can be used in concrete scientific contexts to guide us through the thorny issues of anthropic biases.

The case for accepting SSA has two separable parts. One part focuses on its applications. We will continue the argument begun in the last chapter, that a new methodological rule is needed in order to explain how observational consequences can be derived from contemporary cosmological and other scientific theories. I will try to show how SSA can do this for us. This part will be considered in the next chapter, where we’ll also look at how SSA underwrites useful inferences in thermodynamics, evolutionary biology, and traffic analysis.

The present chapter deals with the other part of the case for SSA. It consists of a series of thought experiments designed to demonstrate that it is rational to reason in accordance with SSA in a rather wide range of circumstances. While the application-part can be likened to field observations, the thought experiments we shall conduct in this chapter are more like laboratory research. We here have full control over all relevant variables and can stipulate away inessential complications in order to hopefully get a more accurate measurement of our intuitions and epistemic convictions regarding SSA itself.

The Dungeon gedanken

Our first thought experiment is Dungeon:

The world consists of a dungeon that has one hundred cells. In each cell there is one prisoner. Ninety of the cells are painted blue on the outside and the other ten are painted red. Each prisoner is asked to guess whether he is in a blue or a red cell. (Everybody knows all this.) You find yourself in one of the cells. What color should you think it is?—Answer: Blue, with 90% probability.

Since 90% of all observers are in blue cells, and you don’t have any other relevant information, it seems you should set your credence of being in a blue cell to 90%. Most people I’ve talked to agree that this is the correct answer. Since the example does not depend on the exact numbers involved, we have the more general principle that in cases like this, your credence of having property P should be equal to the fraction of observers who have P, in accordance with SSA.1 Some of our subsequent investigations in this chapter will consider arguments for extending this class in various ways.

1 This does not rule out that there could be other principles of assigning probabilities that would also provide plausible guidance in Dungeon, provided their advice coincides with that of SSA. For example, a relatively innocuous version of the Principle of Indifference, formulated as “Assign the same credence to any two hypotheses if you don’t have any reason to prefer one to the other”, would also do the trick in Dungeon. But subsequent thought experiments impose additional constraints. For reasons that will become clear, it doesn’t seem that any straightforward principle of indifference would suffice to express the needed methodological rule.

While many accept without further argument that SSA is applicable to the Dungeon gedanken, let’s consider how one might seek to defend this view if challenged to do so.

One argument we may advance is the following. Suppose everyone accepts SSA and everyone has to bet on whether they are in a blue or a red cell. Then 90% of all prisoners will win their bets; only 10% will lose. Suppose, on the other hand, that SSA is rejected and the prisoners think that one is no more likely to be in a blue cell than in a red cell; so they bet by flipping a coin. Then, on average, 50% of the prisoners will win and 50% will lose. It seems better that SSA be accepted.

This argument is incomplete as it stands. That one betting-pattern A leads more people to win their bets than does another pattern B does not necessarily make it rational for anybody to prefer A to B. In Dungeon, consider the pattern A which specifies that “If you are Harry Smith, bet you are in a red cell; if you are Geraldine Truman, bet that you are in a blue cell; . . .”— such that for each person in the experiment, A gives the advice that will lead him or her to be right. Adopting rule A will lead to more people winning their bets (100%) than any other rule. In particular, it outperforms SSA which has a mere 90% success rate.

Intuitively it is clear that rules like A are cheating. This is best seen by putting A in the context of its rival permutations A, A, A etc., which map the captives’ names to recommendations about betting red or blue in different ways than does A. Most of these permutations do rather badly. On average, they give no better advice than flipping a coin, which we saw was inferior to accepting SSA. Only if the people in the cells could pick the right A-permutation would they benefit. In Dungeon, they don’t have any information enabling them to do this. If they picked A and consequently benefited, it would be pure luck.

What allows the people in Dungeon to do better than chance is that they have a relevant piece of empirical information regarding the distribution of observers over the two types of cells. They have been informed that 90% of them are in blue cells and it would be irrational of them not to take this information into account. We can imagine a series of thought experiments where an increasingly large fraction of observers are in blue cells—91%, 92%, . . . , 99%. The situation gradually degenerates into the 100%-case where they are told, “You are all in blue cells”, from which each can deductively infer that she is in a blue cell. As the situation approaches this limiting case, it is plausible to require that the strength of participants’ beliefs about being in a blue cell should gradually approach probability 1. SSA has this property.

One may notice that while it is true that if the detainees adopt SSA, 90% of them would win their bets, yet there are even simpler methods that produce the same result, for instance: “Set your probability of being in a blue cell equal to 1 if most people are in blue cells; and to 0 otherwise.” Using this epistemic rule will also result in 90% of the people winning their bets. Such a rule, however, would not be attractive. When the participants step out of their cells, some of them will find that they were in red cells. Yet if their prior probability of that were zero, they could never learn that by Bayesian belief updating. A second and more generic problem is that when we consider rational betting quotients, rules like this are revealed to be inferior. A person whose probability for finding herself in a blue cell was 1 would be willing to bet on that hypothesis at any odds.2 The people following this simplified rule would thus risk losing arbitrarily great sums of money for an arbitrarily small and uncertain gain—an uninviting strategy. Moreover, collectively they would be guaranteed to lose an arbitrarily large sum.

2 Setting aside, as is customary in contexts like this, any risk aversion or aversion against gambling, or computational limitations that the person might have.

Suppose we agree that all the participants should assign the same probability to being in a blue cell (which is quite plausible since their evidence does not differ in any relevant way). It is then easy to show that out of all possible probabilities they could assign to finding themselves in blue cells, a probability of 90% is the only one which would make it impossible to bet against them in such a way that they were collectively guaranteed to lose money. And in general, if we vary the numbers of the example, their degree of belief would in each case have to be what SSA prescribes in order to save them from being a collective sucker.

On an individual level, if we imagine the experiment repeated many times, the only way a given participant could avoid having a negative expected outcome when betting repeatedly against a shrewd outsider would be by setting her odds in accordance with SSA.

All these considerations support what seems to be most persons’ initial intuition about Dungeon: that it is a situation where one should reason in accordance with SSA. Any plausible principle of the epistemology of information that has an indexical component would have to agree with SSA’s verdicts in this particular case.

Another thing to notice about Dungeon is that we didn’t specify how the prisoners arrived in their cells. The prisoners’ ontogenesis is irrelevant so long as they don’t know anything about it that gives them clues about the color of their abodes. They may have been allocated to their respective cells by some objectively random process such as drawing tickets from a lottery urn, after which they were blindfolded and led to their designated locations. Or they may have been allowed to choose cells for themselves, and a fortune wheel subsequently spun to determine which cells should be painted blue and which red. But the gedanken doesn’t depend on there being a well-defined randomization mechanism. One may just as well imagine that prisoners have been in their cells since the time of their birth or indeed since the beginning of the universe. If there is a possible world where the laws of nature dictate which individuals are to appear in which cells, without any appeal to initial conditions, then the inmates would still be rational to follow SSA, provided only that they did not have knowledge of the laws or were incapable of deducing what the laws implied about their own situation. Objective chance, therefore, is not an essential part of the thought experiment. It runs on low-octane subjective uncertainty.

Two thought experiments by John Leslie

We shall now look at an argument for extending the range of cases where SSA can be applied. We shall see that the synchronous nature of Dungeon is inessential: you can in some contexts legitimately reason as if you were a random sample from a reference class that includes observers who exist at different times. Also, we will find that one and the same reference class can contain observers who differ in many respects, including their genes and gender. To this effect, consider an example due to John Leslie, which we shall refer to as Emeralds:

Imagine an experiment planned as follows. At some point in time, three humans would each be given an emerald. Several centuries afterwards, when a completely different set of humans was alive, five thousand humans would each be given an emerald. Imagine next that you have yourself been given an emerald in the experiment. You have no knowledge, however, of whether your century is the earlier century in which just three people were to be in this situation, or in the later century in which five thousand were to be in it. . . .

Suppose you in fact betted that you lived [in the earlier century]. If every emerald-getter in the experiment betted in this way, there would be five thousand losers and only three winners. The sensible bet, therefore, is that yours is instead the later century of the two. (Leslie 1996), p. 20

The arguments that were made for SSA in Dungeon can be recycled in Emeralds. Leslie makes the point about more people being right if everyone bets that they are in the later of the two centuries. As we saw in the previous section, this point needs to be supplemented by additional arguments before it yields support for SSA. (Leslie gives the emeralds example as a response to one objection against the Doomsday argument. He never formulates SSA, but parts of his arguments in defense of the Doomsday argument and parts of his account of anthropic reasoning in cosmology are relevant to evaluating SSA.)

As Leslie notes, we can learn a second lesson if we consider a variant of the emeralds example (Two Batches):

A firm plan was formed to rear humans in two batches: the first batch to be of three humans of one sex, the second of five thousand of the other sex. The plan called for rearing the first batch in one century. Many centuries later, the five thousand humans of the other sex would be reared. Imagine that you learn you’re one of the humans in question. You don’t know which centuries the plan specified, but you are aware of being female. You very reasonably conclude that the large batch was to be female, almost certainly. If adopted by every human in the experiment, the policy of betting that the large batch was of the same sex as oneself would yield only three failures and five thousand successes. . . . [Y]ou mustn’t say: ‘My genes are female, so I have to observe myself to be female, no matter whether the female batch was to be small or large. Hence I can have no special reason for believing it was to be large.’ (Ibid. pp. 222–3)

If we accept this, we can conclude that members of both genders can be in the same reference class. In a similar vein, one can argue for the irrelevance of short or tall, black or white, rich or poor, famous or obscure, fierce or meek, etc. If analogous arguments with two batches of people with any of these property pairs are accepted, then we have quite a broad reference class already. We shall return in a moment to consider what limits there might be to the inclusiveness of the reference class, but first we want to look at another dimension in which one may seek to extend the applicability of SSA.

The Incubator gedanken

All the examples so far have been of situations where all the competing hypotheses entail the same number of observers in existence. A key new element is introduced in cases where the total number of observers is different depending on which hypothesis is true. Here is a simple case where this happens.

Incubator, version I

Stage (a): In an otherwise empty world, a machine called “the incubator”3 kicks into action. It starts by tossing a fair coin. If the coin falls tails then it creates one room and a man with a black beard inside it. If the coin falls heads then it creates two rooms, one with a black-bearded man and one with a white-bearded man. As the rooms are completely dark, nobody knows his beard color. Everybody who’s been created is informed about all of the above. You find yourself in one of the rooms. Question: What should be your credence that the coin fell tails?

3 We suppose the incubator to be a mindless automaton that doesn’t count as an observer.

Stage (b): A little later, the lights are switched on, and you discover that you have a black beard. Question: What should your credence in Tails be now?

Consider the following three models of how you should reason:

Model 1 (Naïve)

Neither at stage (a) nor at stage (b) do you have any relevant information as to how the coin (which you know to be fair) landed. Therefore, in both instances, your credence of Tails should be 1/2.

Answer: At stage (a) your credence of Tails should be 1/2 and at stage (b) it should be 1/2.

Model 2 (SSA)

If you had had a white beard, you could have inferred that there were two rooms, which entails Heads. Knowing that you have a black beard does not allow you to rule out either possibility but it is still relevant information. This can be seen by the following argument. The prior probability of Heads is one half, since the coin was fair. If the coin fell heads, then the only observer in existence has a black beard; hence by SSA, the conditional probability of having a black beard given Heads is one. If the coin fell tails, then one out of two observers has a black beard; hence, also by SSA, the conditional probability of a black beard given Tails is one half. That is, we have

P(Heads) = P(¬Heads) = 1/2

P(Black | Heads) = 1/2

P(Black | ¬Heads) = 1

By Bayes’ theorem, the posterior probability of Heads, after conditionalizing on Black, is P(Heads | Black)


Answer: At stage (a) your credence of Tails should be 1/2 and at stage (b) it should be 2/3.

Model 3 (SSA & SIA)

It is twice as likely that you should exist if two observers exist than if only one observer exists. This follows if we make the Self-Indication Assumption (SIA), to be explained shortly. The prior probability of Heads should therefore be 2/3, and of Tails, 1/3. As in Model 2, the conditional probability of a black beard given Heads is 1 and the conditional probability of black beard given Tails is 1/2.

P(Heads) = 2/3

P(¬Heads) = 1/3

P(Black | Heads) = 1/2

P(Black | ¬Heads) = 1

By Bayes’ theorem, we get

P (Heads | Black) = 1/2.

Answer: At stage (a) your credence of Tails should be 1/3 and at stage (b) it should be 1/2.

The last model uses something that we have dubbed the Self-Indication Assumption, according to which you should conclude from the fact that you came into existence that probably quite a few observers did:

(SIA) Given the fact that you exist, you should (other things equal) favor hypotheses according to which many observers exist over hypotheses on which few observers exist.

SIA may seem prima facie implausible, and we shall argue in chapter 7 that it is no less implausible ultimo facie. Yet some of the more profound criticisms of specific anthropic inferences rely implicitly on SIA. In particular, adopting SIA annihilates the Doomsday argument. It is therefore good to put it on the table so that we can consider what reasons there are for accepting or rejecting it. To give SIA the best chance it can get, we will postpone this evaluation until we have discussed the Doomsday argument and have seen why a range of more straightforward objections against the Doomsday argument fail. The fact that SIA could seem to be the only coherent way (but later we’ll show that it only seems that way!) of resisting the Doomsday argument is possibly the strongest argument that can be made in its favor.

For the time being, we put SIA to one side (i.e. we assume that it is false) and focus on comparing Model 1 and Model 2. The difference between these models is that Model 2 uses SSA and Model 1 doesn’t. By determining which of these models is correct, we get a test of whether SSA should be applied in epistemic situations where hypotheses implying different numbers of observers are entertained. If we find that Model 2 (or, for that matter, Model 3) is correct, we have extended the applicability of SSA beyond what was established in the previous sections, where the number of observers did not vary between the hypotheses under consideration.

In Model 1 we are told to consider the objective chance of 50% of the coin falling heads. Since you know about this chance, you should according to Model 1 set your subjective credence equal to it.

The step from knowing about the objective chance to setting your credence equal to it follows from the Principal Principle4. This is not the place to delve into the details of the debates surrounding this principle and the connection between chance and credence (see Skyrms 1980; Kyburg, Jr. 1981; Bigelow, Collins, & Pargetter 1993; Hall 1994; Halpin 1994; Thau 1994; Strevens 1995; Hoefer 1997, 2007; Black 1998; Sturgeon 1998; Vranas 1998; Bostrom 1999). Suffice it to point out that the Principal Principle does not say that you should always set your credence equal to the corresponding objective chance if you know it. Instead, it says that you should do this unless you have other relevant information that should be taken into account. There is some controversy about how to specify which types of such additional information will modify reasonable credence when the objective chance is known, and which types of additional information will leave the identity intact. But there is general agreement that the proviso is needed. For example, no matter how objectively chancy a process is, and no matter how well you know the chance, if you have actually seen what the outcome was, your credence in that observed outcome should of course be one (or extremely close to one) and your credence in any other outcome the process could have had should be (very close to) zero. This is so quite independently of what the objective chance was. None of this is controversial.

4 David Lewis (Lewis 1986, 1994). A similar principle had earlier been formulated by Hugh Mellor (Mellor 1971).

Now, the point is that in Incubator you do have such extra relevant information that you need to take into account, and Model 1 fails to do that. The extra information is that you have a black beard. This information is relevant because it bears probabilistically on whether the coin fell heads or tails. We can see this as follows. Suppose you are in a room but you don’t know what color your beard is. You are just about to look in the mirror. If the information that you have a black beard were not probabilistically relevant to how the coin fell, there would be no need for you to change your credence about the outcome after looking in the mirror. But this is an incoherent position. For there are two things you may find when looking in the mirror: that you have a black beard or that you have a white beard. Before the light comes on and you see the mirror, you know that if you find that you have a white beard then you will have conclusively refuted the hypothesis that the coin fell tails. So the mirror might give you information that would increase your credence of Heads (to 1). But that entails that making the other possible finding (that you have a black beard) must decrease your credence in Heads. In other words, your conditional credence of Heads given black beard must be less than your unconditional credence of Heads.

If your conditional probability of Heads given a black beard were not lower than the probability you assign to Heads, while also your conditional probability of Heads given a white beard equals one, then you would be incoherent. This is easily shown by a standard Dutch book argument, or more simply as follows:

Write h for the hypothesis that the coin fell heads, and e for the evidence that you have a black beard. We can assume that P (e|h) < 1.

Then we have


So the quotients between the probabilities of h and ¬h is less after e is known than before. In other words, learning e decreases the probability of h and increases the probability of ¬h.

So the observation that you have a black beard gives you relevant information that you need to take into account and it should lower your credence of Tails to below your unconditional credence of Tails, which (provided we reject SIA) is 50%. Model 1, which fails to do this, is therefore wrong.

Model 2 does take the information about your beard color into account and sets your posterior credence of Heads to 1/3, lower than it would have been had you not seen your beard. This is a consequence of SSA. The exact figure depends on the assumption that your conditional probability of a black beard equals that of a white beard, given Heads. If you knew that the coin landed heads but you hadn’t yet looked in the mirror, you would know that there was one man with a white beard and one with black. Provided these men were sufficiently similar in other respects (so that from your present position of ignorance about your beard color you didn’t have any evidence as to which one of them you are), these conditional credences should both be 50% according to SSA.

If we agree that Model 2 is the correct one for Incubator, then we have seen how SSA can be applied to problems where the total number of observers in existence is not known. In chapter 10, we will reexamine Incubator and argue for adoption of a fourth model, which conflicts with Model 2 in subtle but important ways. The motivation for doing this, however, will become clear only after detailed investigations into the consequences of accepting Model 2. So for the time being, we will adopt Model 2 as our working assumption in order to explore the implications of the way of thinking it embodies.

If we combine this with the lessons of the previous thought experiments, we now have a very wide class of problems where SSA can be applied. In particular, we can apply it to reference classes that contain observers who live at different times and who are different in many substantial ways including genes and gender, and to reference classes that may be of different sizes depending on which hypothesis under consideration is true.

One may wonder if there are any limits at all to how much we can include in the reference class. There are. We shall now see why.

The reference class problem

The reference class in the SSA is the class of entities such that one should reason as if one were randomly selected from it. We have seen examples of things that must be included in the reference class. In order to complete the specification of the reference class, we also have to determine what things must be excluded.

In many cases, where the total number of observers is the same on any of the hypotheses assigned non-zero probability, the problem of the reference class appears irrelevant. For instance, take Dungeon and suppose that in ten of the blue cells there is a polar bear instead of a human observer. Now, whether the polar bears count as observers who are members of the reference class makes no difference. Whether they do or not, you know you are not one of them. Thus you know that you are not in one of the ten cells they occupy. You therefore recalculate the probability of being in a blue cell to be 80/90, since 80 out of the 90 observers whom you—for all you know— might be, are in blue cells. Here you have simply eliminated the ten polar-bear cells from the calculation. But this does not rely on the assumption that polar bears aren’t included in the reference class. The calculation would come out the same if the bears were replaced with human observers who were very much like yourself, provided you knew you were not one of them. Maybe you are told that ten people who have a birthmark on their right calves are in blue cells. After verifying that you yourself don’t have such a birthmark, you adjust your probability of being in a blue cell to 80/90. This is in agreement with SSA. According to SSA (given that the people with the birthmarks are in the reference class), P(Blue cell | Setup) = 90/100. But also by SSA, P(Blue cell | Setup & Ten of the people in blue cells have birth marks of a type you don’t have) = 80/90.

Where the definition of the reference class becomes an issue is when the total number of observers is unknown and is correlated with the hypotheses under consideration. Consider the following schema for producing Incubator-type experiments: There are two rooms. Whichever way the coin falls, a person with a black beard is created in Room 1. If and only if it falls heads, then one other thing x is created in Room 2. You find yourself in one of the rooms and you are informed that it is Room 1. We can now ask, for various choices of x, what your credence should be that the coin fell heads.

The original version of Incubator was one where x is a man with white beard:


As we saw above, on Model 2 (“SSA and not SIA”), your credence of Heads is 1/3. But now consider a second case (version II) where we let x be a rock:


In version II, when you find that you are the man in Room 1, it is evident that your credence of Heads should be 1/2. The conditional probability of you observing what you are observing (i.e. your being the man in Room 1) is unity on both Heads and Tails, because with this setup you couldn’t possibly have found yourself observing being in Room 2. (We assume, of course, that the rock does not have a soul or a mind.) Notice that the arguments used to argue for SSA in the previous examples cannot be used in version II. A rock cannot bet and cannot be wrong, so the fraction of observers who are right or would win their bets is not improved here by including rocks in the reference class. Moreover, it seems impossible to conceive of a situation where you are ignorant as to whether you are the man in Room 1 or the rock in Room 2.

If this is right then the probability you should assign to Heads depends on what you know would be in Room 2 if the coin fell heads, even though you know that you are in Room 1. The reference class problem can be relevant in cases like this, where the size of the population depends on which hypothesis is true. What you should believe depends on whether the object x that would be in Room 2 would be in the reference class or not. It makes a difference to your rational credence whether x is rock or an observer like yourself.

Rocks, consequently, are not in the reference class. In a similar vein we can rule out armchairs, planets, books, plants, bacteria, and other such non-observer entities. It gets trickier when we consider possible borderline cases such as a gifted chimpanzee, a Neanderthal, or a mentally disabled human. It is not clear whether the earlier arguments for including things in the reference class could be used to argue that these entities should be admitted. Can a severely mentally disabled person bet? Could you have found yourself as such a person? (Although anybody could of course in one sense become severely mentally disabled, it could be argued that the being that results would not in any real sense still be “you,” if the damage is sufficiently severe.)

That these questions arise seems to suggest that something beyond a plain version of the principle of indifference is involved. The principle of indifference is primarily about what your credence should be when you are ignorant of certain facts (Castell 1998; Strevens 1998). SSA purports to determine conditional probabilities of the form P(“I’m an observer with such and such properties” | “The world is such and such”), and it applies even when you were never ignorant of who you are and what properties you have.5

5 An additional problem with the principle of indifference is that it balances precariously between vacuity and inconsistency. Starting from the generic formulation suggested earlier, “Assign equal credence to any two hypotheses if you don’t have any reason to prefer one to the other”, one can make it go either way depending on how a strong an interpretation one gives of “reason”. If reasons can include any subjective inclination, the principle loses most if not all of its content. But if having a reason requires one to have objectively significant statistical data, then the principle can be shown to be inconsistent.

Intellectual insufficiency might not be the only source of vagueness or indeterminacy of the reference class. Here is a list of possible borderlines:

  • Intellectual limitations (e.g. chimpanzees; persons with brain damage; Neanderthals; persons who can’t understand SSA and the probabilistic reasoning involved in using it in the application in question)
  • Insufficient information (e.g. persons who don’t know about the experimental setup)
  • Lack of some occurrent thoughts (e.g. persons who, as it happens, don’t think of applying SSA to a given situation although they have the capacity to do so)
  • Exotic mentality (e.g. angels; superintelligent computers; posthumans)

No claim is made that all of these dimensions are such that one can exit the reference class by going to a sufficiently extreme position along them. For instance, maybe an intellect cannot by disqualified for being too smart. The purpose of the list is merely to illustrate that the exact way of delimiting the reference class has not been settled by the preceding discussion and that in order to so one would have to address at least these four points.

We will return to the reference class problem in the next chapter, where we’ll see that an attempted solution by John Leslie fails, and yet again in chapters 10 and 11, where we will finally resolve it.

For many purposes, however, the details of the definition of the reference class may not matter much. In thought experiments, we can usually avoid the problem by stipulating that no borderline cases occur. And real-world applications will often approximate this ideal closely enough that the results one derives are robust under variations of the reference class within the zone of vagueness we have left open.