Chapter 7

Invalid Objections Against the Doomsday Argument1

1 This chapter is partly based on a paper previously published in Mind (Bostrom 1999); those bits are reproduced here with permission.

It would probably not be an exaggeration to say that I have encountered over a hundred objections against DA in the literature and in personal communication, many of them mutually inconsistent. Even merging those objections that use the same basic idea would leave us with dozens of distinct and often incompatible explanations of what is wrong with DA. The authors of these refutations frequently seem extremely confident that they have discovered the true reason why DA fails, at least until a doomsayer gets an opportunity to reply. It is as if DA is so counterintuitive (or threatening?) that people reckon that every criticism must be valid.

Rather than aiming for completeness, we shall select a limited number of objections for critical examination. We want to choose those that seem currently alive, or have made their entrée recently, or that have a Phoenix-like tendency to keep reemerging from their own ashes. While the objections studied in this chapter are unsuccessful, they do have the net effect of forcing us to become clearer about what DA does and doesn’t imply.2

Doesn’t the Doomsday argument fail to “target the truth”?

Kevin Korb and Jonathan Oliver propose a minimalist constraint that any good inductive method should satisfy (Korb and Oliver 1998):

Targeting Truth (TT) Principle: No good inductive method should—in this world—provide no more guidance to the truth than does flipping a coin. (p. 404)

DA, they claim, violates this principle. In support of their claim they ask us to consider

a population of size 1000 (i.e., a population that died out after a total of 1000 individuals) and retrospectively apply the Argument to the population when it was of size 1, 2, 3 and so on. Assuming that the Argument supports the conclusion that the total population is bounded by two times the sample value . . . then 499 inferences using the Doomsday Argument form are wrong and 501 inferences are right, which we submit is a lousy track record for an inductive inference schema. Hence, in a perfectly reasonable metainduction we should conclude that there is something very wrong with this form of inference. (p. 405)

But in this purported counterexample to DA, the TT principle is not violated—501 right and 499 wrong guesses is strictly better than what one would expect from a random procedure such as flipping a coin. The reason why the track record is only marginally better than chance is simply that the above example assumes that the doomsayers bet on the most stringent hypothesis that they would be willing to bet on at even odds, i.e. that the total population is bounded by two times the sample value. This means, of course, that their expected gain is minimal. It is not remarkable, then, that in this case, a person who applies the Doomsday reasoning is only slightly better off than one who doesn’t. If the bet were on the proposition not that the total population is bounded by two times the sample value but instead that it is bounded by, say, three times the sample value, the doomsayer’s advantage would be more drastic. And the doomsayer can be even more confident that the total value will not exceed thirty times the sample value.

Additionally, Korb and Oliver’s example assumes that the doomsayer doesn’t take any additional information into account when making her prediction. But as we saw in the previous chapter, there is no basis for that assumption. All relevant information can and should be incorporated. (One of the failings of Gott’s version of DA was that it failed to do so, but that’s just a reason not to accept that version.) If the doomsayer has information about other things than her birth rank, she can do even better.

Therefore, Korb and Oliver have not shown that DA violates the TT principle, nor that the Doomsday reasoning at best improves the chances of being right only slightly.3

3 In a response to criticism, Korb and Oliver make two comments. “(A) The minimal advantage over random guessing in the example can be driven to an arbitrarily small level simply by increasing the population in the example.” (Korb and Oliver 1999), p. 501. This misses the point, which was that the doomsayer’s gain was small because she was assumed to bet at the worst odds on which she is would be willing to bet—which per definition entails that she’d not expect to benefit significantly from the scheme but which is, of course, perfectly consistent with her doing much better than someone who doesn’t accept the “DA” in this example.

I quote the second comment in its entirety:

(B) Dutch book arguments are quite rightly founded on what happens to an incoherent agent who accepts any number of “fair” bets. The point in those arguments is not, as some have confusedly thought, that making such a series of bets is being assumed always to be rational; rather, it is that the subsequent guaranteed losses appear to be attributable only to the initial incoherence. In the case of the Doomsday Argument (DA), it matters not if Doomsayers can protect their interests by refraining from some bets that their principles advise them are correct, and only accepting bets that appear to give them a whopping advantage: the point is that their principles are advising them wrongly. (p. 501)

To the extent that I can understand this objection, it fails. Dutch book arguments are supposed to show that the victim is bound to lose money. In Korb and Oliver’s example, the “victim” is expected to gain money.

The "baby-paradox"

As first noted by the French mathematician Jean-Paul Delahaye in an unpublished manuscript (Delahaye 1996), the basic Doomsday argument form can seem to be applicable not only to the survival of the human race but also to your own life span. A second objection by Korb and Oliver picks up on this idea:

[I]f you number the minutes of your life starting from the first minute you were aware of the applicability of the Argument to your life span to the last such minute and if you then attempt to estimate the number of the last minute using your current sample of, say, one minute, then according to the Doomsday Argument, you should expect to die before you finish reading this article. (p. 405)

The claim is untrue. The Doomsday argument form, applied to your own life span, does not imply that you should expect to die before you have finished reading their article. DA says that in some cases you can reason as if you were a sample drawn randomly from a certain reference class. Taking into account the information conveyed by this random sample, you are to update your beliefs in accordance with Bayes’ theorem. This may cause a shift in your probability assignments in favor of hypotheses that imply that your position in the human race will have been fairly typical—say among the middle 98% rather than in the first or the last percentile of all humans that will ever have been born. DA just says you should make this Bayesian shift in your probabilities; it does not by itself determine the absolute probabilities that you end up with. As we emphasized in the last chapter, what probability assignment you end up with depends on your prior, i.e. the probability assignment you started out with before taking DA into account.

In the case of the survival of the human race, your prior may be based on your estimates of the risk that we will be extinguished through nuclear war, germ warfare, self-replicating nanomachines, a meteor impact, etc. In the case of your own life expectancy, you will want to consider factors such as the average human life span, your state of health, and any hazards in your environment that may cause your demise before you finish the article. Based on such considerations, the probability that you will die within the next half-hour ought presumably to strike you as extremely small. If so, then even a considerable probability shift due to a DA-like inference should not make you expect to die before reaching the last line. Hence, contra Korb and Oliver, the doomsayer would not draw the absurd conclusion that she is likely to perish within half an hour, even should she think the Doomsday argument form applicable to her individual life span.

While this is enough to refute the objection, a more fundamental question here is whether (and if so, how) the Doomsday argument form is applicable to individual life spans at all. I think we concede too much if we grant even a modest probability shift in this case. There are two reasons for this.

First, Korb and Oliver’s application of the Doomsday argument form to individual life spans presupposes a specific solution to the problem of the reference class. This is the problem, remember, of determining what class of entities from which one should consider oneself a random sample. As we are dealing with temporal parts of observers here, we have to invoke SSSA, the version of SSA adapted to observer-moments rather than observes that we alluded to in the section on traffic analysis and which we will discuss more fully in chapter 10. Korb and Oliver’s objection presupposes a particular choice of reference class: the one consisting of those and only those observer-moments that are aware of DA. This may not be the most plausible choice. Certainly, Korb and Oliver do not seek to justify it in any way.

The second reason for the doomsayer not to grant a probability shift in the present case is that the no-outsider requirement is not satisfied. The no-outsider requirement states that in applying SSA there must be no out-siders—beings who are ignored in the reasoning but who really belong in the reference class. Applying SSA in the presence of such outsiders will in many cases yield erroneous results.4

4 John Leslie argues against the no-outsider requirement (e.g. (Leslie 1996), pp. 229-30). I believe that he is mistaken for the reasons given below. (I suspect that Leslie’s thoughts on the no-outsider requirement derive from his views on the problem of the reference class, which we criticized in the previous chapter.)

Consider first the original application of DA (to the survival of the human species). Suppose you were certain that there is extraterrestrial intelligent life. You know that there are a million “small” civilizations that will have contained 200 billion persons each and a million “large” civilizations that will have contained 200 trillion persons each. You know that the human species is one of these populations but you don't know whether it is small or large.

Step 1. Estimate the empirical prior P(Small), i.e. how likely it seems that nanotech warfare etc. will put an end to our species before it gets large. At this stage you don't take into account any form of the Doomsday argument or anthropic reasoning.

Step 2. Now take account of the fact that most people find themselves in large civilizations. Let H be the proposition "I am a human", and define the new probability function P*(.)=P(.|H), obtained by conditionalizing on H. By Bayes' theorem:


A similar expression holds for ¬Small. By SSA, we have:




(If we calculate P*(Small), we find that it is very small for anyrealistic prior. In other words, at this stage in the calculation, it looks as though the human species is very likely long-lasting.)

Step 3. Finally, we take account of DA. Let E be the proposition that you find yourself “early”, i.e. that you are among the first 200 billion persons in your species. Conditionalizing on this evidence, we get the posterior probability function P**(.) = P*(.|E). So


Note that P*(E | Small) = 1 and P*(E | ¬Small) = 1/1000. By substituting back into the above expressions, it is then easy to verify that


Thus we get back the empirical probabilities that we started from. DA (in Step 3) only served to cancel the effect that we took into account in Step 2, namely that you were more likely to turn out to be in the human species given that the human species is one of the large rather than one of the small civilizations. This shows that if we assume we know that there are both “large” and “small” extraterrestrial civilizations, and that we know their pro-portion—though the precise numbers in the above example don’t matter— then the right probabilities are the ones given by the naïve empirical prior.5 So in this instance, if we had ignored the extraterrestrials (thus violating the no-outsider requirement) and simply applied SSA with the human population as the reference class, we would have got an incorrect result.

5 This was first pointed out by Dieks in (Dieks 1992), and more explicitly in (Dieks 1999), and was later demonstrated by Kopf et al. (Kopf, Krtous et al. 1994). It appears to have been independently discovered by Bartha and Hitchcock (Bartha and Hitchcock 1999).

It is worth emphasizing, however, that suspecting that there are extraterrestrial civilizations does not damage DA if we don’t have any information about what fraction of these alien species are “small”. What DA would do in this case (if the argument were sound in other respects) is give us reason to think that the fraction of small intelligent species is greater than was previously held on ordinary empirical grounds.

Returning to the case where you are supposed to apply DA to your own life span, we can now see that the no-outsider requirement is not satisfied. True, if you consider the epoch of your life during which you know about DA, and you partition this epoch into time-segments (observer-moments), then you might say that if you were to live for a long time then the present observer-moment would be extraordinary early in this class of observer-moments. You may thus be tempted to infer that you are likely to die soon (ignoring the difficulties pointed out earlier). But even if DA were applicable in that way, this would be the wrong conclusion. For in this case you have good reason for thinking there are many “outsiders”. The outsiders are the observer-moments of other humans. What’s more, you have detailed information about what fraction of these other humans are “short-lasting”. Just as knowledge about the proportion of actually existing extraterrestrial civilizations that are small would annul the original DA, so in the present case does knowledge about the existence of other short-lived and long-lived

humans and about their approximate proportions cancel the probability shift favoring impending death. The fact that the present observer-moment belongs to you would indicate that you are an individual who will have contained many observer-moments rather than few, i.e. that you will be long-lived. It can then be shown (just as above) that this would counterbalance the fact that your present observer-moment would have been extraordinarily early among all your observer-moments were you to be long-lived.

To sum up, the “baby paradox”-objection fails to take prior probabilities into account. These would be extremely low for the hypothesis that you will die within the next thirty minutes. Therefore, contrary to what Korb and Oliver claim, even if the doomsayer thought DA applied to this case, she would not make the prediction that you will die within 30 minutes. However, the doomsayer should not apply DA to this case, for two reasons. First, it presupposes an arguably implausible solution to the reference class problem. Second, even if we accepted that only beings who know about DA should be in the reference class, and that it is legitimate to run the argument on time-segments of observers, the conclusion still does not follow, because the no-outsider requirement is violated.

Isn’t a sample size of one too small?

Korb and Oliver have a third objection. It starts off with the claim that, in a Bayesian framework, a sample size of one is too small to make a substantial difference to one’s rational beliefs.

The main point . . . is quite simple: a sample size of one is “catastrophically” small. That is, whatever the sample evidence in this case may be, the prior distribution over population sizes is going to dominate the computation. The only way around this problem is to impose extreme artificial constraints on the hypothesis space. (p. 406)

They follow this assertion by conceding that in a case where the hypothesis space contains only two hypotheses, a substantial shift can occur:

If we consider the two urn case described by Bostrom, we can readily see that he is right about the probabilities. (p. 406)

The probability in the example to which refer shifted from 50% to 99.999%, which is surely “substantial”, and similar results would obtain for a broad range of prior distributions. But Korb and Oliver seem to think that such a substantial shift can only occur if we “impose extreme artificial constraints on the hypothesis space” by considering only two rival hypotheses rather than many more.

It is easy to see that this is false. Let {h1, h2, . . . hN} be a hypothesis space and let P be any probability function that assigns a non-zero prior probability to all these hypotheses. Let hi be the least likely of these hypotheses. Let e be the outcome of a single random sampling. Then it is easy to see, by inspecting Bayes’ formula, that the posterior probability of hi, P(hi|e), can be made arbitrarily big (=1) by an appropriate choice of e:


Indeed, we get P(hi|e) = 1 if we choose e such that P(e|hj) = 0 for j ? i. This would, for example, correspond to the case where you discover that you have a birth rank of 200 billion and immediately give probability zero to all hypotheses according to which there would be less than 200 billion persons.

Couldn’t a Cro-Magnon man have used the Doomsday argument?

Indeed he could (provided Cro-Magnon minds could grasp the relevant concepts), and his predictions about the future prospects of his species would have failed. Yet it would be unfair to see this as an objection against DA. That a probabilistic method misleads observers in some exceptional circumstances does not mean that it should be abandoned. Looking at the overall performance of the DA-reasoning, we find that it does not do so badly. Ninety percent of all humans will be right if everybody guesses that they are not among the first tenth of all humans who will ever have lived (Gott’s version). Allowing users to take into account additional empirical information can improve their guesses further (as in Leslie’s version). Whether the resulting method is optimal for arriving at the truth is not something that we can settle trivially by pointing out that some people might be misled.

We can make the effect go away simply by considering a larger hypothesis space

By increasing the number of hypotheses about the ultimate size of the human species that we choose to consider, we can, according to this objection, make the probability shift that DA induces arbitrarily small. Again, we can rely on Korb and Oliver for giving the idea a voice6:

6 A similar objection had been made earlier by Dennis Dieks (Dieks 1992) and independently by John Eastmond (personal communication).

In any case, if an expected population size for homo sapiens … seems uncomfortably small, we can push the size up, and so the date of our collective extermination back, to an arbitrary degree simply by considering larger hypothesis spaces. (p. 408)

The argument is that if we use a uniform prior over the chosen hypothesis space {H1, h2, . . . , hn}, where Hi is the hypothesis that there will have existed a total of i humans, then the expected number of humans that will have lived will depend on n: the greater the value we give to n, the greater the expected future population. Korb and Oliver compute the expected size of the human population for some different values of n and find that the result does indeed vary. Notice first of all that nowhere in this is there a reference to DA. If this argument were right it would work equally against any way of making predictions about how long the human species will survive. For example, if during the Cuba missile crisis you feared—based on obvious empirical factors—that humankind might soon go extinct, you really needn’t have worried. You could just have considered a larger hypothesis space, thereby attaining an arbitrarily high degree of confidence that doom was not impending. If only saving the world were that easy! What, then, is the right prior to use for DA? All we can say about this from a general philosophical point of view is that it is the same as the prior for people who don’t believe in DA. The doomsayer does not face a special problem. The only legitimate way of providing the prior is through an empirical assessment of the potential threats to human survival. You need to base it on your best guesstimates about known hazards and dangers as yet unimagined.7 On a charitable reading, Korb and Oliver could perhaps be interpreted as saying not that DA fails because the prior is arbitrary, but rather that the uniform prior (with some big but finite cut-off) is as reasonable as any other prior, and that with such a prior, DA will not show that doom is likely to strike very soon. If this is all they mean then they are not saying something that the doomsayer could not agree with. The doomsayer is not committed to the view that doom is likely to strike soon8, only to the view that the risk that doom will strike soon is greater than was thought before we understood the probabilistic implications of our having relatively low birth ranks. DA (if sound) shows that we have systematically underestimated the risk of doom soon, but it doesn’t directly imply anything about the absolute magnitude of the probability of that hypothesis. Even with a uniform prior probability, there will still be a shift in our credence in favor of earlier doom.

7 For my views on what the most likely human extinction scenarios are and some suggestions for what could be done to reduce the risk, see (Bostrom 2002).

8 To get the conclusion that doom is likely to happen soon (say within 200 years) you need to make additional assumptions about future population figures and the future risk profile for humankind.

But don’t Korb and Oliver’s calculations at least show that this probability shift in favor of earlier doom is in reality quite small, so that DA isn’t such a big deal after all? Not so.

As already mentioned, their calculations rest on the assumption of a uniform prior. Not only is this assumption gratuitous—no attempt is made to justify it—but it is also, I believe, highly implausible even as an approximation of the real empirical prior. To me it seems fairly obvious (quite apart from DA) that the probability that there will exist between 100 billion and 500 billion humans is much greater than the probability that there will exist between 1020 and (1020 + 500 billion) humans.9

9 Even granting the uniform prior, it turns out that the probability shift is actually quite big. Korb and Oliver assume a uniform distribution over the hypothesis space {h1, h2, . . . , h2,048} (where again hi is the hypothesis that there will have been a total of i billion humans) and they assume that you are the 60 billionth human. Then the expected size of the human population, before considering, DA is (2048-60)*0.5*109 = 994 billion. And Korb and Oliver’s calculations show that, after applying DA, the expected population is 562 billion. The expected human population has been reduced by over 43% in their own example.

Aren't we necessarily alive now

We are necessarily alive at the time we consider our position in human history, so the Doomsday Argument excludes from the selection pool everyone who is not alive now. (Greenberg 1999), p. 22

This objection, put forward by Mark Greenberg, is profiting from an ambiguity. Yes, it is necessary that if you are at time t considering your position in the human history then you are alive at time t. But no, it is not necessary that if you think “I am alive at time t” then you are alive at time t. You can be wrong about when you are alive, and hence you can also be ignorant about it.

The possibility of a state where one is ignorant about what time it is can be used as the runway for an argument showing that one’s reference class can include observers existing at different times (cf. the Emeralds gedanken). Indeed, if the observers living at different times are in states that are subjectively indistinguishable from your own current state, so that you cannot tell which of these observers you are, then a strong case can be made that you are rationally required to include them all in your reference class. Leaving some out would mean assigning zero credence to a possibility (viz., your later discovering that you are one of the excluded observers) that you really have no ground for rejecting with such absolute conviction.

Sliding reference of “soon” and “late”?

Even if someone who merely happens to live at a particular time could legitimately be treated as random with respect to birth rank, the Doomsday Argument would still fail, since, regardless of when that someone’s position in human history is observed, he will always be in the same position relative to Doom Soon and Doom Delayed. (Greenberg 1999), p. 22

This difficulty is easily avoided by substituting specific hypotheses for “Doom Soon” and “Doom Delayed”: e.g. “The total is 200 billions” and “The total is 200 trillions”. (There are many more hypotheses we need to consider, but as argued above, we can simplify by focusing on two.) It is true that some DA-protagonists speak in terms of doom as coming “soon” or “late”. This can cause confusion because under a non-rigid (incorrect) construal, which hypotheses are expressed by the phrases “Doom Soon” and “Doom Late” depends on whom they are uttered by. When there is doubt, speak in terms of specific numbers.

How could I have been a 16th century human?

SSA does not imply that you could have been a 16th century human. We make no assumption as to whether there is a counterfactual situation or a possible world in which you are Leonardo da Vinci, or, for that matter, one of your contemporaries.

Even assuming that you take these past and present people to be in your reference class, what you are thereby committing yourself to is simply certain conditional credences. There is no obvious reason why this should compel you to hold as true (or even meaningful) counterfactuals about alternative identities that you could supposedly have had. The arguments for SSA didn’t appeal to controversial metaphysics of personhood. We should therefore feel free to read it straightforwardly as a prescription for how to assign values to various conditional subjective probabilities—probabilities that must be given values somehow if the scientific and philosophical problems we have been discussing are to be modeled in a Bayesian framework.

Doesn’t your theory presuppose that what happens in causally disconnected regions affects what happens here?

The theory of observation selection effects implies that your beliefs about distant parts of the universe—including ones that lie outside your past light cone—can in some cases influence what credence you should assign to hypotheses about events in your near surroundings. We can see this easily by considering, for example, that whether the no-outsider requirement is satisfied can depend on what is known about non-human observers elsewhere, including regions that are causally disconnected from ours. This, however, does not require that (absurdly) those remote galaxies and their inhabitants exert some sort of physical influence on you.10 Such a physical effect would violate special relativity theory (and in any case it would be hard to see how it could help account for the systematic probabilistic dependencies that we are discussing).

10 This objection is advanced in (Olum 2002).

To see why this “dependence on remote regions” is not a problem, it suffices to note that the probabilities our theory delivers are not physical chances but subjective credences. Those distant observers have zilch effect on the physical chances of events that take place on Earth. Rather, what holds is that under certain special circumstances, your beliefs about the distant observers could come to rationally affect your beliefs about a nearby coin toss, say.

We will see further (hypothetical) examples of this kind of epistemic dependencies in later thought experiments. In the real world, the most interesting dependencies of this kind are likely to emerge in scientific contexts, for instance when measuring cosmological theories against observation or when seeking to estimate the likelihood of intelligent life evolving on Earth-like planets.

The fact that our beliefs about the near are rationally correlated with our beliefs about the remote is itself utterly unremarkable. If it weren’t so, you could never learn anything about distant places by studying your surroundings.

But we know so much more about ourselves than our birth ranks!

Here is one thought that frequently stands in the way of understanding of how observation selection effects work:

“We know a lot more about ourselves than our birth ranks. Doesn’t this mean that even though it may be correct to view oneself as a random sample from some suitable reference class if all one knows is one’s birth rank, yet in the actual case, where we know so much more, it is not permissible to regard oneself as in any way random?”

This question insinuates that there is an incompatibility between being known and being random. That we know a lot about x, however, does not entail that x cannot be treated as a random sample.

A ball randomly selected from an urn with an unknown number of consecutively numbered balls remains random after you have looked at it and seen that it is ball number 7. If the sample ceased to be random when you looked at it, you wouldn’t be able to make any interesting inferences about the number of balls remaining in the urn by studying the ball you’ve just picked out. Further, getting even more information about the ball, say by assaying its molecular structure under an atomic force microscope, would not in any way diminish its randomness. What you get is simply information about the random sample. Likewise, you can and do know much more about yourself than when you were born. This additional information should not obfuscate whatever you can learn from considering your birth rank alone.

Of course, as we have already emphasized, SSA does not assert that you are random in the objective sense of there being a physical randomization mechanism responsible for bringing you into the world. We don’t postulate a time-travelling stochastic stork! SSA is simply a specification of certain types of conditional probabilities. The randomness heuristic is useful because it reminds us how to take into account both the information about your birth rank and any extra information that you might have. Unless this extra information has a direct bearing on the hypothesis in question, it won’t make any difference to what credence you should assign to the hypothesis. The pertinent conditional probabilities will in that case be the same: P(“A fraction f of all observers in my reference class have property P” | “I have property P”) = P(“A fraction f of all observers in my reference class have property P” | “I have properties P, Q1, Q2, and . . . Qi”).

Let us illustrate this with a concrete example. Suppose that Americans and Swedes are in the same reference class. SSA then specifies a higher prior probability of you being an American than of you being a Swede (given the difference in population size). SSA does not entail, absurdly, that you should think that you are probably an American even when knowing that you are reading Svenska Dagbladet on the Stockholm subway on your way to work at Ericsson with a Swedish passport in your pocket; for this evidence provides strong direct evidence for the hypothesis that you are a Swede. All the same, if you were uncertain about the relative population of the two countries, then finding that you a Swede would indeed be some evidence in favor of the hypothesis that Sweden is the larger country; and this evidence would not be weakened by learning a lot of other information about yourself, such as what your passport says, where you work, the sequence of your genome, your family tree seven generations back, or your complete physical constitution down to the atomic level. These additional pieces of information would be irrelevant.

The Self-Indication Assumption — Is there safety in numbers?

We now turn to an idea that can be spotted in the background of several attacks on DA, namely the Self-Indication Assumption (SIA). We encountered it briefly in chapter 4. Framed as an objection against DA, the idea is that the probability shift in favor of Doom Soon that DA leads us to make is offset by another probability shift—which is overlooked by doomsayers—in favor of Doom Late. When both these probability shifts are taken into account, the net effect is that we end up with the naïve probability estimates that we made before we learnt about either DA or SIA. According to this objection, the more observers that will ever have existed, the more “slots” there are that you could have been “born into”. Your existence is more probable if there are many observers than if there are few. Since you do in fact exist, the Bayesian rule has to be applied and the posterior probability of hypotheses that imply that many observers exist must be increased accordingly. The nifty thing is that the effects of SIA and DA cancel each other precisely. We can see this by means of a simple calculation11:

11 Something like using SIA as an objection against DA was first done—albeit not very transparently—by Dennis Dieks in 1992 (Dieks 1992); see also his more recent paper (Dieks 1999). That SIA and DA exactly cancel each other was first showed by Kopf et al. in 1994 (Kopf, Krtous et al. 1994). The objection seems to have been independently discovered by Paul Bartha and Chris Hitchcock (Bartha and Hitchcock 1999), and in variously cloaked forms by several other people (personal communications). Ken Olum has a clear treatment in (Olum 2002). John Leslie argues against SIA in (Leslie 1996), pp. 224-8.

Let P(hi) be the naive prior for the hypothesis that in total i observers will have existed, and assume that P(hi) = 0 for i greater than some finite N (this restriction allows us to set aside the problem of infinities). Then we can formalize SIA as saying that


where a is a normalization constant. Let r(x) be the rank of x, and let “I” denote a random sample from a uniform probability distribution over the set of all observers. By SSA, we have


Consider two hypotheses hnand hm. We can assume that r(I)= min(n,m).

(If not, then the example simplifies to the trivial case where one of the hypotheses is conclusively refuted regardless of whether SIA is accepted.) Using Bayes’ formula, we expand the quotient between the conditional probabilities of these two hypotheses:


We see that after we have applied both SIA and DA, we are back to the probabilities that we started with.

But why accept SIA? The fact that SIA has the virtue of leading to a complete cancellation of DA (and some related inferences that we shall consider in chapter 9) may well be the most positive thing that can be said on its behalf. As an objection against DA, this argument would be unabashedly question-begging. It could still carry some weight if DA were sufficiently unacceptable and if there were no other coherent way of avoiding its conclusion. However, that is not the case. We shall describe another way of resisting DA in chapter 10.

SIA thus makes a charming appearance when arriving arm-in-arm with DA. The bad side emerges when SIA is on its own. In cases where we don’t know our birth ranks, DA cannot be applied. There is then no subsequent probability shift to cancel out the original boost that SIA gives to many-observer hypotheses. The result is a raw bias towards populous worlds that is very hard to justify.

In order for SIA always to be able to cancel DA, you would have to subscribe to the principle that, other things equal, a hypothesis that implies that there are 2N observers should be assigned twice the credence of a hypothesis that implies that there are only N observers. In the case of the Incubator gedanken, this means that before learning about the color of your beard, you should think it likely that the coin fell heads (so that two observers rather than just one were created). If we modify the gedanken so that Heads would lead to the creation of a million observers, you would have to be virtually certain that the coin fell heads (P=99.9999%) without knowing anything directly about the outcome and before learning about your beard-color. Even if you knew that the prior probability of Heads was just one-ina-thousand (imagine a huge fortune wheel instead of a coin), SIA still tells you to be extremely sure that the outcome was Heads. This seems wrong. Think yourself into the situation. What you know and observe at stage (a) in Incubator is perfectly harmonious with the Tails hypothesis—there is nothing that strains your belief in supposing that the coin fell tails and one observer was created and you are that observer. Especially if the prior probability of Tails was a thousand times greater than that of Heads, it would be weird to insist that it would be irrational of you not to be cocksure that the coin fell heads (on the alleged ground that there would be lots of other observers if that were true).

It is not only in fictional toy examples that we would get counterintuitive results if we accepted SIA. For, as a matter of fact, we may well be radically ignorant of our birth ranks, namely if there are intelligent extraterrestrial species. Consider the following scenario:

The Presumptuous Philosopher

It is the year 2100 and physicists have narrowed down the search for a theory of everything to only two remaining plausible candidate theories, T1 and T2 (using considerations from super-duper symmetry). According to T1 the world is very, very big but finite and there are a total of a trillion trillion observers in the cosmos. According to T2, the world is very, very, very big but finite and there are a trillion trillion trillion observers. The super-duper symmetry considerations are indifferent between these two theories. Physicists are preparing a simple experiment that will falsify one of the theories. Enter the presumptuous philosopher: “Hey guys, it is completely unnecessary for you to do the experiment, because I can already show to you that T2 is about a trillion times more likely to be true than T1! (whereupon the philosopher runs the Incubator thought experiment and explains Model 3).”

One suspects that the Nobel Prize committee would be rather reluctant to award the presumptuous philosopher The Big One for this contribution. It is hard to see what the relevant difference is between this case and Incubator. If there is no relevant difference, and we are not prepared to accept the argument of the presumptuous philosopher, then we are not justified in using SIA in Incubator either.

When discussing the second objection by Korb and Oliver, we remarked that the fact that we don’t know our absolute birth ranks if there are extraterrestrial civilizations is not a threat to DA. So why cannot DA be applied in The Presumptuous Philosopher to cancel the SIA-induced probability shift in favour of T2 ? The answer is that in the absence of knowledge about our absolute birth ranks, DA works by giving us information about what fraction of all species are short-lasting. (That we should be at an “early” stage in our species is more likely, according to the DA-reasoning, if a large fraction of all observers find themselves at such an early stage—i.e., if long-lasting species are rare.) This information about what fraction of all species are short-lasting (a larger fraction than we had thought) in turn tells us something about our own fate (that it is more likely that we are a short-lasting species). But it does not tell us anything about how many species, and thus about how many observers there are in total. To get DA to argue in favor of a small number of observers (rather than for a small number of human observers), you would need to know your absolute birth rank. Since you don’t know that in The Presumptuous Philosopher (and, presumably, not in our actual situation either), DA cannot be applied there to cancel the SIA-induced probability shift.

Back in chapter 2, we sketched an explanation of why, owing to observation selection effects, it would be a mistake to view the fine-tuning of our universe as a general ground for favoring hypotheses that imply the existence of a greater number of observer-containing universes. If two competing general hypotheses each implies that there is at least one observer-containing universe, but one of the hypotheses implies the existence of a greater number of observer-containing universes than the other, then fine-tuning is not typically a reason to favor the former. The reasoning in chapter 2 can be adapted to argue that your own existence is not in general a ground for thinking that hypotheses are more likely to be true just by virtue of implying that there is a greater total number of observers. The datum of your existence tends to disconfirm hypotheses on which it would be unlikely that any observers (in your reference class) should exist; but that’s as far as it goes. The reason for this is that the sample at hand—you—should not be thought of as randomly selected from the class of all possible observers but only from a class of observers who will actually have existed. It is, so to speak, not a coincidence that the sample you are considering is one that actually exists. Rather, that’s a logical consequence of the fact that only actual observers actually view themselves as samples from anything at all.12

12 Of course, just as if our universe were found to have “special” properties this could provide justification for using the fact of its existence as part of an argument for there being a great many observer-containing universes, so likewise if you have certain special properties then that could support the hypothesis that there are vast numbers of observers. But it is then the special properties that you are discovered to have, not the mere fact of your existence, that grounds the inference.

Harking back to the heavenly-messenger analogy used in chapter 2, we could have considered the following different version, in which reasoning in accordance with SIA would have been justified:

Case 5. The messenger first selected a random observer from the set of all possible observers. He then traveled to the realm of physical existence and checked whether this possible observer actually existed somewhere, and brought back news to you about the result.

Yet this variation would make the analogy less close to the real case. For while the angel could have learnt from the messenger that the randomly selected possible observer didn’t actually exit, you could not have learnt that you didn’t exist.

Finally, consider the limiting case where we are comparing two hypotheses, one saying that the universe is finite (and contains finitely many observers), the other saying that the universe is infinite (and contains infinitely many observers). SIA would have you assign probability one to the latter hypothesis, assuming both hypotheses had a finite prior probability. But surely, whether the universe is finite or infinite is an open scientific question, not something that you can determine with certainty simply by leaning back in your armchair and registering the fact that you exist!

For these reasons, we should reject SIA.