Chapter 10

Observation Selection Theory

A Methodology for Anthropic Reasoning

This chapter brings all the lessons from the foregoing chapters together and presents a theory of observation selection effects. It provides a method for taming anthropic biases and a general framework for connecting theory and observation.

Building blocks, theory constraints and desiderata

Let’s start by reviewing some of the materials and tools that we have on our workbench:

Chapter 2 established several preliminary conclusions concerning the use of anthropic arguments in cosmology. We shall want to revisit these when we have formulated the observation selection theory and see if it replicates the earlier findings or if some revisions are required.

Chapter 3 homed in on what seemed to lie at the core of anthropic reasoning and expressed it in a tentative principle, SSA, which described a way of taking into account indexical information about which observer one has turned out to be.

Chapter 4 developed several thought experiments in support of SSA. The Incubator gedanken is especially important because it provided the link to the Doomsday argument and the various paradoxical results we examined in chapter 9.

Chapter 5 showed how something like SSA is needed to make sense of certain types of scientific theorizing such as in linking Big-World cosmological models to empirical data.

Chapter 6 analyzed the Doomsday argument. We found shortcomings in the versions that have been presented in the literature, we argued that John Leslie’s proposal for solving the reference class problem is unworkable, and we showed that DA has alternative interpretations and is inconclusive. However, it has not been refuted by any of the easy objections that we examined in chapter 7. In particular, we rejected the claim that SIA is the way to neutralize the counterintuitive effects that SSA can have in certain applications.

Chapter 8 proved a kind of “coherence” for SSA-based reasoning: it was shown not to lead to alleged paradoxical “observer-relative” chances or implausible betting-frenzy between rational agents (in the wide range of cases considered).

In chapter 9, the Adam & Eve, UN++, and Quantum Joe thought experiments demonstrated counterintuitive consequences of SSA, although we also saw that these consequences do not include the prima facie one that SSA gives us reason to believe in paranormal causation. The genuine implications of SSA are not impossible to accept; John Leslie, for one, is quite willing to bite the bullets. Yet many of us, endowed with less hardy epistemic teeth and stomachs, may find a meal of such ammunition a rather unpalatable experience and would prefer an alternative theory that does not have these implications, supposing one can be found that is satisfactory on other accounts.

Let’s list what some of these criteria are that an observation selection theory should satisfy:

  • The observation selection effects described by the Carter-Leslie versions of WAP and SAP must be heeded; these should come out as special case injunctions of a more general principle.
  • The theory must be able to handle the problem of freak observers in Big-World cosmological models.
  • More generally, observation selection effects in cosmology, including ones of a probabilistic nature, must be taken into account. The theory should connect in constructive ways with current research in physical cosmology that is addressing these issues.
  • The theory should also make it possible to model observation selection effects in other sciences, including the applications in evolutionary biology, thermodynamics, traffic analysis, and quantum physics that we reviewed in chapter 5.
  • The arguments set forth in the thought experiments in chapter 4 must be respected to the extent that they are sound.
  • The theory should not explicitly or implicitly rely on SIA or on any supposition that amounts to the same thing. (Or if it does, a very strong defense against the objections raised against SIA in chapter 7, including The Presumptuous Philosopher gedanken, would have to be provided.) Also, the theory must obviously not employ any of the defective ideas and misunderstandings we exposed when scrutinizing the various objection against DA in chapter 6.
  • Not strictly a criterion, but certainly a desideratum, is that the counterintuitive implications of SSA discussed in chapter 9 be avoided.
  • Something needs to be said about the reference class problem: Where is the boundary of how the reference class can be defined? What are the considerations that determine this boundary?
  • In most general terms, the theory should provide a sound methodology for linking up theory with observational data, including ones that have indexical components.

When these specific criteria and desiderata are combined with the usual generic theoretical goals—simplicity, coherence, non-arbitrariness, exactness, intuitive plausibility, etc.—we have enough constraints that we will be happy if we can find even one theory that fits the bill.

The outline of a solution

In order to reach the observation selection theory we are searching for, we shall have to traverse the following sequence of ideas.

Step one: We recognize that there is additional indexical information— apart from the information you might have about which observer you are— that needs to be taken into account. In particular, you may also have relevant information about which temporal part of a given observer that you currently are. We must strengthen SSA in a way that lets us model the evidential import of such information.

Step two: We zoom in on the Incubator gedanken as the simplest situation where SSA leads to the kind of reasoning that we saw in the previous chapter gives counterintuitive results if it is applied to Adam & Eve etc. We need to think carefully about what is going on in this example and study what happens when we apply the strengthened version of SSA to it.

Step three: We note that the answer given by applying SSA to Incubator in accordance with Model 2 (described in chapter 4) can be avoided if we relativize the reference class in a certain way.

Step four: We realize that the arguments given for Model 2 are defeated by the strengthened version of SSA. Since this version takes more indexical information into account, it trumps SSA in cases of disagreement. This gives us the authority to reject the claim that Model 2 has to be used in all cases where the number of observers is a variable. Instead, a new model using relativized reference classes is formulated which is more generally valid and which enables us to resolve the paradoxes of chapter 9.

Step five: We abstract from the particulars and find a general probabilistic formula that specifies the relation between evidence, hypotheses, and reference classes.

Step six: We show how this formula embodies a methodology that meets the criteria and desiderata listed in the previous section.

SSSA: Taking account of indexical information of observer-moments

Just as one can be ignorant about which observer one is, and one can get new information by finding out, and this information can be relevant evidence for various non-indexical hypotheses—so likewise can one be ignorant about which temporal part of an observer one currently is, and such indexical information can bear on non-indexical hypotheses. Observation selection effects can be implicated in both cases. Not surprisingly, there are extensive similarities in how we should model reasoning using these two types of indexical information.

We shall use the term “observer-moment” to refer to a brief time-segment of an observer. We can now consider the obvious analogue to SSA that applies to observer-moments instead of observers. Call this the Strong Self-Sampling Assumption:

(SSSA) One should reason as if one’s present observer-moment were a random sample from the set of all observer-moments in its reference class.

Consider the simple case of Mr. Amnesiac (depicted in figure 3):

Mr. Amnesiac

Mr. Amnesiac, the only observer ever to exist, is created in Room 1, where he stays for two hours. He is then transported in into Room 2, where spends one hour, whereupon he is terminated. His severe amnesia renders him incapable of retaining memories for any significant period of time. The details about the experimental situation he is in, however, are explained on posters in both rooms; so he is always aware of the relevant non-indexical features of his world.


It is plausible to require that Mr. Amnesiac’s credence at each point that he is currently in Room 1 be twice as large as his credence that he is in Room 2. In other words, all observer-moments in this gedanken should set P(“This observer-moment is in Room 1 | Information about the setup) = 2/3.Arguments to back up this claim can be obtained easily by adapting the reasoning we used to support the view that in the Dungeon gedanken (chapter 4), one’s credence of being in a blue cell should equal 90% (the fraction of cells that are blue). This in agreement with SSSA. By varying the proportions of Mr. Amnesiac’s lifespan that he spends in various rooms, we can generalize the finding to a larger set of cases.

In the same manner, we can handle the case where instead of one observer being moved between the rooms, we have two different observers who exist, one in each room, for two hours and one hour, respectively (figure 4). We assume that the lights are out so that the observers cannot see what color beard they have, and that they have amnesia so that they can’t remember how long they have been in a room.


By SSSA, both observers should at each point in time set:

P(I am currently in Room 1 | Information about the setup) = 2/3

P(I am currently in Room 2 | Information about the setup) = 1/3.

This result can be backed up by betting arguments similar to those used to justify our analysis of Dungeon in chapter 4. We may suppose, for example, that every five minutes the observers are called upon to bet on which room they are in, and we can then calculate the fair odds at which their combined expected gain is zero.

Before we proceed, we should note that the definition did not specify the exact duration of an observer-moment. Doesn’t this omission generate a serious degree of vagueness in the formulation of SSSA? Not so. So long as we are consistent and partition observers into time-segments of equal duration, it doesn’t matter how long a unit of subjective time is (provided it is sufficiently fine-grained for the problem at hand). For example, in Mr. Amnesiac it does not matter whether an observer-moment lasts for five seconds or five minutes or one hour. In either case, there are twice as many observer-moments (of the same reference class) being spent in Room 1, and that is enough for SSSA to recommend a credence of 2/3 of being in Room 1 for all observer-moments.

For the purposes of SSSA, it may be appropriate to partition observers into segments of equal subjective time. If one observer has twice the amount of experience in a given time interval as another observer, it seems quite plausible to associate twice as many observer-moments to the former observer during the interval. Thus, for instance, if two similar observers could be similarly implemented on two distinct pieces of silicon hardware (Drexler 1985; Moravec 1989), and we run one of the computers at a faster clock rate, then on this line of reasoning that would result in more observer-moments being produced per second in the faster computer.1

1 One science-fiction method of uploading a human mind to a computer is as follows: (1) Through continued progress in computational neuroscience, create a catalogue of the functional properties of the various types of neurons and other computational elements in the human brain. (2) Use e.g. advanced nanotechnology to disassemble a particular human brain and create a three-dimensional map of its neuronal network at a sufficient level of detail (presumably at least on the neuronal level but if necessary down to the molecular level). (3) Use a powerful computer to run an emulation of this neuronal network. This means that the computations that took place in the original biological brain are now performed by the computer. (4) Connect the emulated intellect to suitable input/output organs if you want it to be able to interact with the external world. Assuming computationalism is true, this will result in the uploaded mind continuing to exist (with the same memories, desires, etc.) on its new computational substrate. (The intuitive philosophical plausibility of the scenario may be increased if you imagine a more gradual transformation, with one neuron at a time being replaced by a silicon microprocessor that performs the same computation. At no point would there be a discontinuity in behavior, and the subject would not be able to tell a difference; and at the end of the transformation we have a silicon implementation of the mind. For a more detailed analysis, see e.g. (Merkle 1994).

Subjective time, thus, is not about how long an observer thinks an interval is—one can easily be mistaken about that—but it is, rather, a measure of the actual amount of cognition and experience that have taken place. However, nothing in the following discussion hinges on this idea.2

2 If subjective time is a better measure of the duration of observer-moments than chronological time, this might suggest that an even more fundamental entity for self-sampling to be applied to would be (some types of) thoughts, or occurrent ideas. SSSA can lead to longer-lived observers getting a higher sampling density by virtue of their containing more observer-moments. One can ponder whether one should not also assign a higher sampling density to certain types of observer-moments, for example those that have a greater degree of clarity, intensity, or focus. Should we say that if there were (counterfactually!) equally many deep and perspicacious anthropic thinkers as there are superficial and muddled ones then one should, other things equal, expect to find one’s current observer-moment to be one of the more lucid observer-moments? And should one think that one were more likely to find oneself as an observer who spends an above-average amount of time thinking about observation selection effects? This would follow if only observer-moments spent pondering problems of observation selection effects are included in one’s current reference class, or if such observer-moments are assigned a very high sampling density. And if one does in fact find oneself as such an observer, who is rather frequently engaged in anthropic reasoning, could one take that as private evidence in favor of the just-mentioned approach?

SSSA is a strengthening of SSA in the sense that it takes more indexical information into account: not only information about which observer you are but also information about which temporal part of that observer you currently are. SSSA is not necessarily a strengthening of SSA in the sense that it has all the same implications that SSA has and then some. Au contraire, we shall argue that the extra informational component that SSSA includes in its jurisdiction introduces new degrees of freedoms for rational belief compared to SSA—basically because this added information can be legitimately evaluated in divergent ways. Consequently, there is a potential for rational disagreements (on the basis of this larger set of indexical information now underlying our judgments) that didn’t exist before (in relation to the more limited set of information that SSA deals with). This means that some limitations on rational belief that would obtain if SSA were all we had are no longer applicable once we realize that SSA left out important considerations. So in one sense, SSSA is sometimes weaker than SSA, namely, because in some cases it imposes fewer restrictions on rational credence assignments.

Reassessing Incubator

Next we zero in on a key lesson that emerges from the preceding investigations: that the critical point, the fountainhead, of all the paradoxical results seems to be the contexts where the hypotheses under consideration have different implications about the total number of observers in existence. Such is the way with DA, the various Adam & Eve experiments, Quantum Joe, and UN++. By contrast, things seem to be humming along perfectly nicely so long as the total number of observers is held constant.

Here is another clue: Recall that we remarked in chapter 4 that the cases in which the definition of the reference class is relevant for our probability assignments seem to be precisely those in which the total number of observers depends on which hypothesis is true. This suggests that the solution we are trying to find has something to with how the reference class is defined.

So that we may focus our beam of attention as sharply as possible on the critical point, let us contemplate the simplest case where the number of observers is a variable and that we can use to model the reasoning in DA and the problematic thought experiments: Incubator. Now that we have SSSA, it is useful to add some details to the original version:

Incubator, version III The incubator tosses a fair coin in an otherwise empty world. If the coin falls heads, the incubator creates one room with a black-bearded observer and one room with one white-bearded observer; if it falls tails, the incubator creates only a room with a black-bearded observer. Observers first spend one hour in darkness (being ignorant about their beard color), and then one hour with the lights on (so they can see their beard in a mirror). Everyone knows the setup. After two hours, the experiment ends and everybody is killed.

The situation is depicted in figure 5. For simplicity, we can assume that there is one observer-moment per hour and observer.


We discussed three models for how to reason about Incubator in chapter 4. We rejected Model 1 and Model 3 and were thus left with Model 2— the model embodying the kind of reasoning that got us into trouble in Chapter 9. Is it perhaps possible that there is fourth model, a better way of reasoning that can be accessed by means of the more powerful analytical resources provided by SSSA? Let’s consider again what the observer-moments in Incubator should believe.

To start with, suppose that all observer-moments are in the same reference class. Then it follows directly from SSSA that3

3 From now on, we suppress information about the experimental setup, which is assumed to be shared by all observer-moments and is thus implicitly conditionalized on in all credence assignments.

P(“This is an observer-moment that knows it has a black beard” | Tails) = 1/2

P(“This is an observer-moment that knows it has a black beard” | Heads) = 1/4

Together with the fact that the coin toss is known to have been fair, this implies that after the light comes on, the observer-moment that knows it has a black beard should assign a credence of 1/3to Heads. This is the conclusion that, when transposed to DA and the Adam & Eve thought experiments, leads to the problematic probability shift in favor of hypotheses that imply fewer additional observers.4

4 It would be an error to regard these probability shifts as representing some sort of “inverse SIA”. SIA would have you assign a higher a priori (i.e. conditional only on the fact that you exist) probability to worlds that contain greater numbers of observers. But the DA-like probability shift in favor of hypotheses entailing fewer observers does not represent a general a priori bias in favor of worlds with fewer observers. What it does, rather, is reduce the probability of those hypotheses on which there would be many additional observers beyond yourself compared to hypotheses on which it also was guaranteed that an observer like you would exist although not accompanied by as many other observers. Thus is is because there would still be “early” observers whether or not the human species lasts for long that finding yourself one of these early observers gives you reason, according to DA, to think that there will not be hugely many observers after you. This probability shift is a posteriori and applies only to those observers who know that they are in the special position of being early (or who have some other such property that is privileged in the sense that the number of people likely to have it is independent of which of the hypotheses in question happens to be true).

This suggests that if we are unwilling to accept these consequences, we should not place all observer-moments in the same reference class. Suppose that we instead put the early observer-moments in one reference class and the late observer-moments in separate reference classes. We’ll see how this move might be justified in the next section, but we can already note that making the choice of reference class context-dependent in this way is not entirely arbitrary. The early observer-moments, which are in very similar states, are in the same reference class. The observer-moment that has discovered that it has a black beard is in an importantly different state (no longer wondering about its beard color) and is thus placed in a different reference class. The observer-moment that has discovered it has a white beard is again different from all the other observer-moments (it is, for instance, in a state of no uncertainty as to its beard color and can deduce logically that the coin fell heads), and so it also has its own reference class. The differences between the observer-moments are significant at least in the respect that they concern what information the observer-moments have that is relevant to the problem at hand, viz. to guess how the coin fell.

If we use this reference class partitioning, then SSSA no longer entails that the observer-moment who has discovered that it has a black beard should favor the Tails hypothesis. Instead, that observer-moment will now assign equal credence to either outcome of the coin toss. This is because on either Tails or Heads, all observer-moments in its reference class (which is now the singleton consisting only of that observer-moment itself) observe what it is observing; so SSSA gives:

P(“This is an observer-moment that knows it has a black beard” | Tails) = 1

P(“This is an observer-moment that knows it has a black beard” | Heads) = 1

The problematic probability shift is thus avoided.

It remains the case that the early observer-moments, who are ignorant about their beard-color, assign an even credence to Heads and Tails; so we have not imported the illicit SIA criticized in chapter 7.

As for the observer-moment that discovers that it has a white beard, SSSA gives the following conditional probabilities:

P(“This is an observer-moment that knows it has a black beard” | Tails) = 0

P(“This is an observer-moment that knows it has a black beard” | Heads) = 1

So that observer-moment is advised to assign zero credence to the Tails hypothesis (which would have made its existence impossible).

How the reference class may be observer-moment relative

Can it be permissible for different observer-moments to use different reference classes? We can turn this question around by asking: Why should different observer-moments not use different reference classes? What argument is there to show that such a way of assigning credence would necessarily be irrational?

In chapter 4, we gave an argument for accepting Model 2, the model asserting that the observer who knows he has a black beard should assign a greater than even credence to Tails. The argument had the following form: First consider what you should believe if you don’t know your beard color; second, in this state of ignorance, assign conditional probabilities to you having a given beard color given Heads or given Tails; third, upon learning your beard color, use Bayesian kinematics to update the credence function obtained through the first two steps. The upshot of this process is that after finding that you have a black beard, your credence of Tails should be 2/3.

Let’s try to recapture this chain of reasoning in our present framework using observer-moments. The early observer-moments don’t know whether they have black or white beard, but they can consider the conditional probabilities of that given a particular outcome of the coin toss. They know that on Heads, one out of two of the observer-moments in their epistemic situation has a black beard; and on Tails, one out of one has a black beard:

P(“This observer-moment has a black beard” | Tails & Early) = 1

P(“This observer-moment has a black beard” | Heads & Early) = 1/2

(“Early” stands for “This observer-moment exists during the first hour”.) One can easily see that this credence assignment is independent of whether one uses the universal reference class is used or the partition of reference classes described above. Moreover, since the observer-moments know that the coin toss is fair, they also assign an even credence to Heads and Tails.5 This gives (via Bayes’ theorem):

5 Note that in this case there is no DA-like probability-shift from finding that you are an “early” observer-moment, because the proportion of observer-moments that are early is the same on the Heads and the Tails hypotheses. Even if the universal reference classed were used, the DA-shift would come only from discovering that you have black beard.

P(Tails | “This observer-moment has a black beard” & Early) = 2/3 (C1)

P(Heads | “This observer-moment has a black beard” & Early) = 1/3 (C2)

When the lights come on, one observer discovers he has a black beard. The old argument that is now being questioned would now have him update his credence by applying Bayesian conditionalization to the conditional credence assignments (C1 & C2) that he made when he was ignorant about his beard color. And this where the argument fails. For the later observer-moment’s evidence is not equivalent to the earlier observermoment’s evidence conjoined with the proposition that it has a black beard. The later observer-moment has also lost knowledge of the indexical proposition “Early”, and moreover, the indexical proposition expressed by “This observer-moment has a black beard” is a different one when the thought is entertained by the later observer-moment, since “this” then refers to a different observer-moment.

Therefore, we see that the argument that would force the acceptance of Model 2 relies on the implicit premiss that the only relevant epistemological difference between the observer before and after he discovers his beard color is that he gains the information that is taken into account by the Bayesian conditionalization referred to in step three. If there are other relevant informational changes between the “early” and the “late” states of the observer, then there is no general reason to think that his credence assignments in the latter state should be obtained by simply conditionalizing on the finding that he is an observer with black beard. In chapter 4, were we had by stipulation limited our consideration to only such indexical information as concerned which observer one is, this hidden premiss was satisfied; for the latter state of the observer then differed from the early one in precisely one regard, namely, by having acquired the indexical information that he is the observer with the black beard—the information that was conditionalized on in step three. Now, however, this tacit assumption is no longer supported. For we now have also to consider changes in other kinds of indexical information that might have occurred between the early and the late stages. This includes the change in the indexical information about which temporal part of the observer (i.e. which observer-moment) one currently constitutes. Before the observer finds that he has a black beard, he knows the piece of indexical information that “this current observer-moment is one that is ignorant about its beard color”. After finding out that he has a black beard, he has lost that piece of indexical information (the indexical fact no longer obtains about him); and the information he has gained includes the indexical fact that “this current observer-moment is one that knows that it has a black beard”. These differences in information (which the argument for Model 2 fails to take into account) could potentially be relevant to what credence the observer should assign to the Tails and Heads hypotheses after he has found out that he has a black beard.

Consider now the claim that the reference class is observer-moment relative, more specifically, that the early and the late observer-moments should use different reference classes, as described above. Then, since the reference class is what determines the conditional probabilities that are used in the calculation of the posterior probabilities of Heads or Tails, we have to acknowledge that the difference in indexical information just referred to is directly relevant and must therefore be taken into account. The indexical information that the early observer-moments use to derive the conditional probabilities C1 and C2 (namely, the indexical information that they are early observer-moments, which is what determines that their reference class is, which in turn determines these conditional probabilities) is lost and replaced by different indexical information when we turn to the later observer-moments. The later observer-moments, having different indexical information, belong, ex hypothesi, in a different reference classes mandating a different set of conditional probabilities. If a late observer-moment’s reference class does not include early observer-moments, then its conditional probability (given either Heads or Tails) of being an early observer-moment is zero. Conditionalizing on being a late observer-moment would therefore have no influence on the credence that the late observer-moment assigns to the possible outcomes of the coin toss. (The late observer-moment that has discovered it has a white beard has of course got another piece of relevant information, which implies Tails, so that’s what it should believe, with probability unity.)

The argument I’ve just given does not show that the difference in indexical information about which observer-moment one currently is requires that different reference classes be used. All it does is to show that this is now an open possibility, and that the argument to the contrary that was earlier used to support model 2 can no longer be applied once the purview is expanded to SSSA which takes into account a more complete set of indexical information. What this means is that the arguments relying on Model 2 can now be seen to be inconclusive; they don’t prove what they set out to prove. We are therefore free to reject DA and the assertion that Adam and Eve, Quantum Joe and UN++ should believe the counterintuitive propositions which, if the sole basis of evaluation were the indexical information taken into account by SSA, they might have been rationally required to accept.

Indeed, the fact that the choice of a universal reference class leads to the implausible conclusions of chapter 9 is a reason for rejecting the universal reference class as the exclusively rational alternative. It suggests that, instead, choosing reference in a more context-dependent manner is a preferable method. I am not claiming that this reason is conclusive. One could choose to accept the consequences discussed in Adam & Eve, Quantum Joe and UN++. If one is willing to do that then nothing that has been said here stops one from using a universal reference class. But if one is unwilling to embrace those results, then the way in which one can coherently avoid doing so is by insisting that one’s choice of reference class is to some degree dependent on context (specifically, on indexical information concerning which observer-moment one currently is).

The task now awaiting us is to explain how an observation selection theory can be developed that meets all the criteria and desiderata listed above and that can operate with a relativized reference class. The framework we shall propose is neutral in regard to the reference class definition. It can therefore be used either with a universal reference class or with a relativized reference class. The theory specifies how credence assignments are to be made given a choice of reference class. This is a virtue because in the absence of solid grounds for claiming that only one particular reference class definition can be rationally permissible, it would be wrong to rule out other definitions by fiat. This is not to espouse a policy of complete laissez-faire as regards the choice of reference class. We shall see that there are interesting limits on the range of permissible choices.

Formalizing the theory: the Observation Equation

A centerpiece of our observation selection theory is the probabilistic connection between theory and observation that enables one to derive observational consequences from theories about the distribution of observer-moments in the world. Here we shall first propose an equation that gives a specification of this fundamental methodological link. Then we shall illustrate how it works by applying it to Incubator.

Let a be an observer-moment whose subjective probability function is Pa. Let Oa be the class of all possible observer-moments that belong to the same reference class as a (according to a’s reference class definition ℜa).6 Let wOa be the possible world in which a is located. Let e be some evidence and h some hypothesis, and let Oe and Oh be the classes of possible observer-moments “about whom” e and h are true, respectively. (If h ascribes a property to observer-moments—e.g. h:= “This is an observer-moment that has a black beard”—then we say that h is true about those and only those possible observer-moments that have the property in question; if h is non-indexical, not referring to any particular observer-moment, then h is true about all and only those possible observer-moments that live in possible worlds where h holds true. And similarly for e.) Finally, let O(w) be the class of observer-moments in the possible world w. We then have:

6 Earlier we included only actually existing observer-moments in the reference class. It is expedient for present purposes, however, to have a concise notation for this broader class which includes possible observer-moments, so from now on we use the term “reference class” for this more inclusive notion. This is merely a terminological convenience and does not by itself reflect a substantive deviation from our previous approach.


Let us apply OE to Incubator to calculate what credence an observer should assign to Heads upon finding that he has a black beard. In order to do that we must first specify what reference class definition is used by the corresponding possible observer-moments (i.e. those that are in a state of knowing that they have black beards). Let’s call these possible observer-moments ß2 and ß4 (see figure 6). We need two such possible observer-moments in our model of the problem since there are two relevant possible worlds, one (w1) where Heads is true and one (w2) where Tails is true, and there is one possible observer-moment knowing it has a black beard in each of these possible worlds. For the sake of illustration, let’s assume that the reference class definition ℜß2,4 used by ß2 and ß4 is the one discussed above that places these two possible observer-moments in a separate reference from the possible observer-moments that don’t know their beard color and places the possible observer-moment that knows it has a white beard in a third reference class on its own. (In the interest of brevity, we shall from now on frequently refer to possible observer-moments simply as “observer-moments”, when context makes it clear what is meant.)


We can assume that the observer-moments share the prior P(w1) = P(w2) = 1/2. Let h be the hypothesis that the coin fell heads, and e the total information available to an observer-moment that knows it has a black beard. As shown in the diagram (where the a-observer-moments are those belonging to the possible white-bearded observer) we have:


From this it follows that Pß2,4(h|e) = 1/2(and ? = 1). The observer, upon finding he has a black beard, should consequently profess prefect ignorance about the outcome of the coin toss.

A quantum generalization of OE

If one adopts a many-worlds interpretation of quantum physics of the type that postulates a primitive connection between the quantum measure of an observer-moment and the probability of finding oneself currently as that observer-moment, then one needs to augment OE by assigning a weight µ(s) to each observer-moment that is being summed over, representing that observer-moment’s quantum measure. This gives us


where ? is a normalization constant:


(No assertion is made here about the virtues of the many-worlds version; we just point out how it can be modeled within the current framework.) This formula can also be used in a non-quantum context if one wishes to assign different kinds of observer-moments different weights, for example a larger weight to observer-moments that are clearer or more intense or contain more information.

One might have a similar expression with an integral instead of a sum if one is dealing with a continuum of observer-moments, but we shall not explore that suggestion here.7

7 For some relevant ideas on handling infinite cases that arise in inflationary cosmological models, see (Vilenkin 1995).

Non-triviality of the reference class: why ℜ0 must be rejected

We thus see how making use of the more fine-grained indexical information represented by observer-moments (rather than observers as wholes) makes it possible to move to a relativized definition of the reference class, and how this enables us to avoid the counterintuitive consequences that flow from applying SSA with a universal reference class in DA, Adam-and-Eve, Quantum Joe, and UN++ .

It was noted that the Incubator observer-moments that were on this approach placed in different reference classes were different in ways that are not small or arbitrary but importantly relevant to the problem at hand. Is it possible to say something more definite about the criteria for membership in an observer-moment’s reference class? This section establishes one important constraint on how the reference class can rationally be defined.

What we shall call ℜ0, the minimal reference class definition, is the beguilingly simple idea that the reference class for a given observer-moment consists of those and only those observer-moments from which it is subjectively indistinguishable:


Two observer-moments are subjectively indistinguishable iff they can’t tell which of them they are. (Being able to say “I am this observer-moment, not that one” does not count as being able to tell which observer-moment you are.) For example, if one observer-moment has a pain in his toe and another has a pain in his finger then they are not subjectively indistinguishable; for they can identify themselves as “this is the observer-moment with the pain in his toe” and “this is the observer-moment with the pain in his finger”, respectively. By contrast, if two brains are in the precisely the same state, then (assuming epistemic states supervene on brain states) the two corresponding observer-moments are subjectively indistinguishable. The same holds if the brains are in slightly different states but the differences are imperceptible to the subjects.

There are some cases where using the extreme minimalism of ℜ0 doesn’t prevent one from constructing acceptable models. For instance, if the possible states that the observer in Incubator may end up in upon discovering that he has a black beard (i.e. ß2 or ß4) are subjectively indistinguishable, then ℜ0replicates the reference class partition that we used above and will thus yield the same credence assignment.

One can model Incubator using ℜ0 even if we assume that there are two subjectively distinguishable states that the blackbearded observer might be in after learning about his beard color. In order to do that, one has to expand our representation of the problem by considering a more fine-grained partition of the possibilities involved. To be concrete, let us suppose that the blackbearded observer might or might not experience a pain in his little toe during the stage where he knows he has a black beard. If he knew that this pain would occur only if the coin fell Tails (say) then the problem would be trivial; so let’s suppose that he doesn’t have know of any correlation between having the pain and the outcome of the coin toss. We then have four possible worlds to consider (figure 7):


The possible worlds w1 -w4 represent the following possibilities:

w1: Heads and the late blackbeard has no little-toe pain. w2: Heads and the late blackbeard has a little-toe pain. w3: Tails and the late blackbeard has a little-toe pain. w4: Tails and the late blackbeard has no little-toe pain.

We can assume that the observer-moments share the prior P(wi) = 1/4 (for i = 1,2,3,4). Let h be the hypothesis that the coin fell heads, and e the information available to an observer-moment that knows it has a black beard and pain in the little toe. By 0, the reference class for such an observer-moment is


OE then implies that Pß4,6(h|e) = 1/2 (with ? = 1/2). That is, we get the same result here with the minimal reference class definition that we got on the revised approach of the previous section.

So ℜ0 can be made to work in Incubator even if the participants are never in subjectively indistinguishable states. ℜ0 is neat, clear-cut, non-arbitrary, and it expunges the counterintuitive implications stemming from using the universal reference class definition, ℜU. Yet the temptation to accept ℜ0 has to be resisted.

Recall the “freak-observer problem” plaguing Big World theories that we discussed in chapters 3 and 5. This is one application where
0 falls short.

Suppose T1 and T2 are two Big World theories. According to T1, the vast majority of observers observe values of physical constants in agreement with what we observe and only a small minority of freak observers are deluded and observe the physical constants having different values. According to T2, it is the other way around: the normal observers observe physical constants having other values than what we observe, and a minority of freak observers make observations that agree with ours. We want to say that our observations favor T1 over T2. Yet this is not possible on ℜ0. For according to ℜ0, the reference class to which we belong consists of all and only those observers-moments who make the same observations as we do, since other observer-moments are subjectively distinguishable from ours. If T1 and T2 both imply that the universe is big enough for it to be certain (or very probable) that it contains at least some observer making the observations that we are actually making, then on ℜ0 our evidence would not favor T1 over T2. Here is the proof:

Consider an observer-moment a, who, in light of evidence e, considers what probability to assign to the mutually exclusive hypotheses Hj (1=j=n). By ℜ0 we have Oa>=Oe.8 OE then gives

8 According to ℜ0, s∈Oa iff a has the same total evidence as s. (For an observer-moment s that has different total evidence from a would thereby be subjectively distinguishable from s; and an observer-moment that is subjectively indistinguishable from a must per definition share all of a’s evidence and can have no evidence that a does not have, and it would thus have the same total evidence as a.) What we need to show, thus, is that shas the same total evidence as a iff s∈Oe. Note first that Oe, the class of all possible observer-moments about whom e is true, is one in which a is a member (for since a knows e, e is true about a; this is so because any non-indexical part p of e is true of those and only those observer-moments that are in possible worlds where p holds true, and any indexical part p’ of e of the form “this observer-moment has property P” is true about those and only those possible observer-moments who have property P). Moreover, Oeis the narrowest class that a knows it is a member of, because e if a knew it was a member of some proper subset Oe* of Oe, then e wouldn’t be the total evidence of a since a would then know e*, which is stronger than e. We can now show that a has the same total evidence as a ⇔ s ∈ Oe:

(⇒) Suppose first that s has the same total evidence as a. Then a is subjectively indistinguishable from s. Therefore, if s ∉ Oe, then a wouldn’t know it was in Oe, since a cannot distinguish itself from s. ? . Hence s∈Oe.

(⇐) Take a s such that s ∈ Oe. Suppose s doesn’t have the same total evidence as a. Then a can subjectively distinguish itself from s. Hence there is a narrower class than Oe(namely ∈ Oe-s) that a knows it is a member of. ? . Hence s has the same total evidence as a.

This completes the proof that Oa = Oe.


Let M(hj) be the class of worlds wi where hj is true and for which O(wi)nOe is non-empty. We can thus write:


Since hj is true in wi if wiM(hj), we have O(wi)⊆Hj, giving:


For each hj that implies the existence of at least one observer-moment compatible with e9, O(wi)nOe is non-empty for each wi in which hj is true. For such an hj we therefore have

9 We say that an observer-moment a is incompatible with Oe iff a∉Oe


Forming the ratio between two such hypotheses, hj and hk, we thus find that this is unchanged under conditionalization on e,


This means that e does not selectively favour any of the hypotheses hj that implies that some observer-moment is compatible with e.

Since this consequence is unacceptable, we must reject ℜ0. Any workable reference class definition must permit reference classes to contain observer-moments that are subjectively distinguishable. The reference class definition is in this sense non-trivial.

Observer-moments that are incompatible with e can thus play a role in determining the credence of observer-moments whose total evidence is e. This point is important. To emphasize it, we will give another example (see figure 8):

Blackbeards & Whitebeards

Two theories, T1 and T2, each say that there are three rooms, and the two theories are assigned equal prior probabilities. On T1, two of the rooms contain observers with black beards and the third room contains an observer with a white beard. On T2, one room contains a black-bearded observer and the other two contain white-bearded observers. All observers know what color their own beard is (but they cannot see into the other rooms). You find yourself in one of the rooms as a blackbeard. What credence should you give to T1?

We can see, by analogy to the Big World cosmology case, that the answer should be that observing that you are a blackbeard gives you reason to favor T1 over T2. But if we use ℜ0, we do not get that result.


In the observer-moment graph of this gedanken (figure 9),a1, ß1, and ß2 are the blackbeard observer-moments, and e is the information possessed by such an observer-moment (“this observer-moment is a blackbeard”). h is the hypothesis that T1 is true. Given OE&ℜ0 the observer-moments are partitioned into two reference classes: the blackbeards and the whitebeards (assuming that they are not subjectively distinguishable in any other way than via their beard color). Thus, for example, a1 belongs to the reference class Oa1 = {a1, ß1, ß2}.


This gives Pa1(h|e)=1/2 (with ?=1). Hence, according to ℜ0, the blackbeards’ credence of T1 should be the same as their credence of T2, which is wrong.

A broader definition of the reference class will give the correct result. Suppose all observer-moments in Blackbeards and Whitebeards are included in the same reference class (figure 10):


This gives P(h) = 2/3. That is, observer-moments that find that they have black beards obtain some reason to think that T1 is true.

This establishes boundaries for how the reference class can be defined. The reference class to which an observer-moment a belongs consists of those and only those observer-moments that are relevantly similar to a. We have just demonstrated that observer-moments can be relevantly similar even if they are subjectively distinguishable. And we saw earlier that if we reject the paradoxical recommendations in Adam & Eve, Quantum Joe, and UN++ that follow from using the universal reference class definition ℜU then we also must maintain that not all observer-moments are relevantly similar. We thus have ways of testing a proposed reference class definition. On the one hand, we may not want it to be so permissive as to give counterintuitive results in Adam & Eve, et al. (Scylla). On the other hand, it must not be so stringent as to make cosmological theorizing impossible because of the freak-observer problem (Charybdis). A maximally attractive reference class definition would seem to be one that steers clear of both these extremes.

A subjective factor in the choice of reference class?

A reference class definition is a partition of possible observer-moments; each equivalence class in the partition is the reference class for all the observer-moments included in it. If a is any permissible reference class definition then we have in general ℜ0⊆ℜ⊆ℜU, where “⊆” denotes the relation “less (or equally) coarse-grained than”. We have argued above that there are cases showing that (“⊂” meaning “strictly less coarse-grained than”):

0⊂ℜ (“ℜ0-bound”)

And if we reject the counterintuitive advice to Adam & Eve, et al. then there also are cases committing us to:

ℜ⊂ℜU (“ℜU-bound”)

One may also want to impose a condition of “non-arbitrariness” to the effect that completely arbitrary or irrelevant differences between two observer-moments are not a ground for placing them in separate reference classes. Of course, we haven’t defined what counts as a “non-arbitrary” or “relevant” difference—indeed, it might be one of those notions that do not permit of an exact definition; but it may still be useful to have a label for this generic theoretical desideratum that significant distinctions be based on relevant differences.

Within these constraints, there is room for diverging reference class definitions. In the next chapter, we shall establish a further constraint as well as identify several considerations that are pertinent in electing a reference class. One cannot rule out that there are new arguments waiting to be discovered that will impose additional limitations on legitimate reference class definitions, conceivably even narrowing the field to one uniquely correct choice.

One idea that might be worth exploring is that in anthropic reasoning one should reason in such a way that one is following a rule that one thinks will maximize the expected fraction of all observer-moments applying it that will be right.10

10 A refinement would be to recognize that being right or wrong is not a binary matter, so one may instead say we should try to minimize the expected value of an error-term that takes into account how much an observer-moment’s degree of belief in a proposition Ψ deviates from the truth value of Ψ.

If many of the different observer-moments in the Big-World cosmology case (including both many of those observing CMB = 2.7 K and those observing CMB = 3.1 K) are said to be applying the same rule (and likewise the black-beard and the whitebeard observer-moments in Blackbeards & Whitebeards), and yet the later observer-moments in e.g. the Adam & Eve gedanken are taken to be applying a different rule than the early observer-moments (since the later ones are, after all, in no uncertainty at all about the outcome of the carnal embrace and may thus not be applying any non-trivial rule of anthropic reasoning to the problem at hand), then this meta-principle may be able to give the results we want. In order to move forward with this idea, however, we would need to have good criteria for determining which observer-moments should be said to be applying the same rule, and rule-following is notoriously a tricky concept to explicate. Another problem is that when calculating the expected fraction of rule-applying observer-moments that will be right, one needs credences as input in order to perform the calculation, and what these credences should be is itself dependent on which rule is adopted—so that maybe the best one could hope for from this approach would be to eliminate those rules that by their own standards are inferior to some other rule.

My suspicion is that at the end of the day there will remain a subjective factor in the choice of reference class. Yet, I think there is a subjective element too in the choice of an ordinary Bayesian prior credence function over the set of non-centered possible worlds. I don’t believe that every such possible non-indexical function is rationally defensible; but I think that after everything has been said and done, there is a class of non-indexical credence functions that would all be defensible in the sense that intelligent, rational, and reasonable thinkers could have any of these credence functions even in an idealized state of reflective equilibrium. (This could perhaps be said to be something of a “received view” among Bayesian epistemologists.) What we are suggesting here is that a similar subjective element exists in credence assignments to indexical propositions, and that this is reflected in the fact that there are many permissible choices of reference class. And isn’t this just what one should have expected? Why think there is no room for rational disagreement regarding the indexical part of belief-formation while there is very considerable room for disagreement between rational thinkers in regard the non-indexical part of belief-formation? Our theory puts the two domains, the indexical and the non-indexical, on an equal footing. In both, there are constraints on what can be reasonably believed, but these constraints may not single out a uniquely correct credence function.

We are free to seek arguments for additional constraints (we shall find some in the next chapter) and it is an open question how far one can shrink the class of defensible reference class definitions. New constraints can simply be added to the theory since OE itself (along with its quantum sibling) is neutral with respect to choice of reference class.

It may be worth noting that the question of whether there is a subjective factor in the choice of reference class is logically independent from the question of whether the reference class is relative to observer-moments. For conceivably, it could be the case that for any observer-moment there is a unique objectively correct choice of reference class, but it is a different one for different observer-moments. Then we would have relativity together with complete objectivity. Moreover, one could alternatively have the view that if everybody is perfectly rational then every observer-moment must use the same reference class, while admitting that there is no objective ground determining exactly what this common reference class should be. Then we would have a degree of subjectivity together with complete absence of relativity. An example of this latter kind of view would be to think that there is compelling argument for adopting a universal reference class definition (ℜU) while admitting that there is no compelling reason for picking any particular delineation of what counts as an observer-moment.

We shall continue our discussion of the reference class problem at the end of the next chapter.