Let's start with the good stuff, namely, free money.
The Envelope Switching Riddle
This is a classic, and one of my all time favorites. To make the math a little easier, I'll start with a variant (not the actual riddle) that goes like this:
1) I hand you an envelope full of money. Yay!
2) I am holding two more envelopes. That's three total. Easy math, right?
3) I invite you to open the first envelope, and you find that it has $20 inside.
4) I tell you that the other two envelopes contain one half that amount, and twice that amount, respectively.
5) For simplicity, let's assume that you believe me, and that you aren't concerned with risk aversion or expected utility (the subjective value of a particular amount in your circumstances).
6) I offer you the chance to switch by selecting another envelope at random. Should you do it?
Now the math gets a little harder, but still nothing a third-grader couldn't handle. The other two envelopes must contain $10 and $40. If you select one at random, the probability of it containing $10 is 50% (in math terms, we'd say that p1=0.5). The probability of it containing $40 is identical: p2=0.5.
In layperson's terms, you have equal chances of losing $10 and gaining $20.
Mathematically, the expected payoff is a sum of probabilities:
mean payoff = p1* -$10 + p2* $20 = -$5 + $10 = $5
You will gain, on average, five dollars (imagine two average attempts, at +$20 and -$10 for a net gain of +$10, giving an average of +$5 per attempt; make sense?). And I think you'd also agree that-- if the conditions are the same-- the amount in the first envelope doesn't matter. Starting with $x gives a possible gain of +$x and a possible loss of -$x/2, for an average payoff of +$x/4. It's always a good bet to switch.
The actual riddle is a bit different. There are only two envelopes, and you know that one contains twice as much as the other. Let us assume that you select one of these two at random and discover that it contains $20.
You are then offered the chance to switch. Should you do it?
The other envelope might contain $10, and it might contain $40, so if you applied the earlier calculations you could decide, again, that the average payoff is +$5 and it's a good idea. But you might feel a bit uneasy about that, and for good reason.
Continuing to apply the previous logic, it would always be a good idea to switch, regardless of the amount. Which means there is no need to check the money inside the envelope. Which means-- strangely-- that after switching, it's a good idea to switch again. And again. And again, ad nauseum.
It ought to be readily apparent that you cannot increase your payoff by constantly switching two mysterious envelopes back and forth. And thus the riddle is normally phrased: What is the fallacy?
There are actually several correct answers to that, because you have to make a best guess as to what the envelope-switcher is thinking. I'm going to talk about the more epistemologically interesting one, which is related to...
We don't need to examine the symbolic form of Bayes' Theorem for this discussion, as a simple example will demonstrate the the type of error that occurs when it isn't used:
Suppose you get a blood test that is 90% accurate in checking for a disease called "Burrow Syndrome". It's positive. What are the chances that you actually have Burrow Syndrome?
The naive answer is 90%. And I use the term "naive" quite literally, not as a synonym for "stupid". The problem with that answer is that it doesn't apply Bayes' Theorem by taking into account the distribution of Burrow Syndrome across the population.
What if, for example, the disease afflicts 1% of the population? Then out of 1000 people tested, only 10 will have the disease and 990 will not. Given that the test is 90% accurate, the results will be spread out thusly:
False negative: 1 person
Correct positive: 9 people
Correct negative: 891 people
False positive: 99 people
As you can see, 90% of the people (891 +9 = 900) got an accurate result. But, of those with a positive test, only 9 out of 108 (~8.3%) actually have Burrow Syndrome!
In a nutshell, Bayes tells us to incorporate prior probability. It also warns us to be wary when information about prior probability is not available.
The Principle of Indifference can be used in conjunction with Bayes' Theorem to explain learning and belief processes.
If you have a three-year-old handy, or can borrow one from a neighbor, try this experiment: Ask the child in question whether or not the sun will rise tomorrow. You can expect a shoulder shrug, a 'maybe', or lots of giggling and running around.
Now wait five years, or borrow someone's eight-year-old. Tell the child that the sun is not going to rise tomorrow, because "the sun doesn't rise every day". If the child agrees, I promise that she's just playing along with your game. She knows the sun will rise tomorrow, because it rises every day.
What's happening here is a process of Bayesian Learning. When the three-year-old first understands the concepts of days and sunrises, he applies the Principle of Indifference and concludes that the sun is equally likely to rise or not:
Riseweight = 1
Darkweight = 1
ergo, prise = pdark = 1/2
Then, when the sun comes up the next day, he increments the probability of that scenario as follows:
Riseweight = Riseprior_weight + 1 = 2
Darkweight = 1
ergo, prise = 2/3 & pdark = 1/3
After a week, the sunrise is much more probable, but still far from certain:
Riseweight = 8 (after seven days)
Darkweight = 1
prise = 8/9
pdark = 1/9
At this point, you still have a chance to convince him that the sun only rises sometimes. But by the time he's eight, prise = ~1799/1800, which is pretty much a certainty.
Smart kids, huh? But don't forget how far off the toddler was to start with. His probability estimation of sunrise was an expression of ignorance (as part of a learning strategy), not an expression of class probability.
Back to the Money
In the pre-riddle example, you had $20 to start with and two more envelopes (calculated at $40 & $10). Those switch-ready envelopes constituted a class:
In class probability we know everything about the entire class of events or phenomena, but we know nothing particular about the individuals making up the class. For example, if we roll a fair die we know the entire class of possible outcomes, but we don't know anything about the particular outcome of the next roll—save that it will be an element of the entire class.(http://mises.org/daily/2615)
Class probability allows us to compute something like an average payoff. The alternative does not:
Case probability is applicable when we know some of the factors that will affect a particular event, but we are ignorant of other factors that will also influence the outcome.
In case probability, the event in question is not an element of a larger class...
It is purely metaphorical when people use the language of the calculus of probability in reference to events that fall under case probability. For example, someone can say "I believe there is a 70 percent probability that Hillary Clinton will be the next president."
Yet upon reflection, this statement is simply meaningless. The election in question is a unique event, not a member of a larger class where such frequencies could be established.
In the riddle-as-presented, we are dealing with a unique event. Having drawn $20, there is only one remaining envelope. It may contain $10 and it may contain $40, just as (to the toddler) the sun may or may not rise tomorrow, but this does not provide a calculable frequency of outcomes.
One might protest that there are equal chances of drawing the higher- or lower-valued envelope, and thus they constitute a class. Well, they do. But the elements of that class are "higher" and "lower"-- it only means that an unexamined envelope has a 0.5 probability of being the lower value.
This is just like the test for Burrow Syndrome: An unexamined test has a 0.9 probability of accuracy, but once you peek at the result (i.e., positive or negative) you can no longer make an accurate probability statement without knowing the priors. While 50% of randomly-selected envelopes will have the lower value, what percentage will have the lower value and be equal to $20? We simply don't know.
Partial information is deceptive.
Let's try one more evil scenario: I hand you an envelope that may or may not have money in it. I've done this before with other people, and there is definitely money sometimes (and sometimes not). Also, there is a number written on the back, and if the envelope does contain money, that number is definitely the amount.
Want to buy the envelope? I'll sell it to you for $100. No? Not enough information? Okay, flip it over and read the number.
It says "$1000". Holy schneikies! It might be empty, but if it does have money, it's got $1000 inside! You have more information now-- sort of. You've eliminated an infinite number of other possibilities ($1, $73, $523,234.42, etc.). You could apply the principle of indifference and say that p$1000 = 0.5, in which case the "average" amount inside is $500, which makes your "average" net gain $400 (purchasing for $100).
Alas, you're not buying it.
So all this was on my mind recently after reading an article on Cracked called 5 UFO Sightings That Even Non-Crazy People Find Creepy.
UFO sightings often enter into speculation along with questions like whether or not we're alone in the universe, whether anyone out there is trying to communicate with us, and what to do when you're captured by an inquisitive alien.
But they shouldn't.
As "creepy" as some of those UFO sightings are, they don't tell us any more about alien visitors than a large number on the back of an envelope tells us about its contents. Once you understand Bayesian Learning, it's easy to see where the fallacy comes from.
Now, even UFOlogists/believers admit that some 90% of sightings can be explained by mundane phenomena such as weather balloons, ball lightning, hoaxes, and the looming planet Venus.
Conversely, skeptics agree that not all sightings are fully explained, and I think most would agree that a list of known scenarios doesn't cover everything.
Let's favor the skeptics and say that "known scenarios" cover 99% of all UFO sightings. Hell, let's make it 99.9%. Even with that, a Bayesian Learner could decide that the remaining scenarios either result from alien visitors (.05%) or not (.05%). From there, belief is just a matter of volume.
Given a single sighting, the naive learner calculates a 1/2000 probability of alien visitors (1/2 of .1%)
After 100 sightings, the probability rises to 4.8%
After 1000 sightings, it's 39.4%
After 10,000 sightings, the perceived likelihood of alien visitors is 99.3%
Seen that way, it's easy to understand why some people are so convinced. But the calculations are just as faulty as the toddler's estimation that the sun has only a 50% chance of rising tomorrow: Ignorance has been mistaken for class probability.
So, what is the likelihood that we share the galaxy with other intelligent life? Or that we have alien visitors, or at least observers, in some capacity?
Frank Drake (no relation) of the SETI institute devised an equation to calculate the average number of detectable, intelligent civilizations in a galaxy at any given time. It looks something like this:
N = R * fp * ne * fl * fi * fc * L
Those variables represent things like the number of stars, the fraction which have habitable planets, the fraction which develop life, etc. Simple logic proves the equation valid, although the actual values are pretty speculative.
But speculate people will. After a number of scientists put in their best guesses for the numbers, they came up with an answer that suggested multiple, intelligent, communicating civilizations in the galaxy. Furthermore, even without assuming any radical breakthroughs in science, it ought to be a trivial matter for an advanced technological civilization to physically explore and even colonize the entire galaxy. The distances involved are... well, astronomical, but a conservative estimate of 10 million years to cross the length of the Milky Way is to be considered in light of the fact that the galaxy is several billion years old.
And thus was born Fermi's Paradox (historically prior to the Drake Equation, but based on the same reasoning): Where are the aliens?
The folks at SETI speculate that the aliens have left us alone and/or have no interest in interstellar travel, but some are communicating and we're right on the technological threshold of detecting them.
Folks who believe in alien UFOs speculate that they are here, and we've even spotted them.
Either or both may be right or wrong. I have no way of knowing. But I can tell you something about the relationship between the "evidence" and the conclusions...
The problem with the UFO proponents' cosmological argument is that it makes an unsupported-- and unconsidered-- connection to the UFO sightings. Even assuming that intelligent aliens exist in the galaxy, what is the probability that they have physically traveled to our solar system?
Consider that technologically-advanced aliens ought to be able to conduct unlimited observations without coming any closer than the moon. Even if they wanted physical samples, those could be collected by undetectable microscopic robots. So what's the probability that they actually fly ships into our atmosphere?
And if they did that, surely they would be undetectable to our radar, to say nothing of the naked eye. What's the probability that they're incompetent?
And finally, if they are so incompetent, what's the probability that they've still managed to avoid leaving any definitive physical evidence?
My point is not that alien visitors do not exist. Personally, I think it's possible; it seems to be the likely behavior of an advanced and not-so-evil civilization. My point is that even if such do exist, even if they're flying their ships through our atmosphere and abducting random humans, it doesn't necessarily mean that a single one of the UFO sightings or abduction stories results from actual alien visitors.
The problem with the Drake Equation is not so much that the multiplying factors are so speculative. Even if they're smack dab right on, they only constitute a class probability when used to consider the average number of ambitious aliens in an average galaxy at an average time. The question of whether or not we have advanced neighbors in our own galaxy, at this particular time, is one of case probability. It is a unique situation with a unique answer.
If I may draw one more painful metaphor, it's like speculating on whether or not your best friend's birthday is in July. You might do a study and find that the "average" birthday is July 2nd. Does that tell you anything? Or the "average" person has 1/12 of his birthday in July. Any better? At best, the Drake Equation is a guide for actions that might prove fruitful (i.e., the SETI project), but more likely, it will only tell us after-the-fact that we got lucky or not.
And finally, if those speculative numbers are correct, it's possible that intelligent, expansive life can only develop once in a galaxy, and that all others wil be pre-empted by the colonization process; in which case there is an observer bias that invalidates the equation (locally). If we are here, then we might well be the first intelligent species to arise in this part of the universe.
And speaking of birthdays,