Archive for the ‘– By the Mathematician’ Category

Q: What’s the chance of getting a run of K or more successes (heads) in a row in N Bernoulli trials (coin flips)? Why use approximations when the exact answer is known?

Saturday, July 24th, 2010

The original question was: Recently I’ve come across a task to calculate the probability that a run of at least K successes occurs in a series of N (K≤N) Bernoulli trials (weighted coin flips), i.e. “what’s the probability that in 50 coin tosses one has a streak of 20 heads?”. This turned out to be a very difficult question and the best answer I found was a couple of approximations.

So my question to a mathematician is: “Why is this task so difficult compared e.g. to finding the probability of getting 20 heads in 50 coin tosses solved easily using binomial formula? How is this task in fact solved – is there an exact analytic solution? What are the main (preferably simplest) approximations and why are they used instead of an analytic solution?”

Physicist: What follows was not obvious. It was the result of several false starts. It’s impressive in the same sense that a dude with seriously bruised legs, who can find anything in his house with the lights off, is also impressive. If you’re looking for the discussion of the gross philosophy, and not the detailed math, skip down to where you find the word “Jabberwocky” in bold. If you want specific answers for fixed, small numbers of coins, or you want sample computer code for calculating the answer, go to the bottom of the article.

The short answer: the probability, S, of getting K or more heads in a row in N independent attempts (where p is the probability of heads and q=1-p is the probability of tails) is:

S(N,K) = p^K\sum_{T=0}^\infty {N-(T+1)K\choose T}(-qp^K)^T-\sum_{T=1}^\infty {N-TK\choose T}(-qp^K)^T

Note that here {A\choose B} is the choose function (also called the binomial coefficients) and we are applying a non-standard convention that {A\choose B}= 0 for A < B which makes the seemingly infinite sums always have only a finite number of terms. In fact, for N and K fixed, the answer is a polynomial with respect to the variable p.

Originally this was a pure math question that didn’t seem interesting to a larger audience, but we both worked for long enough on it that it seems a shame to let it go to waste. Plus, it gives me a chance to show how sloppy (physics type) mathematics is better than exact (math type) mathematics.

Define {Xi}j as the list of results of the first j trials. e.g., if j=4, then {Xi}j might be “{Xi}4=HTHH” or “{Xi}4=TTTH” or something like that, where H=heads and T=tails. In the second case, X1=T, X2=T, X3=T, and X4=H.

Define “Ej” as the event that there is a run of K successes (heads) in the first j trials. The question boils down to finding P(EN).

Define “Aj” as the event that the last K terms of {Xi}j are T followed by K-1 H’s ({Xi}j = X1X2X3X4…THHH…HHH). That is to say, if the next coin (Xj+1) is heads, then you’ve got a run of K.

Finally, define p = P(H) and q = P(T) = 1-p. Keep in mind that a “bar” over an event means “not”. So “\overline{H}=T” reads “not heads equals tails”

The probability of an event is the sum of the probabilities of the different (disjoint) ways that event can happen. So:

\begin{array}{ll}&P(E_{j+1})\\i)&=P(E_{j+1}\cap E_j\cap A_j)+P(E_{j+1}\cap E_j\cap \overline{A_j})+P(E_{j+1}\cap \overline{E_j}\cap A_j)+P(E_{j+1}\cap \overline{E_j}\cap \overline{A_j})\\ii)&=\left[P(E_{j+1}\cap E_j\cap A_j)+P(E_{j+1}\cap E_j\cap \overline{A_j})\right]+P(E_{j+1}\cap \overline{E_j}\cap A_j)+P(E_{j+1}\cap \overline{E_j}\cap \overline{A_j})\\iii)&=\left[P(E_{j+1}\cap E_j)\right]+P(E_{j+1}\cap \overline{E_j}\cap A_j)+P(E_{j+1}\cap \overline{E_j}\cap \overline{A_j})\\iv)&=P(E_j)+P(E_{j+1}\cap \overline{E_j}\cap A_j)+P(E_{j+1}\cap \overline{E_j}\cap \overline{A_j})\\v)&=P(E_j)+P(E_{j+1}\cap \overline{E_j}\cap A_j)+0\\vi)&=P(E_j)+P(E_{j+1}|\overline{E_j}\cap A_j)P(\overline{E_j}\cap A_j)\\vii)&=P(E_j)+pP(\overline{E_j}\cap A_j)\\viii)&=P(E_j)+pP(\overline{E_j}| A_j)P(A_j)\\ix)&=P(E_j)+pP(\overline{E_j}| A_j)qp^{K-1}\\x)&=P(E_j)+qp^KP(\overline{E_{j-k}})\\xi)&=P(E_j)+qp^K\left[1-P(E_{j-k})\right]\\xii)&=P(E_j)+qp^K-qp^KP(E_{j-k})\end{array}

iv) comes from the fact that E_j \subset E_{j+1}. If you have a run of K heads in the first j trials, of course you’ll have a run in the first j+1 trials. v) The zero comes from the fact that if the first j terms don’t have a run of K heads and the last K-1 terms are not all heads, then it doesn’t matter what the j+1 coin is, you can’t have a run of K heads (you can’t have the event Ej+1 and not Ej and not Aj). vii) is because if there is no run of K heads in the first j trials, but the last K-1 of those j trials are all heads, then the chance that there will be a run of K in the first j+1 trials is just the chance that the next trial comes up heads, which is p. ix) the chance of the last K trials being a tails followed by K-1 heads is qpK-1. x) If the last K (of j) trials are a tails followed by K-1 heads, then whether a run of K heads does or doesn’t happen is determined in the first j-K trials.
The other steps are all probability identities (specifically P(C)=P(C\cap D)+P(C\cap \overline{D}), \, P(\overline{C})=1-P(C), and Bayes’ theorem: P(C\cap D)=P(C|D)P(D)).

Rewriting this with some N’s instead of j’s, we’ve got a kick-ass recursion: P(E_N)=P(E_{N-1})+qp^K-qp^KP(E_{N-K-1})

And just to clean up the notation, define S(N,K) as the probability of getting a string of K heads in N trials (up until now this was P(EN)).

S(N,K)=S(N-1,K)+qpK-qpKS(N-K-1,K). We can quickly figure out two special cases: S(K,K) = pK, and S(l,K) = 0 when lK, and that there’s no way of getting K heads without flipping at least K coins. Now check it:

\begin{array}{ll}&S(N,K)\\i)&=S(N-1,K)+qp^K-qp^KS(N-K-1,K)\\ii)&=\left[S(N-2,K)+qp^K-qp^KS(N-K-2,K)\right]+qp^K-qp^KS(N-K-1,K)\\iii)&=S(N-2,K)+2qp^K-qp^K\left[ S(N-K-1,K)+S(N-K-2,K)\right]\\iv)&=S(N-3,K)+3qp^K-qp^K\left[ S(N-K-1,K)+S(N-K-2,K)+S(N-K-3,K)\right]\\v)&=S(N-(N-K),K)+(N-K)qp^K-qp^K\left[ S(N-K-1,K)+\cdots+S(N-K-(N-K),K)\right]\\vi)&=S(K,K)+(N-K)qp^K-qp^K\left[ S(N-K-1,K)+\cdots+S(0,K)\right]\\vii)&=p^K+(N-K)qp^K-qp^K\sum_{r=0}^{N-K-1} S(r,K)\\viii)&=p^k+(N-K)qp^K-qp^K\sum_{r=K}^{N-K-1} S(r,K)\end{array}

ii) Plug the equation for S(N,K) in for S(N-1,K). iii-vi) is the same thing. vii) write the pattern as a sum. viii) the terms up to K-1 are all zero, so drop them.

Holy crap! A newer, even better recursion! It seems best to plug it back into itself!

\begin{array}{ll}i)&S(N,K)=p^K+(N-K)qp^K-qp^K\sum_{r=K}^{N-K-1} S(r,K)\\ii)&=p^K+(N-K)qp^K-qp^K\sum_{r=K}^{N-K-1} \left[p^k+(r-K)qp^K-qp^K\sum_{\ell=K}^{r-K-1} S(\ell,K)\right]\\iii)&=p^K+(N-K)qp^K-qp^K\sum_{r=K}^{N-K-1}p^k-qp^K\sum_{r=K}^{N-K-1}(r-K)qp^K+\left(qp^K\right)^2\sum_{r=K}^{N-K-1}\sum_{\ell=K}^{r-K-1} S(\ell,K)\\iv)&=p^K+(N-K)qp^K-qp^K\sum_{r=1}^{N-2K}p^k-\left(qp^K\right)^2\sum_{r=0}^{N-2K-1}r+\left(qp^K\right)^2\sum_{r=2K+1}^{N-K-1}\sum_{\ell=K}^{r-K-1} S(\ell,K)\\v)&=p^K+(N-K)qp^K-p^k(N-2K)qp^K-\left(qp^K\right)^2\frac{(N-2K)(N-2K-1)}{2}+\left(qp^K\right)^2\sum_{r=2K+1}^{N-K-1}\sum_{\ell=K}^{r-K-1} S(\ell,K)\\vi)&=p^K+{N-K\choose 1}qp^K-p^K{N-2K\choose 1}qp^K-{N-2K\choose 2}\left(qp^K\right)^2+\left(qp^K\right)^2\sum_{r=2K+1}^{N-K-1}\sum_{\ell=K}^{r-K-1} S(\ell,K)\end{array}

You can keep plugging the “newer recursion” back in again and again (it’s a recursion after all). Using the fact that \sum_{i=1}^N={N+1\choose 2} and \sum_{i=D}^M {\ell \choose D}={M+1 \choose D+1} you can carry out the process a couple more times, and you’ll find that:

\begin{array}{l}S(N,K)=p^K\left[1-{N-2K\choose 1}qp^K+{N-3K\choose 2}\left(qp^K\right)^2\cdots\right]+\left[{N-K\choose 1}qp^K-{N-2K\choose 2}\left(qp^K\right)^2+{N-3K\choose 3}\left(qp^K\right)^3\cdots\right]\\=p^K\sum_{T=0}^\infty {N-(T+1)K\choose T}(-qp^K)^T-\sum_{T=1}^\infty {N-TK\choose T}(-qp^K)^T\end{array}

There’s your answer.

In the approximate (useful) case:

Assume that N is pretty big compared to K. A string of heads (that can be zero heads long) starts with a tails, and there should be about Nq of those. The probability of a particular string of heads being at least K long is pk so you can expect that there should be around E=Nqpk strings of heads at least K long. When E≥1, that means that it’s pretty likely that there’s at least one run of K heads. When E<1, E=Nqpk is approximately equal to the chance of a run of at least K showing up.

Jabberwocky: And that’s why exact solutions are stupid.

Mathematician: Want an exact answer without all the hard work and really nasty formulas? Computers were invented for a reason, people.

We want to compute S(N,K), the probability of getting K or more heads in a row out of N independent coin flips (when there is a probability p of each head occurring and a probability of 1-p of each tail occurring). Let’s consider different ways that we could get K heads in a row. One way to do it would be to have our first K coin flips all be heads, and this occurs with a probability p^{K}. If this does not happen, then at least one tail must occur within the first K coin flips. Let’s suppose that j  is the position of the first tail, and by assumption it satisfies 1 \le j \le K. Then, the probability of having K or more heads in a row in the entire set of coins (given that the first tail occurred at j \le K) is simply the probability of having K or more heads in a row in the set of coins following the jth coin (since there can’t be a streak of K or more heads starting before the jth coin due to j being smaller or equal to K). But this probability of having a streak of K or more after the jth coin is just S(N-j,K). Now, since the probability that our first tail occurs at position j is the chance that we get j-1 heads followed by one tail, so it is p^{j-1} (1-p) . That means that the chance that the first tail occurs on coin j AND there is a streak of K or more heads is given by p^{j-1} (1-p) S(N-j,K). Hence, the probability that the first K coins are all heads, OR coin one is the first tails and the remainder have K or more heads in a row, OR coin two is the first tails and the remainder have K or more heads in a row, OR coin three is the first tails and…, is given by:

S(N,K) = p^{K} + \sum_{j=1,K} p^{j-1} (1-p) S(N-j,K)

Note that what this allows us to do is to compute S(N,K) by knowing the values of S(N-j,K) for  1 \le j \le K. Hence, this is a recursive formula for S(N,K) which relates harder solutions (with larger N values) to easier solutions (with smaller N values). These easier solutions can then be computed using even easier solutions, until we get to S(A,B) for values of A and B so small that we already know the answer (i.e. S(A,B) is very easy to calculate by hand). These are known as our base cases. In particular, we observe that if we have zero coins then there is a zero probability of getting any positive number of heads is zero, so S(0,K) = 0, and the chance of getting more heads than we have coins is zero, so S(N,K) = 0 for K>N.

All of this can be implemented in a few lines of (python) computer code as follows:

An important aspect of this code is that every time a value of S(N,K) is computed, it is saved so that if we need to compute S(N,K) again later it won’t take much work. This is essential for efficiency since each S(N,K) is computed using S(N-j,K) for each j with 1 \le j \le K and therefore if we don’t save our prior results there will be a great many redundant calculations.

For your convenience, here is a table of solutions for S(N,K) for 1 \le N \le 10 and 1 \le K \le 10 (click to see it enlarged):

Q: Will there always be things that will not or cannot be known?

Wednesday, June 30th, 2010

Mathematician: Unfortunately, limits to knowledge seem to be built into the nature of the universe, and even into logic itself.

Relativity: Einstein’s theory of special relativity implies that no information can travel faster than the speed of light. That means that information from sufficiently recent, sufficiently far away events will not have had the time to propagate to us yet, making detailed knowledge of such events impossible. In physics speak, we say that these events are outside of our “past light cone“, “space-like separated” from us, or just “elsewhere”. As long as new events of this type keep happening, there will always be things about which we do not and cannot know.

Quantum Mechanics: The Heisenberg uncertainty principle states that the uncertainty \Delta x we have in a particle’s position and the uncertainty \Delta p we have in the particle’s momentum cannot both be very small at the same time. In particular, the product of these uncertainties is greater than a constant (\Delta x \Delta p > \frac{\hbar}{2}). This implies a fundamental limit to the knowledge that is possible because we can know x accurately or p accurately, but not both.

What’s more, the vast majority of physicists agree that quantum mechanics demonstrates the universe is random at a fundamental level. This means that some events, like the time at which an atom will decay, can be predicted only probabilistically. We can say how likely an atom is to decay in a given time interval, but we will never be able to say precisely when the decay will occur, placing another limitation on what knowledge is possible. (Physicist’s note: After the decay you still can’t say when exactly it happened because according to quantum mechanics the exact time doesn’t actually exist!)

Mathematics: Gödel’s  first incompleteness theorem states (essentially) that any mathematical system  that is able to express elementary arithmetic (and doesn’t contain any contradictions) must contain true arithmetical statements that cannot be proven within that system. Essentially this implies that there will always be true mathematical statements that we cannot prove.


Add to all of these theoretical considerations the enormous (and possibly infinite) number of things that could be known about our physical universe, and the (most definitely) infinite number of true mathematical statements that could be known, and it is clear that there will always be knowledge that is beyond our reach.

Video: How do we know that 1+1=2? A journey into the foundations of math.

Sunday, June 6th, 2010

AskAMathematician.com presents a lecture on the foundations of math and whether we really can know that one plus one equals two. How was math invented? Where does mathematics come from? Are the axioms of math provable? Is math true? Can it be proven on purely logical grounds? Can it be demonstrated empirically? Can it only be justified from a pragmatic perspective? These are some of the questions discussed in the three videos below.

Part 1 of 3:

Part 2 of 3:

Part 3 of 3:

Q: Can math and science make you better at gambling?

Sunday, May 23rd, 2010

Physicist: Yes.

Don’t gamble (mathematically speaking).

Mathematician: Gambling (at casinos, in lotteries, and in most other instances) is expected value negative, even when you play optimally. That means that the average amount of money you will make per play is negative (i.e. you will lose money, on average). It also implies (via the Central Limit Theorem) that when you play many times, there will be a greater than 50% chance that you lose money overall.

There are a few exceptions to this rule. Card counting in games like blackjack can be expected value positive if you are very good at it and increase your bet sizes at the right times, but casinos are savvy and make this difficult (e.g. by shuffling many decks of cards together). Plus, if you get caught, you will get banned, or worse. Betting against friends can also be expected value positive if you can be sure you’re really a better gambler than they are. Poker at casinos (where you play against other gamblers) can also be expected value positive if you’re very good, but since the casino charges you to play, even if you’re better than the other players you may not (on average) come out positive.

A good rule of thumb is this: If you don’t enjoy gambling, then simply don’t do it since (on average) it will just waste your money. If you do enjoy it, then think of it as a recreational activity, not as a way to make money. Before starting, decide how much money you are willing to lose, and if you use that money up, quit immediately. If you assume that you are going to lose then you won’t be disappointed.

All of this being said, if you do decide to gamble, you’d better study up on your probability! It’s important to know what the best course of action is (probabilistically speaking) at each decision point in the game.

Q: What do complex numbers really mean or represent?

Monday, May 3rd, 2010

Physicist: Nothing really.

Complex numbers are very useful for streamlining a lot of different types of math, generalizing ideas, and “closing” the real numbers.  In quantum theory you’ll find that on the most fundamental level the universe seems to prefer complex numbers to real numbers.

But you can’t use them to count or measure stuff for crap, so most people can lives long happy lives without being particularly bothered by complex numbers.  If you can call that living.

Mathematician: There are many ways to view complex numbers, but one of the most intuitive is to think of them as representing points in the plane. Doing so will allow us to interpret basic arithmetic operations like addition and multiplication as performing geometric manipulations.

How does this work? Well, every complex number z can be written as z = x + i y  where x and y are real numbers, and i is the square root of -1. We can then think of z as a point on the (x,y) plane with x being the position along the horizontal (i.e. real) axis, and y the position along the vertical (i.e. imaginary) axis. To understand the operation of adding and multiplying complex numbers though, we need to think about them slightly differently.

The complex number 2+3i represented as a point in the complex plane. Have you ever seen a more boring image in your life?

If we choose, we can treat complex numbers as vectors rather than points. That means that now z = x+i y  represents not the point (x,y) but rather, the directed line segment which extends from (0,0) in the complex plane to (x,y) in the complex plane (unless otherwise stated we will assume here that our vectors are emanating from the origin (0,0)). You can think of a vector as simply representing a magnitude (or length) and a direction. The line segment extending from (1,2) to (-4,5) is the same as the one extending from (-4,5) to (1,2), but the vectors represented would be different because they would be pointing in opposite directions. Vectors are often drawn as arrows.

The complex number 2+3i represented as a vector in the complex plane. Compared to our last image, this is a hoot.

Once we have the vector notion of a complex number, we can think about adding complex numbers as adding vectors. For example, if we have

z_{1} = x_{1} + i y_{1} z_{2} = x_{2} + iy_{2}

then

z_{1} + z_{2} = (x_{1}+x_{2}) + i (y_{1}+y_{2}).

Hence, z_{1} + z_{2} is the vector in the complex plane that extends from (0,0) to (x_{1}+x_{2}, y_{1}+y_{2}). Note that (x_{1}+x_{2}, y_{1}+y_{2}) is the point we get to if we piggy back vectors z1 and z2 and then follow them both. By this I mean that we translate (i.e. move without rotating or stretching) z2 so that its beginning is at the end of z1, and then we follow the path consisting of the two vectors until we get to the (new) end of z2. However, since z_{1} + z_{2} = z_{2} + z_{1}, we can also piggy back z1 at the end of z2 to get the same result. A slightly different view is achieved if we think of our first number z_{1} as being a vector, and our second number z_{2} as being a point. In this case, z_{1} + z_{2} simply corresponds to moving the point z_{2} by the distance and direction represented by the vector z_{1}.

Here are two complex numbers represented as vectors, 2+3i and 1+3i.

When we sum the complex numbers 2+3i and 1+3i we get the complex number 3+6i.

Alright, so addition of complex numbers can be thought of as adding vectors in the complex plane (or moving a point by the distance and direction stored in a vector), but what the heck does multiplication do? Well, to understand multiplication we need yet another geometric way of thinking about complex numbers.

First though, we make a quick (and relevant) aside. For any complex number z, we have by definition that the absolute value |z| of z satisfies

|z| = \sqrt{z \overline z} = \sqrt{(x+iy)(x-iy)} = \sqrt{x^{2} + y^{2}} = \sqrt{(x-0)^{2} + (y-0)^{2}}

which is precisely the formula for the distance between the point (0,0) and the point (x,y). Hence, |z| measures the length of the vector that z represents. Now, according to Euler’s formula, we have that for any real number \theta

e^{i \theta} = cos(\theta) + i sin(\theta).

hence e^{i \theta} is a complex number since it is the sum of a real and imaginary number. In particular, we have

|e^{i \theta}|^{2} = e^{i \theta} \overline{ e^{i \theta} } = e^{i \theta} e^{-i \theta} = e^{i \theta - i \theta} = e^{0} = 1

That means that any complex number representable in this form must have a distance of 1 from the origin. In other words, such numbers represent points on the unit circle, or equivalently, vectors of length 1 extending from the origin. As it turns out, all vectors in the complex plane with length 1 can be represented in this form. But what angle does each of these vectors point at? Well, we compute the angle of the line segment (0,0) to (x,y) with respect to the horizontal axis using arctan(y/x). In our case though, since x=cos(\theta) and y = sin(\theta) this gives:

arctan(y/x) = arctan(sin(\theta)/cos(\theta)) = arctan(tan(\theta)) = \theta.

Hence, the complex number e^{i \theta} represents a vector of length 1 pointed at angle \theta. Similarly, for any non-negative number r, we have that the complex number

r e^{i \theta}

represents a vector of length r pointed at angle \theta . This provides a new way of thinking about a complex number: as a vector specified by its length and angle. What happens when we multiply two such numbers z_{1} = r_{1} e^{i \theta_{1}} and z_{2} = r_{2} e^{i \theta_{2}} together? Well, we have

z_{1} z_{2} = r_{1} e^{i \theta_{1}} r_{2} e^{i \theta_{2}}

= r_{1} r_{2} e^{i (\theta_{1} + \theta_{2})} .

Hence, z_{1} z_{2} is a new complex number representing a vector of length r_{1} r_{2} pointed at angle \theta_{1} + \theta_{2}. Therefore, we can think of the action of z_{1} multiplying z_{2} as causing the vector z_{2} to stretch by a factor of r_{1} and rotate by an angle \theta_{1}.

Hence, to recap, we can view complex numbers geometrically as representing points or vectors in the complex plane. If we do this, then adding complex numbers corresponds to adding together vectors, or equivalently, moving the point that the second complex number represents along the vector that the first complex number represents. On the other hand, we can think of multiplication of complex numbers as corresponding to scaling and rotating the second complex number in the multiplication by the length and angle inherent in the first complex number. Finally, we note that taking the absolute value of a complex number corresponds to measuring the length of the corresponding vector. Therefore, one way to view complex numbers is as a means for converting geometric operations (translation, rotation, scaling) into algebraic operations (adding, multiplication) and back again. As you might imagine, this can be extremely useful!

Q: What would happen if an unstoppable force met with an unmovable, impenetrable object?

Thursday, April 22nd, 2010

Mathematician: Sometimes, when we don’t use language carefully enough, we can get ourselves into philosophical trouble. For example, consider the following statement:

If a barber shaves all those men (and only those men) who do not shave themselves, does he shave himself?

If the barber shaves himself, then he is shaving a man who shaves himself, which is something that (by definition) he does not do. On the other hand, if the barber does not shave himself, then there is a man who doesn’t shave himself that the barber doesn’t shave, which again contradicts our definition of the barber.

So what is the answer? Well, the question has no answer, because the definition we use for our barber contains within it a logical contradiction. What’s more, it is impossible for such a barber to actually exist in the real world, since the razor burn associated with simultaneously shaving yourself and not shaving yourself is too much for any single person to withstand.

Now, let’s return to the original question:

What would happen if an unstoppable force met with an unmovable, impenetrable object?

Well, let’s suppose that we define an “unstoppable” force to be one that can move absolutely any matter. Furthermore, let’s define an “unmovable” object to be one that cannot be moved by any force. In that case, this question is unanswerable, because like the barber paradox above, it relies on contradictory information. By definition our force can move anything, but then, also by definition, there is an object that the force cannot move. This is a bit like saying “suppose X is true, and not X is true. Then is X true?”. Here  X is the idea that the force can move anything, and not X is the idea that there is at least one object that cannot be moved by the force (which in this case is our unmovable object). Hence, this question has no answer because it relies on assumptions which contradict each other.

Video: The Scientific Investigation of Aliens – Evidence Examined

Wednesday, April 14th, 2010
This talk given at Nerd Nite (NYC) by the Mathematician discusses some of the evidence that has been proposed for the idea that technologically advanced aliens have already arrived on earth, including UFO photographs, crop circles, and abduction stories.

The PowerPoint presentation of this lecture, including references:

Download the Scientific Investigation of Aliens PowerPoint Presentation

Part 1 of the video of this lecture:

Part 2 of the video of this lecture:

References:

Polls

Roper Poll Survey 1999 :

http://www.ufoevidence.org/documents/doc850.htm

Public Opinion Polls on UFOs :

http://www.ufoevidence.org/topics/PublicOpinionPolls.htm

CNN/TIME Poll 1997 :

http://www.cnn.com/US/9706/15/ufo.poll/

Gallup Poll 2005:

http://www.gallup.com/poll/19558/Paranormal-Beliefs-Come-SuperNaturally-Some.aspx

Hallucination Poll Data :

“Prevalence of hallucinations and their pathological associations in the general population”, 2000 by Maurice M. Ohayon

Crop Circles

The International Crop Circle Database :

http://ccdb.cropcircleresearch.com/index.cgi

Why Crop Circles Can’t Be Hoaxed :

http://theconversation.org/booklet2.html

Abductions

Hill Abduction:

http://en.wikipedia.org/wiki/Betty_and_Barney_Hill_abduction

Kenneth Arnold UFO Sighting:

http://en.wikipedia.org/wiki/Kenneth_Arnold_unidentified_flying_object_sighting

Susan Blackmore on Aliens :

http://www.susanblackmore.co.uk/journalism/ns94.html

John Edward Mack :

http://en.wikipedia.org/wiki/John_Edward_Mack

Astronomy

Number of Planets in the Galaxy :

http://en.wikipedia.org/wiki/Extraterrestrial_life

Number of Galaxies in the Universe :

http://imagine.gsfc.nasa.gov/docs/ask_astro/answers/021127a.html

Number of planets in our galaxy :

http://www.madsci.org/posts/archives/2000-01/948204035.As.r.html

Meteorites :

http://en.wikipedia.org/wiki/Meteorite#Fall_phenomena

Other Info

Photo Tampering Throughout History :

http://www.cs.dartmouth.edu/farid/research/digitaltampering/

Alien Abduction Insurance:

http://en.wikipedia.org/wiki/Alien_abduction_insurance

Alien Abduction Insurance Articles :

http://www.alienabductioninsurance.com/articles.html

Aliens and Children :

http://www.aliensandchildren.org/

Developments in Crop Circles Over Time :

http://www.cropcircles.net/

Top 10 Controversial pieces of evidence for extraterrestrial life :

http://www.newscientist.com/article/dn9943-top-10-controversial-pieces-of-evidence-for-extraterrestrial-life.html

The Drake Equation :

http://en.wikipedia.org/wiki/Drake_equation

The Flake Equation :

http://xkcd.com/718/

UFOs :

http://en.wikipedia.org/wiki/UFO

Causes of Hallucinations :

http://www.wrongdiagnosis.com/h/hallucination/causes.htm#causeslist

Crop Circle :

http://en.wikipedia.org/wiki/Crop_circle

Sleep Paralysis :

http://en.wikipedia.org/wiki/Sleep_paralysis

Water Bears :

http://en.wikipedia.org/wiki/Water_bear

Videos

Best of UFOs (part 1) :

http://www.youtube.com/watch?v=6XXOvUwr2YU

Documentaries About Aleins :

http://www.hyper.net/ufo/video-documentaries.html

Crop Circle Being Made (Schofields Quest – Doug Bower 1994) :

http://www.youtube.com/watch?v=XkGbnUXfh4U

UFO sighting (part 2/3) :

http://www.youtube.com/watch?v=-iVYCYjso-U&feature=player_embedded

National Geographic Crop Circles :

http://video.nationalgeographic.com/video/player/science/weird-science-sci/uk_cropcircles.html

Why people believe strange things:

http://www.ted.com/talks/michael_shermer_on_believing_strange_things.html

Photos

Best UFO Photos:

http://www.ufocasebook.com/bestufopictures.html

Strange Flying Machines Slideshow :

http://www.slideshare.net/webtel125/strange-flying-machines-presentation-793157

UFO Photo Competitions :

http://fx.worth1000.com/contests/8300/sightings-7

Circle Makers :

http://www.circlemakers.org/totc2007.html

UFO Drawings :

http://www.ufo-blog.com/ufo-blog/2010/02/sketches-from-fifth-tranche-of-mod-ufo.html

Three spooky UFO vidoes :

http://www.youtube.com/watch?v=7WYRyuL4Z5I

UFO Haiti :

http://www.youtube.com/watch?v=up5jmbSjWkw

List of Military Aircraft in the United States :

http://en.wikipedia.org/wiki/List_of_military_aircraft_of_the_United_States

Balloon Boy Balloon :

http://s2.hubimg.com/u/1922877_f520.jpg

Sun Dogs:

http://en.wikipedia.org/wiki/Sun_dogs

Lenticular Clouds :

http://en.wikipedia.org/wiki/Lenticular_cloud

Photoshopped Images :

http://www.funnydb.com/Pictures/Celebrities/mona_diesel-3320.html

Books

The Demon Haunted World by Carl Sagan :

http://www.amazon.com/Demon-Haunted-World-Science-Candle-Dark/dp/0345409469


Q: How do I count the number of ways of picking/choosing/taking k items from a list/group/set of n items when order does/doesn’t matter?

Monday, April 12th, 2010

Mathematician: Suppose that we have a list containing three items, {A,B,C}, and we want to know how many different ways there are of choosing two items from this list. If we care about the order that items are selected from the list, then the possibilities are

{A,B}, {A,C}, {B, A}, {B, C}, {C,A}, {C,B}

(where {A,B} here means that we’ve selected item A and then item B from the list {A,B,C} ) so the answer is that there are six possible ways of choosing two items out of three. If on the other hand we don’t care about what order the items were selected in (so {A,B} is considered the same as {B,A}) then all the possible unique arrangements are

{A,B}, {A,C}, {B, C}

so the answer is that there are three possible ways of choosing two items out of three when order doesn’t matter.

When there are small numbers of items, it isn’t difficult to just write down all the possible combinations (when order doesn’t matter) or permutations (when order does matter). But what do we do when there are larger numbers of items? For example, it turns out there are 15,504 different ways to choose 5 items out of 20 (when the order of items selected doesn’t matter), far too many to write down. In cases like these we want to use a formula that depends only on n (the number of items in our set) and k (the number of items we will be selecting from it) that can quickly give us the answer we need. Let’s see if we can figure out what this formula should be.

For the time being we are going to assume that the order we select items in does matter (so the selection of A followed by the selection of B is not the same as the selection of B followed by the selection of A). Now notice that since we have n items in total, there are n choices for the first item that we pick. Once we’ve chosen the first item, there are n-1 items left in our list, so for our second selection we have n-1 possible choices. The number of total possible ways of choosing the first two items is therefore n * (n-1) because for each of the n first items we could choose we have n-1 second items possible. Now, for the 3rd item selected there are n-2 items left in the list (since we’ve already used up 2 items out of a total of n), which means that in choosing three items there are a total of n (n-1) (n-2) possible permutations. This pattern continues for each of the k items we are choosing, so that we find that the total number of ways of choosing k items is

(n)(n-1)(n-2)…(n-k+2)(n-k+1)

which can be written using the factorial function as

\frac{n!}{(n-k)!}

Remember however that in this analysis the order of the items selected made a difference. If what we are interested in is the number of ways of choosing k items from a list of n such that order does NOT matter (i.e. in each selection all that matters is which items are in that selection, not which order those items were chosen) then we have to adjust our formula somewhat by making the following considerations.

If we are going to be choosing k items, then how many different orderings of those k items exist? Well, there are k possible choices for which item goes first. After we have chosen which one goes first, there are k-1 that can go second. This leads to a total number of k*(k-1) arrangements for the first two items. Then, there are k-2 items to choose from for the 3rd item, and so on, leading to k*(k-1)*(k-2)*(k-3)*…*3*2*1 total arrangements. Note though that this is just the same as the definition of k factorial, so we just write k! to represent the expression. Now, we observe that the number of ways to choose k items from n such that order matters is k! times bigger than the number of ways to choose k items from n where order doesn’t matter. The reason is because for each of the ways we can make a selection of k items when order doesn’t matter, there are k! different orderings in which we could have chosen each of our k selected items. Hence, since there are  \frac{n!}{(n-k)!} possible choices when order matters, but this is k! times greater than the case when order doesn’t matter, we have that the total number of different possible selections when order doesn’t matter is just given by

\frac{n!}{k! (n-k)!}

This happens to be the definition of what is called the “choose function”, sometimes known as the binomial coefficient or pascal’s triangle, which mathematicians write as {n\choose k}.

Now, to put our work into action. Let’s suppose that we have ten salad ingredients (peppers, avocado, pears, walnuts, beans, peas, corn, croutons, soy beans and olives) and we want to know how many distinct salads can be made using just two ingredients. Well, if our salad is mixed up, then it doesn’t matter what order we put the ingredients in, so this is equivalent to the problem of asking how many ways there are to select two items from a list of ten when order doesn’t matter. This is given by

10 \choose{2} = \frac{10!}{2! (10-2)!} = \frac{(10)(9)(8!)}{(2)(8!)} = \frac{10*9}{2}=5*9 = 45

so there are 45 two ingredient salads you can make from ten ingredients. How many distinct salads can you make that have anywhere from 2 to 10 of our ingredients? The answer is just the number of two ingredient salads plus the number of three ingredients salads plus the number of four ingredient salads, etc. up to the number of ten ingredient salads. In mathematical notation, this is

10 \choose{2} + 10 \choose{3} + {10\choose 4} + …  + 10 \choose{9} + 10 \choose{10}

= 45 + 120 + 210 + 252 + 210 + 120 + 45 + 10 + 1  = 1013

which is simply the number of salads with two or more ingredients that could be made from our ten ingredients.

Q: How hard would it be to make a list of pruducts of primes that could beat public key encryption?

Saturday, March 27th, 2010

The complete question was: I’m assuming almost anyone with sufficient computing power could generate big prime numbers (if these are not already published somewhere). Would making a table of all of the products of these prime numbers be so difficult? Published public keys could then be compared to this list and the prime factors would be known immediately with a simple database look-up. Wouldn’t this make public key cryptography useless?

Mathematician: Let’s say we are working with all prime numbers that are about d digits long or shorter. The largest such number is x=10d – 1. The number of prime numbers less than this (applying the prime number theorem) is about \frac{x}{\ln{(x)}} which is approximately equal to 0.43\frac{10^d}{d}. Making a table of the product of all of these primes would mean storing about \frac{x}{\ln{(x)}} squared numbers, which is about

.18\frac{10^{2d}}{d^2}

numbers. The average size of one of these numbers is about

.5 x 10d

Hence, on average one of these products will be larger than

.25 x 102d

which takes about

6.64 d

bits to store. The total number of bits needed will be about the number of bits per number times the number of numbers. The number of numbers though is about .18\frac{10^{2d}}{d^2} giving a total number of bits of about

(6.64 d )(.18 \frac{10^{2d}}{d^2}) = 1.2 \frac{10^{2d}}{d}

converting this to gigabytes, we have about

1.12 \frac{10^{(2d-9)}}{d} gigabytes.

If we set d=100 (corresponding to storing all products of prime numbers with no more than 100 digits each) that requires about

1.12 x 10189

which is a ridiculously huge number… far, far bigger than the entire size of the internet. For comparison, the Earth only has about 1050 atoms.

If we set d=20 (corresponding to storing all products of prime numbers with no more than 20 digits each) that requires about

5.6 x 1029
which is still much bigger than the size of the internet!

Basically, there are far too many large prime numbers to store them all, even if we restrict ourselves only to numbers of 20 digits in length.

Physicist: This isn’t an answer, just a clarification of what the question was about (for those of you who aren’t hip to the Comp Sci scene), followed by something I think is worth knowing.

Public key encryption is based on a “trap door algorithm”. Basically, you use a mathematical process that is easy to do in one direction and nie-impossible to do in the other direction, unless you know the “secret”. RSA encryption uses multiplication as it’s mathematical process. (There’s a bit more to it, and if anyone wants to know more let us know.)

Multiplication is easy: 170,363 x 83,701,993 = ?

Factoring is hard (secret = knowing the factors): ? x ? = 14,259,722,633,459

Cryptographers like prime numbers because by using only two big primes you make sure that your factors are large and hard to find. If you have b factors, then, for a given product N, the average size of each factor is about N^{\frac{1}{b}} (smaller and easier to find for larger b).

If you’re ever bored and want to find a big prime you can use Fermat’s Little Theorem: if x^{P-1} mod \, P = 1 (for 1

So for example: P = 7, and x= 5 (pick x at random). 56 mod 7 = (52)3 mod 7 = (25)3 mod 7 = (21+4)3 mod 7 = (4)3 mod 7 = 64 mod 7 = 63+1 mod 7 = 1. 7 is almost definitely prime!

To be more sure you can check a couple different x’s. False positives are very rare using this test.

Another example: P = 12, x=3. 311 mod 12 = 38+3 mod 12 = ((32)2)233 mod 12 = (92)227 mod 12 = (81)2(24+3) mod 12 = (72+9)23 mod 12 = (9)23 mod 12 = 81 x 3 mod 12 = (72+9)3 mod 12 = 9 x 3 mod 12 = 27 mod 12 = 3 ≠ 1. 12 is not prime!

Fermat’s Little Theorem is a nearly 100% accurate test, but if you’re some kind of mathematician or something you would want to use the algorithm found in “Primes is in P” (by Agrawal, Kayal, Saxena). This doesn’t have much to do with the original question, but I think it’s worth knowing.

Q: What is infinity? (A brief introduction to infinite sets, infinite limits, and infinite numbers)

Thursday, March 11th, 2010

Mathematician: To mathematicians, infinity is not a single entity, but rather a label given to a variety of related mathematical objects. What unites these “infinities” is that they are all, in some sense, larger than anything that can be obtained or enumerated in the real physical world. Below, I will discuss a few of the infinities that crop up most frequently. One common feature they share is that our intuition about how things should behave often break down in these cases and the math requires some subtlety. Hang onto your hat.

1. Infinite Sets

Whereas sets like {1,2,3} and {dog, cat, apple, bat} clearly are finite in size (with size 3 and 4 respectively), it is natural to say that the set of all integers and the set of real numbers have infinite size. What is particularly interesting though is that while both infinite sets, the integers have an infinite size that’s smaller (in a precise sense) than that of the real numbers.

To observe this, we begin by noting that we can relabel the elements of {dog, cat, apple, bat} to be {1,2,3,4} by assigning dog=1, cat=2, apple=3, bat=4, which does not alter the size of our set (since the size of a set is independent of the names given to its items). However, {1,2,3} is obviously a subset of {1,2,3,4}, so {1,2,3} cannot be larger in size than {1,2,3,4}, and therefore {1,2,3} cannot be larger than {dog, cat, apple, bat} since this set was constructed from {1,2,3,4} just by renaming elements. More generally, if one (finite) set is larger than another, then we can always relabel the larger set so that the smaller one becomes a subset of it. Let’s now assume that this property continues to hold for infinite sets (or, if you like, we can use this very natural property as part of the foundation for the definition of the sizes of infinite sets).

Now, we apply this reasoning about relabeling to the real numbers and integers. First, we observe that the integers are a subset of the real numbers, and hence cannot have size larger than the real numbers. On the other hand though (and this is subtle requiring proof, which can be found here and here) it is impossible to relabel the integers in such a way that the real numbers become a subset of them. Hence, the real numbers are indeed larger than the integers in some important sense.

It may seem obvious that the size of the set of real numbers is in some sense a larger infinity than that of the size of the integers. What may come as a greater surprise however, is that the set of integers {…, -2, -1, 0, 1, 2, …}, the set of positive integers {1, 2, 3, 4, …}, and the set of all rational numbers {p/q where p and q are integers and q > 0} are all infinite but have exactly the same infinite size. The reason is simply because relabelings of these sets exist that make them all into the same set. For example, note that the assignment 1=0, 2=1, 3=-1, 4=2, 5=-2, 6=3, 7=-3, etc. will turn the positive integers into the set of all integers.

As it turns out, there is an infinite “chain” of infinities that measure the size of sets, including \aleph_{0}, the size of the integers, \aleph_{1}, the size of the real numbers, \aleph_{2}, the size of the set of all functions from the real numbers to binary values, and so on. In fact, for every set of size \aleph_{n} one can form a set of size \aleph_{n+1} by taking the set of all subsets of the original set. A disturbing questions with an even more disturbing answer can then be posed: “does there exist a set whose size is greater than \aleph_{0} but less than \aleph_{1}?” Bizarrely, this question turns out to be independent from the standard axioms of mathematics. That leaves us with just three options:

(a) Accept the fact that this mathematical question in unanswerable or “outside of math”.

(b) Reject the existence of sets that are larger than the integers and smaller than the real numbers which would be confirming what is known as the Continuum Hypothesis and amounts to adding a new axiom to math.

(c) Accept the existence of sets with this in between size, which implies adding the existence of such sets as a new mathematical axiom.

2. Limits

Infinities often arise when using “limits”, mathematical constructions which provide a rigorous backbone for calculus. When we have a function f(x) and write

\lim_{x \rightarrow \infty} f(x)

what we mean is the value (if one exists) that the function f(x) approaches as x gets larger and larger. So, for example, we have

\lim_{x \rightarrow \infty} \frac{1}{x} = 0

since by making x large enough, we can make \frac{1}{x} as close to 0 as we like. We would also write that

\lim_{x \rightarrow \infty} x^{2} = \infty

since as x gets larger and larger, x^{2} grows bigger without end (for any real number r there exists an x large enough so that x^{2} exceeds r) . We note though that there is a “rate” at which x^{2} approaches infinity. To see this, we can consider taking limits of the ratio of x^{2} to other functions which also grow without bound as x grows. If these ratios “go to infinity”, then x^{2} goes to infinity faster than these other functions. For example, we have:

\lim_{x \rightarrow \infty} \frac{x^{2}}{x} = \infty

and

\lim_{x \rightarrow \infty} \frac{x^{2}}{\log(x)} = \infty

whereas

\lim_{x \rightarrow \infty} \frac{x^{2}}{x^{2}-1} = 1

so x^{2} goes to infinity faster than x and \log(x) but at the same rate as x^{2}-1.

3. Algebra

Yet another way to think about infinity, is to introduce it as a special “number” with certain properties. For example, we can define \infty so that it satisfies (for all real numbers x):

\infty > x

\infty + x = \infty

\infty \times x = \infty when x > 0

\frac{x}{\infty} = 0

Similar rules could be used to define -\infty.  Some tricky cases arise though for which no sensible definition of \infty seems possible. For example

\infty \times 0 = ?

\frac{\infty}{\infty} = ?

\infty - \infty = ?

Defining each of these latter statements is problematic. However, as long as we never need to multiply infinity by zero, or divide infinities, or do any other “undefinable” operations in whatever context we happen to be working, we can introduce \infty as if it were just a special new type of number.

4. Topology

Topology is the study of topological spaces (which are sort of like a generalized notion of surfaces) and the properties they have that are independent of angles and distances. If two surfaces can be made the same through stretching or pulling without requiring any cutting or gluing (more technically, if they can be mapped onto each other by a continuous function with a continuous inverse) then they are considered identical from a topological perspective. Hence, a disc and rectangular surface are topologically equivalent, as are a rubber band shaped surface and a disc with one hole punched in it (imagine stretching the hole until you get a band like object). You may begin to see why it has been said that a topologist is a person who can’t tell a teacup from a doughnut.

When topologists work with the real number line (i.e. the set of real numbers together with the usual notion of distance which induces topological structure), they sometimes introduce a “point at infinity”. This point, denoted \infty can be thought of as the point that you would always be heading towards if you started at 0 and traveled in either direction at any speed for as long as you liked. Strangely, when this infinite point is added to the real number line, it makes it topologically equivalent to a circle (think about the two ends of the number line both joining up to this single infinite point, which closes a loop of sorts). This same procedure can also be carried out for the plane (which is the two dimensional surface consisting of points (x,y) where x and y are any real numbers). By adding a point at infinity we compactify the plane, turning it into something topologically equivalent to a sphere (imagine, if you can, the edges of the infinite plane being folded up until they all join together at a single infinity point).

How to get a circle from a line in four easy steps.

In some sense these “points at infinity” that are introduced are not special in any way (they behave just like all other points from a topological perspective). However, if measures of distance are thrown back into the mix, it seems fair to say that these points at infinity are infinitely far away from all the others.

Conclusion

Ultimately, the question “what is infinity?” is not one that really can be answered, as it assumes that infinite things have a unique identity. It makes more sense to ask “how do infinite things arise in mathematics”, and the answer is that they arise in many, very important ways.