Archive for the ‘Math’ Category

Q: If you were on the inside of the Sun falling in, the matter closer to the surface doesn’t affect your acceleration, but the matter closer to the core does. Why is that?

Thursday, September 2nd, 2010

The original question was: Plait talks about the “physics of solid bodies” and why, specifically, if you were on the inside of the Sun falling in, the matter “behind” you- closer to the surface- doesn’t affect your acceleration at all, and all that matters is the matter “in front” of you- closer to the core. Why is that?

Physicist: The short, uninteresting answer is that the gravity from any layer above you cancels itself out. If you take any sample layer above you, and you happen to be closer to one side, then you’ll find that the side you’re closer to has more pull on you, but there’s less of it. Conversely, the far side has less pull, but there’s more of it. For a sphere (but not a ring) these forces cancel exactly. So as you fall in you can ignore all the layers above you.

Pick a layer. Anything inside will experience exactly the same amount of pull in every direction, and so, no pull at all.

Answer gravy: One of the greatest tools in the physicist’s tool kit is “Gaussian Surfaces“. They let you shortcut really difficult math problems using pictures and a little reasoning.  Even better, you come across smarter than perhaps you deserve, which is a big plus.

A Gaussian surface is nothing more than an invisible bubble that you draw in space. The “inverse square law” of gravity can actually be rewritten as “the total amount of gravity pointing into the bubble is proportional to the amount of matter inside the bubble”. The arrangement of matter (both inside and outside the bubble) certainly changes how gravity points into (or out) of the bubble, but the total amount of gravity pointing through depends only on the amount of matter inside.

(upper left) when the mass is symmetrical and in the middle, then the gravity is exactly the same everywhere on the surface. (upper right) if the matter is off to the side, then gravity will be stronger, or weaker, or point in different directions at different points on the surface, but the total stays the same. (bottom) matter outside of the Gaussian surface can affect how gravity pokes through, but it can't affect the total.

Now say that your bubble is an exact fit around a sphere of matter. Everything is perfectly symmetric, so there’s no reason for gravity to be any stronger or weaker anywhere and, given the amount of mass inside the sphere, you can figure out how strong the gravity is. Now say you add more matter, but uniformly, on top of your original sphere.

In both situations the total amount of matter inside the bubble is the same, and everything is nice and symmetrical, so the gravity along the surface of the bubble is the same.

The matter inside the sphere has remained the same, so the pull at the surface of that sphere remains the same. As a result, so long as the matter above is at least fairly symmetrical (which is the case for any planet or star you can think of), you can ignore the layers above the surface of the bubble.

Specifically, as you fall farther and farther into the Sun (or Earth, or whatever else is round) you can figure out how much gravity you’re feeling by using a Gaussian surface, for which you only need the matter below you. The layers above will exert no pull on you, and you will exert no net pull on them (for every action/force there is an equal and opposite reaction/force).

This part has nothing to do with the question: You can use Gaussian surfaces to prove some surprising things. Specifically: Dyson spheres work, and black holes have no more gravity than the stars they came from.

From the last argument (above) you know that the layers above you have no net gravitational effect on you. But what if you fall a little way into a planet, and suddenly find that the inside of it is completely hollow? One you’re inside all the layers are layers above you. So there’s no gravity at all inside of a large hollow sphere (at least, none caused by the sphere). If you built a really huge sphere around a star you’d have a “Dyson’s sphere”. The sphere doesn’t pull the star, and the star doesn’t pull the sphere. It’s stable no matter where the star is inside the ball. So long as no one shoves anything, everything will just float neutrally right where it is.

(left) the set up for a Dyson Sphere. The perfectly spherical shell has no gravitational effect on anything inside the sphere, including the star. (center) Miles Dyson, inventor of the Dyson sphere, and future inventor of Skynet. (right) an artist's interpretation of a Dyson sphere.

Now, put a Gaussian surface around a star. There’s a certain amount of matter in the star, and that tells you how much gravity is pointing through the surface. If the star shrinks, who cares? Same amount of mass = same amount of gravity.

The gravity through the outer Gaussian surface stays the same, since both contain the same amount of matter. The gravity through the inner Gaussian surface increases dramatically after the star collapses, because it contains all of the star's mass, instead of just a small part of it.

But if you draw a small Gaussian surface around the core of the star you’ll find that the gravity along the surface is small, because there is (relatively) little mass inside of it. If for some reason you found yourself in the center of the Sun, you’d be floating in zero G’s. Point of fact; you’d also be on fire.

Now when the Star collapses, all of the matter is drawn into a tiny region. Both spheres (see diagram on the right) now contain all of the star’s matter, and thus the same total amount of gravity pokes through them. The only difference is that the inner sphere is smaller, so the gravity has to be more intense to get the same total as the outer sphere.
Black holes do have very intense gravity, but only in the region where the star used to be.

Q: Will we ever overcome the Heisenberg uncertainty principle?

Sunday, August 1st, 2010

Physicist: Nopers!

The Heisenberg uncertainty principle, while normally presented in physics circles, is actually a mathematical absolute.  So overcoming the uncertainty principle is exactly as hard as overcoming that whole “1+1=2″ thing.

The uncertainty principle (the “position/momentum uncertainty principle”) is generally presented like this: you have some particle floating along and you’d like to know its position and its momentum (velocity, whatever) so you bounce a photon off of it (“Bounce a photon off of it” is just another way of saying “try to see it”).  A general rule of thumb for light (waves in general really) is high frequency waves propagate in straight lines, and low frequency waves spread out.  That’s why sunlight (high frequency) seems to go in a perfectly straight line, but radio waves can spread out around corners.  For example, you can still pick up a radio station even when you can’t see it directly.

So, if you want to see where something is with precision you’ll need to use a high frequency photon.  After all, how can you trust the results from a wandering, low frequency photon?  But, if you use a precise, high-frequency, and thus, high-energy photon, you’ll end up smacking the hell out of the particle you’re trying to measure.  So you’ll know where it is pretty exactly, but it’ll go flying off with some unknown amount of momentum.  Any method you can come up with to measure the momentum will require you to use low-frequency, low-energy, gentle photons.  But then you won’t be able to figure out the particle’s position very well.

Low frequency photons (like radio waves) don't tell you much about where a particle is, but they doesn't knock it around much either (so you can measure its momentum better). High frequency photons (like sunlight) are terrible at measuring momentum, but can tell you position well.

So far this seems more like an engineering problem than a problem with the universe.  Maybe we could arrange things so that the high frequency photon hit softer or something?  There was a lot of back and forth for a long time (still is in some circles) about overcoming the uncertainty principle, but in the end it can never be violated.

Rather than being something that’s merely very challenging like, “you can’t break the sound barrier”, “what goes up must come down”, and “you can’t be the world’s best kick-boxer and be the world’s most handsome physicist”, the uncertainty principle is a mathematical absolute.  So, unless the basic assumptions of physics are completely wrong (and they’ve held up to some serious scrutiny), the uncertainty principle is in the company of things like “you can’t go faster than light”, “energy and mass are conserved”, and “modern mathematicians don’t have beards” (has anyone else noticed this?).  What follows is answer gravy.

Answer gravy: This gravy has some lumps.  If you know what a “Fourier transform” is, and are at least a little comfortable with them, then this could be interesting to you.

The square of a quantum wave function is the probability of finding it in a particular state.  For example, the “position wave function” can tell you the probability of finding a particle at any position. To get the probability from the wave function, all you have to do is square the wave function.

If you’ve got the quantum wave function f(x) for the position of a particle, then you can find the the momentum wave function, g(p), by taking the Fourier transform of f.  That is, g=\hat{f}.  Now, you can define the uncertainty as the standard deviation of the probability function, which is a really good way to go about it.

A probability function (blue), with its uncertainty or standard deviation (red). Like you'd expect, the particle is most likely to be near zero, but it's not certain to be near zero.

The uncertainty principle now just boils down to the statement that the product of the uncertainties of the square of a function, f, and the square of its Fourier transform, \hat{f}, is always greater than some constant.  In what follows you’ll find some useful stuff such as Plancherel’s theorem and Cauchy-Schwartz.

\begin{array}{ll}\sigma_x\sigma_p=\sigma_{|f|^2}\sigma_{|\hat{f}|^2}\\=\sqrt{Var(|f|^2)}\sqrt{Var(|\hat{f}|^2)}\\=\left(\int x^2|f|^2\,dx\right)^{\frac{1}{2}}\left(\int\xi^2|\hat{f}|^2\,d\xi\right)^{\frac{1}{2}}\\=\left(\int |xf|^2\,dx\right)^{\frac{1}{2}}\left(\int|\xi\hat{f}|^2\,d\xi\right)^{\frac{1}{2}}\\=\frac{1}{2\pi}\left(\int |xf|^2\,dx\right)^{\frac{1}{2}}\,\left(\int|\widehat{f^\prime}|^2\,d\xi\right)^{\frac{1}{2}}&(\widehat{f^\prime}=2\pi i\xi\hat{f})\\=\frac{1}{2\pi}\left(\int |xf|^2\,dx\right)^{\frac{1}{2}}\,\left(\int|f^\prime|^2\,d\xi\right)^{\frac{1}{2}}&(\textrm{Plancherel})\\ \ge\frac{1}{2\pi}\int|xf f^\prime|\,dx&(\textrm{Cauchy-Schwartz})\\=\frac{1}{2 \pi} \int |x| \, |f| \, |f^\prime| \, dx \\\ge \frac{1}{2 \pi} \left| \int x |f| f^\prime \, dx \right| \\= \frac{1}{2 \pi} \left| \int (x) (\frac{1}{2}|f|^2)^\prime \, dx \right|&(\frac{1}{2}|f|^2)^\prime=|f| f^\prime\\= \frac{1}{4 \pi} \left| \int |f|^2 \, dx \right|&(\textrm{integration by parts})\\=\frac{1}{4 \pi}&(\textrm{the total probability is always 1})\end{array}

So, there’s the Heisenberg uncertainty principle: \sigma_{|f|^2} \sigma_{|\hat{f}|^2} \ge \frac{1}{4 \pi}.  A physicist would recognize this as \Delta x \Delta p \ge \frac{\hbar}{2}.  The difference comes about because the Fourier transform that takes you from the position wave function to the momentum wave function involves an h, and \hbar = \frac{h}{2\pi}.  (For the physicists out there who were wondering what happened to their precious h’s)

Q: What’s the chance of getting a run of K or more successes (heads) in a row in N Bernoulli trials (coin flips)? Why use approximations when the exact answer is known?

Saturday, July 24th, 2010

The original question was: Recently I’ve come across a task to calculate the probability that a run of at least K successes occurs in a series of N (K≤N) Bernoulli trials (weighted coin flips), i.e. “what’s the probability that in 50 coin tosses one has a streak of 20 heads?”. This turned out to be a very difficult question and the best answer I found was a couple of approximations.

So my question to a mathematician is: “Why is this task so difficult compared e.g. to finding the probability of getting 20 heads in 50 coin tosses solved easily using binomial formula? How is this task in fact solved – is there an exact analytic solution? What are the main (preferably simplest) approximations and why are they used instead of an analytic solution?”

Physicist: What follows was not obvious. It was the result of several false starts. It’s impressive in the same sense that a dude with seriously bruised legs, who can find anything in his house with the lights off, is also impressive. If you’re looking for the discussion of the gross philosophy, and not the detailed math, skip down to where you find the word “Jabberwocky” in bold. If you want specific answers for fixed, small numbers of coins, or you want sample computer code for calculating the answer, go to the bottom of the article.

The short answer: the probability, S, of getting K or more heads in a row in N independent attempts (where p is the probability of heads and q=1-p is the probability of tails) is:

S(N,K) = p^K\sum_{T=0}^\infty {N-(T+1)K\choose T}(-qp^K)^T-\sum_{T=1}^\infty {N-TK\choose T}(-qp^K)^T

Note that here {A\choose B} is the choose function (also called the binomial coefficients) and we are applying a non-standard convention that {A\choose B}= 0 for A < B which makes the seemingly infinite sums always have only a finite number of terms. In fact, for N and K fixed, the answer is a polynomial with respect to the variable p.

Originally this was a pure math question that didn’t seem interesting to a larger audience, but we both worked for long enough on it that it seems a shame to let it go to waste. Plus, it gives me a chance to show how sloppy (physics type) mathematics is better than exact (math type) mathematics.

Define {Xi}j as the list of results of the first j trials. e.g., if j=4, then {Xi}j might be “{Xi}4=HTHH” or “{Xi}4=TTTH” or something like that, where H=heads and T=tails. In the second case, X1=T, X2=T, X3=T, and X4=H.

Define “Ej” as the event that there is a run of K successes (heads) in the first j trials. The question boils down to finding P(EN).

Define “Aj” as the event that the last K terms of {Xi}j are T followed by K-1 H’s ({Xi}j = X1X2X3X4…THHH…HHH). That is to say, if the next coin (Xj+1) is heads, then you’ve got a run of K.

Finally, define p = P(H) and q = P(T) = 1-p. Keep in mind that a “bar” over an event means “not”. So “\overline{H}=T” reads “not heads equals tails”

The probability of an event is the sum of the probabilities of the different (disjoint) ways that event can happen. So:

\begin{array}{ll}&P(E_{j+1})\\i)&=P(E_{j+1}\cap E_j\cap A_j)+P(E_{j+1}\cap E_j\cap \overline{A_j})+P(E_{j+1}\cap \overline{E_j}\cap A_j)+P(E_{j+1}\cap \overline{E_j}\cap \overline{A_j})\\ii)&=\left[P(E_{j+1}\cap E_j\cap A_j)+P(E_{j+1}\cap E_j\cap \overline{A_j})\right]+P(E_{j+1}\cap \overline{E_j}\cap A_j)+P(E_{j+1}\cap \overline{E_j}\cap \overline{A_j})\\iii)&=\left[P(E_{j+1}\cap E_j)\right]+P(E_{j+1}\cap \overline{E_j}\cap A_j)+P(E_{j+1}\cap \overline{E_j}\cap \overline{A_j})\\iv)&=P(E_j)+P(E_{j+1}\cap \overline{E_j}\cap A_j)+P(E_{j+1}\cap \overline{E_j}\cap \overline{A_j})\\v)&=P(E_j)+P(E_{j+1}\cap \overline{E_j}\cap A_j)+0\\vi)&=P(E_j)+P(E_{j+1}|\overline{E_j}\cap A_j)P(\overline{E_j}\cap A_j)\\vii)&=P(E_j)+pP(\overline{E_j}\cap A_j)\\viii)&=P(E_j)+pP(\overline{E_j}| A_j)P(A_j)\\ix)&=P(E_j)+pP(\overline{E_j}| A_j)qp^{K-1}\\x)&=P(E_j)+qp^KP(\overline{E_{j-k}})\\xi)&=P(E_j)+qp^K\left[1-P(E_{j-k})\right]\\xii)&=P(E_j)+qp^K-qp^KP(E_{j-k})\end{array}

iv) comes from the fact that E_j \subset E_{j+1}. If you have a run of K heads in the first j trials, of course you’ll have a run in the first j+1 trials. v) The zero comes from the fact that if the first j terms don’t have a run of K heads and the last K-1 terms are not all heads, then it doesn’t matter what the j+1 coin is, you can’t have a run of K heads (you can’t have the event Ej+1 and not Ej and not Aj). vii) is because if there is no run of K heads in the first j trials, but the last K-1 of those j trials are all heads, then the chance that there will be a run of K in the first j+1 trials is just the chance that the next trial comes up heads, which is p. ix) the chance of the last K trials being a tails followed by K-1 heads is qpK-1. x) If the last K (of j) trials are a tails followed by K-1 heads, then whether a run of K heads does or doesn’t happen is determined in the first j-K trials.
The other steps are all probability identities (specifically P(C)=P(C\cap D)+P(C\cap \overline{D}), \, P(\overline{C})=1-P(C), and Bayes’ theorem: P(C\cap D)=P(C|D)P(D)).

Rewriting this with some N’s instead of j’s, we’ve got a kick-ass recursion: P(E_N)=P(E_{N-1})+qp^K-qp^KP(E_{N-K-1})

And just to clean up the notation, define S(N,K) as the probability of getting a string of K heads in N trials (up until now this was P(EN)).

S(N,K)=S(N-1,K)+qpK-qpKS(N-K-1,K). We can quickly figure out two special cases: S(K,K) = pK, and S(l,K) = 0 when lK, and that there’s no way of getting K heads without flipping at least K coins. Now check it:

\begin{array}{ll}&S(N,K)\\i)&=S(N-1,K)+qp^K-qp^KS(N-K-1,K)\\ii)&=\left[S(N-2,K)+qp^K-qp^KS(N-K-2,K)\right]+qp^K-qp^KS(N-K-1,K)\\iii)&=S(N-2,K)+2qp^K-qp^K\left[ S(N-K-1,K)+S(N-K-2,K)\right]\\iv)&=S(N-3,K)+3qp^K-qp^K\left[ S(N-K-1,K)+S(N-K-2,K)+S(N-K-3,K)\right]\\v)&=S(N-(N-K),K)+(N-K)qp^K-qp^K\left[ S(N-K-1,K)+\cdots+S(N-K-(N-K),K)\right]\\vi)&=S(K,K)+(N-K)qp^K-qp^K\left[ S(N-K-1,K)+\cdots+S(0,K)\right]\\vii)&=p^K+(N-K)qp^K-qp^K\sum_{r=0}^{N-K-1} S(r,K)\\viii)&=p^k+(N-K)qp^K-qp^K\sum_{r=K}^{N-K-1} S(r,K)\end{array}

ii) Plug the equation for S(N,K) in for S(N-1,K). iii-vi) is the same thing. vii) write the pattern as a sum. viii) the terms up to K-1 are all zero, so drop them.

Holy crap! A newer, even better recursion! It seems best to plug it back into itself!

\begin{array}{ll}i)&S(N,K)=p^K+(N-K)qp^K-qp^K\sum_{r=K}^{N-K-1} S(r,K)\\ii)&=p^K+(N-K)qp^K-qp^K\sum_{r=K}^{N-K-1} \left[p^k+(r-K)qp^K-qp^K\sum_{\ell=K}^{r-K-1} S(\ell,K)\right]\\iii)&=p^K+(N-K)qp^K-qp^K\sum_{r=K}^{N-K-1}p^k-qp^K\sum_{r=K}^{N-K-1}(r-K)qp^K+\left(qp^K\right)^2\sum_{r=K}^{N-K-1}\sum_{\ell=K}^{r-K-1} S(\ell,K)\\iv)&=p^K+(N-K)qp^K-qp^K\sum_{r=1}^{N-2K}p^k-\left(qp^K\right)^2\sum_{r=0}^{N-2K-1}r+\left(qp^K\right)^2\sum_{r=2K+1}^{N-K-1}\sum_{\ell=K}^{r-K-1} S(\ell,K)\\v)&=p^K+(N-K)qp^K-p^k(N-2K)qp^K-\left(qp^K\right)^2\frac{(N-2K)(N-2K-1)}{2}+\left(qp^K\right)^2\sum_{r=2K+1}^{N-K-1}\sum_{\ell=K}^{r-K-1} S(\ell,K)\\vi)&=p^K+{N-K\choose 1}qp^K-p^K{N-2K\choose 1}qp^K-{N-2K\choose 2}\left(qp^K\right)^2+\left(qp^K\right)^2\sum_{r=2K+1}^{N-K-1}\sum_{\ell=K}^{r-K-1} S(\ell,K)\end{array}

You can keep plugging the “newer recursion” back in again and again (it’s a recursion after all). Using the fact that \sum_{i=1}^N={N+1\choose 2} and \sum_{i=D}^M {\ell \choose D}={M+1 \choose D+1} you can carry out the process a couple more times, and you’ll find that:

\begin{array}{l}S(N,K)=p^K\left[1-{N-2K\choose 1}qp^K+{N-3K\choose 2}\left(qp^K\right)^2\cdots\right]+\left[{N-K\choose 1}qp^K-{N-2K\choose 2}\left(qp^K\right)^2+{N-3K\choose 3}\left(qp^K\right)^3\cdots\right]\\=p^K\sum_{T=0}^\infty {N-(T+1)K\choose T}(-qp^K)^T-\sum_{T=1}^\infty {N-TK\choose T}(-qp^K)^T\end{array}

There’s your answer.

In the approximate (useful) case:

Assume that N is pretty big compared to K. A string of heads (that can be zero heads long) starts with a tails, and there should be about Nq of those. The probability of a particular string of heads being at least K long is pk so you can expect that there should be around E=Nqpk strings of heads at least K long. When E≥1, that means that it’s pretty likely that there’s at least one run of K heads. When E<1, E=Nqpk is approximately equal to the chance of a run of at least K showing up.

Jabberwocky: And that’s why exact solutions are stupid.

Mathematician: Want an exact answer without all the hard work and really nasty formulas? Computers were invented for a reason, people.

We want to compute S(N,K), the probability of getting K or more heads in a row out of N independent coin flips (when there is a probability p of each head occurring and a probability of 1-p of each tail occurring). Let’s consider different ways that we could get K heads in a row. One way to do it would be to have our first K coin flips all be heads, and this occurs with a probability p^{K}. If this does not happen, then at least one tail must occur within the first K coin flips. Let’s suppose that j  is the position of the first tail, and by assumption it satisfies 1 \le j \le K. Then, the probability of having K or more heads in a row in the entire set of coins (given that the first tail occurred at j \le K) is simply the probability of having K or more heads in a row in the set of coins following the jth coin (since there can’t be a streak of K or more heads starting before the jth coin due to j being smaller or equal to K). But this probability of having a streak of K or more after the jth coin is just S(N-j,K). Now, since the probability that our first tail occurs at position j is the chance that we get j-1 heads followed by one tail, so it is p^{j-1} (1-p) . That means that the chance that the first tail occurs on coin j AND there is a streak of K or more heads is given by p^{j-1} (1-p) S(N-j,K). Hence, the probability that the first K coins are all heads, OR coin one is the first tails and the remainder have K or more heads in a row, OR coin two is the first tails and the remainder have K or more heads in a row, OR coin three is the first tails and…, is given by:

S(N,K) = p^{K} + \sum_{j=1,K} p^{j-1} (1-p) S(N-j,K)

Note that what this allows us to do is to compute S(N,K) by knowing the values of S(N-j,K) for  1 \le j \le K. Hence, this is a recursive formula for S(N,K) which relates harder solutions (with larger N values) to easier solutions (with smaller N values). These easier solutions can then be computed using even easier solutions, until we get to S(A,B) for values of A and B so small that we already know the answer (i.e. S(A,B) is very easy to calculate by hand). These are known as our base cases. In particular, we observe that if we have zero coins then there is a zero probability of getting any positive number of heads is zero, so S(0,K) = 0, and the chance of getting more heads than we have coins is zero, so S(N,K) = 0 for K>N.

All of this can be implemented in a few lines of (python) computer code as follows:

An important aspect of this code is that every time a value of S(N,K) is computed, it is saved so that if we need to compute S(N,K) again later it won’t take much work. This is essential for efficiency since each S(N,K) is computed using S(N-j,K) for each j with 1 \le j \le K and therefore if we don’t save our prior results there will be a great many redundant calculations.

For your convenience, here is a table of solutions for S(N,K) for 1 \le N \le 10 and 1 \le K \le 10 (click to see it enlarged):

Q: Will there always be things that will not or cannot be known?

Wednesday, June 30th, 2010

Mathematician: Unfortunately, limits to knowledge seem to be built into the nature of the universe, and even into logic itself.

Relativity: Einstein’s theory of special relativity implies that no information can travel faster than the speed of light. That means that information from sufficiently recent, sufficiently far away events will not have had the time to propagate to us yet, making detailed knowledge of such events impossible. In physics speak, we say that these events are outside of our “past light cone“, “space-like separated” from us, or just “elsewhere”. As long as new events of this type keep happening, there will always be things about which we do not and cannot know.

Quantum Mechanics: The Heisenberg uncertainty principle states that the uncertainty \Delta x we have in a particle’s position and the uncertainty \Delta p we have in the particle’s momentum cannot both be very small at the same time. In particular, the product of these uncertainties is greater than a constant (\Delta x \Delta p > \frac{\hbar}{2}). This implies a fundamental limit to the knowledge that is possible because we can know x accurately or p accurately, but not both.

What’s more, the vast majority of physicists agree that quantum mechanics demonstrates the universe is random at a fundamental level. This means that some events, like the time at which an atom will decay, can be predicted only probabilistically. We can say how likely an atom is to decay in a given time interval, but we will never be able to say precisely when the decay will occur, placing another limitation on what knowledge is possible. (Physicist’s note: After the decay you still can’t say when exactly it happened because according to quantum mechanics the exact time doesn’t actually exist!)

Mathematics: Gödel’s  first incompleteness theorem states (essentially) that any mathematical system  that is able to express elementary arithmetic (and doesn’t contain any contradictions) must contain true arithmetical statements that cannot be proven within that system. Essentially this implies that there will always be true mathematical statements that we cannot prove.


Add to all of these theoretical considerations the enormous (and possibly infinite) number of things that could be known about our physical universe, and the (most definitely) infinite number of true mathematical statements that could be known, and it is clear that there will always be knowledge that is beyond our reach.

Q: If you could see through the Earth, how big would Australia look from the other side?

Sunday, June 27th, 2010

The original question was: Relative to the size my feet appear when I’m standing up and looking at the ground, how large would Australia appear if I could see all the way through the Earth and observe its shape?  Also, if we considered my location to be a new “north pole”, how large would the “northern” hemisphere I observe seem relative to the “southern” hemisphere? In other words, due to the direct inverse relation between apparent size and distance, how much smaller does one half of a sphere appear from a point directly centered on the surface of the other half?

Physicist: This is example of “party trick mathematics”, the kind of math that you can do in your head, but that looks really complicated.  There’s a seriously old theorem from the days when togas meant math (not frat parties) called the “inscribed angle theorem”.  It says that if something has an angle on a circle of 2ϕ when seen from the center of the circle, then when seen from a point on the edge it will have an angle of ϕ.  What’s really surprising is that it doesn’t matter where you are on the circle.  It always works.

The Inscribed Angle Theorem: Surprising, but true.

I estimate that Australia spans about 34°.  Which means that, if you could see it through the Earth, it would take up about 17° of your vision.  Also, it wouldn’t matter where you are on the planet, it would always be 17°.  Unless you’re in Australia.  The size of where ever you are is always 180°.  Unless you’re on the beach or something (90°?).

Lucky for us (people), we all scale about the same.  There are some (literal) rules of thumb that you can use to estimate angles.  From standing, your feet are about 10°.  With your arm outstretched, the width of your thumb is about 1.5°, and your fist is about 7°.

So if you could see Australia through the ground, it would span about two and a half fists-at-arms-length, or a little less than two of your-own-feet-while-standing.  If you could see the other hemisphere (pick one), then it would appear to be exactly 90° across.

What follows is answer gravy:

Finally, for those of you who want to find exact arcangles on the Earth’s surface: If you have two locations at latitudes \gamma and \phi, and the difference in longitudes is \theta, then the true arcangle between them is:

\cos^{-1}{\left(\cos{(\gamma)}\cos{(\phi)}\cos{(\theta)}+\sin{(\gamma)}\sin{(\phi)}\right)}

Also, if you multiply this number by 6365, then you’ve got the distance between those points in km (as the crow flies).

Q: How it is that Bell’s Theorem proves that there are no “hidden variables” in quantum mechanics? How do we know that God really does play dice with the universe?

Tuesday, June 22nd, 2010

Physicist: Bell’s theorem, and its philosophical fallout, is one of the most profound discoveries since relativity.

Bell’s theorem states (among other things) that the universe is fundamentally unpredictable, and that quantum mechanical things (for example: everything) are not actually in one state.  If a box could contain either a blue marble or a red marble, then when you open it you’ll see either on or the other.  In “reality” it was one color or the other before you open the box.  In QM, it can be both before you open the box (it’s actually still both afterwords, but moving on…).

Einstein (and most other physicists of the time) believed that if you knew everything about a system of particles (no matter how big) that you could theoretically predict what that system will be doing in the future, perfectly.  Homeboy thought that the only reason that the movement of air molecules seems to be random, is that we can’t perfectly measure that exact position and velocity of every single one.  So he thought that every particle truly is in some particular state, but that we merely don’t know for sure what that state is (the marble in the box has only one color, but we don’t know what it is).

The idea that randomness and unpredictability are caused by unknown (or unknowable) things is called “hidden variable theory” (The ‘Stein believed in this).  For example; 2, 2, 3, 6, 0, 6, 7, 9, 7, 7, 4, 9, 9, … is not random, but seems random.  It would be really hard to predict the next term (7) if you don’t know the hidden variable.  (BTW, the “hidden variable” is: this is the decimal expansion of \sqrt{5})

Bell’s theorem essentially boils down to a proof that the result of an experiment doesn’t exist until the measurement is made (so it can’t be predicted).  Hidden variable theory presupposes that the particles involved are in definite states, which means that the result of a measurement already exists before the measurement is made.  For example: before you open a gift what you’ll see is already set in stone.  The gift is a set thing before you open the box.  This is not the case for most quantum mechanical systems.

Here’s one of the experiments that demonstrates Bell’s theorem, and two ways to look at it.

An entangled pair of photons is created and fired in opposite directions. En route the polarizers are randomly oriented, then the detectors measure whether or not the photons pass through. This is done hundreds of thousands of times to measure the relationship between 1) the difference in angles between the polarizers and 2) the probability of measuring the same result.

The experiment: Step 1: Create a pair of entangled photons and fire them in opposite directions.  Entangled particles always yield the same result when they are subjected to the same measurement, and are likely to yield the same result for similar measurements.

Step 2: Randomly orient the polarizers, after the entangled pair is created, but before either is detected (this is hard to time, and is really fast).  This is done so that the photons “don’t know what to expect” and “can’t compare notes”.  Information about polarizer A would have to travel faster than the speed of light to get to photon B before photon B hits it’s own polarizer.  So, without faster than light effects (which don’t exist for many, really good reasons) the photons are each acting independently.  The orientation is random so that the photons can’t “plan ahead”.

Step 3: Measure the polarization.  If the detector “clicks” then the photon made it through the polarizer, and therefore has the same polarization.  If the detector doesn’t click, then the photon had the opposite polarization and was stopped.

The probability of the measurements being the same (for an entangled pair) is P = \cos^2{(\theta)}, where \theta is the difference in angles between the polarizers.  It is tricky to see why, but this probability is impossible if you assume that the result of a measurement exists before the measurement is made.  Here’s why.

The possible polarizations for polarizer A (red) and polarizer B (blue).

Algebraic approach: Restricting the possible angles of the polarizers to 0° and 45° for A, and 22.5° and 67.5° for B, run the experiment. Here’s what’s about to happen:

1) If you could predict the outcome of each version of the experiment, then you could find a definite value of L (see below).

2) For strictly (unarguable) mathematical reasons L = ±2.

3) Experimentally we find that the average value of L is 2√2.

4) But this is a contradiction, so we cannot actually make useful predictions.

Now it’s happening:

If polarizer A is at 0° and the detector clicks then you’d say “A0 = 1″, and if the detector doesn’t click then “A0 = -1″.  Similarly, you can define B67.5, A45, and B22.5.  Just for the hell of it, take a look at: L = A0B22.5 + A45B22.5 + A45B67.5 - A0B67.5 = (A0 + A45)B22.5 + (A45 - A0)B67.5

L = (A0 + A45)B22.5 + (A45 - A0)B67.5 = ±2, since either (A0 + A45) = ±2 and (A45 - A0) = 0, or (A0 + A45) = 0 and (A45 - A0) = ±2.  So L = A0B22.5 + A45B22.5 + A45B67.5 - A0B67.5 = ±2 ≤ 2.

So if you could fill out each of these values (A0, A45, B22.5, B67.5), then L = ±2 ≤ 2.

However, you can’t make all of these measurements simultaneously, so you can’t actually get A0B22.5 + A45B22.5 + A45B67.5 - A0B67.5 for each run of the experiment.  The best you can do is find one of these four terms each time you run the experiment.  For example, if the polarizer A was randomly set to 45° and the detector clicked, and polarizer B was randomly set to 22.5° and the detector didn’t click, then you just found out that A45B22.5 = (1)(-1) = -1 for that run.

You can however find the expectation value by running the experiment over and over and keeping track of the results and polarizer orientation.

E[A0B22.5] + E[A45B22.5] + E[A45B67.5] – E[A0B67.5] = E[A0B22.5 + A45B22.5 + A45B67.5 - A0B67.5] ≤E[|A0B22.5 + A45B22.5 + A45B67.5 - A0B67.5|] = E[2] = 2.

So E[A0B22.5] + E[A45B22.5] + E[A45B67.5] – E[A0B67.5] ≤ 2.  This is one version of “Bell’s Inequality”, and it holds if each term (A0, A45, B22.5, B67.5) has a value.

Using the fact that the chance of getting the same result is P = \cos^2{(\theta)}, and that each term is 1 when the results are the same ((1)(1) or (-1)(-1)), and -1 when the results are different ((1)(-1) or (-1)(1)), you can calculate each term.  For example:

E[A_0B_{22.5}]=P(same)-P(different)=\cos^2{(22.5)}-(1-\cos^2{(22.5)})=\frac{1}{\sqrt{2}}

You’ll find that:

E[A_0B_{22.5}]+E[A_{45}B_{22.5}]+E[A_{45}B_{67.5}]-E[A_0B_{67.5}]=\frac{1}{\sqrt{2}}+\frac{1}{\sqrt{2}}+\frac{1}{\sqrt{2}}-\frac{-1}{\sqrt{2}}=2\sqrt{2}

Holy crap!  2\sqrt{2}>2!  But that’s a violation of Bell’s inequality!  But the existence of each measurement (whether or not you actually do that measurement) is all you need for Bell’s inequality!  So if the inequality is false, then the result of those measurements don’t exist if the measurement isn’t made!

God plays dice with the universe.

Maybe, if you're clever and have ready access to a time machine, you could go back and do all the measurements you didn't make the first time. Then all the results would have to exist! They'd just have to!

Me and my time machine vs. quantum mechanics: If the results exist, but you just didn’t happen to do all the measurements, why not get a time machine?  Then you could do one measurement, go back, do a different measurement, go back, do a different measurement, …  Then every possible result would be known.

However, once again that correlation probability (P = \cos^2{(\theta)}) screws things up.

So, for example, if the photon goes through at 50°, and then you go back in time, change the polarizer to 51°, and repeat the experiment, then there’s a 99.97% (cos2(1°) = 0.9997) that the photon will go through again.

One result from probability says that P(x=z)\ge P(x=y)+P(y=z)-1.  Do this twice and you get P(w=z)\ge P(w=x)+P(x=z)-1\ge P(w=x)+P(x=y)+P(y=z)-2.  So if you measure in the 0° direction to find A0, then go back and change the angle by 1° and repeat this until you’re measuring at 90°, then:

P(A_0=A_{90})\ge P(A_0=A_1)+P(A_1=A_2)+\cdots+P(A_{89}=A_{90})-89 =90\cos^2{(1^o)}-89=0.9726

So, if you go back and forth in time to measure whether or not the photon goes through at 1° increments, then there’s a 97% chance that by the time you get to 90° you’ll be getting the same result you did at 0°.  However, in reality P(A_0=A_{90})=\cos^2{(90^o)}=0.

But this is a contradiction.  So the results of each measurement (A0, A1, A2, …, A90) can’t all exist.

If I had to guess, every time you go back in time the experiment is completely reset, and the experiment becomes completely random again.  The reason (such as it is) is below this unsettling picture.

Wait. Wait... Why?

But why?!: It turns out that the reason that the results of a quantum event can’t be predicted, is that every possible result of that event plays out.  So if you ask “will I see the photon go through the polarizer?” the answer is “yes, some versions of you will see the photon go through” and an equally valid answer is “no, some versions of you will not”.

If different versions of you will see every possible result, then the result can’t be predicted, and doesn’t really exist one way or the other until after the measurement is done.  At that time the different versions of you will disagree on the result.  But don’t worry too much.  You’ll never meet you’re parallel-universe twins.

Q: Can math and science make you better at gambling?

Sunday, May 23rd, 2010

Physicist: Yes.

Don’t gamble (mathematically speaking).

Mathematician: Gambling (at casinos, in lotteries, and in most other instances) is expected value negative, even when you play optimally. That means that the average amount of money you will make per play is negative (i.e. you will lose money, on average). It also implies (via the Central Limit Theorem) that when you play many times, there will be a greater than 50% chance that you lose money overall.

There are a few exceptions to this rule. Card counting in games like blackjack can be expected value positive if you are very good at it and increase your bet sizes at the right times, but casinos are savvy and make this difficult (e.g. by shuffling many decks of cards together). Plus, if you get caught, you will get banned, or worse. Betting against friends can also be expected value positive if you can be sure you’re really a better gambler than they are. Poker at casinos (where you play against other gamblers) can also be expected value positive if you’re very good, but since the casino charges you to play, even if you’re better than the other players you may not (on average) come out positive.

A good rule of thumb is this: If you don’t enjoy gambling, then simply don’t do it since (on average) it will just waste your money. If you do enjoy it, then think of it as a recreational activity, not as a way to make money. Before starting, decide how much money you are willing to lose, and if you use that money up, quit immediately. If you assume that you are going to lose then you won’t be disappointed.

All of this being said, if you do decide to gamble, you’d better study up on your probability! It’s important to know what the best course of action is (probabilistically speaking) at each decision point in the game.

Q: What do complex numbers really mean or represent?

Monday, May 3rd, 2010

Physicist: Nothing really.

Complex numbers are very useful for streamlining a lot of different types of math, generalizing ideas, and “closing” the real numbers.  In quantum theory you’ll find that on the most fundamental level the universe seems to prefer complex numbers to real numbers.

But you can’t use them to count or measure stuff for crap, so most people can lives long happy lives without being particularly bothered by complex numbers.  If you can call that living.

Mathematician: There are many ways to view complex numbers, but one of the most intuitive is to think of them as representing points in the plane. Doing so will allow us to interpret basic arithmetic operations like addition and multiplication as performing geometric manipulations.

How does this work? Well, every complex number z can be written as z = x + i y  where x and y are real numbers, and i is the square root of -1. We can then think of z as a point on the (x,y) plane with x being the position along the horizontal (i.e. real) axis, and y the position along the vertical (i.e. imaginary) axis. To understand the operation of adding and multiplying complex numbers though, we need to think about them slightly differently.

The complex number 2+3i represented as a point in the complex plane. Have you ever seen a more boring image in your life?

If we choose, we can treat complex numbers as vectors rather than points. That means that now z = x+i y  represents not the point (x,y) but rather, the directed line segment which extends from (0,0) in the complex plane to (x,y) in the complex plane (unless otherwise stated we will assume here that our vectors are emanating from the origin (0,0)). You can think of a vector as simply representing a magnitude (or length) and a direction. The line segment extending from (1,2) to (-4,5) is the same as the one extending from (-4,5) to (1,2), but the vectors represented would be different because they would be pointing in opposite directions. Vectors are often drawn as arrows.

The complex number 2+3i represented as a vector in the complex plane. Compared to our last image, this is a hoot.

Once we have the vector notion of a complex number, we can think about adding complex numbers as adding vectors. For example, if we have

z_{1} = x_{1} + i y_{1} z_{2} = x_{2} + iy_{2}

then

z_{1} + z_{2} = (x_{1}+x_{2}) + i (y_{1}+y_{2}).

Hence, z_{1} + z_{2} is the vector in the complex plane that extends from (0,0) to (x_{1}+x_{2}, y_{1}+y_{2}). Note that (x_{1}+x_{2}, y_{1}+y_{2}) is the point we get to if we piggy back vectors z1 and z2 and then follow them both. By this I mean that we translate (i.e. move without rotating or stretching) z2 so that its beginning is at the end of z1, and then we follow the path consisting of the two vectors until we get to the (new) end of z2. However, since z_{1} + z_{2} = z_{2} + z_{1}, we can also piggy back z1 at the end of z2 to get the same result. A slightly different view is achieved if we think of our first number z_{1} as being a vector, and our second number z_{2} as being a point. In this case, z_{1} + z_{2} simply corresponds to moving the point z_{2} by the distance and direction represented by the vector z_{1}.

Here are two complex numbers represented as vectors, 2+3i and 1+3i.

When we sum the complex numbers 2+3i and 1+3i we get the complex number 3+6i.

Alright, so addition of complex numbers can be thought of as adding vectors in the complex plane (or moving a point by the distance and direction stored in a vector), but what the heck does multiplication do? Well, to understand multiplication we need yet another geometric way of thinking about complex numbers.

First though, we make a quick (and relevant) aside. For any complex number z, we have by definition that the absolute value |z| of z satisfies

|z| = \sqrt{z \overline z} = \sqrt{(x+iy)(x-iy)} = \sqrt{x^{2} + y^{2}} = \sqrt{(x-0)^{2} + (y-0)^{2}}

which is precisely the formula for the distance between the point (0,0) and the point (x,y). Hence, |z| measures the length of the vector that z represents. Now, according to Euler’s formula, we have that for any real number \theta

e^{i \theta} = cos(\theta) + i sin(\theta).

hence e^{i \theta} is a complex number since it is the sum of a real and imaginary number. In particular, we have

|e^{i \theta}|^{2} = e^{i \theta} \overline{ e^{i \theta} } = e^{i \theta} e^{-i \theta} = e^{i \theta - i \theta} = e^{0} = 1

That means that any complex number representable in this form must have a distance of 1 from the origin. In other words, such numbers represent points on the unit circle, or equivalently, vectors of length 1 extending from the origin. As it turns out, all vectors in the complex plane with length 1 can be represented in this form. But what angle does each of these vectors point at? Well, we compute the angle of the line segment (0,0) to (x,y) with respect to the horizontal axis using arctan(y/x). In our case though, since x=cos(\theta) and y = sin(\theta) this gives:

arctan(y/x) = arctan(sin(\theta)/cos(\theta)) = arctan(tan(\theta)) = \theta.

Hence, the complex number e^{i \theta} represents a vector of length 1 pointed at angle \theta. Similarly, for any non-negative number r, we have that the complex number

r e^{i \theta}

represents a vector of length r pointed at angle \theta . This provides a new way of thinking about a complex number: as a vector specified by its length and angle. What happens when we multiply two such numbers z_{1} = r_{1} e^{i \theta_{1}} and z_{2} = r_{2} e^{i \theta_{2}} together? Well, we have

z_{1} z_{2} = r_{1} e^{i \theta_{1}} r_{2} e^{i \theta_{2}}

= r_{1} r_{2} e^{i (\theta_{1} + \theta_{2})} .

Hence, z_{1} z_{2} is a new complex number representing a vector of length r_{1} r_{2} pointed at angle \theta_{1} + \theta_{2}. Therefore, we can think of the action of z_{1} multiplying z_{2} as causing the vector z_{2} to stretch by a factor of r_{1} and rotate by an angle \theta_{1}.

Hence, to recap, we can view complex numbers geometrically as representing points or vectors in the complex plane. If we do this, then adding complex numbers corresponds to adding together vectors, or equivalently, moving the point that the second complex number represents along the vector that the first complex number represents. On the other hand, we can think of multiplication of complex numbers as corresponding to scaling and rotating the second complex number in the multiplication by the length and angle inherent in the first complex number. Finally, we note that taking the absolute value of a complex number corresponds to measuring the length of the corresponding vector. Therefore, one way to view complex numbers is as a means for converting geometric operations (translation, rotation, scaling) into algebraic operations (adding, multiplication) and back again. As you might imagine, this can be extremely useful!

Q: What would happen if an unstoppable force met with an unmovable, impenetrable object?

Thursday, April 22nd, 2010

Mathematician: Sometimes, when we don’t use language carefully enough, we can get ourselves into philosophical trouble. For example, consider the following statement:

If a barber shaves all those men (and only those men) who do not shave themselves, does he shave himself?

If the barber shaves himself, then he is shaving a man who shaves himself, which is something that (by definition) he does not do. On the other hand, if the barber does not shave himself, then there is a man who doesn’t shave himself that the barber doesn’t shave, which again contradicts our definition of the barber.

So what is the answer? Well, the question has no answer, because the definition we use for our barber contains within it a logical contradiction. What’s more, it is impossible for such a barber to actually exist in the real world, since the razor burn associated with simultaneously shaving yourself and not shaving yourself is too much for any single person to withstand.

Now, let’s return to the original question:

What would happen if an unstoppable force met with an unmovable, impenetrable object?

Well, let’s suppose that we define an “unstoppable” force to be one that can move absolutely any matter. Furthermore, let’s define an “unmovable” object to be one that cannot be moved by any force. In that case, this question is unanswerable, because like the barber paradox above, it relies on contradictory information. By definition our force can move anything, but then, also by definition, there is an object that the force cannot move. This is a bit like saying “suppose X is true, and not X is true. Then is X true?”. Here  X is the idea that the force can move anything, and not X is the idea that there is at least one object that cannot be moved by the force (which in this case is our unmovable object). Hence, this question has no answer because it relies on assumptions which contradict each other.

Q: How do I count the number of ways of picking/choosing/taking k items from a list/group/set of n items when order does/doesn’t matter?

Monday, April 12th, 2010

Mathematician: Suppose that we have a list containing three items, {A,B,C}, and we want to know how many different ways there are of choosing two items from this list. If we care about the order that items are selected from the list, then the possibilities are

{A,B}, {A,C}, {B, A}, {B, C}, {C,A}, {C,B}

(where {A,B} here means that we’ve selected item A and then item B from the list {A,B,C} ) so the answer is that there are six possible ways of choosing two items out of three. If on the other hand we don’t care about what order the items were selected in (so {A,B} is considered the same as {B,A}) then all the possible unique arrangements are

{A,B}, {A,C}, {B, C}

so the answer is that there are three possible ways of choosing two items out of three when order doesn’t matter.

When there are small numbers of items, it isn’t difficult to just write down all the possible combinations (when order doesn’t matter) or permutations (when order does matter). But what do we do when there are larger numbers of items? For example, it turns out there are 15,504 different ways to choose 5 items out of 20 (when the order of items selected doesn’t matter), far too many to write down. In cases like these we want to use a formula that depends only on n (the number of items in our set) and k (the number of items we will be selecting from it) that can quickly give us the answer we need. Let’s see if we can figure out what this formula should be.

For the time being we are going to assume that the order we select items in does matter (so the selection of A followed by the selection of B is not the same as the selection of B followed by the selection of A). Now notice that since we have n items in total, there are n choices for the first item that we pick. Once we’ve chosen the first item, there are n-1 items left in our list, so for our second selection we have n-1 possible choices. The number of total possible ways of choosing the first two items is therefore n * (n-1) because for each of the n first items we could choose we have n-1 second items possible. Now, for the 3rd item selected there are n-2 items left in the list (since we’ve already used up 2 items out of a total of n), which means that in choosing three items there are a total of n (n-1) (n-2) possible permutations. This pattern continues for each of the k items we are choosing, so that we find that the total number of ways of choosing k items is

(n)(n-1)(n-2)…(n-k+2)(n-k+1)

which can be written using the factorial function as

\frac{n!}{(n-k)!}

Remember however that in this analysis the order of the items selected made a difference. If what we are interested in is the number of ways of choosing k items from a list of n such that order does NOT matter (i.e. in each selection all that matters is which items are in that selection, not which order those items were chosen) then we have to adjust our formula somewhat by making the following considerations.

If we are going to be choosing k items, then how many different orderings of those k items exist? Well, there are k possible choices for which item goes first. After we have chosen which one goes first, there are k-1 that can go second. This leads to a total number of k*(k-1) arrangements for the first two items. Then, there are k-2 items to choose from for the 3rd item, and so on, leading to k*(k-1)*(k-2)*(k-3)*…*3*2*1 total arrangements. Note though that this is just the same as the definition of k factorial, so we just write k! to represent the expression. Now, we observe that the number of ways to choose k items from n such that order matters is k! times bigger than the number of ways to choose k items from n where order doesn’t matter. The reason is because for each of the ways we can make a selection of k items when order doesn’t matter, there are k! different orderings in which we could have chosen each of our k selected items. Hence, since there are  \frac{n!}{(n-k)!} possible choices when order matters, but this is k! times greater than the case when order doesn’t matter, we have that the total number of different possible selections when order doesn’t matter is just given by

\frac{n!}{k! (n-k)!}

This happens to be the definition of what is called the “choose function”, sometimes known as the binomial coefficient or pascal’s triangle, which mathematicians write as {n\choose k}.

Now, to put our work into action. Let’s suppose that we have ten salad ingredients (peppers, avocado, pears, walnuts, beans, peas, corn, croutons, soy beans and olives) and we want to know how many distinct salads can be made using just two ingredients. Well, if our salad is mixed up, then it doesn’t matter what order we put the ingredients in, so this is equivalent to the problem of asking how many ways there are to select two items from a list of ten when order doesn’t matter. This is given by

10 \choose{2} = \frac{10!}{2! (10-2)!} = \frac{(10)(9)(8!)}{(2)(8!)} = \frac{10*9}{2}=5*9 = 45

so there are 45 two ingredient salads you can make from ten ingredients. How many distinct salads can you make that have anywhere from 2 to 10 of our ingredients? The answer is just the number of two ingredient salads plus the number of three ingredients salads plus the number of four ingredient salads, etc. up to the number of ten ingredient salads. In mathematical notation, this is

10 \choose{2} + 10 \choose{3} + {10\choose 4} + …  + 10 \choose{9} + 10 \choose{10}

= 45 + 120 + 210 + 252 + 210 + 120 + 45 + 10 + 1  = 1013

which is simply the number of salads with two or more ingredients that could be made from our ten ingredients.