Q: Why are numerical methods necessary? If we can’t get exact solutions, then how do we know when our approximate solutions are any good?

Physicist: When a problem can be solved exactly and in less time than forever, then it is “analytically solvable”.  For example, “Jack has 2 apples and Jill has 3 apples, how many apples do they have together?” is analytically solvable.  It’s 5.  Exactly 5.

Precisely solving problems is what we often imagine that mathematicians are doing, but unfortunately you can’t swing a cat in the sciences without hitting a problem that can’t be solved analytically.  In reality “doing math” generally involves finding an answer rather than the answer.  While you may not be able to find the exact answer, you can often find answers with “arbitrary precision”.  In other words, you can find an approximate answer and the more computer time / scratch paper you’re willing to spend, the closer that approximation will be to the correct answer.

A lot of math problems can’t be directly solved.  For example: most.

A trick that lets you get closer and closer to an exact answer is a “numerical method”.  Numerical methods do something rather bizarre: they find solutions close to the answer without ever knowing what that answer is.  As such, an important part of every numerical method is a proof that it works.  So that there is the answer: we need numerical methods because a lot of problems are not analytically solvable and we know they work because each separate method comes packaged with a proof that it works.

It’s remarkable how fast you can stumble from solvable to unsolvable problems.  For example, there is an analytic solution for the motion of two objects interacting gravitationally but no solution for three or more objects.  This is why we can prove that two objects orbit in ellipses and must use approximations and/or lots of computational power to predict the motion of three or more objects.  This inability is the infamous “three body problem“.  It shows up in atoms as well; we can analytically describe the shape of electron orbitals and energy levels in individual hydrogen atoms (1 proton + 1 electron = 2 bodies), but for every other element we need lots of computer time to get things right.

Even for purely mathematical problems the line between analytically solvable and only numerically solvable is razor thin.  Questions with analytic solutions include finding the roots of 2nd degree polynomials, such as $0=x^2+2x-3$, which can be done using the quadratic equation:

$x=\frac{-(2)\pm\sqrt{(2)^2-4(1)(-3)}}{2(1)}=\frac{-2\pm4}{2}=-3\,or\,1$

The quadratic equation is a “solution by radicals”, meaning you can find the solution using only the coefficients in front of each term (in this case: 1, 2, -3).  There’s a solution by radicals for 3rd degree polynomials and another for 4th degree polynomials (they’re both nightmares, so don’t).  However, there can never be a solution by radicals for 5th or higher degree polynomials.  If you wanted to find the solutions of $2x^5-3x^4+\pi x^3+x^2-x+\sqrt{3}=0$ (and who doesn’t?) there is literally no way to find an expression for the exact answers.

Numerical methods have really come into their own with the advent of computers, but the idea is a lot older.  The decimal expansion of $\pi$ (3.14159…) never ends and never repeats, which is a big clue that you’ll never find its value exactly.  At the same time, it has some nice properties that make it feasible to calculate $\pi$ to arbitrarily great precision.  In other words: numerical methods.  Back in the third century BC, Archimedes realized that you could approximate $\pi$ by taking a circle with circumference $\pi$, then inscribing a polygon inside it and circumscribing another polygon around it.  Since the circle’s perimeter is always longer than the inscribed polygon’s and always shorter than the circumscribed polygon’s, you can find bounds for the value of $\pi$.

Hexagons inscribed (blue) and circumscribed (red) on a circle with circumference π.  The perimeters of such polygons, in this case p6=3 and P6=2√33.46, must always fall on either side of π≈3.14.

By increasing the number of sides, the polygons hug the circle tighter and produce a closer approximation, from both above and below, of $\pi$.  There are fancy mathematical ways to prove that this method approaches $\pi$, but it’s a lot easier to just look at the picture, consider for a minute, and nod sagely.

Archimedes’ trick wasn’t just noticing that $\pi$ must be between the lengths of the two polygons.  That’s easy.  His true cleverness was in coming up with a mathematical method that takes the perimeters of a given pair of k-sided inscribed and circumscribed polygons with perimeters $p_k$ and $P_k$ and produces the perimeters for polygons with twice the numbers of sides, $p_{2k}$ and $P_{2k}$.  Here’s the method:

$P_{2k}={\frac {2p_{k}P_{k}}{p_{k}+P_{k}}}\quad \quad p_{2k}={\sqrt {p_{k}P_{2k}}}$

By starting with hexagons, where $p_6=3$ and $P_6=2\sqrt{3}$, and doubling the number of sides 4 times Archie found that for inscribed and circumscribed enneacontahexagons $p_{96}=\frac{223}{71}\approx3.14085$ and $P_{96}=\frac{22}{7}\approx3.14286$.  In other words, he managed to nail down $\pi$ to about two decimal places: $3.14085<\pi<3.14286$.

Some puzzlement has been evinced by Mr. Medes’ decision to stop where he did, with just two decimal points in $\pi$.  But not among mathematicians.  The mathematician’s ancient refrain has always been: “Now that I have demonstrated my amazing technique to the world, someone else can do it.”.

To be fair to Archie, this method “converges slowly”.  It turns out that, in general, $p_n=n\sin\left(\frac{\pi}{n}\right)\approx\pi-\frac{\pi^3}{6}\frac{1}{n^2}$ and $P_n=n\tan\left(\frac{\pi}{n}\right)\approx\pi+\frac{\pi^3}{3}\frac{1}{n^2}$.  Every time you double n the errors, $\frac{\pi^3}{3}\frac{1}{n^2}$ and $\frac{\pi^3}{6}\frac{1}{n^2}$, get four times as small (because $2^2=4$), which translates to very roughly one new decimal place every two iterations.  $\pi$ never ends, but still: you want to feel like you’re making at least a little progress.

Some numerical methods involve a degree of randomness and yet still manage to produce useful results.  Speaking of $\pi$, here’s how you can calculate it “accidentally”.  Generate n pairs of random numbers, (x,y), between 0 and 1.  Count up how many times $x^2+y^2\le1$ and call that number k.  If you do this many times, you’ll find that $\frac{4k}{n}\approx\pi$.

If you randomly pick a point in the square, the probability that it will be in the grey region is π/4.

As you generate more and more pairs and tally up how many times $x^2+y^2\le1$ the law of large numbers says that $\frac{k}{n}\to\frac{\pi}{4}$, since that’s the probability of randomly falling in the grey region in the picture above.  This numerical method is even slower than Archimedes’ not-particularly-speedy trick.  According to the central limit theorem, after n trials you’re likely to be within about $\frac{0.41}{\sqrt{n}}$ of $\pi$.  That makes this a very slowly converging method; it takes about half a million trials before you can nail down “3.141”.  This is not worth trying.

Long story short, most applicable math problems cannot be done directly.  Instead we’re forced to use clever approximations and numerical methods to get really close to the right answer (assuming that “really close” is good enough).  There’s no grand proof or philosophy that proves that all these methods work but, in general, if we’re not sure that a given method works, we don’t use it.

Answer Gravy: There are a huge number of numerical methods and entire sub-sciences dedicated to deciding which to use and when.  Just for a more detailed taste of a common (fast) numerical method and the proof that it works, here’s an example of Newton’s Method, named for little-known mathematician Wilhelm Von Method.

Newton’s method finds (approximates) the zeros of a function, f(x).  That is, it finds a value, $\lambda$, such that $f(\lambda)=0$.  The whole idea is that, assuming the function is smooth, when you follow the slope at a given point down you’ll find a new point closer to a zero/solution.  All polynomials are “smooth”, so this is a good way to get around that whole “you can’t find the roots of 5th or higher degree polynomials” thing.

The “big idea” behind Newton’s Method: pick a point (xn), follow the slope, find yourself closer (xn+1), repeat.

The big advantage of Newton’s method is that, unlike the two $\pi$ examples above, it converges preternaturally fast.

The derivative is the slope, so $f^\prime(x_n)$ is the slope at the point $(x_n,f(x_n))$.  Considering the picture above, that same slope is given by the rise, $f(x_n)$, over the run, $x_n-x_{n+1}$.  In other words $f^\prime(x_n)=\frac{f(x_n)}{x_n-x_{n+1}}$ which can be solved for $x_{n+1}$:

$x_{n+1}=x_n-\frac{f(x_n)}{f^\prime(x_n)}$

So given a point near a solution, $x_n$, you can find another point that’s closer to the true solution, $x_{n+1}$.  Notice that if $f(x_n)\approx0$, then $x_{n+1}\approx x_n$.  That’s good: when you’ve got the right answer, you don’t want your approximation to change.

To start, you guess (literally… guess) a solution, call it $x_0$.  With this tiny equation in hand you can quickly find $x_1$.  With $x_1$ you can find $x_2$ and so on.  Although it can take a few iterations for it to settle down, each new $x_n$ is closer than the last to the actual solution.  To end, you decide you’re done.

Say you need to solve $x=\cos(x)$ for x.  Never mind why.  There is no analytical solution (this comes up a lot when you mix polynomials, like x, or trig functions or logs or just about anything).  The correct answer starts with $\lambda=0.739085133215160641655312\ldots$

y=x and y=cos(x). They clearly intersect, but there’s no way to analytically solve for exactly where.

First you write it in such a way that you can apply Newton’s method: $f(x)=\cos(x)-x=0$.  The derivative is $f^\prime(x)=-\sin(x)-1$ and therefore:

$x_{n+1}=x_n-\frac{f\left(x_n\right)}{f^\prime\left(x_n\right)}=x_n+\frac{\cos\left(x_n\right)-x_n}{\sin\left(x_n\right)+1}$

First make a guess.  I do hereby guess $x_0=3$.  Plug that in and you find that:

$x_1=x_0+\frac{\cos\left(x_0\right)-x_0}{\sin\left(x_0\right)+1}=3+\frac{\cos\left(3\right)-3}{\sin\left(3\right)+1}=-0.496558178297331398840279$

Plug back in what you get out several times and:

$\begin{array}{ll} x_0&=3\\ x_1&=-0.496558178297331398840279\ldots\\ x_2&=2.131003844480994964494021\ldots\\ x_3&=0.689662720778373223587585\ldots\\ x_4&=0.739652997531333829185767\ldots\\ x_5&=0.739085204375836184250693\ldots\\ x_6&=0.739085133215161759778809\ldots\\ x_7&=0.739085133215160641655312\ldots\\ x_8&=0.739085133215160641655312\ldots\\ \end{array}$

In this particular case, $x_0$ through $x_3$ jump around a bit.  Sometimes Newton’s method does this forever (try $x_0=5$) in which case: try something else or make a new guess.  It’s not until $x_4$ that Newton’s method starts to really zero in on the solution.  Notice that (starting at $x_4$) every iteration establishes about twice as many decimal digits than the previous step:

$\begin{array}{ll} \vdots\\ x_4&=0.739\ldots\\ x_5&=0.739085\ldots\\ x_6&=0.73908513321516\ldots\\ x_7&=0.739085133215160641655312\ldots\\ \vdots \end{array}$

We know that Newton’s method works because we can prove that it converges to the solution.  In fact, we can show that it converges quadratically (which is stupid fast).  Something “converges quadratically” when the distance to the true solution is squared with every iteration.  For example, if you’re off by 0.01, then in the next step you’ll be off by around $(0.01)^2=0.0001$.  In other words, the number of digits you can be confident in doubles every time.

Here’s why it works:

A smooth function (which is practically everywhere, for practically any function you might want to write down) can be described by a Taylor series.  In this case we’ll find the Taylor series about the point $x_n$ and use the facts that $0=f(\lambda)$ and $x_n-\frac{f\left(x_n\right)}{f^\prime\left(x_n\right)}$.

$\begin{array}{rcl} 0&=&f(\lambda) \\[2mm] 0&=&f(x_n+(\lambda-x_n)) \\[2mm] 0&=&f(x_n)+f^\prime(x_n)\left(\lambda-x_n\right)+\frac{1}{2}f^{\prime\prime}(x_n)\left(\lambda-x_n\right)^2+\ldots \\[2mm] 0&=&\frac{f\left(x_n\right)}{f^\prime\left(x_n\right)}+\left(\lambda-x_n\right)+\frac{f^{\prime\prime}(x_n)}{2f^\prime\left(x_n\right)}\left(\lambda-x_n\right)^2+\ldots \\[2mm] 0&=&\lambda-\left(x_n-\frac{f\left(x_n\right)}{f^\prime\left(x_n\right)}\right)+\frac{f^{\prime\prime}(x_n)}{2f^\prime\left(x_n\right)}\left(\lambda-x_n\right)^2+\ldots \\[2mm] 0&=&\lambda-x_{n+1}+\frac{f^{\prime\prime}(x_n)}{2f^\prime\left(x_n\right)}\left(\lambda-x_n\right)^2+\ldots \\[2mm] \lambda-x_{n+1}&=&-\frac{f^{\prime\prime}(x_n)}{2f^\prime\left(x_n\right)}\left(\lambda-x_n\right)^2+\ldots \end{array}$

The “…” becomes small much faster than $\left(\lambda-x_n\right)^2$ as $x_n$ and $\lambda$ get closer.  At the same time, $\frac{f^{\prime\prime}(x_n)}{2f^\prime\left(x_n\right)}$ becomes effectively equal to $\frac{f^{\prime\prime}(\lambda)}{2f^\prime\left(\lambda\right)}$.  Therefore $\lambda-x_{n+1}\propto\left(\lambda-x_n\right)^2$ and that’s what quadratic convergence is.  Note that this only works when you’re zeroing in; far away from the correct answer Newton’s method can really bounce around.

Posted in -- By the Physicist, Computer Science, Equations, Math | 6 Comments

Burning Man 2017

Long ago, Ask a Mathematician / Ask a Physicist was two guys sitting around in the desert talking to strangers about whatever came to mind.  It’s been a while, but we’re heading back to Burning Man for more of the same!

If you happen to find yourself in the desert, have a question, and/or want to waste time with a Mathematician and a Physicist, you can find us here

There!  12-4 on Thursday.

from 12:00 to 4:00 on Thursday the 31st.  According to the official schedule we’re a gathering or party in a red tent between center camp and the Man.  That same schedule goes on to say:

“Ask a Mathematician / Ask a Physicist is two people sitting in the desert talking to other people in the desert. Ever wonder about the universe?  Entanglement?  The nature of logic?  Got a nagging question that’s been bothering you for years or just want to hang out and listen to other people’s questions?  We can help!”

Posted in -- By the Physicist | 7 Comments

Q: How can something be “proven” in science or math?

The original question was: … it confuses me that abstract concepts, such as Banach-Tarski, and other concepts in pure mathematics and theoretical physics, can be considered to have been “proven”.  Is it not the case that one can only prove something by testing hypotheses in the real/physical world?  And even then isn’t it a bit of a stretch to say that anything can really be proven beyond doubt?

Physicist: Hypothesis testing is the workhorse of scientific inquiry, used to determine whether or not a given effect is real.  The result of a hypothesis test isn’t a proof or disproof, it’s an estimate of how likely it is that you would see a given result accidentally.  The more unlikely it is that something would occur accidentally, the more likely it is to be a real effect.  For example, we haven’t proven that the Higgs boson exists, it’s just that there’s only about a one in half a trillion chance that the data from CERN would be produced accidentally.  That’s not a proof.  Even so, if an effect works as predicted very consistently, then you may as well believe that it’s real.

Things are “proven” to be true with certainty in very much the same way that we can know with certainty that someone has won a chess game.  There’s nothing etched into the fabric of the universe that determines how chess pieces move on a board (other than, you know, physically) or who won a given game, and yet everyone who knows the rules will be able to agree on the victor.  Math, despite its vaunted status as the purest science and the means by which the reach of our simple minds can exceed their squishy grasp, is basically like the rules of chess or any other game.

Once the rules are established, you can prove things based on those rules and some logic (technically, logic is just more rules).  For example, based on a reasonably short list of straightforward mathematical rules you can first define what a prime number is and then prove that there are an infinite number of them.

The rules in mathematics are called “axioms” and the results based on those rules are “theorems”.  For example, “you can’t split a point in half” is an axiom while “there are an infinite number primes” is a theorem.  When you first learn about numbers and arithmetic, you’re learning Peano’s axioms and lots of definitions and conclusions based on them.  Like the rules of chess, axioms just establish what things you can and can’t do in math and people are free to argue about which they do or don’t want to include.  Math doesn’t necessarily have anything to do with reality; it just happens to include some the most effective tools for understanding it ever conceived.

The fact that we can create new mathematics that doesn’t have anything to do with reality may seem like a weakness, but it’s turned out to be fantastically useful.  For example, by generalizing the laws of geometry away from triangles, three dimensions of space, and even the very notion of distance, mathematicians paved the way for Einstein’s general relativity (which describes the nature of gravity in terms of warped spacetime).  He basically just had to plug his new ideas about spacetime into math that had already been created.

Banach-Tarski is a century old result from set theory which says that you can (among other things) break a sphere into five or more sets, rotate and move those sets, and recombine them into two spheres identical to the first.  These sets are less like block puzzles pieces and more like droplets in a fog, almost all of which are smaller than any given size.  Notice that this is completely impossible physically.  Lucky for Banach and Tarski, math isn’t dictated by the uptight strictures of reality.

XKCD: Funny because it’s true.

Banach-Tarski is based on the usual axioms of set theory, Zermelo–Fraenkel (ZF), but requires the addition of a hotly contested axiom, the “Axiom of Choice” (ZFC).  “Hotly contested” in the math community is bit of a misnomer; mathematicians mostly just write long papers and stare angrily at each others shoes when they’re forced to shake hands.  The axiom of choice is to mathematics as en passant is to chess; it comes up when it comes up, but you don’t need it in general (if you’ve ever made it through a game of chess and have no idea what en passant is: exactly).

In abstract systems, the rules that are included is determined by preference, not physical reality.  In order to be useful to more than one person, most of the rules are generally agreed upon, but some are not.  (Left) En passant in chess and (Right) the axiom of choice in set theory.

The axiom of choice states that it is always possible to select (or even choose) a single item from each of a infinite collection of sets.  This is easy if there are a finite number of sets (“just go ahead and do it”) or if there’s a nice rule you can come up with (“always pick the lowest number”).  But sometimes you find yourself with an infinite set of infinite sets, none of which have a highest, lowest, or middlest point.  If you’re wondering how you go about picking a single unique item out of each of these sets, the Axiom of Choice says “you just can, so be cool”.  It is a completely made up statement that changes the rules of the game.  It’s not a matter of true or false, it’s a matter of consistency and agreeing with other mathematicians.

Physics, despite being the queen of the sciences and the means by which we mortals may strive to understand the underlying nature of reality, isn’t any better than math.  In physics you can “prove” that things are true or false, but only based on established rules: the “physical laws”.  For example, Newton’s universal law of gravitation says that the force of attraction between two objects with masses M and m spaced a distance r apart is $F=\frac{GMm}{r^2}$.  More than merely a statement of fact, mathematical expressions like this allow us to describe/predict precisely how things physically behave.  We can prove that orbits are elliptical based on this law (and a couple others) are accurate.  Notice that’s “accurate”, but not necessarily “true”.

If those rules turn out to be false, then the proofs based upon them aren’t proofs.  This is why physicists are so careful about establishing and verifying every detail of their theories.  They spend (seemingly wasted) decades doing tests of things that they’re already almost 100% sure is right, because a flaw in any of the fundamental laws would ripple out into every “proved” thing that’s based on it.

Every now and again some base rule or assumption in math or physics is overturned.  In math this is entirely due to logic, but physics is a bit more tricky.  We can’t divine the rules of the universe with logic alone.  If you were just a mind in a void, the nature of this universe would be a real shock.  No matter how smart you are, you need experiment and observation to learn new things about the world.

It’s easy (well… fairly easy) to write down some physical laws that seem to describe what we know about the universe that turn out to be wrong.  Without buckets of fantastically precise data and the math to understand it, there’s no way to know whether what you know is really only what you think.  Newton’s laws are tremendously useful, but ultimately misinformed.  They perfectly described the universe according to the data we had at the time; when more accurate (and more difficult to attain) data gave rise to “truer” physical theories we came to realize that Newtonian physics is merely a very good approximation.

Before Einstein we had safely assumed that time and space were completely independent. It took some seriously recondite phenomena (e.g., the invariance of the speed of light and a tiny error in Mercury’s orbit) to indicate that time and space are not some much related as they are different aspects of the same thing.  Almost more sacred, before Bell we had assumed that everything exists in a single definite state, whether we know what that state is or not.  This totally reasonable assumption is “realism”.  Again, the difference between the universe we had assumed we lived in and the world we evidently do live in (probably) was a set of incredibly esoteric, nigh unnoticeable effects (e.g., the randomness of things like radioactive decay and the “impossible” statistics of entangled particles).  It took a lot of clever experiments (dutifully checked, expounded upon, and multiply verified) and math to come to the conclusion that: nope, an assumption so fundamental that we call it “realism” or “the reality assumption” is actually false.  Quantum physicists who have evolved beyond the need to be understood will call the property of definitely being in a single state “counterfactual definiteness“.  Not that it’s worth mentioning, but if you can read this, you exist.  Good on ya.

In mathematics you can prove things, but you’re ultimately just moving pieces around on a board.  There’s a lot to learn and discover in the realms of logic, but math, like every abstract human endeavor, is all in our heads.

In physics you can prove things using physical laws.  However, those physical laws are only true insofar as they always work perfectly (as far as we can measure and verify) in every scenario which, arguably, is the best you can hope for.

Posted in -- By the Physicist, Conventions, Math, Philosophical | 7 Comments

Q: If time is relative, then how can we talk about how old the universe is?

Physicist: One of the most profound insights ever made by peoplekind is that time is relative.  This isn’t some abstract idea, mistake, or mathematical artifact.  If you have two identically functioning clocks, you can start them together, move them to different locations or along different paths, then when you physically bring them back together to compare, they will literally have registered different amounts of time.

You may be inclined to say, “sure, it’s weird… but which clock is right?”.  The existentially terrifying answer is: there is no such thing as a “correct clock”.  Every clock measures its own time and there is no such thing as a universal time.  And yet, cosmologists are always talking about the age of the universe (a mere 13.80±0.02 billion years young).  When talking about the age of the universe we’re talking about the age of everything in it.  But how can we possibly talk about the age of the universe if everything in it has its own personal time?

The short answer is: almost everything is about the same age.  The biggest time discrepancy is between things deep inside of galaxies and things well outside of galaxies, amounting to a couple parts per million (or one or two seconds per week).  Matter that has been in the middle of large galaxies since early in the universe’s history should be no more than on the order of 50,000 years younger than matter that has managed to remain in intergalactic space.  Considering that our best estimates for the age of the universe are only accurate to within 20 million years or so (0.1% relative error), a few dozen millennia here and there doesn’t make any difference.

There are two ways to get clocks to disagree: the twin paradox and gravitational time dilation.  The twin paradox a bizarre consequence of the difference between ordinary geometry and spacetime geometry.  In ordinary geometry, the shortest distance between two points is a straight line.  In spacetime geometry, the longest time between two points is a straight line.  A “straight line” in spacetime includes sitting still, so if you start with two clocks in the same place and take one on a trip that eventually brings it back to its stationary partner, then the traveling clock will have fallen behind its sedentary twin.

The Twin Paradox: The straighter the path taken between two locations, the more time is experienced.  Gravitational Time Dilation: Things farther from mass experience more time.

Assuming that the traveling clock travels at a more-or-less constant speed, you can figure out how much less time it experiences pretty easily.  If the traveling clock experiences $\tau$ amount of time and the stationary clock experiences $t$ amount of time, then $\tau=t\sqrt{1-\left(\frac{v}{c}\right)^2}$ (which you’ll notice is always less than t) where v is the speed of the traveling clock and c is the speed of light.  The ratio between these two times is called “gamma”, $\gamma = \frac{t}{\tau} = \frac{1}{\sqrt{1-\left(\frac{v}{c}\right)^2}}$, which is a useful piece of math to be aware of.  If the traveling clock changes speed, v(t), then you’ll need calculus, $\tau=\int_0^t \sqrt{1-\left(\frac{v(t)}{c}\right)^2}\,dt=\int_0^t \frac{dt}{\gamma(t)}$, but there are worse things.

Gravitational time dilation is caused by the warping of spacetime caused by the presence of energy and matter (mostly matter) which is a shockingly difficult thing to figure out.  When Einstein initially wrote down the equations that describe the relationship between mass/energy and spacetime he didn’t really expect them to be solved (other than the trivial empty-space solution); it took Schwarzschild to figure out the shape of spacetime around spherical objects (which is useful, considering how much round stuff there is to be found in space).  He did such a good job that the event horizon of a black hole, the boundary beyond which nothing escapes, is known in fancy science circles as the “schwarzschild radius”.

Fortunately, for reasonable situations (not-black-hole situations), you can calculate the time dilation between different altitudes by figuring out how fast a thing would be falling if it made the trip from the top height to the bottom height.  Once you’ve got that speed, v, you plug it into $\gamma$ and wham!, you’ve calculated the time dilation due to gravity.  If you want to figure out the total dilation between, say, the surface of the Earth and a point “infinitely far away” (far enough away that Earth can be ignored), then you use the speed something would be falling if it fell from deep space: the escape velocity.

By and large, the effect from the twin paradox is smaller than the effect from gravity, because if something is traveling faster than the local escape velocity, then it escapes.  So the velocity you plug into gamma for the twin paradox (the physical velocity) is lower than the velocity you’d plug in for gravitational dilation (the escape velocity).  If you have some stars swirling about in a galaxy, then you can be pretty sure that they’re moving well below the escape velocity.

The escape velocity from the surface of the Earth is 11km/s, which yields a gamma of $\gamma = 1.0000000007$.  Being really close to 1 means that the passage of time far from Earth vs. the surface of the Earth are practically the same; an extra 2 seconds per century if you’re hanging out in deep space.  The escape velocity from the core of a large galaxy (such as ours) is on the order of a thousand km/s.  That’s a gamma around $\gamma = 1.00001$, which accounts for that several seconds per week.

Point is, it doesn’t make too much difference where you are in the universe: time is time.

Now admittedly, there are examples of things either trapped in black holes or screaming across the universe at near the speed of light, but the good news for us (on both counts) is that such stuff is rare.  The only things that move anywhere close to the speed of light is light itself (no surprise) and occasionally individual particles of matter.  Light literally experiences zero time, so the “oldest” photons are still newborns; they have a very different notion of how old the universe is.  No one is entirely sure how much of the matter in the universe is presently tied up in black holes, but it’s generally assumed to be a small fraction of the total.

Long story short: when someone says that the universe is 13.8 billion years old, they’re talking about the age of the matter in the universe, almost all of which is effectively the same age.

Posted in -- By the Physicist, Astronomy, Physics, Relativity | 16 Comments

Q: How can carbon dating work on things that were never alive?

Physicist: It doesn’t.

Carbon dating is the most famous form of “radiometric dating”.  By measuring the trace amounts of radioactive carbon-14 (so named because it has 6 protons and 8 neutrons) in a dead something and comparing it to the amount of regular carbon-12 (6 protons and 6 neutrons) you can figure out how long it’s been since that sample was alive.  Carbon-14 is continuously generated in the upper atmosphere when stray neutrons bombard atmospheric nitrogen (which is what most of the atmosphere is).

The reason carbon dating works is that the fresh carbon-14 gets mixed in with the rest of the carbon in the atmosphere and, since it’s chemically identical to regular carbon, gets worked into whatever is presently absorbing atmospheric carbon.  In particular: plants, things that eat plants, things that eat things that eat plants, and breatharians. When things die they stop getting new carbon and the carbon-14 they have is free to radioactively decay without getting replaced.  Carbon-14 has a half-life of about 5,700 years, so if you find a body with half the carbon-14 of a living body, then that somebody would have been pretty impressed by bronze.

Of course none of that helps when it comes to pottery and tools (except wooden tools).  Not being made of carbon, we can’t carbon date them.  Fortunately, the stuff ancient civilization leave lying around tend to be found in clumps called “middens”.  A less sophisticated word for midden is “pile of garbage and often poo”.

Archaeology.

Generally speaking, archaeologists make the assumption that if the grains in and around of a clay pot are, say, 8,000 years old, then the pot itself is roughly the same age.  Which makes sense.  If you had an ancient amphora sitting around, would you use it for fresh strawberry preserves?  And before you answer: please do it.  A life spent potentially confusing future archaeologists is a life well spent.

There are many different kinds of radiometric dating that are used to date things that are non-organic (which is part of how we determine the age of the Earth).  They each rely on a couple of different (thoroughly verified) principles.  First, that radioactive isotopes have a fixed half-life (totally independent of their environment).  And second, that the elements they were before and after the radioactive decay have different chemical properties.

As water freezes and each molecule falls into place, atoms that don’t fit in the forming ice crystal are excluded.  Impurities, such as dissolved air, are either forced out or concentrated in the last region to freeze.  The same is true for any kind of crystal.

Crystals are regular lattices of atoms.  And they’re picky.  If an atom doesn’t interact chemically in the right way, then it won’t be incorporated into a forming crystal.  For example, zircon (a crystal) is perfectly happy to incorporate uranium, but excludes lead.  It so happens that uranium decays into lead with a half-life of 4.5 billion years.  So if you grind up a zircon and measure the tiny amounts of lead vs. uranium, you’re measuring how long it’s been since that zircon formed.  At that time there would have been zero lead in it.

Since carbon-14 has a half-life on the order of thousands of years, it’s useful for figuring out the age of organic materials that have been independent of the atmosphere for thousands of years.  Since uranium-238 (the isotope comprising more than 99% of natural uranium) has a half-life of billions of years, it’s useful for figuring out the age of (among other things) zircons that crystallized billions of years ago.  Want to date a woolly mammoth?: carbon dating.  Want to date a planet?: uranium dating.

Radiometric dating generally involves tallying up trace amounts of material, so it’s not the sort of thing you do out in the field; you need a clean lab.  So it was, after years of attempting to measure the age of the Earth (or, more specifically, the time since it was last molten) in a regular lab, that Clair Patterson bravely announced “Dudes and dudettes of science… anybody else notice all the lead in the air?”  Turns out that burning gasoline, among its other little known deleterious effects, throws lead into the air.  That’s not great: once everything on Earth is peppered with lead, it’s difficult for scientists to do their science.  And, not for nothing, it’s also caused a thousandfold increase in lead contamination in the bodies (or bones at least) of everything that breathes and/or eats.  If you’ve ever wondered why gasoline should be “unleaded”: that’s why.

This is the beauty of fundamental research: you never know what you’ll find when you start poking around.

Posted in -- By the Physicist, Particle Physics, Physics | 5 Comments

Teleportation! In space!

Physicist: This isn’t a question anybody asked, just an interesting goings-on.

A few weeks ago QUESS (QUantum Experiments at Space Scale) began teleporting quantum information to and from the Micius satellite and between ground stations 1200 km apart.  This is exciting, because it demonstrates the feasibility of easy (cheap), high-fidelity, long-distance quantum entanglement, which is the key to all quantum communication.  Micius is the first shaky pillar of a global-scale quantum infrastructure.

Time-lapse of a laser in Xinglong talking to Micius, the world’s first “quantum satellite”.

Entanglement is basically a combination of correlation and superposition.  The difference between a bit and a qubit is that a bit is either 1 or 0 while a qubit is simultaneously 1 and 0.  There are a lot of different forms that a qubit can take (just like there are many forms a bit can take): in this case the polarization of light is used.  There are two possible polarization states, which is perfect for encoding two possibilities, 0 and 1 (and incidentally perfect for making 3D movies; one movie for each polarization and each eye).

The polarization of light can point in any direction (perpendicular to the direction of travel), so we can use it to describe not just 0 or 1 but a combination of both.

A photon can be in a superposition of both horizontal and vertical polarization.  When measured they are always found to be in one state or the other (0 or 1), but there are a lot of clever things we can do with qubits before doing that measurement.

While it is impossible to be sure of the result, the probability that you see a “0” or “1” is described by the state (in the picture above, “0” is more likely, but not guaranteed).  The spooky thing about entangled particles is that, as long as you measure them the same way, their random results will be correlated.  For the simplest kind of entangled state, $|\Phi^+\rangle$, the results are the same.  If two photons are in the shared state $|\Phi^+\rangle$ and you find that one of them is vertically polarized, then the other will also be vertically polarized.  Random, but the same as each other.  Unfortunately, practically any interaction with either particle breaks the entanglement and leaves you with just a pair of regular, unrelated particles.  This discussion on entanglement goes into a bit more detail.

How do you get entangled particles thousands of km apart?  Carefully.

What QUESS is doing, and what “quantum communication” is all about, is getting two entangled particles far apart without accidentally breaking the entanglement or losing the particles (which is difficult when they’re being fired at you from space).

Once two widely separated parties share a pair of entangled particles, they can start doing some rather remarkable things.  One of those is the ability to send qubits from one particle to its entangled twin: “quantum teleportation”.  Quantum teleportation requires both an entangled pair of particles and a “classical communication channel” (which includes, but is not limited to, shouting loudly).  With those in hand, we can easily “teleport” a qubit, from one location to another.

Top: One state you’d like to teleport, A, and two particles sharing an entangled state, B and C. Middle: Some relative properties of A and B are measured and the results are sent to whomever has the other entangled particle. Based on that information, the other entangled particle is manipulated. Bottom: The result is that the entanglement is destroyed, but C assumes A’s original state.

Qubits (quantum states in general) are as delicate as delicate can be.  Absolutely any interaction capable of allowing anything to determine their state “collapses” that state; a qubit goes from being both 0 and 1 to being either 0 or 1, and all of the advantages that may have gone with that superposition go out the window.  So teleportation needs to be able to measure the to-be-sent qubit, A, without actually determining anything about it, which is tricky.  The way to get around that is to do a measurement that compares A and B, without measuring either of them directly.  If you learn that two coins have either the same or opposite side up, then you’ve learned something about the two of them together, but nothing specific about either of them individually.

The same idea applies in quantum teleportation.  The central idea behind entanglement is that if B and C are entangled, then they react to measurements in the same way.  So by comparing A and B and learning how they’re different, you’re also learning how A and C are different.  Knowing that, you can figure out what needs to be done to C to make it have the same state as A.  And all without actually learning what that state is.  Even if C is, hypothetically, on the far side of China, you can just tell whoever has it what the results of the test were.  For coins/regular bits you only need to send one bit of information; the result of the comparison is either “same” or “different”.  For qubits you need to send two bits, because of how terrifyingly complex quantum mechanics is.  Here’s a bit more detail on how quantum teleportation works.

Not a lot of physicists are too surprised that ground-to-space quantum teleportation works (nobody builds and launches spacecraft on a hunch).  There’s never been any indication that distance is a factor for quantum entanglement, so this isn’t a matter of overcoming physical laws, just getting around (a lot of) engineering difficulties.  Teleportation is easy to do with equipment on opposite sides of a room.  The difference here is that the “opposite side of the room” is moving at about 8 km/s and is in freaking space.

Quantum states are delicate, so we need to be able to catch, manipulate, and accurately measure the states of individual photons with minimal interference.  Assuming that you’re not having someone else read this out loud, you are presently noticing that photons carry information through air pretty well.  Pretty well.  Over a large enough distance, even clean air is effectively opaque.  The present through-air record for this same procedure is 143 km, between a couple of Canary Islands.  That’s 143 km through the densest region of our atmosphere (sea level).  There’s about as much air between you and space as there is between you and anything 7 km away along the ground (it gets thin fast as you go up).  So teleportation straight up should be easier than teleportation between ground stations.

Generally speaking, the big problem with conveying intact quantum information is all the stuff in the way, so space is kind of an obvious solution.  The problem with space is the distances involved; the farther something is away, the smaller the target.  Establishing entanglement between two locations comes down to creating a pair of entangled particles in one location and then sending one of the pair to the other location.  QUESS manages to catch about 1 in every 6 million photon pairs and it doesn’t work during the day because the sunlight scatters off of the air (likely to be a non-issue between two quantum satellites).  In all, the QUESS team claims to be able to establish 1 entangled pair per second.  All things considered, that’s bragging-rights-impressive.  This and this is what constitutes “bragging” (the QUESS team’s official papers on the subject).

Even with a noisy channel, with lots of photons lost and the states of many of the others perturbed by their journey, a reliable quantum channel is still possible.  We can distill quantum entanglement, turning many weakly entangled pairs into fewer strongly entangled pairs.  You can think of this like repeating a digital message to get it across a noisy channel; it takes more time to send the signal, but the result is a message clearer than any of the individual attempts.  Once an entanglement has been established between two parties, a quantum state can be teleported between the two, including a state entangled with something else.  In this way, two entangled pairs between A-B and B-C can be turned into one entangled pair between A-C.  With “quantum repeaters” in place, we can establish quantum channels over huge distances by piecing together many short, possibly noisy, channels.  Point is: despite quantum states being perfectly delicate, we don’t need perfect delicacy to work with them.

In the golden age of the telegraph, we could send information (bits) anywhere, we just couldn’t do too much with it when it got there.  We’re entering a similar (but likely to be much shorter) age of quantum information.

Quantum information technology is still in its infancy.  Where we are now is analogous to the age of telegraphs and Morse code.  We can send qubits, a few at a time, over long distances, but we don’t have computers at the ends capable of doing much with those qubits.  Despite that massive shortcoming, there are some killer apps that are likely to drive this technology forward.  In particular: quantum cryptography.

Skirting the details, quantum encryption boils down to:

1) distribute lots of pairs of entangled particles

2) measure each pair the same way

3) write down the results

No quantum computation involved!  The defining characteristic of maximally entangled pairs is that measurements on the pair are perfectly correlated and fundamentally random.  Anyone/thing that intercepts an entangled particle breaks (or at least weakens) the entanglement, so eavesdropping can be detected.  For you cypherpunks out there, quantum cryptography is a method of creating a shared random secret that is perfectly robust to man-in-the-middle attacks (or at least detects such attempts).  You and someone else create a random number that only the two of you can possibly know, which allows you to then encrypt any message and send it (by email for example) with security guaranteed by physical laws.

Quantum cryptography: sharing and securing secrets with fundamental physics!

Shockingly, this is of interest to lots of space-capable governments, so Micius is unlikely to be the last quantum communication satellite.

Just as a quick aside, since it’s often not stated clearly: quantum teleportation does not involve actual teleportation in any sense.  Nothing actually makes the journey from one quantum system to the other.  There are a pair of theorems in the field of quantum information theory that say that if you and someone else share an entangled pair of particles (sometimes called an “ebit”), then at the cost of that entanglement and the application of some cute tricks you can:

1) send 2 bits to convey one qubit

2) send 1 qubit to convey two bits

The first procedure is called “teleportation” and the second is called “super-dense coding”.  One of those is a terrible, misleading name and the other is “super-dense coding”.

The laser picture is from here, the telegraph picture is from here, and the spy picture is by Tomer Hanuka.