Q: Before you open the box, isn’t Schrödinger’s cat alive or dead, not alive and dead?

The original question was: If I don’t open the lid of the bunker, the cat inside is either dead OR alive.  Why does Schrödinger say that the cat is both alive AND dead if I do not perform the act of observation?  Isn’t he confusing AND for OR?

Physicist: He’s very intentionally saying “and” and not saying “or”.

When something is in more than one place, or state, or position, we say it’s in a “superposition of states”.  The classic example of this is the “double slit experiment”, where we see evidence of a single photon interfering with itself through both slits.

The patterns on the screen cannot be described by particles of light traveling in a straight line from the source.

The patterns on the screen cannot be described by particles of light traveling in a straight line from the source.  They can be described very easily in terms of waves (which go through both slits), or by saying “particles of light certainly do behave oddly, yessir” (hidden variables).

Schrödinger’s Equation describes particles (and by extension the world) in terms of “quantum wave functions”, and not in terms of “billiard balls”.  His simple model described the results of a variety of experiments very accurately, but required particles to behave like waves (like the interference pattern in the double slit) and be in multiple states.  In those experiments, when we actually make a measurement (“where does the photon hit the photographic plate?”) the results are best and most simply described by that wave.  But while a wave describes how the particles behave and where they’ll be, when we actually measure the particle we always find it to be in one state.

“Schrödinger’s Cat” was a thought experiment that he (Erwin S.) came up with to underscore how weird his own explanation was.  The thought experiment is, in a nutshell (or cat box): there’s a cat in a measurement-proof box with a vial of poison, a radioactive atom (another known example of quantum weirdness), and a bizarre caticidal geiger counter.  If the counter detects that the radioactive atom has decayed, then it’ll break the vial and kill the cat.  T0 figure out the probability of the cat being alive or dead you use Schrödinger’s wave functions to describe the radioactive atom.  Unfortunately, these describe the atom, and hence the cat, as being in a superposition of states between the times when the box is set up and when it’s opened (in between subsequent measurements).  Atoms can be in a combination of decayed and not decayed, just like the photons in the double slit can go through both slits, and that means that the cat must also be in a superposition of states.  This isn’t an experiment that has been done or could reasonably be attempted.  At least, not with a cat.

Schrödinger’s Cat wasn’t intended to be an educational tool, so much as a joke with the punchline “so… it works, but that’s way too insane to be right”.  At the time it was widely assumed that in the near future an experiment would come along that would over-turn this clearly wonky interpretation of the world and set physics back on track.

But as each new experiment (with stuff smaller than cats, but still pretty big) verified and reinforced the wave interpretation and found more and more examples of quantum superposition, Schrödinger’s Cat stopped being something to be dismissed as laughable, and turned instead into something to be understood and taken seriously (and sometimes dropped nonchalantly into hipster conversations).  Rather than ending with “but the cat obviously must be alive OR dead, so this interpretation is messed up somewhere” it more commonly ends with “but experiments support the crazy notion that the cat is both alive AND dead, so… something to think about”.

If it bothers you that the Cat doesn’t observe itself (why is opening the box so important?), then consider Schrödinger’s Graduate Student: unable to bring himself to open one more box full of bad news, Schrödinger leaves his graduate student to do the work for him and to report the results.  Up until the moment that the graduate student opens the door to Schrödinger’s Office, Schrödinger would best describe the student as being in a superposition of states.  This story was originally an addendum to Schrödinger’s ludicrous cat thing, but is now also told with a little more sobriety.

The double slit picture is from here.

Posted in -- By the Physicist, Philosophical, Physics, Quantum Theory | 3 Comments

Q: How many times do you need to roll dice before you know they’re loaded?

Physicist: A nearly equivalent question might be “how can you prove freaking anything?”.

In empirical science (science involving tests and whatnot) things are never “proven”.  Instead of asking “is this true?” or “can I prove this?” a scientist will often ask the substantially more awkward question “what is the chance that this could happen accidentally?”.  Where you draw the line between a positive result (“that’s not an accident”) and a negative result (“that could totally happen by chance”) is completely arbitrary.  There are standards for certainty, but they’re arbitrary (although generally reasonable) standards.  The most common way to talk about a test’s certainty is “sigma” (pedantically known as a “standard deviation“), as in “this test shows the result to 3 sigmas”.  You have to do the same test over and over to be able to talk about “sigmas” and “certainties” and whatnot.  The ability to use statistics is a big part of why repeatable experiments are important.


When you repeat experiments you find that their aggregate tends to look like a “bell curve” pretty quick.  This is due to the “central limit theorem” which (while tricky to prove) totally works.

“1 sigma” refers to about 68% certainty, or that there’s about a 32% chance of the given result (or something more unlikely) happening by chance.  2 sigma certainty is ~95% certainty (meaning ~5% chance of the result being accidental) and 3 sigmas, the most standard standard, means ~99.7% certainty (~0.3% probability of the result being random chance).  When you’re using, say, a 2 sigma standard it means that there’s a 1 in 20 chance that the results you’re seeing are a false positive.  That doesn’t sound terrible, but if you’re doing a lot of experiments it becomes a serious issue.

The more data you have, the more precise the experiment will be.  Random noise can look like a signal, but eventually it’ll be revealed to be random.  In medicine (for example) your data points are typically “noisy” or want to be paid or want to be given useful treatments or don’t want to be guinea pigs or whatever, so it’s often difficult to get better than a couple sigma certainty.  In physics we have more data than we know what to do with.  Experiments at CERN have shown that the Higgs boson exists (or more precisely, a particle has been found with the properties previously predicted for the Higgs) with 7 sigma certainty (~99.999999999%).  That’s excessive.  A medical study involving every human on Earth can not have results that clean.

So, here’s an actual answer.  Ignoring the details about dice and replacing them with a “you win / I win” game makes this question much easier (and also speaks to the fairness of the game at the same time).  If you play a game with another person and either of you wins, there’s no way to tell if it was fair.  If you play N games, then (for a fair game) a sigma corresponds to \frac{\sqrt{N}}{2} excess wins or losses away from the average.  For example, if you play 100 games, then

1 sigma: ~68% chance of winning between 45 and 55 games (that’s 50±5)

2 sigma: ~95% chance of winning between 40 and 60 games (that’s 50±10)

If you play 100 games with someone, and they win 70 of them, then you can feel fairly certain (4 sigmas) that something untoward is going down because there’s only a 0.0078% chance of being that far from the mean (half that if you’re only concerned with losing).  The more games you play (the more data you gather), the less likely it is that you’ll drift away from the mean.  After 10,000 games, 1 sigma is 50 games; so there’s a 95% chance of winning between 4,900 and 5,100 games (which is a pretty small window).

Keep in mind, before you start cracking kneecaps, that 1 in 20 people will see a 2 sigma result (that is, 1 in every N folk will see something with a probability of about 1 in N).  Sure it’s unlikely, but that’s probably why you’d notice it.  So when doing a test make sure you establish when the test starts and stops ahead of time.

Posted in -- By the Physicist, Experiments, Math, Philosophical, Skepticism | 3 Comments

Q: Since it involves limits, is calculus always an approximation?

Physicist: Nope!  Calculus is exact.  For those of you unfamiliar with calculus, what follows is day 1.

Between any two points on a curve we can draw a straight line (which has a definite slope).

Between any two points on a curve we can draw a straight line (which has a definite slope).

In order to find the slope of a curve at a particular point requires limits, which always feel a little incomplete.  When taking the limit of a function you’re not talking about a single point (which can’t have a slope), you’re not even talking about the function at that point, you’re talking about the function near that point as you get closer and closer.  At every step there’s always a little farther to go, but “in the limit” there isn’t.  Here comes an example.

The slope of x2 at x=1.

The slope of x^2 at x=1.

Say you want to find the slope of f(x) = x2 at x=1.  “Slope” is (defined as) rise over run, so the slope between the points \left(1, f(1)\right) and \left(1+h, f(1+h)\right) is \frac{f(1+h)-f(1)}{(1+h-1)} and it just so happens that:

\begin{array}{ll}\frac{f(1+h)-f(1)}{(1+h-1)} \\= \frac{(1+h)^2-1^2}{h} \\= \frac{(1+2h+h^2)-1}{h} \\= \frac{2h+h^2}{h} \\= \frac{2h}{h}+\frac{h^2}{h} \\= 2+h\end{array}

Finding the limit as h\to0 is the easiest thing in the world: it’s 2.  Exactly 2.  Despite the fact that h=0 couldn’t be plugged in directly, there’s no problem at all.  For every h≠0 you can draw a line between \left(1, f(1)\right) and \left(1+h, f(1+h)\right) and find the slope (it’s 2+h).  We can then let those points get closer together and see what happens to the slope (h\to0).  Turns out we get a single, exact, consistent answer.  Math folk say “the limit exists” and the function is “differentiable”.  Most of the functions you can think of (most of the functions worth thinking of) are differentiable, and when they’re not it’s usually pretty obvious why.

Same sort of thing happens for integrals (the other important tool in calculus).  The situation there is a little more subtle, but the result is just as clean.  Integrals can be used to find the area “under” a function by adding up a larger and larger number of thinner and thinner rectangles.  So, say you want to find the area under f(x)=x between x=0 and x=3.  This is day… 30 or so?

As a first try, we’ll use 6 rectangles.

A rough attempt to find the area of the triangle using a mess of rectangles.

A rough attempt to find the area of the triangle using a mess of rectangles.

Each rectangle is 0.5 wide, and 0.5, 1, 1.5, etc. tall.  Their combined area is 0.5\cdot0.5+0.5\cdot1+0.5\cdot1.5+\cdots+0.5\cdot2.5+0.5\cdot3 or, in mathspeak, \sum_{j=1}^6 0.5\cdot\left(\frac{j}{2}\right).  If you add this up you get 5.25, which is more than 4.5 (the correct answer) because of those “sawteeth”.  By using more rectangles these teeth can be made smaller, and the inaccuracy they create can be brought to naught.  Here’s how!

If there are N rectangles they’ll each be \frac{3}{N} wide and will be \frac{3}{N}, 2\frac{3}{N}, 3\frac{3}{N}, \cdots, N\frac{3}{N} tall (just so you can double-check, in the picture N=6).  In mathspeak, the total area of these rectangles is

\begin{array}{ll}    \sum_{j=1}^N \frac{3}{N} \left(j\frac{3}{N}\right) \\    = \frac{3}{N}\frac{3}{N}\sum_{j=1}^N j \\    = \frac{9}{N^2}\sum_{j=1}^N j \\    = \frac{9}{N^2}\left(\frac{N^2}{2}+\frac{N}{2}\right) \\    = \frac{9}{2}+\frac{9}{2N} \\    \end{array}

The fact that \sum_{j=1}^N=\frac{N^2}{2}+\frac{N}{2} is just one of those math things.  For every finite value of N there’s an error of \frac{9}{2N}, but this can be made “arbitrarily small”.  No matter how small you want the error to be, you can pick a value of N that makes it even smaller.  Now, letting the number of rectangles “go to infinity”, \frac{9}{2N}\to0 and the correct answer is recovered: 9/2.

In a calculus class a little notation is used to clean this up:

\frac{9}{2} = \int_0^3 x\,dx = \lim_{N\to\infty}\sum_{j=1}^N\left(\frac{3j}{N}\right)\frac{3}{N}

Every finite value of N gives an approximation, but that’s the whole point of using limits; taking the limit allows us to answer the question “what’s left when the error drops to zero and the approximation becomes perfect?”.  It may seem difficult to “go to infinity” but keep in mind that math is ultimately just a bunch of (extremely useful) symbols on a page, so what’s stopping you?

Mathematicians, being consummate pessimists, have thought up an amazing variety worst-case scenarios to create “non-integrable” functions where it doesn’t really make sense to create those approximating rectangles.  Then, being contrite, they figured out some slick ways to (often) get around those problems.  Mathematicians will never run out of stuff to do.

Fortunately, for everybody else (especially physicists) the universe doesn’t seem to use those terrible, terribleterrible worst-case functions in any of its fundamental laws.  Mathematically speaking, all of existence is a surprisingly nice place to live.

Posted in -- By the Physicist, Equations, Math | Leave a comment

Q: How does Earth’s magnetic field protect us?

Physicist: High energy charged particles rain in on the Earth from all directions, most of them produced by the Sun.  If it weren’t for the Earth’s magnetic field we would be subject to bursts of radiation on the ground that would be, at the very least, unhealthy.  The more serious, long term impact would be the erosion of the atmosphere.  Charged particles carry far more kinetic energy than massless particles (light), so when they strike air molecules they can kick them hard enough to eject them into space.  This may have already happened on Mars, which shows evidence of having once had a magnetic field and a complex atmosphere, and now has neither (Mars’ atmosphere is ~%1 as dense as ours).

Rule #1 for magnetic fields is the “right hand rule”: point your fingers in the direction a charged particle is moving, curl your fingers in the direction of the magnetic field, and your thumb will point in the direction the particle will turn.  The component of the velocity that points along the field is ignored (you don’t have to curl your fingers in the direction they’re already pointing), and the force is proportional to the speed of the particle and the strength of the magnetic field.

For notational reasons either lost to history or not worth looking up, the current (the direction the charge is moving) is I and the magnetic field is B.  F is the force the particle feels.

For notational reasons either lost to history or not worth looking up, the current (the direction the charge is moving) is I and the magnetic field is B. More reasonably, the Force the particle feels is F.  In this case, the particle is moving to the right, but the magnetic field is going to make it curve upwards.

This works for positively charged particles (e.g., protons).  If you’re wondering about negatively charged particles (electrons), then just reverse the direction you got.  Or use your left hand.  If the magnetic field stays the same, then eventually the ion will be pulled in a complete circle.

As it happens, the Earth has a magnetic field and the Sun fires charged particles at us (as well as every other direction) in the form of “solar wind”, so the right hand rule can explain most of what we see.  The Earth’s magnetic field points from south to north through the Earth’s core, then curves around and points from north to south on Earth’s surface and out into space.  So the positive particles flying at us from the Sun are pushed west and the negative particles are pushed east (right hand rule).

Since the Earth’s field is stronger closer to the Earth, the closer a particle is, the faster it will turn.  So an incoming particle’s path bends near the Earth, and straightens out far away.  That’s a surprisingly good way to get a particle’s trajectory to turn just enough to take it back into the weaker regions of the field, where the trajectory straightens out and takes it back into space.  The Earth’s field is stronger or weaker in different areas, and the incoming charged particles have a wide range of energies, so a small fraction do make it to the atmosphere where they collide with air.  Only astronauts need to worry about getting hit directly by particles in the solar wind; the rest of us get shrapnel from those high energy interactions in the upper atmosphere.

If a charge moves in the direction of a magnetic field, not across it, then it’s not pushed around at all.  Around the magnetic north and south poles the magnetic field points directly into the ground, so in those areas particles from space are free to rain in.  In fact, they have trouble not coming straight down.  The result is described by most modern scientists as “pretty”.

Charged particles from space following the magnetic field lines into the upper atmosphere.

Charged particles from space following the magnetic field lines into the upper atmosphere where they bombard the local matter.  Green indicates oxygen in the local matter.

The Earth’s magnetic field does more than just deflect ions or direct them to the poles.  When a charge accelerates it radiates light, and turning a corner is just acceleration in a new direction.  This “braking radiation” slows the charge that creates it (that’s a big part of why the aurora inspiring as opposed to sterilizing).  If an ion slows down enough it won’t escape back into space and it won’t hit the Earth.  Instead it gets stuck moving in big loops, following the right hand rule all the way, thousands of miles above us (with the exception of our Antarctic readers).  This phenomena is a “magnetic bottle”, which traps the moving charged particles inside of it.  The doughnut-shaped bottles around Earth and are the Van Allen radiation belts.  Ions build up there over time (they fall out eventually) and still move very fast, making it a dangerous place for delicate electronics and doubly delicate astronauts.

Magnetic bottles, by the way, are the only known way to contain anti-matter.  If you just keep anti-matter in a mason jar, you run the risk that it will touch the mason jar’s regular matter and annihilate.  But ions contained in a magnetic bottle never touch anything.  If that ion happens to be anti-matter: no problem.  It turns out that the Van Allen radiation belts are lousy with anti-matter, most of it produced in those high-energy collisions in the upper atmosphere (it’s basically a particle accelerator up there).  That anti-matter isn’t dangerous or anything.  When an individual, ultra-fast particle of radiation hits you it doesn’t make much of a difference if it’s made of anti-matter or not.

And there isn’t much of it; about 160 nanograms, which (combined with 160 nanograms of ordinary matter) yields about the same amount of energy as 7kg of TNT.  You wouldn’t want to run into it all in one place, but still: not a big worry.

Why is there a map on it?

A Van Allen radiation belt simulated in the lab.

In a totally unrelated opinion, this picture beautifully sums up the scientific process: build a thing, see what it does, tell folk about it.  Maybe give it some style (time permitting).


The right hand picture is from here.

Posted in -- By the Physicist, Astronomy, Particle Physics, Physics | 5 Comments

Q: If a long hot streak is less likely than a short hot streak, then doesn’t that mean that the chance of success drops the more successes there are?

One of the original questions was:  I understand “gambler’s fallacy” where it is mistaken to assume that if something happens more frequently during a period then it will be less frequently in the future.  Example:  If I flip a coin 9 times and each time I get HEADS, than to assume that  it is more “probable” that the 10th flip will be tails is a incorrect assumption.

I also understand that before I begin flipping that coin in the first place, the odds of getting 10 consecutive HEADS is a very big number and not a mere 50/50.

My question is:  Is it more likely?, more probable?, more expectant?, or is there a higher chance of a coin turning up TAILS after 9 HEADS?

Physicist: Questions of this ilk come up a lot.  Probability and combinatorics, as a field study, are just mistake factories.  In large part because single words massively change the difference between two calculations, not just in the result but in how you get there.  In this case the problem word is “given”.

Probabilities can change completely when the context, the “conditionals”, change.  For example, the probability that someone is eating a sandwich is normally pretty low, but the probability that a person is eating a sandwich given that there’s half a sandwich in front of them is pretty high.

To understand the coin example, it helps to re-phrase in terms of conditional probabilities.  The probability of flipping ten heads in a row, P(10H), is P(10H) = \left(\frac{1}{2}\right)^{10}\approx 0.1\%.  Not too likely.

The probability of flipping tails given that the 9 previous flips were heads is a conditional probability: P(T | 9H) = P(T) = 1/2.

In the first situation, we’re trying to figure out the probability that a coin will fall a particular way 10 times.  In the second situation, we’re trying to figure out the probability that a coin will fall a particular way only once.  Random things like coins and dice are “memoryless”, which means that previous results have no appreciable impact on future results.  Mathematically, when A and B are unrelated events, we say P(A|B) = P(A).  For example, “the probability that it’s Tuesday given that today is rainy, is equal to the probability that it’s Tuesday” because weather and days of the week are independent.  Similarly, each coin flip is independent, so P(T | 9H) = P(T).

The probability of the “given” may be large or small, but that isn’t important for determining what happens next.  So, after the 9th coin in a row comes up heads everyone will be waiting with bated breath (9 in a row is unusual after all) for number ten, and will be disappointed exactly half the time (number 10 isn’t affected by the previous 9).

This turns out to not be the case when it comes to human-controlled events.  Nobody is “good at playing craps” or “good at roulette”, but from time to time someone can be good at sport.  But even in sports, where human beings are controlling things, we find that there still aren’t genuine hot or cold streaks (sans injuries).  That’s not to say that a person can’t tally several goalings in a row, but that these are no more or less common than you’d expect if you modeled the rate of scoring as random.

For example, say Tony Hawk has already gotten three home runs by dribbling a puck into the end zone thrice.  The probability that he’ll get another point isn’t substantially different from the probability that he’d get that first point.  Checkmate.

Notice the ass-covering use of “not substantially different”.  When you’re gathering statistics on the weight of rocks or the speed of light you can be inhumanly accurate, but when you’re gathering statistics on people you can be at best humanly accurate.  There’s enough noise in sports (even bowling) that the best we can say with certainty is that hot and cold streaks are not statistically significant enough to be easily detectable, which they really need to be if you plan to bet on them.

Posted in -- By the Physicist, Math, Probability | 11 Comments

Q: Where do the rules for “significant figures” come from?

Physicist: When you’re doing math with numbers that aren’t known exactly, it’s necessary to keep track of both the number itself and the amount of error that number carries.  Sometimes this is made very explicit.  You may for example see something like “3.2 ± 0.08″.  This means : “the value is around 3.2, but don’t be surprised if it’s as high as 3.28 or as low as 3.12… but farther out than that would be mildly surprising.”

120cm ± 0.5cm, due to hair.

120 ± 0.5 cm, due to hair-based error.

However!  Dealing with two numbers is immoral and inescapably tedious.  So, humanity’s mightiest accountants came up with a short hand: stop writing the number when it becomes pointless.  It’s a decent system.  Significant digits are why driving directions don’t say things like “drive 8.13395942652 miles, then make a slight right”.  Rather than writing a number with it’s error, just stop writing the number at the digit where noises and errors and lack of damn-giving are big enough to change the next digit.  The value of the number and the error in one number.  Short.

The important thing to remember about sig figs is that they are an imprecise but typically “good enough” way to deal with errors in basic arithmetic.  They’re not an exact science, and are more at home in the “rules of punctuation” schema than they are in the toolbox of a rigorous scientist.  When a number suddenly stops without warning, the assumption that the reader is supposed to make is “somebody rounded this off”.  When a number is rounded off, the error is at least half of the last digit.  For example, 40.950 and 41.04998 both end up being rounded to the same number, and both are reasonable possible values of “41.0” or “41±0.05″.

For example, using significant figures, 2.0 + 0.001 = 2.0.  What the equation is really saying is that the error on that 2.0 is around ±0.05 (the “true” number will probably round out to 2.0).  That error alone is bigger than the entire second number, never mind what its error is (it’s around ±0.0005).  So the sum 2.0 + 0.001 = 2.0, because both sides of the equation are equal to 2 + “an error of around 0.05 or less, give or take”.

2.0+0.001 = 2.0 because the significant digits are conveying a notion of "error", and the second number is "drowned out" by the error in the first.

2.0 + 0.001 = 2.0.  The significant digits are conveying a notion of “error”, and the second number is being “drowned out” by the error in the first.

“Rounding off” does a terrible violence to math.  Now the error, rather than being a respectable standard deviation that was painstakingly and precisely derived from multiple trials and tabulations, is instead an order-of-magnitude stab in the dark.

The rules regarding error (answer gravy below) show that if your error is only known to within an order of magnitude (where a power of ten describes its size), then when you’re done adding or multiplying two numbers together, what results will have an error of the same magnitude in the sense that you’ll retain the same number of significant digits.

For example,

\begin{array}{ll}    1234\times 0.32 \\    = (1234 \pm 0.5)(0.32\pm 0.005) \\    = \left([1.234\pm 0.0005] 10^3\right)\left([3.2\pm 0.05] 10^{-1}\right) & \leftarrow\textrm{Scientific Notation} \\    = \left(1.234\pm 0.0005\right)\left(3.2\pm 0.05\right)10^2 \\    = \left(1.234\times3.2 \pm 1.234\times0.05 \pm 3.2\times0.0005 \pm 0.05\times0.0005\right)10^2 \\    = \left(3.9488 \pm 0.0617 \pm 0.0016 \pm 0.000025\right)10^2 \\    \approx \left(3.9488 \pm 0.0617\right)10^2 \\    \approx \left(3.9488 \pm 0.05\right)10^2 \\    = 3.9 \times 10^2    \end{array}

This last expression could be anywhere from 3.8988 x 102 to 3.9988 x 102.  The only digits that aren’t being completely swamped by the error are “3.9”.  So the final “correct” answer is “3.9 x 102“.  Not coincidentally, this has two significant digits, just like “0.32” which had the least number of significant digits at the start of the calculation.  The bulk of the error in the end came from “±1.234×0.05″, the size of which was dictated by that “0.05”, which was the error from “0.32”.

Notice that in the second to last step it was callously declared that “0.0617 ≈ 0.05″.  Normally this would be a travesty, but significant figures are the mathematical equivalent of “you know, give or take or whatever”.  Rounding off means that we’re ignoring the true error and replacing it with the closest power of ten.  That is, there’s a lot of error in how big the error is.  When you’re already introducing errors by replacing numbers like 68, 337, and 145 with “100” (the nearest power of ten), “0.0617 ≈ 0.05″ doesn’t seem so bad.  The initial error was on the order of 1 part in 10, and the final error was likewise on the order of 1 part in 10.  Give or take.  This is the secret beauty of sig figs and scientific notation; they quietly retain the “part in ten to the ___” error.

That said, sig figs are kind of a train wreck.  They are not a good way to accurately keep track of errors.  What they do is save people a little effort, manage errors and fudges in a could-be-worse kind of way, and instill a deep sense of fatalism.  Significant figures underscore at every turn the limits either of human expertise or concern.

By far the most common use of sig figs is in grading.  When a student returns an exam with something like “I have calculated the mass of the Earth to be 5.97366729297353452283 x 1024 kg”, the grader knows immediately that the student doesn’t grok significant figures (the correct answer is “the Earth’s mass is 6 x 1024 kg, why all the worry?”).  With that in mind, the grader is now a step closer to making up a grade.  The student, for their part, could have saved some paper.

Answer Gravy: You can think of a number with an error as being a “random variable“.  Like rolling dice (a decidedly random event that generates a definitively random variable), things like measuring, estimating, or rounding create random numbers within a certain range.  The better the measurement (or whatever it is that generates the number), the smaller this range.  There are any number of reasons for results to be inexact, but we can sweep all of them under the same carpet labeling them all “error”; keeping track only of their total size using (usually) standard deviation or variance.  When you see the expression “3±0.1″, this represents a random variable with an average of 3 and a standard deviation of 0.1 (unless someone screwed up or is just making up numbers, which happens a lot).

When adding two random variables, (A±a) + (B±b), the means are easy, A+B, but the errors are a little more complex.  (A±a) + (B±b) = (A+B) ± ?.  The standard deviation is the square root of the variance, so a2 is the variance of the first random variable.  It turns out that the variance of a sum is just the sum of the variances, which is handy.  So, the variance of the sum is a2 + b2 and (A±a) + (B±b) = A+B ± √(a^2+b^2).

When adding numbers using significant digits, you’re declaring that a=0.5 x 10-D1 and b=0.5 x 10-D2, where D1 and D2 are the number of significant digits each number has.  Notice that if these are different, then the bigger error takes over.  For example, \sqrt{\left(0.5\cdot10^{-1}\right)^2 + \left(0.5\cdot10^{-2}\right)^2} = 0.5\cdot 10^{-1}\sqrt{1 + 10^{-2}} \approx 0.5\cdot 10^{-1}.  When the digits are the same, the error is multiplied by √2 (same math as last equation).  But again, sig figs aren’t a filbert brush, they’re a rolling brush.  √2?  That’s just another way of writing “1”.

The cornerstone of "sig fig" philosophy; you're not all over the place, but it won't be perfect.

The cornerstone of “sig fig” philosophy; not all over the place, but not super concerned with details.

Multiplying numbers is one notch trickier, and it demonstrates why sig figs can be considered more clever than being lazy normally warrants.  When a number is written in scientific notation, the information about the size of the error is exactly where it is most useful.  The example above of “1234 x 0.32″ gives some idea of how the 10’s and errors move around.  What that example blurred over was how the errors (the standard deviations) should have been handled.

First, the standard deviation of a product is a little messed up: (A\pm a)(B\pm b) = AB \pm\sqrt{A^2b^2 + B^2a^2 + a^2b^2}.  Even so!  When using sig figs the larger error is by far the more important, and the product once again has the same number of sig figs.  In the example, 1234 x 0.32 = (1.234 ± 0.0005) (3.2 ± 0.05) x 10-2.  So, a = 0.0005 and b = 0.05.  Therefore, the standard deviation of the product must be:

\begin{array}{ll}    \sqrt{A^2b^2 + B^2a^2 + a^2b^2} \\[2mm]    = Ab\sqrt{1 + \frac{B^2a^2}{A^2b^2} + \frac{a^2}{A^2}} \\[2mm]    = (1.234) (0.05) \sqrt{1.0000069} \\[2mm]    \approx(1.234)(0.05)\\[2mm]    \approx 0.05    \end{array}

Notice that when you multiply numbers, their error increases substantially each time (by a factor of about 1.234 this time).  According to Benford’s law, the average first digit of a number is 3.440*. As a result, if you’re pulling numbers “out of a hat”, then on average every two multiplies should knock off a significant digit, because 3.4402 = 1 x 101.

Personally, I like to prepare everything algebraically, keep track of sig figs and scientific notation from beginning to end, then drop the last 2 significant digits from the final result.  Partly to be extra safe, but mostly to do it wrong.

*they’re a little annoying, right?

Posted in -- By the Physicist, Conventions, Equations, Math | 1 Comment