For the first time ever, you can buy a book!

Physicist: Over the past year I’ve been putting together a collection of some (fifty-four) of my favorite and most elucidating articles from the past decade, revised, updated, and in book form.  You can get your very own copy here!

I wrote a book! It’s good. You should buy it.  The cover is a false-color x-ray of a chameleon, which is hilarious.

In an effort to plug, I’ll be a guest on Story Collider, which will be recording at the Tipsy Crow in San Diego on Thursday at 7:00.  It’s free and should be fun, so if you’d like to show up, you can learn more about the whole thing and register here.

And for those of you attending the Joint Mathematics Meeting (Comic Con for nerds) in San Diego this year, I’ll be at Springer’s booth on Friday.


This is Springer’s first foray into “popular science”.  It’s divided into four chapters: “big things”, “small things”, “in between things”, and “not things” (math).

I aimed it at my younger self, who was unimpressed by the vagueness of pop sci and frustrated by the technicalness of actual sci.  The articles in “Do Colors Exist?” cover the important ideas intuitively and without dumbing down, but also assume that you don’t know a bunch of fancy terminology.  Even if physics isn’t your thing, this is exactly the sort of gift you could give a nerd/science friend without embarrassment.  It provides satisfying answers for the man-on-the-street, while including details for the “advanced” reader.


The blurb from the back of the book (which I didn’t write) reads:

Why do polished stones look wet? How does the Twin Paradox work? How can we be sure that pi never repeats? How does a quantum computer break encryption? Discover the answers to these, and other profound physics questions!

This fascinating book presents a collection of articles based on conversations and correspondences between the author and complete strangers about physics and math. The author, a researcher in mathematical physics, responds to dozens of questions posed by inquiring minds from all over the world, ranging from the everyday to the profound.

Rather than unnecessarily complex explanations mired in mysterious terminology and symbols, the reader is presented with the reasoning, experiments, and mathematics in a casual, conversational, and often comical style. Neither over-simplified nor over- technical, the lucid and entertaining writing will guide the reader from each innocent question to a better understanding of the weird and beautiful universe around us.

Advance praise for Do Colors Exist?: “Every high school science teacher should have a copy of this book. The individual articles offer enrichment to those students who wish to go beyond a typical ‘dry curriculum’. The articles are very fun. I probably laughed out loud every 2-3 minutes. This is not easy to do. In fact, my children are interested in the book because they heard me laughing so much.” -Ken Ono, Emory University


Keeping this website ad-free and cost-free is important, so this will be the last time you’ll have to hear about this.

Posted in -- By the Physicist | 10 Comments

Q: Where is all the anti-matter?

Physicist: Anti-matter is exactly the same as ordinary matter but opposite, in very much the same way that a left hand is exactly the same as a right hand… but opposite.  Every anti-particle has exactly the same mass as their regular-particle counterparts, but with a bunch of their other characteristics flipped.  For example, protons have an electric charge of +1 and a baryon number of +1.  Anti-protons have an electric charge of -1 and a baryon number -1.  The positive/negativeness of these numbers are irrelevant.  A lot like left and right hands, the only thing that’s important about positive charges is that they’re the opposite of negative charges.

Hydrogen is stable because its constituent particles have opposite charges and opposites attract.  Anti-hydrogen is stable for exactly the same reason.

Anti-matter acts, in (nearly) every way we can determine, exactly like matter.  Light (which doesn’t have an anti-particle) interacts with one in exactly the same way as the other, so there’s no way to just look at something and know which is which.  The one and only exception we’ve found so far is beta decay.  In beta decay a neutron fires a new electron out of its “south pole”, whereas an anti-neutron fires an anti-electron out of its “north pole”.  This is exactly the difference between left and right hands.  Not a big deal.

Left: A photograph of an actual flower made of regular matter. Right: An artistic representation of a flower made of anti-matter.

So when we look out into the universe and see stars and galaxies, there’s no way to tell which matter camp, regular or anti, that they fall into.  Anti-stars would move and orbit the same way and produce light in exactly the same way as ordinary stars.  Like the sign of electrical charge or the handedness of hands, the nature of matter and anti-matter are indistinguishable until you compare them.

But you have to be careful when you do, because when a particle comes into contact with its corresponding anti-particle, the two cancel out and dump all of their matter into energy (usually lots of light).  If you were to casually grab hold of 1 kg of anti-matter, it (along with 1 kg of you) would release about the same amount of energy as the largest nuclear detonation in history.

The Tsar Bomba from 100 miles away.  This is what 2 kg worth of energy can do (when released all at once).

To figure out exactly how much energy is tied up in matter (either kind), just use the little known relation between energy and matter: E=mc2.  When you do, be sure to use standard units (kilograms for mass, meters and seconds for the speed of light, and Joules for energy) so that you don’t have to sweat the unit conversions.  For 2 kg of matter, E = (2 kg)(3×108 m/s)2 = 1.8×1017 J.

When anti-matter and matter collide it’s hard to miss.  We can’t tell whether a particular chunk of stuff is matter or anti-matter just by looking at it, but because we don’t regularly see stupendous space kablooies as nebulae collide with anti-nebulae, we can be sure that (at least in the observable universe) everything we see is regular matter.  Or damn near everything.  Our universe is a seriously unfriendly place for anti-matter.

So why would we even suspect that anti-matter exists?  First, when you re-write Schrödinger’s equation (an excellent way to describe particles and whatnot) to make sense in the context of relativity (the fundamental nature of spacetime) you find that the equation that falls out has two solutions; a sort of left and right form for most kinds of particles (matter and anti-matter).  Second, and more importantly, we can actually make anti-matter.

Very high energy situations, like those in particle accelerators, randomly generate new particles.  But these new particles are always produced in balanced pairs; for every new proton (for example) there’s a new anti-proton.  The nice thing about protons is that they have a charge and can be pushed around with magnets.  Conveniently, anti-protons have the opposite charge and are pushed in the opposite direction by magnets.  So, with tremendous cleverness and care, the shrapnel of high speed particle collisions can be collected and sorted.  We can collect around a hundred million anti-particles at a time using particle accelerators (to create them) and particle decelerators (to stop and store them).

Anti-matter, it’s worth mentioning, is (presently) an absurd thing to build a weapon with.  Considering that it takes the energy of a small town to run a decent particle accelerator, and that a mere hundred million anti-protons have all the destructive power of a single drop of rain, it’s just easier to throw a brick or something.

The highest energy particle interactions we can witness happen in the upper atmosphere; to see them we just have to be patient.  The “Oh My God Particle” arrived from deep space with around ninety million times the energy of the particle beams in CERN, but we only see such ultra-high energy particles every few months and from dozens of miles away.  We bothered to build CERN so we could see (comparatively feeble) particle collisions at our leisure and from really close up.

Those upper atmosphere collisions produce both matter and anti-matter, some tiny fraction of which ends up caught in the Van Allen radiation belts by the Earth’s magnetic field.  In all, there are a few nanograms of anti-matter up there.  Presumably, every planet and star with a sufficient and stable magnetic field has a tiny, tiny amount of anti-matter in orbit just like we do.  So if you’re looking for all natural anti-matter, that’s the place to look.

But if anti-matter and matter are always created in equal amounts, and there’s no real difference between them (other than being different from each other), then why is all of the matter in the universe regular matter?

No one knows.  It’s a total mystery.  Isn’t that exciting?  Baryon asymmetry is a wide open question and, not for lack of trying, we’ve got nothing.

The rose photo is from here.

Update: A commenter kindly pointed out that a little anti-matter is also produced during solar flares (which are definitively high-energy) and streams away from the Sun in solar wind.

Posted in -- By the Physicist, Particle Physics, Physics | 12 Comments

Q: Is it possible to write a big number using a small number? Is there a limit to how much information can be compressed?

Physicist: Although there are tricks that work in very specific circumstances, in general when you “encode” any string of digits using fewer digits, you lose some information.  That means that when you want to reverse the operation and “decode” what you’ve got, you won’t recover what you started with.  What we normally call “compressed information” might more accurately be called “better bookkeeping”.

When you encode something, you’re translating all of the symbols in one set into another set. The second set needs to be bigger (left) so that you can reverse the encoding. If there are fewer symbols in the new set (right), then there are at least a few symbols (orange dots) in the new set that represent several symbols in the old set.  So, when you try to go back to the old set, it’s impossible to tell which symbol is the right one.

As a general rule of thumb, count up the number of possible symbols (which can be numbers, letters, words, anything really) and make sure that the new set has more.  For example, a single letter is one of 26 symbols (A, B, …, Z), a single digit is one of 10 (0, 1, …, 9), and two digits is one of 102=100 (00, 01, …, 99).  That means that no matter how hard you try, you can’t encode a letter with a single number, but you can easily do it with 2 (because 10 < 26 < 100).  The simplest encoding in this case is a1, …, z26, and the decoding is equally straightforward.  This particular scheme isn’t terribly efficient (because 27-100 remain unused), but it is “lossless” because you’ll always recover your original letter.  No information is lost.

Similarly, the set of every possible twenty-seven letter words has 2627 = 1.6×1038 different permutations in it (from aaaaaaaaaaaaaaaaaaaaaaaaaaa to zzzzzzzzzzzzzzzzzzzzzzzzzzz).  So, if you wanted to encode “Honorificabilitudinitatibus” as a number, you’d need at least 39 numerical digits (because 39 is the first number bigger than log10(2627)=38.204).

π gives us a cute example of the impossibility of compressing information.  Like effectively all numbers, you can find any number of any length in π (probably).  Basically, the digits in π are random enough that if you look for long enough, then you’ll find any number you’re looking for in a “million monkeys on typewriters” sort of way.

So, if every number shows up somewhere in π, it seems reasonable to think that you could save space by giving the address of the number (what digit it starts at) and its length.  For example, since π=3.141592653589793238… if your phone number happens to be “415-926-5358“, then you could express it as “10 digits starting at the 2nd” or maybe just “10,2”.  But you and your phone number would be extremely lucky.  While there are some numbers like this, that can be represented using really short addresses, on average the address is just as long as the number you’re looking for.  This website allows you to type in any number to see where it shows up in the first hundred million digits in π.  You’ll find that, for example, “1234567” doesn’t appear until the 9470344th digit of π.  This seven digit number has a seven digit address, which is absolutely typical.

The exact same thing holds true for any encoding scheme.  On average, encoded data takes up just as much room as the original data.

However!  It is possible to be clever and some data is inefficiently packaged.  You could use 39 digits to encode every word, so that you could handle any string of 27 or fewer letters.  But the overwhelming majority of those strings are just noise, so why bother having a way to encode them?  Instead you could do something like enumerating all of the approximately 200,000 words in the English dictionary (1=”aardvark”, …., 200000=”zyzzyva”), allowing you to encode any word with only six digits.  It’s not that data is being compressed, it’s that we’re doing a better job keeping track of what is and isn’t a word.

What we’ve done here is a special case of actual data “compression”.  To encode an entire book as succinctly as possible (without losing information), you’d want to give shorter codes to words you’re likely to see (1=”the”), give longer codes to words you’re unlikely to see (174503=”absquatulate“), and no code for words you’ll never see (“nnnfrfj”).

Every individual thing needs to have its own code, otherwise you can’t decode and information is lost.  So this technique of giving the most common things the shortest code is the best you can do as far as compression is concerned.  This is literally how information is defined.  Following this line of thinking, Claude Shannon derived “Shannon Entropy” which describes the density of information in a string of symbols.  The Shannon entropy gives us an absolute minimum to how much space is required for a block of data, regardless of how clever you are about encoding it.

Answer Gravy: There is a limit to cleverness, and in this case it’s Shannon’s “source coding theorem“.  In the densest possible data, each symbol shows up about as often as every other and there are no discernible patterns.  For example, “0000000001000000” can be compressed a lot, while “0100100101101110” can’t.  Shannon showed that the entropy of a string of symbols, sometimes described as the “average surprise per symbol”, tells you how compactly that string can be written.

Incidentally, it’s also a great way to define information, which is exactly what Shannon did in this remarkable (and even fairly accessible) paper.

If the nth symbol in your set shows up with probability Pn, then the entropy in bits (the average number of bits per symbol) is: H=-\sum_n P_n\log_2\left(P_n\right).  The entropy tells you both the average information per character and the highest density that can be achieved.

For example, in that first string of mostly zeros, there are 15 zeros and 1 one.  So, P_0=\frac{15}{16}, P_1=\frac{1}{16}, and H=-\frac{15}{16}\log_2\left(\frac{15}{16}\right)-\frac{1}{16}\log_2\left(\frac{1}{16}\right)\approx0.337.  That means that each digit only uses 0.337 bits on average.  So a sequence like this (or one that goes on a lot longer) could be made about a third as long.

In the second, more balanced string, P_0=P_1=\frac{8}{16}=\frac{1}{2} and H=-\frac{1}{2}\log_2\left(\frac{1}{2}\right)-\frac{1}{2}\log_2\left(\frac{1}{2}\right)=\frac{1}{2}+\frac{1}{2}=1.  In other words, each digit uses about 1 bit of information on average; this sequence is already about as dense as it can get.

Here the log was done in base 2, but it doesn’t have to be; if you did the log in base 26, you’d know the average number of letters needed per symbol.  In base 2 the entropy is expressed in “bits”, in base e (the natural log) the entropy is expressed in “nats”, and in base π the entropy is in “slices”.  Bits are useful because they describe information in how many “yes/no” questions you need, nats are more natural (hence the name) for things like thermodynamic entropy, and slices are useful exclusively for this one joke about π (its utility is debatable).

Posted in -- By the Physicist, Combinatorics, Computer Science, Entropy/Information, Math | 8 Comments

Q: Is reactionless propulsion possible?

Physicist: In a word: no.

A reactionless drive is basically a closed box with the ability to just start moving, on its own, without touching or exuding anything.  The classic sci-fi tropes of silent flying cars or hovering UFOs are examples of reactionless drives.

The problem at the heart of all reactionless drives is that they come into conflict with Newton’s famous law “for every action there is an equal and opposite reaction” (hence the name).  To walk in one direction, you push against the ground in the opposite direction.  To paddle your canoe forward, you push water backward.  The stuff you push backward so you can move forward is called the “reaction mass”.

In order to move stuff forward, you need to move other stuff backward.

This is a universal law, so unfortunately it applies in space.  If you want to move in space (where there’s nothing else around) you need to bring your reaction mass with you.  This is why we use rockets instead of propellers or paddles in space; a rocket is a mass-throwing machine.

But mass is at a premium in space.  It presently costs in the neighborhood of $2000/kg to send stuff to low Earth orbit (a huge improvement over just a few years ago).  So, the lighter your rocket, the better.  Typically, a huge fraction of a rocket’s mass is fuel/reaction mass, so the best way to make spaceflight cheaper and more feasible is to cut down on the amount of reaction mass.  The only way to do that at present is to use that mass more efficiently.  If you can throw mass twice as fast, you’ll push your rocket twice as hard.  Traditionally, that’s done by burning fuel hotter and under higher pressure so it comes shooting out faster.

In modern rockets the exhaust is moving on the order of 2-3 km per second.  However, your reaction mass doesn’t need to be fuel, it can be anything.  Ion drives fire ionized gas out of their business end at up to 50 km per second, meaning they can afford to carry far less reaction mass.  Space craft with ion drives are doubly advantaged: not only are they throwing their reaction mass much faster, but since they carry less of it, they can be smaller and easier to push.

The drawback is that ion drives dole out that reaction mass a tiny bit at a time.  The most powerful ion drives produce about 0.9 ounces of force.  A typical budgie (a small, excitable bird) weighs about 1.2 ounces and, since they can lift themselves, budgies can generate more force than any ion drive presently in production.

Compared to rockets, ion drives pack a greater punch for a given amount of reaction mass.  However, they deliver that punch over a very long time and with less force than a budgie.

Given the limitations and inefficiencies, wouldn’t it be nice to have a new kind of drive that didn’t involve reaction mass at all?  You’d never have to worry about running out of reaction mass; all you’d need is a power supply, and you could fly around for as long as you want.

That’s not to say that propellantless propulsion isn’t possible.  There are ways to move without carrying reaction mass with you.  You can use light as your exhaust (a “photon drive”), but you’ll notice that a flashlight or laser pointer doesn’t have much of a kick.  And you can slingshot around planets, but then the planet is your reaction mass.

The problem with reactionless drives, fundamentally, is that Newton’s third law has no (known) exceptions.  It is one of the most bedrock, absolute rules in any science and a keystone in our understanding of the universe.  On those rare occasions when someone thought they had found an exception, it always turned out to be an issue with failing to take something into account.  For example, when a neutron decays into a proton and electron, the new pair of particles don’t fly apart in exactly opposite directions.  Instead, the pair have a net momentum that the original neutron did not.

When a stationary neutron (gray) decays into a proton (red) and electron (blue), the new pair flies apart, but always favor one direction.  Newton’s laws imply that there must be a third particle moving in the other direction to balance the other two.

The implication (according to Newton’s law) is that there must be another particle to balance things out.  And that’s exactly the case.  Although the “extra particle that’s really hard to detect” theory was first proposed in 1930, it wasn’t until 1956 that neutrinos were finally detected and verified to exist.  The imbalanced momentum, a violation of Newton’s laws, came down to a missing particle.  Today neutrinos are one of the four ways we can learn about space, along with light, gravity waves, and go-there-yourself.

There are plenty of ideas floating around about how to create reactionless drives, such as the Woodward Effect or the Albecurrie warp drive.  But in no case do these ideas use established science.  The Woodward effect depends on Mach’s principle (that somehow inertia is caused by all the other matter in the universe), and reads like a pamphlet a stranger on the street might give you, while the Albecurrie drive needs lots of negative energy, which flat-out isn’t a thing.

Science is all about learning things we don’t know and trying to prove our own theories wrong.  While scientific discovery is certainly awe inspiring, it is also the exact opposite of wishful thinking.  That said, good science means keeping an open mind much longer than any reasonable person would be willing to.  In the ultimate battle between theoretical and experimental physics, the experimentalists always win.  If someone ever manages to create a self-moving, reactionless drive, then all the theories about why that’s impossible go out the window.  But as of now, those theories (standard physics) are holding firm.  We can expect that for the rest of forever, all space craft will have a tail of exhaust behind them.

Posted in -- By the Physicist, Physics, Relativity | 15 Comments

Q: How can I set up a random gift exchange that’s different from year to year?

The original question was: I’ve got a large family and we do a yearly gift exchange one person to one person. And I’d like to make a algorithm or something to do random selection without repeating for some time. And to be able to take old data and put it in to avoid repeats. I’m pretty good at math I’m 29 and my trade is being a machinist so I’ve got some understanding of how things kinda work.

Physicist: A good method should be trivially easy to keep track of from year to year, work quickly and simply, never assign anyone to themselves, make sure that everyone gives to everyone else exactly once, and be unobtrusive enough that it doesn’t bother anybody.  Luckily, there are a few options.  Here’s one of the simplest.

Give each of the N people involved a number from 1 to N.  The only things you’ll need to keep track of from year to year are everyone’s number and an index number, k.  Regardless of who the best gift-giver is, there are no best numbers and no way to game the system, and since no one wants to keep track of a list from one year to the next, you should choose something simple like alphabetical numbering (Aaron Aardvark would be #1 and Zylah von Zyzzyx would be #N).

Draw N dots in a circle then, starting with dot #1, draw an arrow to dot #(1+k), and repeat.  There’s nothing special about any particular dot; since they’re arranged in a circle #1 has just as much a claim to be first as #3 or #N.  When you finally get back to dot #1, you’ll have drawn a star.  Each different value of k, from 1 to N-1, will produce a different star and a different gift-giving pattern.  For example, if N=8 and k=3, then you get a star that describes the pattern {1→4→7→2→5→8→3→6→1} (that is “1 gives to 4 who gives to 7 …”).

When N is prime, or more generally when N and k have no common factors, you’ll hit every dot with a single star.  Otherwise, you have to draw a few different stars (specifically, “the greatest common divisor of k and N” different stars).  For example, if N=8 and k=2, then you need two stars: {1→3→5→7→1} and {2→4→6→8→2}.

Given N points, you can create a star by counting k points and drawing a connection.  This works great if N is a prime (bottom), since you’ll always hit every point, but when N isn’t prime you’ll often need to create several stars (top).

That’s why drawing is a good way of doing this math: it’s easy to see when your star is weird-shaped (your k changed halfway through) and really easy to see when you’ve missed some of the dots.

The “star method” gives you N-1 years of “cyclic permutations” (“cyclic” because when you get to the end you just go back to the beginning).  However, for large values of N that’s only a drop in the permutation sea.  Were you so determined, you could cyclicly permute N things in (N-1)! ways.

However!  With a family of 6 you’d need 5! = 1x2x3x4x5 = 120 years to get through every permutation.  More than time enough for any family to gain or lose members, or for some helpful soul to start questioning the point.  Moreover!  Each of those permutations are similar enough to others that you’ll start to feel as though you’re doing a lot of repeating.  For example: {1→2→3→4→5→6→1}, {1→2→3→4→6→5→1}, {1→2→3→5→6→4→1}, …

For those obsessive, immortal gift-givers who want to hit every permutation with the least year-to-year change, just to fight off boredom for another thousand years, there’s Heap’s algorithm.  For the rest of us, drawing stars is more than good enough.

Posted in -- By the Physicist, Combinatorics, Experiments, Math | 2 Comments

Q: How does “1+2+3+4+5+… = -1/12” make any sense?

Physicist: When wondering across the vast plains of the internet, you may have come across this bizarre fact, that 1+2+3+4+\ldots=-\frac{1}{12}, and immediately wondered: Why isn’t it infinity?  How can it be a fraction?  Wait… it’s negative?

An unfortunate conclusion may be to say “math is a painful, incomprehensible mystery and I’m not smart enough to get it”.  But rest assured, if you think that 1+2+3+4+\ldots=\infty, then you’re right.  Don’t let anyone tell you different.  The -\frac{1}{12} thing falls out of an obscure, if-it-applies-to-you-then-you-already-know-about-it, branch of mathematics called number theory.

Number theorists get very excited about the “Riemann Zeta Function”, ζ(s), which is equal to \zeta(s)=\sum_{n=1}^\infty\left(\frac{1}{n}\right)^s=1+\frac{1}{2^s}+\frac{1}{3^s}+\frac{1}{4^s}+\ldots whenever this summation is equal to a number.  If you plug s=-1 into ζ(s), then it seems like you should get \zeta(-1)=1+2+3+4+\ldots, but in this case the summation isn’t a number (it’s infinity) so it’s not equal to ζ(s).  The entire -1/12 thing comes down to the fact that \zeta(-1)=-\frac{1}{12}, however (and this is the crux of the issue), when you plug s=-1 into ζ(s), you aren’t using that sum.  ζ(s) is a function in its own right, which happens to be equal to 1+\frac{1}{2^s}+\frac{1}{3^s}+\frac{1}{4^s}+\ldots for s>1, but continues to exist and make sense long after the summation stops working.

The bigger s is, the smaller each term, \frac{1}{n^s}, and ζ(s) will be.  As a general rule, if s>1, then ζ(s) is an actual number (not infinity).  When s=1, \zeta(1)=1+\frac{1}{2}+\frac{1}{3}+\frac{1}{4}+\ldots=\infty.  It is absolutely reasonable to expect that for s<1, ζ(s) will continue to be infinite.  After all, ζ(1)= and each term in the sum only gets bigger for lower values of s.  But that’s not quite how the Riemann Zeta Function is defined.  ζ(s) is defined as \zeta(s)=\sum_{n=1}^\infty\left(\frac{1}{n}\right)^s when s>1 and as the “analytic continuation” of that sum otherwise.

 You ever analytically continue a staircase, but there isn't actually another step, so you're like "whoa, my math!"

You know what this bridge would do if it kept going.  “Analytic continuation” is essentially the same idea; take a function that stops (perhaps unnecessarily) and continue it in exactly the way you’d expect.

The analytic continuation of a function is unique, so nailing down ζ(s) for s>1 is all you need to continue it out into the complex plane.

Complex numbers take the form “A+Bi” (where i^2=-1).  The only thing about complex numbers you’ll need to know here is that complex numbers are pairs of real numbers (regular numbers), A and B.  Being a pair of numbers means that complex numbers form the “complex plane“, which is broader than the “real number line“.  A is called the “real part”, often written A=Re[A+Bi], and B is the “imaginary part”, B=Im[A+Bi].

That blow up at s=1 seems insurmountable on the real number line, but in the complex plane you can just walk around it to see what’s on the other side.

Left: ζ(s) for values of s>1 on the real number line. Right: The same red function surrounded by its “analytic continuation” into the rest of the complex plane.  Notice that, except for s=1, ζ(s) is completely smooth and well behaved.

\zeta(s)=1+\frac{1}{2^s}+\frac{1}{3^s}+\frac{1}{4^s}+\ldots defines a nice, smooth function for Re[s]>1.  When you extend ζ(s) into the complex plane this summation definition rapidly stops making sense, because the sum “diverges” when s≤1.  But there are two ways to diverge: a sum can either blow up to infinity or just never get around to being a number.  For example, 1-1+1-1+1-1+… doesn’t blow up to infinity, but it also never settles down to a single number (it bounces between one and zero).  ζ(s) blows up at s=1, but remains finite everywhere else.  If you were to walk out into the complex plane you’d find that right up until the line where Re[s]=1, ζ(s) is perfectly well-behaved.  Looking only at the values of ζ(s) you’d see no reason not to keep going, it’s just that the \zeta(s)=1+\frac{1}{2^s}+\frac{1}{3^s}+\frac{1}{4^s}+\ldots formulation suddenly stops working for Re[s]≤1.

But that’s no problem for a mathematician (see the answer gravy below).  You can follow ζ(s) from the large real numbers (where the summation definition makes sense), around the blow up at s=1, to s=-1 where you find a completely mundane value.  It’s -\frac{1}{12}.  No big deal.

So the 1+2+3+\ldots=-\frac{1}{12} thing is entirely about math enthusiasts being so (justifiably) excited about ζ(s) that they misapply it, and has nothing to do with what 1+2+3+… actually equals.

This shows up outside of number theory.  In order to model particle interactions correctly, it’s important to take into account every possible way for that interaction to unfold.  This means taking an infinite sum, which often goes well (produces a finite result), but sometimes doesn’t.  It turns out that physical laws really like functions that make sense in the complex plane.  So when “1+2+3+…” started showing up in calculations of certain particle interactions, physicists turned to the Riemann Zeta function and found that using -1/12 actually turned out to be the right thing to do (in physics “the right thing to do” means “the thing that generates results that precisely agree with experiment”).

A less technical shortcut (or at least a shortcut with the technicalities swept under the rug) for why the summation is -1/12 instead of something else can be found here.  For exactly why ζ(-1)=-1/12, see below.

Answer Gravy: Figuring out that ζ(-1)=-1/12 takes a bit of work.  You have to find an analytic continuation that covers s=-1, and then actually evaluate it.  Somewhat surprisingly, this is something you can do by hand.

Often, analytically continuing a function comes down to re-writing it in such a way that its “poles” (the locations where it blows up) don’t screw things up more than they absolutely have to.  For example, the function f(z)=\sum_{n=0}^\infty z^n only makes sense for -1<z<1, because it blows up at z=1 and doesn’t converge at z=-1.

f(z) can be explicitly written without a summation, unlike ζ(s), which gives us some insight into why it stops making sense for |z|≥1.  It just so happens that for |z|<1, f(z)=\frac{1}{1-z}.  This clearly blows up at z=1, but is otherwise perfectly well behaved; the issues at z=-1 and beyond just vanish.  f(z) and \frac{1}{1-z} are the same in every way inside of -1<z<1.  The only difference is that \frac{1}{1-z} doesn’t abruptly stop, but instead continues to make sense over a bigger domain.  \frac{1}{1-z} is the analytic continuation of f(z)=\sum_{n=0}^\infty z^n to the region outside of -1<z<1.

Finding an analytic continuation for ζ(s) is a lot trickier, because there’s no cute way to write it without using an infinite summation (or product), but the basic idea is the same.  We’re going to do this in two steps: first turning ζ(s) into an alternating sum that converges for s>0 (except s=1), then turning that into a form that converges everywhere (except s=1).

For seemingly no reason, multiply ζ(s) by (1-21-s):

\begin{array}{ll}    &\left(1-2^{1-s}\right)\zeta(s)\\[2mm]  =&\left(1-2^{1-s}\right)\sum_{n=1}^\infty\frac{1}{n^s}\\[2mm]  =&\sum_{n=1}^\infty\frac{1}{n^s}-2^{1-s}\sum_{n=1}^\infty\frac{1}{n^s}\\[2mm]  =&\sum_{n=1}^\infty\frac{1}{n^s}-2\sum_{n=1}^\infty\frac{1}{(2n)^s}\\[2mm]  =&\left(\frac{1}{1^s}+\frac{1}{2^s}+\frac{1}{3^s}+\frac{1}{4^s}+\ldots\right)-2\left(\frac{1}{2^s}+\frac{1}{4^s}+\frac{1}{6^s}+\frac{1}{8^s}+\ldots\right)\\[2mm]  =&\frac{1}{1^s}-\frac{1}{2^s}+\frac{1}{3^s}-\frac{1}{4^s}+\ldots\\[2mm]  =&\sum_{n=1}^\infty\frac{(-1)^{n-1}}{n^s}\\[2mm]  =&\sum_{n=0}^\infty\frac{(-1)^{n}}{(n+1)^s}  \end{array}

So we’ve got a new version of the Zeta function, \zeta(s)=\frac{1}{1-2^{1-s}}\sum_{n=0}^\infty\frac{(-1)^{n}}{(n+1)^s}, that is an analytic continuation because this new sum converges in the same region the original form did (s>1), plus a little more (0<s≤1).  Notice that while the summation no longer blows up at s=1, \frac{1}{1-2^{1-s}} does.  Analytic continuation won’t get rid of poles, but it can express them differently.

There’s a clever old trick for shoehorning a summation into converging: Euler summation.  Euler (who realizes everything) realized that \sum_{n=k}^\infty {n\choose k}\frac{y^{k+1}}{(1+y)^{n+1}}=1 for any y.  This is not obvious.  Being equal to one means that you can pop this into the middle of anything.  If that thing happens to be another sum, it can be used to make that sum “more convergent” for some values of y.  Take any sum, \sum_{k=0}^\infty A_k, insert Euler’s sum, and swap the order of summation:

\begin{array}{rcl}  \sum_{k=0}^\infty A_k&=&\sum_{k=0}^\infty \left(\sum_{n=k}^\infty {n\choose k}\frac{y^{k+1}}{(1+y)^{n+1}}\right)A_k\\[2mm]  &=&\sum_{k=0}^\infty \sum_{n=k}^\infty {n\choose k}\frac{y^{k+1}}{(1+y)^{n+1}}A_k\\[2mm]  &=&\sum_{n=0}^\infty \sum_{k=0}^n {n\choose k}\frac{y^{k+1}}{(1+y)^{n+1}}A_k\\[2mm]  &=&\sum_{n=0}^\infty \frac{1}{(1+y)^{n+1}}\sum_{k=0}^n {n\choose k}y^{k+1}A_k\\[2mm]  \end{array}

If the original sum converges, then this will converge to the same thing, but it may also converge even when the original sum doesn’t.  That’s exactly what you’re looking for when you want to create an analytic continuation; it agrees with the original function, but continues to work over a wider domain.

This looks like a total mess, but it’s stunningly useful.  If we use Euler summation with y=1, we create a summation that analytically continues the Zeta function to the entire complex plane: a “globally convergent form”.  Rather than a definition that only works sometimes (but is easy to understand), we get a definition that works everywhere (but looks like a horror show).

\begin{array}{rcl}  \zeta(s)&=&\frac{1}{1-2^{1-s}}\sum_{k=0}^\infty\frac{(-1)^{k}}{(k+1)^s}\\[2mm]  &=&\frac{1}{1-2^{1-s}}\sum_{n=0}^\infty\frac{1}{(1+y)^{n+1}}\sum_{k=0}^n{n\choose k}y^{k+1}\frac{(-1)^{k}}{(k+1)^s}\\[2mm]  &=&\frac{1}{1-2^{1-s}}\sum_{n=0}^\infty\frac{1}{(1+1)^{n+1}}\sum_{k=0}^n{n\choose k}1^{k+1}\frac{(-1)^{k}}{(k+1)^s}\\[2mm]  &=&\frac{1}{1-2^{1-s}}\sum_{n=0}^\infty\frac{1}{2^{n+1}}\sum_{k=0}^n{n\choose k}\frac{(-1)^{k}}{(k+1)^s}\\[2mm]  \end{array}

This is one of those great examples of the field of mathematics being too big for every discovery to be noticed.  This formulation of ζ(s) was discovered in the 1930s, forgotten for 60 years, and then found in an old book.

For most values of s, this globally convergent form isn’t particularly useful for us “calculate it by hand” folk, because it still has an infinite sum (and adding an infinite number of terms takes a while).  Very fortunately, there’s another cute trick we can use here.  When n>d, \sum _{k=0}^n{n \choose k}(-1)^{k}k^d=0.  This means that for negative integer values of s, that infinite sum suddenly becomes finite because all but a handful of terms are zero.

So finally, we plug s=-1 into ζ(s)

\begin{array}{rcl}  \zeta(-1)&=&\frac{1}{1-2^{2}}\sum_{n=0}^\infty\frac{1}{2^{n+1}}\sum_{k=0}^n{n\choose k}(-1)^{k}(k+1)\\[2mm]  &=&-\frac{1}{3}\sum_{n=0}^1\frac{1}{2^{n+1}}\sum_{k=0}^n{n\choose k}(-1)^{k}(k+1)\\[2mm]  &=&-\frac{1}{3}\cdot\frac{1}{2^{0+1}}{0\choose 0}(-1)^{0}(0+1)-\frac{1}{3}\cdot\frac{1}{2^{1+1}}{1\choose 0}(-1)^{0}(0+1)-\frac{1}{3}\cdot\frac{1}{2^{1+1}}{1\choose 1}(-1)^{1}(1+1)\\[2mm]  &=&-\frac{1}{3}\cdot\frac{1}{2}\cdot1\cdot1\cdot1-\frac{1}{3}\cdot\frac{1}{4}\cdot1\cdot1\cdot1-\frac{1}{3}\cdot\frac{1}{4}\cdot1\cdot(-1)\cdot2\\[2mm]  &=&-\frac{1}{6}-\frac{1}{12}+\frac{1}{6}\\[2mm]  &=&-\frac{1}{12}  \end{array}

Keen-eyed readers will note that this looks nothing like 1+2+3+… and indeed, it’s not.

(Update: 11/24/17)

A commenter pointed out that it’s a pain to find a proof for why Euler’s sum works.  Basically, this comes down to showing that 1=y^{n+1}\sum_{k=n}^\infty{k\choose n}\frac{1}{(1+y)^{k+1}}.  There are a couple ways to do that, but summation by parts is a good place to start:

\sum_{k=n}^Mf_kg_k=f_n\sum_{k=n}^Mg_k+\sum_{k=n}^{M-1}\left(f_{k+1}-f_k\right)\sum_{j=k+1}^M g_j

You can prove this by counting how often each term, f_rg_u, shows up on each side.  Knowing about geometric series and that {a\choose b-1}+{a\choose b}={a+1\choose b} is all we need to unravel this sum.

\begin{array}{rl}  &\sum_{k=n}^\infty{k\choose n}\frac{1}{(1+y)^{k+1}}\\[2mm]  =&{n\choose n}\sum_{k=n}^\infty\frac{1}{(1+y)^{k+1}}+\sum_{k=n}^\infty\left({k+1\choose n}-{k\choose n}\right)\sum_{j=k+1}^\infty \frac{1}{(1+y)^{j+1}}\\[2mm]  =&\sum_{k=n}^\infty\frac{1}{(1+y)^{k+1}}+\sum_{k=n}^\infty{k\choose n-1}\sum_{j=k+1}^\infty \frac{1}{(1+y)^{j+1}}\\[2mm]  =&\frac{\frac{1}{(1+y)^{n+1}}}{1-\frac{1}{(1+y)}}+\sum_{k=n}^\infty{k\choose n-1}\frac{\frac{1}{(1+y)^{k+2}}}{1-\frac{1}{(1+y)}}\\[2mm]  =&\frac{1}{y(1+y)^n}+\sum_{k=n}^\infty{k\choose n-1}\frac{1}{y(1+y)^{k+1}}\\[2mm]  =&\sum_{k=n-1}^\infty{k\choose n-1}\frac{1}{y(1+y)^{k+1}}\\[2mm]  =&\cdots\\[2mm]  =&\sum_{k=0}^\infty{k\choose 0}\frac{1}{y^n(1+y)^{k+1}}\\[2mm]  =&\frac{1}{y^n}\sum_{k=0}^\infty\frac{1}{(1+y)^{k+1}}\\[2mm]  =&\frac{1}{y^n}\frac{\frac{1}{(1+y)}}{1-\frac{1}{(1+y)}}\\[2mm]  =&\frac{1}{y^{n+1}}  \end{array}

That “…” step means “repeat n times”.  It’s worth mentioning that this only works for |1+y|>1 (otherwise the infinite sums diverge).  Euler summations change how a summation is written, and can accelerate convergence (which is really useful), but if the original sum converges, then the new sum will converge to the same thing.

Posted in -- By the Physicist, Equations, Math, Number Theory | 14 Comments