Q: Why does E=MC2 ?

Physicist: It’s a little surprising that this question didn’t come up earlier.  Unfortunately, there’s no intuitive way to understand why “the energy of the rest mass of an object is equal to the rest mass times the speed of light squared” (E=MC2).  A complete derivation/proof includes a fair chunk of math (in the second half of this post), a decent understanding of relativity, and (most important) experimental verification.

So first, here’s an old physics trick (used by old physicists) to guess answers without doing any thinking or work, and perhaps while drinking.  Take everything that could have anything to do with the question (any speeds, densities, sizes, etc.) and put them together so that the units line up correctly.  There’s an excellent example in this old post about poo.

Here’s the idea; energy is distance times force (E=DF), and force is mass times acceleration (E=DMA), and acceleration is velocity over time, which is the same as distance over time squared (E = DMD/T2 = MD2/T2).

So energy has units of mass times velocity squared.  So, if there were some kind of universal relationship between mass and energy, then it should depend on universal constants.  Quick!  Name a universal speed!  E=MC2

Totally done.

This is a long long way from being a solid argument.  You could just as easily have E=5MC2 or E=πMC2 or something like that, and still have the units correct.  It may even be possible to mix together a bunch of other universal constants until you get velocity squared, or there may just be a new, previously unknown physical constant involved.  This trick is just something used to get a thumbnail sketch of what “looks” like a correct answer, and in this particular case it’s exactly right.

For a more formal derivation, you’d have to stir the answer gravy:


Answer gravy: This derivation is going to seem pretty indirect, and it is.  But that’s basically because E = mc2 is an accidental result from a more all-encompassing theory.  So bear with me…

The length of regular vectors, (which could be distance, momentum, whatever) remains unchanged by rotations.  If you take a stick and just turn it, then of course it stays the same length.  The same holds true for speed: 60 mph is 60 mph no matter what direction you’re moving in.

Although the vector, v, changes when you rotate your point of view, its length, ||v||, always stays the same.

If you have a vector (x,y,z), then it’s length is denoted by “||(x,y,z)||“.  According to the Pythagorean theorem, ||(x,y,z)||^2 = x^2+y^2+z^2.

When relativity came along, time suddenly became an important fourth component: (x,y,z,t).  And the true “conserved distance” was revealed to be: ||(x,y,z,t)||^2 = x^2+y^2+z^2-c^2t^2.  Notice that when you ignore time (t=0), then this reduces to the usual definition.

This fancy new “spacetime interval” conserves the length of things under ordinary rotations (which just move around the x, y, z part), but also conserves length under “rotations involving time”.  In ordinary physics you can rotate how you see things by turning your head.  In relativity you rotate how you see things in spacetime, by running past them (changing your speed with respect to what you’re looking at).  “Spacetime rotations” (changing your own speed) are often called “Lorentz boosts“, by people who don’t feel like being clearly understood.

You can prove that the spacetime interval is invariant based only on the speed of light being the same to everyone.  It’s a bit of a tangent, so I won’t include it here. (Update 8/10/13: but it is included here!)

Some difficulties show up in relativity when you try to talk about velocity.  If your position vector is (x,y,z), then your velocity vector is \vec{v} = (\frac{\partial x}{\partial t},\frac{\partial y}{\partial t},\frac{\partial z}{\partial t}) (velocity is the time derivative of position).

But since relativity messes with distance and time, it’s important to come up with a better definition of time.  The answer Einstein came up with was \tau, which is time as measured by clocks on the moving object in question. \tau is often called “proper time”.  So the better definition of velocity is (\frac{\partial x}{\partial \tau},\frac{\partial y}{\partial \tau},\frac{\partial z}{\partial \tau}, \frac{\partial t}{\partial \tau}).  This way you can talk about how fast an object is moving through time, as well as how fast it’s moving through space.

By the way, as measured by every one else on the planet, you’re currently moving through their time (t) at almost exactly 1 second per second (\frac{\partial t}{\partial \tau}=1).

One of the most important, simple things that you can know about relativity is the gamma function: \gamma = \frac{1}{\sqrt{1-\left(\frac{||v||}{c}\right)^2}}You can derive the gamma function by thinking about light clocks, or a couple other things, but I don’t want to side track on that just now.

Among other things, \gamma = \frac{\partial t}{\partial \tau}.  That is, \gamma is the ratio of how fast “outside time” passes from the point of view of the object’s “on-board time”.  So now, using the chain rule from calculus:

(\frac{\partial x}{\partial \tau},\frac{\partial y}{\partial \tau},\frac{\partial z}{\partial \tau}, \frac{\partial t}{\partial \tau}) = (\frac{\partial t}{\partial \tau}\frac{\partial x}{\partial t},\frac{\partial t}{\partial \tau}\frac{\partial y}{\partial t},\frac{\partial t}{\partial \tau}\frac{\partial z}{\partial t}, \frac{\partial t}{\partial \tau}) = (\gamma\frac{\partial x}{\partial t},\gamma\frac{\partial y}{\partial t},\gamma\frac{\partial z}{\partial t}, \gamma).

For succinctness (and tradition) I’ll bundle the first three terms together:

(\gamma\frac{\partial x}{\partial t},\gamma\frac{\partial y}{\partial t},\gamma\frac{\partial z}{\partial t}, \gamma) = (\gamma \vec{v}, \gamma)

Now check this out!  Remember that the spacetime interval for a spacetime vector with spacial component \vec{a}, and temporal component b, is ||(\vec{a},b)||^2 = a^2-c^2b^2.

||(\gamma\vec{v}, \gamma)||^2 = \gamma^2v^2-c^2\gamma^2 = \gamma^2(v^2-c^2) = \frac{v^2-c^2}{1-\frac{v^2}{c^2}} = c^2\frac{v^2-c^2}{c^2-v^2} = -c^2

(This used a slight breach of notation: “\vec{v}” is a velocity vector and “v” is the length of the velocity, or “speed”)

The amazing thing about “spacetime speed” is that, no matter what v is, ||(\gamma \vec{v}, \gamma)||^2 = -c^2.

(Quick aside; it may concern you that a squared quantity can be negative.  Don’t worry about it.)

Now, Einstein (having a passing familiarity with physics) knew that momentum (\vec{P}) is conserved, and that the magnitude of momentum is conserved by rotation (in other words, the direction of the momentum is messed up by rotation, but the amount of momentum is conserved (top picture of this post).  He also knew that to get from velocity to momentum, just multiply by mass (momentum is \vec{P}=m\vec{v}).  Easy nuf.   m^2||(\gamma \vec{v}, \gamma)||^2 = -m^2c^2 \Rightarrow ||(\gamma m\vec{v}, \gamma m)||^2 = -m^2c^2

So if ordinary momentum is given by the first term (the “spacial term”): \vec{P}= \gamma m \vec{v}, then what’s that other stuff (\gamma m)?  Look at the conserved quantity:

\begin{array}{ll} ||(\gamma m\vec{v}, \gamma m)||^2 = -m^2c^2 \\ \Rightarrow \left(\gamma mv\right)^2 - \left(\gamma mc\right)^2 = -m^2c^2 \\\Rightarrow P^2 - \left(\gamma mc\right)^2 = -m^2c^2 \end{array}

What’s interesting here is that m2c2 never changes, and P2 only changes if you start moving. For example, if you were to run as fast as a bullet (in the direction of the bullet), you wouldn’t worry about it hurting you, because from your perspective it has no momentum.

So whatever that last term is (\gamma mc) it’s also conserved (as long as you don’t change your own speed).  Which is interesting.  So the ‘Stein studied its behavior very closely.  If you take its Taylor expansion, which turns functions into polynomials, you get this:

\begin{array}{ll} m c \gamma\\= \frac{mc}{\sqrt{1-\left(\frac{v}{c}\right)^2}}\\= mc\left[1+\frac{1}{2}\frac{v^2}{c^2}+\frac{3}{8}\frac{v^4}{c^4}+\frac{5}{16}\frac{v^6}{c^6} \cdots \right]\\=mc+\frac{1}{2}m\frac{v^2}{c}+\frac{3}{8}m\frac{v^4}{c^3}+\frac{5}{16}m\frac{v^6}{c^5} \cdots \end{array}

The second term there should look kinda familiar (if you’ve taken intro physics); it’s the classical kinetic energy of an object (1/2 mv2) divided by c.  Could this whole thing (thought Einstein) be the energy of the object in question, divided by c?  Energy is definitely conserved.  And, since c is a constant, energy divided by c is also conserved.

So, multiplying by c: E = \gamma m c^2 = mc^2+\frac{1}{2}mv^2+\frac{3}{8}m\frac{v^4}{c^2}+\frac{5}{16}m\frac{v^6}{c^4} \cdots.

You can also plug this into P^2 - \left(\gamma mc\right)^2 = -m^2c^2 and, Alaca-math!  You get Einstein’s (sorta) famous energy/momentum relation: P^2c^2 + m^2c^4 = E^2.

Notice that the energy and momentum here are not the classical (Newtonian) energy and momentum: E=\frac{1}{2}mv^2 and \vec{P} = m\vec{v}.  Instead they are the relativistic energy and momentum: E=\gamma mc^2 and \vec{P} = \gamma m\vec{v}.  This only has noticeable effects at extremely high speeds, and at lower speeds they look like: E \approx mc^2 + \frac{1}{2}mv^2 and \vec{P} \approx m\vec{v}, which is what you’d hope for.  New theories should always include the old theories as a special case (or disprove them).

Now, holy crap.  If you allow the speed of the object to be zero (v=0), you find that everything other than the first term in that long equation for E vanishes, and you’re left with (drumroll): E=mc2!  So even objects that aren’t moving are still holding energy.  A lot of energy.  One kilogram of matter holds the energy equivalent of 21.4 Megatons of TNT, or about 1500 Hiroshima bombs.

The first question that should come to mind when you’ve got a new theory that’s, honestly, pretty insane is “why didn’t anyone notice this before?”  Why  is it that the only part of the energy that anyone ever noticed was \frac{1}{2}mv^2?  Well, the higher terms are a little hard to see.  Up until Einstein that fastest things around were bullets moving at about the speed of sound.  If you were to use the “\frac{1}{2}mv^2” equation for kinetic energy you would be exactly right up to one part in 20,000,000,000,000,000.  All of the higher terms are divided by some power of c (a big number), so until the speed gets ridiculously high they just don’t matter.

But what about the mc2?  Well, to be detected energy has to do something.  If somebody flings a battery at you, it really doesn’t matter if the battery is charged up or not.

Side note: This derivation isn’t a “proof” per say, just a really solid argument: “there’s this thing that’s conserved according to relativity, and it looks exactly like energy”.  However, you can’t, using math alone, prove anything about the outside universe.  The “proof” came when E=mc2 was tested experimentally (with particle accelerators ‘n stuff).

But Einstein’s techniques and equations have been verified as many times as they’ve been tested.  One of the most profound conclusions is that, literally, “energy is the time component of momentum”.  Or “E/c” is at least.  So conservation of energy, momentum, and matter are all actually the same conservation law!

This entry was posted in -- By the Physicist, Physics, Relativity. Bookmark the permalink.

54 Responses to Q: Why does E=MC2 ?

  1. J-Rad says:

    In your response above you stated as follows:

    The exact expression is floating around somewhere in the middle of the post. The energy is E = \frac{mc^2}{\sqrt{1-\left(\frac{v}{c}\right)^2}} = mc^2 + \frac{1}{2}mv^2 + \frac{3}{8}m\frac{v^4}{c^2}+\cdots, with more and more terms that are less and less important.
    So, the approximation E=mc^2+\frac{1}{2}mv^2 is pretty good, but isn’t exact.

    I’m an engineer and noticed that this looks extremely similar to Bernoulli’s Principle for fluid dynamics (http://en.wikipedia.org/wiki/Bernoulli's_principle).

    Assuming I consider mc^2 to be a constant potential energy within the object, or the gz term in Bernoulli; the (mv^2)/2 being the kinetic energy imparted by motion. Would I be correct to consider the remainder of the equation to be similar to the energy due to pressure? If so, what sort of pressure would this be referring to? Does it relate to the pressure from mass increasing infinitely as it approaches the speed of light?

    The more I think about this, the more confused I’m becoming.

  2. Pierre Beaudet says:

    There must exists a relationship between this formula and the kinetic energy of this mass in movement described as 1/2mv.v. So if a mass is moving at the speed of light, why is its kinetic energy only half of what the energy of the rest mass calculated by the Einstein formula ?

  3. mohammad riaz khan says:

    Please try to avoid confusion,& then ur thinking may come true & fruitful.u may contact me if u r interested to avoid confusion.As u mentioned u r an engineer so i m (B.Sc. electrical engg.70 uet lhr.easier contact is 03335337077 Islamabad.

  4. PacRim Jim says:

    You, sir, are a witty writer.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>