# Q: Are beautiful, elegant or simple equations more likely to be true?

Mathematician:

It is not uncommon to hear physicists or mathematicians talk about the beauty, simplicity or elegance of equations or theorems, and even claim that they are sometimes led to a correct formula (or away from an incorrect one) by considering what is simple or elegant. Consider, for example, the words of the Nobel prize-winning physicist Murray Gell-Mann:

“Three or four of us in 1957 put forward a partially complete theory of the weak [nuclear] force, in disagreement with the results of seven experiments. It was beautiful and so we dared to publish it, believing that all those experiments must be wrong. In fact, they were all wrong.”

and Albert Einstein’s remark:

“I have deep faith that the principle of the universe will be beautiful and simple.”

Could there be something to these remarkable claims? Is beauty in physics evidence for some kind of “intelligent” universe, or are there more mundane explanations? Is elegance in mathematics evidence for an underlying structure to reality? Or can this be explained away by psychological or practical considerations?

To begin answering these questions, an important thing to notice about the aesthetics of equations is that what appears to be simple or elegant may sometimes only be so because of the way that symbols are defined. For example, consider the remarkable and rather minimalist “heat equation”

$\Delta f = f '$

which, when solved for the function f with a given condition on its boundary, will describe how heat would actually flow over time on a specified surface. Is it not astounding that we can describe such a powerful physical law with just 5 symbols?

A deeper look at this equation shows us that the apparent simplicity here is in part an illusion. First of all $\Delta$, which is known in this context as the Laplace operator, can be thought of as simply a short hand notation. If we replace $\Delta f$ and $f'$ with their respective definitions, we are left with the markedly less simple looking equation:

$\Large \frac{d^2 f}{dx^2} + \frac{d^2 f}{dy^2} + \frac{d^2 f}{dz^2} = \frac{d f}{dt}$

Derivative operations (which are taken a total of seven times in the above equation) are not themselves trivial operations, and are typically defined via a limiting procedure. If we are crazy enough to replace all the derivative operations with their definitions, we are left with an equation which is just plain long and ugly, even after performing some simplification:

$\lim_{h \to 0} \frac{1}{h^2} (3 f(x,y,z,t) + f(x+2h,y,z,t) - 2 f(x+h,y,z,t) + f(x,y+2h,z,t) - 2 f(x,y+h,z,t) +f(x,y,z+2h,t) - 2 f(x,y,z+h,t) ) ) = \lim_{h \to 0} \frac{1}{h} (f(x,t+h) - f(x,t))$

The point to realize here is that mathematicians and physicists make very careful choices when selecting their notation to vastly compress very complicated ideas. You can make anything look simple by giving it a unique symbol, and you can make anything look complicated by recursively expanding the definitions it relies on. Even references to the number 1 look extremely complex if you replace them using a formula like:

$1 = \sum_{k=0}^{\infty} (\frac{\pi}{2})^{2k+3-2} \frac{(2-3)^k }{(2k+3-2)!}$

Doing so would not change what’s true, but it sure would confuse a lot of people and make formulas much harder to work with.

To make their notations as useful as possible, physicists and mathematicians typically define symbols in such a way as to make important formulas easy to write down and manipulate. The reason that the Laplace operator $\Delta$ gets its own symbol is because it’s so damn important. So important theorems and known physical laws may have some tendency to seem simpler than they are when the equations are written out, because the choice of symbols was made, in part, to make them easy to write out.

All of this being said, notation is not the end of the story on whether elegance or simplicity relates to truth in math in physics. Another important point to consider is that in many cases a single physical law can cause a multitude of different effects which may not, at first, seem to be related. To give some classic examples, before Newton’s era it was not at all obvious that the force that causes us to fall to the ground when we jump is the same force that keeps planets in orbit in our solar system. Likewise, before the 1800’s it was not known that electric fields, magnetic fields and light are manifestations of a single phenomena now known as electromagnetism. Similarly, before the era of Einstein it was not understood that conservation of energy and conservation of momentum could be thought of as effectively being part of a single conservation law.

There are some cases in physics where simpler and more elegant theories have won out over more complex theories because they correctly identify seemingly unrelated phenomena as having a single cause. Theories which treat inherently connected ideas as being wholly different are destined to be replaced since their lack of unification creates redundancy and therefore unnecessary complexity in the theory. This is one important reason why ugly, complicated theories can often be outdone by what seem to be more beautiful ones. We find it more beautiful to have one explanation for two results than to have two distinct explanations, and if the results really are just caused by one phenomenon, the single explanation will typically be easier to express and work with mathematically than both of the other two.

Another, related reason why we might expect simplicity to win out over complexity, is because of a rule of thumb known as Occam’s Razor. This idea, in its typical modern form, states that given multiple possible explanations for a phenomena that are otherwise equally plausible, we should prefer the one that is the simplest.

A potential example of Occam’s Razor in practice relates to the Ptolemaic explanation of the motion of the planets, which was the accepted theory in some places for many hundreds of years. The basic idea of this theory was that planetary motion consists of “epicycles” around the fixed planet earth. This means that planets were thought to make circular orbits around earth, but that during these circular orbits the planets orbited in smaller circles along the orbits, and along those smaller circular orbits they orbited in still smaller circles, etc. This model was intrinsically very complex because by adjusting the epicycles so that there were a sufficient number of circular orbits within circular orbits at appropriate speeds, one could have described pretty much any shape of orbit, real or imagined. In other words, the model had a large number of free variables which gave it enormous flexibility and therefore complexity. Copernicus eventually laid the Ptolemaic model to waste by replacing it with a far simpler model with far fewer free parameters, which he accomplished merely by shifting the center of the circular planetary orbits to be the sun rather than the earth. However, the basic form of his new theory still did not agree perfectly with observation, and so required some ad hoc refinements that introduced extra complexity. This further complexity was eventually removed by Kepler who refined the model yet again by allowing for elliptical rather than circular orbits, which now is known to be an excellent explanation for the orbits that are observed. The key difference in these explanations for orbits is that the theory of epicycles is complex enough to explain almost any conceivable orbit you could ever think of, whereas Kepler’s idea of elliptical orbits with the sun at one focus of the ellipses was just complex enough to explain what was actually observed but without being complex enough to explain the universe had we observed substantively different orbits than actually exist. In other words, Kepler’s theory is as complicated as it needs to be to explain reality.

There does seem to be something to this Occam’s Razor business. It certainly seems to be a bad idea to add extra assumptions to a theory if those new assumptions don’t improve explanatory power, and it also seems like a bad idea to use a theory that’s so complex (or has so many variables that can be tweaked) that it can explain pretty much any experimental result you might get. There is even some neat mathematical theory which shows that taking an Occam’s Razor like approach is a good idea in certain contexts (see, for instance, the work on Solomonoff Induction and the Ockham Efficiency Theorem). But we’re a long way off from being able to formally prove that we should use Occam’s Razor as a general rule of thumb. In fact, we don’t even know what the right notion of “simpler” is to use in real world problems when we claim that “simpler explanations are more likely to be true.”

There are a few more points about the relationship between beauty and truth in physics and math that I feel are worth mentioning. To begin with, as physicist Murray Gell-Mann (quoted above) mentions in his TED talk on beauty and truth in physics, symmetry plays a key role in simplicity. For example, since all the known laws of physics treat the three dimensions of space equally, we can often greatly simplify equations by writing expressions such as

$\nabla f$

(which is the gradient of f, which constructs a vector of the derivatives of f with respect to each of its variables) rather than having to write an equivalent but much more cumbersome set of equations where we treat each dimension of space separately, as in:

$\frac{df}{dx}$

$\frac{df}{dy}$

$\frac{df}{dz}$

The point here is that symmetry makes it easier to simplify equations. Of course, this argument goes beyond just the symmetry of the three dimensions of space, and applies also to symmetry in time, rotation, etc.

Another idea that should be mentioned is that typically mathematical expressions have a number of equivalent forms. For example, we could define the exponential function $e^x$ using any of the following nearly interchangeable definitions:

$f(x) = \lim_{h \to \infty} (1 + \frac{x}{h})^h$

$f(x) = f'(x), f(0) = 1$

$f(x) = \sum_{k=0}^{\infty} \frac{x^k}{k!}$

$f(\ln(x)) = x$

$f(x+y) = f(x) f(y), f(1) = e$

$f(x) = \cosh(x) + \sinh(x)$

None of these definitions for $e^x$ is intrinsically best. Mathematicians have the choice to use whichever definition is more useful for any given purpose, and often times it is precisely the simpler or more elegant seeming definitions that are used most commonly because they are easier for us to understand and manipulate.

As a final point, it is worth noting that quite a bit of the more theoretical mathematical work is driven more by the aesthetic and psychological appeal of the theorems than by the importance of those theorems in solving practical problems that arise in the real world. One prime example of this phenomenon is the field of number theory, which while popular and very elegant, found almost no practical applications before it was (unexpectedly) linked to the field of cryptography and secure online banking.

Not only do mathematicians like equations that seem elegant, but it is easier to publish results that strike the reviewers as elegant rather than clumsy and awkward. Keeping these ideas in mind, it is no surprise to find that some of the most researched areas of math even today have great beauty but few real world applications.

In conclusion, the relationship between elegance and truth in physics and math is a complicated one, which relates to practical considerations such as choices for notation and definitions, psychological phenomenon such as the personal preferences and aesthetic sensibilities of the practitioners, and deeper physical or mathematical ideas such as symmetry, the unification of seemingly unrelated results, and Occam’s razor.

This entry was posted in -- By the Mathematician, Equations, Math, Philosophical, Physics. Bookmark the permalink.

### 12 Responses to Q: Are beautiful, elegant or simple equations more likely to be true?

1. Sometimes the most elegant ways to communicate a physical law is without the mathematical language. For example, in Einstein’s paper, The Foundations of the Generalized Theory of Relativity, you would see that he develops his famous theory of space-time from assumptions which are given in plain English (or, more accurately, German). And if you read John Baez’ exposition on Einstein’s field equations, you may see that such equations, representing the terminal point of the general theory of relativity, may also be written in the layman’s vernacular.

2. Andrew Hall says:

Thanks, that actually cleared things up quite a bit for me.

3. rookworst says:

“Keeping these ideas in mind, it is no surprise to find that some of the most researched areas of math even today have great beauty but few real world applications.”

I’m interested in areas of math that you personally would describe as having great beauty, but little applicability to the real world?

4. micha says:

You write: “ This model was intrinsically very complex because by adjusting the epicycles so that there were a sufficient number of circular orbits within circular orbits at appropriate speeds, one could have described pretty much any shape of orbit, real or imagined. ” (emph. mine)

No need for the “pretty much”. Epicycles, being circles seen side on, would change sinusoidally with time. A sum of epicycles is a Fourier series, which (as I’m sure you know) can be made for any function, and thus any path across our sky.

5. John says:

If you want to know more about Occam’s razor, this is an interesting article (with great links to other articles): Less Wrong

6. The Mathematician says:

Topology and number theory are pretty good examples of fields with elegant theories but few real world applications relative to the amount of research that has been done in them.

7. The Mathematician says:

Thanks for bringing up the connection with Fourier series! What makes me hesitate to say just “any shape of orbit” is that the Fourier series actually relies on some assumptions. For instance, a Fourier series would fail to converge for non-periodic orbit.

8. Amir says:

I believe this is not a very fair comparison, but today, as a 25 year old MSc in Mechanical Engineering, this blog makes me feel an awe similar to Beakman’s World when I was 10. Thank you so much.

9. valtron says:

Here’s some true equations that aren’t simple, and some might say not even beautiful or elegant (but that’s objective): pretty much everything Ramanujam wrote.

10. socratus says:

Physics: By Beauty it is beautiful.
#
After reading book ‘Albert Einstein’ by Leopold Infeld.
http://en.wikipedia.org/wiki/Leopold_Infeld
========================.
Page 4.
‘ Many believe that relative theory tells us that ours
is a kind of Alice-in-Wonderland universe; . . . .‘
‘ How, then, did the prejudice about the mysterious relative
Alice-in-Wonderland universe arise?’
#
1.
In the 19th century aether /ether was the term used to
describe a medium for the propagation of quantum of light
(electromagnetic waves ).
On one hand it must be very thin, because the planets
move through it without resistence.
On the other hand it must be very hard, because quantum
of light is a transverse wave. And a transverse wave can
move only in a hard space. It was created many theories
to explain this paradox but without success.
2.
In 1887 the Michelson-Morley experiment
showed that the speed of quantum of light is constant in all
directions regardless of the motion of the source.
This experiment was interpreted as ’ether doesn’t exist’.
3.
In 1905, Albert Einstein resolved this paradox by revising the
Galilean model of space and time to account for the constancy
of the speed of light. Einstein formulated his ideas in his
special theory of relativity, which advanced humankind’s
understanding of space and time.
/ The special theory of relativity
http://en.wikipedia.org/wiki/Light
4.
In 1908 Herman Minkowski explained Einstein’s
idea using time as forth dimension and said:
‘ Henceforth, space by itself, and time by itself,
and only a kind of union of the two will preserve
an independent reality.’
=======================.
#
So, ‘ How, then, did the prejudice about the mysterious
relative Alice-in-Wonderland universe arise?’
My opinion.
On the page 5 Infeld wrote:
‘ Science is a rational structure; the greatest pleasure
in studying is that of understanding. Without it
knowledge means little.’

Very well. But if the ‘ Science is a rational structure’
then where is the Minkowski (-4D ) in nature, where
is the ‘only a kind of union of the two’ ?
Nobody knows where it is.
So, what is about a rational structure?
So, what is about a real structure, real nature?
I don’t mean to criticize.
I only cannot understand why the trick of changing
concept of ether on the concept of space-time was passed
without doubt, with glory and proud.
=====.
P.S.
Maybe the reason of (-4D) long live is it
mathematical beauty ?
Page 45.
‘Minkowski mathematical genius put Einstein’s ideas
into a new geometrical form that fully revealed their
beauty and simplicity.’

But is it correct to say, that these two parameters real enough
to explain and understand the real nature?
About 2500 years ago, according to Plato, Socrates said:
‘ I do not go so far as to insist upon the precise details;
only upon the fact that it is by Beauty that beautiful
things are beautiful.’

This is exactly that physicists are doing.
And as a result, going in such beautiful mathematical
way we have many paradoxes in physics.

Without the precise physical details, like: volume (V ),
temperature (T ) and density ( P) the Minkowski
beautiful and simple (-4D) is a pure mathematical game,
it is an abstraction.
=======.
All the best.