It is not uncommon to hear physicists or mathematicians talk about the beauty, simplicity or elegance of equations or theorems, and even claim that they are sometimes led to a correct formula (or away from an incorrect one) by considering what is simple or elegant. Consider, for example, the words of the Nobel prize-winning physicist Murray Gell-Mann:
“Three or four of us in 1957 put forward a partially complete theory of the weak [nuclear] force, in disagreement with the results of seven experiments. It was beautiful and so we dared to publish it, believing that all those experiments must be wrong. In fact, they were all wrong.”
and Albert Einstein’s remark:
“I have deep faith that the principle of the universe will be beautiful and simple.”
Could there be something to these remarkable claims? Is beauty in physics evidence for some kind of “intelligent” universe, or are there more mundane explanations? Is elegance in mathematics evidence for an underlying structure to reality? Or can this be explained away by psychological or practical considerations?
To begin answering these questions, an important thing to notice about the aesthetics of equations is that what appears to be simple or elegant may sometimes only be so because of the way that symbols are defined. For example, consider the remarkable and rather minimalist “heat equation”
which, when solved for the function f with a given condition on its boundary, will describe how heat would actually flow over time on a specified surface. Is it not astounding that we can describe such a powerful physical law with just 5 symbols?
A deeper look at this equation shows us that the apparent simplicity here is in part an illusion. First of all , which is known in this context as the Laplace operator, can be thought of as simply a short hand notation. If we replace and with their respective definitions, we are left with the markedly less simple looking equation:
Derivative operations (which are taken a total of seven times in the above equation) are not themselves trivial operations, and are typically defined via a limiting procedure. If we are crazy enough to replace all the derivative operations with their definitions, we are left with an equation which is just plain long and ugly, even after performing some simplification:
The point to realize here is that mathematicians and physicists make very careful choices when selecting their notation to vastly compress very complicated ideas. You can make anything look simple by giving it a unique symbol, and you can make anything look complicated by recursively expanding the definitions it relies on. Even references to the number 1 look extremely complex if you replace them using a formula like:
Doing so would not change what’s true, but it sure would confuse a lot of people and make formulas much harder to work with.
To make their notations as useful as possible, physicists and mathematicians typically define symbols in such a way as to make important formulas easy to write down and manipulate. The reason that the Laplace operator gets its own symbol is because it’s so damn important. So important theorems and known physical laws may have some tendency to seem simpler than they are when the equations are written out, because the choice of symbols was made, in part, to make them easy to write out.
All of this being said, notation is not the end of the story on whether elegance or simplicity relates to truth in math in physics. Another important point to consider is that in many cases a single physical law can cause a multitude of different effects which may not, at first, seem to be related. To give some classic examples, before Newton’s era it was not at all obvious that the force that causes us to fall to the ground when we jump is the same force that keeps planets in orbit in our solar system. Likewise, before the 1800′s it was not known that electric fields, magnetic fields and light are manifestations of a single phenomena now known as electromagnetism. Similarly, before the era of Einstein it was not understood that conservation of energy and conservation of momentum could be thought of as effectively being part of a single conservation law.
There are some cases in physics where simpler and more elegant theories have won out over more complex theories because they correctly identify seemingly unrelated phenomena as having a single cause. Theories which treat inherently connected ideas as being wholly different are destined to be replaced since their lack of unification creates redundancy and therefore unnecessary complexity in the theory. This is one important reason why ugly, complicated theories can often be outdone by what seem to be more beautiful ones. We find it more beautiful to have one explanation for two results than to have two distinct explanations, and if the results really are just caused by one phenomenon, the single explanation will typically be easier to express and work with mathematically than both of the other two.
Another, related reason why we might expect simplicity to win out over complexity, is because of a rule of thumb known as Occam’s Razor. This idea, in its typical modern form, states that given multiple possible explanations for a phenomena that are otherwise equally plausible, we should prefer the one that is the simplest.
A potential example of Occam’s Razor in practice relates to the Ptolemaic explanation of the motion of the planets, which was the accepted theory in some places for many hundreds of years. The basic idea of this theory was that planetary motion consists of “epicycles” around the fixed planet earth. This means that planets were thought to make circular orbits around earth, but that during these circular orbits the planets orbited in smaller circles along the orbits, and along those smaller circular orbits they orbited in still smaller circles, etc. This model was intrinsically very complex because by adjusting the epicycles so that there were a sufficient number of circular orbits within circular orbits at appropriate speeds, one could have described pretty much any shape of orbit, real or imagined. In other words, the model had a large number of free variables which gave it enormous flexibility and therefore complexity. Copernicus eventually laid the Ptolemaic model to waste by replacing it with a far simpler model with far fewer free parameters, which he accomplished merely by shifting the center of the circular planetary orbits to be the sun rather than the earth. However, the basic form of his new theory still did not agree perfectly with observation, and so required some ad hoc refinements that introduced extra complexity. This further complexity was eventually removed by Kepler who refined the model yet again by allowing for elliptical rather than circular orbits, which now is known to be an excellent explanation for the orbits that are observed. The key difference in these explanations for orbits is that the theory of epicycles is complex enough to explain almost any conceivable orbit you could ever think of, whereas Kepler’s idea of elliptical orbits with the sun at one focus of the ellipses was just complex enough to explain what was actually observed but without being complex enough to explain the universe had we observed substantively different orbits than actually exist. In other words, Kepler’s theory is as complicated as it needs to be to explain reality.
There does seem to be something to this Occam’s Razor business. It certainly seems to be a bad idea to add extra assumptions to a theory if those new assumptions don’t improve explanatory power, and it also seems like a bad idea to use a theory that’s so complex (or has so many variables that can be tweaked) that it can explain pretty much any experimental result you might get. There is even some neat mathematical theory which shows that taking an Occam’s Razor like approach is a good idea in certain contexts (see, for instance, the work on Solomonoff Induction and the Ockham Efficiency Theorem). But we’re a long way off from being able to formally prove that we should use Occam’s Razor as a general rule of thumb. In fact, we don’t even know what the right notion of “simpler” is to use in real world problems when we claim that “simpler explanations are more likely to be true.”
There are a few more points about the relationship between beauty and truth in physics and math that I feel are worth mentioning. To begin with, as physicist Murray Gell-Mann (quoted above) mentions in his TED talk on beauty and truth in physics, symmetry plays a key role in simplicity. For example, since all the known laws of physics treat the three dimensions of space equally, we can often greatly simplify equations by writing expressions such as
(which is the gradient of f, which constructs a vector of the derivatives of f with respect to each of its variables) rather than having to write an equivalent but much more cumbersome set of equations where we treat each dimension of space separately, as in:
The point here is that symmetry makes it easier to simplify equations. Of course, this argument goes beyond just the symmetry of the three dimensions of space, and applies also to symmetry in time, rotation, etc.
Another idea that should be mentioned is that typically mathematical expressions have a number of equivalent forms. For example, we could define the exponential function using any of the following nearly interchangeable definitions:
None of these definitions for is intrinsically best. Mathematicians have the choice to use whichever definition is more useful for any given purpose, and often times it is precisely the simpler or more elegant seeming definitions that are used most commonly because they are easier for us to understand and manipulate.
As a final point, it is worth noting that quite a bit of the more theoretical mathematical work is driven more by the aesthetic and psychological appeal of the theorems than by the importance of those theorems in solving practical problems that arise in the real world. One prime example of this phenomenon is the field of number theory, which while popular and very elegant, found almost no practical applications before it was (unexpectedly) linked to the field of cryptography and secure online banking.
Not only do mathematicians like equations that seem elegant, but it is easier to publish results that strike the reviewers as elegant rather than clumsy and awkward. Keeping these ideas in mind, it is no surprise to find that some of the most researched areas of math even today have great beauty but few real world applications.
In conclusion, the relationship between elegance and truth in physics and math is a complicated one, which relates to practical considerations such as choices for notation and definitions, psychological phenomenon such as the personal preferences and aesthetic sensibilities of the practitioners, and deeper physical or mathematical ideas such as symmetry, the unification of seemingly unrelated results, and Occam’s razor.