Q: When you write a fraction with a prime denominator in decimal form it repeats every p-1 digits. Why?

The original question was: How come the length of the repetend for some fractions (e.g. having a prime number p as a denominator) is equal to p-1?


Physicist: The question is about the fact that if you type a fraction into a calculator, the decimal that comes out repeats.  But it repeats in a very particular way.  For example,

\frac{1}{7} = 0.\underbrace{142857}_{repetend}142857142857\ldots

7 is a prime number and (you can check this) all fractions with a denominator of 7 repeats every 7-1=6 digits (even if it does so trivially with “000000”).  The trick to understanding why this happens in general is to look really hard at how division works.  That is to say: just do long division and see what happens.

When we say that \frac{1}{7} = 0.142857\ldots, what we mean is \frac{1}{7} = 0 + \frac{1}{10} + \frac{4}{10^2} + \frac{2}{10^3}+ \frac{8}{10^4}+ \frac{5}{10^5}+ \frac{7}{10^6}\ldots.  With that in mind, here’s why \frac{1}{7} = 0.142857\ldots.

\begin{array}{ll}  \frac{1}{7} \\[2mm]  = \frac{1}{10}\frac{10}{7} \\[2mm]  = \frac{1}{10} + \frac{1}{10}\frac{3}{7} \\[2mm]  = \frac{1}{10} + \frac{1}{10^2}\frac{30}{7} \\[2mm]  = \frac{1}{10} + \frac{4}{10^2} + \frac{1}{10^2}\frac{2}{7} \\[2mm]  = \frac{1}{10} + \frac{4}{10^2} + \frac{1}{10^3}\frac{20}{7} \\[2mm]  = \frac{1}{10} + \frac{4}{10^2} + \frac{2}{10^3} + \frac{1}{10^3}\frac{6}{7} \\[2mm]  \end{array}

and so on forever.  You’ll notice that the same thing is done to the numerator over and over: multiply by 10, divide by 7, the quotient is the digit in the decimal and the remainder gets carried to the next step, multiply by 10, ….  The remainder that gets carried from one step to the next is just \left[10^k\right]_7.

Quick aside: If you’re not familiar with modular arithmetic, there’s an old post here that has lots of examples (and a shallower learning curve).  The bracket notation I’m using here isn’t standard, just better.  “[4]3” should be read “4 mod 3″.  And because the remainder of 4 divided by 3 and the remainder of 1 divided by 3 are both 1, we can say “[4]3=[1]3“.

\begin{array}{l|l}\frac{1}{7}&[1]_7\\[2mm]=\frac{1}{10}\frac{10}{7}&[10]_7\\[2mm]=\frac{1}{10}+\frac{1}{10}\frac{3}{7}&[10]_7=[3]_7\\[2mm]=\frac{1}{10}+\frac{1}{10^2}\frac{30}{7}&[10^2]_7=[30]_7\\[2mm]=\frac{1}{10}+\frac{4}{10^2}+\frac{1}{10^2}\frac{2}{7}&[10^2]_7=[2]_7\\[2mm]=\frac{1}{10}+\frac{4}{10^2}+\frac{1}{10^3}\frac{20}{7}&[10^3]_7=[20]_7\\[2mm]=\frac{1}{10}+\frac{4}{10^2}+\frac{2}{10^3}+\frac{1}{10^3}\frac{6}{7}&[10^3]_7=[6]_7\\[2mm]  \end{array}

These aren’t the numbers that end up in the decimal expansion, they’re the remainder left over when you stop calculating the decimal expansion at any point.  What’s important about these numbers is that they each determine the next number in the decimal expansion, and they repeat every 6.

\begin{array}{ll}  [1]_7=1\\[2mm]  [10]_7=3\\[2mm]  [10^2]=2\\[2mm]  [10^3]=6\\[2mm]  [10^4]=4\\[2mm]  [10^5]=5\\[2mm]  [10^6]=1\end{array}

After this it repeats because, for example, [10^9]_7 = [10^3\cdot10^6]_7 = [10^3\cdot1]_7 = [10^3]_7.  If you want to change the numerator to, say, 4, then very little changes:

\begin{array}{l|l}\frac{4}{7}&[4]_7\\[2mm]=\frac{5}{10}+\frac{1}{10}\frac{5}{7}&[4\cdot10]_7=[5]_7\\[2mm]=\frac{5}{10}+\frac{7}{10^2}+\frac{1}{10^2}\frac{1}{7}&[4\cdot10^2]_7=[1]_7\\[2mm]=\frac{5}{10}+\frac{7}{10^2}+\frac{1}{10^3}+\frac{1}{10^3}\frac{3}{7}&[4\cdot10^3]_7=[3]_7\\[2mm]=\frac{5}{10}+\frac{7}{10^2}+\frac{1}{10^3}+\frac{4}{10^4}+\frac{1}{10^4}\frac{2}{7}&[4\cdot10^4]_7=[2]_7\\[2mm]=\frac{5}{10}+\frac{7}{10^2}+\frac{1}{10^3}+\frac{4}{10^4}+\frac{2}{10^5}+\frac{1}{10^5}\frac{6}{7}&[4\cdot10^5]_7=[6]_7\\[2mm]=\frac{5}{10}+\frac{7}{10^2}+\frac{1}{10^3}+\frac{4}{10^4}+\frac{2}{10^5}+\frac{8}{10^6}+\frac{1}{10^6}\frac{4}{7}&[4\cdot10^6]_7=[4]_7\\[2mm]\end{array}

So the important bit to look at is the remainder after each step.  More generally, the question of why a decimal expansion repeats can now be seen as the question of why [10^k]_P repeats every P-1, when P is prime.  For example, for \frac{2}{3} we’d be looking at [2\cdot10^k]_3 and for \frac{30}{11} we’d be looking at [30\cdot10^k]_{11}.  The “10” comes from the fact that we use a base 10 number system, but that’s not written in stone either (much love to my base 20 Mayan brothers and sisters.  Biix a beele’ex, y’all?).

It turns out that when the number in the denominator, M, is coprime to 10 (has no factors of 2 or 5), then the numbers generated by successive powers of ten (mod M) are always also coprime to M.  In the examples above M=7 and the powers of 10 generated {1,2,3,4,5,6} (in a scrambled order).  The number of numbers less than M that are coprime to M (have no factors in common with M) is denoted by ϕ(M), the “Euler phi of M”. For example, ϕ(9)=6, since {1,2,4,5,7,8} are all coprime to 9.  For a prime number, P, every number less than that number is coprime to it, so ϕ(P)=P-1.

When you find the decimal expansion of a fraction, you’re calculating successive powers of ten and taking the mod.  As long as 10 is coprime to the denominator, this generates numbers that are also coprime to the denominator.  If the denominator is prime, there are P-1 of these.  More generally, if the denominator is M, there are ϕ(M) of them.  For example, \frac{5}{21}=0.\underbrace{238095238095}238095238095\ldots, which repeats every 12 because ϕ(21)=12.  It also repeats every 6, but that doesn’t change the “every 12″ thing.

Why the powers of ten must either hit every one of the ϕ(M) coprime numbers, or some fraction of ϕ(M) (\frac{\phi(M)}{2}, or \frac{\phi(M)}{3}, or …), thus forcing the decimal to repeat every ϕ(M) will be in the answer gravy below.


Answer Gravy: Here’s where the number theory steps in.  The best way to describe, in extreme generalization, what’s going on is to use “groups“.  A group is a set of things and an operation, with four properties: closure, inverses, identity, and associativity.

In this case the set of numbers we’re looking at are the numbers coprime to M, mod M.  If M=7, then our group is {1,2,3,4,5,6} with multiplication as the operator.  This group is denoted “\mathbb{Z}_7^\times“.

The numbers coprime to M are “closed” under multiplication, because if you multiply two numbers with no factors in common with M, you’ll get a new number with no factors in common with M.  For example, [3\cdot4]_7=[12]_7=[5]_7.  No 7’s in sight (other than the mod, which is 7).

The numbers coprime to M have inverses.  This means that if a\in\mathbb{Z}_7^\times and b\in\mathbb{Z}_7^\times, then a\cdot b\in\mathbb{Z}_7^\times.  This is a consequence of Bézout’s lemma (proof in the link), which says that if a and M are coprime, then there are integers x and y such that xa+yM=1, with x coprime to M and y coprime to a.  Writing that using modular math, if a and M are coprime, then there exists an x such that [xa]_M=[1]_M.  For example, [1\cdot1]_7=[1]_7, [2\cdot4]_7=[1]_7, [3\cdot5]_7=[1]_7, and [6\cdot6]_7=[1]_7.  Here we’d write [3^{-1}]_7=[5]_7, which means “the inverse of 3 is 5″.

The numbers coprime to M have an identity element.  The identity element is the thing that doesn’t change any of the other elements.  In this case the identity is 1, because 1\cdot x=x in general.  1 is coprime to everything (it has no factors), so 1 is always in \mathbb{Z}_M^\times regardless of what M is.

Finally, the numbers coprime to M are associative, which means that (ab)c=a(bc).  This is because multiplication is associative.  No biggy.

 

So \mathbb{Z}_M^\times, the set of numbers (mod M) coprime to M, form a group under multiplication.  Exciting stuff.

But what we’re really interested in are “cyclic subgroups”.  “Cyclic groups” are generated by the same number raised to higher and higher powers.  For example in mod 7, {31,32,33,34,35,36}={3,2,6,4,5,1} is a cyclic group.  In fact, this is \mathbb{Z}_7^\times.  On the other hand, {21,22,23}={2,4,1} is a cyclic subgroup of \mathbb{Z}_7^\times.  A subgroup has all of the properties of a group itself (closure, inverses, identity, and associativity), but it’s a subset of a larger group.

In general, {a1,a2,…,ar} is always a group, and often is a subgroup.  The “r” there is called the “order of the group”, and it is the smallest number such that [a^r]_M=[1]_M.

Cyclic groups are closed because [a^x\cdot a^y]_M=[a^{x+y}]_M.

Cyclic groups contain the identity.  There are only a finite number of elements in the full group, \mathbb{Z}_M^\times, so eventually different powers of a will be the same.  Therefore,

\begin{array}{ll}    [a^x]_M=[a^y]_M \\[2mm]    \Rightarrow[a^x]_M=[a^xa^{y-x}]_M \\[2mm]    \Rightarrow[(a^x)^{-1}a^x]_M=[(a^x)^{-1}a^xa^{y-x}]_M \\[2mm]    \Rightarrow[1]_M=[a^{y-x}]_M    \end{array}

That is to say, if you get the same value for different powers, then the difference between those powers is the identity.  For example, [3^2]_7=[2]_7=[3^8]_7 and it’s no coincidence that [3^{8-2}]_7=[3^6]_7=[1]_7.

Cyclic groups contain inverses.  There is an r such that [a^r]_M=[1]_M.  It follows that [ba^x]_M=[1]_M\Rightarrow[ba^x]_M=[a^r]_M\Rightarrow[b]_M=[a^{r-x}]_M.  So, [\left(a^x\right)^{-1}]_M=[a^{r-x}]_M.

And cyclic subgroups have associativity.  Yet again: no biggy, that’s just how multiplication works.

 

It turns out that the number of elements in a subgroup always divides the number of elements in the group as a whole.  For example, \mathbb{Z}_M^\times={1,2,3,4,5,6} is a group with 6 elements, and the cyclic subgroup generated by 2, {1,2,4}, has 3 elements.  But check it: 3 divides 6.  This is Lagrange’s Theorem.  It comes about because cosets (which you get by multiplying every element in a subgroup by the same number) are always the same size and are always distinct.  For example (again in mod 7),

\begin{array}{rl}    1\cdot\{1,2,4\} & = \{1,2,4\} \\    2\cdot\{1,2,4\} & = \{2,4,1\} \\    3\cdot\{1,2,4\} & = \{3,6,5\} \\    4\cdot\{1,2,4\} & = \{4,1,2\} \\    5\cdot\{1,2,4\} & = \{5,3,6\} \\    6\cdot\{1,2,4\} & = \{6,5,3\} \\    \end{array}

The cosets here are {1,2,4} and {3,5,6}.  They’re the same size, they’re distinct, and together they hit every element in \mathbb{Z}_7^\times.  The cosets of any given subgroup are always the same size as the subgroup, always distinct (no shared elements), and always hit every element of the larger group.  This means that if the subgroup has S elements, there are C cosets, and the group as a whole has G elements, then SD=G.  Therefore, in general, the number of elements in a subgroup divides the number of elements in a whole group.

 

To sum up:

In order to calculate a decimal expansion (in base 10) you need to raise 10 to higher and higher powers and divide by the denominator, M.  The quotient is the next digit in the decimal and the remainder is what’s carried on to the next step.  The remainder is what the “mod” operation yields.  This leads us to consider the group of \mathbb{Z}_M^\times which is the multiplication mod M group of numbers coprime to M (the not-coprime-case will be considered in a damn minute).  \mathbb{Z}_M^\times has exactly ϕ(M) elements.  The powers of 10 form a “cyclic subgroup”.  The number of numbers in this cyclic subgroup must divide ϕ(M), by Lagrange’s theorem.

If P is prime, then ϕ(P)=P-1, and therefore if the denominator is prime the length of the cycle of digits in the decimal expansion (which is dictated by the cyclic subgroup generated by 10) must divide P-1.  That is, the decimal repeats every P-1, but it might also repeat every \frac{P-1}{2} or \frac{P-1}{3} or whatever.  You can also calculate ϕ(M) for M not prime, and the same idea holds.


Deep Gravy:

Finally, if the denominator is not coprime to 10 (e.g., 3/5, 1/2, 1/14, 71/15, etc.), then things get a little screwed up.  If the denominator is nothing but factors of 10, then the decimal is always finite.  For example, \frac{1}{8}=0.125\underbrace{0}_{repetend}000000.

\begin{array}{l|l}    \frac{1}{8}&[1]_8\\[2mm]    =\frac{1}{10}+\frac{1}{10}\frac{2}{8}&[10]_8=[2]_8\\[2mm]    =\frac{1}{10}+\frac{2}{10^2}+\frac{1}{10^2}\frac{4}{8}&[10^2]_8=[4]_8\\[2mm]    =\frac{1}{10}+\frac{2}{10^2}+\frac{5}{10^3}&[10^3]_8=[0]_8\\[2mm]    \end{array}

In general, if the denominator has powers of 2 or 5, then the resulting decimal will be a little messy for the first few digits (equal to the higher of the two powers, for example 8=23) and after that will follow the rules for the part of the denominator coprime to 10.  For example, 28=2^2\cdot7.  So, we can expect that after two digits the decimal expansion will settle into a nice six-digit repetend (because ϕ(7)=6).

Fortunately, the system works: \frac{1}{28}=0.03\underbrace{571428}571428\ldots

This can be understood by looking at the powers of ten for each of the factors of the denominator independently.  If A and B are coprime, then \mathbb{Z}_{AB}^\times \cong \mathbb{Z}_{A}^\times\otimes \mathbb{Z}_{B}^\times.  This is an isomorphism that works because of the Chinese Remainder Theorem.  So, a question about the powers of 10 mod 28 can be explored in terms of the powers of 10 mod 4 and mod 7.

\begin{array}{l|l}    [10]_{28}=[10]_{28} & \left([10]_{4},[10]_{7}\right) = \left([2]_{4},[3]_{7}\right) \\[2mm]    [10^2]_{28}=[16]_{28} & \left([10^2]_{4},[10^2]_{7}\right) = \left([0]_{4},[2]_{7}\right) \\[2mm]    [10]_{28}=[10]_{28} & \left([10]_{4},[10]_{7}\right) = \left([0]_{4},[3]_{7}\right) \\[2mm]    [10^3]_{28}=[20]_{28} & \left([10^3]_{4},[10^3]_{7}\right) = \left([0]_{4},[6]_{7}\right) \\[2mm]    \end{array}

Once the powers of 10 are a multiple of all of the of 2’s and 5’s in the denominator, they basically disappear and only the coprime component is important.

Numbers are a whole thing.  If you can believe it, this was supposed to be a short post.

Posted in -- By the Physicist, Math, Number Theory | 5 Comments

Pluto!

Physicist: In 2006 a probe called New Horizons was launched to get a better look at Pluto and its moons Charon, Nix, and Hydra.  Since then, Pluto stopped being a planet and gained a couple more moons: Kerberos and Styx.

This is exciting stuff.  The reason we have big fancy pictures of the eight planets in our solar system is because we’ve sent cameras to them.  Hubble (and future space telescopes) are great, but sometimes you’ve just gotta be there.

Neptune as seen by Hubble (left) and Voyager (right).

Neptune as seen by Hubble (left) and Voyager 2 (right).

Tomorrow morning (July 14, 2015) New Horizons will pass Pluto at mach 48 (48 times the speed of sound, which is a misleading and entirely inappropriate way of measuring speed in space).  It will furiously take pictures and measurements for a couple hours and then continue into interstellar space, where its docket will be pretty open for the next few million years.

Already New Horizons has sent us the clearest images of Pluto ever.

Pluto as seen by Hubble and New Horizons, a few million miles ago.

Pluto as seen by Hubble (left) and as seen by New Horizons, two days and few million miles ago (right).

Unlike the last post in this vein, there’s nothing for you to personally do.  But still: now we get to learn stuff about the planet-turned-dwarf-planet that’s been a bit of an asterisk for 85 years.  Good times!

Update (July 15, 2015): Huzzah!

If you're a nerd born before now, you've been waiting for this picture.

It only took 85 years! (click to enlarge)

Posted in -- By the Physicist | 7 Comments

Q: If atoms are 99.99% space, what “kind” of space is it? Is it empty vacuum?

Physicist: This is a bit of a misnomer.

When we picture an atom we usually picture the “Bohr model”: a nucleus made of a bunch of particles packed together (protons and neutrons) with other particles zipping around it (electrons).  In this picture, if you make a guess about of the size of electrons and calculate how far they are from the nucleus, then you get that weird result about atoms being mostly empty.  But that guess is surprisingly hard to make.  The “classical electron radius” is an upper-limit guess based on the electron being nothing more than it’s own electric field, but it’s ultimately just a gross estimate.

The picture gives you an idea of more or less where things can be found in an atom, but does a terrible job conveying what those things are like.

The picture gives you an idea of more or less where things can be found in an atom, but does a terrible job conveying what those things are actually like.

However, electrons aren’t really particles (which is why it’s impossible to actually specify their size); they’re waves.  Instead of being in a particular place, they’re kinda “smeared out”.  If you ring a bell, you can say that there is a vibration in that bell but you can’t say where exactly that vibration is: it’s a wave that’s spread out all over the bell.  Parts of the bell will be vibrating more, and parts may not be vibrating at all.  Electrons are more like the “ringing” of the bell and less like a fly buzzing around the bell.

Just to be clear, this is a metaphor: atoms are not tiny bells.  The math that describes the “quantum wave function” of electrons in atoms and the math that describes vibrations in a bell have some things in common.

Where exactly is the ringing happening?

Where exactly is the ringing happening?

So, the space in atoms isn’t empty.  A more accurate thing to say is that the overwhelming majority of the matter in an atom is concentrated in the nucleus, which is tiny compared to the region where the electrons are found.  However, even in the nucleus the same “problem” crops up; protons and neutrons are just “the ringing of bells” and aren’t simply particles either.  The question “where exactly is this electron/proton/whatever?” isn’t merely difficult to answer, the question genuinely doesn’t have an answer.  In quantum physics things tend spread out between a lot of states (in this case those different states include different positions).

The atom picture is from here.

Posted in -- By the Physicist, Physics, Quantum Theory | 9 Comments

Q: Is geocentrism really so wrong? Is the Sun being at the “center” (i.e. the Earth orbiting the Sun) just an arbitrary reference frame decision, and no more true than the Earth being at the center?

Physicist: When you walk around in this big crazy world, there aren’t any immediate reasons to suspect that the ground under your feet is doing anything more than sitting perfectly still (ring of fire notwithstanding).  Given that, when you look up at the sky and see things pinwheeling about; why not assume that they’re moving and that you’re sitting still?  On its face, geocentrism makes sense.

But there are a lot of physical phenomena that poke holes in it pretty quick.  For example, Foucault pendulums (more commonly known as “big pendulums“) swing as though the Earth were turning under them and in a way that exactly corresponds to the way everything in the sky turns overhead (not a coincidence).

The classic way that heliocentrism (the idea that the Sun is at the center of the solar system) is demonstrated to be better that geocentrism (the idea that the Earth is at the center of the solar system) is by looking at the motion of the other planets.  This was essentially what Copernicus did; point out that with the Earth at the center the motions of the other planets are crazy, but with the Sun at the center the motion of the planets (including Earth) are simple ellipses.  His original argument was essentially just an application of Occam’s razor: simpler is better, so the Sun must be at the center.

Occam’s razor is a great red flag for detecting ad-hoc theories, but it’s not science.  With that in mind, it’s impressive how much Copernicus got exactly right.  Fortunately, about a century and a half after Copernicus, Newton came along and squared that circle.  Newtonian physics says a lot more than “gee wiz, but ellipses are pretty”; it actually describes exactly why all of the orbits behave the way they do with a remarkably simple set of laws for gravity and movement in general.  Newtonian physics goes even farther, describing not just the motion of the planets, but also why we don’t directly notice the motion of our own.

If we still assumed that the Earth was sitting still in the universe, physicists would have spent the last couple centuries desperately trying to explain what’s hauling the Sun (and the rest of the planets) around in such huge circles. We’d need a bunch of extra, mysterious forces to explain away why the center of mass in the solar system doesn’t sit still (or move at a uniform speed), but is instead whipping by overhead daily.

What follows is a bunch of Newtonian stuff.

Position and velocity are both entirely subjective, but acceleration is objective.  What that physically means that there is absolutely no way, whatsoever, to determine where you are or how fast you’re moving by doing tests of any kind.  Sure, you can look around and see other things passing by, but even then you’re only measuring your relative velocity (your velocity relative to whatever you’re looking at).  So, hypothetically, if you’re on a big ball of stuff flying through space, you’d never be able to tell.  Acceleration on the other hand is easy to measure.

Whether you're juggling in a place (upper left), a different place (upper right), or even a different speed (lower left), the laws of physics are indistinguishable.

The laws of physics are exactly the same regardless of where you are or how you’re moving.  Therefore there is no experiment that can tell you your “true” position or velocity.  Acceleration however does change things.  That’s why you can juggle in exactly the same way in a place (upper left), a different place (upper right), or even a different speed (lower left), but you can’t juggle, or you have to juggle differently, when accelerating (lower right).

At first blush it would seem as though there’s no way, from here on Earth, to tell the difference between the Earth moving or sitting still.  If the Earth is sitting still, we wouldn’t be able to tell.  If the Earth is moving, we also wouldn’t be able to tell.  But we’re doing more than just moving; we’re moving in circles and as it happens traveling in a circle requires acceleration.  The push you feel when you speed up or slow down comes from the exact same source as the push you feel when you turn a corner or spin around: acceleration.

The Moon orbits the Earth, and the Earth sorta orbits the Moon by wobbling.

The Moon orbits the Earth, and the Earth sorta orbits the Moon.  By wobbling.

If the Earth were “nailed to space” and never accelerated, then we‎’d only have one each of the lunar and solar tides.  If the Earth never moved, then the Moon’s gravity would pull the oceans toward it and that’s it.  But the Earth does move.  The Moon is heavy so, even though the Earth doesn’t move nearly as much, the Earth does execute little circles to balance the Moon’s big circles.  The Moon’s big circles generate enough centrifugal force to balance the Earth’s pull on it (that’s what an orbit is), and at the same time the Earth’s little circles balance the Moon’s pull on us.

If the Earth were nailed to the sky, then the Moon's gravity would cause only one tide a day.  We experience two because the Earth and Moon both orbit the same point (red dot).

If the Earth were nailed to the sky, then the Moon’s gravity would cause only one tide a day as the seas are pulled toward it. We experience two because the Earth and Moon orbit each other around the same point (red dot).  The swinging of the backside of the Earth means that the water on the far side is “flung outward”.

The same basic thing happens between the Earth and the Sun.  Things closer to the Sun orbit faster and things farther away orbit slower.  But the Earth has to travel as one big block.  The side facing the Sun is about 4,000 miles closer, and traveling slower than it would if it were orbiting at that slightly lower level.  As a result, the Sun’s gravity “wins” a little in the “noon region” of the Earth and we get a high tide (pulled toward the Sun).  The side facing away is moving a little bit faster than something at that distance from the Sun should, so it’s flung outward a little more than it should be and we get another high tide at midnight.  These are called the “solar tide” and they’re harder to notice because they’re about half as strong as the lunar (regular) tides.  That said, the solar tides are important and they exist because the Earth is traveling in a circle around the Sun.

Long story short: If the Earth were stationary (geocentrism) then we’d have to come up with lots of bizarre excuses to explain why Newton’s laws work perfectly here on the ground, but not at all in space, and we’d only have one solar and lunar tide a day.  If the Earth is moving (specifically: around the Sun), then Newton’s simple laws can be applied universally without buckets of caveats and asterisks*, and we get two lunar and solar tides a day.

*or even †’s.

Posted in -- By the Physicist, Astronomy, Experiments, Physics, Relativity | 7 Comments

Q: Is there such a thing as half a derivative?

The original question was: Another one of those questions of the type “does this make sense”.  You have first derivatives and second derivatives.  f'(x), f”(x) or sometimes dy/dx and d^2y/dx^2. Is there any sensible definition of a something like a “half” derivative, or more generally an nth derivative for a non-integer n?


Physicist: There is!  For readers not already familiar with first year calculus, this post will be a lot of non-sense.

Strictly speaking, the derivative only makes sense in integer increments.  But that’s never stopped mathematicians from generalizing.  Heck, non-integer exponentiation doesn’t make much sense (I mean, 23.5 is “2 times itself three and a half times”.  What is that?), but with a little effort we can move past that.

The derivative of a function is the slope at every point along that function, and it tells you how fast that function is changing.  The “2nd derivative” is the derivative of the derivative, and it tells you how fast the slope is changing.

f(x) is a parabola.  f'(x) describes the fact that as you move to the right the parabola's slope increases.  Notice that a negative slope means "down hill".  f''(x) describes

f(x) is a parabola. f'(x) describes the fact that as you move to the right the parabola’s slope increases. Notice that a negative slope means “down hill”. f”(x) describes the slope of f'(x), which is constant.

When you want to generalize something like this to you basically need to “connect the dots” between those cases where the math actually makes sense.  For something like exponentiation by not-integers there’s a “correct” answer.  For not-integer derivatives there really isn’t.  One way is to use Fourier Transforms.  Another is to use Laplace Transforms.  Neither of these is ideal.  Just to be clear: non-integral derivatives are nothing more than a matter of choosing “what works” from a fairly short list of options that aren’t terrible.

It turns out (as used in both of those examples) that integrals are a great way of “connecting dots”.  When you integrate a function the result is more continuous and more smooth.  In order to get something out that’s discontinuous at a given point, the function you put in needs to be infinitely nasty at that point (technically, it has to be so nasty it’s not even a function).  So, integrals are a quick way of “connecting the dots”.

To get the idea, take a look at N!.  That excited looking N is “N factorial” and it’s defined as N!=1\cdot2\cdot3\cdots(N-1)\cdot N.  For example, 3!=1\cdot2\cdot3=6.  Clearly, it doesn’t make a lot of sense to write “3.5!” or, even worse, “π!”.  And yet there’s a cute way to smoothly connect the dots between 3! and 4!.

Gamma(x+1) is a fairly natural way of generalizing x! to non-natural numbers.

Γ(N+1) is a fairly natural way of generalizing N! to non-natural numbers.  The dotted lines correspond to 1!=1, 2!=2, and 3!=6.

The Gamma function, Γ(N),  (not to be confused with the gamma factor) is defined as: \Gamma(N+1) = \int_0^\infty t^{N} e^{-t}\,dt.  Before you ask, I don’t know why Euler decided to use “N+1″ instead of “N”.  Sometimes decent-enough folk have good reasons for doing confusing things.  If you do a quick integration by parts, a pattern emerges:

\begin{array}{ll}\Gamma(N+1)\\[2mm]= \int_0^\infty t^{N} e^{-t}\,dt \\[2mm]=\left[-t^Ne^{-t}\right]_0^\infty + N\int_0^\infty t^{N-1} e^{-t}\,dt \\[2mm]=N\int_0^\infty t^{N-1} e^{-t}\,dt \\[2mm]=N\Gamma(N)\end{array}

So, Γ(N+1) has the same defining property that N! has: \Gamma(N+1) = N\cdot \Gamma(N) and N! = N\cdot (N-1)!.  Even better, \Gamma(1) = \int_0^\infty e^{-t}\,dt=-e^{-t}\big|_0^\infty = 0-(-1)=1, which is the other defining property of N!, 0!=1.  We now have a bizarre new way of writing N!.  For all natural numbers N, N! = Γ(N+1).  Unlike N!, which only makes sense for natural numbers, Γ(N+1) works for any positive real number since you can plug in whatever positive N you like into \int_0^\infty t^{N} e^{-t}\,dt.

Even better, this formulation is “analytic” which means it not only works for any positive real number, but (using analytic continuation) works for any complex number as well (with the exception of those poles at each negative integer where it jumps to infinity).

|Γ(N)|, where N can now take values in the complex plane.  The smooth area on the right corresponds to

|Γ(N)|, where N can now take values in the complex plane.

Long story short, with that integral formulation you can connect the dots between the integer values of N (where N! makes sense) to figure out the values between (where N! doesn’t make sense).

So, here comes a pretty decent way to talk about fractional derivatives: fractional integrals.

If “f ‘(x)=f(1)(x)” is the derivative of f, “f(N)(x)” is the Nth derivative of f, and “f(-1)(x)” is the anti-derivative, then by the fundamental theorem of calculus f^{(-1)}(x) = \int_0^x f(t)\,dt.  It turns out that f^{(-N)}(x)=\frac{1}{(N-1)!}\int_0^x (x-t)^{N-1}f(t)\,dt.  x-t runs over strictly positive values, so there’s no issue with non-integer powers, and it just so happens that we already have a cute way of dealing with non-integer factorials, so we may as well deal with that factorial cutely: f^{(-N)}(x)=\frac{1}{\Gamma(N)}\int_0^x (x-t)^{N-1}f(t)\,dt.

Holy crap!  We now have a way to describe fractional integrals that works pretty generally.  Finally, and this is very round-about, but it turns out that a really good way to do half a derivative is to do half an integral and then do a full derivative of the result:

f^{\left(\frac{1}{2}\right)}(x)=\frac{d}{dx}f^{\left(-\frac{1}{2}\right)}(x)=\frac{d}{dx}\left[\frac{1}{\Gamma\left(\frac{1}{2}\right)}\int_0^x (x-t)^{-\frac{1}{2}}f(t)\,dt\right]=\frac{d}{dx}\left[\frac{1}{\sqrt{\pi}}\int_0^x \frac{1}{\sqrt{x-t}}f(t)\,dt\right]

That “root pi” is just another math thing.  If you want to do, say, a third of a derivative, then you can first find f(-2/3)(x) and then differentiate that.  This isn’t the “correct” way to do fractional derivatives, just something that works while satisfying a short wishlist of properties and re-creating regular derivatives without making a big deal about it.


Answer Gravy: You can show that f^{(-N)}(x)=\frac{1}{(N-1)!}\int_0^x (x-t)^{N-1}f(t)\,dt (or even better, f^{(-N)}(x)=\frac{1}{\Gamma(N)}\int_0^x (x-t)^{N-1}f(t)\,dt) through induction.  The base case is f^{(-1)}(x)=\frac{1}{(1-1)!}\int_0^x (x-t)^{1-1}f(t)\,dt=\int_0^x f(t)\,dt.  This is true by the fundamental theorem of calculus, which says that the anti-derivative (the “-1″ derivative) is just the integral.  So… check.

To show the equation in general, you demonstrate the (N+1)th case using the Nth case.

\begin{array}{ll}  f^{(-N-1)}(x)\\[2mm]    =\int_0^x f^{(-N)}(t)\,dt \\[2mm]  = \int_0^x \frac{1}{\Gamma(N)}\int_0^t (t-u)^{N-1}f(u) \,du\,dt \\[2mm]    = \frac{1}{\Gamma(N)}\int_0^x \int_0^t (t-u)^{N-1}f(u) \,du\,dt \\[2mm]  = \frac{1}{\Gamma(N)}\int_0^x \int_u^x (t-u)^{N-1}f(u) \,dt\,du \\[2mm]    = \frac{1}{\Gamma(N)}\int_0^x f(u)\int_u^x (t-u)^{N-1} \,dt\,du \\[2mm]    = \frac{1}{\Gamma(N)}\int_0^x f(u)\left[\frac{1}{N}(t-u)^{N}\right]_u^x\,du \\[2mm]    = \frac{1}{\Gamma(N)}\int_0^x f(u)\frac{1}{N}(x-u)^{N}\,du \\[2mm]    = \frac{1}{\Gamma(N+1)}\int_0^x f(u) (x-u)^{N}\,du \\[2mm]  \end{array}

Huzzah!  Using the formula for f(-N)(x) we get the formula for f(-N-1)(x).

There’s a subtlety that goes by really quick between the fourth and fifth lines.  When you switch the order of integration (dudt to dtdu) it messes up the limits.  Far and away the best way to deal with this is to draw a picture.  At first, for a given value of t, we integrate u from zero to t, and then integrating t from zero to x.  When switching the order we need to make sure we’re looking at the same region.  So for a given value of u, we integrate t from u to x and then integrate u from zero to x.

You can either

Integrating over the same region in two different orders.

So that’s what happened there.

Posted in -- By the Physicist, Conventions, Equations, Math | 13 Comments

Q: Why is our Moon drifting away while Mars’ moons are falling?

The original question was: I know the Moon is getting further away because tides/friction/conservation of angular momentum.  This video claims Phobos is getting closer to Mars because of tidal forces, what gives?  Obviously no oceans to drag around but what else?


Physicist: A moon causes the material of the planet under it to distend toward it (and away, which is why there are two tides).  This is especially obvious on Earth where the water is free to move a lot more than the ground.  However that bump takes a little while to relax and, because planets turn and moons orbit, that bump is never exactly under the moon.

The tidal bulge created by a moon doesn't stay directly under that moon, either because the planet is turning, because the moon is orbiting or both.

The tidal bulge created by a moon doesn’t stay directly under that moon, either because the planet is turning, because the moon is orbiting, or both.  This is really, really not to scale.

Because the Earth spins in the same direction that the Moon orbits, our bump leads the Moon a little.  If the Moon orbited in the opposite direction, or even orbited so fast that its orbit were faster than the Earth’s spin, then the bump would trail the Moon.

The bump itself has mass and therefore a little extra gravity.  If it leads the moon, then the moon speeds up because it’s getting a tiny, tiny extra pull in the direction its orbiting.  Speeding up causes things in orbit to assume higher orbits, which we often and not-quite-accurately describe as “drifting away”.  Our Moon gets a couple cm farther away every year.

On the other hand, if the tidal bump trails behind a moon, then that moon is slowed down and drops lower as a result.  Phobos’ orbital period is about 8 hours (it’s already very low), and Mars’ day (a “sol“) is about as long as ours, so the bump Phobos creates necessarily trails behind it.  As a result Phobos is slowly dropping and will eventually impact Mars.  Mars is going to have a really bad sol in about 50 million years.

But raising moons consumes a lot of energy and that energy has to come from somewhere.  The same tiny pull that the Earth applies to the Moon to speed up its orbit is applied to the Earth to slow down our day.  When the Moon formed around 4.5 billion years ago, it about 15 times closer to the Earth (give or take) and a day was only about 6 hours long.  Back then a full moon would have provided about 200 times as much light and solar eclipses would have blacked out swaths of the Earth’s surface nearly the size of Australia.

Our Moon has more than 7 million times the mass of Phobos, so Phobos doesn’t have nearly as pronounced an impact on the spin of Mars.

The Earth and Moon as they are now and the Earth and Moon as they were when the Moon formed.

The Earth and Moon as they are now and the Earth and Moon as they were when the Moon formed.

We live in a remarkably unlikely time, when the size of the Moon in the sky perfectly matches the size of the Sun.  In fact, since the Moon’s orbit around Earth is a little elliptical, the Moon is sometimes a little smaller and sometimes a little bigger.  We live on the only planet that gets to see both annular and total solar eclipses.  But see a total eclipse while you can; in a few million years we’ll be stuck with only annular eclipses.  Sucks to be you, unforeseeable future generations!

Posted in -- By the Physicist, Astronomy, Physics | 11 Comments