So, you call that new number “j” (not to be confused with “j” from engineering, which is actually just “i” and presumably stands for “jamaginary number”). On the face of it, there’s nothing wrong with that; if we can make up i and work with it (to great effect), then making up j shouldn’t be terribly different. In the same way that we can write complex numbers as A+Bi, we should be able to write these new numbers as A+Bi+Cj; “trinions” as it were. However, it turns out that introducing a “j” requires us to also introduce a “k” (that also does the same thing as i and j).

Here’s why. You start by saying “i^{2} = j^{2} = -1″ and then asking “ij = ?”. You begin to get a sinking feeling when you square it: (ij)^{2 }= i^{2}j^{2} = (-1)(-1) = 1. This implies that ij = 1 or -1. But ij = 1 means that j = -i and ij = -1 means that j = i. There are more rigorous (confusing/complicated) ways to do this, but they ultimately boil down to “dude, we need another number”. That number is k (for “kamaginary” maybe).

So we’ve got i^{2} = j^{2} = k^{2} = -1 and ij = k. Fine. But there’s a big problem: quaternions can’t be commutative (mathematicians would call this big problem an “interesting property”, because they’re so chipper). “Commutative” means that order doesn’t matter, but for quaternions it must. Here comes a contradiction:

Firstly: (ij)^{2} = k^{2} = -1. This is basically a definition. It’s “True”.

Secondly (with commutativity): (ij)^{2} = (ij)(ij) = ijij = i^{2}j^{2} = (-1)(-1) = 1. Savvy readers will note that 1 ≠ -1. This can be fixed by declaring that ij = -ji.

Thirdly (declaring that ij = -ji): (ij)^{2} = (ij)(ij) = i(ji)j = i(-ij)j = -i^{2}j^{2} = -(-1)(-1) = -1. Fixed!

So far, this whole thing has been about why quaternions have the weird properties they do: there needs to be an i, j, *and* k, and you have to give up commutativity. Complex numbers are written “A+Bi” where i^{2} = -1. Quaternions are written “A+Bi+Cj+Dk” where i^{2} = j^{2} = k^{2} = -1, ij = k, jk = i, ki = j, and reversing any of these last three flips the sign.

One of the most profoundly cool things about quaternions is that they have their own form of Euler’s equation. When , . This can be derived the same way the regular Euler equation is derived, but using the fact that .

At this point it’s entirely natural (for a mathematical masochist) to ask “alright, but what if there were *yet another* square root of -1?”. Well it turns out that the next jump is harder and requires *seven* things that square to -1. Concerned at the prospect of running out of letters, clever mathematicians usually label these e_{1}, e_{2}, e_{3}, e_{4}, e_{5}, e_{6}, e_{7}. where (e_{1})^{2} = (e_{2})^{2} = … = (e_{7})^{2} = -1. An octonion number is written “A + Be_{1} + Ce_{2} + De_{3} + Ee_{4} + Fe_{5} + Ge_{6} + He_{7}“, where each of these (capital) letters is a real number. When you make the jump to octonions you not only lose commutativity you lose associativity, which makes everything terrible. With octonions you can’t say that (ab)c = a(bc), which is a big loss.

Some terribly insightful old soul might now be driven to inquire “alright, but what if there were *still more* square roots of -1?”. Sure. Enter the Cayley Dickson construction to create a “ladder” of as many of these number systems as your heart may ever desire, doubling in complexity every time.

Here’s the idea: you’ve start with a number system, then you take pairs of those numbers and slap a couple of rules on them. Complex numbers are just a pair of real numbers with some algebra glued on. For example, and . You may as well write this and . In addition to addition and multiplication, complex numbers also have an operation called “complex conjugation” (denoted with a bar or asterisk) which flips the sign of the imaginary part of a complex number. For example, or equivalently . The same operation exists for quaternions. For example, .

The Cayley Dickson construction defines numbers “higher up the ladder” as pairs of numbers from “lower down the ladder”. So a complex number, Z, is a pair of real numbers, A and B, which we can write Z=A+Bi={A,B}. A quaternion number, Z, is a pair of complex numbers, A+Bi and C+Di, which we can write Z=A+Bi+Cj+Dk=A+Bi+Cj+Dij=(A+Bi)+(C+Di)j={A+Bi,C+Di}. You’ll never guess how you can write an octonion.

Addition is handled like this , multiplication is handled like this , and conjugation is handled like this . For the jump from real to complex numbers those bars (conjugates) don’t do anything, but they’re important for each of the higher number systems. With this weird looking formalism in hand you can go from real numbers to complex numbers to quaternions to octonions to sedenions and so on and on and on (if you *really* want to).

It turns out that these higher number systems are useful. Complex numbers are ridiculously useful. Quaternions have a lot of interesting and *fairly* intuitive uses, like modeling rotations in 3 dimensions (which coincidentally is where we live) in part because they don’t have “special angles” that mess them up (e.g., the north pole is difficult to work with because it doesn’t have a definable longitude, but quaternions don’t have “north pole type problems”). While octonions are useful, they’re not useful in any easy to describe ways (when was the last time you *really* needed 8 dimensions for a problem?). Turns out they’re useful in string theory and presumably the higher number systems are useful as well. The harder mathematicians try to make mathematics that’s “pure” and free of the burden of being useful, the better they end up making our physics and computers.

The movement of Earth, as well as the Earth’s gravity, change how much time we experience compared to other objects in the universe. If we were to occasionally compare our clocks to clocks in tight orbits around black holes or neutron stars we’d find they run slower than ours, and if we compare with clocks floating deep in the middle of nowhere we’d find that we’d find that those clocks run a little faster than ours.

However, there’s no “true” time to experience; you can never experience time wrong. Time is relative which means that we can compare how time is passing for any two things, but there’s no ultimate “clock of the universe” to compare with. Your watch, no matter where you are or how you’re moving, will always read 1 second per second. That is, you’ll never see *yourself* in fast-forward or slow-motion. In that sense we can’t help but experience time correctly. Each of us may as well declare that our clock is the One True Clock, and everyone else’s is wrong.

Simply moving fast in a straight line isn’t enough to make your clock objectively run slowly compared to other clocks. If two folk run past each other they *both* see the other experiencing less time and, weirdly enough, this isn’t a paradox and they’re both “right”. There are two effects that do objectively (in a way that everyone in the universe can agree) cause clocks to run slow: the very poorly named “twin paradox” and gravity.

In spacetime the “length” (spacetime interval) of a trip is measured by a clock that makes that trip. It turns out (this is not obvious, but it can be understood) that the shortest trips are the ones that are the most circuitous. If you watch the ball drop on New Years and stay put for a year until the next ball drop, then you’ve made a pretty straight trip (in spacetime) between those two events. This path is straight, so it’s long, and your clock will read more. Instead, if you spend that year zipping around the solar system as fast as you can before coming back for the next New Years, then your path was decidedly not straight (in spacetime). This all-over-the-place path is short and your clock will read less. “The longest spacetime distance between two points is a straight line” may sound utterly insane, but it works. Long story short: if your trip involves a loop, then your clock is falling behind.

As it happens, the Earth spins on its axis and orbits the Sun and, along with the rest of the solar system and all the stars that we can see, orbits the galaxy as well. Each of these are loops, not straight lines, and each time the Earth makes one of these circuits it falls a little behind any clock that didn’t. This is a *little* hypothetical: in order to get a clock to sit in the same place while the Earth does an orbit to meet up with it every year, it would need a big rocket (it’s not orbiting the Sun, so it should be falling into it).

As fortune would have it, you can just use the gamma function, , to find the time dilation caused by running in a loop (for more complex paths, like those with different speeds, you still use the gamma function but you need calculus too). The velocity of the spinning Earth at the equator is about 0.5 km/s, we orbit the Sun at about 30 km/s, and the whole kit and kaboodle orbits the galaxy at about 200 km/s. The difference in time experienced between people living in Longyearbyen (near the pole) and people living in Ecuador (near the equator) is about one part in a trillion, which gives those proud Norwegians an extra second every 25 thousand years. Don’t spend that second all in one place, Norwegians.

The time dilation from the biggest of these speeds, our movement around the galaxy, amounts to one part in 4.5 million. That amounts to an extra second every couple months or an extra half solar year for every galactic year.

The second effect to consider is the curvature of space time caused by (or which is) gravity. Things that are lower experience less time than things that are higher. This can be explained (and even verified) by measuring how the frequency of light changes when it travels vertically in a gravity field. The details are terrible, but for most practical purposes (“most practical purposes” = “not black holes”) you can find the time dilation between two altitudes by figuring out how fast something would be moving if it fell from the higher to the lower and plugging that v into .

It’s reasonable to say that if you’re infinitely far away from something then you’re outside of its gravitational influence and your clock should be running “right”. If you fell from “infinitely far away” to the surface of something big, you’d be moving at the excellently named “escape velocity” of that big something. If you try leave a planet moving slower than the escape velocity, then eventually you’ll fall back. Excellent name.

To escape the Earth from the ground you need 11 km/s. More difficult is escaping the Sun (from Earth’s orbit) which requires 42 km/s. To leave our galaxy (from here) you need somewhere between 500 and 600 km/s. This time dilation from the Milky Way’s gravity has the biggest effect of those mentioned here.

The spinning of the Earth and the orbiting of the Sun do affect the amount of time you experience, but not by a lot. Despite being closer to the Earth than the bulk of the galaxy, it’s our orbits around, and position in, our galaxy that affects our experience of time the most.

By virtue of being a member of the Milky Way, we experience about 1 second per week less than someone hanging out deep in the intergalactic void. Most of that comes from the effects of our galaxy’s gravity directly; not from the motion of our planet.

]]>I got quite the challenge from my father in law. The problem is well defined, but I’m having difficulties finding a meaningful answer. The reason why he asked me is because I’m an engineering student and he is in the windmill industry.

Before they attach the actual mill on the concrete foundation, it has to be absolutely leveled. If not, a tall mill would be quite offset even with a very small angle. To tackle this, they use two angle gauges and measure in two directions. The angle gauges are connected and you know the angle between them, their mutual angle. I’m supposed to find a way to convert these 3 inputs (angle 1, angle 2 and mutual angle) to find 2 outputs (the steepest angle and in which direction this is relative to the gauges).

**Physicist**: This is a gorgeous question that leads through some pretty math and ends with an elegant answer. If you’ve taken a class or two that used lots of vectors, then this is a cute exploration of what you can do with surprisingly little. If you’ve never taken a class or two that used lots of vectors, then please do: it’s fun stuff. You get to draw pictures and everything.

So you’ve got a flat slab that isn’t quite level. Two angle gauges (with plumb lines or bubbles or whatever) are placed on the slab in two directions. Define and as the directions of the two gauges on the ground and as up. These may as well be unit vectors, so: they are.

Define the angle between and as and the angle between each each of the levels and as and .

Measuring the angles between these vectors means that we know sine and cosine of these angles, and knowing that means that we know the dot product and magnitude of the cross product, since and .

Finally, since the windmill will be built perpendicular to the slab, it will be built perpendicular to both and . When a physicist (hell, even a mathematician) hears “I need a vector perpendicular to two other vectors” they convulsively respond “cross product those mothers”. If is the windmill’s “up”, then . If you were standing on the slab where the tails of and meet, then would be on the right and would be on the left (that’s the right hand rule).

If we project onto the , plane, the result will be pointing in the direction opposite the direction of the windmill’s lean. Define this projection as . The direction of is the direction that the windmill needs to be “leaned” so that it will stand straight.

The questions (way back at the top of this page) now boil down to:

1) What is the angle between and ?

2) What is the angle between , and and ?

For #1, it turns out that the cross product is easier to work with. Define as the angle between and .

We also know that:

And therefore:

In the event that (and honestly, why wouldn’t you want your gauges perpendicular?), then this simplifies a lot:

For #2 we find the projection, , and dot it with and . The projection onto the slab is . That is; it’s the up direction minus whatever component points in the direction of the windmill.

In that last step you know that since the tower, , and any direction on the slab it’s on, like , are perpendicular.

Defining as the angle between the projection and ,

Again, in the event that , this simplifies:

Similarly, .

So, if you’ve got the inclinometer readings, and , then you can find the lean of the tower, , and the direction you should push it so that it doesn’t lean, and from and respectively. This is a beautiful example of math leading to a cute, relatively simple solution that you probably couldn’t guess.

The windmill picture is from here.

]]>**Physicist**: There are absolutely different degrees of entanglement!

The degree you usually hear about are “maximally entangled states”, but basically everything is a little entangled. Not because of the big bang, but because every-day interactions generate and break a little entanglement all the time. Entanglement has a lot in common with correlation: if you know something about one thing, you’ll know something about the things it’s correlated with.

Correlations crop up all the time when things interact. For example, if you leave your car in a parking lot and come back to find a dent with a little red paint in it, then you know that somewhere nearby is a red car with another dent. The random things about your dent (the height above the ground, the severity, etc.) will be similar to those properties of the corresponding dent on the other car. You and a damnable ne’er-do-well have correlated cars because looking at the dent on one tells you something about the dent on the other; not because they have a spooky cosmic connection, but because they physically ran into each other. Entanglement is a little more subtle (what with all the quantum mechanics), but not a hell of a lot more subtle. Nothing fancy.

Just to be over-precise, when we say that things are entangled what we really mean is that some of their *properties* are entangled. For example, the polarization of two photons might be entangled while their positions are not, or vice versa.

The homogeneity of the universe (the “more-or-less-the-same-everywhere-ness” of the universe) is often cited as evidence that all the matter in the very early universe briefly had a chance to mix around, but that doesn’t have too much of an impact on entanglement. There’s something called “monogamy of entanglement” that says that maximally entangled qubits only appear in pairs, and maximally entangled states are the ones that really do interesting things. This can be generalized a bit to say “the more entangled two things are, the less they’re entangled with anything else”. Unfortunately, in order for such a pair to persist until today it would need to be left almost entirely unharrassed by everything else for billions of years. However, if the universe is anything, it’s old and messy. The entanglement we (people) create on purpose requires careful isolation and control of the stuff in question.

Even *worse*, if you have access to only one entangled particle, there’s no way to tell that it’s entangled. All of the fancy effects you hear about entanglement *always* require both, or at least most, of the entangled particles.

So you (every bit of you) can be entangled with other stuff in the universe (you kinda have to be). Entanglement is generated and broken by interactions, so you’re more entangled with stuff that’s nearby (in an astronomical sense). But most importantly, it doesn’t matter; random atomic-scale correlations are a lot like random atomic-scale noise.

Even less exciting, if you (personally) are the thing that’s entangled, your experience is entirely ordinary; the thing you’re entangled with will always be in a single state (from your point of view). All of the fancy experiments we do with entangled particle always involve particles being entangled with each other, because when they become entangled with the person doing the experiment it looks like “wave function collapse” (suddenly it appears to be in only one state) and that’s boring. Similarly, if you and a distant alien are entangled it does not mean you have a spooky connection (groovy, spiritual, or otherwise), it means that they will already be in a single state (from you mutual points of view) before you ever meet each other.

Which is exactly the sort of thing you’d never notice.

]]>The Enigma machine used a “rolling substitution cypher” which means that it was essentially a (much more) complicated version of “A=1, B=2, C=3, …”. The problem with substitution cyphers is that if parts of several messages are the same then you can compare their similarities to break the code. Enigma was broken in part because of German formality (most messages started with the same formal greeting). Even worse, since some letters are more common than others (e.g., “e” and “g”) you can make progress by just counting up how often letters show up in the code (or even get an idea of what language the code is written in without breaking it!). Substitution cypher are so easy to break that some folk do it for fun. *Rolling* substitution cyphers can use a set of several encoding schemes and cycle through which code is used or make the scheme dependent on the previous letter, but this merely makes the code breaking more difficult. Ultimately, all substitution cyphers suffer from the same difficulty: similar messages produce similar looking codes.

Modern cryptography doesn’t have that problem. If any part of a message is different at all, then the entire resulting code is completely different from beginning to end. That is; if you encrypted a message, you’d get cypertext (the encoded message) and if you were to encrypt the exact same message but misspelled a single word, then the cypertext would be *completely* different.

If your messages were “Hello A”, “Hello B”, and “Hello C”, then a substitution cypher might produce “Tjvvw L”, “Tjvvw C”, and “Tjvvw S” while RSA (the most common modern encryption) might produce “idkrn7shd”, “62hmcpgue”, and “nchhd8pdq”. In the first case you can tell that the messages are nearly the same, but in the second you got nothing.

Enigma was very clever but is shockingly primitive compared to modern crypto techniques. If anyone in WW2 had been using modern (1970’s or later) encryption, then there is no way that anyone would have been able to break those codes (and Turing would have to settle for being famous for everything *else* he did).

There’s a post here that talks about the main ideas behind RSA encryption. The really fancy stuff is some of the only math that isn’t publicly known. Scientists have a whole thing about openness and the free exchange of information that governments and corporate entities (for whatever reason) don’t.

]]>The quickest way to see why is to imagine a slightly different way of drawing straws. Instead of drawing straws, draw cards where all but one are black (for example). *Everyone* takes a card and afterward everyone turns their card over; the one red card is the “short straw”. In this case it should make sense that no person is more or less likely to get the red card for the same reason that it’s no more or less likely for any particular card to be any particular place in a deck. The fact that when drawing straws we pull one at a time and generally stop halfway through (whenever the short straws appears) makes it fell like the situation is different, but it’s not.

Say you’ve got N peeps (people). The first person to draw a straw is the least likely to draw the short one (1/N) and the last to draw is the most likely (1/2). However! While the later people are more likely to draw the short straw, they’re also less likely to pull *any* straw since it’s more likely that the short straw has already been drawn. In movies they almost always draw every straw because of drama, but in practice, you draw until the short one shows up and then you stop.

The early people are least likely to draw the short straw while the later people are least likely to draw at all. If you write down the math you find that the effects balance out exactly. So here’s the math written down:

You’ve got N peeps named One, Two, Three, etc. (probably siblings).

The first person has N straws to choose from and their probability of getting the short one is . Easy enough. The second person has N-1 straws to choose from, so you might expect that their chance of drawing the short straw is . But that’s not the probability that counts. What counts is the probability of drawing the short straw *given* that it hasn’t been drawn already. That probability is . $\frac{N-1}{N}$ is the probability that the first person did not already draw the short straw.

By the time it’s the Jth person’s turn there are N-J+1 straws remaining. The probability that the short straw is among them (the probability that it hasn’t been drawn already) is . And if it hasn’t, then the probability of drawing it is . So, all in all, the probability of the Jth person drawing the short straw is .

Finally, the last person to draw is the person who cut the straws. This person’s choice is random because everyone else’s choices were random: knowing which straw is which doesn’t change that.

]]>