Q: Quaternions and Octonions: what?

Physicist: The most straightforward way to stumble across quaternions is to sit around thinking about complex numbers, where we have “i” which is the square root of -1 and stands for “imaginary number”.  If you have i, then you have two square roots of -1: i and -i (all square roots are positive/negative pairs).  To get to quaternions you just need to ask “alright, but what if there were another square root of -1?”.

So, you call that new number “j” (not to be confused with “j” from engineering, which is actually just “i” and presumably stands for “jamaginary number”).  On the face of it, there’s nothing wrong with that; if we can make up i and work with it (to great effect), then making up j shouldn’t be terribly different.  In the same way that we can write complex numbers as A+Bi, we should be able to write these new numbers as A+Bi+Cj; “trinions” as it were.  However, it turns out that introducing a “j” requires us to also introduce a “k” (that also does the same thing as i and j).

Here’s why.  You start by saying “i2 = j2 = -1″ and then asking “ij = ?”.  You begin to get a sinking feeling when you square it: (ij)2 = i2j2 = (-1)(-1) = 1.  This implies that ij = 1 or -1.  But ij = 1 means that j = -i and ij = -1 means that j = i.  There are more rigorous (confusing/complicated) ways to do this, but they ultimately boil down to “dude, we need another number”.  That number is k (for “kamaginary” maybe).

So we’ve got i2 = j2 = k2 = -1 and ij = k.  Fine.  But there’s a big problem: quaternions can’t be commutative (mathematicians would call this big problem an “interesting property”, because they’re so chipper).  “Commutative” means that order doesn’t matter, but for quaternions it must.  Here comes a contradiction:

Firstly: (ij)2 = k2 = -1.  This is basically a definition.  It’s “True”.

Secondly (with commutativity): (ij)2 = (ij)(ij) = ijij = i2j2 = (-1)(-1) = 1.  Savvy readers will note that 1 ≠ -1.  This can be fixed by declaring that ij = -ji.

Thirdly (declaring that ij = -ji): (ij)2 = (ij)(ij) = i(ji)j = i(-ij)j = -i2j2 = -(-1)(-1) = -1.  Fixed!

So far, this whole thing has been about why quaternions have the weird properties they do: there needs to be an i, j, and k, and you have to give up commutativity.  Complex numbers are written “A+Bi” where i2 = -1.  Quaternions are written “A+Bi+Cj+Dk” where i2 = j2 = k2 = -1, ij = k, jk = i, ki = j, and reversing any of these last three flips the sign.

One of the most profoundly cool things about quaternions is that they have their own form of Euler’s equation.  When B^2+C^2+D^2=1, e^{\theta(Bi+Cj+Dk)} = \cos(\theta)+(Bi+Cj+Dk)\sin(\theta).  This can be derived the same way the regular Euler equation is derived, but using the fact that (Bi+Cj+Dk)^2=-(B^2+C^2+D^2).

At this point it’s entirely natural (for a mathematical masochist) to ask “alright, but what if there were yet another square root of -1?”.  Well it turns out that the next jump is harder and requires seven things that square to -1.  Concerned at the prospect of running out of letters, clever mathematicians usually label these e1, e2, e3, e4, e5, e6, e7. where (e1)2 = (e2)2 = … = (e7)2 = -1.  An octonion number is written “A + Be1 + Ce2 + De3 + Ee4 + Fe5 + Ge6 + He7“, where each of these (capital) letters is a real number.  When you make the jump to octonions you not only lose commutativity you lose associativity, which makes everything terrible.  With octonions you can’t say that (ab)c = a(bc), which is a big loss.

Some terribly insightful old soul might now be driven to inquire “alright, but what if there were still more square roots of -1?”.  Sure.  Enter the Cayley Dickson construction to create a “ladder” of as many of these number systems as your heart may ever desire, doubling in complexity every time.

Here’s the idea: you’ve start with a number system, then you take pairs of those numbers and slap a couple of rules on them.  Complex numbers are just a pair of real numbers with some algebra glued on.  For example, (A+Bi) + (C+Di) = (A+C) + (B+D)i and (A+Bi)\times(C+Di) = AB+ADi+BCi+ADi^2 = (AB-CD) + (AD+BC)i.  You may as well write this \{A,B\}+\{C,D\} = \{A+C,B+D\} and \{A,B\}\times\{C,D\} = \{AC-BD,AC+BD\}.  In addition to addition and multiplication, complex numbers also have an operation called “complex conjugation” (denoted with a bar or asterisk) which flips the sign of the imaginary part of a complex number.  For example, \overline{3+2i} = 3-2i or equivalently \overline{\{3,2\}} = \{3,-2\}.  The same operation exists for quaternions.  For example, \overline{2+i-3j+7k} = 2-i+3j-7k.

The Cayley Dickson construction defines numbers “higher up the ladder” as pairs of numbers from “lower down the ladder”.  So a complex number, Z, is a pair of real numbers, A and B, which we can write Z=A+Bi={A,B}.  A quaternion number, Z, is a pair of complex numbers, A+Bi and C+Di, which we can write Z=A+Bi+Cj+Dk=A+Bi+Cj+Dij=(A+Bi)+(C+Di)j={A+Bi,C+Di}.  You’ll never guess how you can write an octonion.

Addition is handled like this \{A,B\}+\{C,D\} = \{A+C,B+D\}, multiplication is handled like this \{A,B\}\times\{C,D\} = \{AC-\overline{D}B,DA+B\overline{C}\}, and conjugation is handled like this \overline{\{A,B\}} = \{\overline{A},-B\}.  For the jump from real to complex numbers those bars (conjugates) don’t do anything, but they’re important for each of the higher number systems.  With this weird looking formalism in hand you can go from real numbers to complex numbers to quaternions to octonions to sedenions and so on and on and on (if you really want to).

It turns out that these higher number systems are useful.  Complex numbers are ridiculously useful.  Quaternions have a lot of interesting and fairly intuitive uses, like modeling rotations in 3 dimensions (which coincidentally is where we live) in part because they don’t have “special angles” that mess them up (e.g., the north pole is difficult to work with because it doesn’t have a definable longitude, but quaternions don’t have “north pole type problems”).  While octonions are useful, they’re not useful in any easy to describe ways (when was the last time you really needed 8 dimensions for a problem?).  Turns out they’re useful in string theory and presumably the higher number systems are useful as well.  The harder mathematicians try to make mathematics that’s “pure” and free of the burden of being useful, the better they end up making our physics and computers.

This entry was posted in -- By the Physicist, Logic, Math. Bookmark the permalink.

10 Responses to Q: Quaternions and Octonions: what?

  1. Hilbert spaces can only cope with members of division rings. Only three suitable division rings exist. Real numbers, complex numbers and quaternions. Thus Hilbert spaces cannot handle octonions. Also biquaternions do not fit!
    See: “Division algebras and quantum theory” by John Baez. http://arxiv.org/abs/1101.5690
    Quaternions can act as rotators of other quaternions.
    For example in general γ = αβ/α ≠ β
    (The imaginary part of β, which is perpendicular to the imaginary part of α will be rotated in a direction that is perpendicular to both imaginary parts)
    Due to the four dimensions of quaternions, quaternionic number systems exist in 16 versions that only differ in their discrete symmetry set.
    Special quaternions can shift anisotropy of other quaternions to another dimension.
    The special quaternions play the role of α in the above formula and have the same size of their real part as the size of their imaginary part is.
    Quarks are anisotropic and their color charge identifies the corresponding dimension. Gluons can switch color charge of quarks. This can be represented by the mentioned special quaternions.

  2. Elaine Puricelli says:

    Hello,
    Interested to know if any change in a previous answer may be developing.
    My original question was regarding equating the photons with mass bearing particles.
    I viewed a visual aid posted by the physicist, I think, wherein it displayed a brick, yes a plain red brick, and rays of sunlight. Talk about your apples and oranges!
    So…I’m told that not even on the surface of the Sun (apparently not hot enough for
    this question) does a photon bear mass. But what about a black hole’s environment?
    The black hole itself is so “massive” that light cannot escape or perhaps it simply
    gets roughed up and bends a bit. Or,…does the photon become a mass bearing particle in this circumstance? Perhaps only in this circumstance?

  3. Other hardly known aspects about quaternions concern quaterionic differential calculus.

    A quaternion q is a combination of a real number q₀ and a 3D vector Q.
    This vector forms the imaginary part of the quaternion.
    q = q₀+ Q; q* = q₀– Q; (p q)*=p* q*
    p q = q₀ p₀ – (Q,P) + q₀ P + p₀ Q±Q×P
    The ± sign reflects the choice between right handed and left handed quaternions.

    The quaternionic nabla is defined as:
    ∇={∂/∂τ, ∂/∂x, ∂/∂y, ∂/∂z}

    ∇ψ = ∇₀ ψ₀ – (▽,Ψ) + ∇₀ Ψ + ▽ ψ₀ ± ▽ × Ψ
    In quaternionic physics, the most important equation is what I call the coupling equation. It is a special habit of quaternionic functions that can be normalized and can be differentiated.

    Free elementary particles and their free composites obey this coupling equation:
    Φ = ∇ψ = m φ
    ||ψ||² =∫|ψ|² dV= 1
    ||φ||² =∫|φ|²dV= 1
    ||Φ||² =∫|Φ|² dV= ∫|∇ψ|² dV= m²

    Φ = ∇ψ represents a differential continuity equation

    Φ, ψ and φ are differentiable quaternionic functions (DQF’s).
    m is the coupling factor.
    m also represents the total energy of the ψ field.

    Like φ it is a quaternionic field, which means that it is a combination of a scalar function and a 3D vector function. ψ represents a density distribution.

    DQF’s exist in 16 different discrete symmetry sets (symmetry flavors).
    If you want to comprehend electrons or other elementary particles, then you must try to comprehend DQF’s.
    For elementary particles ψ and φ in the coupling equation only differ in their discrete symmetries.

    In quaternionic format the Dirac equation for the free electron runs ∇ψ = m ψ*
    The Dirac equation for the free positron runs ∇*ψ* = m ψ

    Φ in Φ = ∇ψ does not change by a (gauge) transformation ψ→ ψ+∇*φ, where ∇∇*φ = 0
    In general ∇∇*ψ = ρ, where ρ is the combination of a real location density distribution and an imaginary displacement density distribution. This is the quaternionic in-homogeneous wave equation.
    ∇ψ = m₁ φ and ∇*φ = m₂ χ ⇒ ∇*∇ψ = m₁ m₂χ=ρ
    χ is a normalized density distribution.

    Be careful. In the above equations the real part of quaternions that represent space and progression does not represent our common notion of time. A small progression step represents a proper time step. A small coordinate time step is a mixture of a pure progression step and a pure space step. It is the 4D vector sum of these steps.

    Maxwell equations use coordinate time rather than proper time. Further in Maxwell equations the terms of the differential equation have obtained special names.

    Einstein used the model enforced by the Maxwell equations. It delivers a different wave equation. This results in a spacetime model with a Minkowski signature.

    Quaternionic differential calculus enforces an Euclidean space progression model.

    Quaternions fit directly with Hilbert spaces. Spacetime data must first be dismantled into real numbers before they can be used as eigenvalues in Hilbert spaces.

  4. Happyapy says:

    I’m not an expert, by any means, but the algebra of real valued octonions are a normed division algebra, and hence, a division ring. Therefore, shouldn’t octonions be valid to use with Hilbert spaces?

  5. Lucas says:

    A better title might have been “Quaternions and Octonions: Why?”

    Also, why did the Physicist get stuck with this one?

  6. Travis says:

    I understand the quaternions, and have to objection to their definition or their noncommutativity, but it seems to me that’s a choice, rather than forced.

    It’s clear that we need to introduce a new $k$, but you immediately jump from $ij=k$ to $k^2=-1$. If one were to assume commutativity, then $k^2 = (ij)^2 = i^2 j^2 = (-1)(-1)$. It seems to me that everything else could follow and you’d have a commutative group with 4 square roots of -1 (counting negatives) and 4 square roots of 1.

    I can’t find a contradiction, does this violate associativity?

  7. Antonio Carlos Mota says:

    The quarterions are the substratum of special theory of relativity,being that is possible there do the connection of space and time into spacetime continuos in 4dimensional .the curvatures of space given by quaternions in hyperbolical manifolds,then will obtain rotations( opposed spins) in 4dimensional space time continuum.the property of noncommutativity to join space and time is fundamental.the octonions appear as the matter deforming the space,but turn it symmetric does the connections with time .then appear the space time curvatures in manifolds of high dimensions.where in 8 dimensions the symmetry of space time is complete.the group of octonions is 16,to the strings the quarter ions are 8

  8. antonio carlos motta says:

    The quaternions are the substractum of STR, that contain in it essence the spacetime, that is rotational in 3 dimensions, more the dimensions time, that calcule the metrics of spacetime, or hiperbólic structures in 4 dimensions.then the spacetime are curves in mi nkowskian structures.this the property non commutative does the junction of space and time into of spacetime Contínuos, it the symmetry pt that lead tô the relative motions.i think that the pt symmetry breaking generating the constancy of speed of light, that is measured the rotational invariance and invariance of lorentz, show that there is a Scala of Planck tô the spacetime

    Ativea

  9. NXTangl says:

    One more thing about the quaternion is that it’s actually the basis of the dot and cross products. The guy who invented quaternions was actually trying to make a 3d algebra, but he couldn’t figure out how to divide two vectors in 3-space, but realized that he could make a system where the quotient of two 3-vectors was the sum of a scalar (the real component) and a 3-vector (the imaginary component/s).

    And where do dot and cross products come from? Well, if you interpret two 3-vectors, call them a and b, as quaternions, then you get:

    ab = a × bab

    which is a really neat result!

    So basically, without quaternions, we probably wouldn’t have cross products, which are very necessary to describe electromagnetism.

  10. Ángel Méndez Rivera says:

    I have an objection.

    “Here’s why. You start by saying “i2 = j2 = -1″ and then asking “ij = ?”. You begin to get a sinking feeling when you square it: (ij)2 = i2j2 = (-1)(-1) = 1.”

    This is identity is only true if you assume that multiplication is commutative, and commutativity is already false regardless of whether you introduce k or not. In fact, all the identity ij = k does is change the name of the product, it doesn’t necessarily show that k is an imaginary unit, so to speak. And the hypercomplex numbers exist too! I think a better way to explain the quaternions is by presenting them as a paravectors: the sum of a scalar and a vector. This is also much more appropriate than the rea vs imaginary terminology.

    In fact, the quaternions are perfect for modeling four-vectors in tensor calculus and special relativity. The Minkowksi metric gives us a four vector inner product that combines arithmetic multiplication with the dot product, and this dot product can be expressed in terms of the negative squares obtained when using the imaginary units. We now have two vector products: the Minkowski inner product and the paratensor outer product. For any two quaternion paravectors s + u, where s is a scalar and u is a vector, and t + v, where t is a scalar and v is a vector, the Minkowski inner product is st – u dot v, where – u dot v = uv if uv is expressed as the quaternion products, and u & v would be the imaginary parts of both quaternion paravectors, respectively. Instead of calling it the imaginary part, we can call it the vector part, and the real part can be called the scalar part. The outer paratensor product produces a paratensor, namely, the sum of a scalar, a vector, and a tensor, this tensor either being a matrix, or a higher order tensor in the more general case, depending on the number of multiplied vectors. This product would also be almost equivalent to the quaternion product, except that ij is a matrix rather than a vector. These can all be grouped together into a 4×4 matrix, which we would call a four-tensor in special relativity. This paratensor would be equal to st + (sv + tu) + u(x)v, where u(x)v stands for the outer product or more generally the tensor product of u with v. These two products are related, because the paratensor outer product contracted with the Mikowski metric yields the Minkwoski inner product! This formalism is also more sensical and satisfactory for a more general reason: we can now define a exterior algebra via a commutator operator defined for the tensor product. In special relativity, the four-bivectors get formed via the exterior product of two four-vectors, just as any normal bivector gets formed as the exterior product of two normal vectors. In both cases, the exterior product can be defined as the commutator of the tensor product of two vectors. Namely, u^v = u(x)v – v(x)u, or equivalently, v(x)u = Transpose[u(x)v]. This also applies in the case applied to four-vectors. Because of this, we can obtain the angular momentum four-tensor, and we can also obtain the electromagnetic field strength four-tensor, both of which can be expressed in terms of an exterior algebra operation. This relation is really good, because this allows us to define the Grassmann numbers and the Grassmann variables, which can reduce to the dual numbers in the case of 1-dimension. As for having a vector field defined over a field of complex number as opposed to real numbers, even though complex numbers are already paravectors, all that this requires is simply is the Cartesian product to define.

    Finally, the original quaternion product can be expressed in terms of the Hodge dual and the exterior product. More generally, the Hamiltonian quaternion algebra can be defined using the Hodge dual operator on the Grassmann algebra, and both algebras can be combined to form Clifford algebras, which are what ultimately get used to model the theory of general relativity in its completion whenever not dealing with quantum gravity.

Leave a Reply

Your email address will not be published.