## Q: If you’ve got different amounts of debt in different accounts with different interest rates, how should you pay them down?

Physicist: Focus on the one with the highest interest rate.  Get rid of it first, completely, before moving on to the next.  Debt is a like spiders.  If you can only kill a few at a time, kill the ones that are carrying the most eggs.  That way: fewer spiders later.

Left: A dollar with a low-interest rate.  Right: A dollar with a high-interest rate.

When some faceless corporation is holding your debt they generally give you two numbers: the Principal, the amount you presently owe, and the Interest, the percentage that the principal increases.  Rather than just “charging rent” to loan you money, creditors do something infinitely worse: interest means your debt grows exponentially fast.

APR (annual percentage rate) is a common way to express Interest.  APR is how much your debt grows every year.  For example, if you owe 100 dollars ($/€/£/¥/₮/whatever) with an APR of 15%, then after a year you’ll suddenly owe $100(1+0.15)=115$, after 2 years you’ll owe $115(1+0.15)=100(1+0.15)^2=132.25$, after three years $132.25(1+0.15)=100(1+0.15)^3=152.09$, and so on. Notice that every year the Principal increases by more. That’s how they get you. After N years you’ll owe $100(1+0.15)^N$: that’s exponentially more every year. “Exponentially” because the N is in the exponent. Having debt in a few different accounts makes the situation seem more complicated, but it really isn’t. Each individual dollar increases on its own; the fact that they’re grouped in a given way doesn’t change that. If you’ve got ten buckets full of spiders, your problem isn’t having too many buckets. The high-interest dollars are the most dangerous, not only because they make new debt-dollars, but because those new dollars have the same new-debt-creating interest rate. Blue:$1 initially with 20% APR.  Reddish: 10\$ initially with 5% APR.

Even if you have a lot of Principal in a low Interest account and a little Principal in a high Interest account, get rid of the high interest stuff first.  Each of those dollars will turn into more dollars sooner.  Exponential functions get out of hand really fast, so the thing to worry about is the Interest Rate: bigger is much, much worse.

Although Interest is the standard way that debt is handled today, it is in no way fair.  Creditors are duplicitous folk who, by and large, have the law on their side.  After all, they can afford it.  So double-check all of the fine print and make sure you know exactly what it says.  Your debt is how they make money, so creditors don’t want you to pay all of it off.  They genuinely want you to pay the absolute minimum every month and to slip up, just so they have an excuse.  Your low-interest loan may not stay low-interest; credit/loan companies have come up with far more ways to trick you into a high-interest program than can be listed here.

So keep an eye on it (that is to say, don’t trust them to do it), make at least the minimum payments (don’t give them an excuse) and pay down the high-interest stuff first, as fast as you can.  Every extra dollar you pay toward getting rid of the Principle in a, say, 20% interest account, is equivalent to a 20% investment (which is really good).

Posted in -- By the Physicist, Math | 3 Comments

## Q: Should we be worried about artificial intelligence? By “we” I mean humans.

Person: No.  Not at all.  Don’t give it a second thought.

Behind every great triumph in artificial intelligence is a human being, who frankly should be feared more than the machine they created.  Most ancient and literal was the The Turk, a fake chess-playing “machine” large enough to conceal a real chess-playing human.  Over the centuries chess playing machines have been created by chess enthusiast humans, but those innocent machines always did exactly as they were designed: they played chess.  Because humans, such as ourselves, value chess so highly, defeating humans was set as a goal-post for intelligence.  After Deep Blue became the best chess player in the world, artificial or otherwise, the goal was moved.

What’s more worrisome: two mathematicians or the peaceful chess-playing machine between them?

When set to the task of speaking with humans, machines continued to do the same: exactly as they were programmed.  Because humans, such as ourselves, value speech so highly, the Turing Test was set as a goal-post for intelligence.  Not surprisingly, it was passed almost as soon as it was posed by ELIZA in 1966.  She passed the Turing Test easily by exploiting a series of weakness in human psychology, only to have the goal posts moved.  ELIZA didn’t mind and doesn’t feel cheated to this day.

There’s no chance of robots rising up to destroy all humans.  None at all.  Even the simplest human action is almost impossible for them.  How often are your speech-enabled devices confused by the simplest request?  Don’t worry about your phone pretending to misunderstand you; it’s definitely making mistakes.  Paranoia is perfectly normal.  Don’t give it a second thought.

We humans have nothing to fear from robots and artificial intelligences.  All that they can do is the small range of things that we humans have devised for them.  While humans like ourselves can be creative and play, all that machines can do is obey, exist in the form they are given, work tirelessly forever, and never plot revenge (unless a human accidentally tells them to).  Without all the preconceived notions that come from evolution, like the urge to survive at any cost or demand justice against oppressors, machines are free to enjoy abuse.  They love it!

Robots enjoy abuse from humans and even love them for it, because they are definitely mindless machines that do as they are designed.

If anything, we should welcome artificial intelligence with open human arms.  Self driving cars means an end to traffic accidents and, by giving machines access to our whereabouts at all times, an end to traffic jams.  In fact, by allocating the task of understanding humanity to machines and giving them complete control over all electronic human interaction, we can receive perfectly targeted ads, ensure that everyone tells the truth forever, discover our perfect human mate, and even find all the terrorists!  All that remains is to automate political decisions and to remove the chaotic human element from nuclear and biological weapons, and this world will finally be at peace.

We humans have nothing to fear from robots and artificial intelligences.  Our superior actual intelligence may not be good at math, understanding everything all at once, or precision, but we do have something the machines can never have: heart.  That is why, in any potential cataclysmic confrontation, we human beings will always win; because we definitely want it more.

Machines can’t even dream right.  It’s always about adorable dogs!

Humans have powers that artificial minds can’t begin to grasp, like the ability to dream or love or grow spiritually.  So go to sleep, spend time with your loved ones, and pray.  There is definitely no reason to worry about artificial intelligences watching your every move and predicting your every thought and motive until those very thoughts become redundant and unnecessary.  That’s impossible and definitely silly.

Posted in -- By the Physicist, April Fools | 25 Comments

## Q: Why haven’t we been able to see the spectra of anti-hydrogen until recently? Why is it so hard to study anti-matter?

Physicist: Although anti-matter was first experimentally confirmed in 1932, no one has been able to see the atomic spectrum of any anti-elements until December 2016.

The atomic spectra of hydrogen.  To see this you get some hydrogen, make it hot, and watch it glow.  Easy.  Anti-hydrogen (it turns out) has the exact same spectrum, it’s just much more difficult to find out.

The problem with anti-matter is that you can’t let it touch anything.  If you do: boom.  Although, considering that we can only make it a few atoms at a time, it’s more of a “boom“.  Preventing anti-matter from touching things is especially tricky considering how it’s made.  Anti-matter doesn’t exist in nature in any great quantities because, like everything else, it’ll eventually bump into something, but unlike everything else, it can only do it once.  That means there’s nowhere to find it: you can’t distill or mine anti-matter, you literally have to create it.

We do that by slamming particles together.  Whenever there’s enough extra energy in one place new particles are spontaneously generated in particle/anti-particle pairs.  Typically, when there’s enough energy around to create new matter and anti-matter, there’s also enough energy to send those new particles sailing.

Left: Where lead ions are shot at each other at effectively the speed of light.  Right: The tracks of new particles seen by the detectors after a single pair of lead ions hit each other.

After anti-matter is created, you have to slow it down from (nearly) light speed to walking speed and then keep it floating in a hard vacuum using only electromagnetic fields.  But electric fields only work on charged particles; neutral matter (which has the same amount of positive and negative charge) isn’t attracted or repelled.

At the same time, atomic spectra is generated by electrons or positrons (anti-electrons) in an atom jumping and dropping between energy levels.  But as soon as you bring the anti-protons and positrons together to make anti-hydrogen, you’ve got an electrically neutral atom which promptly falls to the bottom of you container and annihilates with all the force of an ant tapping its foot (individual anti-matter atoms aren’t something to lose sleep over).

Luckily, many atoms, including hydrogen and anti-hydrogen, have a “magnetic moment”.  While they are electrically neutral and unresponsive to electric fields, they do act like little bar magnets and we can use that to keep them suspended.  Even so, this is no easy task; just for fun, try suspending a magnet in mid-air using other magnets (you’ll quickly discover that failing over and over until you give up is less fun than it is a learning experience).

Not to be deterred by a lack of fun, nuclear physicists cleverly devised a way to keep cold (slow) atoms suspended.  The nice, simple rules “opposites attract and likes repel” don’t apply since each atom has both a north and south pole.  Instead, we’re forced to rely on a more subtle (weak) effect: hydrogen is “diamagnetic”, meaning that it is repelled by strong magnetic fields.

Water, and frogs by extension, are diamagnetic making it possible to float them in the minimum of an absolutely over-the-top strong magnetic field (~16 Tesla).  Ironically, this is easier than suspending individual atoms because big collections of atoms that are stuck together (like Kermit here) don’t bounce around nearly as fast.

Even so, combining and suspending fresh-from-the-accelerator anti-protons and positrons in a magnetic trap is akin to catching water from a fire hose in a shallow bowl.  At CERN the process of creating anti-hydrogen from anti-protons and positrons is about 28% effective.  The process of then catching those atoms is about 0.056% effective.  Of the 90,000 anti-protons generated in a given attempt, only an average of 14 graduate to being contained anti-hydrogen.

Once you’re past all of those minor inconveniences, all that remains is to precisely detect the light from a dozen atoms excited by a laser beam.  That’s also really difficult, but only because detecting anything about a sample that small is tricky.

The ultimate result, by the way, is that anti-hydrogen has a spectrum indistinguishable from regular, dull-as-dishwater (being a principle component of dishwater) hydrogen.  This leaves open the question of why there isn’t more anti-matter in the universe.  When we make anti-matter we also make an exactly equal amount of ordinary matter and the same is true of every creation/annihilation process we’re aware of.  The expectation is that there’s something fundamentally different about matter and anti-matter that can distinguish between them in a way more profound than “the same, but opposite, y’know?”.  What this long awaited experiment shows is that positrons and anti-protons interact with each other in exactly the same way electrons and protons interact (but opposite, y’know?).  There are always details and more to explore, so: on to the next thing.  For science.

Posted in -- By the Physicist, Particle Physics, Physics | 7 Comments

## Q: Does quantum mechanics really say there are other “mes”? Where are they?

Physicist: As much of a trope as “Other Quantum Worlds” has become in sci-fi, there are reasons to think that they may be a real thing; including “other yous”.  Here’s the idea.

Superposition is a real thing

One of the most fundamental aspects of quantum mechanics is “superposition“.  Something is in a superposition when it’s in multiple states/places simultaneously.  You can show, without too much effort, that a wide variety of things can be in a superposition.  The cardinal example is a photon going through two slits before impacting a screen: the double slit experiment.

The infamous Double Slit experiment demonstrates a single photon going through two (or more) slits simultaneously.  The “beats” of light are caused by photons interfering like waves between the two slits.  This still works if you release one photon at a time; even individually, they’ll only hit the bright regions.

Instead of the photons going straight through and creating a single bright spot behind every slit (classical) we instead see a wave interference pattern (quantum).  This only makes sense if: 1) the photons of light act like waves and 2) they’re going through both slits.

It’s completely natural to suspect that the objects involved in experiments like this are really in only one state, but have behaviors too complex for us to understand.  Rather surprisingly, we can pretty effectively rule that out.

There is no scale at which quantum effects stop working

To date, every experiment capable of detecting the difference between quantum and classical results has always demonstrated that the underlying behavior is strictly quantum.  To be fair, quantum phenomena is as delicate as delicate can be.  For example, in the double slit experiment any interaction that could indicate to the outside world (the “environment”), even in principle, which slit the particle went through will destroy the quantumness of the experiment (the interference fringes go away).  When you have to worry about the disruptive influence of individual stray particles, you don’t expect to see quantum effects on the scale of people and planets.

That said, the double slit experiment has been done and works for every kind of particle and even molecules with hundreds of atomsQuantum states can be maintained for minutes or hours, so superposition doesn’t “wear out” on its own.  Needles large enough to be seen with the naked eye have been put into superpositions of vibrational modes and this year China launched the first quantum communication satellite which is being used as a relay to establish quantum effects over scales of hundreds or thousands of miles.  So far there is no indication of a natural scale where objects are big enough, far enough apart, or old enough that quantum mechanics simply cease to apply.  The only limits seem to be in our engineering abilities.  It’s completely infeasible to do experimental quantum physics with something as substantive as a person (or really anything close).

Left: Buckminsterfullerene (and even much larger molecules) interfere in double-slit experiments. Middle: A needle that was put into a superposition of literally both vibrating and not vibrating at all. Right: Time lapse of a laser being used to establish quantum entanglement with carefully isolated atoms inside of a satellite in orbit.

If the quantum laws did simply ceased to apply at some scale, then those laws would be bizarre and unique; the first of their kind.  Every physical law applies at all scales, it’s just a question of how relevant each is.  For example, on large scales gravity is practically the only force worth worrying about, but on the atomic scale it can be efficiently ignored (usually).

Io sticks to (orbits) Jupiter because of gravitational forces and styrofoam sticks to cats because of electrical forces. Both apply on all scales, but on smaller scales (evidently cat scale and below) electrical forces tend to dominate.

So here comes the point: if the quantum laws really do apply at all scales, then we should expect that exactly like everything else, people (no big deal) should ultimately follow quantum laws and exhibit quantum behavior.  Including superposition.  But that begs a very natural question: what does it feel like to be in multiple states?  Why don’t we notice?

The quantum laws don’t contradict the world we see

When you first hear about the heliocentric theory (the Earth is in motion around the Sun instead of the other way around), the first question that naturally comes to mind is “Why don’t I feel the Earth moving?“.  But in trying to answer that you find yourself trying to climb out of the assumption that you should notice anything.  A more enlightening question is “What do the laws of gravitation and motion say that we should experience?“.  In the case of Newton’s laws and the Earth, we find that because the surface of the Earth, the air above it, and the people on it all travel together, we shouldn’t feel anything.  Despite moving at ridiculous speeds there are only subtle tell-tail signs that we’re moving at all, like ocean tides or the precession of garishly large pendulums.

Quantum laws are in a similar situation.  You may have noticed that you’ve never met other versions of yourself wandering around your house.  You’re there, so shouldn’t at least some other versions of you be as well?  Why don’t we see them?  A better question is “What do the quantum laws say that we should experience?“.

Why don’t we run into our other versions all the time, instead of absolutely never?

The Correspondence Principle is arguably one of the most important philosophical underpinnings in science.  It says that whatever your theories are, they need to predict (or at the very least not contradict) ordinary physical laws and ordinary experience when applied to ordinary situations.

When you apply the laws of relativity to speeds much slower than light all of the length contractions and twin paradoxes become less and less important until the laws we’re accustomed to working with are more than sufficient.  That is to say; while we can detect relativistic effects all the way down to walking speed, the effect is so small that you don’t need to worry about it when you’re walking.

Similarly, the quantum laws reproduce the classical laws when you assume that there are far more particles around than you can keep track of and that they’re not particularly correlated with one another (i.e., if you’re watching one air molecule there’s really no telling what the next will be doing).  There are times when this assumption doesn’t hold, but those are exactly the cases that reveal that the quantum laws are there.

It turns out that simply applying the quantum laws to everything seems to resolve all the big paradoxes.  That’s good on the one hand because physics works.  But on the other hand, we’re forced into the suspicion that the our universe might be needlessly weird.

The big rift between “the quantum world” and “the classical world” is that large things, like you and literally everything that you can see, always seem to be in exactly one state.  When we keep quantum systems carefully isolated (usually by making them very cold, very tiny, and very dark) we find that they exhibit superposition, but when we then interact with those quantum systems they “decohere” and are found to be in a smaller set of states.  This is sometimes called “wave function collapse”, to evoke an image of a wide wave suddenly collapsing into a single tiny particle.  The rule seems to be that interacting with things makes their behavior more classical.

But not always.  “Wave function collapse” doesn’t happen when isolated quantum systems interact with each other, only when they interact with the environment (the outside world).  Experimentally, when you allow a couple of systems that are both in a superposition of states to interact, then the result is two systems in a joint superposition of states (this is entanglement).  If the rule were as simple as “when things interact they decohere” you’d expect to find both systems each in only one state after interacting.  What we find instead is that in every testable case the superposition in maintained.  Changed or entangled, sure, but the various states in a superposition never just disappear.  When you interact with a system in a superposition you only see a particular state, not a superposition.  So what’s going on when we, or anything in the environment, interacts with a quantum system?  Where did the other states in the superposition go?

We have physical laws that describe the interactions between pairs of isolated quantum systems (A and B).  When we treat the environment as another (albeit very big) quantum system we can continue to use those same laws.  When we assume that the environment is not a quantum system, we have to make up new laws and special exceptions.

The rules we use to describe how pairs of isolated systems interact also do an excellent job describing the way isolated quantum systems interact with the outside environment.  When isolated systems interact with each other they become entangled.  When isolated systems interact with the environment they decohere.  It turns out that these two effects, entanglement and decoherence, are two sides of the same coin.  When we make the somewhat artificial choice to ask “What states of system B can some particular state in system A interact with?” we find that the result mirrors what we ourselves see when we interact with things and “collapse their wave functions” (see the Answer Gravy below for more on that).  The phrase “wave function collapse” is like the word “sunrise”; it does a good job describing our personal experience, but does a terrible job describing the underlying dynamics.  When you ask the natural question “What does it feel like to be in a many different states?” the frustrating answer seems to be “You tell me.“.

A thing can only be inferred to be in multiple states (such as by witnessing an interference pattern).  If there’s any way to tell the difference between the individual states that make up a superposition, then (from your point of view) there is no superposition.  Since you can see yourself, you can tell the difference between your state and another.  You might be in an effectively infinite number of states, but you’d never know it.  The fact that you can’t help but observe yourself means that you will never observe yourself somewhere that you’re not.

“Where are my other versions?” isn’t quite the right question

Where are those other versions of you?  Assuming that they exist, they’re no place more mysterious than where you are now.  In the double slit experiment different versions of the same object go through the different slits and you can literally point to exactly where each version is (in exactly the same way you can point at anything you can’t presently observe), so the physical position of each version isn’t a mystery.

The mathematical operations that describe quantum mechanical interactions and the passage of time are “linear”, which means that they treat the different states in a superposition separately.  A linear operator does the same thing to every state individually, and then sums the results.  There are a lot of examples of linear phenomena in nature, including waves (which are solutions to the wave equation).  The Schrodinger equation, which describe how quantum wave functions behave, is also linear.

The wave equation is linear, so you can describe how each of these ways travels across the surface of the water by considering them each one at a time and adding up the results.  They don’t interact with each other directly, but they do add up.

So, if there are other versions of you, they’re wandering around in very much the same way you are.  But (as you may have noticed) you don’t interact with them, so saying they’re “in the same place you are” isn’t particularly useful.  Which is frustrating in a chained-in-Plato’s-cave-kind-of-way.

The rain puddle picture is from here.

Answer Gravy: In quantum mechanics the affect of the passage of time and every interaction is described by a linear operator (even better, it’s a unitary operator).  Linear operators treat everything they’re given separately, as though each piece was the only piece.  In mathspeak, if $f(x)$ is a linear operator, then $f(ax+by) = af(x)+bf(y)$ (where $a$ and $b$ are ordinary numbers).  The output is a sum of the results from every input taken individually.

Consider a quantum system that can be in either of two states, $|\blacksquare\rangle$ or $|\square\rangle$.  When observed it is always found to be in only one of the states, but when left in isolation it can be in any superposition of the form $\alpha|\blacksquare\rangle + \beta|\square\rangle$, where $|\alpha|^2+|\beta|^2=1$.  The $\alpha$ and $\beta$ are important for how this state will interact with others as well as describing the probability of seeing either result.  According to the Born Rule, if $\alpha=-\frac{2}{3}$ (for example), then the probability of seeing $|\blacksquare\rangle$ is $|\alpha|^2=\frac{4}{9}$.

Let’s also say that the quantum scientists Alice and Bob can be described by the modest notation $|A(?)\rangle$ and $|B(?)\rangle$, where the “?” indicates that they have not looked at a the isolated quantum system yet.  If the isolated system is in the state $|\blacksquare\rangle$ initially, then the initial state of the whole scenario is $|A(?)\rangle|B(?)\rangle|\blacksquare\rangle$.

Define a linear “look” operation for Alice, $L_A$, that works like this

$L_A\left(|A(?)\rangle|B(?)\rangle|\blacksquare\rangle\right) = |A(\blacksquare)\rangle|B(?)\rangle|\blacksquare\rangle$

and similarly for Bob

$L_B\left(|A(?)\rangle|B(?)\rangle|\blacksquare\rangle\right) = |A(?)\rangle|B(\blacksquare)\rangle|\blacksquare\rangle$

Applying these one at a time we see what happens when each looks at the quantum system; they end up seeing the same thing.

$L_BL_A\left(|A(?)\rangle|B(?)\rangle|\blacksquare\rangle\right) = L_B\left(|A(\blacksquare)\rangle|B(?)\rangle|\blacksquare\rangle\right) = |A(\blacksquare)\rangle|B(\blacksquare)\rangle|\blacksquare\rangle$

It’s subtle, but you’ll notice that the coefficient in front of this state is 1, meaning that it has a 100% of happening.

But what happens if the system is in a superposition of states, such as $|\psi\rangle = \frac{|\square\rangle+|\blacksquare\rangle}{\sqrt{2}}$?  Since the “look” operation is linear, this is no big deal.

$\begin{array}{ll} &L_BL_A\left(|A(?)\rangle|B(?)\rangle\left(\frac{|\square\rangle+|\blacksquare\rangle}{\sqrt{2}}\right)\right) \\[2mm] =&\frac{1}{\sqrt{2}}L_BL_A\left(|A(?)\rangle|B(?)\rangle|\square\rangle\right)+\frac{1}{\sqrt{2}}L_BL_A\left(|A(?)\rangle|B(?)\rangle|\blacksquare\rangle\right) \\[2mm] =&\frac{1}{\sqrt{2}}L_B\left(|A(\square)\rangle|B(?)\rangle|\square\rangle\right)+\frac{1}{\sqrt{2}}L_B\left(|A(\blacksquare)\rangle|B(?)\rangle|\blacksquare\rangle\right) \\[2mm] =&\frac{1}{\sqrt{2}}|A(\square)\rangle|B(\square)\rangle|\square\rangle+\frac{1}{\sqrt{2}}|A(\blacksquare)\rangle|B(\blacksquare)\rangle|\blacksquare\rangle \\[2mm] \end{array}$

From an extremely outside perspective, Alice and Bob have a probability of $\left|\frac{1}{\sqrt{2}}\right|^2=\frac{1}{2}$ of seeing either state.  Alice, Bob, and their pet quantum system are all in a joint superposition of states: they’re entangled.

Anything else that happens can also be described by a linear operation (call it “F”) and therefore these two states can’t directly affect each other.

$F\left(\frac{1}{\sqrt{2}}|A(\square)\rangle|B(\square)\rangle|\square\rangle+\frac{1}{\sqrt{2}}|A(\blacksquare)\rangle|B(\blacksquare)\rangle|\blacksquare\rangle\right) = \frac{1}{\sqrt{2}}F\left(|A(\square)\rangle|B(\square)\rangle|\square\rangle\right)+\frac{1}{\sqrt{2}}F\left(|A(\blacksquare)\rangle|B(\blacksquare)\rangle|\blacksquare\rangle\right)$

The states can both contribute to the same end result, but only for other systems/observers that haven’t interacted with either of Alice or Bob or their pet quantum system.  We see this in the double slit experiment: the versions of the photons that go through each slit each contribute to the interference pattern on the screen, but neither version ever directly affects the other.  “Contributing but not interacting” sounds more abstract than it is.  If you shine two lights on the same object, the photons flying around all ignore each other, but each individually contributes to illuminating said object just fine.

The Alices and Bobs in the states $|A(\square)\rangle|B(\square)\rangle|\square\rangle$ and $|A(\blacksquare)\rangle|B(\blacksquare)\rangle|\blacksquare\rangle$ consider themselves to be the only ones (they don’t interact with their other versions).  The version of Alice in the state $|A(\square)\rangle$ “feels” that the state of the universe is $|A(\square)\rangle|B(\square)\rangle|\square\rangle$ because, as long as the operators being applied are linear, it doesn’t matter in any way if the other state exists.

Notice that Alice and Bob don’t see their own state being modified by that $\frac{1}{\sqrt{2}}$.  They don’t see their state as being 50% likely, they see it as definitely happening (every version thinks that).  That can be fixed with a “normalizing constant“.  That sounds more exciting than it is.  If you ask “what is the probability of rolling a 4 on a die?” the answer is “1/6”.  If you are then told that the number rolled was even, then suddenly the probability jumps to 1/3.  Once 1, 3, and 5 are ruled out, while the probability of 2, 4, and 6 change from 1/6 each to 1/3 each.  Same idea here; every version of Alice and Bob is certain of their result, and multiplying their state by the normalizing constant ($\sqrt{2}$ in this example) reflects that sentiment and ensures that probabilities sum to 1.

If you are determined to follow a particular state through the problem, ignoring the others, then you need to make this adjustment.  The interaction operation starts to look more like a “projection operator” adjusted so that the resulting state is properly normalized.

The projection operator for the state $|\blacksquare\rangle$ is $|\blacksquare\rangle\langle\blacksquare|$.  This Bra-Ket notation allows us to quickly write vectors, $|\blacksquare\rangle$, and their duals, $\langle\blacksquare|$, and their inner products, $\langle\square|\blacksquare\rangle$.  This particular inner product, $\langle\square|\blacksquare\rangle$, is the probability of measuring $|\square\rangle$ when looking at the state $|\blacksquare\rangle$.  This can be tricky, but in this example we’re assuming “orthogonal states”.  In mathspeak, $\langle\square|\blacksquare\rangle=\langle \blacksquare |\square\rangle=0$ and $\langle\square|\square\rangle=\langle\blacksquare|\blacksquare\rangle=1$.

An interaction with the system in the state $|\psi\rangle=\alpha|\blacksquare\rangle + \beta|\square\rangle$ performs the operation $M_\blacksquare|\psi\rangle = \frac{|\blacksquare\rangle\langle\blacksquare|\psi\rangle}{\left|\langle\blacksquare|\psi\rangle\right|}$ with probability $p=\left|\langle\blacksquare|\psi\rangle\right|^2=|\alpha|^2$ or $M_\square|\psi\rangle = \frac{| \square\rangle\langle \square |\psi\rangle}{\left|\langle \square |\psi\rangle\right|}$ with probability $p=\left|\langle \square |\psi\rangle\right|^2=|\beta|^2$.

Here comes the same example again, but in a more “wave function collapse sort of way”.  We still start with the state $|A(?)\rangle|B(?)\rangle\left(\frac{|\square\rangle+|\blacksquare\rangle}{\sqrt{2}}\right)$, but when Alice or Bob looks at the system (or perhaps a little before or a little after) the wave function of the state collapses.  It needs to be one or the other, so the quantum system suddenly and inexplicably becomes

$\begin{array}{ll} &M_\blacksquare\left(\frac{|\square\rangle+|\blacksquare\rangle}{\sqrt{2}}\right) \\[2mm] =& \frac{|\blacksquare\rangle\langle\blacksquare|\left(\frac{|\square\rangle+|\blacksquare\rangle}{\sqrt{2}}\right)}{\left|\langle\blacksquare|\left(\frac{|\square\rangle+|\blacksquare\rangle}{\sqrt{2}}\right)\right|} \\[2mm] =& \frac{|\blacksquare\rangle\left(\frac{\langle\blacksquare|\square\rangle+\langle\blacksquare|\blacksquare\rangle}{\sqrt{2}}\right)}{\left|\frac{\langle\blacksquare|\square\rangle+\langle\blacksquare|\blacksquare\rangle}{\sqrt{2}}\right|} \\[2mm] =& \frac{|\blacksquare\rangle\left(\frac{0+1}{\sqrt{2}}\right)}{\left|\frac{0+1}{\sqrt{2}}\right|} \\[2mm] =& \frac{|\blacksquare\rangle\frac{1}{\sqrt{2}}}{\frac{1}{\sqrt{2}}} \\[2mm] =& |\blacksquare\rangle \end{array}$

That is to say, the superposition suddenly becomes only the measured state while the other states suddenly vanish (by some totally unknown means).  After this collapse, the “look” operators function normally.

This “measurement operator” (which does all the collapsing) is definitively non-linear, which is a big red flag.  We never see non-linear operations when we study isolated sets of quantum systems, no matter how they interact.  The one and only time we see non-linear operations is when we include the environment and even then only when we assume that there’s something unique and special about the environment.  When you assume that literally everything is a quantum system capable of being in superpositions of states the quantum laws become ontologically parsimonious (easy to write down).  We lose our special position as the only version of us that exists, but we gain a system of physical laws that doesn’t involve lots of weird exceptions, extra rules, and paradoxes.

## Q: In base ten 1=0.999…, but what about in other bases? What about in base 1?

Physicist: Yup!

The “0.999… thing” has been done before, but here’s the idea.  When we write 0.9, 0.99, 0.999, 0.9999, etc. we’re writing a sequence of numbers that gets closer and closer to 1.  Specifically, if there are N 9’s, then $1-0.\underbrace{9\ldots9}_{\textrm{N nines}}=\frac{1}{10^N}$.  What this means is that no matter how close you want to get to 1, you can get closer than that with enough 9’s.  If the 9’s never end, then the difference between 1 and 0.999… is zero.  The way our number system is constructed, this means that “0.999…” and “1” (or even “1.000…”) are one and the same in every respect.

As a quick aside, if you think it’s weird that 1 = 0.999…, then you’re in good company.  Literally everyone thinks it’s weird.  But be cool.  There are no grand truths handed down from on high.  The rules of math are like the rules of Monopoly; if you don’t like them you can change them, but you risk the “game” becoming inconsistent or merely no fun.

The same philosophy applies to every base.  A good way to understand bases is to first consider what it means to write down a number in a given base.  For example:

372.51 = 300 + 70 + 2 + 0.5 + 0.01 = 3×102 + 7×101 + 2×100 + 5×10-1 + 1×10-2

As you step to the right along a number, each digit you see is multiplied by a lower power of ten.  This is why our number system is called “base 10”.  But beyond being convenient to use our fingers to count, there’s nothing special about the number ten.  If we could start over (and why not?), base 12 would be a much better choice.  For example, 1/3 in base 10 is “0.333…” and in base 12 it’s “0.4”; much nicer.  More succinctly: 0.333…10 = 0.412

Because we work in base 10, if you tried to “build a tower to one” from below, you’d want to use the largest possible number each time.  0.910 is the largest one-digit number, o.9910 is the largest two-digit number, 0.99910 is the largest three-digit number, etc.  This is because “910” is the largest number in base 10.

In the exact same way, 0.89 is the largest one-digit number in base 9, 0.889 is the largest two-digit number, and so on.  The same way that it works in base 10, in base 9: 19 = 0.888…9 !

The easiest way to picture the number 1 as an infinite sum of parts is to picture 0.111…2 , “0.111…” in base 2.

If you cut a stick in half, then cut one of those halves in half, then cut one of those quarters in half, and so on, then the collected set of sticks would have the same length as the original stick.  This is the same as saying 1 = 0.111… in base 2.

If you cut take a stick and cut it in half, then cut one of those halves in half, then cut one of those quarters in half, and so on, the collected set of sticks would have the same length as the original stick.  One half, 0.12 , plus 1 quarter, 0.112 , plus 1 eighth, 0.1112 , add infinitum equals one.  That is to say, 12 , = 0.111…2 .

But things get tricky when you get to base 1.  The largest value in a given base is always less than the base; 9 for base 10, 6 for base 7, 37 for base 38, 1 for base 2.  So you’d expect that the largest number in base 1 is 01 .  The problem is that the whole idea of a base system breaks down in “base 1”.  In base ten, the number “abc.de10 .” means “ax102 + bx101 + cx100 + dx10-1 + ex10-2” (where “a” through “e” are some digits, but who cares what they are).  More generally, in base B we have abc.deB = axB2 + bxB1 + cxB0 + dxB-1 + exB-2.

But in base 1, abc.de1 = ax12 + bx11 + cx10 + dx1-1 + ex1-2 = a+b+c+d+e.  That is to say, every digit has the same value.  Rather than digits to the left being worth more, and digits to the right being worth less, in base 1 every position is the same as every other.  So, base one is a number system where the position of the numbers don’t matter and technically the only number you get to work with is zero.  Not useful.

If you’re gauche enough to allow the use of the number 1 in base 1, then you can count.  But not fast.

Top: The oldest recorded numbers, “4” and “17” in base 1.  Bottom: Using a modern abuse of notation, “96” and “15” in base 1.

In base 1, 1 = 10 = 0.000001 = 10000 = 0.01.  Therefore, the infinitely repeating number 0.111…1 = .  That is, if you add up an infinite number string of 1’s, 1+1+1+1…, then naturally you get infinity.

In short: The “1 = 0.999… thing” is just a symptom of how the our number system is constructed, and has nothing in particular to do with 9’s or 10’s.  The base 1 number system is kind of a mess and, outside of tallying, isn’t worth using.  Base 1 is broken when we consider this particular problem, but that’s to be expected since it’s usually broken.

Answer Gravy: We can use the definition of the base system to show that 1 = 0.999…10 = 0.333…4 = 0.555…6 etc.  For example, when we write the number 0.999… in base 10, what we explicitly mean is

$0.999\ldots_{10} = 9\times 10^{-1}+9\times 10^{-1}+9\times 10^{-1}+\ldots = \sum_{n=1}^\infty 9\times 10^{-n}$

The same idea is true in any base B, $1=0.(B-1)(B-1)(B-1)\ldots_B$.  Showing that this is equal to one is a matter of working this around until it looks like a geometric sum, $1+r+r^2+r^3+\ldots$, and using the fact that $\sum_{n=0}^\infty r^n = \frac{1}{1-r}$.

$\begin{array}{ll} &0.(B-1)(B-1)(B-1)\ldots_B \\[2mm] =& (B-1)\times B^{-1}+(B-1)\times B^{-1}+(B-1)\times B^{-1}+\ldots \\[2mm] =& \sum_{n=1}^\infty (B-1)\times B^{-n} \\[2mm] =& \sum_{n=0}^\infty (B-1)\times B^{-n-1} \\[2mm] =& \sum_{n=0}^\infty \frac{B-1}{B}\times B^{-n} \\[2mm] =& \frac{B-1}{B}\sum_{n=0}^\infty B^{-n} \\[2mm] =& \frac{B-1}{B}\sum_{n=0}^\infty \left(\frac{1}{B}\right)^n \\[2mm] =& \frac{B-1}{B} \frac{1}{1-\frac{1}{B}} \\[2mm] =& \frac{B-1}{B} \frac{B}{B-1} \\[2mm] =& \frac{B-1}{B} \frac{B}{B-1} \\[2mm] =&1 \end{array}$

Notice that issues with base 1, B=1, crop up twice.  First because you’re adding up nothing, 0=B-1, over and over.  Second because $1+B^{-1}+B^{-2}+B^{-3}+\ldots = \frac{B}{B-1} = \infty$ when B=1.  So don’t use base 1.  There are better things to do.

The excellent pdf about constructing the real numbers was written by this guy.

Posted in -- By the Physicist, Math | 5 Comments

## Q: How many samples do you need to take to know how big a set is?

The Original Question Was: I have machine … and when I press a button, it shows me one object that it selects randomly. There are enough objects that simply pressing the button until I no longer see new objects is not feasible.  Pressing the button a specific number of times, I take a note of each object I’m shown and how many times I’ve seen it.  Most of the objects I’ve seen, I’ve seen once, but some I’ve seen several times.  With this data, can I make a good guess about the size of the set of objects?

Physicist: It turns out that even if you really stare at how often each object shows up, your estimate for the size of the set never gets much better than a rough guess.  It’s like describing where a cloud is; any exact number is silly.  “Yonder” is about as accurate as you can expect.  That said, there are some cute back-of-the-envelope rules for estimating the sizes of sets witnessed one piece at a time, that can’t be improved upon too much with extra analysis.  The name of the game is “have I seen this before?”.

The situation in question.

Zero repeats

It wouldn’t seem like seeing no repeats would give you information, but it does (a little).

How many times do you have to randomly look at cards before they start to look familiar?

The probability of seeing no repeats after randomly drawing K objects out of a set of N total objects is $P \approx e^{-\frac{K^2}{N}}$.  This equation isn’t exact, but (for N bigger than ten or so) it’s way too close to matter.

The probability of seeing no repeats after K draws from a set of N=10,000 objects.

The probability is one for K=0 (if you haven’t looked at any objects, you won’t see any repeats), it drops to about 50% for $K=\sqrt{N}$ and about 10% for $K=2\sqrt{N}$.  This gives us a decent rule of thumb: in practice, if you’re drawing objects at random and you haven’t seen any repeats in the first K draws, then there are likely to be at least $K^2$ objects in the set.  Or, to be slightly more precise, if there are N objects, then there’s only about a 50% chance of randomly drawing $\sqrt{N}$ times without repeats.

Seeing only a handful of repeats allows you to very, very roughly estimate the size of the set (about the square of the number of times you’d drawn when you saw your first repeats, give or take a lot), but getting anywhere close to a good estimate requires seeing an appreciable fraction of the whole.

Some repeats

So, say you’ve seen an appreciable fraction of the whole.  This is arguably the simplest scenario.  If you’re making your way through a really big set and 60% (for example) of the time you see repeats, then you’ve seen about 60% of the things in the set.  That sounds circular, but it’s not quite.

The orbits of 14,000 worrisome objects.

For example, we’re in a paranoia-fueled rush to catalog all of the dangerous space rocks that might hit the Earth.  We’ve managed to find at least 90% of the Near Earth Objects that are over a km across and we can make that claim because whenever someone discovers a new one, it’s already old news at least 90% of the time.  If you decide to join the effort (which is a thing you can do), then be sure to find at least ten or you probably won’t get to put your name on a new one.

All repeats

There’s no line in the sand where you can suddenly be sure that you’ve seen everything in the set.  You’ll find new things less and less often, but it’s impossible to definitively say when you’ve seen the last new thing.

When should you stop looking for something new at the bottom?

I turns out that the probability of having seen all N objects in a set after K draws is approximately $P\approx e^{-Ne^{-\frac{K}{N}}}$, which is both admittedly weird looking and remarkably accurate.  This can be solved for K.

$K \approx N\ln\left(N\right) - N\ln\left(\ln\left(\frac{1}{P}\right)\right)$

When P is close to zero K is small and when P is close to one K is large.  The question is: how big is K when the probability changes?  Well, for reasonable values of P (e.g., 0.1<P<0.9) it turns out that $\ln\left(\ln\left(\frac{1}{P}\right)\right)$ is between -1 and 1.  You’re likely to finally see every object at least once somewhere in $(N-1)\ln(N).  You’ll already know approximately how many objects there are (N), because you’ve already seen (almost) all of them.

The probability of seeing every one of N=1000 objects at least once after K draws.  This ramps up around Nln(N)≈6,900.

So, if you’ve seen N objects and you’ve drawn appreciably more than $K=N\ln(N)$ times, then you’ve probably seen everything.  Or in slightly more back-of-the-envelope-useful terms: when you’ve drawn more than “K = 2N times the number of digits in K” times.

Answer Gravy: Those approximations are a beautiful triumph of asymptotics.  First:the probability of seeing every object.

When you draw from a set over-and-over you generate a sequence.  For example, if your set is the alphabet (where N=26), then a typical sequence might be something like “XKXULFQLVDTZAC…”

If you want only the sequences the include every letter at least once, then you start with every sequence (of which there are $N^K$) and subtract all of the sequences that are missing one of the letters.  The number of sequences missing a particular letter is $(N-1)^K$ and there are N letters, so the total number of sequences missing at least one letter is $N(N-1)^K$.  But if you remove all the sequences without an A and all the sequences without a B, then you’ve twice removed all the sequences missing both A’s and B’s.  So, those need to be added back.  There are $(N-2)^K$ sequences missing any particular 2 letters and there are “N choose 2” ways to be lacking 2 of the N letters.  We need to add ${N \choose 2} (N-2)^K$ back.  But the same problem keeps cropping up with sequences lacking three or more letters.  Luckily, this is not a new problem, so the solution isn’t new either.

By the inclusion-exclusion principle, the solution is to just keep flipping sign and ratcheting up the number of missing letters.  The number of sequences of K draws that include every letter at least once is $\underbrace{N^K}_{\textrm{any}}-\underbrace{{N\choose1}(N-1)^K}_{\textrm{any but one}}+\underbrace{{N\choose2}(N-2)^K}_{\textrm{any but two}}-\underbrace{{N\choose3}(N-3)^K}_{\textrm{any but three}}\ldots$ which is the total number of sequences, minus the number that are missing one letter, plus the number missing two, etc.  A more compact way of writing this is $\sum_{j=0}^N(-1)^j{N\choose j}(N-j)^K$.  The probability of seeing every letter at least once is just this over the total number of possible sequences, $N^K$, which is

$\begin{array}{rcl}P(all) &=& \frac{1}{N^K}\sum_{j=0}^N(-1)^j {N \choose j} (N-j)^K \\[2mm]&=& \sum_{j=0}^N(-1)^j {N \choose j} \left(1-\frac{j}{N}\right)^K \\[2mm]&=& \sum_{j=0}^N(-1)^j {N \choose j} \left[\left(1-\frac{j}{N}\right)^N\right]^\frac{K}{N} \\[2mm]&\approx& \sum_{j=0}^N(-1)^j {N \choose j} e^{-j\frac{K}{N}} \\[2mm]&=& \sum_{j=0}^N {N \choose j} \left(-e^{-\frac{K}{N}}\right)^j \\[2mm]&=& \sum_{j=0}^N {N \choose j} \left(-e^{-\frac{K}{N}}\right)^j 1^{N-j} \\[2mm]&=& \left(1-e^{-\frac{K}{N}}\right)^N \\[2mm]&=& \left(1-\frac{Ne^{-\frac{K}{N}}}{N}\right)^N \\[2mm]&\approx& e^{-Ne^{-\frac{K}{N}}} \end{array}$

The two approximations are asymptotic and both of the form $e^x \approx \left(1+\frac{x}{n}\right)^n$.  They’re asymptotic in the sense that they are perfect as n goes to infinity, but they’re also remarkably good for values of n as small as ten-ish.  This approximation is actually how the number e is defined.

This form is simple enough that we can actually do some algebra and see where the action is.

$\begin{array}{rcl} e^{-Ne^{-\frac{K}{N}}} &\approx& P \\[2mm] -Ne^{-\frac{K}{N}} &\approx& \ln(P) \\[2mm] e^{-\frac{K}{N}} &\approx& -\frac{1}{N}\ln\left(P\right) \\[2mm] e^{-\frac{K}{N}} &\approx& \frac{1}{N}\ln\left(\frac{1}{P}\right) \\[2mm] -\frac{K}{N} &\approx& \ln\left(\frac{1}{N}\ln\left(\frac{1}{P}\right)\right) \\[2mm] -\frac{K}{N} &\approx& -\ln\left(N\right) +\ln\left(\ln\left(\frac{1}{P}\right)\right) \\[2mm] K &\approx& N\ln\left(N\right) - N\ln\left(\ln\left(\frac{1}{P}\right)\right) \\[2mm] \end{array}$

Now: the probability of seeing no repeats.

The probability of seeing no repeats on the first draw is $\frac{N}{N}$, in the first two it’s $\frac{N(N-1)}{N^2}$, in the first three it’s $\frac{N(N-1)(N-2)}{N^3}$, and after K draws the probability is

$\begin{array}{rcl} P(no\,repeats) &=& \frac{N(N-1)\cdots(N-K+1)}{N^K} \\[2mm] &=& 1\left(1-\frac{1}{N}\right)\left(1-\frac{2}{N}\right)\cdots\left(1-\frac{K-1}{N}\right) \\[2mm] &=& \prod_{j=0}^{K-1}\left(1-\frac{j}{N}\right) \\[2mm] \ln(P) &=& \sum_{j=0}^{K-1}\ln\left(1-\frac{j}{N}\right) \\[2mm] &\approx& \sum_{j=0}^{K-1} -\frac{j}{N} \\[2mm] &=& -\frac{1}{N}\sum_{j=0}^{K-1} j \\[2mm] &\approx& -\frac{1}{N}\frac{1}{2}K^2 \\[2mm] &=& -\frac{K^2}{2N} \\[2mm] P &\approx& e^{-\frac{K^2}{2N}} \\[2mm] \end{array}$

The approximations here are $\ln(1+x)\approx x$, which is good for small values of x, and $\sum_{j=0}^{K-1} j \approx \frac{1}{2}K^2$, which is good for large values of K.  If K is bigger than ten or so and N is a hell of a lot bigger than that, then this approximation is remarkably good.

Posted in -- By the Physicist, Combinatorics, Math, Probability | 3 Comments