Mathematics and Computation

August 13, 2008

Intuitionistic mathematics for physics

Filed under: Constructive math, Tutorial — Andrej Bauer @ 09:35

At MSFP 2008 in Iceland I chatted with Dan Piponi about physics and intuitionistic mathematics, and he encouraged me to write down some of the ideas. I have little, if anything, original to say, so this seems like an excellent opportunity for a blog post. So let me explain why I think intuitionistic mathematics is good for physics.

Intuitionistic mathematics, whose main proponent was L.E.J. Brouwer, is largely misunderstood by mathematicians. Consequently, physicists have strange ideas about it, too. For example, David Deutsch somehow managed to write in his otherwise excellent popular science book “The Fabric of Reality” that intuitionists deny existence of infinitely many natural numbers (those would be the ultrafinitists, if there are any). He also produced rather silly arguments against intuitionistic mathematics, which I explained to myself by believing that he never had a chance to learn that intuitionistic mathematics supports his point of view.

While Brouwer’s and other preintuitionists’ reasons for intuitionistic mathematics were philosophical in nature, there is today a vibrant community of mathematicians, logicians, computer scientists, and even the odd physicist, who work with intuitionistic mathematics not because of their philosophical conviction but because it is simply the right kind of math for what they are doing.

Intuitionistic understanding of truth

A common obstacle in understanding intuitionistic logic is the opinion that the difference between classical and intuitionistic logic arises because classicists and intuitionists just happen to disagree about what is true. A typical example of this is the principle known as Proof by Contradiction:

For every proposition `phi`, if `phi` is not false then `phi` is true.

With a formula we write this as

`forall phi in Prop, not not phi => phi`.

Classical mathematicians accept it as true. Intuitionists do not accept it, but neither do they claim it is false. In fact, they claim that the principle has no counterexamples, that is

`not exists phi in Prop, not (not not phi => phi)`.

This becomes very confusing for classical mathematicians who think that the two displayed formulae are equivalent, because they believe in Proof by Contradiction. It is like believing that the Earth is flat while trying to make sense of Kepler’s Laws of planetary motion.

The difference between intuitionistic and classical logic is in the criteria for truth, i.e., what evidence must be provided before a statement is accepted as true. Speaking vaguely, intuitionistic logic demands positive evidence, while classical logic is happy with lack of negative evidence. The intuitionist view is closer to the criterion of truth in science, where we normally confirm a statement with an experiment (positive evidence), but this analogy should not be taken too far.

What counts as “evidence” is open to interpretation. Before I describe the three most common ones below, let me just explain the difference between `phi` (”`phi` is true”) and `not not phi` (”`phi` is not false”). Intuitionistically:

  • `phi` holds if there is positive evidence supporting it,
  • `not phi` holds if it is contradictory to assume `phi`, that is to say, evidence of `phi` would entail a contradiction.
  • `not not phi` holds if it is contradictory to assume that it is contradictory to assume `phi`.

That is a bit complicated. In essence, it says that `not not phi` is accepted when there is no evidence against it. In other words, `not not phi` means something like “`phi` cannot be falsified” or “`phi` is potentially true”. For example, if someone says

“There is a particle which does not interact with anything in the universe.”

that would be a statement which is not accepted as true, for how would you ever present positive evidence? But it is accepted as potentially true, for how would you ever falsify it?

A statement which is logically equivalent to one of the form `not not phi` is called doubly negated. For the purposes of this post I shall call a statement `phi` potentially true if its double negation `not not phi` is true. It seems nontrivial to come up with useful statement in physics which are only potentially true (but see the discussion about infinitesimals below). Perhaps Karl Popper would have something to say about that.

Let me now describe three most common interpretations of “evidence” in intuitionistic logic.

Computational interpretation

This is the interpretation of intuitionistic logic commonly presented in computer science. We view all sets as represented by suitable data structures—a reasonable point of view for a computer scientist. Then a statement is taken to be true if there exists a program (computational evidence) witnessing its truth. To demonstrate the idea, consider the statement

`forall x in A, exists y in B, phi(x, y)`.

This is taken to be true if there exists a program which accepts `x` and outputs `y` together with computational evidence that `phi(x,y)` holds. Another example: the statement

`forall x in A, phi(x) or psi(x)`

is true if there exists a program which takes `x` as input and outputs either `0` and evidence of `phi(x)`, or `1` and evidence of `psi(x)`. In other words, the program is a decision procedure which tells us which of the two disjuncts holds, and why. Under this interpretation the Law of Excluded Middle fails because there are unsolvable decision problems, such as the Halting problem.

The computationally minded readers might entertain themselves by figuring out a computational explanation of potentially true statements (take double negation to mean “continuation”). I have not done it myself.

Topological interpretation

We may replace the phrases “data structure” and “program” in the computational interpretation by “topological space” and “continuous function”, respectively. Thus a statement is true if it is witnessed by a continuous function which transforms input (hypotheses) to output (conclusions).

The basis for this explanation may be found in physics if we think about what it means for a function to be continuous in terms of communication or information processing. Suppose an observer wants to communicate a real-valued quantity `x` to another observer. They can do it in many ways: by making sounds, by sending electromagnetic signals, by sending particles from one place to another, by manufacturing and sending a stick of length `x` by mail, etc. However, as long as they use up only a finite amount of resources (time, space, energy) they will be able to communicate only a finite amount of information about `x`. Similarly, in any physical process (computer, brain, abacus) which transforms an input value `x` to an output value `f(x)` the rate of information flow is finite. Consequently, in finite time the process will obtain only a finite amount of information about `x`, on the basis of which it will output a finite amount of information about `f(x)`. This is just the definition of continuity of `f` phrased in terms of information flow rather than `epsilon` and `delta`. Notice that we are not assuming that `f` is computable because we do not want to make the rather sweeping assumption that all physical processes are computable.

The conclusion is that “all functions are continuous”, including those that witness truth of statements.

You might be thinking that an analog-to-digital converter is a counterexample to the above argument. It is a device which takes as input an electric signal and outputs either 0 or 1, depending on whether the voltage of the signal is below or above a given threshold. Indeed, this would be a discontinuous function, if only such converters worked exactly. But they do not, they always have a tolerance level, and the manufacturer makes no guarantees about it working correctly very close to the threshold value.

A useful exercise is to think about the difference between “all functions are continuous”, “potentially all functions are continuous”, and “all functions are potentially continuous”. Which one does the above argument about finite rate of information processing support?

Local truth

This explanation of intuitionistic logic is a bit more subtle, but also much more powerful and versatile. It is known by categorical logicians as the Kripke-Joyal or sheaf semantics, while most logicians are familiar at least with the older Kripke semantics.

Imagine a planet and a meteorologist at each point of the surface, measuring the local temperature `T`. We assume that `T` varies continuously with position. A statement such as `T > 273` is true at some points of the planet and false at others. We say that it is locally true at `x` if there exists a small neighborhood around `x` where it is true. In other words, a statement is locally, or stably true at a given point if it remains true when we perturb the point a little.

On this planet a statement is globally true if it is locally true everywhere, and it is globally false if its negation is locally true everywhere. There are also many intermediate levels of truth. The truth value (a measure of truth) of a statement is the set of those points at which the statement is locally true. Such a set is always open.

The explanation so far is a bit wrong. For a statement to be locally true at `x`, not only must it be true in a neighborhood of `x`, but it must also be true everywhere in the neighborhood “for the same reason”. For example, the statement

`T > 273` or `T <= 273`

is true at `x` if there exists a neighborhood `U` of `x` such that `T > 273` everywhere on `U`, or `T <= 273` everywhere on `U`. The reason, namely which of the two possibilities holds, must be the same everywhere on `U`.

The truth value of `T = 273` is the interior of the set of those points at which `T` equals 273, while the truth value of `T != 273` is the exterior of the set of those points at which `T` equals 273. Thus the truth value of the disjunction

`T = 273 or T != 273`

need not be the entire planet—it will miss isolated points at which `T` is 273. The Law of Excluded Middle is not valid.

By changing the underlying space and topology, we can express various notions of truth. We can, for example, incorporate passage of time, or a universe branching into possible worlds. In the most general case the underlying space need not even be a space, but a category with a so-called Grothendieck topology which determines what “locally” means.

Apart from being a wonderful mathematical tool, it should be possible to use sheaf semantics to clarify concepts in physics. I would expect the notions of “truth stable under small perturbation” and “truth local to an observer” to appeal to physicists. Fancy kinds of sheaf semantics have been proposed to explain features of quantum mechanics, see for example this paper by Bas Spitters and his coworkers.

Smooth infinitesimal analysis

Philosophical explanations and entertaining stories about intuitionistic mathematics are one thing, but getting actual benefits out of it are another. For physicists this means that they will want to calculate things with it. The good news is that they are already doing it, they just don’t know it!

There is something odd about how physicists are taught mathematics—at least in my department. Physics majors learn the differential and integral calculus in the style of Cauchy and Weierstrass, with `epsilon`–`delta` definitions of continuity and differentiability. They are told by math professors that it is a sin to differentiate a non-differentiable function. They might even be told that the original differential and integral calculus, as invented by Leibniz and Newton, was flawed because it used the unclear concept of infinitesimals, which were supposed to be infinitely small yet positive quantities.

Then these same students go to a physics class in which a physics professor never performs `epsilon`-`delta` calculations, freely differentiates everything in sight, and tops it off by using the outlawed infinitesimals to calculate lots of cool things. What are the students supposed to think? Clearly, the “correct” mathematics is useless to them. It’s a waste of time. Why aren’t they taught mathematics that gives a foundation to what the physics professors are actually doing? Is there such math?

Yes there is. It’s the mathematics of infinitesimal calculus, brought forward to the 20th century by Anders Kock and Bill Lawvere under the name Synthetic Differential Geometry (SDG), or Smooth Infinitesimal Analysis. (I am too young to know exactly who invented what, but I’ve heard people say that Eduardo Dubuc also played a part. I would be happy to correct bibliographical omissions on my part.) By the way, I am not talking about Robinson’s non-standard analysis, which uses classical logic.

This is not the place to properly introduce synthetic differential geometry. I will limit myself to a few basic ideas and results. For a first reading I highly recommend John Bell’s booklet A Primer of Infinitesimal Analysis. If you refuse to read physical books, you may try his shorter An Invitation to Smooth Infinitesimal Analysis online. For further reading Anders Kock’s Synthetic differential geometry is an obvious choice (available online!), and there is also Moerdijk and Reyes’s Models of smooth infinitesimals analysis, which shows in detail how to construct models of SDG using sheaves of germs of smooth functions.

To get a feeling for what is going on, and why intuitionistic logic is needed, let us review the usual proof that infinitesimals do not exist. This requires a bit of logical nitpicking, so bare with me. Both intuitionistic and classical mathematics agree that there is no real number `x` which is neither negative, nor zero, nor positive:

`not exists x in R, not (x < 0) and not (x = 0) and not (x > 0)`.

(There is some disagreement as to whether every number is either negative, zero, or positive, but that is beside the point right now.) A nilpotent infinitesimal of second degree, or just infinitesimal for short, is a real number `dx` whose square is zero. Any such `dx` is neither negative nor positive, because both `dx > 0` and `dx < 0` imply `dx^2 > 0`, which contradicts `dx^2 = 0`. If `dx` were also non-zero, we would have a number which is neither negative, zero, nor positive. Thus we proved that an infinitesimal cannot be non-zero:

`dx^2 = 0 => not not (dx = 0)`.

A classical mathematician will now conclude that `dx = 0` by applying Proof by Contradiction. Intuitionistically we have only shown that infinitesimals are potentially equal to zero.

But are there any infinitesimals which are actually different from zero? It can be shown from the main axiom of SDG (see below) that non-zero infinitesimals potentially exist. It is a confusing world: on one hand all infinitesimals are potentially zero, but on the other non-zero ones potentially exist. Like all good things in life, intuitionistic mathematics is an acquired taste (and addictive).

Can a physicist make sense of all this? We may think of infinitesimals as quantities so small that they cannot be experimentally distinguished from zero (they are potentially zero), but neither can they be shown to all equal zero (potentially there are some non-zero ones). By the way, we are not talking about lengths below Planck length, as there are clearly reals numbers smaller than `1.6 * 10^(-35)` whose square is positive.

The actual axiom which gets the infinitesimal calculus going does not explicitly state anything about non-zero infinitesimals. Instead, it expresses the principle of micro-affinity (sometimes called micro-linearity) that physicists use in their calculations.

Principle of micro-affinity: An infinitesimal change in the independent variable `x` causes an affine (linear) change in the dependent variable `y = f(x)`.

More precisely, if `f : R -> R` is any function, `x in R` and `dx` is an infinitesimal, then there exists a unique number `f’(x)`, called the derivative of `f` at `x`, such that `f(x + dx) = f(x) + f’(x) dx`. This principle has many consequences, such as potential existence of non-zero infinitesimals described above. For actual calculations the most important consequence is

Law of cancellation: If `a` and `b` are real numbers such that `a * dx` = `b * dx` for all infinitesimals `dx` then `a = b`.

What this says is that we may cancel infinitesimals when they are arbitrary. This is important because infinitesimals do not have inverses (they are potentially zero). Nevertheless, we may cancel them in an equation, as long as they are arbitrary.

Let me show how this works in practice by calculating the derivative of `f(x) = x^2`. For arbitrary infinitesimal `dx` we have

`f’(x) * dx = f(x + dx) - f(x) = (x + dx)^2 - x^2 = x^2 + 2 x * dx + dx^2 - x^2 = 2 x * dx`

where we used the fact that `dx^2 = 0`. Because `dx` is arbitrary, we may cancel it on both sides and get `f’(x) = 2 x`. I emphasize that this is a mathematically precise and logically correct calculation. It is in fact very close to the usual treatment which goes like this:

`f’(x) = (f(x+dx) - f(x))/dx = (x^2 + 2 x * dx - dx^2 - x^2)/dx = 2 x + dx = 2 x`

There are two incorrect steps here: we divided by an infinitesimal `dx` without knowing that it is different from zero (it isn’t!), and we pretended that `2 x + dx` is equal to `2 x` because “`dx` is very small”. By the same reasoning we should have concluded that `f(x+dx) - f(x) = f(x) - f(x) = 0`, but we did not. Why?

The principle of micro-affinity allows us to easily derive the usual rules for computing derivatives, the potential existence of non-zero infinitesimals, prove the fundamental theorem of calculus in two lines, derive the wave equation like physicists do it, etc. And it is all correct, exact math. No approximations, no guilty feeling about throwing away “negligible terms” here but not there, and other hocus-pocus that physicists have to resort to because nobody told them about this stuff.

Just for fun, let me compute more derivatives. The general strategy in computing `f’(x)` is to consider an arbitrary infinitesimal `dx` and express `f’(x) * dx = f(x + dx) - f(x)` as a quantity multiplied by `dx`. Then we cancel `dx` on both sides and get `f’(x)`. Throughout we use the fact that `dx^2 = 0`. Here we go:

  • The derivative of `x^n` is `n * x^(n-1)`:

    `(x+dx)^n - x^n = x^n + n x^(n-1) * dx - x^n = n x^(n-1) * dx`

  • Leibniz’s formula for derivatives of products `(f(x)*g(x))’ = f’(x) * g(x) + f(x) * g’(x)`:

    `f(x+dx) * g(x+dx) - f(x) * g(x) = `
    `(f(x) + f’(x) * dx) (g(x) + g’(x) * dx) - f(x) * g(x) =`
    `(f’(x) g(x) + f(x) * g’(x)) * dx`.

  • Chain rule `f(g(x))’ = f’(g(x)) * g’(x)`

    `f(g(x+dx)) - f(g(x)) =`
    `f(g(x) + g’(x) * dx) - f(g(x)) =`
    `f(g(x)) + f’(g(x)) * g’(x) * dx - f(g(x)) =`
    `f’(g(x)) * g’(x) * dx`

    where we used the fact that `g’(x) * dx` is infinitesimal because its square is zero.

There you have it, in a paragraph we derived precisely and in sufficient detail what usually takes a whole lecture of `epsilon`–`delta` manipulations.

If we stick to classical logic, the Principle of micro-affinity is false. To see this, consider a function with a jump, such as

`j(x) = 0`    if `x < 0`
`j(x) = 1`    if `x >= 0`

At `x = 0` the principle of micro-affinity fails. This is a counterexample only in classical mathematics because intuitionistically we cannot prove that there is a function with a jump. Concretely, the above definition of `j(x)` is not intuitionistically valid because it presupposes `forall x in R, x < 0 or x >= 0`.

Space-time anomalies

But wait! Intuitionistically we can construct non-differentiable continuous functions, such as the absolute value `f(x) = |x|`, for which the principle of micro-affinity fails, too. Well, I am not telling the whole story. The smooth real line `R` of infinitesimal analysis is not the usual real line `RR`, as constructed by Richard Dedekind. It does not support computation of absolute values.

This seems pretty bad. If we cannot have a function such as the absolute value, then it is not clear how to model phenomena that involve sudden change of direction, such as reflection of light and collision of particles. Can rays not bounce of mirrors, intuitionistically? Yes they can, it is just that the intuitionistic treatment of sudden changes is more profound than the classical one.

Consider a particle which moves freely up to time `t_0`, then bounces off a wall, and moves freely after that. Its position `p` is described as a function of time `t` in two parts,

`p(t) = p_1(t)`    if `t <= t_0`
`p(t) = p_2(t)`    if `t >= t_0`

where `p_1` and `p_2` are smooth functions, and `p_1(t_0) = p_2(t_0)`. Because `p` is defined separately for `t <= t_0` and for `t >= t_0`, its domain of definition is the union of two half-lines

`D = { t in R  |  t <= t_0 } uu { t in R  |  t >= t_0 }`.

Classical mathematics proves that `D = R`, which amounts to forgetting that `t_0` is a special moment. In the smooth world, `D` is only a subset of `R`, but is not equal to `R` because it carries more information than `R`. As strange as this may seem, it is useful because it encodes moments in time or places in space where special things happen, such as sudden change of movement or sudden change of density. Smooth space-time, say `R^4`, allows only smooth motion and smooth distribution of mass. If we place non-smooth mass in it, the space will change to a subset of `R^4` which carries additional information about the anomalies contained in it.

This post has become very long so I will stop here.

10 Comments »

  1. First of all, great post!

    I had encountered intuitionistic maths before and thought “more mathematical logicians playing games with definitions”. As I did more computer science I could see why one would like to use constructive maths but the mathematician in me thought proof by contradiction was too large a sacrifice to make.

    Recently, I read about a non-constructive proof that there are irrational and b such that ab is rational. This pushed me further away from the non-constructive viewpoint.

    I think the treatment of infinitesimals you describe to has sent me over the line to the intuitionist camp. Furthermore, it provides a logically sound basis to automatic differentiation using dual numbers - exactly the nilpotent infinitesimals of second degree you talk about.

    That said, I’m still uncomfortable with the interpretation of the absolute value function. Could you please expand on this some more? Does it have any relationship with your earlier post in which you talk about all computable functions being continuous?

    Comment by mark — August 14, 2008 @ 09:34

  2. Answer to mark’s question about absolute value: suppose we had an operation which assigned to every number `x` a number `|x|` such that `x < 0 => |x| = -x` and `x > 0 => |x| = x`, where `a <= b` is defined as `not (b < a)`. Above we proved that for infinitesimal `dx` we have `0 <= dx <= 0`, therefore both `|dx| = dx` and `|dx| = -dx` from which it follows that `dx = 0`. So, if absolute value exists then all infinitesimals are zero. But this contradicts the principle of micro-affinity, because if all infinitesimals are zero, then for all infinitesimal `dx`

    `0 * dx = 0 = 1 * dx`

    and now by the law of cancellation we get `0 = 1`.

    But this is nothing to despair about, because the absolute value exists as a function `{x in R  |  x <= 0 or 0 <= x} -> R`, defined by

    `|x| = x`     if `x >= 0`
    `|x| = -x`     if `x <= 0`

    It’s just that you need to know the sign of `x` in order to compute its absolute value.

    Comment by Andrej Bauer — August 14, 2008 @ 10:39

  3. Having read the invitation by Bell that you link to I can see why things appear strange to me. In particular, the notion of the “indecomposability” of the intuitionist’s reals is what appears to be behind the lack of discontinuities. Furthermore, you cannot define the absolute value function in the usual way because that presupposes being able to split the reals at 0.

    My ideas of continuity are tied up with the point set topology I studied many years ago. I think I will have to rethink several basic definitions, such as open and closed sets, in light of intuitionist logic to better understand what’s going on here.

    Comment by mark — August 19, 2008 @ 02:27

  4. [...] Great Intro to Intuitionistic Logic Jump to Comments At Mathematics and Computation, there’s a really good accessible introduction to intuitionistic logic called Intuitionistic Logic for Physics. [...]

    Pingback by Great Intro to Intuitionistic Logic « XOR’s Hammer — August 19, 2008 @ 17:20

  5. [...] a great post at Mathematics and Computation on Intuitionist Mathematics and Physics. (HT: Philosophy, Science, and Method) I always found Intuitionism a bit vague in some key places [...]

    Pingback by Intuitionist Mathematics and Physics : Mormon Metaphysics — September 6, 2008 @ 02:42

  6. Is it fair to say “non-zero infinitesimals potentially exist” (which I would parse as ¬¬∃dx∈R [¬(dx = 0) ∧ dx^2 = 0])? I would actually take this to be provably false (as an intuitionistic field, non-zeros should be the same as invertibles, and thus be closed under squaring). It seems the more accurate statement would be “not all infinitesimals are zero” (which I would parse as ¬∀dx∈R [dx^2 = 0 => dx = 0]). But perhaps I should just use a different translation from ordinary language to formal intuitionistic logic…

    Comment by Sridhar Ramesh — September 19, 2008 @ 07:15

  7. Answer to Sridhar: it is not true that in an intuitionistic field non-zeros coincide with invertibles. In intuitionistic algebra we use apartness relation, which is a constructivized version of non-equality, rather than non-equality itself. Thus the axiom about invertible elements in the field is “if `x` is apart from zero, then it is invertible” (and not “if `x` is not zero, then it is invertible).

    Concretely, in an ordered field apartness is defined as: `x` and `y` are apart iff `x < y` or `x > y`. So you can only invert an element if you know that it is negative or positive. It does not generally follow that every non-zero element is negative or positive (I actually mention this issue in the post above). I hope that answers your question.

    In the case of smooth reals we can show that for every infinitesimal `x` we have “`x` is not apart from zero”. So you cannot have an invertible infinitesimal (although there are variants of the axioms of smooth analysis which give you that).

    Comment by Andrej Bauer — September 19, 2008 @ 09:29

  8. Hm, I see. But it’s still not clear to me how to derive the potential existence of non-zero infinitesimals from the principle of microaffinity. Could you expand on that?

    For example, taking the formalization of this statement as I stated before, `not not exists dx in R, (not (dx = 0) and dx^2 = 0)`, this should be intuitionistically equivalent to `not forall dx in R, not (not (dx = 0) and dx^2 = 0)`, which in turn should be equivalent to `not forall dx in R, (dx^2 = 0 => not not (dx = 0))`, but this is the negation of what you prove above. Where am I going wrong?

    Comment by Sridhar Ramesh — September 19, 2008 @ 10:15

  9. Also, my impression that the relevant notion of “intuitionistic field” was one in which “¬(x = 0) iff x is invertible” came from, for example, the last field axiom in section R_1 of p. 102 of Bell’s “A Primer of Infinitesimal Analysis”. Of course, I understand now that there are also other reasonable conceptions of intuitionistic fields, but this would seem to be the one at use in at least Bell’s formulation of smooth infinitesimal analysis.

    Comment by Sridhar Ramesh — September 20, 2008 @ 03:13

  10. Hm, somehow, my math symbols have turned garbled in some (but not all) of my above comments, even though they were fine before. Odd…

    Comment by Sridhar Ramesh — September 27, 2008 @ 04:07

RSS feed for comments on this post. TrackBack URI

Leave a comment

Powered by WordPress

Listed on BlogShares