Proof of negation and proof by contradiction

I am discovering that mathematicians cannot tell the difference between “proof by contradiction” and “proof of negation”. This is so for good reasons, but conflation of different kinds of proofs is bad mental hygiene which leads to bad teaching practice and confusion. For reference, here is a short explanation of the difference between proof of negation and proof by contradiction.

By the way, this post is something I have been meaning to write for a while. It was finally prompted by Timothy Gowers’s blog post “When is proof by contradiction necessary?” in which everything seems to be called “proof by contradiction”.

As far as I can tell, “proof by contradiction” among ordinary mathematicians means any proof which starts with “Suppose …” and ends with a contradiction. But two kinds of proofs are like that:

Proof of negation is an inference rule which explains how to prove a negation:

To prove $\lnot \phi$, assume $\phi$ and derive absurdity.

The rule for proving negation is the same classically and intuitionistically. I mention this because I have met ordinary mathematicians who think intuitionistic proofs are never allowed to reach an absurdity.

Proof by contradiction, or reductio ad absurdum, is a different kind of animal. As a reasoning principle it says:

To prove $\phi$, assume $\lnot \phi$ and derive absurdity.

As a proposition the principle is written $\lnot \lnot \phi \Rightarrow \phi$, which can be proved from the law of excluded middle (and is in fact equivalent to it). In intuitionistic logic this is not a generally valid principle.

Admittedly, the two reasoning principles look very similar. A classical mathematician will quickly remark that we can get either of the two principles from the other by plugging in $\lnot \phi$ and cancelling the double negation in $\lnot \lnot \phi$ to get back to $\phi$. Yes indeed, but the cancellation of double negation is precisely the reasoning principle we are trying to get. These really are different.

I blame the general confusion on the fact that an informal proof of negation looks almost the same as an informal proof by contradiction. In order to prove $\lnot \phi$ a mathematician will typically write:

“Suppose $\phi$. Then … bla … bla … bla, which is a contradiction. QED.”

In order to prove $\phi$ by contradiction a mathematician will typically write:

“Suppose $\lnot \phi$. Then … bla … bla … bla, which is a contradiction. QED.”

The difference will be further obscured because the text will typically state $\lnot \phi$ in an equivalent form with negation pushed inwards. That is, if $\phi$ is something like $\exists x, \forall y, f(y) < x$ and the proof goes by contradiction then the opening statement will be “Suppose for every $x$ there were a $y$ such that $f(y) \geq x$.” With such “optimizations” we really cannot tell what is going on by looking just at the proof. We have to take into account the surrounding context (such as the original statement being proved).

A second good reason for the confusion is the fact that both proof principles feel the same when we try to use them. In both cases we assume something believed to be false and then we hunt down a contradiction. The difference in placement of negations is not easily appreciated by classical mathematicians because their brains automagically cancel out double negations, just like good students automatically cancel out double negation signs.

Keeping all this in mind, let us look at Timothy Gower’s blog examples.

Irrationality of $\sqrt{2}$

The first example is irrationality of $\sqrt{2}$. Because “$\sqrt{2}$ is irrational” is by definition the same as “$\sqrt{2}$ is not rational” we are clearly talking about a proof of negation. There is a theorem about normal forms of proofs in intuitionistic logic which tells us that every proof of a negation can be rearranged so that it ends with the inference rule cited above. In this sense the method of proof “assume $\sqrt{2}$ is rational, …, contradiction” is unavoidable.

I want to make two further remarks. The first one is that the usual proof of irrationality of $\sqrt{2}$ is intuitionistically valid. Let me spell it out:

Theorem: $\sqrt{2}$ is not rational.

Proof. Suppose $\sqrt{2}$ were equal to a fraction $a/b$ with $a$ and $b$ relatively prime. Then we would get $a^2 = 2 b^2$, hence $a^2$ is even and so is $a$. Write $a = 2 c$ and plug it back in to get $2 c^2 = b^2$, from which we conclude that $b$ is even as well. This is a contradiction since $a$ and $b$ were assumed to be relatively prime. QED.

No proof by contradiction here!

My second remark is that this particular example is perhaps not good for discussing proofs of negation because it reduces to inequality of natural numbers, which is a decidable property. That is, as far as intuitionistic logic is concerned, equality and inequality of natural numbers are both equally “positive” relations. This is reflected in various variants of the proof given by Gowers on his blog, some of which are “positive” in nature.

The situation with reals is different. There we could define the so-called apartness relation $x \# y$ to mean $x < y \lor y < x$. The negation of apartness is equality, but the negation of equality is not apartness, at least not intuitionistically (classically of course this whole discussion is a triviality). A proof of inequality $x \neq y$ of real numbers $x$ and $y$ may thus proceed in two ways:

  1. The direct way: assume $x = y$ and derive absurdity
  2. Via apartness: prove $x \# y$ and conclude that $x \neq y$

Note that the proof of $x \# y \Rightarrow x \neq y$ still involves the usual proof of negation in which we assume $x \# y \land x = y$ and derive absurdity.

A continuous map on $[0,1]$ is bounded

The second example is the statement that a continuous map $f : [0,1] \to \mathbb{R}$ is bounded. The direct proof uses the Heine-Borel property of the closed interval to find a finite cover of $[0,1]$ such that $f$ is bounded on each element of the cover. There is also a proof by contradiction which goes as follows:

Suppose $f$ were unbounded. Then we could find a sequence $(x_n)_n$ in $[0,1]$ such that the sequence $(f(x_n))_n$ is increasing and unbounded (this uses Countable Choice, by the way). By Bolzano-Weierstras there is a convergent subsequence $(y_n)_n$ of $(x_n)_n$. Because $f$ is continuous the sequence $(f(y_n))_n$ is convergent, which is impossible because it is a subsequence of the increasing and unbounded sequence $(f(x_n))_n$. QED.

Can we turn this proof into one that does not use contradiction (but still uses Bolzano-Weierstrass)? Constructive mathematicians are well versed in doing such things. Essentially we have to look at the supremum of $f$, like Timothy Gowers does, but without actually referring to it. The following proof is constructive and direct.

Theorem: If every sequence in a separable space $X$ has a convergent subsequence, then every continuous real map on $X$ is bounded.

Proof. Let $(x_n)_n$ be a dense sequence in $X$ and $f : X \to \mathbb{R}$ continuous. For every $n$ there is $k$ such that $f(x_k) \geq \max(f(x_1), …, f(x_n)) – 1$. By Countable Choice there is a sequence $(k_n)_n$ such that $f(x_{k_n}) \geq \max(f(x_1), …, f(x_n)) – 1$ for every $n$. Let $(z_n)_n$ be a convergent subsequence of $(x_{k_n})_n$ and let $z$ be its limit. Because $f$ is continuous there is $d > 0$ such that $f(z_n) \leq f(z) + d$ for all $n$. Consider any $x \in X$. Because $f$ is continuous and $(x_n)_n$ is dense there is $x_i$ such that $f(x) \leq f(x_i) + 1$. Observe that there is $j$ such that $f(x_{k_i})  - 1 \leq f(z_j)$. Now we get $$f(x) \leq f(x_i) + 1 \leq \max(f(x_1), …, f(x_i)) + 1 \leq f(x_{k_i}) + 2 \leq f(z_j) + 3 < f(z) + d + 3.$$ We have shown that $f(z) + d + 3$ is an upper bound for $f$. QED.

I am pretty sure with a bit more work we could show that $f$ attains its supremum, and in fact this must have been proved by someone constructively.

The moral of the story is: proofs by contradiction can often be avoided, proofs of negation generally cannot, and if you think they are the same thing, you will be confused.

17 thoughts on “Proof of negation and proof by contradiction

  1. Good post, Andrej. I was thinking of responding to Gowers myself, but I could not have done it as well as you’ve just done. Thanks!

    A related point, which is implicit in your discussion, is that constructively there are more distinctions to be made than classically, so that one can often “constructivize” a classical theorem in many different ways. I think it was Bishop who said, heuristically, that one can either use the same premises and derive a constructively weaker conclusion, or strengthen the premises and derive the same conclusion. I think it’s not so simple as all that, but it’s a helpful guide. An example that arises in basic algorithms is what it means for a finite graph to be non-planar: it could be that it is not planar or it could be that one can embed K_5 or K_{3,3} in it. Classically it’s the same thing, but constructively they have content. Richer examples abound.

  2. Can you say more about the planarity examples? I would naively assume that with good definitions the theorem must be the same. For example, I would take graphs to be decidable (decidable equality of vertices and edges, decidable neighborhood relation) and embeddings into the plane “nice” (uniformly continuous should do). Can’t we recover the original theorem “planarity is equivalent to embedding `K_5` or `K_{3,3}`” under these conditions?

  3. The classical theorem, as I remember, is “non-planarity is equivalent to embedding `K_5` or `K_{3,3}`” — an equivalence between a negative and a positive condition.

    The most natural ways to think of this classically are surely as `(not P) => E`, `P <=> not E`, or the conjunction of the two. But each of these implies that at least one of `P`, `E` is (equivalent to) a negation, and hence equivalent to its own double negation. But I don’t think we should be able to get either of those here.

    If the property `E`, “embeddability of `K_5` or `K_{3,3}`” is equivalent to its double negation, then I think excluded middle follows, by constructing (given a statement `p`) a graph which has a `K_5` if `p` holds and a `K_{3,3}` if `p` fails. (More concretely, one can give a counterexample graph in presheaves on the poset “V”.) And I suspect something similar should be do-able for planarity, though I don’t immediately see it…

  4. I can believe that allowing “fuzzy” graphs gets you into trouble. But I would say that the correct constructive definition of finite structures from discrete mathematics should require decidability all around (decidable equality, decidable relations, decidable subsets, etc). So, essentially discrete math can be coded into arithmetic.

  5. It’s quite possible that you’re making a distinction where none exists: what classical mathematicians call “proof by contradiction” is exactly what constructivists call “proof of a negation”. The only difference is that constructivists encode this distinction in the statement of the thesis by choosing either the positive or the negative form (and considering the latter to be logically weaker than the former) whereas classical mathematicians consider positive and negative forms to be entirely interchangeable.

    It should be noted that one can “refutivize” a proof of a negative statement (turn it into a constructive refutation) just as easily as one can constructivize a classical proof by contradiction.

  6. @guest: I am not inventing anything here, this is all standard logic. Please consult a textbook on logic because what you are saying is wrong. Even classical mathematicians do not identify a proposition with its double negation, they just take them to be equivalent. And even in classical logic this equivalence is proved from the Law of Excluded Middle.

  7. Please consult a textbook on logic because what you are saying is wrong.

    It is very possibly ‘wrong’ from a proof-theoretical point of view, but most classical mathematicians do not reason like this; they take a denotational and model-theoretic perspective where a proposition stands for either truth or falsity, so identifying equivalent propositions is entirely justified. And even proof-theoretically, any proposition can be replaced by an equivalent statement plus a proof that the latter implies the former; and if the implication is obvious enough it will be elided in an informal proof.

    The moral: minimizing proof by contradiction is a good policy, and this extends to searching for positive refutations of ‘negative’ statements. But sometimes avoiding proof by contradiction is impossible or there’s no compelling case for a direct proof, and here constructive mathematicians must either use negation signs, or (implicitly) punt to classical mathematicians and translate classical math to the negative fragment of constructive logic.

  8. You are confusing propositions with their meaning and provability with validity. There might be some mathematicians who think that their perspective is “model-theoretic”, but they still write proofs in their papers, don’t they? You cannot avoid proofs in mathematics, no matter what your philosophical point is. Identification of equivalent propositions is justified proof-theoretically as well as semantically. It is easy to prove a meta-theorem which says that you can always subtitute a propostion for an equivalent one, no need to go to semantics.

    My point is that such identifications, while useful most of the time, are the source of confusion to many mathematicians who cannot tell the difference between a proof of negation and a proof by contradiction. You seem to be one of them, unfortunately.

  9. I am sympathetic to saying that ordinary language “proof by contradiction” might as well refer to “proof of negation”, and that it’s just that classical mathematics is fraught with following this up immediately with double negation elimination (in which case, it is what was called “proof by contradiction” in this post).

    I am even sympathetic to saying that, in classical mathematics, propositions are even often considered to be syntactically identified with their double negations; sure, you can construct some formal syntax in which the two are syntactically distinct (and, indeed, this is the way we logicians normally analyze them), but just as well, you could construct a formal syntax in which they aren’t, and the everyday non-logician mathematician isn’t particularly concerned with formal syntax anyway. (Granted, if one is accustomed to thinking of a statement as even syntactically identical to its double negation, it becomes that much more difficult to abandon this idea when moving away from classical mathematics into the intuitionistic or what have you, but so it is when moving between different logical systems in many ways anyway…)

  10. (That is, I am all for calling the distinctively classical reasoning principle “double negation elimination” instead of “proof by contradiction”, using the latter term instead for the “proof of negation” principle common to both classical and intuitionistic logic by which negation is introduced. Of course, in actual practice, people use the term “proof by contradiction” in a multitude of differing ways, hence the confusion you remark upon…)

  11. “It is very possibly ‘wrong’ from a proof-theoretical point of view, but most classical mathematicians do not reason like this; they take a denotational and model-theoretic perspective where a proposition stands for either truth or falsity, so identifying equivalent propositions is entirely justified.”

    The perspective here presumes the law of the excluded middle. This only holds on the view that all propositions of mathematics are either true or false and nothing else. But what about propositions like “infinite series are divergent”? You might wish to believe it false, since we do know of convergent infinite series. But, the proposition, as intended by the writer or speaker, may or may not say “*all* infinite series are divergent.” It might say “most infinite series are divergent” or it might say “usually infinite series are divergent.” Or consider a proposition like “prime numbers are odd.” If the writer means “all prime numbers are odd”, then the proposition is false. However, if the writer means “almost all prime numbers are odd”, then the proposition is true.

  12. @Doug: even though I agree with you that the quoted passage is confused, I am afraid your reply is equally confused, or even more. It is clear that the author of the passage meant “propositional function” when he said “proposition”. Your examples are about making the distinction between a sentence (a proposition without free parameters) and a proposition (with free parameters), or about reading sense into vague statements. But the passage is about neither of those.

  13. @Andrej: Is a proof by negation valid for the intuitionist because it can easily be turned into a proof by contraposition? (For example, if, in trying to prove $\lnot\phi$, I prove $\phi\Rightarrow b$, where $\lnot b$ is known to be true, then I’ve established the contrapositive $\lnot b\Rightarrow\lnot\phi$, which establishes $\lnot\phi$.) In fact, is the formalization of such a proof actually a proof of the contrapositive, at least as the intuitionist requires the proof to be formalized? Playing the same game with proofs by contradiction (as opposed to proofs by negation) requires excluded middle, of course (as you say).

  14. @Zach: you have to be careful there. Intuitionistically $a \Rightarrow b$ implies $\lnot b \Rightarrow \lnot a$, but in general the other direction does not hold (and in fact, if it does then we get classical logic). However, if I remember correctly, as soon as either $a$ or $b$ is $\lnot\lnot$-stable (equivalent to its double negation) then we do get $(a \Rightarrow b) \iff (\lnot b \Rightarrow \lnot a)$. In the case of $\lnot \phi$, which is just an abbreviation for $\phi \Rightarrow \bot$, we do have $\lnot\lnot$-stability of $\bot$, so indeed we could prove the contrapositive if we wished. But the contrapositive is $\lnot \bot \Rightarrow \lnot \phi$, which is just $\lnot \phi$ again, so we end up where we started.


  15. I am pretty sure with a bit more work we could show that f attains its supremum, and in fact this must have been proved by someone constructively.

    Unfortunately not. Recall that it is not possible to prove the intermediate value theorem constructively (see page 5 of Bishop’s book, the introduction section, send me a mail if you don’t have the book and want me to just scan that part for you). However, a method to prove your statement can be adapted to prove the intermediate value theorem:

    Suppose that $f$ is a function $[0,1] \to \mathbb{R}$, and $f(0) < 0 < f(1)$. Let $F(x) = \int_0^x -f(y) dy$. Use your procedure to find the $x$ in $[0,1]$ where $F$ attains its maximum. By the fundamental theorem of calculus $F$ is differentiable and $F' = -f$, and since $x$ is where $F$ has a maximum, $F'(x) = 0$, so $-f(x) = 0$, so $f(x) = 0$, and the intermediate value has been found. [End proof]

    Therefore no such method can exist (a proof of negation).

    Bishop proves the necessary calculus theorems in his book. Interestingly, Bishop shows how to compute the least upper bound of a continuous function on an interval, but doesn’t mention that it can’t be shown to take that value.

  16. @Robert: I took the liberty to edit your comment to insert LaTeX and also to fix the proof a bit (I think HTML ate a part of it). Please let me know if I did injustice to your proof. (Also, I think it goes without saying that $f$ is assumed to be continuous in Bishop sense, i.e., uniformly continuous on every closed interval, otherwise how do you define the integral?)

    You have shown, I think, that if every Bishop-continuous map on $[0,1]$ attains its maximum then the Intermediate Value Theorem holds.

    But my claim (specialized to $X = [0,1]$) was: if every sequence in $[0,1]$ has a convergent subsequence, then every continuous map on $[0,1]$ attains its maximum.

    I am very sorry, but how does what you have shown invalidate what I claimed? If we put both things together we only get: “If every sequence in $[0,1]$ has a convergent subsequence then the Intermediate Value Theorem holds”. This is not in contradiction with possible failure of the Intermediate Value Theorem in Bishop-style mathematics. In fact, it is easy to show Bishop-style that the claim actually holds. For suppose every sequence in $[0,1]$ has a convergent subsequence and $f : [0,1] \to \mathbb{R}$ is continuous and $f(0) < 0 < f(1)$. We can find a sequence $(x_n)_n$ such that $|f(x_n)| \leq 2^{-n}$. Take a convergent subsequence of $(x_n)$, and its limit is where $f$ has a zero.

    It is an exercise to show that if every sequence of $[0,1]$ has a convergent subsequence then every continuous map on $[0,1]$ attains its maximum.

  17. You are quite right, of course. I got confused because I didn’t realise you had assumed something that was Bishop-false. This must be why Bishop’s definition of compact isn’t any of the usual ones, it’s totally bounded and complete, and applies only to metric spaces.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>