Free variables are not “implicitly universally quantified”!

Mathematicians are often confused about the meaning of variables. I hear them say “a free variable is implicitly universally quantified”, by which they mean that it is ok to equate a formula $\phi$ with a free variable $x$ with its universal closure $\forall x \,.\, \phi$. I am addressing this post to those who share this opinion.

I will give several reasons, which are all essentially the same, why “there is no difference between $\phi$ and $\forall x \,.\, \phi$” is a really bad opinion to have.

Reason 1: you wouldn’t equate a function with its definite integral

You would not claim that a real-valued function $f : \mathbb{R} \to \mathbb{R}$ is “the same thing” as its definite integral $\int_{\mathbb{R}} f(x) \, d x$, would you? One is a real function, the other is a real number. Likewise, $\phi$ is a truth function and $\forall x \,.\, \phi(x)$ is a truth value.

Reason 2: functions are not their own values

To be quite precise, the expression $\phi$ by itself is not a function, just like the expression $x + \sin x$ is not a function. To make it into a function we must first abstract the variable $x$, which is usually written as $x \mapsto x + \sin x$, or $\lambda x \,.\, x + \sin x$, or fun x -> x +. sin x. In logic we indicate the fact that $\phi$ is a function by putting it in a context, so we write something like $x : \mathbb{R} \vdash \phi$.

Why is all this nit-picking necessary? Try answering these questions with “yes” and “no” consistently:

  1. Is $x + \sin x$ a function in variable $x$?
  2. Is $x + \sin x$ a function in variables $x$ and $y$?
  3. Is $y – y + x + \sin x$ a function in variables $x$ and $y$?
  4. Is $x + \sin x = y – y + x + \sin x$?

A similar sort of mistake happens in algebra where people think that polynomials are functions. They are not. They are elements of a certain freely generated ring.

Reason 3: They are not logically equivalent

It is absurd to claim that $\phi$ and $\forall x \in \mathbb{R} \,.\, \phi$ are logically equivalent statements. Suppose $\forall x \in \mathbb{R} \,.\, x > 2$ were equivalent to $x > 2$. Then I could replace one by the other in any formula I wish. So I choose the formula $\exists x \in \mathbb{R} \,.\, x > 2$. It must be equivalent to $\exists x \in \mathbb{R} \,.\, \forall x \in \mathbb{R} \,.\, x > 2$, but since $\forall x \in \mathbb{R} \,.\, x > 2$ is false, we get $\exists x \in \mathbb{R} \,.\, \bot$, which is false. We proved that there is no number larger than 2.

Reason 4: They are not inter-derivable

If you can tell the difference between an implication and logical entailment, perhaps you might try to counter reason 3 by pointing out that $\phi$ and $\forall x \,.\, \phi$ are either both derivable, or both not derivable. That is to say, we can prove one if, and only if, we can prove the other. But again, this is not the case. We can prove $\forall x \in \emptyset \,.\, \bot$ but we cannot prove $\bot$.

Reason 5: Bound variables can be renamed but free variables cannot

The formula $x > 2$ is obviously not the same thing as the formula $y > 2$. But the formula $\forall x \in \mathbb{R} . x > 2$ is actually the same as $\forall y \in \mathbb{R} . y > 2$. If you find this confusing it is because you were never taught properly how to handle free and bound variables.

Reason 6: You cannot prove $\forall x \,.\, \phi$ without allowing $x$ to become free

Perhaps we can just forbid free variables altogether and stipulate that all variables must always be quantified. But how are you then going to prove $\forall x \in \mathbb{R} \,.\, \phi$? The usual way

“Consider any $x \in \mathbb{R}$. Then bla bla bla, therefore $\phi$.”

is now forbidden because the first sentence introduces $x$ as a free variable.

We can abolish variables altogether if we wish, by resorting to combinators, but it makes no sense to keep variables and make them all bound all the time.

Epilogue: so in what sense are they the same?

There is a theorem in model theory:

Let $\phi$ be a formula in context $x_1, \ldots, x_n$ and $M$ a structure in which we can interpret $\phi$. The following are equivalent:

  1. the universal closure $\forall x_1, \ldots, x_n \,.\, \phi$ is valid in $M$,
  2. for every valuation $\nu : \lbrace x_1, \ldots, x_n \rbrace \to M$, $\phi[\nu]$ is valid in $M$.

This is sometimes abbreviated (quite inaccurately) as “a formula and its universal closure are semantically equivalent”. This theorem is causing a lot of harm because mathematicians interpret it as “free variables are implicitly universally bound”. But the theorem itself clearly distinguishes a formula from its universal closure. It has a limited range of applications in model theory. It is not a general reasoning principle that would allow you to dispose of thinking about free variables.

You are in good company. Philosophers have thought about free variables for millennia, although they phrase the problem in the language of universals and particulars. They wonder whether “dog” is the same thing as the set of all dogs, or perhaps there is an ideal dog which is “pure dogness”, but then do we need two ideal dogs to make ideal pups, etc. The answer is simple: a free variable is a projection from a cartesian product.

29 thoughts on “Free variables are not “implicitly universally quantified”!

  1. I think two things lead people astray on this, and are things that working with type theory sets you straight on.

    The primary thing is that most mathematicians work in situations where all variables range over a single, non-empty domain of discourse, and bounded quantification is secondary. This means you don’t have to worry about #4, say. Two is that, because every variable ranges over the same domain, they’re just not put in the context (assuming you even work with a notion of context at all). This leads you to wonder what the status of free variables are, when free variables are ‘obviously’ bound by the context (and then their semantics become pretty clear, at least in the categorical treatment).

    This also gets you the weird (to a type theorist): $(\forall x. P(x)) \Rightarrow \exists x . P(x)$. “What if you have an empty model?” “We’ve ruled those out, just because.” The type theorist says instead: $y:A \vdash (\forall x:A . P(x)) \Rightarrow \exists x:A . P(x)$.

    (Oh, and to cover myself, those $A$’s and $P$’s are metavariables in a schema, not free variables; I think.)

  2. @Dan: I fixed your math and took the liberty to delete the second comment, as it is not obsolete. You can just write LaTeX code, with dollar signs at all. The blog uses MathJax.

  3. That “a function is an expression in a context” from point 2 makes me wonder why lambda calculus has lambdas at all rather than reinterpret function applications as suspended substitutions.

    The representations of LC with explicit substitutions that I remember still have lambdas though, I guess for higher-order cases it’s more convenient to have proper function types, especially when polymorphism is involved.

  4. My guess is that people feel the need to bind variables in the way that you describe because they have an instinct for logical syntax without the training, and $\forall$ (and $\int$) are the only binding operators that they know. If the variables are not bound then they are loose, just as the threads in a piece of cloth that is not bound with a hem are loose and will become frayed.

    A careful analysis of syntax uses an expression tree, which is like a piece of cloth in that it has threads. But when weaving cloth it is more useful to keep the warp on the loom so that it can be extended. Likewise, we keep hold of the free variables in an expression tree rather than binding them so that we can extend the expression by substituting for (and into) them.

    So the answers to your first two questions are no, but
    $$ x:X \quad \vdash\quad x+\sin x $$
    is a function of (expression in) $x$, whilst
    $$ x:X,\ y:Y \quad \vdash\quad x+\sin x $$
    is an expression in $x$ and $y$.

    These may be seen as examples of morphisms $X\to R$ and $X\times Y\to R$ in the category of contexts and substitutions.

    I’m afraid that Dan is wrong in saying that Type Theory gets this straight, because for them $X\to R$ is another type and not a hom or a sequent. This is like confusing the first expression above with
    $$ \lambda x.X+\sin x. $$
    This is the same confusion as the one between $\phi(x)$ and $\forall x.\phi(x)$ about which Andrej has written, but it is still present in what he has written.

    So, passing from arithmetic to logical examples, there is a three-way confusion amongst
    $$ \Gamma,\ x:X \quad\vdash\quad \phi x, $$
    $$ \Gamma \quad\vdash\quad \lambda x.\phi x $$
    and
    $$ \Gamma \quad\vdash\quad \forall x.\phi x, $$
    by each of which I mean that the formula on the right is asserted to be well formed rather than true.

    Does this matter? What does $\lambda$-abstraction actually do?

    If you only have one (“arithmetical”) type, such as $R$, so $R\to R$ can only mean a hom and is not a type in itself, then $\lambda x.f(x)$ is either ill-formed or just means $x:X\vdash f(x)$. This is like working in a category with products but not exponentials, or a logic with conjuction and no implication.

    It is only when we introduce these higher types that there is a distinction.

    For the terms, if the only thing you can do with $\lambda x.f(x)$ is apply it then this is the same as substituting into $x:X\vdash f(x)$. Only when this function is reified (made into a thing in itself) as the value of some higher-type function is there any difference.

    Returning to the examples from “school calculus”, when does the distinction between an expression and its $\lambda$-abstraction become significant? I was going to say, only when we study the Calculus of Variations, but it does come in earlier, with the calculation of the volume of a figure of revolution, given the function that provides its profile.

    Recently, in the work of someone who definitely does understand these syntactic distinctions, I saw the commutative law written as
    $$ \forall x, y. x \star y = y \star x. $$
    Now, I consider that this is wrong and that it should be written instead as
    $$ x,\ y :X \quad\vdash\quad x\star y = y\star x. $$
    However, as I have explained, the reasons for my opinion are quite subtle.

  5. I think there are a couple of reasons for this.

    1) Some presentations of logic actually have that, in certain circumstances. I can point to the discussion in chapter 8 of GEB as a specific example:

    “Rule of Generalization: Suppose x is a theorem in which u, a variable, occurs free. Then, ?u: x is a theorem. (Restrinction: No generalization is allowed in a fantasy on any variable which appeared free in the fantasy’s premise.”

    2) I’d guess that if you asked them about particular examples, they’d agree that particular examples weren’t universally quantified, or that the variables are implicitly bound. The issue here is that mathematical english doesn’t have a linear translation into explicit logical syntax, so the scope of bound variables isn’t necesarilt obvious.

    A couple of comments on some of your specific points.

    1) “x+sin x” can certainly be the notation for a function, even if it is somewhat sloppy and ambiguous. And I am sure it causes endless confusion for undergraduates. But that doesn’t change the fact that in the appropriate context (i.e, when talking about functions on the complex plain), refering to the function “x + sin x” is mostly unambiguous

    2) I wouldn’t say that “Consider any x??. Then bla bla bla, therefore ?.” introduces a free variable, it just moves it to the context. Certainly, it occurs in formulas as free, but it isn’t globally free.

  6. How exactly does type theory get this wrong? It has (conceivably; depending on what system we’re talking about) all three of these things:

    $\Gamma, x:S \vdash \phi \, x : T$
    $\Gamma \vdash (\lambda x : S. \phi \, x) : S \rightarrow T$
    $\Gamma \vdash \forall x : S. \phi \, x type$

    Depending on what $\phi$ is. And they’re all different (though related by inference rules). The first two correspond to something like:

    $\Gamma \times S \to T$
    $\Gamma \to T^S$

    And the last is a rather different animal. It’s written differently in the category theory, but it’s the same distinction.

    If you’re referring to the syntax people use in something like Coq or Agda, a lot of this is elided because it can be reconstructed by a machine, and would be a burden to write down explicitly. People do tend to be less precise about the distinction when working in this format, but the vast majority of papers trying to be at all precise use the more formal proof trees with contextual judgments and whatnot, and I’d expect people studying type theory to be familiar with them.

  7. Dan, I’m not saying that Type Theory makes a major error, just that the arrow syntax elides a distinction that is in fact more or less the same one that is the subject of Andrej’s original posting,

    If you want to say that type theorists write $A\to B$ where categorists have $B^A$ and $x:A\vdash f(x):B$ where categorist write $f:A\to B$ then I suppose that settles the question.

  8. Dear Paul,

    Speaking as a type theorist who occasionally uses a little category theory, I do something pretty close to what you suggest. When I use categories, I tend to give the types of morphisms with the single-arrow $A \to B$, and use the double-arrow $A \Rightarrow B$ to write the exponential. This is mainly because the superscript notation $B^A$ gets unwieldy when $A$ is a syntactically large expression.

    I do agree that axioms are typically better given in the form:
    $$x, y: A \vdash P(x,y)$$
    where $P(x,y)$ is an atomic proposition. There are good proof-theoretic reasons for this, in that cut-elimination fails if you give the axiom as:
    $$\cdot \vdash \forall x,y:A.\; P(x,y)$$
    since this introduces a forall without using the forall-introduction rule. I don’t quite know how to render this fact categorically, since it seems to lead to modelling contexts with multicategories, and I’ve never had the stamina for dealing with them.

  9. I am a math/CS nub, so please forgive me if what I am about to say is really stupid, BUT… Is it not easier to simply declare free variables evil, and force every variable to be bound to either a definition (an existential proof) or an universal quantifier? Would this not make the language simpler to understand?

  10. @Eduardo: it is possible to do away with all variables (look up “combinatory alegbra”), but humans are not good at using that sort of thing. As I said in the post, once you decide to have bound variables, you are more or less forced to have free variables as well. This is so because inside a universal quantifier $\forall x . \phi$ the variable $x$ is free. Or to put it in terms of computer science: a local variable is not visible outside of its scope, but inside its scope it acts as a free variable (one that is not “obsucred” and is “available”). You seem to think that free variables somehow complicate matters. They do not, bad math education and ignorance create confusion, and that complicates matters. Free variables are a completely natural concept. They reflect how humans speak and think. When you say “suppose $x$ is a real number”, and then you proceed to argue about it, you have used a free variable. Mathematicians do that all the time.

  11. I’m having the same questions as Eduardo and Tom Prince.

    All of the presentations I have seen of first order logic include the derivation rule of Universal generalization. As someone who strongly agrees with this post, this rule has always seemed blatantly false to me.

    I thought that the right way to fix it is to ban free variables from formal derivations. Of course, in math as it is normally written, one writes “Suppose $x$ is a real number …” and continues from there. My proposal would be that this “suppose” means that every single sentence which follows begins with an implicit $\forall_{x \in \mathbb{R}} \cdots$ and that, when giving formal derivations, we should actually write out the $\forall_{x \in \mathbb{R}} \cdots$ on every line. This doesn’t mean that $x^2 \geq 0$ isn’t a grammatical phrase, just that it is not the sort of thing which can appear by itself as a line in a proof.

    But I haven’t seen any references which do this, and you seem to be saying it is a bad idea for some reason. If you could flesh out why this is a bad idea, and help me understand the alternative, I’d appreciate it.

  12. Regarding the Universal Generalisation rule as it appears on the Wikipedia page to which you link, I am inclined to say that the reason why you consider that it is blatantly false is that it is false. However, I cannot say this because too many traditional authors in logic use the $\vdash$ notation in this way.

    Just as one deduces $\vdash Q\Rightarrow P$ from $Q\vdash P$, so one should obtain $\vdash \forall x\in Q.P(x)$ from $x:Q\vdash P(x)$ and not from $\vdash P(x)$.

    I am not sure how generally you propose to prepend $\forall x$ to predicates with a free variable $x$, cf. Andrej’s original comments. It’s technically correct in some cases. However, it is idiomatially wrong: this is not how we write mathematics. Specialists sometimes need to do things in idiosyncratic ways, but logicians have been notorious for doing this, to the extent that they end up talking about angels dancing on pinheads instead of the real (mathematical) world.

    All of the difficulties with free variables, including the notorious Empty Domain fallacy ($\forall x.P(x)\vdash\exists x.P(x)$), are solved by just keeping track of them, in a context (explicit listing), along with the propositional hypotheses.

    This does not mean to say that we have to retain all of this accountancy while writing ordinary mathematical arguments. Programming languages have developed good ways of indicating which variables are free (in scope) in any particular part of a program. Similarly, there is the box notation for doing it in logical proofs. However, there are also longstanding conventions for doing this in mathematical arguments written in natural language.

  13. In the last paragraph/section, isn’t the “theorem” quoted actually the definition of $M \models \phi(x_1, \ldots,x_n$? Or, more precisely, one clause in the mutually recursive definition of what $M \models \phi$ means for any $\phi$.

  14. @Ian: Well, yes, that just makes the theorem easier to prove… Also, with non-empty domains (an assumption often made in logic textbooks), one can then conclude from this and the completeness theorem that a formula is provable if, and only if, its universal closure is.

  15. Paul, I really like your loom analogy! I wish there was a way to integrate a loom *into* the weave so the the analogy would be complete (and I could explain $l$-calculus to my mother).

  16. I’m trying to understand Reason 2 in the original post: I would answer all of the questions 1. 2. 3. & 4. with yes (possibly adding something like the context $x,y\in \mathbb{R}$). Why is that inconsistent?

  17. @Michael: it is precisely the missing context, or to be more precise, the missing abstraction, which is the problem. The expressions $x + \sin x$ denotes a number. This number depends on another number $x$, but so what? To get a function out of this you have to write something like $x \mapsto x + \sin x$ or $(x,y) \mapsto x + \sin x$, at which point it becomes clear what this function maps from (in the first case from $\mathbb{R}$ and in the second $\mathbb{R^2}$). Mahtematicians are doing themselves a great disservice by saying things like “$n + 3 k$ is an odd number” at one point, and then “$n + 3 k$ is a monotone function” at another. They should say “$n + 3 k$ is an odd number$” but $n \mapsto n + 3 k$” is a monotone function”.

  18. Implicitly universal quantification is a good convention to have, in certain contexts, for example when formulating axioms or theorems. Trying to be consistent and precise at all times just makes life hard. The right balance between nit-picking and liberty is delicate and not always clear. But I believe it is all right and can cause no misunderstanding to say that $n+3k$ is a monotone function of $n$, without saying $n\mapsto$ and without saying “for all $k$” (Hey, Andrey, I caught you! How is $k$ qualified?). Or even without saying “of $k$” if it is clear that $k$ is the quantity whose variation is under consideration. We can use the plain O-notation $O(n)$ if the context makes it clear whether we mean “as $n\to\infty$” or “as $n\to 0″ or something else, although the expression is, strictly speaking, meaningless without this qualification.

    Another example came to my mind when I read Reason 2: (“A similar sort of mistake happens in algebra where people think that polynomials are functions.”) Is the polynomial $x^2+2x+3 \in \mathbb{Z}[x]$? Is it in $\mathbb{Z}[x,y]$? (Of course, both are true.) (Certain programming languages even have provisions for automatically “promoting” constants to constant functions if the context requires it.)

    2) You are using the expression “a function in variable $x$”. Isn’t this meaningless? Can you give an example of a function in variable $x$?

  19. Essentially, I disagree with Günter. There is absolutely no need to quantify variables in axioms. They can be left free. Why would we want to quantify them universally? To artificially increase their logical complexity in the name of general confusion pervading treatment of free variables?

    The “big O” notation is an abomination that is just as bad as the broken notation in analysis. We should in fact write $O(\lambda n . n^2)$ instead of $O(n^2)$. Complexity theorists write nonsense such as $2^{O(n)}$ out of laziness which then they claim to be convenience.

    Likewise, I do think we should write $\lambda n . 2n+ k$ when we think of a monotone map. And no, you did not “catch” me because there is no need to quantify $k$ in this expression. That is the whole point of what I am saying, namely, that variables are not always quantified, and for good reasons.

  20. What is wrong with $2^{O(n)}$? It is used not just by complexity theorists but all sorts of ordinary mathematicians and computer scientists who are describing the asymptotic growth of something. If $F$ is a function whose growth we want to study, should we really apply your suggested replacement instead of writing $F(n)\le 3n^2+7n+1=O(n^2)$?

  21. There should really be two separate threads, about using (or abusing) expressions as a notation for functions, and about “implicit” universal quantification. Here is a post on quantification.

    Andrey, why is there no need to quantify variables in axioms? Isn’t the associative law $a*(b*c)=(a*b)*c$ supposed to hold for all elements $a,b,c$ of the group? I noted that Paul Taylor in his post of
    December 26 also objected to the use of quantifiers in such a case, but as an ordinary mathematician who never uses turnstyles in his work, I don’t understand why. For me, the reason why we don’t need to (explicitly) quantify $a,b,c$ when we write the associative law is precisely that they are implicitly universally quantified.

    Here is an extract form some hypothetical mathematical article. A series of derivations is concluded by the remark: “For the transition from the second to the third line, we have used the Cauchy-Schwarz inequality, $\langle x,y\rangle^2\le\|x\|^2\cdot\|y\|^2$.” Suppose that $x$ and $y$ do not appear otherwise in this passage.
    What might the author mean? Tick one of the following boxes.

    – a) $\Box$ $\forall n\in\mathbb{N}\colon\forall x,y\in \mathbb{R}^n\colon\langle x,y\rangle^2\le\|x\|^2\cdot\|y\|^2$

    – b) $\Box$ $\exists n\in\mathbb{N}\colon\forall x,y\in \mathbb{R}^n\colon\langle x,y\rangle^2\le\|x\|^2\cdot\|y\|^2$

    – c) $\Box$ $\exists n\in\mathbb{N}\colon\exists x,y\in \mathbb{R}^n\colon\langle x,y\rangle^2\le\|x\|^2\cdot\|y\|^2$

    – d) $\Box$ $\forall n\in\mathbb{N}\colon\exists x,y\in \mathbb{R}^n\colon\langle x,y\rangle^2\le\|x\|^2\cdot\|y\|^2$

    – e) $\Box$ It cannot be determined what this means without specifying what the values of $x$ and $y$ are.

    – f) $\Box$ It is the Cauchy-Schwarz-Buniakovski inequality.

    – g) $\Box$ This mathematician is doing
    a great disservice to mathematics by this triviality, because everybody knows the Cauchy-Schwartz inequality.

    Have you ticked?

    (This example is weak because the correct answer is (g). Imagine a lesser-known inequality or identity.)

    I think every variable that is used meaningfully in mathematical discourse is bound somehow, either by defining its value, or by explicitly quantifying it, or implicitly. The default binding is the universal quantification. Every claim in a mathematical paper can be converted into a fully quantified statement without free variables. The phrase “Consider any $x??$… blabla” in Reason 6 binds $x$, and as the word “any” suggests, it is a universal quantification here. Everything that is said about $x$ in “blabla” is supposed to hold for all $x$. There is no contradiction to the fact that $x$ is used as a free variable in “blabla”. (Actually I am now puzzled completely why this example should be at all a “reason” against implicit quantification: The variable $x$ is explicitly quantified here.)

    Or maybe we are talking about different things. In “$\exists x\colon x>2$”, $x$ occurs as a free variable in “$x>2$”, but it would of course be nonsense to prepend “$\forall x$” to it because “free variables are implicitly universally quantified”. I am talking about a useful convention that any free variables that remain free at the outermost level after all bindings have been resolved, are implicitly considered as universally quantified. For example, a lemma in a book that I am reading starts with: “Let $n\ge-2, \dots$”. The variable $n$ is not explicitly quantified. What is the intended meaning? Should it mean “$\forall n\ge -2\dots$”?

  22. I never said or pretended not to understand what it means to say that an expression is a function of some variable. I suggest you stop taking my statements in bad faith. What I am claiming is that precise notation is better and should be used. At the expense of two letters and one punctuation you can make the distinction between an expression and its abstraction perfectly clear. Surely you cannot argue in favor of sloppy notation?!

  23. a) It was E. W. Dijkstra who pretended that he did not understand “log(n)” as an example of a function. (There is a link in my post, but I screwed it up by forgetting the closing tag, sorry.)

    b) There are more criteria to be considered when choosing notation;
    precision and unambiguousness is just one aspect. Notation should also be concise, mnemonic, in line with tradition, not compete with other notations etc. Thus, I certainly don’t agree that “precise notation is better and should be used” ABSOLUTELY and AT ALL TIMES.

    There is a balance between convenience and precision. Phrases like “We omit the subscript $S$ from $F_S(t)$ when it is clear from the context” are abundant in the mathematical literature. Do you want to abolish them because this is “sloppy”? A book that I am reading proposes to render $f(x)(y)$ as $f(x,y)$ for the mere sake of saving parentheses.
    What is wrong with using a notation that is, strictly speaking, imprecise and ambiguous but is understood by everybody and causes no danger of confusion? There is a wide range of occasions for expressing mathematical ideas, ranging from dreams, the notes that we scribble, the blackboards that we write on when discussing with colleagues, over lectures, papers for publication to machine-checkable proofs. Different standards of rigor are appropriate for each category, but mathematical texts are written primarily for human readers and not for machines.

    c) Andrej, you still did not show how you propose to render my example with the $O$-notation. Let’s see how it compares in terms of conciseness and awkward circumlocutions. I take it from your comments that you have never used the “abominable O-notation” to your advantage (and I hope I am not taking your comment in bad faith). Don Knuth has written elaborately on the O-notation, starting with his article “big Omicron and big Omega and big Theta“.
    (And if you want a real challenge for your proposed replacement of the O-notation, take the passage at the bottom of p.22)
    Convenience comes at a price, and O-notation has its pitfalls, about which we can read, for example, in Volume 1 of the Art of Computer Programming. Maybe you don’t regard Knuth as an authority because he uses expressions like “functions $f(n)$”, but he has certainly spent a lot of conscious thought on this notation and on notation in general, and I think one thing that he cannot be accused of is “laziness”.

    d) As far as I see, $\lambda$-notation is nonexistent outside logic, foundations of mathematics and computing, computability and certain parts of programming. (I wonder how many of our mathematical colleagues are actually aware of what it means.)
    The reason is, I believe, that mathematicians have seen little need for it, and when they did, have come up with ad-hoc notations, like $a_{i*}$ for the $i$-th row of a matrix, or $\langle z,\cdot \rangle$ for partial evaluation of a bivariate function, that served them well enough. (Isn’t there a similar notation, a standard one, on quantum mechanics?) I have a great respect for Dijkstra, but I don’t like the self-opinionated attitude that is expressed at the end of EWD1151, trying to impose his personal standards of syntax on everyone. I am sensing a similar attitude in expressions like “doing themselves a great disservice”. I might be convinced by an actual example from the literature where the confusion between the “function $f(n)$” and some particularly “value $f(n)$” causes some harm. We are tolerating the same confusion in the notation for polynomials. (But yes, there is the convention to use $X$ for indeterminates and $x$ when substituting values.)

    e) Andrey, I would still like to hear your opinion about my thesis that, by convention, we don’t need to explicitly quantify all variables in a theorem BECAUSE they are implicitly universally quantified, in line with the peculiar characteristic of mathematics that its statements are universally true.
    (And why do we want to have all variables quantified? Because if variables remain free, the statement is not a proposition that could be true or false.)

  24. Confusion between $phi$ and $\forall x.\phi$ is only possible when the latter is assumed to be true. Clearly, if $\forall x.\phi$ is false, then is usage is very different from the usage of $\phi$.

    I guess the “implicit universal quantification” stems from the very ancient idea that all formulas that happen during a proof must be true.

  25. You make an assertion that ?x.? cannot be demonstrated without using free variables, and seem to suggest that the only way to do without is to use combinators. There’s a book by Quine called “Mathematical Logic”, where he manages quite well without introducing any free variables, using ?? to refer to ‘the universal closure of ?’ (meaning the result of prepending it with universal quantifiers for all free variables in alphabetic order), and just states a sentence after its number to state it’s a theorem (or theorem schemata). All theorems of his system have no free variables, and he never introduces free variables into his proofs: basically all you need is to introduce a few helpful theorem schemata for substitution into and out of a quantified expression. If your point is that it’s inconvenient to write proofs without quantifiers, that’s a judgement relative to the way you were taught. If you prefer writing proofs in English, there are a limited number of logical forms which can easily be represented, but if you are more fond of working in a semi-formal way, it’s just as easy to develop good algebraic intuitions for how to manipulate quantified formulas into the desired form, as it is to develop the same intuitions for open formulas.

    Either way, I think “cannot” should be changed in that case to “I find it unpleasant to work without free variables.”.

  26. It has been quite a long time (more than a year!) since I made an extremely uniformed comment. In my defense, back then I did not know about the distinction between judgments and propositions, between an assumption in a proof and an argument for a function, or even between a metalanguage and an object language. I did briefly stumble upon the concept of a metalanguage, but I quickly dimissed as “Who would like to reason outside the object language?”

    It is only after trying to implement a programming language that I actually understood what all of this is about: context management. When a variable is free, it can be used – for performing actual computation. You bind a variable when you need to abstract over it. And you need to abstract over a variable if you do not want it in the environment. The environment is a very real thing – the contents of memory.

    So, even if it is a little bit too late, thanks for the explanation!

Leave a Reply

Your email address will not be published. Required fields are marked *

67 − 60 =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>