Mathematics and Computation

Space-filling curves, constructively

2024-01-30T00:00:00+01:00

In 1890 Giuseppe Peano discovered a square-filling curve, and a year later David Hilbert published his variation. In those days people did not waste readers' attention with dribble – Peano explained it all on 3 pages, and Hilbert on just 2 pages, with a picture!

But are these constructive square-filling curves?

There's no doubt that the curves themselves are defined constructively, for instance as limits of uniformly continuous maps. A while ago I even made a video showing the limiting process for Hilbert curve:

Your browser does not support the video tag.

Is Hilbert's curve constructively surjective? Almost:

Theorem 1: For any point in the square, its distance to Hilbert curve is zero.

Proof. Recall that the Hilbert curve $\gamma : [0,1] \to [0,1]^2$ is the limit of a sequence $\gamma_n : [0,1] \to [0,1]^n$ of uniformly continuous maps, with respect to the supremum norm on the space of continuous maps $\mathcal{C}([0,1], [0,1]^2)$. The finite stages $\gamma_n$ get progressively closer to every point in the square. Thus, for any $\epsilon > 0$ and $p \in [0,1]^2$ there is $n \in \mathbb{N}$ and $t \in [0,1]$ such that $d(p, \gamma_n(t)) < \epsilon/2$ and $d(\gamma_n(t), \gamma(t)) < \epsilon/2$, together ensuring that $\gamma$ is closer than $\epsilon$ from $p$. $\Box$

Classically, Theorem 1 suffices to conclude that Hilbert curve is surjective. Constructively, we have to modify it a bit. Recall that $\gamma$ is self-similar, as it is made of four copies of itself, each scaled by a factor $1/2$, translated and rotated to cover precisely one quarter of the unit square. Therein lies the problem: the four abutting quarter-sized squares cannot be shown to cover the unit square. We should make them slightly larger so that they overlap.

Given a scaling factor $\alpha$, let us define the generalized Hilbert curve $\gamma^\alpha : [0,1] \to [0,1]^2$ constructed just like the usual Hilbert curve, but with scaling factor $\alpha$. Instead of writing down formulas in LaTeX, it is more fun to program the curve in Mathematica and draw some pictures (see IntuitionisticHilbert.nb).

At $\alpha = 0.5$ we recover the usual Hilbert curve:

At $\alpha = 0.4$ the curve is not square-filling:

At $\alpha = 0.6$ we obtain a square-filling curve that overlaps itself already at finite stages: This is the one we want.

Just for fun, here's a video the 8-th level curve as $\alpha$ ranges from $0.4$ to $0.8$.

(Your browser does not support the video tag.)

As $\alpha$ increases the image gets denser in the center of the square and sparser close to the boundary, but this is an artifact of showing a finite stage. The actual curve $\gamma^\alpha$ is equally dense everywhere as soon as $\alpha > 0.5$. Back to serious business:

Theorem 2: Assuming dependent choice, the generalized Hilbert cube $\gamma^\alpha$ is surjective for $1/2 < \alpha < 1$.

Proof. Define the transformations $T_0^\alpha, T_1^\alpha, T_2^\alpha, T_3^\alpha : [0,1]^2 \to [0,1]^2$:

$T_0^\alpha(x,y) = (\alpha \cdot y, \alpha \cdot x)$
$T_1^\alpha(x,y) = (\alpha \cdot x, 1 - \alpha \cdot (y - 1))$
$T_2^\alpha(x,y) = (1 - \alpha \cdot (x - 1) , 1 - \alpha \cdot (y - 1))$
$T_3^\alpha(x,y) = (1 - \alpha \cdot y , \alpha \cdot (1 - x))$

Each of these map the unit square onto a smaller square with side $\alpha$:

$T_0$ scales and reflects the unit square onto $[0,\alpha] \times [0,\alpha]$
$T_1$ scales the unit square onto $[0,\alpha] \times [1 - \alpha, 1]$,
$T_2$ scales the unit square onto $[1 - \alpha, 1] \times [1 - \alpha, 1]$,
$T_3$ scales and reflects onto $[1 - \alpha, 1] \times [0, \alpha]$.

Because $\alpha > 1/2$ these four squares overlap (rather than just touch) therefore they cover $[0,1]^2$, constructively. Given any $p \in [0,1]^2$ we may use Dependent choice to find a sequence of $T_{i_1}, T_{i_2}, T_{i_3}, \ldots$ such that $p = T_{i_1} (T_{i_2} (T_{i_3} (\cdots)))$, hence $p = \gamma^\alpha(t)$ where $t \in [0,1]$ is the number $0.i_1 i_2 i_3 \ldots$ written in base 4. $\Box$

Can we also do it without Dependent choice? We certainly cannot get rid of all choice.

Theorem 3: In the topos of sheaves on the unit square $\mathrm{Sh}([0,1]^2)$ there is no square-filling curve.

Proof. Let $I$ be the unit interval in the topos, i.e., it is the sheaf of continuous maps valued in $[0,1]$. Consider the internal statement that there is a surjection from $I$ onto $I^2$:

$$\exists \gamma : I \to I^2 .\, \forall p \in I^2 .\, \exists t \in I .\, \gamma(t) = p \tag{1}$$

Working through sheaf semantics (thanks to Andrew Swan for doing it with me over a cup of coffee – although I claim ownership of all errors), its validity at an open set $U \subseteq [0,1]^2$ amounts to the following condition: there is an open cover $(U_i)_i$ of $U$ with continuous maps $\gamma_i : U_i \times [0,1] \to [0,1]^2$ such that, for every $i$, every open $V \subseteq U_i$ and continuous $p : V \to [0,1]^2$, there is an open cover $(V_j)_j$ of $V$ and continuous maps $t_j : V_j \to [0,1]$, such that $\gamma_i(v, t_j(v)) = p(v)$ for all $j$ and $v \in V_j$.

Instantiate $p : V \to [0,1]^2$ in the stated condition with the inclusion $p(v) = v$ to obtain, for every $j$, that $\gamma_i(v, t_j(v)) = v$ holds for all $v \in V_j$. Therefore, the map $\gamma_i{\restriction}_{V_j} : V_j \times [0,1] \to V_j$ has a continuous section, namely the map $v \mapsto (v, t_j(v))$. But there can be no such map, as it would violate invariance of domain, unless $V_j = \emptyset$. Consequently, the only way for (1) to hold at $U$ is for $U$ to be empty. $\Box$

To summarize the argument: in a topos of sheaves “$\gamma$ is surjective” is a very strong condition, namely that $\gamma$ has local sections – and these may not exist for topological or geometric reasons. At the same time we still have Theorem 1, so also in a topos of sheaves the usual Hilbert curve leaves no empty space in the unit square.

On indefinite truth values

2023-08-13T00:00:00+02:00

In a discussion following a MathOverflow answer by Joel Hamkins, Timothy Chow and I got into a chat about what it means for a statement to “not have a definite truth value”. I need a break from writing the paper on countable reals (coming soon in a journal near you), so I thought it would be worth writing up my view of the matter in a blog post.

How are we to understand the statement “the Riemann hypothesis (RH) does not have a definite truth value”?

Let me first address two possible explanations that in my view have no merit.

First, one might suggest that “RH does not have a definite truth value” is the same as “RH is neither true nor false”. This is nonsense, because “RH is neither true nor false” is the statement $\neg \mathrm{RH} \land \neg\neg\mathrm{RH}$, which is just false by the law of non-contradiction. No discussion here, I hope. Anyone claiming “RH is neither true nor false” must therefore mean that they found a paradox.

Second, it is confusing and even harmful to drag into this discussion syntactically invalid, ill-formed, or otherwise corrupted statements. To say something like “$(x + ( - \leq 7$ has no definite truth value” is meaningless. The notion of truth value does not apply to arbitrary syntactic garbage. And even if one thinks this is a good idea, it does not apply to RH, which is a well-formed formula that can be assigned meaning.

Having disposed of ill-fated attempts, let us ask what the precise mathematical meaning of the statement might be. It is important to note that we are discussing semantics. The truth value of a sentence $P$ is an element $I(P) \in B$ of some Boolean algebra $(B, 0, 1, {\land}, {\lor}, {\lnot})$, assigned by an interpretation function $I$. (I am assuming classical logic, but nothing really changes if we switch to intuitionistic logic, just replace Boolean algebras with Heyting algebras.) Taking this into account, I can think of three ways of explaining “RH does not have a definite truth value”:

The truth value $I(\mathrm{RH})$ is neither $0$ nor $1$. (Do not confuse this meta-statement with the object-statement $\neg \mathrm{RH} \land \neg\neg\mathrm{RH}$.) Of course, for this to happen one has to use a Boolean algebra that contains something other than $0$ and $1$.
The truth value of $I(\mathrm{RH})$ varies, depending on the model and the interpretation function. An example of this phenomenon is the continuum hypothesis, which is true in some set-theoretic models and false in others.
The interpretation function $I$ fails to assign a truth value to $\mathrm{RH}$.

Assuming we have set up sound and complete semantics, the first and the second reading above both amount to undecidability of RH. Indeed, if the truth value of RH is not $1$ across all models then RH is not provable, and if it is not fixed at $0$ then it is not refutable, hence it is undecidable. Conversely, if RH is undecidable then its truth value in the Lindenbaum-Tarski algebra is neither $0$ nor $1$. We may quotient the algebra so that the value becomes true or false, as we wish.

The third option says that one has got a lousy interpretation function and should return to the drawing board.

In some discussions “RH does not have a definite truth value” seems to take on an anthropocentric component. The truth value is indefinite because knowledge of it is lacking, or because there is a cognitive barrier to comprehending the statement, etc. I find these just as unappealing as the Brouwerian counterexamples arguing in favor of intuitionistic logic.

The only realm in which I reasonably comprehend “$P$ does not have a definite truth value” is pre-mathematical, or even philosophical. It may be the case that $P$ refers to pre-mathematical concepts lacking precise formal description, or whose existing formal descriptions are considered problematic. This situation is similar to the third one above, but cannot be just dismissed as technical deficiency. An illustrative example is Solomon Feferman's Does mathematics need new axioms? and the discussion found therein on the meaningfulness and the truth value of the continuum hypothesis. (However, I am not aware of anyone seriously arguing that the mathematical meaning of Riemann hypothesis is contentious.)

So, what do I mean by “RH does not have a definite truth value”? Nothing, I would never say that and I do not understand what it is supposed to mean. RH clearly has a definite truth value, in each model, and with some luck we are going to find out which one. (To preempt a counter-argument: the notion of “standard model” is a mystical concept, while those stuck in an “intended model” suffer from lack of imagination.)

Variations on Weihrauch degrees (CiE 2023)

2023-07-28T00:00:00+02:00

I gave a talk “Variations on Weihrauch degrees” at Computability in Europe 2023, which took place in Tbilisi, Georgia. The talk was a remote one, unfortunately. I spoke about generalizations of Weihrauch degrees, a largely unexplored territory that seems to offer many opportunities to explore new directions of research. I am unlikely to pursue them myself, but will gladly talk with anyone who is interested in doing so.

Slides: CiE-2023-slides.pdf.

Continuity principles and the KLST theorem

2023-07-19T00:00:00+02:00

On the occasion of Dieter Spreen's 75th birthday there will be a Festschrift in the Journal of Logic and Analysis. I have submitted a paper “Spreen spaces and the synthetic Kreisel-Lacombe-Shoenfield-Tseitin theorem”, available as a preprint arXiv:2307.07830, that develops a constructive account of Dieter's generalization of a famous theorem about continuity of computable functions. In this post I explain how the paper fits into the more general topic of continuity principles.

A continuity principle is a statement claiming that all functions from a given class are continuous. A silly example is the statement

Every map $f : X \to Y$ from a discrete space $X$ is continuous.

The dual

Every map $f : X \to Y$ to an indiscrete space $Y$ is continuous.

is equally silly, but these two demonstrate what we mean.

In order to find more interesting continuity principles, we have to look outside classical mathematics. A famous continuity principle was championed by Brouwer:

Brouwer's continuity principle: Every $f : \mathbb{N}^\mathbb{N}\to \mathbb{N}$ is continuous.

Here continuity is taken with respect to the discrete metric on $\mathbb{N}$ and the complete metric on $\mathbb{N}^\mathbb{N}$ defined by

$$\textstyle d(\alpha, \beta) = \lim_n 2^{-\min \lbrace k \in \mathbb{N} \,\mid\, k = n \lor \alpha_k \neq \beta_k\rbrace}.$$

The formula says that the distance between $\alpha$ and $\beta$ is $2^{-k}$ if $k \in \mathbb{N}$ is the least number such that $\alpha_k \neq \beta_k$. (The limit is there so that the definition works constructively as well.) Brouwer's continuity principle is valid in the Kleene-Vesley topos.

In the effective topos we have the following continuity principle:

KLST continuity principle: Every map $f : X \to Y$ from a complete separable metric space $X$ to a metric space $Y$ is continuous.

The letters K, L, S, and T are the initials of Georg Kreisel, Daniel Lacombe, Joseph R. Shoenfield, and Grigori Tseitin, who proved various variants of this theorem in the context of computability theory (the above version is closest to Tseitin's).

A third topos with good continuity principles is Johnstone's topological topos, see Section 5.4 of Davorin Lešnik's PhD dissertaton for details.

There is a systematic way of organizing such continuity principles with synthetic topology. Recall that in synthetic topology we start by axiomatizing an object $\Sigma \subseteq \Omega$ of “open truth values”, called a dominance, and define the intrinsic topology of $X$ to be the exponential $\Sigma^X$. This idea is based on an observation from traditional topology: the topology a space $X$ is in bijective correspondence with continuous maps $\mathcal{C}(X, \mathbb{S})$, where $\mathbb{S}$ is the Sierpinski space.

Say that a map $f : X \to Y$ is intrinsically continuous when the invese image map $f^\star$ maps intrinsically open sets to intrinsically open sets.

Intrinsic continuity principle: Every map $f : X \to Y$ is intrinsically continuous.

Proof. The inverse image $f^\star(U)$ of $U \in \Sigma^Y$ is $U \circ f \in \Sigma^X$. □

Given how trivial the proof is, we cannot expect to squeeze much from the intrinsic continuity principle. In classical mathematics the principle is trivial because there $\Sigma = \Omega$, so all intrinsic topologies are discrete.

But suppose we knew that the intrinsic topologies of $X$ and $Y$ were metrized, i.e., they coincided with metric topologies induces by some metrics $d_X : X \times X \to \mathbb{R}$ and $d_Y : Y \times Y \to \mathbb{R}$. Then the intrinsic continuity principle would imply that every map $f : X \to Y$ is continuous with respect to the metrics. But can this happen? In “Metric spaces in synthetic topology” by Davorin Lešnik and myself we showed that in the Kleene-Vesley topos the intrinsic topology of a complete separable metric space is indeed metrized. Consequently, we may factor Brouwer's continuity principles into two facts:

Easy general fact: the intrinsic continuity principle.
Hard specific fact: in the Kleene-Vesley topos the intrinsic topology of a complete separable metric space is metrized.

Can we similarly factor the KLST continuity principle? I give an affirmative answer in the submitted paper, by translating Dieter Spreen's “On Effective Topological Spaces” from computability theory and numbered sets to synthetic topology. What comes out is a new topological separation property:

Definition: A Spreen space is a topological space $(X, \mathcal{T})$ with the following separation property: if $x \in X$ is separated from an overt $T \subseteq X$ by an intrinsically open subset, then it is already separated from it by a $\mathcal{T}$-open subset.

Precisely, a Spreen space $(X, \mathcal{T})$ satisfies: if $x \in S \in \Sigma^X$ and $S$ is disjoint from an overt $T \subseteq X$, then there is an open $U \in \mathcal{T}$ such that $x \in U$ and $U \cap T = \emptyset$. The synthetic KLST states:

Synthetic KLST continuity principle: Every map from an overt Spreen space to a pointwise regular space is pointwise continuous.

The proof is short enough to be reproduced here. (I am skipping over some details, the important one being that we require open sets to be intrinsically open.)

Proof. Consider a map $f : X \to Y$ from an overt Spreen space $(X, \mathcal{T}_X)$ to a regular space $(Y, \mathcal{T}_Y)$. Given any $x \in X$ and $V \in \mathcal{T}_Y$ such that $f(x) \in V$, we seek $U \in \mathcal{T}_X$ such that $x \in U \subseteq f^\star(V)$. Because $Y$ is regular, there exist disjoint $W_1, W_2 \in \mathcal{T}_Y$ such that $x \in W_1 \subseteq V$ and $V \cup W_2 = Y$. The inverse image $f^\star(W_1)$ contains $x$ and is intrinsically open. It is also disjoint from $f^\star(W_2)$, which is overt because it is an intrinsically open subset of an overt space. As $X$ is a Spreen space, there exists $U \in \mathcal{T}_X$ such that $x \in U$ and $U \cap f{*}(W_2) = \emptyset$, from which $U \subseteq V$ follows. □

Are there any non-trivial Spreen spaces? In classical mathematics every Spreen space is discrete, so we have to look elsewhere. I show that they are plentiful in synthetic computability:

Theorem (synthetic computability): Countably based sober spaces are Spreen spaces.

Please consult the paper for the proof.

There is an emergent pattern here: take a theorem that holds under very special circumstances, for instance in a specific topos or in the presence of anti-classical axioms, and reformulate it so that it becomes generally true, has a simple proof, but in order to exhibit some interesting instances of the theorem, we have to work hard. What are some other examples of such theorems? I know of one, namely Lawvere's fixed point theorem. It took some effort to produce non-trivial examples of it, once again in synthetic computability, see On fixed-point theorems in synthetic computability.

Isomorphism invariance and isomorphism reflection in type theory (TYPES 2023)

2023-06-15T00:00:00+02:00

At TYPES 2023 I had the honor of giving an invited talk “On Isomorphism Invariance and Isomorphism Reflection in Type Theory” in which I discussed isomorphism reflection, which states that isomorphic types are judgementally equal. This strange principle is consistent, and it validates some fairly strange type-theoretic statements.

Here are the slides with speaker notes and the video recording of the talk.

Formalizing invisible mathematics

2023-02-13T00:00:00+01:00

I am at the Machine assisted proofs workshop at the UCLA Institute for Pure and Applied Mathematics, where I am about to give a talk on “Formalizing invisible mathematics”.

Here are the slides with speaker notes and the video recording of the talk.

Abstract:

It has often been said that all of mathematics can in principle be formalized in a suitably chosen foundation, such as first-order logic with set theory, higher-order logic, or type theory. When one attempts to actually do so on a large scale, the true meaning of the qualifier “in principle” is revealed: mathematical practice consists not only of text written on paper, however detailed they might be, but also of unspoken conventions and techniques that enable efficient communication and understanding of mathematical texts. While students may be able to learn these through observation and imitation, the same cannot be expected of computers, yet.

In this talk we will first review some of the informal mathematical practices and relate them to corresponding techniques in proof assistants, such as implicit arguments, type classes, and tactics. We shall then ask more generally whether these need be just a bag of tricks, or can they be organized into a proper mathematical theory.

Exploring strange new worlds of mathematics

2023-02-10T00:00:00+01:00

On February 10, 2023, I gave my Levi L. Conant Lectur Series talk “Exploring strange new worlds of mathematics”, at the math department of Worcester Polytechnic Institute. Here are the slides with speaker notes and the video recording of the talk.

Katja Berčič made a super cool logo for my talk:

Thank you, Katja! If you are a Trekkie you should figure it out.

Abstract:

In the 19th century Carl Friedrich Gauss, Nikolai Lobachevsky, and János Bolyai discovered geometries that violated the parallel postulate. Initially these were considered inferior to Euclid's geometry, which was generally recognized as the true geometry of physical space. Subsequently, the work of Bernhard Riemann, Albert Einstein, and others, liberated geometry from the shackles of dogma, and allowed it to flourish beyond anything that the inventors of non-euclidean geometry could imagine.

A century later history repeated itself, this time with entire worlds of mathematics at stake. The ideal of one true mathematics was challenged by the schism between intuitionstic and classical mathematics, as personified in the story of rivalry between L.E.J. Brouwer and David Hilbert. Not long afterwards, Kurt Gödel's work in logic implied the inevitability of a multitude of worlds of mathematics. These could hardly be dismissed as logical sophistry, as they provided answers to fundamental questions about set theory and foundations of mathematics. The second half of the 20th century brought gave us many more worlds of mathematics: Cohen's set-theoretic forcing, Alexander Grothnedieck's sheaves, F. William Lawvere's and Myles Tierney's elementary toposes, Martin Hyland's effective topos, and a plethora of others.

We shall explore but a small corner of the vast multiverse of mathematics, observing in each the quintessential mathematical object, the field of real numbers. There is a universe in which the reals contain Leibniz's infinitesimals, in another they are all computable, there is one in which they are cannot be separated into two disjoint subsets, and one in which all subsets are measurable. There is even a universe in which the reals are countable. The spectrum of possibilities is bewildering, but also inspiring. It leads to the idea of synthetic mathematics: just like geometers and physicists choose a geometry that is best for the situation at hand, mathematicians can choose to work in a mathematical universe made to order, or synthesized, that best captures the essence and nature of the topic of interest.

Happy birthday, Dana!

2022-10-11T00:00:00+02:00

Today Dana Scott is celebrating the 90th birthday today. Happy birthday, Dana! I am forever grateful for your kindness and the knowledge that I received from you. I hope to pass at least a part of it onto my students.

On the occasion Steve Awodey assembled selected works by Dana Scott at CMU-HoTT/scott repository. It is an amazing collection of papers that had deep impact on logic, set theory, computation, and programming languages. I hope in the future we can extend it and possibly present it in better format.

As a special treat, I recount here the story the invention of the famous $D_\infty$ model of the untyped $\lambda$-calculus. I heard it first when I was Dana's student. In 2008 I asked Dana to recount it in the form of a short interview.

These days domain theory is a mature branch of mathematics. It has had profound influence on the theory and practice of programming languages. When did you start working on it and why?

Dana Scott: I was in Amsterdam in 1968/69 with my family. I met Strachey at IFIP WG2.2 in summer of 1969. I arranged leave from Princeton to work with him in the fall of 1969 in Oxford. I was trying to convince Strachey to use a type theory based on domains.

One of your famous results is the construction of a domain $D_\infty$ which is isomorphic to its own continuous function space $D_\infty \to D_\infty$. How did you invent it?

D. S.: $D_\infty$ did not come until later. I remember it was a quiet Saturday in November 1969 at home. I had proved that if domains $D$ and $E$ have a countable basis of finite elements, then so does the continuous function space $D \to E$. In understanding how often the basis for $D \to E$ was more complicated than the bases for $D$ and $E$, I then thought, “Oh, no, there must exist a bad $D$ with a basis so 'dense' that the basis for $D \to D$ is just as complicated – in fact, isomorphic.” But I never proved the existence of models exactly that way because I soon saw that the iteration of $X \mapsto (X \to X)$ constructed a suitable basis in the limit. That was the actual $D_\infty$ construction.

Why do you say “oh no”? It was an important discovery!

D. S.: Since, I had claimed for years that the type-free $\lambda$-calculus has no “mathematical” models (as distinguished from term models), I said to myself, “Oh, no, now I will have to eat my own words!”

The existence of term models is guaranteed by the Church-Rosser theorem from 1936 which implies that the untyped lambda calculus is consistent?

D. S.: Yes.

The domain $D_\infty$ is an involved construction which gives a model for the calculus with both $\beta$- and $\eta$-rules. Is it easier to give a model which satisfies the $\beta$-rule only?

D. S.: Since the powerset of natural numbers $P\omega$ (with suitable topology) is universal for countably-based $T_0$-spaces, and since a continuous lattice is a retract of every superspace, it follows that $P\omega \to P\omega$ is a retract of $P\omega$. This gives a non-$\eta$ model without any infinity-limit constructions. But continuous lattices had not yet been invented in 1969 – that I knew of.

Where can the interested readers read more about this topic?

D.S.: I would recommend these two:

Scott, D. A type-theoretical alternative to ISWIM, CUCH, OWHY. Theoretical Computer Science, vol. 121 (1993), pp. 411-440.
Scott, D. Some Reflections on Strachey and his Work. A Special Issue Dedicated to Christopher Strachey, edited by O. Danvy and C. Talcott. Higer-Order and Symbolic Computation, vol. 13 (2000), pp. 103-114.

Thank you very much!

Dana Scott: You are welcome.

One syntax to rule them all

2022-05-20T00:00:00+02:00

I am at the Syntax and Semantics of Type Theory workshop in Stockholm, a kickoff meeting for WG6 of the EuroProofNet COST network, where I am giving a talk “One syntax to rule them all” based on joint work with Danel Ahman.

Abstract: The raw syntax of a type theory, or more generally of a formal system with binding constructs, involves not only free and bound variables, but also meta-variables, which feature in inference rules. Each notion of variable has an associated notion of substitution. A syntactic translation from one type theory to another brings in one more level of substitutions, this time mapping type-theoretic constructors to terms. Working with three levels of substitution, each depending on the previous one, is cumbersome and repetitive. One gets the feeling that there should be a better way to deal with syntax.

In this talk I will present a relative monad capturing higher-rank syntax which takes care of all notions of substitution and binding-preserving syntactic transformations in one fell swoop. The categorical structure of the monad corresponds precisely to the desirable syntactic properties of binding and substitution. Special cases of syntax, such as ordinary first-order variables, or second-order syntax with variables and meta-variables, are obtained easily by precomposition of the relative monad with a suitable inclusion of restricted variable contexts into the general ones. The meta-theoretic properties of syntax transfer along the inclusion.

The relative monad is sufficiently expressive to give a notion of intrinsic syntax for simply typed theories. It remains to be seen how one could refine the monad to account for intrinsic syntax of dependent type theories.

Talk notes: Here are the hand-written talk notes, which cover more than I could say during the talk.

Formalization: I have the beginning of a formalization of the higher-rank syntax, but it hits a problem, see below. Can someone suggest a solution? (You can download Syntax.agda.)

{-
   An attempt at formalization of (raw) higher-rank syntax.

   We define a notion of syntax which allows for higher-rank binders,
   variables and substitutions. Ordinary notions of variables are
   special cases:

   * order 1: ordinary variables and substitutions, for example those of
     λ-calculus
   * order 2: meta-variables and their instantiations
   * order 3: symbols (term formers) in dependent type theory, such as
     Π, Σ, W, and syntactic transformations between theories

   The syntax is parameterized by a type Class of syntactic classes. For
   example, in dependent type theory there might be two syntactic
   classes, ty and tm, corresponding to type and term expressions.
-}

module Syntax (Class : Set) where

  {- Shapes can also be called “syntactic variable contexts”, as they assign to
     each variable its syntactic arity, but no typing information.

     An arity is a binding shape with a syntactic class. The shape specifies
     how many arguments the variable takes and how it binds the argument's variables.
     The class specifies the syntactic class of the variable, and therefore of the
     expression formed by it.

     We model shapes as binary trees so that it is easy to concatenate
     two of them. A more traditional approach models shapes as lists, in
     which case one has to append lists.
  -}

  infixl 6 _⊕_

  data Shape : Set where
    𝟘 : Shape -- the empty shape
    [_,_] : ∀ (γ : Shape) (cl : Class) → Shape -- the shape with precisely one variable
    _⊕_ : ∀ (γ : Shape) (δ : Shape) → Shape -- disjoint sum of shapes

  infix 5 [_,_]∈_

  {- The de Bruijn indices are binary numbers because shapes are binary
     trees. [ δ , cl ]∈ γ is the set of variable indices in γ whose arity
     is (δ, cl). -}

  data [_,_]∈_ : Shape → Class → Shape → Set where
    var-here : ∀ {θ} {cl} → [ θ , cl ]∈  [ θ , cl ]
    var-left :  ∀ {θ} {cl} {γ} {δ} → [ θ , cl ]∈ γ → [ θ , cl ]∈ γ ⊕ δ
    var-right : ∀ {θ} {cl} {γ} {δ} → [ θ , cl ]∈ δ → [ θ , cl ]∈ γ ⊕ δ

  {- Examples:

  postulate ty : Class -- type class
  postulate tm : Class -- term class

  ordinary-variable-arity : Class → Shape
  ordinary-variable-arity c = [ 𝟘 , c ]

  binary-type-metavariable-arity : Shape
  binary-type-metavariable-arity = [ [ 𝟘 , tm ] ⊕ [ 𝟘 , tm ] , ty ]

  Π-arity : Shape
  Π-arity = [ [ 𝟘 , ty ] ⊕ [ [ 𝟘 , tm ] , ty ] , ty ]

  -}

  {- Because everything is a variable, even symbols, there is a single
     expression constructor _`_ which forms and expression by applying
     the variable x to arguments ts. -}

  -- Expressions

  infix 9 _`_

  data Expr : Shape → Class → Set where
    _`_ : ∀ {γ} {δ} {cl} (x : [ δ , cl ]∈ γ) →
            (ts : ∀ {θ} {B} (y : [ θ , B ]∈ δ) → Expr (γ ⊕ θ) B) → Expr γ cl

  -- Renamings

  infix 5 _→ʳ_

  _→ʳ_ : Shape → Shape → Set
  γ →ʳ δ = ∀ {θ} {cl} (x : [ θ , cl ]∈ γ) → [ θ , cl ]∈ δ

  -- identity renaming

  𝟙ʳ : ∀ {γ} → γ →ʳ γ
  𝟙ʳ x = x

  -- composition of renamings

  infixl 7 _∘ʳ_

  _∘ʳ_ : ∀ {γ} {δ} {η} → (δ →ʳ η) → (γ →ʳ δ) → (γ →ʳ η)
  (r ∘ʳ s) x =  r (s x)

  -- renaming extension

  ⇑ʳ : ∀ {γ} {δ} {Θ} → (γ →ʳ δ) → (γ ⊕ Θ →ʳ δ ⊕ Θ)
  ⇑ʳ r (var-left x) =  var-left (r x)
  ⇑ʳ r (var-right y) = var-right y

  -- the action of a renaming on an expression

  infixr 6 [_]ʳ_

  [_]ʳ_ : ∀ {γ} {δ} {cl} (r : γ →ʳ δ) → Expr γ cl → Expr δ cl
  [ r ]ʳ (x ` ts) = r x ` λ { y → [ ⇑ʳ r ]ʳ ts y }

  -- substitution
  infix 5 _→ˢ_

  _→ˢ_ : Shape → Shape → Set
  γ →ˢ δ = ∀ {Θ} {cl} (x : [ Θ , cl ]∈ γ) → Expr (δ ⊕ Θ) cl

  -- side-remark: notice that the ts in the definition of Expr is just a substituition

  -- We now hit a problem when trying to define the identity substitution in a naive
  -- fashion. Agda rejects the definition, as it is not structurally recursive.
  -- {-# TERMINATING #-}
  𝟙ˢ : ∀ {γ} → γ →ˢ γ
  𝟙ˢ x = var-left x ` λ y →  [ ⇑ʳ var-right ]ʳ 𝟙ˢ y

  {- What is the best way to deal with the non-termination problem? I have tried:

     1. sized types: got mixed results, perhaps I don't know how to use them
     2. well-founded recursion: it gets messy and unpleasant to use
     3. reorganizing the above definitions, but non-structural recursion always sneeks in

     A solution which makes the identity substitition compute is highly preferred.

     The problem persists with other operations on substitutions, such as composition
     and the action of a substitution.
  -}

Two new doctors!

2022-01-12T00:00:00+01:00

Within a month two of my students defended their theses: Dr. Anja Petković Komel just before Christmas, and Dr. Philipp Haselwarter just yesterday. I am very proud of them. Congratulations!

Philipp's thesis An Effective Metatheory for Type Theory has three parts:

A formulation and a study of the notion of finitary type theories and standard type theories. These are closely related to the general type theories that were developed with Peter Lumsdaine, but are tailored for implementation.
A formulation and the study of context-free finitary type theories, which are type theories without explicit contexts. Instead, the variables are annotated with their types. Philipp shows that one can pass between the two versions of type theory.
A novel effectful meta-language Andromeda meta-language (AML) for proof assistants which uses algebraic effects and handlers to allow flexible interaction between a generic proof assistant and the user.

Anja's thesis Meta-analysis of type theories with an application to the design of formal proofs also has three parts:

A formulation and a study of transformations of finitary type theories with an associated category of finitary type theories.
A user-extensible equality checking algorithm for standard type theories which specializes to several existing equality checking algorithms for specific type theories.
A general elaboration theorem in which the transformation of type theories are used to prove that every finitary type theory (not necessarily fully annotated) can be elaborated to a standard type theory (fully annotated one).

In addition, Philipp has done a great amount of work on implementing context-free type theories and the effective meta-language in Andromeda 2, and Anja implemented the generic equality checking algorithm. In the final push to get the theses out the implementation suffered a little bit and is lagging behind. I hope we can bring it up to speed and make it usable. Anja has ideas on how to implement transformations of type theories in a proof assistant.

Of course, I am very happy with the particular results, but I am even happier with the fact that Philipp and Anja made an important step in the development of type theory as a branch of mathematics and computer science: they did not study a particular type theory or a narrow family of them, as has hitherto been the norm, but dependent type theories in general. Their theses contain interesting non-trivial meta-theorems that apply to large classes of type theories, and can no doubt be generalized even further. There is lots of low-hanging fruit out there.