<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Mathematics and Computation</title>
	<atom:link href="http://math.andrej.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://math.andrej.com</link>
	<description>Mathematics for computers</description>
	<lastBuildDate>Tue, 25 Dec 2012 04:22:07 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5-alpha-21589</generator>
		<item>
		<title>Free variables are not &#8220;implicitly universally quantified&#8221;!</title>
		<link>http://math.andrej.com/2012/12/25/free-variables-are-not-implicitly-universally-quantified/</link>
		<comments>http://math.andrej.com/2012/12/25/free-variables-are-not-implicitly-universally-quantified/#comments</comments>
		<pubDate>Tue, 25 Dec 2012 01:27:53 +0000</pubDate>
		<dc:creator>Andrej Bauer</dc:creator>
				<category><![CDATA[Logic]]></category>
		<category><![CDATA[Tutorial]]></category>

		<guid isPermaLink="false">http://math.andrej.com/?p=1387</guid>
		<description><![CDATA[<p>Mathematicians are often confused about the meaning of variables. I hear them say &#8220;a free variable is implicitly universally quantified&#8221;, by which they mean that it is ok to equate a formula $\phi$ with a free variable $x$ with its universal closure $\forall x \,.\, \phi$. I am addressing this post to those who share this [...]]]></description>
				<content:encoded><![CDATA[<p>Mathematicians are often confused about the meaning of variables. I hear them say &#8220;a free variable is implicitly universally quantified&#8221;, by which they mean that it is ok to equate a formula $\phi$ with a free variable $x$ with its universal closure $\forall x \,.\, \phi$. I am addressing this post to those who share this opinion.</p>
<p><span id="more-1387"></span></p>
<p>I will give several reasons, which are all essentially the same, why &#8220;there is no difference between $\phi$ and $\forall x \,.\, \phi$&#8221; is a really bad opinion to have.</p>
<h3>Reason 1: you wouldn&#8217;t equate a function with its definite integral</h3>
<p>You would not claim that a real-valued function $f : \mathbb{R} \to \mathbb{R}$ is &#8220;the same thing&#8221; as its definite integral $\int_{\mathbb{R}} f(x) \, d x$, would you? One is a real function, the other is a real number. Likewise, $\phi$ is a truth <emph>function</emph> and $\forall x \,.\, \phi(x)$ is a truth <emph>value</emph>.</p>
<h3>Reason 2: functions are not their own values</h3>
<p>To be quite precise, the expression $\phi$ by itself is not a function, just like the expression $x + \sin x$ is not a function. To make it into a function we must first <emph>abstract</emph> the variable $x$, which is usually written as $x \mapsto x + \sin x$, or $\lambda x \,.\, x + \sin x$, or <code>fun x -> x +. sin x</code>. In logic we indicate the fact that $\phi$ is a function by putting it in a <emph>context</emph>, so we write something like $x : \mathbb{R} \vdash \phi$.</p>
<p>Why is all this nit-picking necessary? Try answering these questions with &#8220;yes&#8221; and &#8220;no&#8221; consistently:</p>
<ol>
<li>Is $x + \sin x$ a function in variable $x$?</li>
<li>Is $x + \sin x$ a function in variables $x$ and $y$?</li>
<li>Is $y &#8211; y + x + \sin x$ a function in variables $x$ and $y$?</li>
<li>Is $x + \sin x = y &#8211; y + x + \sin x$?</li>
</ol>
<p>A similar sort of mistake happens in algebra where people think that polynomials are functions. They are not. They are elements of a certain freely generated ring.</p>
<h3>Reason 3: They are not logically equivalent</h3>
<p>It is absurd to claim that $\phi$ and $\forall x \in \mathbb{R} \,.\, \phi$ are logically equivalent statements. Suppose $\forall x \in \mathbb{R} \,.\, x > 2$ were equivalent to $x > 2$. Then I could replace one by the other in any formula I wish. So I choose the formula $\exists x \in \mathbb{R} \,.\, x > 2$. It must be equivalent to $\exists x \in \mathbb{R} \,.\, \forall x \in \mathbb{R} \,.\, x > 2$, but since $\forall x \in \mathbb{R} \,.\, x > 2$ is false, we get $\exists x \in \mathbb{R} \,.\, \bot$, which is false. We proved that there is no number larger than 2.</p>
<h3>Reason 4: They are not inter-derivable</h3>
<p>If you can tell the difference between an implication and logical entailment, perhaps you might try to counter reason 3 by pointing out that $\phi$ and $\forall x \,.\, \phi$ are either both derivable, or both not derivable. That is to say, we can prove one if, and only if, we can prove the other. But again, this is not the case. We can prove $\forall x \in \emptyset \,.\, \bot$ but we cannot prove $\bot$.</p>
<h3>Reason 5: Bound variables can be renamed but free variables cannot</h3>
<p>The formula $x > 2$ is obviously not the same thing as the formula $y > 2$. But the formula $\forall x \in \mathbb{R} . x > 2$ is actually the same as $\forall y \in \mathbb{R} . y > 2$. If you find this confusing it is because you were never taught properly how to handle <a href="http://en.wikipedia.org/wiki/Free_variables_and_bound_variables">free and bound variables</a>.</p>
<h3>Reason 6: You cannot prove $\forall x \,.\, \phi$ without allowing $x$ to become free</h3>
<p>Perhaps we can just forbid free variables altogether and <emph>stipulate</emph> that all variables must always be quantified. But how are you then going to prove $\forall x \in \mathbb{R} \,.\, \phi$? The usual way</p>
<blockquote><p>
&#8220;Consider any $x \in \mathbb{R}$. Then bla bla bla, therefore $\phi$.&#8221;
</p></blockquote>
<p>is now forbidden because the first sentence introduces $x$ as a free variable.</p>
<p>We can abolish variables altogether if we wish, by resorting to combinators, but it makes no sense to keep variables and make them all bound all the time.</p>
<h3>Epilogue: so in what sense are they the same?</h3>
<p>There is a theorem in model theory:</p>
<blockquote><p>
Let $\phi$ be a formula in context $x_1, \ldots, x_n$ and $M$ a structure in which we can interpret $\phi$. The following are equivalent:</p>
<ol>
<li>the universal closure $\forall x_1, \ldots, x_n \,.\, \phi$ is valid in $M$,</li>
<li>for every valuation $\nu : \lbrace x_1, \ldots, x_n \rbrace \to M$, $\phi[\nu]$ is valid in $M$.</li>
</ol>
</blockquote>
<p>This is sometimes abbreviated (quite inaccurately) as &#8220;a formula and its universal closure are semantically equivalent&#8221;. This theorem is causing a lot of harm because mathematicians interpret it as &#8220;free variables are implicitly universally bound&#8221;. But the theorem itself clearly distinguishes a formula from its universal closure. It has a limited range of applications in model theory. It is not a general reasoning principle that would allow you to dispose of thinking about free variables.</p>
<p>You are in good company. Philosophers have thought about free variables for millennia, although they phrase the problem in the language of <a href="http://en.wikipedia.org/wiki/Universal_(metaphysics)">universals</a> and <a href="http://en.wikipedia.org/wiki/Particular">particulars</a>. They wonder whether &#8220;dog&#8221; is the same thing as the set of all dogs, or perhaps there is an ideal dog which is &#8220;pure dogness&#8221;, but then do we need two ideal dogs to make ideal pups, etc. The answer is simple: a free variable is a projection from a cartesian product.</p>
]]></content:encoded>
			<wfw:commentRss>http://math.andrej.com/2012/12/25/free-variables-are-not-implicitly-universally-quantified/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>How to implement dependent type theory III</title>
		<link>http://math.andrej.com/2012/11/29/how-to-implement-dependent-type-theory-iii/</link>
		<comments>http://math.andrej.com/2012/11/29/how-to-implement-dependent-type-theory-iii/#comments</comments>
		<pubDate>Thu, 29 Nov 2012 03:55:28 +0000</pubDate>
		<dc:creator>Andrej Bauer</dc:creator>
				<category><![CDATA[Homotopy type theory]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Tutorial]]></category>

		<guid isPermaLink="false">http://math.andrej.com/?p=1337</guid>
		<description><![CDATA[<p>I spent a week trying to implement higher-order pattern unification. I looked at couple of PhD dissertations, talked to lots of smart people, and failed because the substitutions were just getting in the way all the time. So today we are going to bite the bullet and implement de Bruijn indices and explicit substitutions.</p>
<p>The code is [...]]]></description>
				<content:encoded><![CDATA[<p>I spent a week trying to implement higher-order pattern unification. I looked at couple of PhD dissertations, talked to lots of smart people, and failed because the substitutions were just getting in the way all the time. So today we are going to bite the bullet and implement <a href="http://en.wikipedia.org/wiki/De_Bruijn_index">de Bruijn indices</a> and <a href="http://en.wikipedia.org/wiki/Explicit_substitution">explicit substitutions</a>.</p>
<p>The code is available on Github in the repository <a href="https://github.com/andrejbauer/tt/tree/blog-part-III">andrejbauer/tt</a> (the <code>blog-part-III</code> branch).</p>
<p><span id="more-1337"></span></p>
<p>People say that de Bruijn indices and explicit substitutions are difficult to implement. I agree, I spent far too long debugging my code. But because every bug crashed and burnt my program immediately, I at least knew I was not done. In contrast, &#8220;manual&#8221; substitutions hide their bugs really well, and so are even more difficult to get right. I am convinced that my implementation from part II is still buggy.</p>
<h3>Blitz introduction to de Bruijn indices and explicit substitution</h3>
<p>If you do not know about <a href="http://en.wikipedia.org/wiki/De_Bruijn_index">de Bruijn indices</a> and <a href="http://en.wikipedia.org/wiki/Explicit_substitution">explicit substitutions</a> you should first read the relevant Wikipedia pages, and perhaps the <a href="http://www.hpl.hp.com/techreports/Compaq-DEC/SRC-RR-54.pdf">original paper on explicit substitutions</a>, written by a truly impressive group of authors. Here is an inadequate short explanation for those who cannot be bothered to click on links.</p>
<p>We keep looking up variables in a context by their names, which seems a bit inefficient. We might have the bright idea of referring to <em>positions</em> in the context directly. We can indeed do this, and because a context is like a stack there are two choices:</p>
<ul>
<li><em>de Bruijn levels</em> are positions as counted from the bottom of the stack,</li>
<li><em>de Bruijn indices</em> are positions as counter from the top of the stack.</li>
</ul>
<p>We will use the indices. Thus, when the context grows all the old indices have to be <em>shifted</em> by one, which sounds more horrible than it is, as levels bring their own problems (which?). For instance, the $\lambda$-term $\lambda x \,.\, \lambda y \,.\, x$ is written with de Bruijn indices as $\lambda \, (\lambda \, 1)$, whereas $\lambda x \,.\, \lambda y \,.\, y$ is written as $\lambda \, (\lambda \, 0)$. (Just go read the Wikipedia article on <a href="http://en.wikipedia.org/wiki/De_Bruijn_index">de Bruijn indices</a> if you have not seen this before.)</p>
<p>The shifting and pushing of new things onto the context is expressed with explicit substitutions:</p>
<pre class="brush: plain; title: ; notranslate">
type substitution =
  | Shift of int
  | Dot of expr * substitution
</pre>
<p>Read <code>Shift k</code> as &#8220;add $k$ to all indices&#8221; and <code>Dot(e,s)</code> as &#8220;push $e$ and use $s$&#8221;. In mathematical notation we write $\uparrow^n$ instead of <code>Shift n</code> and $e \cdot \sigma$ instead of <code>Dot(e,sigma)</code>. An explicit substitution $\sigma$ acts on an expression $e$ to give a new expression $[\sigma] e$. For example:</p>
<ul>
<li>$[\uparrow^k] (\mathtt{Var}\, m) = \mathtt{Var} (k + m)$</li>
<li>$[e \cdot \sigma)] (\mathtt{Var}\, 0) = e$</li>
<li>$[e \cdot \sigma)] (\mathtt{Var}\, (k+1)) = [\sigma](\mathtt{Var}\, k).$</li>
</ul>
<p>Below we will read off the other equations from the source code. Substitutions are performed on demand, which means that $[\sigma] e$ is an expression that needs to be accounted for in the syntax.</p>
<h3>Splitting the syntax</h3>
<p>The user is going to type in syntax with names, which we have to convert to an internal syntax that uses the indices. We should also keep the original names around for pretty-printing purposes. Therefore we need a datatype <a href="https://github.com/andrejbauer/tt/blob/blog-part-III/input.ml"><code>Input.exp</code></a> for parsing,</p>
<pre class="brush: plain; title: ; notranslate">
(** Abstract syntax of expressions as given by the user. *)
type expr = expr' * Common.position
and expr' =
  | Var of Common.variable
  | Universe of int
  | Pi of abstraction
  | Lambda of abstraction
  | App of expr * expr

(** An abstraction [(x,t,e)] indicates that [x] of type [t] is bound in [e]. *)
and abstraction = Common.variable * expr * expr
</pre>
<p>and a datatype <a href="https://github.com/andrejbauer/tt/blob/blog-part-III/syntax.ml"><code>Syntax.expr</code></a> for the internal syntax:</p>
<pre class="brush: plain; title: ; notranslate">
(** Abstract syntax of expressions, where de Bruijn indices are used to represent
    variables. *)
type expr = expr' * Common.position
and expr' =
  | Var of int                   (* de Briujn index *)
  | Subst of substitution * expr (* explicit substitution *)
  | Universe of universe
  | Pi of abstraction
  | Lambda of abstraction
  | App of expr * expr

(** An abstraction [(x,t,e)] indicates that [x] of type [t] is bound in [e]. We also keep around
    the original name [x] of the bound variable for pretty-printing purposes. *)
and abstraction = Common.variable * expr * expr

(** Explicit substitutions. *)
and substitution =
  | Shift of int
  | Dot of expr * substitution
</pre>
<p>Conversion from one to the other is done by <a href="https://github.com/andrejbauer/tt/blob/blog-part-III/desugar.ml"><code>Desugar.desugar</code></a>. Notice that we do not throw away variable names, but rather keep them around in the internal syntax so that we can print them out later. Strangely enough, <a href="https://github.com/andrejbauer/tt/blob/blog-part-III/beautify.ml"><code>beautify.ml</code></a> gets shorter with de Bruijn indices.</p>
<h3>Explicit substitutions</h3>
<p>The <a href="https://github.com/andrejbauer/tt/blob/blog-part-III/syntax.ml"><code>Syntax</code></a> module contains a couple of functions for handling explicit substitutions. First we have <code>Syntax.composition</code> which tells us how substitutions are composed:</p>
<pre class="brush: plain; title: ; notranslate">
let rec compose s t =
  match s, t with
    | s, Shift 0 -&gt; s
    | Dot (e, s), Shift m -&gt; compose s (Shift (m - 1))
    | Shift m, Shift n -&gt; Shift (m + n)
    | s, Dot (e, t) -&gt; Dot (mk_subst s e, compose s t)
</pre>
<p>In mathematical notation:</p>
<ul>
<li>$\sigma \circ \uparrow^0 = \sigma$</li>
<li>$(e \cdot \sigma) \circ \uparrow^{m} = s \circ \uparrow^{m-1}$</li>
<li>$\uparrow^{m} \circ \uparrow^{n} = \uparrow^{m + n}$</li>
<li>$\sigma \circ (e \cdot \tau) = [\sigma] e \cdot (\sigma \circ \tau)$</li>
</ul>
<p>Of course, composition $\circ$ is the operation characterized by the equation $[\sigma \circ \tau] e = [\sigma]([\tau] e)$. Next we have <code>Syntax.subst</code> which explains how substitutions are performed:</p>
<pre class="brush: plain; title: ; notranslate">
(** [subst s e] applies explicit substitution [s] in expression [e]. It does so
    lazily, i.e., it does just enough to expose the outermost constructor of [e]. *)
let subst =
  let rec subst s ((e', loc) as e) =
    match s, e' with
      | Shift m, Var k -&gt; Var (k + m), loc
      | Dot (e, s), Var 0 -&gt; subst idsubst e
      | Dot (e, s), Var k -&gt; subst s (Var (k - 1), loc)
      | s, Subst (t, e) -&gt; subst s (subst t e)
      | _, Universe _ -&gt; e
      | s, Pi a -&gt; Pi (subst_abstraction s a), loc
      | s, Lambda a -&gt; Lambda (subst_abstraction s a), loc
      | s, App (e1, e2) -&gt; App (mk_subst s e1, mk_subst s e2), loc
  and subst_abstraction s (x, e1, e2) =
    let e1 = mk_subst s e1 in
    let e2 = mk_subst (Dot (mk_var 0, compose (Shift 1) s)) e2 in
      (x, e1, e2)
  in
    subst
</pre>
<p>The code is not very readable, but in mathematical notation the interesting bits say:</p>
<ul>
<li>$[\uparrow^m](\mathtt{Var}\,k) = \mathtt{Var}\,(k + m)$</li>
<li>$[e \cdot \sigma] (\mathtt{Var}\,0) = e$</li>
<li>$[e \cdot \sigma] (\mathtt{Var}\,k) = [\sigma](\mathtt{Var}(k-1))$</li>
<li>$[\sigma](\lambda\, e) = \lambda \, ([\mathtt{Var}\,0 \cdot (\uparrow^1 \circ \sigma)] e)$</li>
<li>$[\sigma](e_1\,e_2) = ([\sigma]e_1)([\sigma]e_2)$</li>
</ul>
<p>There is also <code>Syntax.occurs</code> which checks whether a given index appears freely in an expression. This is not entirely trivial because explicit substitutions and abstractions change the indices, so the function has to keep track of what is what.</p>
<p>You may wonder what happened to $\beta$-reduction. If you look at <code>Norm.norm</code> you will discover it burried in the code for normalization of applications:<br />
$$(\lambda \, e_1)\, e_2 = [e_2 \cdot \uparrow^0] e_1.$$</p>
<h3>Normalization</h3>
<p>In the last part we demonstrated normalization by evaluation. We always normalized everything all the way, which is an overkill. For example, during equality checking the <a href="http://encyclopedia2.thefreedictionary.com/Weak+Head+Normal+Form">weak head normal form</a> suffices to get the comparison started, and then we normalize on demand. So I replaced normalization by evaluation with direct normalization, as done in <a href="https://github.com/andrejbauer/tt/blob/blog-part-III/norm.ml"><code>norm.ml</code></a>. We still need normal forms when the user asks for them. Luckily, a single function can perform both kinds of normalization.</p>
<h3>Optimization</h3>
<p>The source contains no optimizations at all because its purpose is to be as clear as possible. The whole program is still pretty small, we are at 824 lines while the core is just 247 lines. The speed is comparable to the previous version, but with a bit of effort we should be able to speed it up considerably. Here are some opportunities:</p>
<ul>
<li>we normalize a definition every time we look it up in the context,</li>
<li>explicit substitutions tend to cancel out, and it is a good idea to look for common special cases, like composition with the identity substitution,</li>
<li>there is a lot of shifting happening when we look things up in the context, perhaps some of those could be avoided</li>
</ul>
<p>If anyone wants to work on these, I would be delighted to make a pull request.</p>
<p>I really have to do some serious math and stop playing around, so do not expect the next part anytime soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://math.andrej.com/2012/11/29/how-to-implement-dependent-type-theory-iii/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>How to implement dependent type theory II</title>
		<link>http://math.andrej.com/2012/11/11/how-to-implement-dependent-type-theory-ii/</link>
		<comments>http://math.andrej.com/2012/11/11/how-to-implement-dependent-type-theory-ii/#comments</comments>
		<pubDate>Sun, 11 Nov 2012 15:48:41 +0000</pubDate>
		<dc:creator>Andrej Bauer</dc:creator>
				<category><![CDATA[Homotopy type theory]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Tutorial]]></category>

		<guid isPermaLink="false">http://math.andrej.com/?p=1320</guid>
		<description><![CDATA[<p>I am on a roll. In the second post on how to implement dependent type theory we are going to:</p>

Spiff up the syntax by allowing more flexible syntax for bindings in functions and products.
Keep track of source code locations so that we can report where the error has occurred.
Perform normalization by evaluation.

<p></p>
<p>The relevant Github repository is [...]]]></description>
				<content:encoded><![CDATA[<p>I am on a roll. In the second post on how to implement dependent type theory we are going to:</p>
<ol>
<li>Spiff up the syntax by allowing more flexible syntax for bindings in functions and products.</li>
<li>Keep track of source code locations so that we can report <em>where</em> the error has occurred.</li>
<li>Perform <a href="http://en.wikipedia.org/wiki/Normalisation_by_evaluation">normalization by evaluation</a>.</li>
</ol>
<p><span id="more-1320"></span></p>
<p>The relevant Github repository is <a href="https://github.com/andrejbauer/tt/tree/blog-part-II">andrejbauer/tt/</a> (branch blog-part-II). By the way, there are probably bugs in the implementation, I am not spending a huge amount of time on testing (mental note: put &#8220;implement testing&#8221; on the to do list). If you discover one, please tell me, or preferrably make a pull request with a fix. This also applies to old branches.</p>
<h3>Source code positions</h3>
<p>Source code positions seem like an annoyance because they pollute our nice datatypes. Nevertheless, an even bigger annoyance is an error message without an indication of its position.</p>
<p>The OCaml lexer keeps track of positions, and menhir has support for them, so we just need to incorporate them into our program. Every expression should be tagged with the source code position it came from. Sometimes we generate expressions with no associated position, so we define:</p>
<pre class="brush: plain; title: ; notranslate">
type position =
  | Position of Lexing.position * Lexing.position
  | Nowhere
</pre>
<p>The type <code>Lexing.position</code> is the one from OCaml lexer. Each expression is associated with two such positions, its beginning and end. To tag expressions with positions we define two types: <code>expr</code> is an expression with a position and <code>expr'</code> without (I stole the idea from <a href="http://matija.pretnar.info/">Matija Pretnar</a>&#8216;s <a href="/eff/">eff</a> code):</p>
<pre class="brush: plain; title: ; notranslate">
(** Abstract syntax of expressions. *)
type expr = expr' * position
and expr' =
  | Var of variable
  | Universe of int
  | Pi of abstraction
  | Lambda of abstraction
  | App of expr * expr

(** An abstraction [(x,t,e)] indicates that [x] of type [t] is bound in [e]. *)
and abstraction = variable * expr * expr
</pre>
<p>Note that <code>expr'</code> refers back to <code>expr</code> so that subexpressions come equipped with their positions. We generally follow the rule that an apostrophe is attached to a type or a function which is position-less. Except that apostrohpies are not valid in the names of grammatical rules in the parser, so in <a href="https://github.com/andrejbauer/tt/blob/blog-part-II/parser.mly">parser.mly</a> we write <code>plain_expr</code> instead of <code>expr'</code>.</p>
<p>We also extend the pretty printer and error reporting with positions, feel free to consult the source code.</p>
<h3>Better syntax for bindings</h3>
<p>This is a fairly trivial change. It is annoying to have to write things like</p>
<pre class="brush: plain; title: ; notranslate">
fun x : A =&gt; fun y : A =&gt; fun z : B =&gt; fun w : B =&gt; ...
</pre>
<p>We improve the parser so that it accepts syntax like</p>
<pre class="brush: plain; title: ; notranslate">
fun (x y : A) (z w : B) =&gt; ...
</pre>
<p>Let us read out the relevant portion of <a href="https://github.com/andrejbauer/tt/blob/blog-part-II/parser.mly">parser.mly</a>, namely the rules <code>abstraction</code>, <code>bind1</code> and <code>binds</code>:</p>
<pre class="brush: plain; title: ; notranslate">
abstraction:
  | b = bind1
    { [b] }
  | bs = binds
    { bs }

bind1: mark_position(plain_bind1) { $1 }
plain_bind1:
  | xs = nonempty_list(NAME) COLON t = expr
    { (List.map (fun x -&gt; String x) xs, t) }

binds:
  | LPAREN b = bind1 RPAREN
    { [b] }
  | LPAREN b = bind1 RPAREN lst = binds
    { b :: lst }
</pre>
<p>A <code>bind1</code> is something of the form <code>x y ... z : t</code>. A <code>binds</code> is a non-empty list of parenthesized <code>bind1</code>&#8216;s. An abstraction is either a <code>bind1</code> or a <code>binds</code>. Thus we can write <code>fun x y z : t =&gt; ...</code> and <code>fun (x y z : t) =&gt; ...</code> and <code>fun (x y : t) (z : t) =&gt; ...</code> but not <code>fun x y : y (z : t) =&gt; ...</code>.</p>
<h3>Normalization by evaluation</h3>
<p>In the first version we performed normalization by substitution, just like theory books say we should. But this is horribly inefficient. We could improve efficiency by keeping a current substitution (a &#8220;runtime&#8221; environment) which maps variables to the expressions. When we encounter a variable we look up its value in the current substitution. This way at least we do not keep traversing expressions during substitutions.</p>
<p>An even cooler way to normalize is known as normalization by evaluation. We first &#8220;evaluate&#8221; expressions to actual OCaml values in such a way that definitionally equal expressions evaluate to (observationally) equivalent values, and then we reconstruct the expression from the value (the fancy speak is that we <a href="http://dictionary.reference.com/browse/reify">reify</a> the value). Apart from giving us a normal form there are all sorts of other benefits (Dan Grayson keeps asking me which, perhaps the more knowledgable readers can point them out).</p>
<p>We need a datatype <code>value</code> into which we evaluate expressions. We need to evaluate expressions with free variables, which means that we are going to get stuck on applications of the form <code>x v1 v2 .. vn</code> where <code>x</code> is a free variable (these are called head-normal). We collect those in a separate datatype <code>neutral</code>:</p>
<pre class="brush: plain; title: ; notranslate">
type value =
  | Neutral of neutral
  | Universe of int
  | Pi of abstraction
  | Lambda of abstraction

and abstraction = variable * value * (value -&gt; value)

and neutral =
  | Var of variable
  | App of neutral * value
</pre>
<p>Abstractions will be evaluated to OCaml functions (so OCaml will take care of substitutions). Thus an abstraction like <code>fun x : t => e</code> should be evaluated to a pair <code>(u, v)</code> where <code>u</code> is the value of <code>t</code> and <code>v</code> is the function $x \mapsto e$. But if you look at the definition of <code>abstraction</code> above you see that we also keep around the variable name. This we do for pretty-printing purposes. When we reify an evaluated abstraction back to its expression form, we use the variable name as a hint.</p>
<p>You should take the time to read <a href="https://github.com/andrejbauer/tt/blob/blog-part-II/value.ml"><code>value.ml</code></a> which contains evaluation and reification, and comparison of values. Also note that <code>Infer.normalize</code> really is just the composition of evaluation and reification.</p>
<p>We are now at 759 lines of code. We added 90 codes for evaluation by normalization and 51 for the for keeping track of source code positions.</p>
<h3>Trying something out</h3>
<p>Ok, let us try something fun. How about <a href="http://en.wikipedia.org/wiki/Church_encoding">Church numerals</a>?</p>
<pre class="brush: plain; title: ; notranslate">
Parameter N : Type 0.
Parameter z : N.
Parameter s : N -&gt; N.

Definition numeral := forall A : Type 0, (A -&gt; A) -&gt; (A -&gt; A).

Definition zero := fun (A : Type 0) (f : A -&gt; A) (x : A) =&gt; x.
Definition one := fun (A : Type 0) (f : A -&gt; A) =&gt; f.
Definition two := fun (A : Type 0) (f : A -&gt; A) (x : A) =&gt; f (f x).
Definition three := fun (A : Type 0) (f : A -&gt; A) (x : A) =&gt; f (f (f x)).

Definition plus :=
  fun (m n : numeral) (A : Type 0) (f : A -&gt; A) (x : A) =&gt; m A f (n A f x).

Definition times :=
  fun (m n : numeral) (A : Type 0) (f : A -&gt; A) (x : A) =&gt; m A (n A f) x.

Definition power :=
  fun (m n : numeral) (A : Type 0) =&gt; m (A -&gt; A) (n A).
  
Definition four := plus two two.
Definition five := plus two three.
</pre>
<p>If you put the above code in <code>church.tt</code> you can load it into <code>tt</code> by</p>
<pre class="brush: plain; title: ; notranslate">
./tt.native -l church.tt
</pre>
<p>Dare we compute $2^{16}$? Sure:</p>
<pre class="brush: plain; title: ; notranslate">
# Eval power two (power two four) N s z.
    = s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s (s
      (s (s (s (s (s (s (s (s (s (s (... (...)))))))))))))))))))))))))))))))))))))))
    : N
</pre>
<p>It takes a moment, but that is mostly because of pretty-printing (if you evaluate the same expression in non-interactive mode by placing it in a file, you will notice no delay). How about $2^{20}$?</p>
<pre class="brush: plain; title: ; notranslate">
# Eval power two (times four five).
Fatal error: exception Stack_overflow
</pre>
<p>Oh well. We will have to do something smarter (I am open to suggestions), or increase the stack size. Next time we are going some more types, and then I would like to focus on how to implement an interactive mode.</p>
]]></content:encoded>
			<wfw:commentRss>http://math.andrej.com/2012/11/11/how-to-implement-dependent-type-theory-ii/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
		<item>
		<title>How to implement dependent type theory I</title>
		<link>http://math.andrej.com/2012/11/08/how-to-implement-dependent-type-theory-i/</link>
		<comments>http://math.andrej.com/2012/11/08/how-to-implement-dependent-type-theory-i/#comments</comments>
		<pubDate>Thu, 08 Nov 2012 05:23:50 +0000</pubDate>
		<dc:creator>Andrej Bauer</dc:creator>
				<category><![CDATA[Homotopy type theory]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Software]]></category>
		<category><![CDATA[Tutorial]]></category>

		<guid isPermaLink="false">http://math.andrej.com/?p=1284</guid>
		<description><![CDATA[<p>I am spending a semester at the Institute for Advanced Study where we have a special year on Univalent foundations. We are doing all sorts of things, among others experimenting with type theories. We have got some real experts here who know type theory and Coq inside out, and much more, and they&#8217;re doing crazy things [...]]]></description>
				<content:encoded><![CDATA[<p>I am spending a semester at the <a href="http://www.ias.edu/">Institute for Advanced Study</a> where we have a special year on <a href="http://www.math.ias.edu/sp/univalent">Univalent foundations</a>. We are doing all sorts of things, among others experimenting with type theories. We <a href="http://en.wikipedia.org/wiki/Per_Martin-Löf">have</a> <a href="http://www.cse.chalmers.se/~coquand/">got</a> <a href="http://pauillac.inria.fr/~herbelin/index-eng.html">some</a> <a href="http://mattam.org">real</a> <a href="http://www.lix.polytechnique.fr/~barras/">experts</a> <a href="http://www.lix.polytechnique.fr/~assia/rech-eng.html">here</a> who know type theory and Coq inside out, and much more, and they&#8217;re doing crazy things to Coq (I will report on them when they are done). In the meanwhile I have been thinking how one might implement dependent type theories with undecidable type checking. This is a tricky subject and I am certainly not the first one to think about it. Anyhow, if I want to experiment with type theories, I need a small prototype first. Today I will present a very minimal one, and build on it in future posts.</p>
<p>Make a guess, how many lines of code does it take to implement a dependent type theory with universes, dependent products, a parser, lexer, pretty-printer, and a toplevel which uses line-editing when available?</p>
<p><span id="more-1284"></span>If you ever looked at my <a href="http://andrej.com/plzoo/">Programming languages zoo</a> you know it does not take that many lines of code to implement a toy language. On the other hand, dependent type theory is different from a typical compiler because we cannot meaningfully separate the type checking, compilation, and execution phases.</p>
<p><a href="http://www.cs.cmu.edu/~drl/">Dan Licata</a> pointed me to <a href="http://www.andres-loeh.de/LambdaPi/">A Tutorial Implementation of a Dependently Typed Lambda Calculus</a> by <a href="http://www.andres-loeh.de">Andreas Löh</a>, <a href="http://strictlypositive.org">Connor McBride</a>, and <a href="http://www.staff.science.uu.nl/~swier004/">Wouter Swierstra</a> which is similar to this one. It was a great inspiration to me, and you should have a look at it too, because they do things slightly differently: they use de Bruijn indices, they simplify things by assuming (paradoxically!) that $\mathtt{Type} \in \mathtt{Type}$, and they implement the calculus in <a href="http://www.haskell.org/">Haskell</a>, while we are going to do it in <a href="http://www.ocaml.org/">OCaml</a>.</p>
<h3>A minimal type theory</h3>
<p>I am going to assume you are already familiar with Martin-Löf <a href="http://en.wikipedia.org/wiki/Intuitionistic_type_theory">dependent type theory</a>. We are going to implement:</p>
<ul>
<li>a hierarchy of <a href="http://en.wikipedia.org/wiki/Intuitionistic_type_theory#Universes">universes</a> $\mathtt{Type}_0$, $\mathtt{Type}_1$, $\mathtt{Type}_2$, &#8230;</li>
<li><a href="http://en.wikipedia.org/wiki/Intuitionistic_type_theory#.CE.A0-types">dependent products</a> $\prod_{x : A} B$</li>
<li>functions $\lambda x : A . e$, and</li>
<li>application $e_1 \; e_2$.</li>
</ul>
<p>We are not going to write down the exact inference rules, although that would be a good idea in a serious experiment. Instead, we are going to read them off later by looking at the source code.</p>
<h3>Syntax</h3>
<p>We can directly translate the above to <a href="http://en.wikipedia.org/wiki/Abstract_syntax">abstract syntax</a> of expressions (in the code below think of the type <code>variable</code> as string, we will explain it later):</p>
<pre class="brush: plain; title: ; notranslate">
(** Abstract syntax of expressions. *)
type expr =
  | Var of variable
  | Universe of int
  | Pi of abstraction
  | Lambda of abstraction
  | App of expr * expr

(** An abstraction [(x,t,e)] indicates that [x] of type [t] is bound in [e]. *)
and abstraction = variable * expr * expr
</pre>
<p>We choose a concrete syntax that is similar to that of <a href="http://coq.inria.fr/">Coq</a>:</p>
<ul>
<li>universes are written <code>Type 0</code>, <code>Type 1</code>, <code>Type 2</code>, &#8230;</li>
<li>the dependent product is written <code>forall x : A, B</code>,</li>
<li>a function is written <code>fun x : A =&gt; B</code>,</li>
<li>application is juxtaposition <code>e1 e2</code>.</li>
</ul>
<p>If <code>x</code> does not appear freely in <code>B</code>, then we write <code>A -&gt; B</code> instead of <code>forall x : A, B</code>.</p>
<p>In this tutorial we are not going to learn how to write a lexer and a parser, but see a comment about it below.</p>
<h3>Substitution</h3>
<p>One way or another we have to deal with substitution. We could try to avoid it by compiling into OCaml functions, or we could use de Bruijn indices. The expert opinion is that <a href="http://en.wikipedia.org/wiki/De_Bruijn_index">de Bruijn indices</a> are the way to go, but I want to keep things as simple as possible for now, so let us just implement <a href="http://en.wikipedia.org/wiki/Lambda_calculus#Substitution">substitution</a>.</p>
<p>Substitution must avoid variable capture. This means that we have to be able to generate new variable names. We can do it by simply generating an infinite sequence of them $x_1, x_2, x_3, \ldots$ But what it the user already used $x_3$, then we should not reuse it? To solve the problem we define a datatype of variable names like this:</p>
<pre class="brush: plain; title: ; notranslate">
type variable =
 | String of string
 | Gensym of string * int
 | Dummy
</pre>
<p>When the user types <code>x3</code> we represent this as <code>String "x3"</code>, whereas we generate variables of the form <code>Gensym("x",3)</code>. We make sure that the integer is unique, so the variable is fresh. The string is a hint to the pretty-printer, which should try to print the generated variable as the string, if possible. For example, suppose we have a $\lambda$-abstraction</p>
<pre class="brush: plain; title: ; notranslate">
Lambda (String &quot;x&quot;, ..., ...)
</pre>
<p>and because of substitutions we refreshed the variable to something like</p>
<pre class="brush: plain; title: ; notranslate">
Lambda (Gensym(&quot;x&quot;, 4124), ..., ...)
</pre>
<p>It would be silly to print this as <code>fun x4124 : ... =&gt; ...</code> if we could first rename the bound variable back to <code>x</code>, or to <code>x1</code> if <code>x</code> is taken already. This is exactly what the pretty printer will do.</p>
<p>The <code>Dummy</code> variable is one that is never used. It only appears for the purposes of pretty printing.</p>
<p>Here is the substitution code:</p>
<pre class="brush: plain; title: ; notranslate">
(** [refresh x] generates a fresh variable name whose preferred form is [x]. *)
let refresh =
  let k = ref 0 in
    function
      | String x | Gensym (x, _) -&gt; (incr k ; Gensym (x, !k))
      | Dummy -&gt; (incr k ; Gensym (&quot;_&quot;, !k))

(** [subst [(x1,e1); ...; (xn;en)] e] performs the given substitution of
    expressions [e1], ..., [en] for variables [x1], ..., [xn] in expression [e]. *)
let rec subst s = function
  | Var x -&gt; (try List.assoc x s with Not_found -&gt; Var x)
  | Universe k -&gt; Universe k
  | Pi a -&gt; Pi (subst_abstraction s a)
  | Lambda a -&gt; Lambda (subst_abstraction s a)
  | App (e1, e2) -&gt; App (subst s e1, subst s e2)

and subst_abstraction s (x, t, e) =
  let x' = refresh x in
    (x', subst s t, subst ((x, Var x') :: s) e)
</pre>
<h3>Type inference</h3>
<p>Our calculus is such that an expression has at most one type, and when it does the type can be inferred from the expression. Therefore, we are going to implement type inference. During inference we need to carry around a context which maps variables to their types. And since we will allow global definitions on the toplevel, the context should also store (optional) definitions. So we define contexts to be <a href="http://en.wikipedia.org/wiki/Association_list">association lists</a>.</p>
<pre class="brush: plain; title: ; notranslate">
type context = (Syntax.variable * (Syntax.expr * Syntax.expr option)) list
</pre>
<p>We need functions that lookup up types and values of variables, and one for extending a context with a new variable:</p>
<pre class="brush: plain; title: ; notranslate">
(** [lookup_ty x ctx] returns the type of [x] in context [ctx]. *)
let lookup_ty x ctx = fst (List.assoc x ctx)

(** [lookup_ty x ctx] returns the value of [x] in context [ctx], or [None]
    if [x] has no assigned value. *)
let lookup_value x ctx = snd (List.assoc x ctx)

(** [extend x t ctx] returns [ctx] extended with variable [x] of type [t],
    whereas [extend x t ~value:e ctx] returns [ctx] extended with variable [x]
    of type [t] and assigned value [e]. *)
let extend x t ?value ctx = (x, (t, value)) :: ctx
</pre>
<p>Notice that extending with a variable which already appears in the context shadows the old variable, as it should.</p>
<p>We said we would read off the typing rules from the source code:</p>
<pre class="brush: plain; title: ; notranslate">
(** [infer_type ctx e] infers the type of expression [e] in context [ctx].  *)
let rec infer_type ctx = function
  | Var x -&gt;
    (try lookup_ty x ctx
     with Not_found -&gt; Error.typing &quot;unkown identifier %t&quot; (Print.variable x))
  | Universe k -&gt; Universe (k + 1)
  | Pi (x, t1, t2) -&gt;
    let k1 = infer_universe ctx t1 in
    let k2 = infer_universe (extend x t1 ctx) t2 in
      Universe (max k1 k2)
  | Lambda (x, t, e) -&gt;
    let _ = infer_universe ctx t in
    let te = infer_type (extend x t ctx) e in
      Pi (x, t, te)
  | App (e1, e2) -&gt;
    let (x, s, t) = infer_pi ctx e1 in
    let te = infer_type ctx e2 in
      check_equal ctx s te ;
      subst [(x, e2)] t
</pre>
<p>Ok, here we go:</p>
<ol>
<li>The type of a variable is looked up in the context.</li>
<li>The type of $\mathtt{Type}_k$ is $\mathtt{Type}_{k+1}$.</li>
<li>The type of $\prod_{x : T_1} T_2$ is $\mathtt{Type}_{\max(k, m)}$ where $T_1$ has type $\mathtt{Type}_k$ and $T_2$ has type $\mathtt{Type}_m$ in the context extended with $x : T_1$.</li>
<li>The type of $\lambda x : T \; . \; e$ is $\prod_{x : T} T&#8217;$ where $T&#8217;$ is the type of $e$ in the context extended with $x : T$.</li>
<li>The type of $e_1 \; e_2$ is $T[x/e_2]$ where $e_1$ has type $\prod_{x : S} T$ and $e_2$ has type $S$.</li>
</ol>
<p>The typing rules refer to auxiliary functions <code>infer_universe</code>, <code>infer_pi</code>, and <code>check_equal</code>, which we have not defined yet. The function <code>infer_universe</code> infers the type of an expression, makes sure that the type is of the form $\mathtt{Type}_k$, and returns $k$. A common mistake is to think that you can implement it like this:</p>
<pre class="brush: plain; title: ; notranslate">
(** Why is this infer_universe wrong? *)
and bad_infer_universe ctx t =
    match infer_type ctx t with
      | Universe k -&gt; u
      | App _ | Var _ | Pi _ | Lambda _ -&gt; Error.typing &quot;type expected&quot;
</pre>
<p>This will not do. For example, what if <code>infer_type ctx t</code> returns the type $(\lambda x : \mathtt{Type}_{4} \; . \; x) \mathtt{Type}_3$? Then <code>infer_universe</code> will complain, because it does not see that the type it got is equal to $\mathtt{Type}_3$, even though it is not syntactically the same expression. We need to insert a normalization procedure which converts the type to a form from which we can read off its shape:</p>
<pre class="brush: plain; title: ; notranslate">
(** [infer_universe ctx t] infers the universe level of type [t] in context [ctx]. *)
and infer_universe ctx t =
  let u = infer_type ctx t in
    match normalize ctx u with
      | Universe k -&gt; k
      | App _ | Var _ | Pi _ | Lambda _ -&gt; Error.typing &quot;type expected&quot;
</pre>
<p>We shall implement normalization in a moment, but first we write down the other two auxiliary functions:</p>
<pre class="brush: plain; title: ; notranslate">
(** [infer_pi ctx e] infers the type of [e] in context [ctx], verifies that it is
    of the form [Pi (x, t1, t2)] and returns the triple [(x, t1, t2)]. *)
and infer_pi ctx e =
  let t = infer_type ctx e in
    match normalize ctx t with
      | Pi a -&gt; a
      | Var _ | App _ | Universe _ | Lambda _ -&gt; Error.typing &quot;function expected&quot;

(** [check_equal ctx e1 e2] checks that expressions [e1] and [e2] are equal. *)
and check_equal ctx e1 e2 =
  if not (equal ctx e1 e2)
  then Error.typing &quot;expressions %t and %t are not equal&quot; (Print.expr e1) (Print.expr e2)
</pre>
<h3>Normalization and equality</h3>
<p>We need a function <code>normalize</code> which takes an expression and &#8220;computes&#8221; it, so that we can tell when something is a universe, and when something is a function. There are several strategies on how we might do this, and any will do as long as we have the following property: if $e_1$ and $e_2$ are equal (type theorists say that they are <em>judgmentally equal</em>, or sometimes that they are <em>definitionally equal</em>) then after normalization they should become syntactically equal, up to renaming of bound variables.</p>
<p>Our judgmental equality essentially has just two simple rules, <a href="http://en.wikipedia.org/wiki/Beta_reduction#Reduction">$\beta$-reduction</a> and unfolding of definitions. So this is what the normalization procedure does:</p>
<pre class="brush: plain; title: ; notranslate">
(** [normalize ctx e] normalizes the given expression [e] in context [ctx]. It removes
    all redexes and it unfolds all definitions. It performs normalization under binders.  *)
let rec normalize ctx = function
  | Var x -&gt;
    (match
        (try lookup_value x ctx
         with Not_found -&gt; Error.runtime &quot;unkown identifier %t&quot; (Print.variable x))
     with
       | None -&gt; Var x
       | Some e -&gt; normalize ctx e)
  | App (e1, e2) -&gt;
    let e2 = normalize ctx e2 in
      (match normalize ctx e1 with
        | Lambda (x, _, e1') -&gt; normalize ctx (subst [(x,e2)] e1')
        | e1 -&gt; App (e1, e2))
  | Universe k -&gt; Universe k
  | Pi a -&gt; Pi (normalize_abstraction ctx a)
  | Lambda a -&gt; Lambda (normalize_abstraction ctx a)

and normalize_abstraction ctx (x, t, e) =
  let t = normalize ctx t in
    (x, t, normalize (extend x t ctx) e)
</pre>
<p>How about testing for equality of expressions, which was needed in the rule for application? We normalize, then compare for syntactic equality. We make sure that in comparison of abstractions both bound variables are the same:</p>
<pre class="brush: plain; title: ; notranslate">
(** [equal ctx e1 e2] determines whether normalized [e1] and [e2] are equal up to renaming
    of bound variables. *)
let equal ctx e1 e2 =
  let rec equal e1 e2 =
    match e1, e2 with
      | Var x1, Var x2 -&gt; x1 = x2
      | App (e11, e12), App (e21, e22) -&gt; equal e11 e21 &amp;&amp; equal e12 e22
      | Universe k1, Universe k2 -&gt; k1 = k2
      | Pi a1, Pi a2 -&gt; equal_abstraction a1 a2
      | Lambda a1, Lambda a2 -&gt; equal_abstraction a1 a2
      | (Var _ | App _ | Universe _ | Pi _ | Lambda _), _ -&gt; false
  and equal_abstraction (x, t1, e1) (y, t2, e2) =
    equal t1 t2 &amp;&amp; (equal e1 (subst [(y, Var x)] e2))
  in
    equal (normalize ctx e1) (normalize ctx e2)
</pre>
<p>And that is it! We have the core of the system written down. The rest is just <a href="http://www.jargon.net/jargonfile/c/chrome.html">chrome</a>: lexer, parser, toplevel, error reporting, pretty printer. Those are the things nobody ever explains because they are boring as soon as you have managed to implement them once. Nevertheless, let us have a brief look. The code is accessible at the Github project <a href="https://github.com/andrejbauer/tt/tree/blog-part-I"><code>andrejbauer/tt</code></a> (the branch <code>blog-part-I</code>).</p>
<h3>The infrastructure</h3>
<h4>Parser: <a href="https://github.com/andrejbauer/tt/blob/blog-part-I/parser.mly"><code>parser.mly</code></a> and <a href="https://github.com/andrejbauer/tt/blob/blog-part-I/lexer.mll"><code>lexer.mll</code></a></h4>
<p>You never ever want to write a parser with your bare hands. Insetad, you should use a <a href="http://en.wikipedia.org/wiki/Compiler-compiler">parser generator</a>. There are many, I used <a href="http://gallium.inria.fr/~fpottier/menhir/">menhir</a>. Parser generators can be a bit scary, but a good way to get started is to take someone else&#8217;s parser and fiddle with it.</p>
<h4>Pretty printer: <a href="https://github.com/andrejbauer/tt/blob/blog-part-I/print.ml"><code>print.ml</code></a> and <a href="https://github.com/andrejbauer/tt/blob/blog-part-I/beautify.ml"><code>beautify.ml</code></a></h4>
<p>Pretty printing is the opposite of parsing. In a usual programming language we do not have to print expressions with bound variables (because they get converted to non-printable closures), but here we do. It is worthwhile renaming the bound variables before printing them out, which is what <code>Beautify.beautify</code> does.</p>
<h4>Error reporting: <a href="https://github.com/andrejbauer/tt/blob/blog-part-I/error.ml"><code>error.ml</code></a></h4>
<p>Not surprisingly, errors are reported with exceptions. The only thing to note here is that it should be possible to do pretty printing in error messages, otherwise you will be tempted to produce uninformative error messages. Our implementation of course does that.</p>
<h4>Toplevel: <a href="https://github.com/andrejbauer/tt/blob/blog-part-I/tt.ml"><code>tt.ml</code></a></h4>
<p>The toplevel does nothing suprising. After parsing the command-line arguments and loading files, it enters an interactive toplevel loop. One neat trick that the toplevel does is that it looks for <a href="http://utopia.knoware.nl/~hlub/rlwrap/#rlwrap">rlwrap</a> or <a href="http://pauillac.inria.fr/~ddr/ledit/">ledit</a> and wraps itself with it. This gives us line-editing capabilities for free.</p>
<p>The toplevel commands are:</p>
<ul>
<li><code>Help.</code> print a description of toplevel commands.</li>
<li><code>Context.</code> print current context.</li>
<li><code>Parameter <i>x</i> : <i>t</i>.</code> assume that variable <code><i>x</i></code> has type <code><i>t</i></code>.</li>
<li><code>Definition <i>x</i> := <i>e</i>.</code> define <code><i>x</i></code> to be <code><i>e</i></code>.</li>
<li><code>Check <code><i>e</i></code>.</code> infer the type of <code><i>e</i></code>.</li>
<li><code>Eval <code><i>e</i></code>.</code> normalize <code><i>e</i></code>.</li>
</ul>
<p>Here is a sample session:</p>
<pre class="brush: plain; title: ; notranslate">
tt blog-part-I
[Type Ctrl-D to exit or &quot;Help.&quot; for help.]
# Parameter N : Type 0.
N is assumed
# Parameter z : N. Parameter s : N -&gt; N.
z is assumed
s is assumed
# Definition three := fun f : N -&gt; N =&gt; fun x : N =&gt; f (f (f x)).
three is defined
# Context.
three = fun f : N -&gt; N =&gt; fun x : N =&gt; f (f (f x))
    : (N -&gt; N) -&gt; N -&gt; N
s : N -&gt; N
z : N
N : Type 0
# Check (three (three s)).
three (three s)
    : N -&gt; N
# Eval (three (three s)) z.
    = s (s (s (s (s (s (s (s (s z))))))))
    : N
</pre>
<h3>Where to go from here?</h3>
<p>The whole program is 618 lines of code, and only 312 if we discount empty lines and comments. The core is just 92 lines, the rest is infrastructure. Not too bad. There are many ways in which we can improve <code>tt</code>, such as:</p>
<ol>
<li>Improve efficiency by implementing de Bruijn indice or some other mechanism that avoids substitutions.</li>
<li>Improve the normalization procedure so that it unfolds definitins on demand, rather than eagerly.</li>
<li>Improve the parser so that it accepts more flexible syntax.</li>
<li>Improve type inference so that not all bound variables have to be explicitly typed.</li>
<li>Add basic datatypes <code>unit</code>, <code>bool</code> and <code>nat</code>.</li>
<li>Add simple products, coproducts and dependent sums.</li>
<li>Implement a cummulative hierachy so that $\mathtt{Type}_k$ is a subtype of $\mathtt{Type}_{k+1}$.</li>
<li>Add inductive datatypes.</li>
<li>Add a stronger judgmental equality, for example $\eta$-reduction.</li>
<li>Implement tactics and an interactive proof mode.</li>
<li>Rule the world.</li>
</ol>
<p>I may do some of these in subsequent blog posts, if there is interest. Or if you do it, make a pull request on git and write a guest blog post!</p>
]]></content:encoded>
			<wfw:commentRss>http://math.andrej.com/2012/11/08/how-to-implement-dependent-type-theory-i/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Am I a constructive mathematician?</title>
		<link>http://math.andrej.com/2012/10/03/am-i-a-constructive-mathematician/</link>
		<comments>http://math.andrej.com/2012/10/03/am-i-a-constructive-mathematician/#comments</comments>
		<pubDate>Wed, 03 Oct 2012 04:54:07 +0000</pubDate>
		<dc:creator>Andrej Bauer</dc:creator>
				<category><![CDATA[Constructive math]]></category>
		<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://math.andrej.com/?p=1247</guid>
		<description><![CDATA[<p>It seems to me that people think I am a constructive mathematician, or worse a constructivist (a word which carries a certain amount of philosophical stigma). Let me be perfectly clear: it is not decidable whether I am a constructive mathematician.

But seriously, if anything, you may call me a mathematical relativist: there are many worlds of [...]]]></description>
				<content:encoded><![CDATA[<p>It seems to me that people think I am a constructive mathematician, or worse a constructivist (a word which carries a certain amount of philosophical stigma). Let me be perfectly clear: it is not decidable whether I am a constructive mathematician.<br />
<span id="more-1247"></span><br />
But seriously, if anything, you may call me a <strong>mathematical relativist: there are many worlds of mathematics, and the view of the worlds is relative to which one I am in.</strong> Any attempt to bring mathematics within the scope of a single foundation necessarily limits mathematics in unacceptable ways. A mathematician who sticks to just one mathematical world (probably because of his education) is a bit like a geometer who only knows Euclidean geometry. This holds equally well for classical mathematicians, who are not willing to give up their precious law of excluded middle,  and for Bishop-style mathematicians,  who pursue the noble cause of not opposing anyone.</p>
<p>What could be more appealing to a mathematician than the idea that there is not one, but many, infinitely many worlds of mathematics? Would he not want to visit them all, understand how they are related, and see what happens to his favorite subject as he moves between them?</p>
<p>Let us consider an example. The real numbers are a mathematical object of fundamental importance, and have many aspects:</p>
<ol>
<li>The reals as a set are uncountable and in bijection with the powerset of natural numbers.</li>
<li>The reals as an algebraic structure form a linearly ordered field.</li>
<li>The reals as a space are locally compact, Hausdorff, and connected.</li>
<li>The reals are a measurable space on which measure theory rests.</li>
<li>The reals of non-standard analysis contain infinitesimals.</li>
<li>The reals as understood by Leibniz contain <em>nilpotent</em> infinitesimals.</li>
<li>The reals as Brouwerian continuum cannot be decomposed into two disjoint inhabited subsets.</li>
<li>The reals are overt.</li>
</ol>
<p>We can have some of these properties but not all at once. History has chosen for us a combination that is taught today as a dogma. Any attempt to deviate from it is met with opposition. Thus you probably consider 1, 2, 3, and 4 as true, 5 as something exotic you heard of, 6 as Leibniz&#8217;s biggest mistake, 7 as intuitionistic hallucination (because obviously the reals can be decomposed into the non-negative and negative numbers), and 8 as something you never heard of (but you should have because it is the concept dual to compactness and you have been using it all your life).</p>
<p>Once we break free from Cantor&#8217;s paradise that Hilbert threw us in we discover unsuspected possibilities:</p>
<ol>
<li>It may happen that the reals are in 1-1 correspondence with a subset of the natural numbers, while at the same time they form an uncountable set.</li>
<li>It may happen that the reals form a proper class.</li>
<li>It may happen that every real number has a Turing machine computing its digits.</li>
<li>It may happen that the reals are not linearly ordered.</li>
<li>It may happen that the reals are locally non-compact, in the sense that every interval contains a sequence without an accumulation point.</li>
<li>It may happen that every subset of the reals is measurable.</li>
<li>It may happen that the reals can be covered by a sequence of intervals whose cumulative length does not exceed $1$.</li>
<li>It may happen that the reals contain nilpotent infinitesimals, which validate the 17th century calculations that physicists still use because, luckily, they did not subscribe entirely to the $\epsilon\delta$-dogma of analysis.</li>
<li>It may happen that every real function is continuous, and consequently the reals are not decomposable into two disjoint inhabited subsets.</li>
<li>It may happen that the reals are <em>not</em> overt, whatever that means.</li>
</ol>
<p>It should be admitted that some of the possibilities are rather bizarre. For example, I do not know what good it is to have the reals as a subset of the naturals, but I am sure somebody could think of something. But why should a measure theorist ignore a world of mathematics in which every subset is measurable, or a computer scientist one in which all reals are computable, or a topologist one in which all functions are continuous, or an analyst one in which all functions are smooth?</p>
<p>I am <em>not</em> proposing that mathematics should be compartmentalized so that each branch sits in its world of mathematics, incompatible with others. That would be a grave mistake indeed. In fact, the unification of mathematics under the umbrella of classical set theory has been immensely successful precisely because it allowed mathematicians to discover deep and unsuspected connections between different branches of mathematics. We have learnt to look for connections between branches of mathematics, and now we must also learn to look for connections that span worlds of mathematics.</p>
<p><strong>We cannot ignore the many worlds of mathematics</strong>. Therefore, mathematics must become applicable in a wide variety of worlds. Mathematicians have to be educated so that they develop multiple mathematical intuitions that help them <em>feel</em> how the worlds of mathematics behave.</p>
<p>I have so far not given you any technical definition of a mathematical world. Such a definition may be useful for showing meta-theorems, but I think it can never be exhaustive. A world of mathematics may be a forcing extension of set theory, or a topos, or a pretopos, or a model of type theory, or any other structure within which it is possible to interpret the basic language of mathematics.</p>
<p>At the moment I am visiting the <a href="http://www.ias.edu">Institute for Advanced Study</a> as a member of the <a href="http://video.ias.edu/sites/video.ias.edu/files/webfm/2007/Edward%20T.%20Cone/2010/Univalent%20Foundations/Voevodsky.hi.mp4">Univalent Foundations</a> group. We are building a new foundation of mathematics whose language is type theory rather than set theory, and whose primary objects are homotopy types and not just bare sets. Do I think this is an exciting new development? Certainly! Will the Univalent foundations disrupt the monopoly of Set-theoretic foundations? I certainly hope so! Will it become the new monopoly? It must not!</p>
]]></content:encoded>
			<wfw:commentRss>http://math.andrej.com/2012/10/03/am-i-a-constructive-mathematician/feed/</wfw:commentRss>
		<slash:comments>42</slash:comments>
<enclosure url="http://video.ias.edu/sites/video.ias.edu/files/webfm/2007/Edward%20T.%20Cone/2010/Univalent%20Foundations/Voevodsky.hi.mp4" length="462310232" type="video/mp4" />
		</item>
	</channel>
</rss>
