Before jumping straight into implementing datatypes via descriptions, it is convient to be able model some of them in Agda (as done in The Gentle Art of Levitation by Chapman et. al. and subsequent papers). Everything in the surface language, such as datatype declarations and pattern matching, is elaborated to terms of the core type theory. It is desirable to have the ingredients of the core type theory be as similar to their higher level counterparts as possible, otherwise work on the canonical terms (such as embedding back upwards, generic programming on core terms, and compilation of core terms) becomes complex.
My desired features for a bare bones version 1 of Spire include:
- implicit arguments
- elaboration of datatype declarations to descriptions
- elaboration of pattern matching syntax to eliminators
- later superceded by elaboration to the “by” gadget from The view from the left by McBride
- sugar for performing generic programming over descriptions
- formal metatheory for the canonical type theory
- universe levels
Rather than implementing this all at once, certain features are gradually being added. However, it is still a good idea to have future features in mind when implementing something, and the Agda model helps with that. The process goes like this:
- Write down the high-level notation that Agda supports natively (or in a comment, for features that Agda does not support)
- Manually perform the elaboration procedures on paper
- Typecheck and study the resulting core type theory terms
Elaboration of a high level term can involve many steps that are individually easy to follow, but produce a complex final term, and it is worth considering alternative core type theory constructs to produce simpler final terms. These sorts of before-and-after pictures, and most concepts in this post, can be found in Pierre Dagand’s thesis.
Note
All of the code from this post can be found in Spire. Additionally, each code snippet contains a link to the specific file in the top right corner.
Pattern Matching
When first implementing the Desc
ription technology, it will be
convenient to have a sufficiently complex example to typecheck. The
following standard sequence of types and functions suits this goal.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
This sequence of functions has the nice property that each function
builds upon the previous ones (either in its type, value, or both),
ending in the definition of concat
. Furthemore, concat has
a moderatley complicated dependent type, but only eliminates type
families applied to a sequence of variables. Eliminating type families
applied to expressions built from constructors requires more clever motive synthesis
(via Eliminating Dependent Pattern Matching
by Goguen et. al.) that I would like to ignore for this first pass.
Eliminators
Translating these basic functions into eliminators is straightforward. Because we only eliminate type families applied to a sequence of variables, the branch functions supplied to the eliminator look like pattern matching, and the whole definition is rather compact.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Computational Descriptions
Now we will consider Desc
riptions as they appear in
Dagand’s thesis,
which are the core type theory analogue to surface language datatype
definitions.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
A well known isomorphism exists between sums and dependent products
whose domain is some finite collection. To encode a type such as ℕ
,
we can use a Σ
whose domain is an index into an enumeration of the
contructor names zero
and suc
.
1 2 3 4 5 6 7 8 9 10 |
|
If you’ve been reading carefully, you noticed that μ
did not take a
Desc
ription as an argument, but a function from the index to a
description. Certain type families can be defined computationally (as
functions from their index), as in
Inductive Families Need Not Store Their Indices
by Brady et. al. Eliminating functions defined in this style leads to
particularly nice reduction behaviour, buying you free equations
thanks to definitional equality. ℕ
was not indexed, but below is an
example of defining Vec
as a computational description.
1 2 3 4 5 6 7 8 |
|
Rather than using a standard eliminator on datatypes defined using
descriptions, the special ind
elimination rule is used. An
eliminator has separate “branches” for each constructor of a datatype,
along with proofs of the motive being satisfied at recursive positions
in the constructor. Intead, ind
has a single branch (called pcon
below) that bundles up all branches of a typical eliminator, along
with an All
argument for all recursive motive proofs.
1 2 3 4 5 6 7 8 |
|
Using this eliminator we can define our running example of function
definitions. Here we use ind
rather than pattern matching. The
anonymous function argument represents sugared “{}” syntax from
Dagand thesis Example 3.19
. Additionally, the arguments bound in each
constructor pattern match clause are desugared into projections on the
right hand side. We will see what the final desugared terms look like
later in this post.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
Definitions using computational descriptions (like the ones above) are
nice because you can pattern match on the index (e.g. ℕ
) and the
description for the type family (e.g. Vec
) definitionally unfolds.
However, things get a bit clunkier once we wish to support named
constructor arguments. Notice that we defined ℕ
with an enumeration
for the constructor arguments, but we did not do the same for Vec
.
Example 7.46
in Dagand shows how to elaborate Vec
into a description
that has named constructor arguments. This involves first wrapping the
description in an elim
constructor to identify Vec
as a type defined
by computation over its index. Then, the zero
and suc
branches
return zero
and suc
constructor tags respectively. In this case,
the constructors index into singleton enumerations, i.e. elim
into
[elim]
, zero
into [zero]
, and suc
into [suc]
. If we were
defining a type that had multiple constructors with the same index for
a particular index branch then the enumeration would not
be a singleton, but it would still only be sub-enumeration of the
total enumeration of constructors that we have in mind for the type.
In contrast,
the ℕ
description tag constructors both belong to a more natural
enumeration, i.e. zero
and suc
index into [zero, suc]
. Hence,
although functions like append
and concat
defined above over
computational description Vec
look nice, once you add these
singleton tags and desugar everything, you get lots of eliminations
over singleton enumerations that are IMO no longer as elegant.
Additionally, type families defined by computation over the index are
only a subclass of all possible type families. The remaining types
(and actually, all type families) can be alternatively defined by
constraining the index with a propositional equality proof. See Dagand
Example 7.45
for how to define Vec
this way. This type of definition
keeps the more natural enumeration of constructor tags. I will call
types defined this way “propositional descriptions”.
Propositional Descriptions
Although computational descriptions give you an additional way to
define types, in practice once you add named constructors and perform
elaboration of patterns to eliminators, I don’t feel like they buy you
enough for the additional complexity. I am content with supporting
Agda-style propositionally defined datatypes exclusively. Given this
decision, we can change the grammar of descriptions to more
closely resemble the surface language Agda-style datatype
declarations. I saw something like this alternative Desc
definition
from the code accompanying a
blog post
by Guillaume Allais.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
This description grammar enforces descriptions to look like what we
are used to seeing in datatype declarations. For example,
Rec/Arg/RecFun
, corresponding to the previous X/Σ/Π
constructors,
take an extra
description argument at the end. Then End
, formerly ⊤
, ends the
“constructor” with an index value. The interpretation function uses
this index value to ask for a propositionally equality proof, making
sure that the index of the constructor you produce matches the index
of the type you specified. This can be achieved in the previous Desc
grammar by ending a description with Σ (x ≡ y) λ _ → ⊤
, but here
that pattern is internalized. One pleasant consequence can be seen by
looking at the μ
datatype. It no longer requires a function from the
index to a description, and now merely requires a description. Because
we no longer support computational described datatypes (instead
describing them all propositionally), our descriptions can be
first-order rather than higher-order. The more first-order your
descriptions are, the more
fully generic programming
you can do over them.
The ℕ
datatype is declared pretty much the same as before. However,
Vec
is now given with its constructor names, and the index of a
particular constructor is given at the end of the sequence of
constructor arguments. Compare this to the Agda data declaration at
the top of the post and notice the similar structure.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
Our function definitions for add
and mult
are pretty much
unchanged, but append
and concat
have one significant difference.
Both Vec
constructors do a dependent pattern match on a
propositional equality proof. However, this is once again a rather
simple dependent match that can just be elaborated to uses of
substitution. Specifically, this elaboration is the solution
step of
Lemma 16
in
Eliminating Dependent Pattern Matching.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
Desugared Propositional Descriptions
I will now show the desugared final forms of the propositional description code given so far. It is very important to study these terms carefully, as they are the terms of our canonical type theory and will appear as types everywhere throughout our language (because types are fully evaluated terms).
The first bit of sugar we will get rid of has to do with the pattern
matching we have been performing on finite enumerations of tags
(representing constructor names). The enumeration Enum
will be a
list of strings. Tag
is an index into Enum
(like Fin
into ℕ
).
Cases
is a finite product of values in a type family indexed by each
Tag
(like Vec
with type family values indexed by ℕ
). Finally,
case
eliminates a tag by returning the value in Cases
at that
position/name. I’ve renamed these contstructs, and their original
names in Dagand are EnumU
, EnumT
, π
, and switch
and can be
found in Definition 2.49
and Definition 2.52
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
In the sugared version of descriptions for datatypes we match on a tag
and return a description for it. In the desugared version, we instead
eliminate a tag with the special case
elimination rule.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
Now brace yourself for the rather wordy desugared version of our
series of function definitions. The general pattern for these is that
we first use ind
on the datatype being eliminated (like before),
then we use case
to give a branch for each constructor. Because
case
eliminates a tag out of the domain of a dependent pair,
we must use the
convoy pattern
to have the codomain properly unfold. As mentioned before, “matching” on
our propositional equality proof is done by applying subst
. Finally,
each argument to a constructor is referenced by its projection out of
the tuple of arguments you actually get out of the constructor.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
|
Alternatively, rather than defining these functions with ind
,
case
, and subst
directly, we can define the eliminators using
ind
and reuse our former definitions that use eliminators.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
|
An Avalanche of Canonical Terms
We have already seen how large the desugared code gets in the previous
section. Unfortunately, if you evaluate this code to canonical form it
gets much bigger! For example, the
canonical term
for concat
defined
using ind
is 2,195 lines long! This is a huge term, considering
the
surface language source
for defining the types and values of add
, mult
, append
, and
concat
is 14 lines long (the line count above only accounts for
the value of concat by itself).
Some negative consequences of large canonical terms include:
- Computational overhead type checking expressions against these canonical types.
- Computational overhead embedding/pretty-printing these canonical terms.
- Memory overhead in type checking and embedding/pretty-printing.
- Readability of canonical terms for developers while debugging.
Some of this explosion in size comes from using eliminators instead of
dependent pattern matching, where the motive must be supplied
explicitly. Some more comes from the fact that a μ
representing a
type is applied to its description, which may be large, so anywhere
the type appears the whole description is duplicated.
I haven’t thought that much about solutions to this large term problem yet, but I can imagine a few. I don’t want to perform implicit argument search and unification in the canonical type theory because it complicates it too much. However, the current canonical grammar is already broken up into values and spines. This allows for some bidirectional argument synthesis for values, but not for elimination rules. Breaking up the grammar further would allow for synthesis of arguments to elimination rules too, and the canonical type checker would remain relatively simple. Here is a file that adds some implicit arguments to the definitions presented thus far that I believe could be synthesized. I didn’t try that hard, so there is more room for making things more implicit, but that at least takes the line count down to 1,411.
An interesting thing to notice is that the way elimination proceeds
when writing definitions with ind
, case
, and subst
is rather
uniform. All definitions of datatypes can already be characterized as
codes of a universe called tagDesc
in Dagand Definition 4.23
.
Dagand programs over this universe to perform generic programming. One form
of that would be a specialized indcase
definition that automatically
applies ind
, then
case
, then maybe subst
too. Ideally, I would like a generic
elim
function that computes the exact type signature expected from
standard eliminators from each description. This basically involves
additionally doing some currying/uncurrying to pack/unpack values out of the tuple
produced by the interpretation function for descriptions. As we have
seen, programming definitions using the interface exposed by
eliminators leads to
pretty short code.
The motive is still there and descriptions are still duplicated in
every occurrence of μ
, but we need to win this war by winning many
battles. However, even if you programmed this generic elim
function
within the language, the canonical term it produces would be just as
big. It may be necessary to add elim
as a canonical term primitive
instead, and we may not even need the more general ind
or case
if
our language never produces code in the more general form where ind
or case
would be necessary.
Another technique might be to avoid expanding definitions where possible. For example, if you have a unique hash represent the description for a type, then testing the equality of terms using the same hash value would work. However, I would need to be careful to evaluate that hash to a concrete term during type checking, a form of lazy evaluation. Memoization of previously type-checked terms, and equality comparisons, would help a lot too because there are a lot of duplicated terms.
Sums
Using a tag that indexes into an enumeration as the domain of a
dependent pair type is isomorphic to using a Sum
type. This
alternative approach is taken by Bob Atkey in
Foveran. Well, the isomorphism
is almost there. A tag for an enumeration lets you have named sums,
used because we care about the name of our constructors. There are a
variety of ways to accomplish this with sums, like making a new
sum-like type, or always making the first type of a sum be a
labelled type
(see
The view from the left
) and ending the chain of sums in ⊥
on the right. This would still
allow you to perform generic programming by having a list of tuples of
strings plus descriptions act as the universe of codes
(just like tagDesc
), which gets interpreted as a description,
which gets interpreted as a sequence of sums of the form that I just
described. There are a number of pros and cons between the tag and sum
approach that I should study more closely, but I think a lot of the
duplication issues would come up in either case.
Later
That wraps up this week’s blog post. I think a weekly schedule is good for this kind of development blog, as it gives me enough time to come up with material worth blogging about and enough time for interested readers to keep up.