This post is about adapting descriptions, used to encode dependently typed datatypes (including type families) originally in The Gentle Art of Levitation, to inductive-recursive types.
Goals
There have been many encodings of [indexed] inductive-recursive types, but as I developed Spire I came to desire an encoding that satisfied all of the following criteria:
- Is universe-polymorphic. Although it’s not hard to add universe levels, it is crucial for an actual language implementation. Nevertheless, most papers (justifiably) abstain from cluttering their presentation with universe levels.
- Passes Agda’s termination & positivity checkers. Many alternative encodings of IR types, especially the ones that are directly inspired by algebraic semantics [Ghani & Hancock] suffer from being too general to pass Agda’s termination or positivity checkers.
- Has description constructors that directly correspond to
constructor declaration arguments.
Implementing a high-level language that translates to a canonical
kernel language is very difficult if the kernel language is too
different from the high-level language. For this reason I have chosen
to make the description (
Desc
) datatype enforce structure that is as similar to a higher-level constructor telescope declaration as possible. From the descriptions literature, this means extending the propositional encoding [McBride] rather than the (albeit more general) computational encoding [Dagand]. From the IR literature, this means staying away from the more semantic encodings described by Ghani & Hancock, as well as subtle differences from the original encoding by Dybjer and Setzer encoding, as reviewed by Malatesta et. al.. - Has an induction principle. The induction principle for encoded datatypes is typically given in the description literature, but this is not always the case in the IR literature. I also wanted an induction principle in the style of the description literature.
Non-Indexed Inductive-Recursive Descriptions
I’m going to start with descriptions for non-indexed IR types, and move to indexed IR types in the subsequent section. Additionally, I will mainly focus on the definition of descriptions, and encoding types with descriptions. The harder part is defining the fixpoint for encoded types and the corresponding induction principle, which I briefly gloss over at the end of the post. Additionally, all the code presented in this post is linked to in the conclusion.
A common use of IR is to
model a dependently typed language. Below is the definition of an
example language. The datatype definition (Lang
) is mutually defined
with an interpretation function whose domain is Set
. Because the
codomain is Set
(rather than some small type like ℕ
), this makes
it a large inductive-recursive definition.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
I will encode Lang
using the following description datatype
(Desc
). Desc
is parameterized by a datatype O
, which is the type
of the codomain of the IR interpretation function.
It is easiest to think of the constructors of Desc
as the pieces
used to form the telescope of a constructor type declaration, such as
Two
and Pair
in Lang
.
1 2 3 4 5 |
|
End
ends a constructor declaration, and specifies what the
mutually-defined IR interpretation function returns for the
constructor. Rec
encodes a request for a recursive first-order
argument, and the remainder of the telescope may depend on the
interpretation function applied to this recursive argument.
Ref
encodes a request for a recursive function argument. To remain
strictly positive, Ref
asks for the type of the domain of the
function argument, and the remainder of the telescope may depend on a
function from a value of the domain of the function to the
interpretation function applied to the result of the recursive function
that Ref
encodes.
Arg
records an ordinary argument by requesting the type of the
argument, and the remainder of the telescope may depend on a value of
that type.
Below is the encoding of the the Lang
datatype, and its
interpretation function, as a description.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
The Two
constructor takes no arguments, and its interpretation
function returns the Bool
type, and the Zero
and One
cases are
similar.
The Pair
constructor first takes a
recursive argument, so the rest of the telescope can depend on its
interpretation. Then, it takes a recursive function argument whose
domain is the interpretation of the first argument. Once again, the
remainder of the telescope may depend on the the function from the
requested domain to the interpretation function result. Finally, the
constructor declaration ends by saying that the interpretation of
Pair
as a whole is the sigma type (Σ
) that we expect. The Fun
and Tree
constructors are encoded in the same way that Pair
is.
Also, I sprinkle Lift
and lift
in the right places to make the
universe levels work out.
Another standard example of a small inductive-recursive definition
is an interpreter of an arithmetic language, as presented by
Malatesta et. al..
The idea is to express mathematical sums and products, where the
domain of the sum or product specifies the bound on the iteration. See
the paper for an in depth explanation, but below is the high-level
code for the Arith
type.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
Note the that it is a small IR type because the codomain of the
interpretation function is a small type (ℕ
). Below is the Desc encoding,
which is very similar to what Malatesta et. al. give.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Indexed Inductive-Recursive Descriptions
Both the Lang
and Arith
types had interpretations whose codomain
was some constant value (Set
and ℕ
respectively). More generally,
an inductive-recursive type may be indexed, and the codomain of the
interpretation function may also be a type family. For example, you can
imagine a type like Type : ℕ → Set
where the index type is I = ℕ
,
and an interpretation function eval : (n : ℕ) → Type n → Fin n
where
the interpretation type is O = Fin
.
Below is the description type Desc
modified to be able to encode
indexed inductive-recursive definitions. It now takes an
index type ((I : Set ℓ)
) as an additional parameter to the
(now dependent) interpretation type ((O : I → Set ℓ)
).
1 2 3 4 5 |
|
The new Desc
is a mixture of the old one (which is a slightly modified
Dybjer-Setzer IR type) , and the indexed description by
McBride.
Recall that descriptions encode constructor argument telescopes. At
the end of a telescope (End
), we must now specify what index the type is
at, as well as the (dependent) value that the interpretation function
returns for the constructor being defined. A recursive (Rec
)
argument requires the index at for the requested recursive type, and
the remainder of the telescope of arguments may use the (dependent)
result of the interpretation function. A recursive function argument
(Ref
) requires the type of the domain of the recursive argument, an
index value (I
) assuming the requested argument (A
), and the rest
of the constructor arguments telescope may depend on a function from
the requested argument to the (dependent) interpretation function
call. Finally, requesting an ordinary argument (Arg
) is just as
before.
So far we have only looked at descriptions, but they are relatively
useless and easy to get wrong if you haven’t defined the corresponding
fixpoint type (which interprets a description of a type as an actual
type). First let’s look at El
, which interprets a description as a
functor between indexed families of sets.
1 2 3 4 5 |
|
El
gets an additional argument (Y
), representing the interpretation
function of the described datatype. To define El
, we must use Y
wherever
we need to get the description from the codomain (B
) of a recursive
first-order (Rec
) or higher-order (Ref
) argument.
More interestingly, we can now define the fixpoint datatype μ
.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
I tried to define this a couple of years ago but stuck for some
reason. I tried once again a couple of days ago and arrived at this
definition. In the IR literature El
is
sometimes referred to as
⟦_⟧₀
, and a more general version of foldsO
is referred to as
⟦_⟧₁
. However, to get foldO
to pass Agda’s termination check we must inline a
mutually defined specialized version of ⟦_⟧₁
, namely foldsO
. This
is basically the same trick used to get the elimination principle
ind
to terminate by inlining hyps
. I remember trying to reuse this
termination technique a couple
of years ago and failing, but anyhow there it is.
Finally, for the sake of completeness below is the adapted definition
of Hyps
(a collection of inductive hypotheses), ind
(the primitive
induction principle for described types), and hyps
(a specialized
mapping function to collect inductive hypotheses).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
Conclusion
Well there it is, indexed inductive-recursive descriptions that satisfy all of my goals stated at the beginning of the post! The beginning of the post was background on non-indexed IR encoding. However, the useful bit of the code to reuse is in the previous section, which includes the universe-polymorphic, indexed, inductive-recursive, constructor-esque descriptions, as well as their primitive introduction and elimination rules.
The code used throughout the post is linked below:
If you want some homework, try inventing your own indexed inductive-recursive type, and encode it both in Agda and as a description. You can also borrow a type to encode from Dybjer & Setzer’s paper on the topic.