model.lean
Model theory formalized in Lean
Link to file: type theory
Table of Contents
1 TODO Arity \(n\) functions and their API
2 Languages
2.1 Definition and implementation
A language is given by specifying "functions'', "relations" and "constants" along with the arity of each function and each relation. Note that languages are purely syntactic – they have no sematics (i.e. meaning assigned to them).
The words "functions", "relations", and "constant" are written in quotes here to emphasize that these are just the names we give to these pieces of data. We could equivalently have called them F, R, and C. They do not yet behave like actual functions, relations and constants because we have not added any semantic information to them. For now, they are merely symbols in our language.
In lean, we use structure to bundle data (think of it like classes in Python). So, we bundle functions and relations in a structure and call it a language.
Implementation details –
- Constants are included in the mathematical definition of languages but not in
out immediate implementation of
lang
(see subsection that follows for an explanation). - The definition says that every function and relation is supposed to have an arity. In our implementation, we invert this and instead stipulate that every arity (i.e. every natural number) has a type assiciated with it. What type is that? The type corresponding to \((n : ℕ)\) is the type of all \(n\)-arity functions in our language.
- Thus, \(F : ℕ → Type\) is the function that returns a type for each natural number. Similarly for \(R\).
- We also note that since each
lang
contains anF
and anR
that themselves return types, the type oflang
itself isType 1
, which one universe higher thanType
. In other words,Type
is too small to containlang
, so languages must live in a higher universe. This is a small detail that does not affect the rest of treatment of model theory in any way.
2.2 Constants
We define the constants in a language to simply be the \(0\)-arity functions
i.e. L.F 0
. (Think of constants as simply a convenient way to refer to
\(0\)-arity functions). They do not add any new information to the lang
already defined above.
- This uses the fact that natural numbers in Lean begin at \(0\), not \(1\).
- This is the reason why we did not need to include constants in out implementation of languages.
2.3 DLO language
The language that is of primary interest to us is the language of dense linear ordering.
A dense linear ordering (without endpoints) (DLO) is a language containing a single binary relation symbol \((\le)\) satisfying some axioms. The exact axioms are not important at the moment since axioms are semantic statements, while we are interested only in syntactic statements right now.
Thus, a DLO's lang
is a language with no constant, no functions and only one
binary relation. (Refer to the type theory primer for the meaning of empty
,
unit
etc.).
2.4 lang
is an inhabited type
We can now declare that lang
is an inhabited (read: nonempty) type since we
know that DLO_lang
is a lang
. This declaration is made using Lean's
instance
keyword. This declaration automatically makes the nonemptiness of
lang
available as a fact to Lean's typeclass resolver if needed.
3 Structures
Given a language \(L\), an \(L\)-structure is a bundle of the following 4 pieces of data.
- A set called the universe or domain. We implement this as a type rather than a set because Lean is based on type theory.
- For every symbolic function of arity \(n\) in \(L\), we must have a
corresponding real function of arity \(n\) acting on the universe of
\(M\). To implement this, we map every
(f : L.F n)
toFunc univ v
. Since we need a name for this map, we name itF
. Thanks to namespaces in lean, the full name for this map isstruc.F
orM.F
where \(M\) is astruc
, and hence there is no chance of mixing this up withlang.F
orL.F
. We will denote the interpretation of the symbolic function \(f\) under the structure \(M\) asf^M
. - For every symbolic relation of arity \(n\) in \(L\), we must have a
corresponding real relation of arity \(n\) on the universe of \(M\). At this
point, it is useful to recall that the proper definition of a relation is as
a subset if the domain set. What do I mean? While we like to think of a
binary relation for example as a relation "between" elements of our set, it
is in fact better to view it as a subset of the Cartesian product of the
domain with itself. Pairs belonging to this subset are related while pairs
not belonging are not. Similarly, \(3\)-ary relations are subsets of
univ³
, and so on. Thus we map each element(r : L.R n)
toset (vector univ n)
, i.e. we map to a set of length \(n\) vectors ofuniv
. We will denote the interpretation of the symbolic relation \(r\) under the structure \(M\) asr̂M
. - For every symbolic constant in \(L\), we must have a corresponding constant
in the universe/domain. Since
L.C
is implemented outside of \(L\), we similarly implementM.C
outside of \(M\). It is implemented as a restriction ofM.F
to \(0\)-arity functions.
3.1 struc
is inhabited
We can now declare that struc L
is an inhabited (read: nonempty) type for
every language \(L\). In other words, every language has at least one structure
defined on it.
The default value for struc L
has –
- a universe/domain with only one element
- corresponding to every symbolic function \(f\) in \(L\),
f^M
is the function that maps everything to the singleton element. - corresponding to every symbolic relation \(r\) in \(L\),
r̂M
is the empty relation.
4 Embedding between structures
4.1 Embedding
Given two structures on the same language, an embedding between them is a map that is injective on the universe/domain and preserves the interpretation of all the elements of the language.
For ease of writing, we first introduce some notation. Given a function of arity
\(n\) (f : Func α n)
, and an \(n\)-vector \(v\), we will write \(f\otimes v\)
to mean that the function \(f\) is to be applied to the \(n\) elements of \(v\)
to yield a value in α
.
To start, we let \(L\) denote the language, and \(M\) and \(N\) denote the
structures on \(L\). Then the map η : M.univ → N.univ
is an embedding –
- If \(\eta\) it is injective
- For every \(n\)-vector \(v\) and every \(n\)-arity symbolic function \(f\) in
\(L\), we have
η(f^M ⊗ v) = f^N ⊗ η(v)
, where \(\eta(v)\) denotes the vector obtained by mapping every coordinate of \(v\) under \(\eta\). - For every \(n\)-vector \(v\) and every \(n\)-arity symbolic relation \(r\) in
\(L\), we have
v ∈ r̂M ↔ η(v) ∈ r̂N
, where \(\eta(v)\) again denotes the vector obtained by mapping every coordinate of \(v\) under \(\eta\).
4.2 Embeddings are inhabited
We argue that every structure has an embedding to itself via the indentity map.
For this embedding, the \(\eta\) map is the identity map on the universe of \(M\). This map is clearly injective and it trivially preserves the interpretation of every function, relation, and constant.
Thus, embedding M M
is an inhabited type (read: nonempty), for every structure
\(M\).
4.3 Isomorphism
An isomorphism is a bijective embedding between two \(L\)-structures. To implement this, we define an isomorphism to be a structure in Lean that extends embeddings. This means –
- there is a natural map
to_embedding : isomorphism M N → embedding M N
. - an isomorphism is the data contained in an embedding bundled with a proof of the bijectivity of the \(\eta\) map.
4.4 Isomorphisms are inhabited
We argue that every structure has an isomorphism to itself via the identity map.
Thus, isomorphism M M
is an inhabited type (read: nonempty), for every
structure \(M\).
4.5 Cardinality of structure
The cardinality of a structure is defined to be the cardinality of its domain. In Lean, the cardinals are their own separate type which is why we need to "make" the cardinal from the universe of \(M\).
We can then claim that if \(\eta : M \rightarrow N\) is an embedding of \(M\) in \(N\), then the cardinality of \(M\) must be at most the cardinality of \(N\). In other words, only smaller structures can be embedded into larger structures.
The proof hinges on the injectivity of \(\eta\) and follows from the fact
cardinal.mk_le_of_injective
(which is already proved in mathlib).
5 Terms
Throughout this section, let \(L\) be a language and let \(M\) be an \(L\)-structure. Terms can be thought of as symbolic words in a language. We get variables as words/terms for free (think of these as placeholder words). A function applied to the correct number of arguments is also a word.
5.1 Definition of terms
\(L\)-Terms are defined inductively as –
- constants of the language are all terms.
- there is a countably infinite supply of indexed "variables", each of which are also distinct terms.
- an \(n\)-ary symbolic function of \(L\) applied to exactly \(n\) other terms also yields a term.
This definition of terms has 2 distinct parts –
- There are atomic terms like constants and variables,
- There are higher-order terms (i.e. terms made of other terms) like ones formed by function application.
5.2 TODO Implementation of terms
The trickiest part of the definition of terms is checking that in higher-order
terms, an \(n\)-ary symbolic function is being applied to exactly \(n\)-many
terms. This check needs a proof of a fact like (length = n)
. However, we
cannot incorporate this proof into the definition directly without running into
lean's issues of nested inductive types.
Our solution is to incorporate the notion of atomic/higher-order into our implementation. To each term we associate a level. We start by stipulating that atomic terms have level \(0\). A symbolic function applied to all its arguments also has level \(0\), while a partially applied function has positive level.
We summarize the term to level translation below –
Term | Level |
---|---|
constant | 0 |
variable | 0 |
function | level = arity of the function |
partially-applied function | level = arity \(-\) # arguments already applied |
5.3 TODO Depth of a term
5.4 TODO Terms are inhabited
5.5 TODO Variables in a term
5.6 TODO Number of variables in a term
5.7 Term interpretation
Having defined terms as words in our language, we can then define an interpretation for each word.
Recall that an \(L\)-structure can be seen as a way of interpreting symbolic functions and relations as real functions and relations. This interpretation carries over to terms as well once we define how variables and function application are to be interpreted. A term of level \(n\) is interpreted as an \(n\)-ary function on the universe of \(M\).
We denote the interpretation of term \(t\) under variable assignment \(va\) as
t^^va
.
5.7.1 Variable interpretation using variable assignment
For interpreting variables, we use a variable assignment map. This is a map from variables (i.e. ℕ) to the domain of \(M\).
5.7.2 Full implementation
For a given variable assignment map, term interpretation is defined inductively on each variant of \(L\)-terms as follows:
- A symbolic function \(f\) in \(L\) is interpreted as
f^M
. - In order to be compatible with the above interpretation, a constant in
\(L\) is interpreted via the map
M.C
. - A variable is interpreted via the variable assignment map.
- Lastly, the application of a term \(t\) to a term of \(t_0\) is interpreted as the interpretation of \(t\) partially applied to the interpretation of \(t_0\) to yield a function of arity one less than the arity equal to the arity of the term we started with.
6 TODO Term substitution
7 Formulas and Sentences
7.1 Definition of a formula
An \(L\)-formula on \(L\) is any one of the following "words" on the alphabet \(L \cup \text{Var} \cup \{\top, \bot, =\}\), where Var denotes the set of all variables –
- \(\top\) and \(\bot\) are formulas
- \(t_1 = t_2\) is a formula, where \(t_1\) and \(t_2\) are \(L\)-terms.
- \(r t_1 \ldots t_n\) is a formula, where \(r\) is a symbolic \(n\)-ary relation in \(L\), and the \(t_i\)'s are \(L\)-terms.
- If \(\phi, \psi\) are \(L\)-formulas, then so are \((\neg \phi)\), \((\phi\wedge\psi)\), and \((\phi\vee\psi)\). Note: we have added parenthesis here merely for ease of writing. They are not really a part of the formula.
- If \(x\) is a variable, and \(\phi\) is an \(L\)-formula, then \(\exists x \phi\) and \(\forall x \phi\) are \(L\)-formulas.
7.2 Implementation of a formula
We implement this definition of an \(L\)-formula inductively by enumerating all the cases. Since we do not have a separate type for variables in our implementation, we simply denote variable \(x_i\) by the natural number \(i\).
7.3 Notation for logical operators
We then introduce some notation for logical operators. We mark all symbols with a "prime" in order to distinguish them from Lean's built-in symbols for equality, and, or, not etc.
Symbol | Meaning |
---|---|
=' |
formula.eq |
¬' |
formula.neg |
∧' |
formula.and |
∨' |
formula.or |
∃' |
formula.exi |
∀' |
formula.all |
⊤' |
formula.tt |
⊥' |
formula.ff |
→' |
¬'φ₁ ∨' φ₂ |
↔' |
(φ₁ →' φ₂) ∧' (φ₂ →' φ₁) |
7.4 TODO Helper functions
vars_in list
vars_in_formula
7.5 Free and bound variables
A variable occurs freely in a formula if it present outside of a
quantification, meaning it occurs in the formula somewhere where it is not
immediately preceded by ∃'
or ∀'
.
We need to be careful with our implementation of this definition because it contains two distinct ideas. A variable occurs freely in a formula if –
- It occurs in a formula.
- Some occurrence in the formula is free.
For example, in the formula \(\phi = \neg'(\forall' v_2 \top) \wedge' (v_1 =' v_4)\wedge' (\exists v_3 (v_2 =' v_3))\),
- the variables \(v_1\) and \(v_4\) occur freely in the formula
- the variable \(v_3\) does not occur freely in the formula.
- the variable \(v_5\) does not occur freely in the formula (because it does not occur in the formula at all).
- Surprisingly, the variable \(v_2\) occurs freely in the formula.
7.6 Definition of a sentence
A formula in which no variable occurs freely is a sentence. We create a subtype of \(L\)-formulas that we call \(L\)-sentences.
Every sentence is consequently a formula. The explicit map for this can be written as
example : setence L → formula L := λ s, s.val
We wish to always be able to recover the formula underlying a sentence without
having to specify the above function explicitly. To do this, we declare a
has_coe
instance from sentence L
to formula L
. Thereafter, we can always
reference the underlying formula of a sentence s
as ↑s
.
8 Satisfiability and models
8.1 TODO Expanding a language and structure
8.2 Structure models/satisfies a formula
We now define what it means for a formula to be true in an \(L\)-structure \(M\), or consequently, what it means for a structure \(M\) to model/satisfy a formula.
Let va : ℕ → M.univ
be a given variable assignment. We denote \(M\)
satisfying the formula \(\phi\) by \(M\models\phi\) and define it inductively as
–
- \(M\) always satisfies \(\top'\).
- \(M\) never satisfies \(\bot'\).
- \(M\) satisfies \(t_1 = t_2\) if
t₁^^va
andt₂^^va
match. - \(M\) satisfies \(r\,t_1\ldots t_n\) if the
(t₁^^va, ..., tₙ^^va)
is in the setr̂M
. - \(M\) satisfies \(\neg \phi\) if it does not satisfy \(\phi\).
- \(M\) satisfies \(\phi_1\wedge\phi_2\) if it satisfies both \(phi_1\) and \(\phi_2\).
- \(M\) satisfies \(\phi_1\vee\phi_2\) if it satisfies either \(phi_1\), or \(\phi_2\), or both.
- \(M\) satisfies \(\exists v, \phi\) for some variable \(v\) if there exists some \(x\) in the domain of \(M\) such that \(M\) satisfies the formula obtained by replacing all occurrences of \(v\) in \(phi\) by \(x\).
- \(M\) satisfies \(\forall v, \phi\) for some variable \(v\) if for every \(x\) in the domain of \(M\), \(M\) satisfies the formula obtained by replacing all occurrences of \(v\) in \(phi\) by \(x\).
8.2.1 Implementation detail for quantifiers
Instead of updating the formula by replacing all occurrences of \(v\) in \(\phi\) by \(x\), we instead update the variable assignment \(va\) at the point \(v\) to \(x\).