Python's Type theory
Python -> Tuple[TypeTheory, typing]
Home
Table of Contents
1. Context
1.1. Opening remarks
Python added support for type hints in the typing
module in Python version
3.5. Use this for:
- adding type hints (think of these as a very powerful form of adding comments to your code).
- static typechecking (by an external tool like
mypy
). - typechecking within an IDE (many editors and IDEs do this on the fly).
- a sanity check (to ensure that the code you are writing makes sense even without running it).
Two key points to remember:
- type hints are an optional feature of python. So, it is not necessary that we add type hints to every single line of code.
- type hints are ignored during runtime and do not affect performance of the code.
Using type hints is optional. But I have found it to be an extremely useful feature. I frequently find myself wishing for more features in the typing module – not less.
These are just my notes on Python's type thoery. Do not use it as a fool-proof reference. For proper documentation, one can refer to typing module docs or PEP 483]].
1.2. Types vs Types
There are two meanings for types in python now. One is the type of an object
that is returned by calling the in-built type()
function. The other is the
type inferred by an external typechecker like mypy
, and which can be displayed
using the reveal_type
function and then running the script through mypy
.
We will always refer to the second interpretation when we refer to the type of an object in these notes.
1.3. So, what is type theory?
Think of it as the math/rules that govern these types. A basic fact to understand is that every python object has a type, whether we specify it explicitly or not.
2. Types that are built-in in python
Python comes with some buil-in types.
a: int = 1 b: float = 1.0 c: str = 'one' d: bool = True e: None = None
In addition, for every class that we define, we can refer to the type of its instances using the name of the class itself.
class MyClass: pass m: MyClass = MyClass()
2.1. Subtypes and NewType
So far, we have talked about types without specifically defining what a type
is. We will continue to not define them. However, we will say that one way to
think about a type is simply as a set of values (sometimes these values are
called terms, or even instances). So, bool
can be thought of as the set
{True, False}
, and int
can be thought of as {1, 2, 3, ...}
, and so on.
This analogy is also handy when trying to understand Subtypes. A type A
is a
subtype of a type B
if the set of values corresponding to A
is a subset of
the set of values corresponding to B
. This is often written (in type-theory
texts, but not in python itself) as A <: B
.
For example, int <: float
. In other words, every int
is also a float
. It
may come as a surprise to some python users that in python, the following is
also true: bool <: int
. This is because bool
inherits from int
and
False
, True
are just aliases of 0
and 1
respectively.
The typing
library lets us define subtypes using the NewType
function as
follows:
# define a new type PositiveInt = NewType('PositiveInt', int) # This is the same as declaring PositiveInt <: int # define a constructor function for the new type def make_positive_int(a: int) -> PositiveInt: if a > 0: return PositiveInt(a) raise ValueError, 'arg was not positive.'
2.2. Why use NewType
?
NewType
offers a way to restrict the type of argument supplied
to a function. Looking at the example above, we can see that
PositiveInt
is a subtype of int
. This means that a function
that accepts an int
argument will also accept an argument of
type PositiveInt
without any hiccups. However, a function that
accepts PositiveInt
as an argument will not accept any
run-of-the-mill int
as its input (here, by "would not accept" we
mean that such an argument will not pass through the typechecker
without an error).
Our constructor function make_positive_int
ensures that the only
way to create a PositiveInt
is to pass it through this
function.
This subtype and its constructor can then be used to create safer functions that only accept the restricted type. For example, we can have:
# define a new type from math import log def safe_logarithm(p: PositiveInt) -> float: return math.log(p) safe_log(-5) # Fails in typechecking as well as in runtime safe_log(5) # Fails in typechecking only. Computes correctly at runtime. safe_log(make_positive_int(-5)) # Passes typecking. Raises ValueError and meaningful error message at runtime. safe_log(make_positive_int(5)) # Passes typechecking. Computes correctly at runtime.
2.3. Type coercion
Type coercion refers to the process by which the typechecker changes the type of some of the variables to match function call signatures.
a: int = 2 b: float = 3.0 c = a + b
In the above code example, if we ask the typechecker for the type of
c
, then it will correctly guess that c
is a float
. How did it
reach that conclusion? The answer is that the +
operator is
overloaded (details on how to write overloaded functions are given
in a later section). Since one of the arguments is a float
, python
coerces a
from an int
into a float
and then computes the sum
of two float
's (which is also obviously a float
).
In general, typecheckers coerce type A
into type B
(if required)
whenever they know that A
is a subtype of B
.
3. Importing more Types
By importing from typing
, we get access to some more types.
3.1. Basic data-structures
from typing import (Any, Dict, FrozenSet, List, Set, Tuple) a: List[int] = [1, 2] b: List[str] = ['one', 'two'] c: Dict[int, str] = {1: 'a', 2: 'b'} d: Set[str] = {'a', 'b', 'c'} e: FrozenSet[str] = frozenset({'a', 'b', 'c'}) f1: Tuple[str] = ('a',) f2: Tuple[str, str] = ('a', 'b') f3: Tuple[str, str, str] = ('a', 'b', 'c') f4: Tuple[str, ...] = ('a', 'b', 'c') g1: Any = 1 g2: Any = {'a', 'b'} g3: Any = {1 : 'a', 2 : 'b'} g4: List[Any] = [1, 2, 'a', 'b']
3.2. Understanding Any
Any
is a sort of magic type. I like to think of Any
as the
type-theoretic counterpart of object
in python's class
hierarchy.
Every object in python belongs to a class. And every class inherits
methods from other classes. But what happens when we define a base
class (a class that doesn't inherit from other classes)? Even in
that case, python defaults to inheriting from a class called
object
. Any
is similar. Every type is a subtype of Any
, just
as every class inherits from object
. (I will have more to say
about subtypes in the later sections).
Any
is the precise reason why type hints are optional in
python. When the type-checker can't figure out the type of a
variable on its own (and when no type hints are provided by the
user), python simply defaults to usign Any
as the variable's type.
3.3. Optional and Union
Union
represents what in type-theory jargon are called sum-types.
One freqently needs the union of a given type with
None
. Python's typing
module therefore comes with the handy
Optional
type-constructor. The following code block illustrates
the use of both of these.
from typing import List, Optional, Union a: List[Union[int, str]] = [1, 2, 'a', 3, 'b'] def unsafe_division(a: int, b: int) -> Union[float, Exception]: """Divide in an unsafe manner.""" if not b: raise ValueError, 'Field axioms do not allow this.' return a/b def optional_division(a: int, b: int) -> Optional[float]: """Return quotient is b is nonzero, else return None.""" return a/b if b else None
Note that one might want to use either the unsafe or the optional variant of division (depending on the context). Each approach has its own way of handling the exceptional case. The advantage of using type hints is that we get a good idea of how the function behaves just by looking at the call signature of the function!
Side note: Exception
is an in-built class in python. Almost all
exceptions and errors are inherited from the base class Exception
.
(The ones that aren't are inherited from BaseException
).
3.4. The cast
function
This is a function that lets you do two things:
- with
cast
, you can resolve ambiguities in the type (in case the typechecker needs some help). - you can cheat the typechecker.
from typing import cast, Union def madeup_example(a: Union[int, str]) -> int: if hasattr(a, 'split'): # must be a string a_str = cast(str, a) return ord(a_str[0]) # must be an integer a_int = cast(int, a) return a_int + 1 print(madeup_example('abc')) print(madeup_example(5))