Featured image of post Type Checking: Subtyping

Type Checking: Subtyping

What can be assigned to a variable of type A?

Introduction

At the very least when it comes to types, something of type A\texttt{A} can be used whenever something of type A\texttt{A} is expected.

int a = 5float b = 5.0\texttt{int a = 5}\newline\texttt{float b = 5.0}

Is there any leeway for flexibility? Specifically, does there exist a notion of type substitutability?


float b = 5\texttt{float b = 5}

This makes sense because an int\texttt{int} is a subset of a float\texttt{float}. In other words, a float\texttt{float} can do anything an int\texttt{int} can do.

int b = 5.0\cancel{\texttt{int b = 5.0}}

This won’t work. An int\texttt{int} cannot do everything a float\texttt{float} can do (cannot represent all of its values). A quick way to break this is: int b = 5.5\texttt{int b = 5.5}

Definition

S<:TS <: T means that SS is a subtype of TT.

This means that anything of type SS can be used whenever type TT is expected.

So in the above example: int<:float\texttt{int}<:\texttt{float}

You can think of it as SS is more restrictive (to typecheck) than TT since if you want to do something with TT, you should also be able to do it with SS.

Structures

Width Subtyping

Class Hierarchy

Suppose a class lineage has the following structures.

class Animal:string nameint agevoidβ†’void makeNoiseclass Mammal extends Animal:string furColorclass Human extends Mammal:string occupationint netWorthvoidβ†’int doTaxes \begin{align*} &\texttt{class Animal:}\newline &\texttt{\qquad string name}\newline &\texttt{\qquad int age}\newline &\texttt{\qquad void}\rightarrow\texttt{void makeNoise}\newline &\newline &\texttt{class Mammal extends Animal:}\newline &\texttt{\qquad string furColor}\newline &\newline &\texttt{class Human extends Mammal:}\newline &\texttt{\qquad string occupation}\newline &\texttt{\qquad int netWorth}\newline &\texttt{\qquad void}\rightarrow\texttt{int doTaxes} \end{align*}

By definition of class hierarchy, a child has every characteristic its parent has. For example, Human\texttt{Human} has a furColor\texttt{furColor}, and transitively, since Mammal\texttt{Mammal} has a name\texttt{name} (among other things), so does Human\texttt{Human}.

Since children have everything their parents have (and thus can do anything the parents can do): Human<:Mammal\texttt{Human} <: \texttt{Mammal} Mammal<:Animal\texttt{Mammal} <: \texttt{Animal} βˆ’and transitively βˆ’- \textit{and transitively } - Human<:Animal\texttt{Human} <: \texttt{Animal}

Generally:

Child<:Parent\texttt{Child} <: \texttt{Parent}

This is called width subtyping since supertypes (opposite of subtypes) contain a subset of fields (along the width of the class definition).

Unrelated structures

The notion of width subtyping can be extended to datatypes without hierarchical relationships.

For example, suppose there are two unrelated datatypes with the following definitions.

struct Named:string namestruct User:string namestring age \begin{align*} &\texttt{struct Named:}\newline &\texttt{\qquad string name}\newline &\newline &\texttt{struct User:}\newline &\texttt{\qquad string name}\newline &\texttt{\qquad string age} \end{align*}

They have no explicit relationship. However, it sort of makes sense that whenever a program expects a Named\texttt{Named} entity that we can pass it a User\texttt{User} (since anything a Named\texttt{Named} has, a User\texttt{User} also has).

Example:

func rename(Named entity, string newName):entity.name = newNameUser user = {name: β€œglee”, age: 1000}rename(user, β€œnot glee”) \begin{align*} &\texttt{func rename(Named entity, string newName):}\newline &\texttt{\qquad entity.name = newName}\newline &\text{}\newline &\texttt{User user = }\lbrace\texttt{name: β€œglee”, age: 1000}\rbrace\newline &\texttt{rename(user, β€œnot glee”)} \end{align*}

This can get fairly messy to type check in various type systems.

Depth Subtyping

Definition

Instead of subtyping at the top level, could we subtype at the field level (and thus, their fields recursively)?

class LivingSpace:Animal residentclass Studio:Human resident \begin{align*} &\texttt{class LivingSpace:}\newline &\texttt{\qquad Animal resident}\newline &\newline &\texttt{class Studio:}\newline &\texttt{\qquad Human resident} &\end{align*}

Does Studio<:LivingSpace\texttt{Studio} <: \texttt{LivingSpace}?

Notice that these cannot have hierarchical relationships.

It might make sense logically that if our program requires a LivingSpace\texttt{LivingSpace} that we may provide it with a Studio\texttt{Studio} since the latter can do anything the former can do (in this case, provide resident.name\texttt{resident.name}).

func owner(LivingSpace home)β†’string:return home.resident.nameStudio studio = {resident: {βˆ’some humanβˆ’}}print(owner(studio)) \begin{align*} &\texttt{func owner(LivingSpace home)} \rightarrow \texttt{string:}\newline &\texttt{\qquad return home.resident.name}\newline &\text{}\newline &\texttt{Studio studio = }\lbrace\texttt{resident: }\lbrace- \textit{some human} -\rbrace\rbrace\newline &\texttt{print(owner(studio))} \end{align*}

Caveat

func reassignResident(LivingSpace home, Animal newResident):home.resident = newResidentStudio studio = {resident: {βˆ’some humanβˆ’}}reassignResident(studio, {βˆ’some animal βˆ’}) \begin{align*} &\texttt{func reassignResident(LivingSpace home, Animal newResident):}\newline &\texttt{\qquad home.resident = newResident}\newline &\text{}\newline &\texttt{Studio studio = }\lbrace\texttt{resident: }\lbrace- \textit{some human} -\rbrace\rbrace\newline &\texttt{reassignResident(studio, }\lbrace- \textit{some animal } -\rbrace) \end{align*}

Ah, so here’s where it breaks down. We want to assign our subtyped field. But the true underlying structure requires more!

reassignResident\texttt{reassignResident} is under the impression that an Animal\texttt{Animal} is enough to fill up a LivingSpace\texttt{LivingSpace}, which is logically true. But if we pass reassignResident\texttt{reassignResident} a Studio\texttt{Studio}, it should really expect a Human\texttt{Human} instead of an Animal\texttt{Animal}.

It turns out that the breakdown occurs because of aliasing. Thus, depth subtyping works with immutable structures (records).

Functions

Introduction

Functions are values. So how do we typecheck function assignment? We’ll reuse some types from above.

Mammalβ†’Mammal someFunction=…Mammal input = {…}Mammal output = someFunction(input) \begin{align*} &\texttt{Mammal} \rightarrow \texttt{Mammal someFunction}=\ldots\newline &\texttt{Mammal input = }\lbrace\ldots\rbrace\newline &\texttt{Mammal output = someFunction(input)} \end{align*}

The question is: what types can we assign to someFunction\texttt{someFunction}?

Outputs

Let’s trial and error, and try to reason the solution out.

Mammalβ†’Human functionValue=…Mammalβ†’Mammal someFunction = functionValue \begin{align*} &\texttt{Mammal} \rightarrow \texttt{Human functionValue}=\ldots\newline &\texttt{Mammal} \rightarrow \texttt{Mammal someFunction = functionValue} \end{align*}

functionValue\texttt{functionValue} returns a Human\texttt{Human}. Executions of someFunction\texttt{someFunction} expect a return value that can do everything a Mammal\texttt{Mammal} can do, which a Human\texttt{Human} satisfies. This works.

actualOutput<:expectedOutput\texttt{actualOutput} <: \texttt{expectedOutput}


For sake of clarity, let’s try it the other way.

Mammalβ†’Animal functionValue=…Mammalβ†’Mammal someFunction = functionValue \begin{align*} &\texttt{Mammal} \rightarrow \texttt{Animal functionValue}=\ldots\newline &\cancel{\texttt{Mammal} \rightarrow \texttt{Mammal someFunction = functionValue}} \end{align*}

Clearly, an execution to someFunction\texttt{someFunction} which expects a Mammal\texttt{Mammal} return type will miss out on the furColor\texttt{furColor} attribute.

Inputs

Again, let’s trial and error.

Humanβ†’Mammal functionValue=…Mammalβ†’Mammal someFunction = functionValue \begin{align*} &\texttt{Human} \rightarrow \texttt{Mammal functionValue}=\ldots\newline &\cancel{\texttt{Mammal} \rightarrow \texttt{Mammal someFunction = functionValue}} \end{align*}

Clearly, whatever we provide as input must be able to do whatever someFunction\texttt{someFunction} requires of it. If we provide Mammal\texttt{Mammal}, functionValue\texttt{functionValue} expects a Human\texttt{Human} which might use attributes like occupation\texttt{occupation} which aren’t present in Mammal\texttt{Mammal}.


Animalβ†’Mammal functionValue=…Mammalβ†’Mammal someFunction = functionValue \begin{align*} &\texttt{Animal} \rightarrow \texttt{Mammal functionValue}=\ldots\newline &\texttt{Mammal} \rightarrow \texttt{Mammal someFunction = functionValue} \end{align*}

Providing someFunction\texttt{someFunction} with an input that can do at least what Mammal\texttt{Mammal} can do will allow it to safely be used as input to its value, functionValue\texttt{functionValue}.

expectedInput<:actualInput\texttt{expectedInput} <: \texttt{actualInput}

Putting it All Together

It’s a little counterintuitive, but the inputs and outputs of a function subtype have opposite directionality (as we saw above).

Animalβ†’Human functionValue=…Mammalβ†’Mammal someFunction = functionValue \begin{align*} &\texttt{Animal} \rightarrow \texttt{Human functionValue}=\ldots\newline &\texttt{Mammal} \rightarrow \texttt{Mammal someFunction = functionValue} \end{align*}

I’<:IO<:Oβ€™β€…β€ŠβŸΉβ€…β€ŠIβ†’β€…O<:I’→O’\begin{align*} \texttt{I}&\texttt{’} <:\texttt{I}\newline &\texttt{O} <:\texttt{O’}\newline &\implies\newline \texttt{I}\rightarrow\medspace&\texttt{O} <: \texttt{I’}\rightarrow\texttt{O’} \end{align*}

My heart is in the work ― Andrew Carnegie
Built with Hugo
Theme Stack designed by Jimmy