CS221/321 Lecture 8, Nov 2, 2010 Typed Fun (TFun) Concrete syntax: We add types to Let and Fun variable bindings: let x : Int = 3 in x + 1 fun(x: Int) = x + 1 We also will treat primitive infix operators differently. Instead of hard-wiring them into the abstract syntax, we will treat them as ordinary functions, but functions that are predefined in some initial environment. In concrete syntax, we will continue to write expressions like "x + 1", but this will translate into the following abstract syntax: App(App(Var "+", Var "x"), Num 1) Notice that this is "curried" application, and "+" is treated as a curried function of type Int → (Int → Int), because we don't yet have pairs and tuples in the language. [We'll add these later.] This treatment of primitive operators means that we can add as many as we like by extending the initial environment, and we don't have to change the language syntax. ---------------------------------------------------------------------- Figure 4.2: abstract syntax of TFun ---------------------------------------------------------------------- v ::= x, y, z, ... (alphanumeric variables + symbolic variables) n ::= 0, 1, 2, ... (natural numbers) b ::= True, False τ ::= Int | Bool | τ → τ e ::= Num(n) | Bool(b) | Var(v) | -- atomic expressions If(e, e, e) | Let(v, τ, e, e) | Fun(v, τ, e) | App(e, e) ---------------------------------------------------------------------- Now in place of the relative closure judgement "Γ ⊦ e ok" we used before, we will introduce a well-typing judgement Γ ⊦ e : τ Here Γ is a "type assignment" or "type environment". This is a finite map from variables to types. How should we deal with types for the primitive operators (+, *, =, etc.)? One approach is to view these as "predefined" variables that are bound in an initial environment. The types of these operator variables would then be assigned in an initial type environment Γ0: E.g. Γ0(+) = Int → (Int → Int) Γ0(=) = Int → (Int → Bool) ---------------------------------------------------------------------- Notation: By convention, → is a right-associative infix operator on types, so when we write Int → Int → Int, this is parsed as Int → (Int → Int). So we will usually write the type of + as "Int → Int → Int", and similarly for other curried functions. ---------------------------------------------------------------------- Treating primitive operators as predefined variables with curried types has a couple of problematic consequences. (1) Since + is a bound variable, it can be "shadowed" by another local binding of the "+" variable. Then it becomes difficult to syntactically identify a valid + redex. E.g. in 2 + 3 = App(App(+,2),3), is + currently bound to the primitive addition operator, or has it been rebound to something else. (2) Primitive binary operator redexes are now going to be nested double applications, which makes things a bit awkward. The old form of primop redex Plus(Num(2),Num(3)) becomes App(App(Var("+"), Num(2)), Num(3)). For instance, think what this means for the transition rules of a CK-style machine. What will a redex state for 2+3 look like in such a machine? We can address these difficulties by introducing a new class of identifiers, C, separate from variables, which will call "constants". c ∈ C. is given (for concrete syntax) by: c ::= + | * | - | = | < | ... We have other constants in the language: number constants and boolean constants. We could well include these with the operator constants. The abstract syntax would be: c ::= Num(n) | True | False | Plus | Times | Minus | Eq | Lt | ... e ::= Const(c) | Var(v) | -- atomic expressions If(e, e, e) | Let(v, ty, e, e) | Fun(v, ty, e) | App(e, e) We will then have to write Const(Num(3)) for the constant 3, but we can follow earlier informal practice by eliding the Num and even the Const constructors. Thus we will feel free to write informally App(App(Plus,2),3) (a Plus redex) instead of the full version App(App(Const(Plus),Const(Num(2))), Const(Num(3))). To summarize, here is the revised abstract syntax: ---------------------------------------------------------------------- Figure 4.2: abstract syntax of TFun ---------------------------------------------------------------------- v ::= x, y, z, ... (alphanumeric variables + symbolic variables) n ::= 0, 1, 2, ... (natural numbers) c ::= Num(n) | True | False | Plus | Times | Minus | Eq | Lt | ... τ ::= Int | Bool | τ → τ e ::= Const(c) | Var(v) | -- atomic expressions If(e, e, e) | Let(v, τ, e, e) | Fun(v, τ, e) | App(e, e) ---------------------------------------------------------------------- For assigning types to constants, we will use a separate finite mapn Σ, called a "signature", that maps constants to their types. E.g. Σ(Num(n)) = Int Σ(Plus) = Int → Int → Int Σ is fixed, so we don't need to include it along with the Γ in the typing judgements. Corresponding to the rules in Fig 4.1 for Γ ⊦ e ok, we have the following set of typing rules for Γ ⊦ e: τ. ---------------------------------------------------------------------- Figure 4.3: TFun[Typ] - Rules for the typing judgement Γ ⊦ e: τ ---------------------------------------------------------------------- Rules: Σ(c) = τ (1) ------------------- Γ ⊦ Const(c): τ Γ(x) = τ (2) -------------- Γ ⊦ Var x: τ Γ ⊦ e1: Bool Γ ⊦ e2: τ Γ ⊦ e3: τ (3) --------------------------------------- Γ ⊦ If(e1,e2,e3): τ Γ ⊦ e1: τ Γ[x:τ] ⊦ e2: τ' (4) --------------------------------- Γ ⊦ Let(x,τ,e1,e2) : τ' Γ[x: τ1] ⊦ e: τ2 (5) --------------------------- Γ ⊦ Fun(x,τ1,e) : τ1 → τ2 Γ ⊦ e1: τ1 → τ2 Γ ⊦ e2: τ1 (6) -------------------------------- Γ ⊦ App(e1,e2) : τ2 ---------------------------------------------------------------------- The goal now is to prove a type soundness theorem for TFun. Informally, this soundness theorem could be stated as follows: Theorem : if ⊦ e : τ then e does not get stuck. I.e. a stuck state like 3 + true won't happen. The proof follows roughly the same outline as Theorem 4.7. Lemma 4.12 [Preservation]: ⊦ e : τ & e ↦ e' ==> ⊦ e' : τ Lemma 4.13 [Progress]: ⊦ e : τ ==> e is a value or ∃e'. e ↦ e'. We will need the following basic Lemmas about the typing judgements: Lemma 4.8 [Inversion]: (1) Γ ⊦ Const(c) : τ ==> Σ(c) = τ (2) Γ ⊦ Var(x) : τ ==> Γ(x) = τ (3) Γ ⊦ If(e1,e2,e3) : τ ==> Γ ⊦ e1 : Bool & Γ ⊦ e1 : τ & Γ ⊦ e2 : τ (4) Γ ⊦ Let(x,τ,e1,e2) : τ' ==> Γ ⊦ e1 : τ & Γ[x:τ] ⊦ e2 : τ' (5) Γ ⊦ Fun(x,τ1,e) : τ ==> ∃τ2. τ = τ1 → τ2 and Γ[x:τ'] ⊦ e : τ2 (6) Γ ⊦ App(e1,e2) : τ ==> ∃τ1. Γ ⊦ e1 : τ1 → τ & Γ ⊦ e2 : τ1 Lemma 4.9 [Substitution]: Γ ⊦ e1 : τ and Γ[x: τ] ⊦ e2 : τ' (where x ∉ Γ) ==> Γ ⊦ [e1/x]e2 : τ' The Inversion and Substitution Lemmas are similar to those for the earlier ok judgement. We also have a new Lemma about the form of value expressions (Fig. 4.4 below) for various types: Lemma 4.10 [Canonical Forms]: (1) v a value (expression) and ⊦ v : Int ==> v = Const(Num(n)) for some n ∈ Nat. (2) v a value and ⊦ v : Bool ==> v = Const(True) or v = Const(False). (3) v a value and ⊦ v : τ1 → τ2 ==> v = Fun(x, τ, e) (for some x, τ, e), or v = Const(primop), where primop ∈ PrimFun, or v = App(Const(primop), v), where primop ∈ PrimFun and v is a value (assuming primop is always binary) There is one final utility Lemma about typings, namely that for a given Γ and e, if there is a derivation of Γ ⊦ e : τ, then that τ is unique. Lemma 4.11 [Unique types] If Γ ⊦ e : τ and Γ ⊦ e : τ', then τ = τ'. Proof: Straightforward induction on the derivation of Γ ⊦ e: τ. ---------------------------------------------------------------------- Dynamic Semantics of TFun ------------------------- For our soundness result, it will be most convenient to work with a classic small-step dynamic semantics using substitution. We will use a Call-by-Value semantics, but the proofs differ very little for a CBN semantics. First we need to define what expressions are considered "values", i.e. irreducible, fully evaluated expressions. These are given in Fig 4.4. ---------------------------------------------------------------------- Figure 4.4: TFun value expressions ---------------------------------------------------------------------- Num = {Num(n) | n ∈ Nat} -- numbers Bool = {True, False} -- booleans PrimFun = {Plus, Times, Minus, Eq, Lt, ...} -- primitive functions PrimPA = {App(op,Num(n)) | v ∈ Value, op ∈ PrimFun} -- primitive partial applications Fun = {Fun(x,τ,e) | Fun(x,τ,e) closed} -- defined functions Value = Num + Bool + PrimFun + PrimPA + Fun ---------------------------------------------------------------------- We have the usual cases of numbers, boolean constants, and closed λ-abstractions (i.e. Fun expressions). In addition, the primitive operators like Plus and Times are now considered values (predefined functions). The binary primitive operators are now curried, so we also have to consider partial applications of operators to a single value argument to be values. Thus, for instance, App(Plus, Num(3)) is a value expression. It represents a function value. Note that we are assuming that all our primitive operations are binary and that they take numbers as arguments. Thus App(Plus, True) is specifically not a value expression -- it is a stuck expression. Thus our definition restricts value expressions to include only "well-typed" partial applications of the primitive operators. If we were to add binary boolean operators, say Or, then App(Or, True), App(Or, False) would be value expressions, but App(Or, Num(3)) would not. To be syntactically correct, our value expressions would have to written as: Const(Num(3)), Const(True), Const(Plus), App(Const(Plus),Const(Num(3))), ... but it is obviously cumbersome to write these fully correct expressions. So we are taking the notational liberty of leaving out the Const constructor, assuming that it is obvious where they need to be added. We are still assuming a set of primitive operators that are all binary. If we had unary primites (e.g. arithmetic negation, boolean not), these would be constants, but the issue of partial application values would not arise for them. If we added ternary primitives, then both one- and two-argument partial applications of those primities would be considered values. ---------------------------------------------------------------------- Figure 4.5: TFun[SSv] - CBV small-step sematics for TFun ---------------------------------------------------------------------- transition: ↦ ⊆ expr * expr (1) App(App(Const(bop), Const(Num(n1))), Const(Num(n2))) ↦ v where v = expOf(prim(bop,n1,n2)) (bop ∈ PrimFun) (2) App(v1, v2) ↦ [v2/x]e (v1 = Fun(x,τ,e); v2 ∈ Value) (3) App(e1,e2) ↦ App(e1', e2) <= e1 ↦ e1' (4) App(v1, e2) ↦ App(v1, e2') (v1 ∈ Value) <= e2 ↦ e2' (5) Let(x, τ, v1, e2) ↦ [v1/x]e2 (v1 ∈ Value) (6) Let(x, τ, e1, e2) ↦ let(x, τ, e1', e2) <= e1 ↦ e1' (7) If(e1,e2,e3) ↦ If(e1',e2,e3) <= e1 ↦ e1' (8) If(True,e2,e3) ↦ e2 (9) If(False,e2,e3) ↦ e3 where prim(bop,x,y) returns the value of binary primop bop on arguments x and y. expOf translates a number or boolean value to the corresponding value expression: expOf(n) = Const(Num(n)) expOf(true) = Const(True) expOf(false) = Const(False) ---------------------------------------------------------------------- Notes: (1) Rule (1) obviously assumes that any bop ∈ PrimFun is a binary operation on numbers. (2) There are two kinds of acceptible function values whose applications can be reduced. In Rule (1) these are partially applied binary primities, while in Rule (2) they are closed λ-abstractions. ---------------------------------------------------------------------- Figure 4.6: TFun[SSn] - CBN small-step sematics for TFun ---------------------------------------------------------------------- transition: ↦ ⊆ expr * expr (1) App(App(Const(bop), Const(Num(n1))), Const(Num(n2))) ↦ v where v = expOf(prim(bop,n1,n2)) (bop ∈ PrimFun) (2) App(v1, e2) ↦ [e2/x]e (v1 = Fun(x,τ,e)) (3) App(e1,e2) ↦ App(e1', e2) <= e1 ↦ e1' (4) Let(x, τ, e1, e2) ↦ [e1/x]e2 (5) If(e1,e2,e3) ↦ If(e1',e2,e3) <= e1 ↦ e1' (6) If(True,e2,e3) ↦ e2 (7) If(False,e2,e3) ↦ e3 where prim(bop,x,y) returns the value of binary primop bop on arguments x and y. expOf translates a number or boolean value to the corresponding value expression: expOf(n) = Const(Num(n)) expOf(true) = Const(True) expOf(false) = Const(False) ---------------------------------------------------------------------- ---------------------------------------------------------------------- Program 4.1: Implementation of TFun[BSv] using substitution ---------------------------------------------------------------------- See file prog_4_1.sml. ---------------------------------------------------------------------- ====================================================================== Homework 4.2. Give a full and rigorous proof of the Substitution Lemma (Lemma 4.9), using the proof of the Substitution Lemma for the ok judgement (Lemma 4.4, Lecture 7) as the template. ====================================================================== ---------------------------------------------------------------------- Example 1: Deduce the type of e = let f : (Int -> Int) -> Int = Fun(g: Int->Int)=> g 3 in f(Fun(x: Int) => x + 1) Let e1 = Fun(g: Int->Int) => g 3 and e2 = f(Fun(x: Int) => x + 1) so e = let f: (Int->Int)->Int = e1 in e2 We will show: (1) ⊦ e1 : (Int -> Int) -> Int (2) Γ1 ⊦ e2 : Int where Γ1 = [f: (Int -> Int) -> Int] from which we have ⊦ e1 : (Int -> Int) -> Int Γ1 ⊦ e2 : Int -----------------------------------------------4 ⊦ let f: (Int -> Int) -> Int = e1 in e2 : Int Thus ⊦ e: Int. Here is the derivation for subgoal (1): --------------------------------2 --------------------------1 [g: Int -> Int] ⊦ g : Int -> Int [g: Int -> Int] ⊦ 3 : Int -----------------------------------------------------------------6 [g: Int -> Int] ⊦ App(g,3) : Int ----------------------------------------------------5 ⊦ Fun(g: Int->Int)=> App(g,3) : (Int -> Int) -> Int The derivation for subgoal (2) is: (assuming Γ2 = Γ1[x: Int]) Σ(Plus) = Int -> Int -> Int Γ2(x) = Int -------------------------------1 -------------2 Γ1 ⊦ Plus : Int -> Int -> Int Γ2 ⊦ x : Int Σ(1) = Int --------------------------------------------------6 -------------1 Γ2 ⊦ App(Plus,x) : Int -> Int Γ2 ⊦ 1 : Int ----------------------------------------------------------------6 Γ2 ⊦ App(App(Plus,x),1) : Int -------------------------- ------------------------------------------------6 Γ1 ⊦ f : (Int->Int)->Int Γ1 ⊦ Fun(x,Int,App(App(Plus,x),1)) : Int->Int ------------------------------------------------------------------------------6 Γ1 ⊦ App(f,Fun(x,Int,App(App(Plus,x),1))) : Int --------------------------------------------------------------------------------------------- Example 2: Is there a type τ such that a type can be derived for Fun(x, τ, App(x,x)) (i.e. in the typed λ-calculus, λx:τ.xx). The derivation would have to look like [x: T] ⊦ x : T2 -> T1 [x: T] ⊦ x : T2 ------------------------------------------- [x: T] ⊦ App(x,x) : T1 ----------------------------- ⊦ Fun(x, T, App(x,x)) : T1 For this to be true, there must be a type expressions T, T1, T2 satisfying T = T2 -> T1 T = T2 and hence T2 = T2 -> T1 This is impossible, since type expressions are finite, and clearly the rhs expression, T2 -> T1, must be larger than the lhs expression, T2. Thus we cannot find a type τ making Fun(x, τ, App(x,x)) well typed! What does this mean for the Y combinators? Since they contain subexpressions of the form xx, we will encounter the same problem trying to find a typing for them. What does this mean for recursion, or in general for the power of the language TFun? Can we define factorial?