CS221/321
Lecture 8, Nov 2, 2010


Typed Fun (TFun)

Concrete syntax:  We add types to Let and Fun variable bindings:

   let x : Int = 3 in x + 1

   fun(x: Int) = x + 1

We also will treat primitive infix operators differently.  Instead
of hard-wiring them into the abstract syntax, we will treat them as
ordinary functions, but functions that are predefined in some 
initial environment.

In concrete syntax, we will continue to write expressions like
"x + 1", but this will translate into the following abstract
syntax:

   App(App(Var "+", Var "x"), Num 1)

Notice that this is "curried" application, and "+" is treated as
a curried function of type Int → (Int → Int), because we don't yet
have pairs and tuples in the language. [We'll add these later.]
This treatment of primitive operators means that we can add as many
as we like by extending the initial environment, and we don't have to
change the language syntax.

----------------------------------------------------------------------
Figure 4.2: abstract syntax of TFun
----------------------------------------------------------------------
   v ::= x, y, z, ...    (alphanumeric variables + symbolic variables)
   n ::= 0, 1, 2, ...    (natural numbers)
   b ::= True, False

   τ ::= Int | Bool | τ → τ

   e ::= Num(n) | Bool(b) | Var(v) |     -- atomic expressions
         If(e, e, e) |
         Let(v, τ, e, e) |
         Fun(v, τ, e) |
         App(e, e)

----------------------------------------------------------------------

Now in place of the relative closure judgement "Γ ⊦ e ok" we used
before, we will introduce a well-typing judgement

    Γ ⊦ e : τ

Here Γ is a "type assignment" or "type environment". This is a finite
map from variables to types.

How should we deal with types for the primitive operators (+, *, =,
etc.)?  One approach is to view these as "predefined" variables that
are bound in an initial environment.

The types of these operator variables would then be assigned in 
an initial type environment Γ0: E.g.

   Γ0(+) = Int → (Int → Int)
   Γ0(=) = Int → (Int → Bool)

----------------------------------------------------------------------
Notation: By convention, → is a right-associative infix operator on
types, so when we write Int → Int → Int, this is parsed as Int → (Int
→ Int).  So we will usually write the type of + as "Int → Int → Int",
and similarly for other curried functions.
----------------------------------------------------------------------

Treating primitive operators as predefined variables with curried
types has a couple of problematic consequences.

(1) Since + is a bound variable, it can be "shadowed" by another
local binding of the "+" variable. Then it becomes difficult to
syntactically identify a valid + redex. E.g. in 2 + 3 =
App(App(+,2),3), is + currently bound to the primitive addition
operator, or has it been rebound to something else.

(2) Primitive binary operator redexes are now going to be nested
double applications, which makes things a bit awkward.
The old form of primop redex

    Plus(Num(2),Num(3))

becomes

    App(App(Var("+"), Num(2)), Num(3)).

For instance, think what this means for the transition rules of a
CK-style machine. What will a redex state for 2+3 look like in such
a machine?

We can address these difficulties by introducing a new class of
identifiers, C, separate from variables, which will call "constants".
c ∈ C. is given (for concrete syntax) by:

   c ::= + | * | - | = | < | ...

We have other constants in the language: number constants and boolean
constants. We could well include these with the operator constants.
The abstract syntax would be:

   c ::= Num(n) | True | False | Plus | Times | Minus | Eq | Lt | ...

   e ::= Const(c) | Var(v) |     -- atomic expressions
         If(e, e, e) |
         Let(v, ty, e, e) |
         Fun(v, ty, e) |
         App(e, e)

We will then have to write Const(Num(3)) for the constant 3, but we
can follow earlier informal practice by eliding the Num and even the Const
constructors. Thus we will feel free to write informally

   App(App(Plus,2),3)     (a Plus redex)

instead of the full version

   App(App(Const(Plus),Const(Num(2))), Const(Num(3))).

To summarize, here is the revised abstract syntax:

----------------------------------------------------------------------
Figure 4.2: abstract syntax of TFun
----------------------------------------------------------------------
   v ::= x, y, z, ...    (alphanumeric variables + symbolic variables)
   n ::= 0, 1, 2, ...    (natural numbers)

   c ::= Num(n) | True | False | Plus | Times | Minus | Eq | Lt | ...

   τ ::= Int | Bool | τ → τ

   e ::= Const(c) | Var(v) |     -- atomic expressions
         If(e, e, e) |
         Let(v, τ, e, e) |
         Fun(v, τ, e) |
         App(e, e)

----------------------------------------------------------------------

For assigning types to constants, we will use a separate finite mapn
Σ, called a "signature", that maps constants to their types.  E.g.

  Σ(Num(n)) = Int
  Σ(Plus) = Int → Int → Int

Σ is fixed, so we don't need to include it along with the Γ in the
typing judgements.
 

Corresponding to the rules in Fig 4.1 for Γ ⊦ e ok, we have the
following set of typing rules for Γ ⊦ e: τ.

----------------------------------------------------------------------
Figure 4.3: TFun[Typ] - Rules for the typing judgement Γ ⊦ e: τ 
----------------------------------------------------------------------

Rules:

              Σ(c) = τ 
(1)      -------------------
           Γ ⊦ Const(c): τ 

             Γ(x) = τ 
(2)       --------------
           Γ ⊦ Var x: τ

           Γ ⊦ e1: Bool    Γ ⊦ e2: τ   Γ ⊦ e3: τ 
(3)      ---------------------------------------
                  Γ ⊦ If(e1,e2,e3): τ 

           Γ ⊦ e1: τ       Γ[x:τ] ⊦ e2: τ'
(4)      ---------------------------------
              Γ ⊦ Let(x,τ,e1,e2) : τ'

               Γ[x: τ1] ⊦ e: τ2
(5)       ---------------------------
           Γ ⊦ Fun(x,τ1,e) : τ1 → τ2

           Γ ⊦ e1: τ1 → τ2    Γ ⊦ e2: τ1
(6)      --------------------------------
              Γ ⊦ App(e1,e2) : τ2

----------------------------------------------------------------------

The goal now is to prove a type soundness theorem for TFun.
Informally, this soundness theorem could be stated as follows:

Theorem :  if  ⊦ e : τ  then e does not get stuck.

I.e. a stuck state like  3 + true  won't happen.

The proof follows roughly the same outline as Theorem 4.7.

Lemma 4.12 [Preservation]: ⊦ e : τ & e ↦ e' ==> ⊦ e' : τ 

Lemma 4.13 [Progress]: ⊦ e : τ  ==>  e is a value or ∃e'. e ↦ e'.

We will need the following basic Lemmas about the typing judgements:

Lemma 4.8 [Inversion]:
 (1) Γ ⊦ Const(c) : τ   ==>  Σ(c) = τ 
 (2) Γ ⊦ Var(x) : τ  ==>  Γ(x) = τ 
 (3) Γ ⊦ If(e1,e2,e3) : τ  ==>  Γ ⊦ e1 : Bool  &  Γ ⊦ e1 : τ  &  Γ ⊦ e2 : τ
 (4) Γ ⊦ Let(x,τ,e1,e2) : τ'  ==>  Γ ⊦ e1 : τ  &  Γ[x:τ] ⊦ e2 : τ'
 (5) Γ ⊦ Fun(x,τ1,e) : τ  ==>  ∃τ2. τ = τ1 → τ2 and Γ[x:τ'] ⊦ e : τ2
 (6) Γ ⊦ App(e1,e2) : τ  ==>  ∃τ1. Γ ⊦ e1 : τ1 → τ  &  Γ ⊦ e2 : τ1

Lemma 4.9 [Substitution]:
   Γ ⊦ e1 : τ  and Γ[x: τ] ⊦ e2 : τ' (where x ∉ Γ) ==>  Γ ⊦ [e1/x]e2 : τ' 


The Inversion and Substitution Lemmas are similar to those for the
earlier ok judgement. We also have a new Lemma about the form of 
value expressions (Fig. 4.4 below) for various types:

Lemma 4.10 [Canonical Forms]:
  (1) v a value (expression) and ⊦ v : Int  ==>
      v = Const(Num(n)) for some n ∈ Nat.
  (2) v a value and ⊦ v : Bool ==> v = Const(True) or v = Const(False).
  (3) v a value and ⊦ v : τ1 → τ2  ==>
        v = Fun(x, τ, e) (for some x, τ, e), or
        v = Const(primop), where primop ∈ PrimFun, or
        v = App(Const(primop), v),  where primop ∈ PrimFun and v is a value
            (assuming primop is always binary)

There is one final utility Lemma about typings, namely that for a
given Γ and e, if there is a derivation of Γ ⊦ e : τ, then that τ is
unique.

Lemma 4.11 [Unique types] If Γ ⊦ e : τ and Γ ⊦ e : τ', then τ = τ'.

Proof: Straightforward induction on the derivation of Γ ⊦ e: τ.


----------------------------------------------------------------------

Dynamic Semantics of TFun
-------------------------

For our soundness result, it will be most convenient to work with a
classic small-step dynamic semantics using substitution. We will use
a Call-by-Value semantics, but the proofs differ very little for a
CBN semantics.

First we need to define what expressions are considered "values",
i.e. irreducible, fully evaluated expressions. These are given in
Fig 4.4.

----------------------------------------------------------------------
Figure 4.4: TFun value expressions
----------------------------------------------------------------------

Num = {Num(n) | n ∈ Nat}                        -- numbers
Bool = {True, False}                            -- booleans
PrimFun = {Plus, Times, Minus, Eq, Lt, ...}     -- primitive functions
PrimPA = {App(op,Num(n)) | v ∈ Value, op ∈ PrimFun}  -- primitive partial applications
Fun = {Fun(x,τ,e) | Fun(x,τ,e) closed}          -- defined functions

Value = Num + Bool + PrimFun + PrimPA + Fun 

----------------------------------------------------------------------

We have the usual cases of numbers, boolean constants, and closed
λ-abstractions (i.e. Fun expressions). In addition, the primitive
operators like Plus and Times are now considered values (predefined
functions). The binary primitive operators are now curried, so we
also have to consider partial applications of operators to a single 
value argument to be values.  Thus, for instance,

   App(Plus, Num(3))

is a value expression. It represents a function value.

Note that we are assuming that all our primitive operations are binary
and that they take numbers as arguments. Thus

   App(Plus, True)

is specifically not a value expression -- it is a stuck expression.
Thus our definition restricts value expressions to include only
"well-typed" partial applications of the primitive operators. If we
were to add binary boolean operators, say Or, then 

   App(Or, True),  App(Or, False)

would be value expressions, but App(Or, Num(3)) would not.

To be syntactically correct, our value expressions would have to
written as:

   Const(Num(3)), Const(True), Const(Plus),
   App(Const(Plus),Const(Num(3))), ...

but it is obviously cumbersome to write these fully correct
expressions. So we are taking the notational liberty of leaving out
the Const constructor, assuming that it is obvious where they need
to be added.

We are still assuming a set of primitive operators that are all
binary. If we had unary primites (e.g. arithmetic negation, boolean
not), these would be constants, but the issue of partial application
values would not arise for them.  If we added ternary primitives, then
both one- and two-argument partial applications of those primities
would be considered values.

----------------------------------------------------------------------
Figure 4.5: TFun[SSv] - CBV small-step sematics for TFun
----------------------------------------------------------------------

transition: ↦  ⊆ expr * expr

(1)  App(App(Const(bop), Const(Num(n1))), Const(Num(n2)))  ↦  v 
       where v = expOf(prim(bop,n1,n2))
       (bop ∈ PrimFun)

(2)  App(v1, v2) ↦ [v2/x]e   (v1 = Fun(x,τ,e); v2 ∈ Value)

(3)  App(e1,e2) ↦ App(e1', e2)
      <= e1 ↦ e1'

(4)  App(v1, e2) ↦ App(v1, e2')        (v1 ∈ Value)
      <= e2 ↦ e2'

(5)  Let(x, τ, v1, e2) ↦ [v1/x]e2      (v1 ∈ Value)

(6)  Let(x, τ, e1, e2) ↦ let(x, τ, e1', e2)
      <= e1 ↦ e1'

(7)  If(e1,e2,e3) ↦ If(e1',e2,e3)
      <= e1 ↦ e1'

(8)  If(True,e2,e3) ↦ e2

(9)  If(False,e2,e3) ↦ e3

where
   prim(bop,x,y) returns the value of binary primop bop on
                 arguments x and y.

   expOf translates a number or boolean value to the corresponding
         value expression:

     expOf(n) = Const(Num(n))
     expOf(true) = Const(True)
     expOf(false) = Const(False)

----------------------------------------------------------------------

Notes:
(1) Rule (1) obviously assumes that any bop ∈ PrimFun is a binary
operation on numbers.

(2) There are two kinds of acceptible function values whose
applications can be reduced. In Rule (1) these are partially
applied binary primities, while in Rule (2) they are closed
λ-abstractions.


----------------------------------------------------------------------
Figure 4.6: TFun[SSn] - CBN small-step sematics for TFun
----------------------------------------------------------------------

transition: ↦  ⊆ expr * expr

(1)  App(App(Const(bop), Const(Num(n1))), Const(Num(n2)))  ↦  v 
       where v = expOf(prim(bop,n1,n2))
       (bop ∈ PrimFun)

(2)  App(v1, e2) ↦ [e2/x]e   (v1 = Fun(x,τ,e))

(3)  App(e1,e2) ↦ App(e1', e2)
      <= e1 ↦ e1'

(4)  Let(x, τ, e1, e2) ↦ [e1/x]e2

(5)  If(e1,e2,e3) ↦ If(e1',e2,e3)
      <= e1 ↦ e1'

(6)  If(True,e2,e3) ↦ e2

(7)  If(False,e2,e3) ↦ e3

where
   prim(bop,x,y) returns the value of binary primop bop on
                 arguments x and y.

   expOf translates a number or boolean value to the corresponding
         value expression:

     expOf(n) = Const(Num(n))
     expOf(true) = Const(True)
     expOf(false) = Const(False)

----------------------------------------------------------------------


---------------------------------------------------------------------- 
Program 4.1: Implementation of TFun[BSv] using substitution
---------------------------------------------------------------------- 
See file prog_4_1.sml.
----------------------------------------------------------------------


======================================================================
Homework 4.2. Give a full and rigorous proof of the Substitution Lemma
(Lemma 4.9), using the proof of the Substitution Lemma for the ok
judgement (Lemma 4.4, Lecture 7) as the template.
======================================================================

----------------------------------------------------------------------
Example 1: Deduce the type of

  e = let f : (Int -> Int) -> Int = Fun(g: Int->Int)=> g 3
       in f(Fun(x: Int) => x + 1)

Let e1 = Fun(g: Int->Int) => g 3
and e2 = f(Fun(x: Int) => x + 1)

so e = let f: (Int->Int)->Int = e1 in e2

We will show:

  (1) ⊦ e1 : (Int -> Int) -> Int
  (2) Γ1 ⊦ e2 : Int

where Γ1 = [f: (Int -> Int) -> Int]

from which we have

    ⊦ e1 : (Int -> Int) -> Int    Γ1 ⊦ e2 : Int
  -----------------------------------------------4
   ⊦ let f: (Int -> Int) -> Int = e1 in e2 : Int

Thus ⊦ e: Int. Here is the derivation for subgoal (1):

   --------------------------------2     --------------------------1
   [g: Int -> Int] ⊦ g : Int -> Int       [g: Int -> Int] ⊦ 3 : Int
   -----------------------------------------------------------------6
	          [g: Int -> Int] ⊦ App(g,3) : Int
      ----------------------------------------------------5
       ⊦ Fun(g: Int->Int)=> App(g,3) : (Int -> Int) -> Int


The derivation for subgoal (2) is: (assuming Γ2 = Γ1[x: Int])


			   Σ(Plus) = Int -> Int -> Int        Γ2(x) = Int
			 -------------------------------1    -------------2
			  Γ1 ⊦ Plus : Int -> Int -> Int       Γ2 ⊦ x : Int       Σ(1) = Int
			--------------------------------------------------6    -------------1
			       Γ2 ⊦ App(Plus,x) : Int -> Int                    Γ2 ⊦ 1 : Int
			     ----------------------------------------------------------------6
					    Γ2 ⊦ App(App(Plus,x),1) : Int
   --------------------------    ------------------------------------------------6
    Γ1 ⊦ f : (Int->Int)->Int       Γ1 ⊦ Fun(x,Int,App(App(Plus,x),1)) : Int->Int
   ------------------------------------------------------------------------------6
	          Γ1 ⊦ App(f,Fun(x,Int,App(App(Plus,x),1))) : Int


---------------------------------------------------------------------------------------------
Example 2: Is there a type τ such that a type can be derived for

   Fun(x, τ, App(x,x))

(i.e. in the typed λ-calculus, λx:τ.xx).

The derivation would have to look like


    [x: T] ⊦ x : T2 -> T1     [x: T] ⊦ x : T2
   -------------------------------------------
           [x: T] ⊦ App(x,x) : T1
      -----------------------------
        ⊦ Fun(x, T, App(x,x)) : T1

For this to be true, there must be a type expressions T, T1, T2 satisfying

     T = T2 -> T1
     T = T2

and hence

     T2 = T2 -> T1

This is impossible, since type expressions are finite, and clearly the
rhs expression, T2 -> T1, must be larger than the lhs expression, T2.

Thus we cannot find a type τ making Fun(x, τ, App(x,x)) well typed!

What does this mean for the Y combinators?  Since they contain
subexpressions of the form xx, we will encounter the same problem
trying to find a typing for them.

What does this mean for recursion, or in general for the power of the
language TFun? 

Can we define factorial?