CS221/321 Lecture 11, Nov 23, 2010 Section 6. Mutable Storage We could consider mutable storage and assignments using three different approaches: (1) Move to a new elementary language: a "simple imperative language" similar to the Bims language introduced in Huttel, Chapter 3. You can read about this approach, including the addition of constrol structures, variable bindings, and procedures, in Huttel Chapters 3 through 7. A feature of this approach is the separation of syntax constructions into expressions and "statements" or "commands". (2) Add mutable store and assignments using ref values (references) as found in Standard ML. This is obviously the most direct way of dealing with stores starting with the language PTFun that we already have. (3) Model stores and assignments using monads, as is done in "pure" functional languages like Haskell. I'll start with approach (2) and talk about (3) next week in the last lecture of the course. ---------------------------------------------------------------------- References: First lets treat Ref as a basic syntactic form, as we treated Fst, Inl, etc. before introducing polymorphism. This treatment will be similar to Harper, Chapter 14. Syntactically, we add three new forms, corresponding to the concrete syntax ref(e) -- creation of an initialized ref cell !e -- dereferencing, returning the contents of a ref cell e1 := e2 -- assignment, updating the contents of a ref cell For small-step semantics, we need to also add a new syntactic category of "locations" (or memory addresses). l ∈ Locations These location expressions will not occur in original programs, but will be introduced by reductions in the small-step semantics. Locations designate places in the mutable memory where values can be stored. A memory M will be a finite map from locations to values: M: Locations → Values (Memories are often called "states", as in Huttel). Evaluation of expression will be done with respect to a current state, and the evaluation of certain expressions, namely those involving assignments, may modify the current state. Example: achieving recursion by updating memory f = ref(λx:Int. x) (f : Ref(Int→Int)) fact = λx:Int. if x = 0 then 1 else x * !f(x - 1) (fact: Int→Int) fact(4) = 4 * (λx.x)3 = 12 f := fact fact(4) ==> 24 ---------------------------------------------------------------------- Abstract Syntax: l ::= _ (members of Locations) τ ::= ... | Ref(τ) e ::= ... | Ref(e) | Dref(e) | Set(e1,e2) | l v ::= ... | l ---------------------------------------------------------------------- Type expressions that include locations requires that we be able to type locations. Locations resemble free variables, and so we will introduce a new kind of typing environment for locations: Λ : Locations → Types In any memory M compatible with Λ, given location l can only can only contain values of type Λ(l): ⊦ M(l) : Λ(l) Typing judgements must be modified to include a memory typing Λ as well as a (free) variable typing Γ: Λ; Γ ⊦ e : τ ---------------------------------------------------------------------- Typing Rules: Λ; Γ ⊦ e : τ (RT1) ------------------------ Λ; Γ ⊦ Ref(e) : Ref(τ) Λ; Γ ⊦ e : Ref(τ) (RT2) -------------------- Λ; Γ ⊦ Dref(e) : τ Λ; Γ ⊦ e1 : Ref(τ) Λ; Γ ⊦ e2 : τ (RT3) ------------------------------------- Λ; Γ ⊦ Set(e1,e2) : Unit Λ(l) = τ (RT4) ------------------- Λ; Γ ⊦ l : Ref(τ) Plus modified versions of previous typing rules with Λ added to the contexts. For Instance: Γ(x) = τ (RT5) -------------- Λ; Γ ⊦ x : τ Λ; Γ[x:τ1] ⊦ e : τ2 (RT6) -------------------------------- Λ; Γ ⊦ Fun(x,τ1,e2) : τ1 → τ2 Λ; Γ ⊦ e1 : τ1 → τ Λ; Γ ⊦ e2 : τ1 (RT7) ---------------------------------------- Λ; Γ ⊦ App(e1,e2) : τ Λ; Γ ⊦ e1 : Bool Λ; Γ ⊦ e2 : τ Λ; Γ ⊦ e3 : τ (RT8) ----------------------------------------------------- Λ; Γ ⊦ If(e1,e2,e3) : τ Note that in a derivation under these rules, the Λ context remains fixed for all rules throughout the derivation. Another way of putting this is that the scope of Λ is the whole expression. ---------------------------------------------------------------------- Evaluation: ----------- A small-step dynamic semantics must use a transition relation that involves memories as well as expressions: (M, e) ↦ (M', e') Transitions will always modify the expression (e' != e), and sometimes the memory will also be modified. ---------------------------------------------------------------------- Evaluation [CBV]: These are the new rules involving the new operators Ref, DRef, and Set. Search rules: (M,e) ↦ (M',e') (RE1) ------------------------------ (M, Ref(e)) ↦ (M', Ref(e')) (M,e) ↦ (M',e') (RE2) -------------------------------- (M, DRef(e)) ↦ (M', DRef(e')) (M,e1) ↦ (M',e1') (RE3) ------------------------------------- (M, Set(e1,e2)) ↦ (M', Set(e1',e2)) (M,e2) ↦ (M',e2') (RE4) ------------------------------------- (M, Set(v1,e2)) ↦ (M', Set(v1,e2')) Redex rules: (l = fresh(M)) (RE5) -------------------------- (M,Ref(v)) ↦ (M[l=v],l) (l ∈ dom(M)) (RE6) ------------------------- (M,DRef(l)) ↦ (M,M(l)) (l ∈ dom(M)) (RE7) ---------------------------- (M,Set(l,v)) ↦ (M[l=v],()) We also inherit modified versions of the standard transition rules for TFun, such as these rules for App: (v1 = Fun(x,τ,e); v2 ∈ Value) (RE8) --------------------------------- (M, App(v1,v2)) ↦ (M, [v2/x]e) (M,e1) ↦ (M',e1') (RE9) ------------------------------------- (M,App(e1,e2)) ↦ (M',App(e1',e2)) (M,e2) ↦ (M',e2') (RE10) ------------------------------------- (v1 ∈ Value) (M,App(v1,e2)) ↦ (M',App(v1,e2')) ---------------------------------------------------------------------- Type Soundness -------------- We need to define a relation between memories M and location typings Λ, that expresses the property that M "conforms to" Λ. Defn 6.1: ⊦ M : Λ iff (1) dom(M) = dom(Λ) (2) ∀l ∈ dom(Λ). Λ;∅ ⊦ M(l): Λ(l) That is, ⊦ M:Λ if they have the same set of locations as their domains, and at each location, the value stored in M has the type specified by Λ. Defn 6.2: Λ ⊦ (M,e) : τ iff ⊦ M:Λ & Λ;∅ ⊦ e: τ Theorem 6.1 [Preservation]: Λ ⊦ (M,e): τ & (M,e) ↦ (M',e') => ∃Λ'. Λ ⊆ Λ' & Λ' ⊦ (M',e'): τ. Theorem 6.2 [Progress]: Λ ⊦ (M,e) : τ => e a value or ∃M',e'. M ⊆ M' & (M,e) ↦ (M',e'). Note that in both of these statements, it is assumed that e is closed w.r.t. variables, but e may contain "free" location names. (In fact, all location names are free, since there is no construct that "binds" a location name.) We will need the usual Inversion Lemma for the new typing judgements, which we will assume without stating it in detail. Proof of Preservation: ---------------------- We assume the hypotheses: (H1) (M,e) ↦ (M',e') (H2) Λ ⊦ (M,e): τ and proceed by induction on the derivation of (H1). Base Case: (H1) is derived using rule (RE5). Then (1) e = Ref(v) for some value v (2) e' = l, where l is a fresh location, i.e., (3) l ∉ dom(M) (4) M' = M[l=v] (5) Λ ⊦ (M,Ref(v)) : τ by (1) and (H2) (6) τ = Ref(τ') for some τ', and (7) Λ ⊦ (M,v): τ' by (5) and Inversion of (RT1) (8) Λ ⊦ v: τ' by Defn 6.2 (9) Let Λ' = Λ[l: τ'] (10) Λ ⊆ Λ' by (9) and (3) (11) ⊦ M : Λ by (5) and Defn 6.2 (12) ⊦ M' : Λ' by (9) and Defn 6.1 (13) Λ' ⊦ l : τ' by (9) and (RT4) (14) Λ' ⊦ (M',l): Ref(τ') by (12), (13), and Defn 6.1. (15) ∃Λ'. Λ ⊆ Λ' & Λ' ⊦ (M',e'): τ by (14) and (2). [X] Ind Case: (H1) is derived using rule (RE1). Then (1) e = Ref(e1) for some e1 (2) e' = Ref(e1') for some e1', where (3) (M,e1) ↦ (M',e1') (IH) ∀(Λ1,τ1). Λ1 ⊦ (M,e1): τ1 => ∃Λ1'. Λ1 ⊆ Λ1' & Λ1' ⊦ (M',e'): τ1 (4) Λ ⊦ (M,Ref(e1)) : τ by (1) and (H2) (5) τ = Ref(τ') for some τ', and (6) Λ ⊦ (M,e1): τ' by (4) and Inversion of (RT1) (7) ∃Λ1'. Λ ⊆ Λ1' & Λ1' ⊦ (M',e1'): τ' by (6) and (IH) (8) Let Λ' be a witness for (7), so (9) Λ ⊆ Λ' and (10) Λ' ⊦ (M',e1'): τ' (11) Λ' ⊦ (M',Ref(e1')): Ref(τ') by (10) and (RT1) (12) Λ' ⊦ (M',e'): τ by (1) and (5) (13) ∃Λ'. Λ ⊆ Λ' & Λ' ⊦ (M',e'): τ. by (12) and (2). [X] The other cases are similar. [XX] ---------------------------------------------------------------------- Proof of Theorem 6.2: Progress ------------------------------ We start with the hypothesis: (H) Λ ⊦ (M,e) : τ By Definition 6.2, this expands into a pair of hypotheses: (H1) ⊦ M : Λ (H2) Λ; ∅ ⊦ e : τ The proof proceeds by induction on the derivation of (H2). Base Case: (H2) by rule (RT4). (1) e = l for some location l, by Case Hyp. (2) e is a value, by defn of value [X] Base Case: (H2) by rule (RT5). This is impossible, since Γ = ∅. Ind. Case: (H2) by rule (RT1). (1) e = Ref(e1), and (2) τ = Ref(τ1) by Case Hyp., where (3) Λ; ∅ ⊦ e1 : τ1 (IH) e1 is a value or (M,e1) ↦ (M',e1') Case (IH1): e1 is a value. (5) e1 = l1 for some location l1, by Canonical Forms Lemma(*) (6) (M,Ref(e1)) ↦ (M,v) where v = M(l). [X] Case (IH2): (M,e1) ↦ (M',e1'). (7) (M, Ref(e1)) ↦ (M', Ref(e1')), by (RE1) (8) (M, e) ↦ (M',e') where e' = Ref(e1'), by (1), (7). [X] Ind. Case: (H2) by (RT2). This is similar to the (RT1) case. Ind. Case: (H2) by (RT7). (1) e = App(e1,e2) by Case Hyp. (2) Λ; ∅ ⊦ e1 : τ1 → τ for some τ1, and (3) Λ; ∅ ⊦ e2 : τ1 by Inversion of (RT7) (IH1) e1 a value or (M,e1) ↦ (M',e1') (IH2) e2 a value or (M,e2) ↦ (M',e2') Case (IH1a) e1 a value (4) e1 = Fun(x,τ1,e3), by Cannonical Forms Lemma Case (IH2a) e2 a value. (5) (M, App(e1,e2)) ↦ (M, [e2/x]e3) by (RE8) (6) (M, e) ↦ (M, e') where e' = [e2/x]e3 by (1), (5). [X] Case (IH2b) (M,e2) ↦ (M',e2') (7) (M, App(e1,e2)) ↦ (M', App(e1,e2')) (8) (M, e) ↦ (M', e') where e' = App(e1,e2') by (1), (7). [X] Case (IH1b) (M,e1) ↦ (M',e1') (9) (M, App(e1,e2)) ↦ (M', App(e1',e2)) by (RE9) (10) (M, e) ↦ (M', e') where e' = App(e1',e2) by (1), (9). [X] Other inductive cases are similar to (RT1) or (RT7). [XX] ---------------------------------------------------------------------- Polymorphic Typings for State primitives. In PTFun, we can treat Ref, DRef, and Set as primitive functions with the following polymorphic types: Ref : ∀t. t → Ref(t) DRef : ∀t. Ref(t) → t Set : ∀t. Ref(t) * t → Unit E.g. Ref[Int](Num 3) Polymorphic typings in ML: ref : 'a -> 'a ref ! : 'a ref -> 'a := : 'a ref * 'a -> unit E.g. ref 3 Example: let val r = ref(fn x => x) [ r : ('a -> 'a) ref ] in r := (fn x: int => x + 1); [ r : (int -> int) ref ] !r true [ r : (bool -> bool) ref ] end References have introduced unsoundness in the type system!!! After years of experimentation with fixes for this problem, the ML community settled on the "value restriction": A variable declaration (like "val r = ref(fn x => x)") can only have its type generalized (made polymorphic) if the definients is a value expression (which it is not, in this case). This issue does not affect PTFun with polymorphically typed Ref, DRef, and Set primitives, because to make r polymorphic we will have to explicitly abstract over a type parameter, as in: let r = Λt.Ref[t → t](λx: t.x) [ r : ∀t.Ref(t → t) ] in Set(r[Int], (λx: int.x + 1)); [ r[Int] : Ref(Int → Int) ] DRef(r[Bool]) true [ r[Bool] : Ref(Bool → Bool) ] end Since the Λ-abstraction defining r is a value, the application of the Ref constructor is suspended, and so the actual allocation of the ref-cell does not take place until r is applied to a type. There are two such applications: r[Int] and r[Bool]; these produce two {\em different} ref-cells, one containing Ints and the other Bools. So there is not type conflict. ====================================================================== And now for something different -- monads! See state.sml. ====================================================================== ====================================================================== Summary ------- What have we learned about programming languages? * Careful, incremental development of basic concepts. simple arithmetic expressions -- a (seemingly) trivial language SAE let: local variables, bindings, scope, free and bound variables substitution, free variable capture environments [SAEL] functions: abstraction and application [Fun] conditional expressions (boolean values, relational operators) recursive functions (e.g. factorial) the Y combinators types: basic types, type checking (TFun) * Inductive proof techniques What have we not learned? * Lots of details about lots of "real" languages. * Have not surveyed various "programming paradigms" (except the "functional programming" paradigm). No object-oriented, no logic programming, etc.