CMSC 221/321 Fall 2011 Outline of concepts: 1. Concrete syntax 1. lexical analysis defining basic symbols (lexical tokens) of a language 2. grammar and parsing discovering structure in a linear sequence of symbols 3. not covered in depth, but illustrated by tokenizer and parser for fun language 2. Abstract syntax 1. logical structure of programs 2. expressed as a set of types (e.g. ML datatypes) - variables - constants of various sorts e.g. numbers, booleans, primitive operators - compound expressions 3. produced by parsing plus some static analysis 4. each sample language given an abstract syntax (SAE, SAEL, Fun, TFun, PTFun) 3. Variables 1. introduced by several abstraction or "binding" constructs - let (let x = e1 in e2) - fun (or λ: λx.e) - mu (for recursive definitions) - polymorphism (for type variables) (Λt.λx:t.x : ∀t.t → t) 2. free and bound variables - binding occurrences, bound occurrences, free occurrences - association between bound occurrences and binding occurrences - scope of a binding - shadowing of a binding - change of bound variable names (α-conversion, α-equivalence) 3. substitution of expressions for variables - free variable capture (to be avoided by α-conversion) 4. Dynamic Semantics (evaluation of expressions/programs) 1. big-step semantics; evaluators, interpreters - recursing over abstract syntax structure 2. reduction semantics (small-step: SS) - redex (an expression ready to be simplified) - transition relation: e ↦ e' based on reducing a redex subexpression - transition systems in general - inductive definitions of transition relations reduction rules search rules (determining location of next active redex) derivations of transition judgements (source and target attributes) - values as quiescent expressions (constants, λ abstractions) 3. context-redex (CR) systems - CR factoring of expressions into context-redex pairs uniqueness of factoring - redex reduction rules - contexts play the same role as search rules in SS semantics - SS and CR reductions for the same language are in 1-1 correspondence 4. variable binding mechanisms (let, fun/app) - reduction through substitution (let, fun/app redexes, β-reduction) - bind-by-value vs bind-by-name for let (SEAL) - call-by-value (CBV) vs call-by-name (CBN) for functions 5. termination - all computations terminate in SAE and SAEL (easy to prove for SAE, SAEL[SSv], difficult for SEAL[SSn]) - untyped Fun language introduces nontermination (via recursion) - typed TFun and PTFun languages terminate again (unless supplied with a special construct for defining recursive functions) [proofs of termination beyond the scope of this course.] 6. recursion - for Fun, define recursive functions using Y combinators two forms, for CBN and slightly more complicated for CBV - for TFun, Y combinator expressions don't type check make Y a primitive (various notations: letrec, μ, Rec(f,e)) 7. environments - environments represent lazy (delayed) substitutions finite mappings from variables to values they represent (are bound to) - function values as closures, with "closure environment" - big-step environment-based evaluators - no easy way to do small-step transition with environments 5. Dynamic Semantics - abstract machines 1. CK machine - local incremental movement from reduction site to reduction site - stack k a list of frames, representing a context similar to the context in CR reductions - state transition relation directly defined (by cases), without need for induction - let/fun reductions via substitutions, as in SSx and CRx semantics - CBV and CBN variants - states are closed (no free variables) control expressions, frames, stacks all closed 2. CEK machine - use environments, closures instead of performing substitutions 6. Types (static semantics) 1. type expressions 2. types characterize values and value expressions 3. typing judgements: Γ ⊦ e: τ - type contexts (environments mapping variables to their types) 4. typing rules - inductively defined typing derivations (attributes: context (Γ), subject (e), type (τ)) 5. type soundness: relation between typing and evaluation (SS semantics) - Preservation theorem - Progress theorem - well-type expressions don't get stuck 6. type constructs - function types (τ1 → τ2) λ-abstraction and application - cartesian product types (τ1 * τ2) pair expression, fst and snd projections - disjoint sum types (τ1 + τ2) left and right injections, case expressions - recursive types (μt.τ) fold/unfold operations (isorecursive) - polymorphism (abstracting over type variables) (∀t.τ) Λt.e, e[τ] 7. typed version of CK machine - typed frames, typed stacks, typed states - analogues of Preservation and Progress