CS221/321
Lecture 3, Oct 5, 2010

Preview
-------
Continuing to explore dynamic semantics (evaluation) of SAE.
  1. Contextual reductions
  2. Abstract machine (CK)

Adding variables
  1. Let expressions: local variable bindings
    * substitutions
    * free variable capture
    * alpha conversion (change of bound variables)
    * static semantics (forbiding free variables)
      - values still all in Nat
    * environments (lazy substitutions)
    * extension of dynamic semantics (big-step, small-step, etc.)

  2. functions
    * lambda abstraction and function application
    * beta reduction
    * call-by-value and call-by-name
    * recursive functions (tying a knot)
    * static semantics?
      - functions are a new kind of value (heterogeneous values)
    * extension of dynamic semantics (big-step, small-step, etc.)
      - dynamic type checking

Adding types
  1. new kinds of values
     * e.g. booleans, pairs, ...
  2. type errors
  3. a type language
  4. static semantics
     * static type checking
  5. relating static semantics and dynamic semantics
     * type soundness (Progress and Preservation Theorems)

More language features

  * references, stores, side-effects
  * continuations
  * exceptions
  * coroutines and threads
  * objects
  * modules

Proof assistants

======================================================================

SAE dynamic semantics (continued)

Terminology:  An expression that cannot be further evaluated (by
the rules of a given dynamic semantics) is called a "value".  For
SAE, the value expressions are of the form Num(n).


Contextual reductions
---------------------

New reduction rules

Take the reduction rules from small-step semantics:

(1)  Plus(Num n1, Num n2)  ↦  Num p   where p = n1 + n2

(2)  Times(Num n1, Num n2)  ↦  Num p   where p = n1 * n2
  
Recall that expressions matching either of the left-hand-sides
are called "redexes".

We can factor any non-value expression into a redex and a context.
E.g.:

    Plus(2, Times(13,4))  =  C[Times(13,4)]

where 

    C = Plus(2, [])

Here the "[]" is called the hole.  It is expected to be filled
by a redex.

Contexts C are defined by a (abstract) grammar:

    C := []  |  Plus(C, e)  |  Plus(Num(n), C)
             |  Times(C, e) |  Times(Num(n), C)

Given a non-value expression e that is not fully evaluated (i.e. not of the
form Num(n)), there is a unique way of expressing e as C[r] where r is a
redex subexpression.  The hole will identify the leftmost redex when e
contains more than one redex (why?).

Now there is just one general contextual reduction rule instead of the
4 search rules in the earlier small-step semantics.

(3)  e = C[r] ↦ C[r'] = e'    where r ↦ r'

where r is a redex and r' is the expression it reduces to by
rule (1) or (2) above.  Observe that (3) applies even when e itself
is a redex, since in that case the context C will be [].

Exercise: How do we show that the factorization of an expression into
the form C[r] is unique, and that the subterm r is the leftmost redex?


After a given reduction by rule (3), the resultant expression e' may be
further reduced, unless it is a value.  To reduce e' by (3), it has to be
refactored into the form C'[r''] for some context C' and redex r''

Note that in general, r'' is not r'. In fact for SAE, r' will always
be a value, so r'' will never be the same as r'.  In more interesting
languages, such as the λ calculus, r' may contain a redex, and may even
be a redex itself.

----------------------------------------------------------------------
Figure 1.5: Example for 2 + ((5 + 8) * 4)
----------------------------------------------------------------------

   e = Plus(2,Times(Plus(5,8),4))    [omitting Num constructors]

     = C1[Plus(5,8)]  where C1 = Plus(2,Times([],4))

     ↦ C1[13]

     = Plus(2,Times(13,4))

     = C2[Times(13,4)]  where C2 = Plus(2,[])

     ↦ C2[52]

     = Plus(2,52)
 
     = C3[Plus(2,52)]  where C3 = []

     ↦ C3[54]

     = 54
----------------------------------------------------------------------

See how each transition involves refactoring the expression into
context and redex.

Comparing a reduction context with the corresponding transition derivation,
such as Fig. 1.3, we see that each layer used to construct the context 
corresponds to a search rule instance in the transition derivation.

Question: Is the contextual reduction system given by rules (1), (2),
and (3) equivalent to the earlier big-step and small-step semantics.
If so, how do we prove it?

To Summarize:

----------------------------------------------------------------------
Figure 1.6: SAE[CR] - Contextual Reductions for SAE
----------------------------------------------------------------------
Contexts:

    C := []  |  Plus(C, e)  |  Plus(Num(n), C)
             |  Times(C, e) |  Times(Num(n), C)

Redex rules:

(1)  Plus(Num n1, Num n2)  ↦  Num p   where p = n1 + n2

(2)  Times(Num n1, Num n2)  ↦  Num p   where p = n1 * n2

Contextual reduction:

(3)  C[r] ↦ C[r']  where r a redex and r ↦ r'
----------------------------------------------------------------------


----------------------------------------------------------------------
Implementation: Evaluation by contextual reduction

This is how we can implement the factoring of an expression into a
context and redex in SML and use that for evaluation.

----------------------------------------------------------------------
Program 1.2: Contextual reduction
----------------------------------------------------------------------
(* representation of contexts
 * -- why can't we just use Plus and Times again? *)
datatype context
  = Hole
  | CPlusL of context * expr
  | CPlusR of expr * context     (* expr should be a value *)
  | CTimesL of context * expr
  | CTimesR of expr * context    (* expr should be a value *)
 
(* factor: expr -> (context * expr) option
 * factor a nonvalue expression into a context and redex,
 * returning NONE if expression is a value *)
fun factor (e as Num _) = NONE
  | factor (e as Plus(Num _, Num, _)) = SOME(Hole, e)
  | factor (e as Times(Num _, Num, _)) = SOME(Hole, e)
  | factor (Plus(e1 as Num _, e2)) = 
    (case factor e2  (* e2 not Num _ *)
       of SOME(c,e') => SOME(CPlusR(e1, c), e')
        | NONE => impossible "factor")
 | factor (Times(e1 as Num _, e2)) = 
    (case factor e2  (* e2 not Num _ *)
       of SOME(c,e') => SOME(CTimesR(e1, c), e')
        | NONE => impossible "factor")
 | factor (Plus(e1,e2)) = 
    (case factor e1  (* e1 not Num _ *)
       of SOME(c,e') => SOME(CPlusL(c, e2), e')
        | NONE => impossible "factor")
 | factor (Times(e1,e2)) = 
    (case factor e1  (* e2 not Num _ *)
       of SOME(c,e') => SOME(CTimesL(c, e2), e')
        | NONE => impossible "factor")

(* recombine a context and redex expr into an expr *) 
fun wrap (Hole,e) = e
  | wrap (CPlusL(c,e'), e) = Plus(wrap(c,e), e')
  | wrap (CPlusR(e',c), e) = Plus(e', wrap(c,e))
  | wrap (CTimesL(c,e'), e) = Times(wrap(c,e), e')
  | wrap (CTimesR(e',c), e) = Times(e', wrap(c,e))

fun reduce (Plus(Num m, Num n)) = Num(m+n)
  | reduce (Times(Num m, Num n)) = Num(m*n)
  | reduce _ = raise Fail "reduce - nonredex"

fun transition e =
    case factor e
      of NONE => NONE
       | SOME(c,r) => SOME(wrap(c,reduce r))

fun eval e =
    case transition e
      of NONE => e  (* e is already a value *)
       | SOME e' => eval e'
----------------------------------------------------------------------


======================================================================


Abstract Machine (the CK machine)

The factoring and wrapping at each transition is quite inefficient,
especially for a large expression.  How can we avoid all this redundant
work?  Refocus by moving incrementally around the expression tree,
maintaining a factoring into focus expression and context.

Lets view contexts as being built-up in layers.  For instance, in
the example above for 2 + ((5 + 8) * 4), the context 

   C1 = Plus(2,Times([],4))

for the (5 + 8) redex consists of two nested layers:

   Plus(2, [])
   Times([], 4)
   ------------
   Plus(5, 8)

The line separates the stack of context layers from the focus 
expression.

Reducing the redex evaluates the first argument of the inner Times
layer.  The next task is to shift the focus to the next redex, which
will be the Times expression.  We can perform this shift
incrementally:

   Plus(2, [])     Plus(2, [])     Plus(2, [])
   Times([], 4)    Times([], 4)    Times*(13, [])
   ------------    ------------    --------------
   Plus(5, 8)      13              4
  
After evaluating the redex Plus(5,8), we shift focus to the right
argument of Times (4), which is already evaluated, but we annotate
the Times constructor with an asterisk to indicate that we know
that the left argument has been evaluated.

In the next step we recognize that the right argument of the Times*
context layer is a value, so the Times* is a redex and can be reduced:

   Plus(2, [])       Plus(2, [])
   Times*(13, [])    -----------
   --------------    52
   4

If we were paying attention as we constructed the context as a set
of layers, we would have noticed that the left argument of the outer
Plus layer was a value, so we should have used the anotated Plus*
form to indicate this:

   Plus*(2, [])      Plus*(2, [])    54
   Times*(13, [])    -----------
   --------------    52
   4

Here are a set of rules for carrying out the context analysis
and reductions:

----------------------------------------------------------------------
Figure 1.7: SAE[CK] - CK-machine for SAE
----------------------------------------------------------------------
Frames: 

     F ::= Plus([],e2) | Plus(Num(n), []) |
           Times([], e2) | Times(Num(n), [])

Stack/Context:

     k = nil | F :: k

Machine state:  (e,k) ∈ expr * k

Transition judgement:  (e,k) => (e',k')

Initial states: 

   (e, [])   (where e is an expression to be evaluated)

Final states:

   (Num(n), [])   (where n ∈ Nat is the result value)
 
Transition rules:  (read n as Num(n))
 
(1)  (Plus(e1,e2), k)      =>  (e1, Plus([],e2)::k)
(2)  (Times(e1,e2), k)     =>  (e1, Times([],e2)::k)

(3)  (n, Plus([],e2)::k)   =>  (e2, Plus*(n,[])::k)
(4)  (n, Times([],e2)::k)  =>  (e2, Times*(n,[])::k)

(5)  (n, Plus*(m,[])::k)   =>  (p, k)    where p = m+n
(6)  (n, Times*(m,[])::k)  =>  (p, k)    where p = m*n

----------------------------------------------------------------------

Example: 2 + ((5 + 8) * 4)

(Plus(2, (Times(Plus(5,8), 4))), []) =>

(2, Plus([], (Times(Plus(5,8), 4)))::[]) =>

(Times(Plus(5,8),4), Plus*(2,[])::[]) =>

(Plus(5,8), Times([],4)::Plus*(2,[])::[]) =>

(5, Plus([],8)::Times([],4)::Plus*(2,[])::[]) =>

(8, Plus*(5,[])::Times([],4)::Plus*(2,[])::[]) =>   !

(13, Times([],4)::Plus*(2,[])::[]) =>

(4, Times*(13,[])::Plus*(2,[])::[]) =>   !

(52, Plus*(2,[])::[]) =>    !

(54, [])


----------------------------------------------------------------------
Program 1.3: CK machine for SAE  (SAE-context-reduction.sml)
----------------------------------------------------------------------
(* stack frames used to build evaluation contexts *)
datatype frame
  = PlusL of expr
  | PlusR of int
  | TimesL of expr
  | TimesR of int

type context = frame list    (* a stack of frames *)

(* CK abstract machine states *)
type state = expr * context

(* runCK : state -> int *)
fun runCK (Num n, []) = n
  | runCK (Plus(e1,e2), k) = runCK(e1, PlusL(e2)::k)
  | runCK (Num m, PlusL(e)::k) = runCK(e, PlusR(m)::k)
  | runCK (Num n, PlusR(m)::k) = runCK(Num(m+n), k)
  | runCK (Times(e1,e2), k) = runCK(e1, TimesL(e2)::k)
  | runCK (Num m, TimesL(e)::k) = runCK(e, TimesR(m)::k)
  | runCK (Num n, TimesR(m)::k) = runCK(Num(m*n), k)

fun eval e = runCK(e,[])
----------------------------------------------------------------------


======================================================================
Homework 2.1: Formally state and then prove the equivalence of
SAE[CR] and SAE[CK]

======================================================================