CS221/321 Lecture 2, 9/29/2011 ------------------------------- [Note on terminology: we use the words "expression" and "term" interchangeably.] How do we precisely define how arithmetic expressions are evaluated? 1. By writing a program implementing evaluation (say in ML). ------------------------------------------------------------ Program 1.1: SAE eval (SAE-interp.sml) ------------------------------------------------------------ datatype expr = Num of int | Plus of expr * expr | Times of expr * expr (* eval : expr -> int *) fun eval (Num n) = n | eval (Plus(e1,e2)) = eval e1 + eval e2 | eval (Times(e1,e2)) = eval e1 * eval e2 val exp1 = Plus(Num 2, Times(Num 13, Num 4)) eval exp1 ==> 54 ------------------------------------------------------------ 2. But what if we want to define evaluation "mathematically"? Define an abstract grammar of simple arithmetic expressions (SAEs) n :: = (i.e. n ∈ Nat) expr :: = Num n | Plus(expr,expr) | Times(expr,expr) We define a binary evaluation relation: ---------------------------------------------------------------------- Fig 1.1: SAE[BS] - Eval relation ---------------------------------------------------------------------- Relation: Eval ⊆ expr × Nat Rules: Eval ((Num n),n) Eval ((Plus(e1,e2)), p) <= Eval (e1, n1) & Eval (e2, n2) & p = n1 + n2 Eval ((Plus(e1,e2)), p) <= Eval (e1, n1) & Eval (e2, n2) & p = n1 + n2 ---------------------------------------------------------------------- Alternatively, we can use an infix down arrow symbol (⇓) to represent the Eval relation: ---------------------------------------------------------------------- Fig 1.1a: SAE[BS] - ⇓ evaluation relation ---------------------------------------------------------------------- Relation: ⇓ ⊆ expr × nat Rules: (1) ------------- (Num n) ⇓ n e1 ⇓ n1 e2 ⇓ n2 (2) --------------------- where p = n1 + n2 Plus(e1,e2) ⇓ p e1 ⇓ n1 e2 ⇓ n2 (3) --------------------- where p = n1 * n2 Times(e1,e2) ⇓ p ---------------------------------------------------------------------- Is evaluation deterministic? I.e. For any e, is the n such that e ⇓ n unique? Intuitively, we certainly expect so. Theorem 1.1: ∀e ∀m ∀n. e ⇓ m & e ⇓ n => m = n Exercise: How do we prove it? ----------------------- This Eval (⇓) relation defines what is known as a "big-step" semantics (hence SEA[BS]), because it relates an expression to its final value, without indicating how many intermediate steps of computation are involved in the evaluation. ====================================================================== Note: On the inductive definition of expr The BNF definition expr :: = Num n | Plus(expr,expr) | Times(expr,expr) defines the set of SAE expressions recursively, or "inductively". What is the meaning of such a definition? We are working in basic set theory, augmented with a notion of general expressions consisting of syntax constructors like Plus and Times applied to tuples of expressions. These expressions could be themselves encoded by basic set constructions, but we will just assume them as primitives. It is assumed that syntax constructors are 1-1 functions, so, for instance, Plus(e1,e2) = Plus(e1',e2') implies e1 = e1' and e2 = e2'. Num is assumed to be a 1-1 function mapping Nat to expressions. Furthermore the ranges of Nat, Plus, and Times are disjoint; expressions formed by Nat, Plus, and Times are always distinct. We define a "set closure" function C as follows: C(S) = {Num(n) | n ∈ Nat} ⋃ {Plus(s1,s2) | s1, s2 ∈ S} ⋃ {Times(s1,s2) | s1, s2 ∈ S} Such a closure function maps sets to sets and is monotonic: A ⊆ B => C(A) ⊆ C(B). Define a sequence of sets E(i) as follows: E(0) = ∅ E(n+1) = C(E(n)) For instance, E(1) = {Num(n) | n ∈ Nat} E(2) = {Num(n) | n ∈ Nat} ⋃ {Plus(Num(n1),Num(n2)) | n1, n2 ∈ Nat} ⋃ {Times(Num(n1),Num(n2)) | n1, n2 ∈ Nat} and in general E(n) is the set of terms built from Num, Plus and Times of depth less than or equal to n. Note that E(n) ⊆ E(n+1), and each finite expression e in SAE will appear in E(n), where n = depth(e), defined by: depth(Num(n)) = 1 depth(Plus(e1,e2)) = 1 + max(depth(e1), depth(e2)) depth(Times(e1,e2)) = 1 + max(depth(e1), depth(e2)) (that is, the "rank" of e ∈ SAE, the stage in the inductive construction where it is introduced, is equal to depth(e)). Now define the limit set E = ⋃ {E(n) | n ∈ Nat} Prop: E is the least fixed point of C, i.e. (i) E = C(E). (ii) If S = C(S), then E ⊆ S. Proof: see the Induction Tutorial. We take E, the least fixed point of C to be the meaning of the inductive definition of expr, i.e. expr = E. Note that all the expressions in E are finite, because all the expressions in each E(n) are finite. This definition excludes "infinite expressions" such as the expression defined by the recursive equation e = Plus(e,e) which looks like an infinite binary tree with each node labeled by the Plus constructor. e = Plus / \ Plus Plus / \ / \ ... ... ... ... As a consequence, the relation of subexpression is a well-founded partial order on expr. This is the basis for structural induction on expressions, which can be derived from the general principle of well founded induction. ======================================================================== Small-step semantics -------------------- A more fine-grained definition of evaluation takes into account all the intermediate steps involved. We define a "transition relation" on expressions (which is a binary relation on expressions): e ↦ e' ( ↦ ⊆ expr × expr ) where expression e' is obtained from expression e by one basic step of evaluation (i.e. by a single "reduction" of a redex subexpression). For definitions and notation for transition systems, see the Appendix "Transistion Systems" at the end of this lecture. The expressions here are also known as "states" in a transition system. In this case, the initial states I = expr (i.e. all expressions are initial states), and the final states are F = {Num(n) | n ∈ Nat}, i.e. the atomic expressions representing numbers. We have to account for the fact that the place that the reduction takes place will generally be a subterm (possibly deeply nested) of the main expression we are evaluating. We define the transition relation in terms of derivations of transition "judgements" using a set of "inference rules". Here are the inference rules defining the small-step transition relation for evaluating SAEs. This rule system is designated SAE[SS]. ---------------------------------------------------------------------- Fig 1.2: SAE[SS] ---------------------------------------------------------------------- (1) ---------------------------------- (where p = n1 + n2) Plus(Num n1, Num n2) ↦ Num p (2) ---------------------------------- (where p = n1 * n2) Times(Num n1, Num n2) ↦ Num p e1 ↦ e1' (3) --------------------------------- Plus(e1, e2) ↦ Plus(e1', e2) e2 ↦ e2' (4) ----------------------------------------- Plus(Num n1, e2) ↦ Plus(Num n1, e2') e1 ↦ e1' (5) --------------------------------- Times(e1, e2) ↦ Times(e1', e2) e2 ↦ e2' (6) ----------------------------------------- Times(Num n1, e2) ↦ Times(Num n1, e2') ---------------------------------------------------------------------- Rules (1) and (2) are called "instructions" or axioms. They specify the basic reductions. A term matching one of the patterns on the left is called a redex, and the reduced term on the right is called the "contractum". These rules have a conclusion, but no premises. Rules (3), (4), (5), and (6) are called "search rules". They are used to isolate the site of a redex and propagate the effect of reducing the redex to the larger enclosing expression. The rules mean that if we can derive the premise transition above the line, then we can derive the conclusion. [We will refer to these rules as SAE[SS](1), etc. when we need to be precise about which rule set we are using.] How do we use these rules to justify a reduction like: 2 + (13 * 4) ↦ 2 + 52 ? We have to "prove" this transition by constructing a derivation from the rules. (To make terms and derivations more concise in the examples, we'll allow ourselves a slight abuse of notation and leave out the Num constructors, just treating the numbers themselves as though they were basic expressions.) ---------------------------------------------------------------------- Fig 1.3: Example: 2 step derivation ---------------------------------------------------------------------- ------------------ (2) Times(13,4) ↦ 52 ------------------------------------ (4) Plus(2, Times(13,4)) ↦ Plus(2, 52) ---------------------------------------------------------------------- This is called a proof tree or a derivation tree. Each node is labeled by the rule that justifies the inference. A search rule like (4) promotes a transition on a subexpression to a transition on the (immediate) containing expression. If the redex is nested more deeply, the transition derivation will also be deeper. In fact, the shape of the derivation exactly corresponds to the form of the expression in the conclusion. For instance, the transition 2 + ((5 + 8) * 4) ↦ 2 + (13 * 4) ----- has the derivation ---------------------------------------------------------------------- Fig 1.4: Example: 3 step derivation ---------------------------------------------------------------------- ---------------- (1) Plus(5,8) ↦ 13 ----------------------------------- (5) Times(Plus(5,8),4)) ↦ Times(13,4) --------------------------------------------------- (4) Plus(2,Times(Plus(5,8),4)) ↦ Plus(2,Times(13,4)) ---------------------------------------------------------------------- Observe that the left-hand-sides of the transitions in this derivation are just the nested sequence of subexpressions along the path from the root of the full term to the redex subexpression. This will be true in general, so the structure of any reduction derivation reflects the path from the root to the redex subexpression of the expression being reduced. How do we get full evaluation from the transition relation ↦ ? We construct sequences of chained transitions: e1 ↦ e2 ↦ e3 ↦ ... ↦ e How do we know we are done? When there are no further transitions possible. I.e when there does not exist an e' such that e ↦ e'. Such an expression/state is known as a terminal state. In this case, the terminal states are the same as the final states F = {Num(n) | n ∈ Nat}, the simple, fully evaluated Num terms. A sequence ending in a final state is called a complete sequence, and represents a complete computation of the value of the starting expression. ====================================================================== Induction over Derivations in SAE[SS] The derivations defined by the rules of SAE[SS] are an inductively defined set, similar to the inductively defined set of SAE expressions. Let this inductively defined set of derivations be denoted by D. Each rule can be viewed as a derivation constructor, and if we want to formalize them as constructors, we have to define their parameters: R1 : Nat * Nat -> D R2 : Nat * Nat -> D R3 : D * expr -> D R4 : D * Nat -> D R5 : D * expr -> D R6 : D * Nat -> D ---------------------------------------------------------------------- ML version: the datatype of SAE[SS] derivations datatype D = R1 of Nat * Nat | R2 of Nat * Nat | R3 of D * expr | R4 of D * Nat | R5 of D * expr | R6 of D * Nat ---------------------------------------------------------------------- We use these constructors, with the appropriate Nat and SAE arguments, to construct derivation terms representing derivations. For instance, the derivation term for the derivation in Figure 1.4 is: R4(R5(R1(5,8),Num 4),2) A proof by induction over derivations is essentially a proof by induction over the set of derivation terms. The source (resp. target) of a derivation is the source (resp. target) of the final judgement proved by the derivation. For a transition judgement e ↦ e', we define source(e ↦ e') = e target(e ↦ e') = e' We can also define source and target functions on derivations: source(R1(n1,n2)) = Plus(Num n1, Num n2) target(R1(n1,n2)) = Num m (where m = n1 + n2) source(R2(n1,n2)) = Times(Num n1, Num n2) target(R2(n1,n2)) = Num m (where m = n1 * n2) source(R3(d,e)) = Plus(source d, e) target(R3(d,e)) = Plus(target d, e) source(R4(d,n)) = Plus(Num n, source d) target(R4(d,n)) = Plus(Num n, target d) source(R5(d,e)) = Times(source d, e) target(R5(d,e)) = Times(target d, e) source(R6(d,n)) = Times(Num n, source d) target(R6(d,n)) = Times(Num n, target d) Now we can define the concluding judgement of a derivation by conclusion d = source d ↦ target d A proof by "rule induction" over the rules (1) .. (6) is really a proof by induction on the set of derivation terms defined by the constructors R1 .. R6. The Induction Principle for SEA[SS] derivations is: ∀n1. ∀n2. P(R1(n1,n2)) & ∀n1. ∀n2. P(R2(n1,n2)) & ∀d. ∀e. P(D) ⇒ P(R3(D,e)) & ∀d. ∀n. P(D) ⇒ P(R4(D,n)) & ∀d. ∀e. P(D) ⇒ P(R5(D,e)) & ∀d. ∀n. P(D) ⇒ P(R6(D,n)) ⇒ ∀d. P(d) ====================================================================== Here is an example of a useful property of SAE[SS], and its proof, which is an example of induction on the structure of an SAE expression. Theorem 1.2 [Progress]: For any e ∈ SAE, either e ∈ F (e is Final), or ∃e'. e ↦ e'. Proof: We prove this by induction on the structure of e. (0) P(e) = e ∈ F or ∃e'. e ↦ e' defn (1) ∀e. P(e) TBS (2) Let e ∈ SAE defn (3) Cases on structure of e (4) [Base] case: e = Num(n). case defn (5) e ∈ F defn F (6) P(e) QED 4 (7) [Ind] case: e = Plus(e1,e2) case defn (8) P(e1) I.H. (9) P(e2) I.H. (10) OR cases on P(e1) IH 8 (11) case: e1 ∈ F case assumpt (12) e1 = Num(n1), some n1 defn F (13) OR cases on P(e2) IH 9 (14) case e2 ∈ F case assumpt (15) e2 = Num(n2), some n2 defn F (16) e = Plus(Num n1, Num n2)) (12,15) (17) e ↦ Num (n1+n2) SAE[SS] (1) (18) ∃e'. e ↦ e' (19) P(e) QED 11, 14 (20) case ∃e2'. e2 ↦ e2' case assumpt (21) Plus(Num n1, e2) ↦ Plus(Num n1, e2') SAE[SS] (4) (22) e ↦ Plus(Num n1, e2') (7, 12) (23) ∃e'. e ↦ e' ∃-intro, 22 (24) P(e) QED 11, 20 (25) QED 11 (26) case ∃e1'. e1 ↦ e1' case assumpt (27) Plus(e1,e2) ↦ Plus(e1',e2) SAE[SS] (3) (28) e ↦ Plus(e1',e2) 7 (29) ∃e'. e ↦ e' ∃-intro, 28 (30) P(e) QED 26 (31) P(e) QED 7 (32) ∀e.P(e) ∀-intro, 2; QED ====================================================================== There are a number of natural questions to ask about this transition system. Question 1. Does evaluation always terminate? That is, for any given initial expression e1, is there always a complete transition sequence ending in a final state, that is, in a number expression? Question 2. Is evaluation deterministic? That is, does the evaluation of an expression, assuming it terminates, always yield the same, uniquely determined number? A question related to Question 2 is: Question 3. Is the transition relation deterministic, in the sense that for a given nonfinal expression (i.e. compound expression) e1, there is a unique expression e2 such that e1 ↦ e2. An affirmative answer to Question 3 obviously implies an affirmative answer to Question 2. In fact, the answer for all three questions is yes. ====================================================================== Exercise 1.2 ------------ Prove that small-step (transition) evaluation for SAE always terminates. Exercise 1.3 ------------ Prove that the transistion relation is deterministic: for any e ∈ expr, there exists at most one e' such that e ↦ e'. ====================================================================== Equivalence of Big-Step and Small-Step Evaluation ------------------------------------------------- Now we have two alternate explanations of evaluation of SAE, using either the big-step semantics ⇓ or the small-step semantics ↦. We would hope that they are consistent with one another, and this is indeed the case. Theorem 1.3. For all e ∈ expr and n ∈ Nat, e ⇓ n <=> e ↦! Num(n). Proof: Intuition: One learns from experience that when you have a double implication like this, it is usually best to subdivide the problem into two one-way implications, because it is often the case that the two implications are best proved by different approaches (different forms of induction). So consider first e ⇓ n => e ↦! Num(n). In the assumption e ⇓ n, we have two elements, expression e and natural number n. It seems unlikely that induction on n will work (because e ⇓ n is defined recursively on the structure of e), so we concentrate on e. As when starting with an assumption like e ↦ e', we have a choice of doing induction on the structure of expression e or induction on the derivation of the judgment e ⇓ n. Unlike the ↦ judgements, the case analysis for e and for the e ⇓ n are the same: e = Num(n), e = Plus(e1,e2), or e = Times(e1,e2). So it doesn't really matter whether we do induction on the structure of e, or on the derivation of e ⇓ n. If e is a value, the case is trivial, since e ↦! Num(n) by the empty transition sequence. If e = Plus(e1,e2), we will be able to assume by induction that e1 ↦! n1 and e2 ↦! n2. But how do we take these two assumed transition sequences and build a transition sequence for e ↦! (n1+n2)? This will take a Lemma. For e ⇓ n <= e ↦! Num(n), it will be natural to do induction on the transition sequence implied by e ↦! Num(n), or equivalently, on the definition of ↦*. Proof. (Part I) e ⇓ n => e ↦! Num(n). By induction on the derivation of e ⇓ n (Figure 1.1a). Base: e ⇓ n by ⇓(1): Then e = Num(n). Then e ↦* e = Num(n) in zero steps (by ↦*(1)). [X] Ind1: e ⇓ n by ⇓(2). Then e = Plus(e1,e2), and by inverting ⇓(2) we have: ∃n1,n2. n = n1+n2 and e1 ⇓ n1 and e2 ⇓ n2. IH1: e1 ↦* Num(n1) IH2: e2 ↦* Num(n2) Lemma 1: (1) e ↦* e' => Plus(e,f) ↦* Plus(e',f) (2) e ↦* e' => Plus(Num(n),e) ↦* Plus(Num(n),e') (3) e ↦* e' => Times(e,f) ↦* Times(e',f) (4) e ↦* e' => Times(Num(n),e) ↦* Times(Num(n),e') Proof: (1) e ↦* e' => Plus(e,f) ↦* Plus(e',f) Induction on defn of ↦*. Base: e ↦* e' by ↦*(1). => [inverting ↦*(1)] e = e' => Plus(e,f) = Plus(e',f) [by inductive defn of = on expressions] => [↦*(1)] Plus(e,f) ↦* Plus(e',f), Ind: e ↦* e' by ↦*(2): Then there exists e0 s.t. (1) e ↦ e0 & (2) e0 ↦* e' IH: Plus(e0,f) ↦* Plus(e',f). [SAE[SS](3), (1)] Plus(e,f) ↦ Plus(e0,f) => [↦*(2), IH] Plus(e,f) ↦* Plus(e',f) [] [X] The proof of (2) is similar, using SAE[SS](4) instead of SAE[SS](3). The proofs of (3) and (4) are identical, replacing Plus with Times and SEA[SS] rules (3) and (4) with Rules (5) and (6). [QED Lemma 1] Now we complete part (I) of the main theorem. [IH1, Lemma 1(1)] Plus(e1,e2) ↦* Plus(Num(n1), e2) (1) => [IH2, Lemma 1(2)] Plus(Num(n1), e2) ↦* Plus(Num(n1), Num(n2)) (2) => [(1), (2), Lemma TS1] Plus(e1,e2) ↦* Plus(Num(n1),Num(n2)) (3) [Rule (1)] Plus(Num(n1), Num(n2)) ↦ Num(n1+n2) (4) => [Defn ↦*] Plus(Num(n1), Num(n2)) ↦* Num(n1+n2) (5) => [(3),(5),Lemma TS1] Plus(e1,e2) ↦* Num(n1+n2) [X] Here we have used a basic lemma about transition sequences: Lemma TS1 [Concatenation of transition sequences] s1 ↦* s2 & s2 ↦* s3 => s1 ↦* s3. Proof: Exercise (straightforward induction on s1 ↦* s2). The Times(e1,e2) case is similar. [XX] Part (II): e ↦! Num(n) => e ⇓ n. Proof: By induction on the definition of ↦*. Base: e = Num(n) [⇓(1)] e ⇓ n Ind: ∃ e1. e ↦ e1 & e1 ↦* Num(n). IH: e1 ⇓ n Lemma 1. e ↦ e' and e' ⇓ n => e ⇓ n. Proof: By induction on the derivation of e ↦ e'. Base: e ↦ e' by ↦(1). Then e = Plus(Num(n1),Num(n2)) and e' = Num(n1+n2) ⇓ (n1+n2). [⇓(1)] Num(n1) ⇓ n1 (1) [⇓(1)] Num(n2) ⇓ n2 (2) [(1), (2), ⇓(2)] e ⇓ (n1+n2) [X] Ind1: e ↦ e' by ↦(2). Then ∃ e1, e2. e = Plus(e1,e2) & e' = Plus(e1',e2) & e1 ↦ e1'. [Hyp] e' ⇓ n, by ⇓(2) by the form of e' [⇓(2) inv] ∃ n1,n2 . e1' ⇓ n1 & e2 ⇓ n2 & n = n1 + n2. IH: e1 ⇓ n1. (since e1 ↦ e1' and e1' ⇓ n1) => [⇓(2)] e ⇓ n1 + n2 = n [X]. The cases for Rules (3), (5), (6) are similar, and the case for Rule (4) is similar to the Base case above for Rule (1). [X] Now from the case assumption and the IH, we have e ↦ e1 & e1 ⇓ n. Hence by Lemma 1, e ⇓ n. [QED] ====================================================================== Appendix: Transition systems ---------------------------- Defn: A transition system is a quadruple where S : a set of "states" ↦ ⊆ S × S : a transition relation I ⊆ S : a set of initial states F ⊆ S : a set of final states (other arrow symbols like → or ⇒ are sometimes used for the transition relation). We typically use a transition system to model a computational process, where a sequence of transitions take us from the initial state to a final state that represents the result of the computation. When a ↦ b, we call a the "source" of the transition, and b the "target" of the transition. Defn: A transition system is determinate (deterministic) if for any state s ∈ S, if s ↦ s' and s ↦ s'', then s' = s'' (i.e. there is at most one transition from any given state). The transition relation may be nondeterministic, meaning that for some states there may be more than one transition with that state as the source: a ↦ b, a ↦ c where b /= c Defn: A state that is not the source of any transition is called a "terminal" state (also known as a "stuck" state). We write "s /↦" to indicate that s is terminal. Generally, the final states will be terminal states. In general, though, stuck states are not necessarily final states. Defn: A transition sequence s0 ↦ s1 ↦ ... ↦ sn is a chain of transitions taking us from a state s0 to sn. We write s1 ↦* sn if there exists such a transition sequence (which may not be unique). We include the empty transition sequence, so for any state s, s ↦* s. We also write s ↦n s' if there is a transition sequence of length n from s to s', where s ↦0 s for every state s. The relation ↦* is the transitive, reflexive closure of ↦. It can be defined inductively by the rules s ↦ s'' s'' ↦* s' ---------------- (1) ------------------------ (2) s ↦* s s ↦* s' ---------------------------------------------------------------------- Alternatively, ↦* is defined by the following inductive defn: (i) (s,s) ∈ ↦* (ii) (s,s'') ∈ ↦ & (s'', s') ∈ ↦* ==> (s,s') ∈ ↦* (iii) nothing else is in ↦* ---------------------------------------------------------------------- We also write s ↦n s' if there is a transition sequence of length n ∈ Nat from s to s', where s ↦0 s for every state s. This family of relations can also be defined inductively: s ↦ s'' s'' ↦n s' ---------------- (1) -------------------------- (2) s ↦0 s s ↦(n+1) s' Prop A1: ↦* = ⋃ {↦n | n ∈ Nat} (i.e, s ↦* s' iff ∃n. s ↦n s') Proof: exercise. Prop A2: ↦* is transitive. Proof: exercise. Defn: If s ↦* s' and s' ∈ F, the transition sequence is called a "complete" transition sequence, and we write s ↦! s'. Defn: If s ↦* s' where s is stuck, we say that the transition sequence is "terminal". Note that a terminal sequence may not be complete, and a complete sequence may not be terminal, though they will agree in the usual case where the stuck states are exactly F. Defn: A "divergent" sequence is an infinite sequence s0 ↦ s1 ↦ s2 ↦ ... that goes on forever. This can represent a nonterminating computation. Note that the existence of divergent transition sequences does not imply that S is infinite, since it may pass through a finite cycle of states. Also, there may be both terminating and divergent transition sequences starting at the same state. Exercise: construct examples illustrating each of these concepts or terms. For any transition system , states and transition relation define a directed graph . A transition sequence represents a path through this graph from its initial source to its final target.