Previous Up Next

Chapter 2  A Simple Functional Language with Call-by-Value Semantics

In this chapter we specify the semantics of the language which we use to implement transformations.

During recent years, several methodologies to rigorously specify the semantics of formal languages have emerged. We will present the semantics of a simple functional language that is suitable for demonstrating in principle all kinds of transformations that can be applied to more fully featured, “real” functional languages.

The concrete syntax, the one in which the programmer writes his programs, will not be considered: it is less important, how programs look than what they mean and how they are structured. Therefore, only the abstract syntax, which carries the semantics, will be presented. Everything else is just a matter of taste.

Due to practical restrictions on the input character set of computer languages, which disallow convenient and concise mathematical symbols in computer programs, we shall explain the commonly known notation for abstract syntax of functional languages as it is used in the text part of the thesis together with the keywords that appear in the implementation (LambdaProlog terms). This is to prevent confusion of people who want to extend the implementation.

Much care was taken to ensure that the implementation of the language stays conformant to the rigorous basis provided in [] (especially chapter 13 on a strict functional language with recursive types). This should make it much easier to prove properties of the language building upon the strong formal foundations provided in the mentioned book1.

2.1  Description of types, values and informal semantics

Types are sets of values that belong together in a meaningful way. E.g. the set of integer values normally has a corresponding type in most computer languages. More generally, the notion of types extends to terms, whose evaluation may or may not terminate. If a term is regarded to be of type t and if its evaluation terminates, the resulting value must indeed be of the required type. We write t : τ to indicate that term t is of type τ.

Type checking ensures to a high degree that the programmer does not accidently write meaningless programs by combining terms that do not belong together. The implementation part of this work contains both a type inference engine and a type checker for polymorphically typed programs of the given language. Polymorphism allows families of types to be parameterised by others. This is a very expressive and generic way to deal with types.

Type inference2 is a process that can automatically infer which type a program and its constructs have by analysing their structure. Unfortunately, type inference for polymorphically typed languages, which is the case for many functional ones, is undecidable in the general case. Therefore, if we want to be able to assign correct types to all possible terms, we need type annotations provided by the user that enable the compiler to infer correct types: it would otherwise have to reject type correct programs or its type inference algorithm could loop. Both cases are not very satisfactory.

When an algorithm verifies programs with such annotated types, we speak of type checking. The implementation of our system features both a type inference algorithm and an extension for type checking that can be used to verify the type correctness of input programs. Unfortunately, it is still not fully complete: it cannot yet handle recursive types in the general case. The current naive version loops under some circumstances.

The implementation of the type system in our system started out with an example implementation presented in John Hannan’s tutorial [], corrected three small errors, extended the type system to cover sum types and recursive types (see further below) and changed the way recursive functions are handled to a more preferable style for strict functional languages. To conform better to the formal foundation of the example language as presented in [], a few identifiers were renamed, too (tabs ↔ tlam; abs → lam).

The type system of our language is otherwise very simple. In the following section, we will give a detailed specification of the types of our language and, as we go along, also explain the way they are represented in the implementation that is written in LambdaProlog:

2.1.1  Unit type

This type has only a single value:

 Text notationRepresentation in implementation
typeunitunit
value*u

Although this type may seem boring, we need it for e.g. lifting terms to values by creating a function that takes this value of “zero-information” and returns the given term. This is called thunking3 and allows passing around computations as values. This can be used to e.g. simulate call-by-name evaluation in a strict language. Additionally, the unit value may be necessary for specifying datastructures together with sum types (see below).

2.1.2  Product type

Values of this type are pairs of the form (c1, c2), where c1 and c2 are values of type τ1 and τ2 respectively. For terms t11 and t22, we have the pair (t1, t2) : τ1 * τ2. It is possible to project from pairs:

Given a term:

t : τ1 * τ2

We have:

  fst(t):τ1
  snd(t):τ2

where fst(t) returns the first element of the pair, snd(t) the second one.

 Text notationRepresentation in implementation
typeτ1 * τ2TP1 ** TP2
value(c1, c2)pair C1 C2
termfst(t)fst T
termsnd(t)snd T

2.1.3  Sum type

Values of this type have either the form inl(c1) or inr(c2), where c1 is a value of type τ1 and c2 one of type τ2. As before, this extends similarly to terms.

Given terms:

  t:τ1 + τ2
  t1:τ  possibly containing variable  x : τ1
  t2:τ  possibly containing variable  y : τ2

We have:

  case  t  of inl(x) . t1,  inr(y) . t2 : τ

Representation in the implementation:

case T LF RF

The case statement has the intuitive meaning that if t evaluates to inl(x), then this term returns t1, otherwise if it evaluates to inr(y), then t2 is returned.

The LambdaProlog construct demands some explanation4: case is just a normal user-defined data constructor that takes three parameters. The first parameter, the variable T, stands for the term that decides, what case arm should be taken. The other two parameters represent the case arms. They are actually functions in LambdaProlog. E.g. if the variable T matches some pattern inl v, then the left case arm will be chosen. If the function representing this case arm is e.g. (x\ x), which is the anonymous5 identity function in LambdaProlog notation, then x will be bound to v in the function body. The result of the case statement would be v in this example.

 Text notationRepresentation in implementation
typeτ1 + τ2TP1 ++ TP2
valueinl(c)inl C
valueinr(c)inr C

2.1.4  Function type

Function values of type τ1 → τ2 have the form λ xτ1.t, where t is a term of type τ2, which may contain variable x of type τ1. When types are clear from the context, we will omit the type of the variable.

Given terms:

  t1:τ1 → τ2
  t2:τ1

We have:

  t1 t2 : τ2

Terms as described above represent function application. In the expression lam F in the table below, the variable F stands for a function in LambdaProlog notation again.

 Text notationRepresentation in implementation
typeτ1 → τ2TP1 --> TP2
valueλ x . tlam F
termt1 t2app T1 T2

2.1.5  Type of recursively defined functions

Recursive functions are typed similarly to non-recursive ones. Given term t : τ2, which may contain variables x : τ1, we can form (types may be omitted in the text):

rec  fτ1→τ2.(λ xτ1.t) : τ1→τ2

Representation in the implementation:

rec F

Function application works the same as for non-recursive functions.

This time the function F in LambdaProlog representation takes two parameters as, for example, in (f\x\ app f x). The reason is that we need a way to refer to the name of the recursive function in its function body — otherwise we would not be able to call it recursively.

It should be noted that there are other ways to specify recursive functions, as is done, for example, in []: there the recursive definition does not take a parameter, which means that we can create arbitrary recursive values (e.g. cyclic lists), too.

However, this generality comes with some problems in strict languages (not so in call-by-name ones). For example, it is not clear what the value

rec  x.x

should mean. It could be of any type, but if we want to access it, it is not defined! This could lead to looping programs or, even worse, to crashing ones in real implementations, where the runtime system would just try to access an uninitialised memory location. Lazy languages do not suffer from the latter problem, because they evaluate all expressions lazily: the runtime system would not assume initialised locations and would either just keep looping trying to evaluate this example or (in some other cases) just delay evaluation of recursive subexpressions.

On the other hand, if we allow for a list constructor ::, the following could be interpreted as the infinite list of ones:

rec  ones. 1 :: ones

Some strict languages (e.g. OCaml) allow such forms and impose restrictions on the right-hand side of definitions that prevent ill-formed cases. In practice it seems to be of hardly any use to allow such “infinite” datastructures in strict languages6. It might indeed make some termination proofs more difficult, since inductively defined datatypes could then have “infinitely large” values, which does not fit well to their usual properties (they allow inductive proofs).

Therefore, we adopt the style that SML goes and disallow fully general recursive definitions completely. In other terms: evaluation always has to be triggered explicitly by a function application, thus preventing any problems with definiteness.

2.1.6  Recursive types

Recursive types are needed to specify inductively defined datastructures such as lists, binary trees, etc., or co-inductive ones like e.g. streams.

Here, for example, a way to encode lists in our language (text representation):

nilabs(inl())
cons(nl)abs(inr(n,l))

’abs’ abstracts the type of the list representation and yields a recursive type. ’inl’ just takes the unit-value as parameter, which indicates that we cannot get any more information from it (it stands for the empty list ’nil’). ’inr’ takes a pair of values: the first one stands for the contents of a list element, the second one for the tail (rest) of the list7. If we want to implement a function on lists, we would have to use the keyword ’rep’ on list values to “know” whether they represent empty lists ’inl()’ or some cons-element ’inr(value, rest)’.

It is important to note that using simple sum types together with pairs and the unit type, we have the potential to specify isomorphic representations for all kinds of recursive datatypes! This means that it is not necessary to come up with a more elaborate implementation of recursive types: it is not too difficult to translate any kind of existing recursive type with arbitrary data constructors into this representation and back again.

 Text notationRepresentation in implementation
typeµXmu TF
valueabs(c)abs_rtp C
termrep(t)rep_rtp T

The variable TF stands for a function from types to types. As was the case with recursive functions, we need this to bind the recursive type within the body of its definition.

2.2  Abstract syntax

Having given an overview of the language, we can now present the full abstract syntax that will be used throughout the text (please refer to the informal specification in section 2.1 to learn about the representation used in the implementation).


  t::=xτ
 |*
 |(t1t2)
 |fst(t)
 |snd(t)
 |λ xτ.t
 |t1 t2
 |inl(t)
 |inr(t)
 |case t  of inl(x).t1,  inr(y).t2
 |abs(t)
 |rep(t)
 |rec fτ1→τ2.(λ xτ1.t)
Table 2.1: Abstract syntax

2.3  Formal typing rules

The formal typing rules are to be read as follows: each rule consists of premises and a conclusion, the premises being written above and the conclusion below the solid line. To prove that a term is well-typed, we have to derive this using the rules until all derivations end in rules that do not require any premise to be true, these last rules being called axioms.


xτ: τ
           
* : unit
t1 : τ1    t2 : τ2
(t1t2) : τ1 * τ2
          
t : τ1 * τ2
fst(t) : τ1
          
t : τ1 * τ2
snd(t) : τ2
t : τ2
λ xτ1.t : τ1 → τ2
          
t1 : τ1 → τ2    t2 : τ1
t1 t2 : τ2
t1 : τ1
inl(t1) : τ1 + τ2
          
t2 : τ2
inr(t2) : τ1 + τ2
t : τ1 + τ2    t1 : τ    t2 : τ
case  t  of inl(x).t1,  inr(y).t2 : τ
t : τ[µX.τ/X]
abs(t) : µX
          
t : µX
rep(t) : τ[µX.τ/X]
t : τ2
rec  fτ1 → τ2.(λ xτ1.t) : τ1 → τ2
Table 2.2: Formal typing rules

Attentive eyes may have spotted the interesting pattern of the typing rules for function application and abstraction: they look the same as the logic rules for implication introduction and elimination in natural deduction proof systems. This important relation, known as Curry-Howard isomorphism, relates proofs and beta-reduction (i.e. “evaluation”) of typed lambda terms, which turns out to be a very powerful tool in both computer science and logic.

2.4  Operational semantics

There are several ways to specify evaluation of programs, the operational semantics of a language. However, the structural operational semantics has the advantage of being syntax-directed. This ensures that it fills its purpose of providing for a strict guideline of implementation: it directly associates evaluation steps with each piece of abstract syntax in a formal way. Before this kinds of semantics became more widespread in use, it was common practice to specify the operational semantics by providing an abstract machine that interprets it. This is, of course, not so rigorous an approach from a formal point of view.

We will see in chapter 6 that our system implements both “normal” evaluation rules and partial evaluation as well. The declarative reading of the evaluation rules in LambdaProlog corresponds exactly to the rules we are about to specify: this makes it trivial to prove our implementation correct and demonstrates the very high level of programming achievable in LambdaProlog, making it most suitable for such tasks.

Before presenting the evaluation rules, we will give a short explanation of the meaning of values.

2.4.1  Values

Some terms that do not have free variables in them (i.e. they are closed) and if they are well-typed, correspond to basic elements of their types. In other terms: they are values.


  c::=*
 |(c1c2)
 |λ xτ.t    iff the whole lambda term is closed
 |inl(c)
 |inr(c)
 |abs(c)
Table 2.3: Values

* is the only value of type unit. The pair (c1, c2) is a value when its type is τ1 * τ2 and when c1 has type τ1 and c2 has type τ2. Any closed and well-typed lambda term λ xτ.t is a value, as are inl(c1) and inr(c2) when their type is τ1 + τ2 and if c1 has type τ1 and c2 has type τ2. A term abs(c) is a value if c is a value of type τ[µX.τ/X].

Terms which are values have the property that evaluating them does not change them: they are self-describing.

It is worth noting that in our call-by-value language variables always stand for values in the sense that their meaning is always already computed before one can access it: e.g. variables standing for function parameters are already computed before the function body in which they are bound is executed. This property is very important to know in program transformation: it guarantees that accessing variables will not cause any side effects, be it non-termination or others. We will come back to this in chapter 4.

2.4.2  Structural operational semantics

This subsection will present all the derivation rules to evaluate programs of the language. They are given in the form of structural operational semantics, which is sometimes called natural semantics of the language. Structural operational semantics rigorously defines each computational step to be performed after another so that an interpreter can be implemented from the specification in a straightforward manner.

Results derivable from the structural operational semantics include (the arrow → is to be read as “derives to”):

Type consistency If t : τ is closed and tc then c : τ.
Determinacy For any closed t, there is at most one c such that tc.

Proofs can be obtained by induction on the structure of derivations.

Evaluation rules

As in the subsection on typing rules, the derivation rules of the operational semantics consist of premises and a conclusion. The elements of the rules specify evaluation steps. To evaluate a term on the left side of an evaluation arrow, we first have to derive (evaluate) all elements of the premise above it. Doing this may bind results of evaluation to variables, which may be used on the right-hand side of the conclusion. The right-hand side of the derivation is always a value!

It can happen that the derivation tree would be infinite in size, that we never meet a final rule during derivation: in this case we say that the term does not converge or that evaluation does not terminate. It is possible to prove equivalence of programs by structural induction or (in more difficult cases) by showing that they have equivalent derivation trees (induction on derivations). This can be quite tedious to do in this kind of semantics because of the low level (computation steps) it addresses. Denotational semantics, which we will explain in section 2.5, is a more elegant way of establishing equivalences between programs.


* → *
t1 → c1    t2 → c2
(t1t2) → (c1c2)
          
t → (c1c2)
fst(t) → c1
          
t → (c1c2)
snd(t) → c2
λ x.t → λ x.t
          
t1 → λ x.t    t2 → c1    t[c1/x] → c
t1 t2 → c
t → c
inl(t) → inl(c)
          
t → c
inr(t) → inr(c)
t → inl(c1)    t1[c1/x] → c
case  t  of inl(x).t1,  inr(y).t2 → c
t → inr(c1)    t2[c1/y] → c
case  t  of inl(x).t1,  inr(y).t2 → c
t → c
abs(t) → abs(c)
          
t → abs(c)
rep(t) → c
rec  f.(λ x.t) → λ x.(t[rec  f.(λ x.t)/f])
Table 2.4: Structural operational semantics

2.5  Denotational semantics

Denotational semantics provides for a framework in which program equivalences can be established using mathematical (equational) reasoning. The building blocks of this approach can be found in domain theory, which is concerned with complete partial orders (domains). In short, denotational semantics maps syntactic constructs of a language to least fixed points8 in domains, the elements of the domain representing different meanings. Since there can only be exactly one least fixed point (it is defined by its “leastness”), denotational semantics allows us to unambiguously assign meaning to computer programs.

A very important requirement for a denotational semantics is that it agree with the operational semantics on observations of interest. This means that if a term converges under the operational semantics (= evaluation terminates), it also does so under the denotational one and vice versa. Of course, the resulting value should be identical in both cases. Conversely, if evaluation does not terminate, this should be reflected in both semantics. Only in the case that the denotational and operational semantics fully agree can we draw sound conclusions about program equivalences. It should be noted that this does not mean that denotational semantics always allows us to establish (operational) program equivalences in the general case: though it is sound, it is not complete in this respect.

The existence of recursive types in our language requires us to make use of a special form of domains, namely so-called Scott domains, also known as information systems. The theory behind them is quite heavy, and an introductory description alone would exceed the scope of this work9. It suffices for our purposes to point out the intention, which is to find least solutions in recursive domain equations.

It can be shown that information systems are indeed domains containing a least element. All the interesting types of our language, like product types, sum types, function types and, of course, recursive types can be brought into this framework, which allows us to specify an unambiguous denotational semantics for our language.

Readers who are interested in e.g. proving transformations correct are well-advised to make use of this powerful representation for the semantics of formal languages.

Because it would be necessary to explain a large part of the formalism behind the somewhat complex structure of information systems, we refrain from giving a full specification here. It should be pointed out another time that both the examples of this work and the implementation of the system follow exactly the formal specification of a strict language with recursive types of section 13.1 in [], to which the interested reader may refer.


1
The lecture notes [] helped answer some questions, too.
2
See [] for an introduction to polymorphic type inference.
3
See [] for details.
4
We will go into even more details in chapter 6.
5
“Anonymous” functions are functions without a name = lambda abstractions.
6
As we mentioned earlier, however, one can use thunking techniques to simulate this in a strict language.
7
Scheme and LISP programmers would call this a cons-cell.
8
It is not necessarily the case that a domain has a least fixed point, but the denotational interpretation of types usually lifts the domain of the basic types to include a least element (called bottom). This domain is then called a complete pointed partial order. The bottom element stands for the “unknown” value, for example, when a term does not converge.
9
For details, see chapters 12 and 13 in [].

Copyright   ©  2000 Markus Mottl ⟨markus.mottl@gmail.com⟩
Previous Up Next