Chapter 2 A Simple Functional Language with Call-by-Value Semantics
In this chapter we specify the semantics of the language which we use
to implement transformations.
During recent years, several methodologies to rigorously specify the
semantics of formal languages have emerged. We will present the semantics
of a simple functional language that is suitable for demonstrating in
principle all kinds of transformations that can be applied to more fully
featured, “real” functional languages.
The concrete syntax, the one in which the programmer writes his programs,
will not be considered: it is less important, how programs look than
what they mean and how they are structured. Therefore, only the abstract
syntax, which carries the semantics, will be presented. Everything else
is just a matter of taste.
Due to practical restrictions on the input character set of computer
languages, which disallow convenient and concise mathematical symbols
in computer programs, we shall explain the commonly known notation for
abstract syntax of functional languages as it is used in the text part of
the thesis together with the keywords that appear in the implementation
(LambdaProlog terms). This is to prevent confusion of people who want
to extend the implementation.
Much care was taken to ensure that the implementation of the language
stays conformant to the rigorous basis provided in []
(especially chapter 13 on a strict functional language with recursive
types). This should make it much easier to prove properties of the
language building upon the strong formal foundations provided in the
mentioned book1.
2.1 Description of types, values and informal semantics
Types are sets of values that belong together in a meaningful way.
E.g. the set of integer values normally has a corresponding type in most
computer languages. More generally, the notion of types extends to terms,
whose evaluation may or may not terminate. If a term is regarded to be
of type t and if its evaluation terminates, the resulting value must
indeed be of the required type. We write t : τ to indicate that
term t is of type τ.
Type checking ensures to a high degree that the programmer does not
accidently write meaningless programs by combining terms that do not
belong together. The implementation part of this work contains both
a type inference engine and a type checker for polymorphically typed
programs of the given language. Polymorphism allows families of types
to be parameterised by others. This is a very expressive and generic
way to deal with types.
Type inference2 is a process
that can automatically infer which type a program and its constructs
have by analysing their structure. Unfortunately, type inference for
polymorphically typed languages, which is the case for many functional
ones, is undecidable in the general case. Therefore, if we want to be able
to assign correct types to all possible terms, we need type annotations
provided by the user that enable the compiler to infer correct types:
it would otherwise have to reject type correct programs or its type
inference algorithm could loop. Both cases are not very satisfactory.
When an algorithm verifies programs with such annotated types, we speak
of type checking. The implementation of our system features both
a type inference algorithm and an extension for type checking that can be
used to verify the type correctness of input programs. Unfortunately, it
is still not fully complete: it cannot yet handle recursive types in the
general case. The current naive version loops under some circumstances.
The implementation of the type system in our system started out
with an example implementation presented in John Hannan’s tutorial
[], corrected three small errors, extended the type
system to cover sum types and recursive types (see further below) and
changed the way recursive functions are handled to a more preferable
style for strict functional languages. To conform better to the formal
foundation of the example language as presented in [],
a few identifiers were renamed, too (tabs ↔ tlam; abs
→ lam).
The type system of our language is otherwise very simple. In the
following section, we will give a detailed specification of the types
of our language and, as we go along, also explain the way they are
represented in the implementation that is written in LambdaProlog:
This type has only a single value:
| | Text notation | Representation in implementation |
| type | unit | unit |
| value | * | u |
Although this type may seem boring, we need it for e.g. lifting
terms to values by creating a function that takes this value of
“zero-information” and returns the given term. This is called
thunking3 and
allows passing around computations as values. This can be used to e.g.
simulate call-by-name evaluation in a strict language. Additionally,
the unit value may be necessary for specifying datastructures together
with sum types (see below).
2.1.2 Product type
Values of this type are pairs of the form (c1, c2), where c1 and
c2 are values of type τ1 and τ2 respectively. For terms
t1:τ1 and t2:τ2, we have the pair (t1, t2) : τ1 *
τ2. It is possible to project from pairs:
Given a term:
We have:
where fst(t) returns the first element of the pair, snd(t)
the second one.
| | Text notation | Representation in implementation |
| type | τ1 * τ2 | TP1 ** TP2 |
| value | (c1, c2) | pair C1 C2 |
| term | fst(t) | fst T |
| term | snd(t) | snd T |
Values of this type have either the form inl(c1) or inr(c2), where
c1 is a value of type τ1 and c2 one of type τ2. As
before, this extends similarly to terms.
Given terms:
| t | : | τ1 + τ2 |
| t1 | : | τ possibly containing variable x : τ1 |
| t2 | : | τ possibly containing variable y : τ2 |
|
We have:
| case t of inl(x) . t1, inr(y) . t2 : τ |
|
Representation in the implementation:
case T LF RF
The case statement has the intuitive meaning that if t evaluates to
inl(x), then this term returns t1, otherwise if it evaluates to
inr(y), then t2 is returned.
The LambdaProlog construct demands some explanation4:
case is just a normal user-defined data constructor that
takes three parameters. The first parameter, the variable T,
stands for the term that decides, what case arm should be taken. The
other two parameters represent the case arms. They are actually
functions in LambdaProlog. E.g. if the variable T matches some
pattern inl v, then the left case arm will be chosen. If the
function representing this case arm is e.g. (x\ x), which is
the anonymous5 identity function in LambdaProlog notation,
then x will be bound to v in the function body. The result
of the case statement would be v in this example.
| | Text notation | Representation in implementation |
| type | τ1 + τ2 | TP1 ++ TP2 |
| value | inl(c) | inl C |
| value | inr(c) | inr C |
2.1.4 Function type
Function values of type τ1 → τ2 have the form λ
xτ1.t, where t is a term of type τ2, which may contain
variable x of type τ1. When types are clear from the context,
we will omit the type of the variable.
Given terms:
We have:
Terms as described above represent function application. In the
expression lam F in the table below, the variable F stands
for a function in LambdaProlog notation again.
| | Text notation | Representation in implementation |
| type | τ1 → τ2 | TP1 --> TP2 |
| value | λ x . t | lam F |
| term | t1 t2 | app T1 T2 |
2.1.5 Type of recursively defined functions
Recursive functions are typed similarly to non-recursive ones. Given term
t : τ2, which may contain variables x : τ1, we can form
(types may be omitted in the text):
| rec fτ1→τ2.(λ xτ1.t) : τ1→τ2 |
Representation in the implementation:
rec F
Function application works the same as for non-recursive functions.
This time the function F in LambdaProlog representation takes
two parameters as, for example, in (f\x\ app f x). The reason is
that we need a way to refer to the name of the recursive function in its
function body — otherwise we would not be able to call it recursively.
It should be noted that there are other ways to specify recursive
functions, as is done, for example, in []: there the
recursive definition does not take a parameter, which means that we can
create arbitrary recursive values (e.g. cyclic lists), too.
However, this generality comes with some problems in strict languages
(not so in call-by-name ones). For example, it is not clear what the value
should mean. It could be of any type, but if we want to access
it, it is not defined! This could lead to looping programs or, even worse,
to crashing ones in real implementations, where the runtime system would
just try to access an uninitialised memory location. Lazy languages do
not suffer from the latter problem, because they evaluate all expressions
lazily: the runtime system would not assume initialised locations and
would either just keep looping trying to evaluate this example or (in
some other cases) just delay evaluation of recursive subexpressions.
On the other hand, if we allow for a list constructor ::, the following
could be interpreted as the infinite list of ones:
Some strict languages (e.g. OCaml) allow such forms and impose
restrictions on the right-hand side of definitions that prevent
ill-formed cases. In practice it seems to be of hardly any use to allow
such “infinite” datastructures in strict languages6. It might indeed make some termination proofs
more difficult, since inductively defined datatypes could then have
“infinitely large” values, which does not fit well to their usual
properties (they allow inductive proofs).
Therefore, we adopt the style that SML goes and disallow fully general
recursive definitions completely. In other terms: evaluation always has
to be triggered explicitly by a function application, thus preventing
any problems with definiteness.
2.1.6 Recursive types
Recursive types are needed to specify inductively defined datastructures
such as lists, binary trees, etc., or co-inductive ones like e.g.
streams.
Here, for example, a way to encode lists in our language (text
representation):
| nil | ≡ | abs(inl()) |
| cons(n, l) | ≡ | abs(inr(n,l)) |
|
’abs’ abstracts the type of the list representation and yields a
recursive type. ’inl’ just takes the unit-value as parameter, which
indicates that we cannot get any more information from it (it stands
for the empty list ’nil’). ’inr’ takes a pair of values: the first one
stands for the contents of a list element, the second one for the tail
(rest) of the list7. If we want to implement a function on lists, we
would have to use the keyword ’rep’ on list values to “know” whether
they represent empty lists ’inl()’ or some cons-element ’inr(value,
rest)’.
It is important to note that using simple sum types together with
pairs and the unit type, we have the potential to specify isomorphic
representations for all kinds of recursive datatypes! This means
that it is not necessary to come up with a more elaborate implementation
of recursive types: it is not too difficult to translate any kind of
existing recursive type with arbitrary data constructors into this
representation and back again.
| | Text notation | Representation in implementation |
| type | µX.τ | mu TF |
| value | abs(c) | abs_rtp C |
| term | rep(t) | rep_rtp T |
The variable TF stands for a function from types to types. As was
the case with recursive functions, we need this to bind the recursive
type within the body of its definition.
2.2 Abstract syntax
Having given an overview of the language, we can now present the full
abstract syntax that will be used throughout the text (please refer to
the informal specification in section 2.1 to learn
about the representation used in the implementation).
| t | ::= | xτ |
| | | | * |
| | | | (t1, t2) |
| | | | fst(t) |
| | | | snd(t) |
| | | | λ xτ.t |
| | | | t1 t2 |
| | | | inl(t) |
| | | | inr(t) |
| | | | case t of inl(x).t1, inr(y).t2 |
| | | | abs(t) |
| | | | rep(t) |
| | | | rec fτ1→τ2.(λ xτ1.t) |
|
| Table 2.1: Abstract syntax |
2.3 Formal typing rules
The formal typing rules are to be read as follows: each rule consists
of premises and a conclusion, the premises being written
above and the conclusion below the solid line. To prove that a term is
well-typed, we have to derive this using the rules until all derivations
end in rules that do not require any premise to be true, these last
rules being called axioms.
| t1 : τ1 t2 : τ2 |
|
| (t1, t2) : τ1 * τ2 |
|
| |
| |
|
| | t1 : τ1 → τ2 t2 : τ1 |
|
| t1 t2 : τ2 |
|
| t1 : τ1 |
|
| inl(t1) : τ1 + τ2 |
|
| | t2 : τ2 |
|
| inr(t2) : τ1 + τ2 |
|
| t : τ1 + τ2 t1 : τ t2 : τ |
|
| case t of inl(x).t1, inr(y).t2 : τ |
|
| t : τ[µX.τ/X] |
|
| abs(t) : µX.τ |
|
| | t : µX.τ |
|
| rep(t) : τ[µX.τ/X] |
|
| t : τ2 |
|
| rec fτ1 → τ2.(λ xτ1.t) : τ1 → τ2 |
|
| Table 2.2: Formal typing rules |
Attentive eyes may have spotted the interesting pattern of the typing
rules for function application and abstraction: they look the same
as the logic rules for implication introduction and elimination
in natural deduction proof systems. This important relation, known
as Curry-Howard isomorphism, relates proofs and beta-reduction
(i.e. “evaluation”) of typed lambda terms, which turns out to be a
very powerful tool in both computer science and logic.
2.4 Operational semantics
There are several ways to specify evaluation of programs, the operational
semantics of a language. However, the structural operational
semantics has the advantage of being syntax-directed. This
ensures that it fills its purpose of providing for a strict guideline of
implementation: it directly associates evaluation steps with each piece
of abstract syntax in a formal way. Before this kinds of semantics became
more widespread in use, it was common practice to specify the operational
semantics by providing an abstract machine that interprets it. This is,
of course, not so rigorous an approach from a formal point of view.
We will see in chapter 6 that our system
implements both “normal” evaluation rules and partial evaluation as
well. The declarative reading of the evaluation rules in LambdaProlog
corresponds exactly to the rules we are about to specify: this makes it
trivial to prove our implementation correct and demonstrates the very
high level of programming achievable in LambdaProlog, making it most
suitable for such tasks.
Before presenting the evaluation rules, we will give a short explanation
of the meaning of values.
Some terms that do not have free variables in them (i.e. they are
closed) and if they are well-typed, correspond to basic elements
of their types. In other terms: they are values.
| c | ::= | * |
| | | | (c1, c2) |
| | | | λ xτ.t iff the whole lambda term is closed |
| | | | inl(c) |
| | | | inr(c) |
| | | | abs(c) |
|
* is the only value of type unit. The pair (c1, c2) is
a value when its type is τ1 * τ2 and when c1 has type
τ1 and c2 has type τ2. Any closed and well-typed lambda
term λ xτ.t is a value, as are inl(c1) and inr(c2)
when their type is τ1 + τ2 and if c1 has type τ1
and c2 has type τ2. A term abs(c) is a value if c is a
value of type τ[µX.τ/X].
Terms which are values have the property that evaluating them does not
change them: they are self-describing.
It is worth noting that in our call-by-value language variables always
stand for values in the sense that their meaning is always already
computed before one can access it: e.g. variables standing for function
parameters are already computed before the function body in which they
are bound is executed. This property is very important to know in program
transformation: it guarantees that accessing variables will not cause
any side effects, be it non-termination or others. We will come back to
this in chapter 4.
2.4.2 Structural operational semantics
This subsection will present all the derivation rules to evaluate
programs of the language. They are given in the form of structural
operational semantics, which is sometimes called natural
semantics of the language. Structural operational semantics rigorously
defines each computational step to be performed after another so that an
interpreter can be implemented from the specification in a straightforward
manner.
Results derivable from the structural operational semantics include
(the arrow → is to be read as “derives to”):
-
- Type consistency If t : τ is closed and t → c
then c : τ.
- Determinacy For any closed t, there is at most one c
such that t → c.
Proofs can be obtained by induction on the structure of derivations.
Evaluation rules
As in the subsection on typing rules, the derivation rules of
the operational semantics consist of premises and a conclusion. The
elements of the rules specify evaluation steps. To evaluate a term on
the left side of an evaluation arrow, we first have to derive (evaluate)
all elements of the premise above it. Doing this may bind results of
evaluation to variables, which may be used on the right-hand side of
the conclusion. The right-hand side of the derivation is always a value!
It can happen that the derivation tree would be infinite in size,
that we never meet a final rule during derivation: in this case we
say that the term does not converge or that evaluation does not
terminate. It is possible to prove equivalence of programs by
structural induction or (in more difficult cases) by showing that they
have equivalent derivation trees (induction on derivations). This can be
quite tedious to do in this kind of semantics because of the low level
(computation steps) it addresses. Denotational semantics, which
we will explain in section 2.5, is a more
elegant way of establishing equivalences between programs.
| t1 → c1 t2 → c2 |
|
| (t1, t2) → (c1, c2) |
|
| |
| |
|
| | t1 → λ x.t t2 → c1 t[c1/x] → c |
|
| t1 t2 → c |
|
| t → inl(c1) t1[c1/x] → c |
|
| case t of inl(x).t1, inr(y).t2 → c |
|
| t → inr(c1) t2[c1/y] → c |
|
| case t of inl(x).t1, inr(y).t2 → c |
|
|
| rec f.(λ x.t) → λ x.(t[rec f.(λ x.t)/f]) |
|
| Table 2.4: Structural operational semantics |
2.5 Denotational semantics
Denotational semantics provides for a framework in which program
equivalences can be established using mathematical (equational)
reasoning. The building blocks of this approach can be found in
domain theory, which is concerned with complete partial orders
(domains). In short, denotational semantics maps syntactic constructs of
a language to least fixed points8 in
domains, the elements of the domain representing different meanings. Since
there can only be exactly one least fixed point (it is defined by its
“leastness”), denotational semantics allows us to unambiguously assign
meaning to computer programs.
A very important requirement for a denotational semantics is that it
agree with the operational semantics on observations of interest. This
means that if a term converges under the operational semantics (=
evaluation terminates), it also does so under the denotational one
and vice versa. Of course, the resulting value should be identical in
both cases. Conversely, if evaluation does not terminate, this should
be reflected in both semantics. Only in the case that the denotational
and operational semantics fully agree can we draw sound conclusions
about program equivalences. It should be noted that this does not mean
that denotational semantics always allows us to establish (operational)
program equivalences in the general case: though it is sound,
it is not complete in this respect.
The existence of recursive types in our language requires us to make use
of a special form of domains, namely so-called Scott domains,
also known as information systems. The theory behind them is
quite heavy, and an introductory description alone would exceed the
scope of this work9. It suffices for our purposes to point out the
intention, which is to find least solutions in recursive domain
equations.
It can be shown that information systems are indeed domains containing
a least element. All the interesting types of our language, like product
types, sum types, function types and, of course, recursive types can be
brought into this framework, which allows us to specify an unambiguous
denotational semantics for our language.
Readers who are interested in e.g. proving transformations correct
are well-advised to make use of this powerful representation for the
semantics of formal languages.
Because it would be necessary to explain a large part of the formalism
behind the somewhat complex structure of information systems, we
refrain from giving a full specification here. It should be pointed out
another time that both the examples of this work and the implementation of
the system follow exactly the formal specification of a strict language
with recursive types of section 13.1 in [], to
which the interested reader may refer.
Copyright © 2000 Markus Mottl ⟨markus.mottl@gmail.com⟩