This chapter will give an introduction to the techniques we used to implement a framework for doing program transformations. We start out by describing properties of the language LambdaProlog, which is a very recent development and highly suitable for tackling problems in the field of program transformation. This will be followed by interesting design techniques that allow us to achieve a high degree of declarativity and impose an easily extensible structure when implementing such a system. An overview of the components implemented in the system is given in appendix B.
The recently developed logic programming language LambdaProlog is an extension of its cousin Prolog: whereas Prolog is a typical example of a logic language that implements first-order logic in terms of Horn clauses, LambdaProlog takes us to a higher level: higher-order logic. This allows us to reason about even predicates and functions, not only first-order terms. This capability already indicates its usefulness for the problem at hand, because in functional languages programs correspond to functions. Hence, implementing program transformation systems is greatly simplified by such powerful features1. Here is a short overview of the extensions and advantages which LambdaProlog2 has over traditional Prolog:
pi followed by the
name of the variable and a backslash. The keyword sigma is used
to scope existential quantification.The language is otherwise similar to Prolog: first-order unification is a special case of higher-order unification, backtracking gives non-deterministic choice between solutions, it has negation-as-failure and cuts5.
LambdaProlog has its theoretic foundations in an extended version of so-called higher-order hereditary Harrop formulas6. Semantically speaking, it is a true super-set of first-order Horn clauses with additional higher-order capabilities, which we will now describe in more detail:
Since they are part of the language design, we have a very natural way of expressing lambda abstractions. For example:
(x\ pair x x) |
This is an anonymous function that takes a parameter (called x
inside the function body) and builds a term out of it (a pair in this
case). We can apply such functions to arguments by juxtaposition:
(x\ pair x x) (y\ inl y) |
which evaluates by β-reduction to:
pair (y\ inl y) (y\ inl y) |
As we can see, even higher-order functions are supported directly.
An example of higher-order predicates:
type map (A -> B -> o) -> list A -> list B -> o. map P nil nil. map P (H1::T1) (H2::T2) :- P H1 H2, map P T1 T2. |
This predicate map maps a predicate over a list of elements. This
allows P to be a higher-order predicate. If we consider the
following knowledge base:
type parent string -> string -> o. parent "Albert" "Alan". parent "Albert" "Ann". parent "Berta" "Bill". parent "Berta" "Bridget". |
then the variable Parents in the following piece of code:
mapP parent ["Albert", "Berta"] Parents |
could be instantiated to any of the following:
["Alan", "Bill"] ["Alan", "Bridget"] ["Ann", "Bill"] ["Ann", "Bridget"] |
We will make very intense use of higher-order predicates when we introduce monads as an implementation technique in section 6.2.
This is probably the most powerful aspect of LambdaProlog: it allows instantiating variables that stand for (or contain in substructures) functions and predicates by unifying them with other higher-order terms. To give a short example in which one might be interested when doing program transformation (we use the language representation on which we do transformations):
curry (lam (x\ F (fst x) (snd x)))
(lam (x1\ lam (x2\ F x1 x2))).
|
This is the so-called Curry-relation, which relates functions that take pairs as arguments to ones that are curried, i.e. functions that take one argument and return a function which again consumes the second argument (this allows more flexible use of higher-order functions).
The “magic” happens with variable F: it is a higher-order
variable and stands for any kind of program (function) that takes the
required arguments (in this example arguments that are pairs).
An example application is:
curry X (lam (x1\ lam (x2\ x1))). The answer substitution: X = lam (x\ fst x) |
To get an even better impression of the power of higher-order
unification, here is another example that demonstrates this, and also
universal quantification (the keyword pi introduces a scope for a
universally quantified variable that it receives as argument — x
in this case):
contains T ST :- T = F ST, not (pi x\ F x = T). |
The type of this predicate is A -> B -> o, which means that both
T and ST can have arbitrary type. What the predicate does
is that it checks whether some term T contains a subterm ST
— it does not matter whether we are talking about lists, binary trees
or any other kind of datastructure. Taking a look at the code, we see
that first we declare that some function F maps the subterm to
the term7. However, F could be a trivial function that always
returns T without considering its argument ST. Therefore,
we enforce that F depends on its parameter by declaring that
“Not for all x may (F x) yield T”. This can only
succeed if there is an x such that it appears in T —
in other terms: ST must be contained in it as required.
It should be noted that ST need not be instantiated — it can be
a variable. This would allow us to find out about all subterms contained
in T (ST would be instantiated to them by unification
and backtracking).
We take another example from the implementation of our system:
infer_tp (lam F) (TP1 –> TP2) :- pi x\ infer_tp x TP1 => infer_tp (F x) TP2. |
The purpose of predicate infer_tp is to infer the type of terms in
our language. Terms are given as first argument, and the second argument
of the predicate will be instantiated to the type of the term. The rule
given here infers the type of lambda abstractions. What the body of this
rule is saying is that “For all x, inferring type TP1 for
x implies that the return type of (F x) can be inferred as
TP2”. This means that if the inference system requires the type of
x in the function body of F during inference, it will get
TP1. This type may, of course, be further constrained depending
on the way x is used in the function (e.g. as a pair). If its
use conflicts with the type as inferred so far, the predicate will fail,
thus indicating a type error. We see again that LambdaProlog allows us to
specify important properties of programs in a very clear and concise way.
As we have described in chapter 4, partial evaluation in strict languages is faced with tricky correctness issues concerning non-termination and other side effects. Since the core of our framework is a partial evaluator, it is necessary to give a detailed treatment of the design technique applied. We will see that this specific technique achieves the following:
Many different designs had been tried to come up with a general and clear implementation of this component, but the tricky correctness issues concerning side effects made this fairly difficult. It required quite some time until the relations between several concepts that might be applicable became clear. Although knowledge of work concerning handling of side effects in purely functional languages8 was available from the start of the project, the linking idea appeared after having read []: this paper shows that even logic languages as expressive as LambdaProlog can strongly benefit from a specific construct whose introduction to functional languages has opened new ways of programming: monads.
Monads arose in category theory and their usefulness in structuring denotational semantics was discovered by Eugenio Moggi9. Wadler took up this idea and applied it to structuring functional programs. Other people had also noted that monads are a powerful tool for encapsulating the handling of side effects (e.g. exceptions) in a safe and referentially transparent way in functional languages.
Monads allow us to reason about computations in a very abstract way. This
is exactly what we need for our purposes, because computations are the
food of partial evaluation. The combination of the ideas presented
in the works above lead to an implementation of a monadic approach to
partial evaluation. We will not give a full treatment of monads as this
is already presented in the papers mentioned. In short, the minimum
specification of a monad is a triple of a unary monad type constructor,
a unit-operator and a bind-operator. For example, a monad type constructor
m would impose the following types on the operators:
kind m type -> type. type unitM A -> m A -> o. type bindM m A -> (A -> m B -> o) -> m B -> o. |
This is the abstract view of monads: we operate on it using the given operators. The unit-operator lifts some value into a monad, the bind-operator binds a function (in LambdaProlog: predicate) to a value that it gets from the monad and returns a new monad. The monad, however, has a (hidden) internal representation: this representation may carry additional information about the values it controls. This can be, for example, the number of applications of the bind-operator, errors that happened during some monad operation or (as in our case) a classification of terms into values, terminating computations, etc. One can imagine many applications (see []). Because we operate on the monad using the operators only, our code becomes independent of the monad representation: we could add further details to it without breaking code10 outside, for example to support more kinds of side effects in case the language is being extended.
Our specific monad handles computational effects “behind the scenes”
(internally). It can do so, because every time a term is put under
its control, it will be told what kind of term it is: a value, some
(irreducible) computation or a possibly non-terminating computation.
For this purpose we use more than one unit-operator, which does not
violate the properties of monads (effM stands for “effect
monad“):
unit_value_effM — when the result of partial evaluation is a
value.
unit_comp_effM — in the case of a terminating computation.
unit_mnont_effM — when the result of partial evaluation
is a computation that maybe causes non-termination (“mnont”). This is
generally undecidable, hence the “maybe”.
They all have the type A -> effM A -> o as required by
monads. This is interesting to note — the monad does not have the
slightest “idea” of how terms are implemented in our language: hence
the general type A which the monad type constructor accepts as
argument. This would allow us to reuse the implementation of the monad
to handle effects for any kind of representation of a strict functional
language: as we see, an indeed very abstract view11.
Instead of partially evaluating terms to terms, we partially evaluate
terms within the monad internally. But how can we get at terms that result
from computation in the monad when they are required as subterms in some
simplification rule? This can be done with the monadic bind-operator: it
takes a monad as first and a (higher-order!) predicate as second argument
which “binds” to whatever term the monad finds suitable to “leak”
to this predicate. The type of the bind-operator bind_effM is:
effM A -> (A -> effM B -> o) -> effM B -> o |
So the bound predicate must instantiate its second argument to a new monad. This means that computations that can be statically evaluated all happen “within” the monad. Here is an example rule of the partial evaluator. Its type is, of course, specific to terms (type ’tm’), but the monad can handle all types as we mentioned further above (’tm’ is a specific instance):
type part_evalM tm -> tm effM -> o. part_evalM (fst T) Res :- part_evalM T M, bind_effM M simplify_fst Res. |
This predicate takes a term as first argument and returns a monad in the
second one which contains the partially evaluated computation. This
specific rule handles partial evaluation of the pair accessor
fst. To do this, we first partially evaluate its parameter within a
monad M, and then we “bind” a simplification function to whatever
the monad allows us to see: this could be a normal pair (which we could
simplify then), but also just a universally quantified variable that
stands for all possible terms that could be in this place. This would
indicate that either there was no computation that could be simplified as
e.g. in lam (x\ fst x), where no simplification is possible. Or it
could be the case that the monad “lifted away” a computation that might
cause a side effect or represents another computation (not a value). If
it did not do so, a simplification rule might accidently remove this side
effect or “multiply” a computation, which can be costly12. The simplification
rules for fst look as follows:
type simplify_fst tm -> tm effM -> o. simplify_fst (pair V1 _V2) Res :- !, unit_value_effM V1 Res. simplify_fst V Res :- unit_comp_effM (fst V) Res. |
Either the value to be simplified is a pair, in which case we just drop
its second argument (it is a value, of course, as guaranteed by the
monad implementation) and only return the first one, advising the monad
that this is a value. Or the term is something else (only a universally
quantified variable would make sense here). Then we cannot simplify,
so return the computation that takes the first element of the pair:
fst V. The monad learns that this is a computation through the
corresponding unit-operator unit_comp_effM.
The implementation of the bind-operator for the monad acts accordingly depending on the kind of term it gets. If it is a value, the term will be passed unchanged to the bound predicate. Otherwise, the terminating or maybe non-terminating computation is lifted out of its place and a universally quantified variable is left there instead. If it was a “normal” computation and after the bound predicate finishes simplification, the monad would check whether the universally quantified variable is still present in the simplified program: if this is not the case (a simplification rule must have removed it then), the computation is not needed to compute the result of the program — the monad will discard it. Otherwise or when we have a potentially diverging term, the monad will remember the computations in the right order so that no effect is lost.
Of course, at some point of time we will want to get at the result of
partial evaluation. For this we provide a show_effM-function
that reveals the term controlled by the monad: it translates the
internal representation of the monad, its constructors used for lifting
out computations with ’let’-constructs13 as we described in
chapter 4.
Here is a final example of a very crucial component of the partial evaluator that demonstrates within a couple of lines only the elegance of the monadic approach and nearly all of the power of LambdaProlog (explicit universal and existential quantification, intuitionistic implication, higher-order functions and higher-order unification):
part_eval_fun F1 F2 :-
pi x\ sigma Mx\
unit_value_effM x Mx,
part_evalM x Mx =>
sigma M\
part_evalM (F1 x) M,
show_effM M (F2 x).
|
The purpose of this predicate is to partially evaluate a function from
terms to terms (tm -> tm), F1 being the function to be
improved, F2 the result. We already know that in our strict
language all variables stand for values. Thus, no matter to which term
we apply the function to be partially evaluated, this term must be a
value in its body. In LambdaProlog this is expressed as follows:
x” (pi x\) there “exists a monad
Mx” (sigma Mx\) such that this x can be lifted
into the monad as a value (in LambdaProlog: unit_value_effM x Mx).x in monad Mx implies
(part_evalM x Mx =>) that there is another monad M
(sigma M\) such that we can partially evaluate F1 applied
to x within this monad M (part_evalM (F1 x) M).M results in another term: this last term
can be gained by applying the partially evaluated function F2
to x (in LambdaProlog: show_effM M (F2 x)). In this last
part the higher-order variable F2 (it stands for a function)
is inferred by higher-order unification.As we have hopefully managed to demonstrate, the advanced features of LambdaProlog combined with the monadic approach result in a most declarative implementation of what it means to partially evaluate a function that is strict in its argument. Assuming that the monad implementation indeed behaves as we have explained further above, this piece of code could be proved correct easily: its declarative reading should actually correspond to the specification itself.
Currently, the monad only handles non-termination and “irreducible computations”14. It would be very easy, however, to extend its capabilities to handling all kinds of side effects, such as I/O and exceptions. We would not have to change a single rule of the partial evaluator other than the one for function application (which is the only place where effects can be triggered). The monad would have to be extended with more unit-operators that indicate the kind of effect, and the implementation of the binding-operators would need to handle those effects accordingly. This pinpoints the place where extensions can happen and gives us a very modular approach to structuring partial evaluation. The big advantages of the monad approach can also be seen in code size: the partial evaluator only requires somewhere between 150 and 200 lines of code15, effect handling in the monad itself about 30. The level of declarativity is extremely high, and it therefore should not be too difficult to give a proof of correctness of the partial evaluator by just taking the implementation as guideline.
Another aspect of extensibility of our partial evaluator, though not
directly related to monads, is that we can arbitrarily extend its power
to infer termination behaviour: the partial evaluator makes use of a
predicate funcall_terminates, which takes a term as first argument
that stands for a function and another term as second argument which is
the parameter that should be applied to the function. If calling the
predicate succeeds, this should mean that calling the given function
on the given parameter terminates. Currently, we do not provide any
implementation for this predicate, but users who want to try out advanced
tools for termination analysis can “hook” them into our system by just
using this interface. LambdaProlog allows this very conveniently due to
having the property that predicates can be implemented incrementally,
spanning their implementation over arbitrary modules.
It was only after the implementation of the partial evaluator that it became evident that other researchers had already given a full formal treatment of partial evaluation under monads. The interested reader can find such formalisations in [] and []. This should be an indication that our approach has a strong formal basis.
Most advantages of having a partial evaluator that identifies various sorts of computation have already been described in our chapter on transformations, but it is interesting to note that the way in which computations are lifted in this implementation with ’let’-constructs has other advantages, too. As, for example, [] explains, some ways of using ’let’ make it easier for compilers to generate fast code. Indeed, programs transformed by our partial evaluator end up in this form automatically: the computation steps, all of which are lifted out of place, follow each other in a linear fashion.
As a final remark on the monadic approach, it should be pointed out that this is a generic technique for structuring declarative code: it can potentially be used for many kinds of problems, e.g. where parts of the code need the guarantee of certain invariants, which the monad can enforce. It usually does this in such a way that changing the way invariants are maintained does not interfere at all with the rest of the code. This seems to make this approach very suitable for most kinds of transformation and rewrite systems in general. One should not forget, however, that monads require higher-order capabilities (higher-order functions and/or higher-order predicates) as they are provided in most functional and in higher-order logic languages. Their safe application usually requires a static type system that allows hiding of implementation details.
There is another implementation of a partial evaluator available in our framework which maintains type annotations during partial evaluation. The rationale behind it is that fold/unfold techniques and possibly others require type information to work correctly and/or efficiently. For example, when we want to apply the instantiation rule to specialise a call to a function, we must make sure that this call indeed takes an argument that contains a sum type in its type signature. Unfortunately, it is very costly to infer this information whenever it is needed: we would have to infer types for the whole program, which is unacceptable. Therefore, we convert terms to another representation which contains type annotations. This is a one-time effort only. The alternative partial evaluator respects these annotations and therefore allows other transformations to find out the exact type of every node in the abstract syntax tree without having to use type inference. Of course, other transformations must respect the type annotations, too.