Previous Up Next

Chapter 5  Controlling Transformations

Not only do we want to obtain efficient programs by transformation: for practical purposes it is necessary to guide the application of transformations in such a way that the transformation process terminates in acceptable time.

First of all, it is important to note that this has nothing to do with correctness of transformations: either the preconditions for a correct transformation are met and it could be applied — or not. Control of transformations is necessary when the preconditions for several of them can become true at the same time, in which case we say that we have choice between transformations.

Unfortunately, this freedom of choice is rather a burden here: we not only may choose, we must choose! The problem is that once we have decided for a specific transformation and apply it, we may lose the opportunity to apply any of the others — and might even get further away from the goal, an optimal transformation, if our choice was bad. In such cases we would have to take back some or even all of our choices and try others, which means that the transformation effort so far was wasted.

This situation is especially bad if we consider that after each transformation we again have several to choose from. This clearly implies that the number of choices available to us can rise exponentially, which can easily make the transformation process intractable for anything but toy problems.

5.1  Reducing the number of choice points

The first kind of technique we can try to make search more tractable is to eliminate choices that are sure to lead us away from the goal. For our purposes of program transformation it is noteworthy that there are potentially infinitely many programs that have the same meaning. It would be helpful to only have transformation rules that handle some of these cases rather than all and to use specific rules which do not require search that can transform any of the many equivalent variants into ones that can be handled. This might allow us to reduce the number of transformation rules we need.

5.1.1  Partial evaluation

One very useful approach is using a partial evaluator to eliminate as many statically computable computations as possible. Clearly, this redundancy should be eliminated in any case, and partial evaluation transforms many equivalent programs into a common representation, thus also removing the need for having to cope with all of them in other transformations. Because many rules can often be applied to different parts of a program, the partial evaluator is likely to reduce the number of some of these cases.

But does partial evaluation not require any search? Although it requires pattern matching in our implementation, which, of course, needs computation time, once a rule of the partial evaluator matches, it is guaranteed to be the only rule that could match in this place1. This means that there is only one possible choice, which implies that there is no true search involved.

One might raise the argument that there is choice, for example, when partially evaluating pairs: we could choose one component only. This, however, is dangerous: the partial evaluator is supposed to preserve the effect behaviour, the latter depending on evaluation order. Therefore, we must never abuse the partial evaluator to work on subterms in such a way that effect behaviour may be changed. This means that we will always have one node in the abstract syntax tree for which we know that it is safe to apply partial evaluation to it (in the worst case it is the root node). Taking one node beneath it would be unsound — effects might be lost. Taking one above it does not improve the performance of the partial evaluator, rather degrade it: it would have to partially evaluate a larger tree unnecessarily. Therefore, we always only have one rational choice.

One could imagine that there are other transformations that have this property of not introducing alternatives when applied. Because such transformations do not induce nonlinear time behaviour due to the lack of search whereas others do, it is very cheap to apply them: they will only impose a constant factor on the execution time of the transformation system.

Thus, it is advisable to run the partial evaluator every time when statically computable parts could be present. This is definitely a good idea when the initial program is given, but also after each other transformation that might reveal new information to partial evaluation. For example, the partial evaluator never applies recursive functions whose termination it cannot prove2. Other transformations, e.g. unfolding of recursive functions, could allow the partial evaluator to proceed again and optimise the newly transformed part.

Partial evaluation has another fine property: it uncovers useful information that is hidden in complicated constructs. For example, it can find out which parts of the program do not necessarily terminate or yield side effects3. This information could again be used to remove superfluous transformation alternatives, for example, by eliminating duplicated computations (common sub-expression elimination), which is also implemented in our system.

5.1.2  An advanced technique for guiding search — rippling

Viewing transformations as rewrite rules in general, an interesting method for guiding search is rippling4. Its purpose is to maintain a specific pattern in a term during transformations while eliminating other subterms or at least moving them to designated positions where they do not disturb further transformations.

The rationale behind this technique is that it sometimes happens that the application of a promising rule is obstructed by terms which disallow the pattern of the rule to match. By protecting the pattern using special annotations and by only allowing application of rules5 that carry similar annotations which have to match, too, it is guaranteed that only rules that do not destroy the target pattern can be applied.

Due to correctness requirements imposed on the annotations that guarantee a strictly decreasing measure on a well-founded order during transformation steps, the transformation process either succeeds in the target configuration or it ends in a state where no rule is applicable any more. This means that the transformation process is guaranteed to terminate at some point of time. Because the annotations only prevent some rules from matching, but otherwise do not change the way they work, the technique is sound, too.

This heuristic was developed to make inductive proofs more tractable. Inductive proofs play a very important role in program analysis, and one can easily imagine advanced transformation systems that exploit program equivalences which require such proofs.

5.2  Giving priority to choices

Another way to improve efficiency is to give priority to transformations based on some heuristic measure for them. Such a measure may be the size of certain subprograms that we want to transform or the “distance” between interesting subterms in the abstract syntax tree. There should, of course, be some kind of statistical motivation behind such measures that makes it more likely to apply the right transformations to the program first.

For example, when having the choice between unfolding any of a number of recursive function calls, the question may be, which of them is more likely to yield a program which we can further simplify? One such heuristic may be to unfold the outermost recursive function first: because it encompasses a larger part of the program, the partial evaluator may have more opportunities to optimise, because the function parameter might get substituted in many more positions.

Another heuristic would be to prefer recursive function calls that take a value of sum type as argument: this passes information (choice) from terms higher in the tree to deeper ones, possibly giving rise to more simplifications again. We could even define the following measure for the parameter if it represents a nested datastructure containing sum types: it is more likely that a value of sum type which is higher in the datastructure is consulted than one very deep within the datastructure. Thus, trying function applications with such arguments earlier may be beneficial. E.g.:

(*, (inl(v), *))

would be less likely to yield information to the function body than

(inl(v), *)

The rationale behind this heuristic is that to reach a value that is deep within a datastructure we need more code (functions that destruct the datastructure up to the point of the value). Because we want to perform as little work as possible, we might prefer to transform a program part where information is consumed at an earlier step of execution.

5.3  Control and completeness

One aspect that should not be ignored, though it is not relevant to efficiency of the transformation system, is the question of completeness. The search for suitable transformations may in many cases be infinite. What should the system do if it has not found the goal of a transformation method after many iterations? The control part of the system will therefore have to put a reasonable limit on the depth of search so as not to try transformations for an indefinite amount of time.

5.4  Examples of control heuristics

This section will present two control strategies: the classic one by Burstall and Darlington and a still untried one that we believe to be useful in combination with our partial evaluator. The reader might be interested in implementing them using our framework as basis.

5.4.1  A classic control strategy

In their initial paper, Burstall and Darlington propose a useful control strategy that may be applied together with their fold/unfold method. The typical application of the rules of this strategy looks as follows:

  1. Use the definition rule to create necessary definitions of recursive equations (functions).
  2. Instantiate the equations using the instantiation rule.
  3. For each instantiation unfold repeatedly and at each step:

Unfortunately, in their original system the first two steps require help from the user6. The algebraic replacement and where-abstraction rules could be applied automatically, but the system may need the guidance of the user to find solutions faster (due to possibly many alternatives).

Therefore, it may be useful to allow (but not require) user interaction with the system to control rule applications. This might help the user to discover more general principles in guiding the transformation system, which could lead to better automation of the same.

5.4.2  A control strategy with partial evaluation

Since our partial evaluator will never degrade the efficiency of a program and is capable of reducing the number of choice points in the search space as described in this chapter, the first step in this control strategy is to run the partial evaluator on the whole program, followed by the elimination of common computations (sub-expressions). Then we try the following steps:

  1. Find next recursion that takes a sum type as argument.
  2. Choice point:
  3. Unfold the recursive call — this makes it a lambda abstraction.
  4. Partially evaluate program and eliminate redundant computations: this will apply the newly created lambda abstraction to the explicit values of sum type.
  5. Try folding. If this succeeds it depends on what we want to do: we may continue improving the program with this strategy or we could succeed with the transformed one. Otherwise we have two choices again:

If we restrict the search space by a depth limit, this strategy will try out all combinations of unfolding recursion up to this depth, possibly specialising function parameters as required. This may also be combined with other rules if this seems useful (e.g. the algebraic replacement rule, some kind of tupling strategy, etc.). The use of the partial evaluator in this strategy guarantees that we will only try folding at points where folding has a chance to succeed. Without it, folding might fail even if the functions only differ by one statically computable expression in a critical place, which can happen all too easily. Additionally, partial evaluation may eliminate code parts that are not needed any more. If these contain calls to recursive functions, our search space will be made smaller without losing any existing goals.


1
We will describe the rules of the partial evaluator in more detail in chapter 6 and 7.
2
We have noted elsewhere that the unfolding rule is a special case of partial evaluation. However, so as not to cause the partial evaluator to loop when handling recursive functions, we must handle this case of unfolding recursive functions separately.
3
See chapter 6 for details on how the partial evaluator achieves this.
4
This heuristic is presented in [].
5
So-called wave-rules.
6
Discovery of “Eureka”-steps.

Copyright   ©  2000 Markus Mottl ⟨markus.mottl@gmail.com⟩
Previous Up Next