Previous Up Next

Chapter 1  Introduction

Many years ago I watched a famous ventriloquist as he “talked” to his puppet. “I can calculate faster than anybody else in the world!”, it said. “Well, then how much is five times five?”, asked the master. Blindingly fast the figure answered: “23!”. The master complains: “But that was wrong!”. The puppet replies in a superior tone: “Yes, but fast...”.

Somewhat subconsciously we accept the fact that correctness and efficiency of computation are two conflicting goals. The tension between them has never been as obvious as in our time, where the correct and efficient handling of information, of doing computation, has become crucial for everyday life.

If we want to book a flight, we expect that information displayed by a booking computer is correct and that we receive the answer immediately. The engineer who simulated the aerodynamic behaviour of our plane on a computer must have been able to trust the results for the sake of our safety, while his company might have had interest in efficient computer programs for this purpose to reduce development time and costs.

Unfortunately, experience shows that we often do a bad job in implementing information systems. Sometimes they just fail to do their job correctly, which can range from e.g. exploding space rockets (Ariane 5) to crashing of widespread operating systems for the average home user. On the other hand, inefficient computer programs can cause international companies to go bankrupt, as, for example, was the case for a large vendor of pharmaceutical products, who sued a consulting company for having installed a standard enterprise solution that just could not cope with their high data traffic.

How does it happen that it is so difficult to find both correct and efficient solutions to our problems?

If there are several correct solutions to a problem, it is a natural assumption that the simplest one will be found first, or that there is at least a bias towards finding simpler solutions before complex ones. Though the simplest one may indeed be most efficient, it is much more likely that there are more efficient complex ones — simply because complexity means that we have to choose them from a much larger set, which makes it more probable to find efficient solutions there.

The reason is that given no information at all, which element from a set (of solutions) has a specific property (it is the “best”), we have to assign equal probabilities to them so as not to introduce an unjustified bias.

Choosing possibly more sophisticated solutions from a large space is not only more work. In addition, it raises the probability that we accept an incorrect one: “To human is err.”

A great deal of time has been invested in the development of tools that support us in finding solutions or in improving already existing ones automatically. Modern approaches generally use the powerful concept of formal methods to provide for full automation of problem solving. This usually requires formalising the problem domain using formal languages, which results in symbolic representations that are suitable for manipulation by machines.

In fact, modern universal programming languages belong in this group of formal languages, and they claim to be suitable for formalising all computable tasks and functions in general. If we can use them to solve our practical problems, then it is an obvious step to apply their power to the problem at hand: we write programs to develop efficient programs!

Taking up the idea of automating generation of efficient programs, there are basically two applicable techniques. We can try to:

The first task is surely the more challenging one. A system capable of it would not have any hint at all where to start searching. While the second technique looks less powerful, it is still of great interest to the software industry: human programmers often find it nearly as easy (or should we say: difficult?) to implement simple, though possibly inefficient programs as writing correct formal problem specifications in very abstract specification languages. Taking up a human’s initial hint, a transformation system could arrive at good solutions much faster. In our work we will focus on this second technique.

1.1  Purpose of program transformations

What exactly is a program transformation system supposed to do? Although producing more efficient versions of given programs is most likely the primary intention of most users, the most important aspect to the developer of the transformation system itself is something else: correctness. Because it can be extremely difficult to verify the result of the transformation process, we have to design the system in such a way that the meaning of transformed programs is provably the expected one.

It actually only rarely happens that transformations are used to change the meaning of programs, for example to correct previous misbehaviour. This was more frequently the case during the transition period before the year 2000: some companies developed automated tools to transform programs containing insidious Y2K-bugs — with varying success. Though the behaviour of these programs was correct up to this time, encoding years using two digits only would have lead to incorrect handling of dates after the change to the new century. Therefore, the (wrong) meaning of the programs had to be changed by transformation to cover the more general case.

Other transformations involve simplification of programs or bringing them into a canonical form which might be easier to analyse. However, the main intention is most often optimisation: the user can focus on quickly and correctly solving the problem and leaves efficiency considerations mostly to the transformation system.

Before everything else, the purpose of optimising transformations is to preserve the meaning of programs1: otherwise the result would be potentially unusable for any purpose. Additionally, it would be very surprising for the user if his program ran slower rather than faster after the transformation, which should be avoided, too.

1.2  Why transform functional programs?

So far, we have not justified the choice to transform functional programs as indicated in the title of the thesis. In fact, functional programming is only one of several different approaches and by far not the most widespread programming paradigm.

Logic Programming

The most general and convenient way to use machines for solving problems using formal languages would be to just tell them, what problem should be solved. We call this way of implementing programs declarative programming. One of its variants, logic programming, is a very general method to achieve this high level of automated problem solving.

This style abstracts from the need to know how to find solutions (control knowledge). Unfortunately, the software industry has a certain reservation against such high level languages. They are often considered as “too mathematical”, “too abstract” or “too inefficient”, which is generally not justified: if we cannot even specify what problem we want to solve, how can we know how to solve it?

The conciseness of declarative programs often exhibits how difficult the problem to be solved really is, whereas less abstract languages hide the problem complexity in much larger amounts of more concrete code (i.e. each single step can be followed more easily). This may be an explanation for psychological inhibitions against declarative languages. Efficiency of logic languages has vastly improved over the last decades, modern implementations being competitive with low-level languages. This should be a strong enough argument against efficiency concerns.

Examples of logic languages are (Lambda)Prolog, Mercury and Oz. The language we will use to implement some example transformations is LambdaProlog. We will explain in chapter 6 which of its features make this a viable choice.

Imperative Programming

The prevalent programming paradigm, which is strongly concerned about the “how”, is imperative programming, where the user tells the computer each action to be done after another. Due to the von Neumann architecture of most modern computers, which lends itself to this style, these languages are considered as leading to most efficient programs. However, knowing “how” to do something is generally much more difficult than knowing “what” to do. Thus, this style is in practice tedious, error-prone and among the main reasons for failures of large software systems. Widespread languages of this kind are C/C++, Java, Fortran, Basic and Cobol.

Functional Programming

One feature that imperative programs lack is that they cannot be easily transformed into equivalent ones. The reason is that the meaning of imperative constructs heavily depends on the state of computation. But the state is only exactly known at runtime, which makes it nearly impossible to apply equational reasoning to parts of programs. The style which addresses this problem is functional programming. Here, a program is treated as a function, which is usually composed out of other functions. Higher-order functions, ones that can be passed as argument or returned as result, are also an important aspect of functional programming and allow implementation of very generic solutions. Type systems of functional languages are generally very powerful and significantly more advanced (securer, more flexible) than those of their imperative competitors.

Functions are a very basic and well understood mathematical concept, and equational reasoning can be applied to them easily. This means that transformation systems can build on a sound basis. This elegant feature also helps humans to reduce the complexity of programs: the evaluation of different parts of the program cannot change the meaning of other parts. If something does not work, the offending part can in most cases be pinpointed much faster. Well-known languages include Haskell and Clean, SML and OCaml, Lisp and Scheme.

Functional programming is usually also considered as declarative programming due to its strong mathematical foundations, even if logic programming surpasses it in generality (functions are a special case of relations (predicates), the latter being the core elements of logic programs). Still, functions have properties that make them especially amenable to transformations that build on equational reasoning.

It is the functional style on which our transformations will focus, and we will explain the semantics of a language of this kind in more detail in chapter 2.

1.3  Difficulties

1.3.1  Semantic complications

As indicated in the last section, functional languages support transformations in a straightforward manner: we can, for example, derive new programs by substituting parameters of function applications in their corresponding function bodies — almost. One problem that appears here is that the correctness of this very important transformation depends on the evaluation strategy of the language. As we will see later, it is of great necessity to specify an unambiguous formal semantics for the language to guarantee correct transformations.

Call-by-name semantics

Most transformation techniques presented in the literature assume free substitutability, which is only possible in functional languages with so-called call-by-name evaluation: this does not evaluate function arguments before applying them to the function, but substitutes the whole unevaluated expression in the function body. This allows terminating evaluation of a larger class of programs.

Although several languages are defined in terms of such a semantics (e.g. Haskell and Clean), there are also some disadvantages associated with this approach:

Call-by-value semantics

The alternative to the semantics above is call-by-value, sometimes referred to as strict evaluation: here, function parameters are always evaluated before they are substituted within the function body. This is usually more efficient; it may, however, lead to non-termination in cases, where call-by-name semantics would yield a result. Otherwise, reasoning about runtime properties becomes tractable, and languages of this type (e.g. OCaml and SML) can more easily have coexisting pure (purely functional) and impure (imperative) features, which may make them a better choice for beginners.

Due to the mentioned shortcomings of call-by-name evaluation, which may be lifted in the future by more research, and because there has not yet been so much work on transformations of strict languages, we will restrict this work to the latter evaluation strategy. This comes at the expense of losing the full power of equational reasoning due to possible impurities and non-termination. We will see in this work that techniques used in purely functional languages to tame impurity of side-effecting (= imperative) code can be very valuable to achieve our goals.

1.3.2  Complexity of search

Besides semantics related problems that complicate the correct application of program transformations and impact effectiveness, an equally important question concerns the efficiency of transformation systems. The complexity of search for more efficient programs depends on several parameters:

Both the first and especially the second parameter can lead to exponential explosion of the size of the search space. To give an example, a program may contain large mathematical expressions. If we try to simplify them by applying transformation rules that exploit mathematical properties like commutativity and associativity in all possible ways, this may require a significant effort.

There is naturally not much we can do to limit the impact of the first parameter other than not transforming parts of the program. Limiting the influence of the second one by removing rules usually makes the system less powerful and may not allow exploitation of all opportunities to improve programs.

The last parameter, however, gives us some control over the search process: if we find out in which cases specific transformations are more suitable than others, maybe even a “sure bet”, then we can structure their application in such a way that we always only consider the most promising ones. Such techniques of reducing the number of choices (the branching factor of the search space) are called heuristics.

We will see that it is necessary to employ heuristics to make search for more efficient programs tractable. Partial evaluation, a technique that can statically evaluate programs (without knowing their input), will play a major role here to reduce the number of alternatives in the search space3.

It is important to point out that the correctness of the transformations is independent of the heuristic: the intention behind heuristics is only to restrict the number of alternatives to the most promising ones or to impose an order on them in which they are tried. They do not add new (potentially unsound) choices.

As is the case for rewrite systems in general, sets of program transformation rules do not necessarily always allow normal forms: this means that there can be programs such that the transformation process does not terminate in a state in which no transformation rule applies to the program. On the other hand, restricting the number of rule applications to prevent this problem can result in a loss of completeness: it may still be possible that the program could be improved by further transformations.

It is often the case that such situations arise in program transformation systems. The consequences of this problem can be weakened by employing search strategies that can be parameterised in terms of the search depth. This allows the user of such systems to decide, on a per-case basis, how much computation time they are willing to sacrifice to achieve a higher degree of completeness with deeper search.

1.4  Goals and structure of this work

The aim of the project is to implement a basic framework in which various transformations can be tried out easily on a language which has a rigorously defined semantics.

The framework contains functionality for handling abstract syntax trees of the language, inferring types (including type checking of polymorphically typed programs), termination analysis, common sub-expression elimination and, most important, a fairly general partial evaluator that can handle higher-order functions while exactly preserving the meaning of programs, including termination behaviour.

The reader might especially benefit from taking a look at the approach taken in the partial evaluator4. It is implemented in monadic style, which makes the program very declarative in nature: short, clear, and easily extensible. It seems that this approach can be used to conveniently structure transformation systems in the general case. Its implementation relies heavily on higher-order features of the relatively young logic programming language LambdaProlog. The thesis will also outline ways to continue implementation of transformations in this specific framework.

In the chapters to follow we will deal with the following aspects of automating functional program transformation:


1
This issue will be discussed in more detail in chapter 4.
2
See [] for further information on advanced implementation of and reasoning about purely functional programs.
3
This will be discussed in more detail in chapter 4.
4
See section 6.2 for more information.

Copyright   ©  2000 Markus Mottl ⟨markus.mottl@gmail.com⟩
Previous Up Next