13.1 Evaluation of bindings
So far we have introduced some basic functional bindings but we glossed over some issues. We covered a minor prettiness issue in the last chapter. In this chapter we expose a major semantic issue!
Consider:
fun myrand : 1 -> int = "rand()" requires header ' ' ; fun mycond : bool * int * int -> int = "$1 ?? $2 : $3"; fun mytwice : int -> int = "$1+$1"; fun mydiv: int * int -> int = "$1/$2";
This code looks innocuous enough, but it hides subtle issues. Consider now:
var x = mytwice (myrand());
and let me ask the question: is {rand()} called once or twice? If we just substitute the C code would resolve to:
int x = myrand() + myrand();
and so it gets called twice, and the result is sure to be even (if it doesn't overflow). This is the behaviour you'd expect from a macro, not a function. Of course you can fix this as so:
var tmp = myrand(); var x2 = mytwice(tmp);
and this should work. But can you be sure that Felix doesn't do that itself? There's a good chance if you write:
var x3 = x+1 + x+1 + x+1 + x+1;
that a decent compiler will want to lift the common expression {x+1} out. On the other hand, if you write this:
val tmp2 = myrand(); var x4 = mytwice (tmp2);
Felix semantics actually allow the compiler to replace tmp2
with {myrand()}, because unlike var
a val
can be considered
either as a variable or as a macro. Of the two usual evaluation
strategies, substitution is lazy evaluation, whilst assignment
to a parameter is eager evaluation, and as you can see in this
case, it can matter.
In the example we probably want eager evaluation, but the macro like nature of C bindings suggests we may get lazy evaluation.
Now lets look at a second example:
val a = 1; var b = 0; var x5 = mycond ( b==0, 0, mydiv (a,b));
If we follow through the substitutions, we would get this in C:
int x5 = b==0 ? 0 : a / b;
which is what we want: don't do the division if the divisor is zero. This works because Felix does substitution, which is lazy evaluation, and C also does lazy evaluation here: the second argument of a conditional expression is C is not evaluated if the condition is true. This is sometimes called a short-cut operator because it bypasses or short cuts some of the calculations, but the proper name is lazy evaluation.
Now the problem here is that we want to be sure Felix doesn't try to lift the arguments out to variables, and emit this:
int true_case = 0; int false_case = a / b; int x5 = b == 0 ? true_case : false_case;
because if it did, you'd get a divide by zero error before even getting to conditional expression.
The point now is that in some cases we want eager evaluation, to prevent duplicate calls to functions that return different values each call, and in some cases we want lazy evaluation, to prevent evaluating something that might prematurely abort your program, or perhaps even go into an infinite loop.
Again, your nasty teacher glossed over these issues. So now, you want to know, how does Felix actually evaluate expressions. You will hate the answer. In most cases it is unspecified! You have to write your code so it doesn't matter.
The reason for the lack of specification is simple: performance. The compiler tries to generate the fastest possible code and it simply assumes that functional code can be copied about, copies removed, and the time of evaluation doesn't matter
Of course, you have to evaluate an argument before the corresponding parameter is required, but don't forget the call to the function using that parameter may itself be a parameter .. so the time its required can be delayed significantly. In fact in purely lazy functional languages like Haskell the whole computation is delayed right up to the termination point of the program (then it does everything in a great hurry!).
Now, you may think, well, C is an eagerly evaluated language, with a couple of minor exceptions like the three shortcut operators.
You would be utterly and completely wrong. All traditional imperative languages are primarily lazily evaluated. Remember, evaluation of condition operations requires lazy evaluation. You choose one path for control to flow down or the other. The fact is, that control flow is just another name for lazy evaluation! The fact that you do one then and then another is precisely lazy evaluation.
Now, in C, function arguments are evaluated before calling a function. This is eager evaluation. However the rules are not as determinate as you might think! The order of evaluation is not specified. In fact the exact order of calculations, including operators such as assignment, is only partially determinate, by rules which invoke the concept of a sequence point to specify when computations start and end, and when another computation can use the results of a previous one. Statements like:
x[i++]=x[i++];
are not deterministic. Yet people do write working C code. So do not be too alarmed! We will show how Felix provides ways to ensure the desired semantics.