Wednesday, August 17, 2011

Type-checking is compile-time evaluation

I've been thinking a bit about type-checking the last days -- especially how type-checking corresponds to compile-time evaluation (CTE). In C++ templates are often used to implement CTE -- it's a messy way of achieving it but it works.

The idea is a language L where the type-checking is implemented purely by CTE. When a compiler for such language compiles a function f it generates (in addition to compiling the function into executable code) a function fs that type-checks each call to f.

The fact the that type-checking is a function make it possible to do all kinds of fun things, like verifying that an argument is an even integer. All that is needed to do this is to create a function (like fs) that checks whatever one wish to check. For instance, such function can be implemented in L itself (or a subset thereof).

Of course, to be able to type-check this way a lot of type-information needs to be available for the type-checking functions (like fs). For providing that kind of information, the compiler needs to (in addition to compiling) know how various kinds of type-information propagate through the program. For instance, if a is an even integer and b is an odd integer, is a * b even?

It is more less straight-forward to figure out how such type-information propagates through arithmetic functions, like a * b. What complicates things are conditionals, but those can be solved too. The real problem is loops.

And that's were I'm stuck right now -- with loops. Not that I've implemented a full compiler, but I've played around with a instruction set for a stack-based VM and how type-information propagates through those instructions. So far the only compiler is my head, though, but I'm using very mechanical steps to compile.

Anyway, as I said, the problem is loops. Generally, as soon as there is a loop in a program very little type-information can be propagated. However, it is my hope that most practical uses of loops can be expressed in a way such that a reasonable amount of type-information can be propagated through functions containing loops. This might imply that there is a need for other kinds of loops that is traditionally used: for, while, and do-while. Or, that only a certain form of for loops are allowed.

This is where I currently spend my thinking.

No comments: