Today I started working on a new type-checker for python that is based on Prolog. Prolog is a logic programming language with several features that makes it an ideal language to implement type-checking in. Here's an example of a piece of code that it successfully type-checks:
A = 1
B = 1.0
C = "c"
def plus_C(value):
return C + value
def func0(a):
global A, B, C
A = "hello"
result = plus_C(A)
C = 10
result = plus_C(B) + a
return result
The way it works is pretty straight-forward. The type-checker simply disassembles the relevant functions (using the dis module), and outputs Prolog code where each instruction is replaced with a Prolog predicate, as follows:
py_plus_C(Module, V_value, ReturnValue) :-
load_attr(Module, 'C', T0),
binary_add(T0, V_value, T1),
return_value(T1, ReturnValue).
py_func0(Module, V_a, ReturnValue) :-
store_attr(Module, 'A', py_str),
load_attr(Module, 'plus_C', T0),
load_attr(Module, 'A', T1),
call_function(Module, T0, [T1], T2),
store_attr(Module, 'C', py_int),
load_attr(Module, 'plus_C', T3),
load_attr(Module, 'B', T4),
call_function(Module, T3, [T4], T5),
binary_add(T5, V_a, T6),
return_value(T6, ReturnValue).
Here, the type-checker infers the return type of func to be float, and the type of the argument a to be either int or float. Note that the definition of Module is omitted because it's mostly just an internal representation of the python module and not terribly interesting.
In addition to dealing with mutable state (mutation really complicates type-checking), it handles parametrized types (e.g., list) as the following example illustrates:
def func1(lla, lb):
return lla[0] + lb
where it infers that lla, lb, and the return type is:
- list(float), float, float
- list(int), float, float
- list(int), int, int
- list(str), str, str
- list(list(X)), list(X), list(X)
- dict(int, list(X)), list(X), list(X)
Now to the cool part.
The entire thing is 236 lines of code.
Ok, granted it doesn't handle ifs, fors, etc, but still. The type-checker can be found here. Unfortunately, I currently don't have time to continue working on it, but the code is there and proofs the idea.