Some scenic overlooks on the internet:
OcamlCore Planet
Fabulous Adventures...
Room 101
Andrew Kennedy's Blog
To be continued...
Some scenic overlooks on the internet:
OcamlCore Planet
Fabulous Adventures...
Room 101
Andrew Kennedy's Blog
To be continued...
I was wandering around the internet looking for interesting programming languages related blogs and came across the blog of Gilad Bracha, one of the people involved with Newspeak. There were many interesting posts, but an older one caught my eye and got me thinking about object instantiation in Mix.
In Constructors Considered Harmful Mr. Bracha expounds on the problems associated with object constructors, when it comes to both the reusability and extensibility of a piece of code, and the uniformity of the language.
One way of thinking about this first point is to consider object construction to be separate from
object initialization. In this scenario new
is an operation that
first constructs an uninitialized object of a particular class, and then calls
the constructor to initialize it. In this sense one can see that the signature
of the constructor in many object oriented languages is rather arbitrarily
constrained, in that it must return an instance of the class that it is defined
in. This means, for instance, that new X()
must return an instance
of X
, and can't return an instance of a related type Y
.
This means that, if ever your code changes so that you need such functionality,
you'll have to traipse all around your code base replacing calls to new X()
.
That's the first problem; how can we
What might be nice would be for us to separate these two concerns explicitly,
so that new
really does only instantiate an uninitialized object,
and then some method on that object is responsible for initializing the object.
But wait, you say: won't this allow us to accidentally create and then use
uninitialized instances of a particular class? And isn't that frightening and
terrible? Well, sure, as presented thus far it would. Of course, there's nothing
that prevents you (save for good practice, of course) from defining all your classes
with an empty constructor and a method named Init
. But, we can
do a bit better (though we'll still trust ourselves not to amputate our own feet
via firearm).
Let us say that one can only create an uninitialized instance of a particlar class from within
that class, so that a random programmer can't new
up an uninitialized
instance. Then let's restore a notion of constructor (call it an initializer)
that is nothing but a (possibly named) constructor, but one that is obligated to return
the object it is responsibly for creating and initializing. Therefore our
initializer can return instances of a different type if it so desires.
Aside: This is really just like static factory methods as in Java, and in fact that is one technique for avoiding some of the problems with constructors that has found its way into general practice. However in Java we can still write regular constructors with all of their supposed problems.
But aren't we back to regular old constructors? Not quite, because we are not required to return any particular type from our initializer. So we might have the following code:
class Foo { new Foo(i) { var t = new Foo; t.Bar = null; t.Num = i; return t; } new FooBar(i) { var t = new Foo; t.Bar = Bar.Bar(); t.Num = i + 8; return t; } } var f = Foo.Foo(42); var g = Foo.FooBar(34);
Then, if ever we need to modify the constructor of Foo
, so that,
for instance, it returns an access controlled instance of a Foo
,
we can do the following:
class Foo { new Foo(i) { var t = new Foo; t.Bar = null; t.Num = i; return AccessController.New(t); } ... }
Another complaint raised in the aforementioned blog post (the second point I metioned) is that constructors behave differently than other "callable" objects. That is, creating an object using a constructor is fundementally different at the language level (for many languages, like Java, C#, and so on) than calling a function (or static method, or whatever), so that you cannot, in general, pass a constructor around, or store it in a variable; they aren't first class. In the above presentation there is nothing precluding them from being so.
So really, all we have done is introduced some limited form of static methods,
and restricted the use of new
as an object instantiation device
so that it can only be used to create new, uninitialized objects of the same
type as the class in which the call to new
is found. What we
gain is that constructors need not return a particular type, that we still are
"encouraged" by the language to write constructors that return initialized
objects, and that constructors behave in the same way as regular functions, in
that they may be passed around, stored in variables, and so on.
In fact I had never really faced either of the problems that Mr. Bracha described
in my own programming; perhaps this is because I haven't worked on any really large
projects with many other participants. In general I've never needed a constructor
lke that of Foo
above, and if I've ever needed to pass a constructor
as a function I have just wrapped it, thus:
var ctor = function(i, j, k){ return new SomeClass(i, j, k); };
But still, it is an interesting point that merits some thought, especially in the context of large codebases that act as a dependency to many other pieces of software, and whose signature cannot easily be changed.
So, back to the stacks. This time it's about functions and methods and objects and functions and methods and objects.
So, as I mentioned only briefly,
functions are really objects
that implement a particular method, namely operator()
. This is
nice because it means that we can easily define functors and
use them in the same way we use "regular" functions and object methods. For
instance, the following class wraps an "action" (for those of you not from the
C# world, a
unary function returning void
) and
returns an action that keeps track of how many times it has been called:
class CountedAction { private var func; private var count = 0; public operator()(arg) { this.count++; this.func(arg); } public Count() { return this.count; } public Reset() { this.count = 0; return; } }
Now, to be fair there are other ways we could swing this sort of thing, but I'm not as concerned with what is possible so much as I am with what is simple or even elegant.
So, what we'd like to say is that, whenever we call an object, we simply look
up that object's operator()
and call that. Of course, here
be our turtles.
Retraction, start over. Let's say first that a callable object is either
a "real" function, or it is an object which has a member operator()
that is itself callable. Then calling an object consists of either
actually calling it (in terms of typechecking, it means creating a new
environment with bindings for each function argument and then checking the
function's body) if it is a "real" function, or it proceeds by retrieving the
object-to-be-called's member operator()
and calling that
with the given arguments.
That's a bit better, but it means that the programmer has to deal with both
objects and these "real" functions, when the whole point of this mess is to
treat everything as objects. So it remains for us to hide this additional
complexity in the language, so that the programmer can pretend that everything
is an object, and that anything implementing operator()
can be
called.
It's pretty easy to see that any "real" function can be easily treated as
an object that has a single member, operator()
, whose body is the
exact same as the function. So in the end all we have to do is, whenever a
"real" function is used in any way other than being called, promote it
to an object.
That's great, and thus topples our turtle stack. But what's the end result for me? Well, one nice thing is that now all functions and methods can be treated uniformly. For instance, consider the following class definition:
class Foo { BarMethod(a) { return a.Bleck(); } }That's equivalent to the following:
class Foo { private var BarMethod; new() { this.BarMethod = function(a){ return a.Bleck(); }; } }
All because methods are just functions; see this post for more information on how methods are implemented in this way. And now I'll say goodnight.
My good friend RT painted this the other day -- little did he know that today Infinite Turtle Theory would actually be relevant to what I want to talk about:
Anyway, here's the thing: in Mix I want every first class value to be
an object. That is, instances of classes are objects, literals like 7
and
true
are really just instances (of Int
and Bool
,
respectively) and so are objects. Functions are just objects with a method
named op()
. And methods are just functions that hold a reference
to this
-- recall this discussion on implementing methods.
(This is a bit circular, I know; here be our first stack of turtles).
But, remember I said first class values; not everything you deal with in Mix is
first class. Classes themselves (not their instances), control flow constructs
(like if
, for instance), and instance members are all examples of
parts of Mix that are not first class. You can't pass a class around, modify it,
and then create an instance from it. You can't pass a statement to a function
(though it might be fun to try that out; I'll make a note of it). And you
can't write the following code:
class Foo { public var Bar = 7; } getField(o, f) { return o.f; } Int main(String args) { return getField(new Foo(), Bar); }
Again, it might be interesting to allow that kind of first class status for member names in a statically typed language like Mix; I know other languages support such kinds of things.
Back on track, we want everything to be an object but this leaves us with a challenge: how do actually represent things like integers, reals, strings, and so on? For booleans we might try the following:
public Bool { ToString() { if(this.IsTrue() is not null) return "true"; else return "false"; } operator==(other) { var t1 = this.IsTrue(); var t2 = other.IsTrue(); if(t1 is not null && t2 is not null) return new True(); else if(t1 is null && t2 is null) return new True(); return new False(); } } class True : Bool { IsTrue() { return this; } IsFalse() { return null; } } class False : Bool { IsTrue() { return null; } IsFalse() { return this; } }
Here we managed to define booleans in the language, though our technique seems
a bit hackish; what about that "true"
we left in there? Instead
of trying to find a way of implementing each of these "ground types" with
pure Mix, it would be nice if they were somehow "built-in" to the language.
On the other hand, I'd rather not have to special case them all over the place,
and it would be great if we could grow the set of built-ins easily. So, what
is at the bottom of this stack of turtles (our second, for those of you
who are counting).
Enter abstract classes. An abstract class is one whose purpose is to
provide a "binding" to a built-in type (one that is likely implemented in C#).
For instance, we could have an abstract class binding System.Int32
, one
binding System.String
, and so on. The binding is like an interface
between the C# implementation and the Mix type system. An abstract class
generates no code when it is compiled; this "interfacing" is it's only role.
To see how it works, consider the following binding for System.String
:
abstract class String { operator==(o) { return null as Bool; } operator[](index) { index.ToInt() as Int; return null as Bool; } Join(s) { s.ToString() as String; return null as String; } }
Lot's of code, all involving this new operator as
. This operator
isn't the same as the
C# as operator
(and therefore perhaps it should
get a different name, though as you'll see it is somewhat fitting). It's behavior
at compile time is that it checks that the left side argument has only exactly
the type on the right; if so it returns a typeset consisting of a new instance of
the type on the right. At runtime the as
expression just returns
its left side.
Given this description, let's consider the definition of operator[]
.
Type checking an invocation of this method with an argument index
entails checking that index
implements ToInt
, and that
the result of invoking ToInt
is in fact an instance of type Int
;
if this is the case then the return type is Bool
; otherwise an error
is returned.
So, other than making it very easy to bind simple types, what do abstract classes get us? Well, consider binding an array:
class Array { private var elements; new(index) { index.ToInt() as Int; } operator[](index) { index.ToInt() as Int; return this.elements; } operator[]=(index, value) { index.ToInt() as Int; this.elements = value; return; } // Etc. }
What's nice about this binding is that we can use an instance variable to track the type of the array. So, if we have an array, and add a few items to it, we can verify that any object accessed from the array is used in the correct way; this of course restricts the array so that all elements are assumed to be of the same "type", so that we don't allow a particular operation on an element unless all elements can perform that operation (remember our definition of correctness).
Int main(String args) { var a = new Array(10); //Array starts out with null in each position. for(var i = 0; i < a.Length(); i++) a[i] = "Hello!"; a[0] = 7; a[2] = true; foreach(var e in a) print(a.ToString()); //OK: Int, Bool, and String all implement ToString. a[6].ToLowerCase(); //Error: Int and Bool do not implement ToLowerCase. }
So that's how the current incarnation handles the second infinite stack of turtles. We'll see how to deal with the first infinite turtle tower (namely, that a function is an object with a particular method, but a method is just a function -- which is an object with a particular method, but a method is just a function -- this is getting old already) next time.
Windows 7 won't let me do that, at least I never figured out how, even after asking the internet. There were lots of suggestions that one use Window Manager by DeskSoft, which I tried. It seemed a bit flaky at times, but did more or less what I wanted, and then some.
However, I figured that the software was pretty simple, and I knew that if and when the shareware license expired I wouldn't be buying it (too unpolished, and really, it does something Windows should already do). So I thought, "Why not write an some free/Free/open-source/whatever-you-want-to-call-it software that does what I want?"
So I did. You can find an alpha version of WindowsWhere ("windows where you want them", if you follow me) hosted at Google Code. WindowsWhere is licensed under the GPL v3. Hopefully at least one other person has been annoyed by this, and hopefully I didn't just totally miss a "native" solution.
The last couple of posts have been a bit off topic (where topic = Mix), so let's get back to it. These words are about methods in Mix, how they are implemented, and how the implementation choice corresponds closely to the language's semantics (which seems like a bad idea, but you'll see what I mean).
So, first let me note that, in the spirit of object orientationification, in Mix the goal is that every first class entity is an object. By this I mean that numbers and strings are objects, class instances are objects, instance methods are objects, and functions in general are objects. These are all first class entities, so they're objects; since classes are not first class (you can't go passing around classes, unfortunately, at least at this point) they aren't objects, and in fact they don't have any runtime representation.
It's easy to see how regular class instances are objects (it's by definition,
right?), but perhaps slightly less clear what I mean when I state that, say,
a function is really an object. All I mean is that a function is an instance
of a class that has a method op()
. This gets us into a bit of a
loop; if methods are objects, then this method is an object as well.
So, at the bottom we need to have some notion of a callable object; a
"special" object with no methods at all. From the programmers perspective these
can be ignored, because a callable object can always be treated as an object with an
op()
method that is callable.
So, methods are really just function objects; sounds great, very symmetric.
Now, we have a question to answer: do these function objects take
this
as an argument, or do they contain a reference to
this
that is bound once and for all when the function object is
instantiated? It turns out that this choice will dictate the semantics and
correctness of certain kinds of statements, and so we should try to make a
reasonable and (hopefully) reasoned choice.
Consider the following Mix code, which defines a class implementing
ToString
, and then derives from this a class that does not
override ToString
. Finally, it adds a ToString
method
to an instance of the derived class. Assume that print
just
takes a string and writes it to standard output.
class Base { ToString(){ return "Base!"; } } class C : Base { public var Name; new(n) { this.Name = n; } } Int main() { var c = new C("Bob C."); var f = c.ToString; // (1) c.ToString = function(self) { return self.Name + " " + f(); }; // (2) c.ToString = function() { return c.Name + " " + f(); }; // (3) print(c.ToString()); // (4) return 0; }
In this code there are a few issues. First, at point 1 we store a
reference to the original value of c.ToString
(which is more or
less B
's implementation of ToString
). I think it is
clear that we want f
to contain a function object that takes 0
arguments (so that later we can write f()
and get the same result
as if we wrote c.ToString()
); this implies that the function object
implementing c.ToString
has c
(or this
)
bound as a member variable, and does not take it as an argument.
Then, at point 4, we use the new method in the same way as we used the
original one; we aren't obligated to pass c
to the new
ToString
method.
Next, check out points 2 and 3, which highlight the difference
between the two treatments. In the first case we get explicit access to
this
(though without some syntactic sugar we can't name it the
keyword this
, and so give it a different name). In the second
we only have access to this
(actually c
) because it is
contained in an outer scope. While at first glance they seem equivalent,
they aren't quite the same.
To see the difference, consider this function that, given a function object
(which should correspond to a ToString
method), returns a new
method that annotates the output of the given method with the operated-on
object's Name
field:
function annotate(f) { return function(self) { return self.Name + "(" + f() + ")"; }; } o.ToString = annotate(o.ToString);
While this works under the interpretation higlighted by point 2 above, it would not work under the interpretation shown in point 3:
function annotate(f) { return function() { return X.Name + "(" + f() + ")"; }; }
The question is, "What is X?" One answer is that we can work around the issue
by passing this
(or at least, the object that correspond to it) to
the scope enclosing the new method's definition:
function annotate(o, f) { return function() { return o.Name + "(" + f() + ")"; }; } o.ToString = annotate(o, o.ToString);
So we don't gain expressivity going the second route, though we do perhaps lose code clarity. And in fact, this seems to be exactly how it works in Python:
class Foo: __init__(self, i): self.I = i Get(self): return self.i f = Foo(7) f.Get() # = 7 g = g.Get f.Get = lambda: g() + 1 f.Get() # = 8 f.Get = lambda self: self.i f.Get() # Error!
Anyways, if they are kind of the same, but kind of not, then where does the real difference reside?
One important point is that the first route allows us to use a single function
object instance as a method on many instances regardless of whether the function
object needs to access this
, while the second option only lets us
reuse a function object if the object does not access this
. That
is, under the first interpretation many instances could share a single function
object as a method.
However, a caveat: the "workaround" described for the second situation has a small "strangeness", in that because the variable holding the equivalent of this
is captured it could be updated, but only in by the directly enclosing scope.
All of this leads me to a decision.
this
to be passed explicitly,
and since method definition within a class also doesn't require it, then
method definition from outside of a class should not as well.
There is also a second, simpler issue here as well. In point 2
notice that we really do have access to this
; does this mean that
under the first interpretation we should at this point have access to the
private members of this
, and therefore the private members
of c
? I am inclined to believe that we should not: private should
have a lexical meaning, which is that only code lexically within the class, or
lexically within a derived class, whould have access to said members. What's,
more, this is in keeping with the choice made above (that is, as we've picked
the second interpretation, we don't even face such an issue).
Furthermore, it would produce a asymmetry between function objects intended to become methods,
and those intended to be simple function objects, which could only be solved with
more syntax. Namely, it would allow a function object's body to access the
private members of an argument only when the function object was intended
to become a method; we would need to require that in such cases the argument be
made explicitly this
. Anyway, lot's of special casing for an idiom that
could lead to some seriously obfuscated code.
Time has passed quickly, and my posts have been infrequent of late. That's because I have been busy with a couple of things. Apologies all around.
First, I've been working on a method of implementing the dependently typed lambda calculus via compilation. Specifically, I've been looking at how to compile (or translate, if you like) Logical Framework specifications to lambdaProlog, a higher order logic programming language with at least one efficient compiled implementation (Teyjus, which I have worked on during my undergraduate and graduate studies; there are probably more).
Programming with LF is usually achieved by encoding judgments as dependent types and then searching for inhabitants of particular instances of these type families; I won't go into detail, but instead point you towards the excellent LF implementation Twelf, which does many things, including allow you to program in this way. My implementation is organized around compiling an LF specification to a lambdaProlog module, and then querying that module to find inhabitants of particular types.
Second, my friend and colleague David Baelde has recently begun a year long post-doc at the University of Minnesota (which is where I am), so we've been finding him a place to stay and what not, as well as working away (on, amongst other things, a proof that the translation I mention above is correct).
But, now things are settling a bit, and I should have time to write more about Mix -- I'll have to if I want to write a thesis proposal on the ideas I've been describing.
Once I talked about why I think static typing (and more generally static analysis) can be really useful, even if I didn't say much. Today I was working on some code and was reminded yet again that, in some cases, without static typing all would be lost, and I would fall into a dark insanity, never to be lucid again.
The code is that of the Tac system, a "framework for implementing and interacting with interactive theorem provers". Originally it was conceived of as a generic tool with which you could develop first an interactive prover for a linear logic, then one for a classical, then another for an intuitionistic logic, all as easy as something relatively... un-difficult. As it stands we have fully developed one such prover, based on the logic µLJ, a first-order logic with (mutually recursive) definitions, induction and coinduction, and generic quantification.
Like Coq, Tac is a tactics and tacticals based prover, in which you manipulate logical sequents by applying tactics (possibly tactics composed of tacticals and tactics) to them. These tactics can be very low-level (a tactic for right conjunction elimination, say), or more complex (a rule for applying some hypotheses to a lemma). In addition, there's a tactic implementing automatic proof search based on focusing which can finds proofs that require nested inductions and coinductions by automatically generating invariants.
To give you an idea of how working with the system might work, here's a proof script showing the process of
first defining the natural numbers, defining the even numbers, and then proving that 0 is even (yes, a silly proof)
by first unfolding the definition of even
, then picking the left branch, and finally recognizing
that 0 does in fact equal 0.
#define "nat x := x = 0; sigma y\ x = s y, nat y". #define "even x := x = 0; sigma y\ x = s s y, even y". #theorem even_0 "even 0". mu_r. left. eq.
We've also spent a lot of time working on implementing automatic proof search via focusing. Here's a slightly more interesting proof using automation, showing that if x is a natural number then either it is even or its successor is even:
#theorem even_nat "pi x\ nat x => (even x; even (s x))". prove.
Not an interesting script though.
Anyway, one standard way of implementing such a system is through higher order functions and success and failure continuations (that paper, by Elliott and Pfenning, describes it in the context of logic programming search, but as it is powerful way of implementing backtracking search, it's what we want). The idea is that a tactic is just a function that takes a sequent, a success continuation, and a failure continuation. If application of the tactic on the sequent succeeds, the success continuation is called with the results of the application (more sequents). Otherwise, the failure continuation is called. Tacticals are functions taking possibly several tactics (a list of them, even) and returning a single tactic. Sometimes we have tacticals taking tacticals returning tacticals, and on and on. Lots of useful composable tacticals for doing all kinds of stuff -- the automatic proof search tactical is (more or less) just composed out of simpler tactics.
The point is, after all is said and done, we end up with some fairly complex code. Some of the types defined in the source are third and fouth order, to say nothing of those of the various helper functions whose types are inferred. Working with such complicated code can be pretty hairy, but with O'Caml's type system we've found it relatively easy to work with.
Now, you might say "you're doing it wrong" (I mean come on, fourth order functions?) But let me motivate the design of the system as it stands. First, the tactics and tacticals method of interactive theorem proving has demonstrated its merit in several other systems, so we'll not find a serious case of "doing it wrong" here. Second, the success and failure continuation based approach is, as mentioned, a powerful and elegant way of implementing backtracking search; it's efficient, because all calls are tailcalls, once you wrap your head around the technique it's easy to write new tacticals to compose various tactics correctly, and it cuts the crust off your toast.
Anyway, the point is that while I don't bother about types when writing Python scripts for a MUD I wrote, and I don't even use them to great effect in some of my other projects, sometimes they really are a ____-send.
Postscript 1: fill in the blanks as you see fit.
Postscript 2: Here's an example of the kind of proof that Tac can find; this one is of the fact that "less than or equal to" is transitive. The proof is presented in outlined form, where asynchronous phases are compressed. Click it for a better view.
So, last time
I said "the invocation n.Foo()
will raise a null pointer exception", which
it will of course, but in the more complete example involving Length
no such exception could ever be
raised: the body of the while
loop was guarded by a test to determine whether current
was null. Duh.
The point is, it seems that some simple form of nullness analysis would completely eliminate the problems with null
that I talked about previously. But, I also said that such an analysis has its own problems. The thing is, this kind of
analysis isn't simple after all. Consider:
test(n) { if(n is null) return 42; else return 17; } main() { var n = null; if(test().IsPrime()) n.Blah(); }
This is a relatively simple (though convoluted) example, but it should illustrate the idea: to get "perfect" nullness
analysis in we would need to actually run the program (to determine whether the number is prime using the definition
of IsPrime
). This is something we really don't want to do (there goes any hope of a termination proof, for
one thing). What if the code needs to ask the internet whether the number is prime? All kinds of chaos.
Perhaps a better idea is a very naive nullness analysis, one that only recognizes explicit guards of the form
if(something is not null) { ... }
or similar. Under such a system, if something
had an empty typeset
(indicating that it is always null), then the body of the if
statement would not be examined.
Then, unguarded method invocations on objects with empty typesets could raise a warning.
This would yield more accurate types during inference, and would limit the number of spurious warnings. The latter aspect should be obvious, but the former might not be as clear. Consider:
main() { var n = null; var g = 42; if(n is not null) { n.Bleck(); g = new Foo(); } return g + 7; }
Let's assume that Foo
does not implement operator+
. With the simple form of nullness analysis
described above, this code typechecks correctly: g
is never assigned to an instance of Foo
, so its
typeset contains only Int[]
(which does implement operator+
). Without nullness
analysis, the invocation n.Bleck()
would generate a warning, and the expression g + 7
would generate an error. Here nullness analysis (or nullness inference, as the Nice
people call their version of it) is acting as a kind of poor-man's
dataflow analysis.
We still need to analyze such a technique for correctness. For now I'll handwave it away; I've got work to do.
First, null
often gets a bad rap, and it
deserves what it gets. Unfortunately, it can be quite handy, and it is pretty well understood by the average programmer,
so I'm keeping it anyway, at least for now. Fortunately almost no one will use Concrete Mix, so it won't be an
expensive mistake. Sidenote: I had to look this
up while writing the above.
So, last time I said that
null
acts a bit strangely. To understand the how and the why, let's look at some code, and see how we
would typecheck it:
main() { var n = null; n.Foo(); var m = null; m = new Bar(); m.Bleck(); return 0; }
Some totally useless code, but it will prove illustrative, so bear with me. First, let's assume we're playing
with the iterative algorithm I've been harping on about. So, after iterating to a fixed point, what does the
typeset of n
contain? Let's say that it contains nothing (that is, null
indicates
no type, and not an empty type, like void
might indicate): {}
. Then, should
the call n.Foo()
be deemed correct by the typechecker?
It turns out that classifying this call as incorrect has some unforseen consequences. I noticed the problem
first when playing with a simple singly-linked list implementation. Consider the following code for a naive
Length
method:
class List { private var m_Head; public List() {...}; public Append(element) {...}; public Length() { var count = 0; var current = this.Head; while(current is not null) { count += 1; current = current.Next(); } return count; } }
(Syntax note: as operator==
can be overridden, Mix offers the is
and is not
operators for referential equality and inequality, respectively).
This code is a tiny bit more useful, though it's only a sketch. I'm sure you get the idea. The point
is that, if we call Length
on a list that is always empty, and we treat null
as described above, we have a problem: the typeset of current
is always empty, and so the call
current.Next()
would be flagged as incorrect by the typechecker. Now, you might suggest we perform
some kind of nullness analysis and thereby determine that the body of the while
loop cannot
be executed, but this brings its own issues, so let's leave it alone for now.
So maybe our initial decision was wrong, and null
should indicate the empty type. But this is no good,
for consider m
in the first code listing: its typeset would be something like {Null[]; Bar[...]}
,
and no method call on that typeset could ever succeed, as Null
presumably implements no methods.
So it seems like the only solution (until we take a harder look at nullness analysis) is to classify calls on
the empty typeset as correct. And in fact, this doesn't violate our
definition of correctness, because the
invocation n.Foo()
will raise a null pointer exception, not a "method does not exist" exception.
So, taking a brief break from history, let's take yet another look at type inference in Mix. Way back when I mentioned that one of the restrictions we needed to place on the subset of Mix for which the algorithm worked was that no recursive calls we allowed. First, let's recall why; consider the following code:
Factorial(i) { if(i == 0) return 1; else return i * Factorial(i - 1); } main() { var i = Factorial(7); }
A simple factorial, no big deal. So, what should they type of i
be? Well, first let's get our
terminology straight: what concrete types should be in the typeset of i
?
Just by thinking about it we should conclude that it should hold only Int[...]
. But, if we run
the original, or even the iterative
algorithm on this code, what happens? Infinite loopiness. The terror!
(Incidentally; if the call to Factorial
in main
were Factorial(7.0f)
,
i
would hold both Int[...]
and Float[...]
in it's typeset; the base
case should tell you why).
The problem is that, when type checking the right hand side of the assignment in main
, we are
obligated to check the type of Factorial
with an argument of Int[...]
. In so doing,
we check that Int[...]
implements operator==
, operator*
, and operator-
.
And then we check Factorial
with an argument of Int[...]
. In so doing we check that...
And on and on.
However, this is easy enough to fix (or at least, it's easy enough to partially fix): let's keep track of the "call chain", and whenever we start checking a new call, we'll search the call chain for said call. If it is not in the call chain, we'll add it to the call chain and check the type like normal. But if it is in the call chain, we get to stop type checking: we know that checking the call will only get us right back where we started.
Pause for a second: what exactly is a "call", and how do we represent a "call chain"? A call is just the
(unique) name of a function or method, along with the type of each argument. This last bit is
important, because otherwise we might realize that a previous call to AddTwoNumbers
succeeded
with integers, and imagine that it would succeed on files. A call chain is a list of such calls, in the order
we've invoked them.
But, is finding a call in the call chain an error, or success? And anyway, what does it mean? Second question
first: it means that there is a loop in the call graph, just like we saw in the code above, and furthermore that
the recursive call is at the same types: the function is being used in a monomorphically recursive way.
First question second: we'll treat this as a success, because it indicates that along this call chain all
possible method invocations succeed (possibly while looping infinitely). If they succeed, it meets the correctness
requirement we've already talked about.
If it loops infinitely, it still meets this requirement. (Side note: we're essentially saying "it's correct, unless
this other runtime error comes up, but that error will hide the incorrectness"; we'll see this with respect
to null
next time).
Side note: this is rather strange, in that it somehow means that we are interested in a
greatest
fixed point,
whereas normally we are interested in least fixed points. We're taking the view (in a handwavish sort of way)
that, if when trying to prove P
, we eventually have to prove P
, we'll consider this
to be true
, or provable.
So, if you take the above as correct, are we back to a terminating algorithm? Well... not necessarily. Once again we would have termination "for free" if only we had a finite number of types to deal with. Unfortunately, we don't, and so this doesn't terminate so easily. In fact we could force this to terminate by requiring that all recursive calls to a particular function have the same type (that is, make all functions monomorphically recursive), but that's no good -- check out this simple function that isn't monomorphically recursive:
identity(arg) { if(identity(rand() == 42)) return identity(arg); return arg; }
It's just the identity function, but with a little twist: no matter what the type of arg
is,
the function might recursively called with a Bool[...]
argument. In ML the type of the function
is 'a -> 'a
, but it is used in a polymorphically recursive way. Anyway, a strange function, but
they can be found in "real life" (especially when mutual recursion comes into play); Henglein had an interesting
(and approachable) paper on inference
in the presence of generalized polymorphic recursion in an ML-like settings -- check it out. If you don't bother,
let me just mention that inference for such a language can be shown to be undecidable (indeed that's a contribution
of the paper) by showing that it is equivalent to semi-unification. And that means that, in what I'm trying
to do with Mix, I will inevitably face some tough choices to reign in this undecidability somehow; sounds fun!
So, I'm not the first idiot to think about doing type inference in object-oriented languages, obviously. There's plenty of prior work in this area. Here's a list of some relevant research, by project.
So, why don't I just implement some of this previous work and be done with it? Well, other than the fact that it isn't as fun as trying something new, each of the previously described approaches leaves something to be desired. So, let's first examine what we (ok, what I) want from the inference algorithm and language, and then see which criteria each of the aforementioned approaches meets.
C
should be able to contain members that are instances of class C
,
see? And members should be able to call themselves recursively. Oh, and make that mutually
in both cases.
First, row variables a la O'Caml. They handle the structural typing end of things, and recursive
types and functions, just fine. But row variables don't say anything about mutable data (and therefore
data polymorphism), and O'Caml (like most ML-like languages) doesn't treat mutable data
in the way that I wish it would. In O'Caml, a mutable data 'cell' is stored in a reference. When
the reference created it is filled with some initial data, which can later be updated. However, the
type that the reference holds cannot change; if the reference is first filled with a string
,
it cannot later be filled with an int
.
In addition, the structural typing supported by row variables isn't quite as strong as we might like. It can deal with inferring the types of arguments to a function by the way that they are used, but some simple constructs throw the algorithm for a loop. For instance:
let someFun b = if b then new Foo else new Bar
The above will not compile: there is no implicit subsumption rule in O'Caml. The algorithm will not
find a subtype of Foo
and Bar
, and use that in each branch of the if
expression. It will only work if you explicitly "cast" to such a type (note that the cast is safe, and
not a downcast). This is exactly the kind of code we will be writing (and be expecting to work!),
so row variables as implemented in O'Caml aren't going to cut it.
I was going to continue with the rest of my bullets, but I've run out of ammo (and time), so you'll have to wait until next time.
So, does the iterative algorithm we last discussed actually terminate? It's easy enough to see that it would if there were only a finite number of types, but last time I claimed that there are in fact an infinite number of concrete types. Let's see why:
class Pointer { private var obj; public new(o) {this.obj = o;} public Get() {return this.obj;} public Set(o) {this.obj = o;} }
Using just this class (and not even the "built-in" classes like Int
and so on) we can
construct an infinite number of concrete types. We have Pointer[obj: {}; ...]
, and
Pointer[obj: {Pointer[obj: {}; ...]}; ...]
, and so on (the fields Get
and
Set
have been ellided). That is, we could have a pointer to the empty typeset, or a
pointer to a pointer to the empty typeset, or a pointer to a pointer to a pointer, and on and on.
So clearly we can't just rely on there being a finite number of concrete types when iterating to ensure the algorithm eventually terminates. We need to be a bit more clever. First, let's write some Mix code that leads to an infinite number of types as described above:
main() { var p = new Pointer(null); for(...) { p = new Pointer(p); } }
So, what happens here? First, p
's typeset just has a Pointer[...]
to
an empty typeset. After one iteration, it's has a typeset has two elements: Pointer[...]
,
as before, and another Pointer
to it's own typeset. That is, if we could name
p
's typeset (let's call it p
), then p
= {Pointer[obj:{}; ...], Pointer[obj: p; ...]}
.
So, what happens after another iteration? In fact, we try to add the same concrete type to the
typeset again: a Pointer
to p
. As long as we can recognize that this
is the same concrete type as last time, we can stop iterating as no typeset will have changed.
So, that takes care of this example, but what about in general? Is there any way to construct an infinite number of concrete types, without them being cyclic? Clearly it is possible to define a non-cyclic concrete type of arbitrary size, but is it possible to write code that leads to one? We'll see.
Actually, there will be no strings here; right now I'm just going to be talking about loops. Earlier I mentioned that the simple algorithm for infering types couldn't handle loops (amongst other things), so let's see why they are an issue more clearly, and try to solve it.
I gave a simple example last time:
main() { var i = "Hello!"; for(...) { i.SomeStringMethod(); i = new Foo(); } i.SomeFooMethod(); }
The problem is that the call to SomeStringMethod
(which is presumably valid when
i
is a string) is valid if the loop runs only once, but isn't if the loop runs
more than once. So, we'll assume that the loop runs more than once; now how do we build
this assumption into the algorithm?
At first glance we might simply analyse the body of the loop twice; the first time when
we check the call to SomeStringMethod
i
is known only to have type
String[]
, and so the call succeeds. The second time i
's typeset
contains both String[]
and Foo[...]
, and so the call fails
(assuming that Foo
doesn't implement SomeStringMethod
, which is
rather the point of that name).
That sounds good, and it's simple, right? Unfortunately, it's not right, either. Consider this more convoluted piece of code:
main() { var a = new SomeClass(); var b = new SomeOtherClass(); var c = new YetAnotherClass(); for(...) { a = b; b = c; c = a; } a.SomeMethod(); }
It's pretty obvious what's going on here: depending on how many times the loop runs, this call
could be valid, or it could be invalid. Let's say that both SomeClass
and
SomeOtherClass
implement SomeMethod
, but YetAnotherClass
does not. Then after 0 iteration the call would succeed, and after 1 it would succeed, but after
2 it would fail. So, let's analyse the loop body 3 times; 3 is enough for anybody, right?
But wait, you say: we could keep adding variables (d
, e
, and so on)
to cause a greater number of iterations to succeed before failure. Right you are. Instead of
hardcoding the number of times we should check the body of the loop, we need a way to determine
that during analysis.
One way we could approach this is by analysing the code in a separate pass and calculating the number of times, perhaps be checking the number of variables. That would work in the above case, but what about in more complicated cases? Trust me, there are cases in which that sort of simple analysis won't work. Maybe there's a more complicated analysis? I don't know it.
Another way, and indeed the way I will choose, it to iterate to a fixed point. That is we keep on analysing the body of the loop until no typeset changes; at that point every typeset should be complete. Fixed point analyses are really common in Computer Science, and you seem to bump into them quite frequently when working with compilers; for example, the first time I ran into it (I think) was when I was implementing a compiler for Tiger (a simple imperative language) based on Andrew Appel's book Modern Compiler Implementation in ML, which is, incidentally, also the first time I programmed in Standard ML. The specific usage was during liveness analysis, solving dataflow equations.
Anyway, to see how this works, let's run through the last example.
At the outset, each typeset (for a
, b
, and c
) has a single
element in it. After the first iteration, a
has SomeClass[]
and
SomeOtherClass[]
, b
has SomeOtherClass[]
and YetAnotherClass[]
,
and c
has YetAnotherClass[]
and SomeClass
.
After the second iteration, each typeset has all three classes; we could stop at this point, but how would we know to in the general case? Instead we continue.
After the third iteration, each typeset still has all three classes; no typeset has changed. Now we can stop; we know that further iterations will not yield additional information.
There are a number of questions that remain to be answered. First, how do we know that we'll eventually stop adding concrete types to the typesets, and thereby terminate. Clearly there are an infinite number of concrete types, so we could concievably continue to add types indefinitely.
Second, how do we know that the collection of types in the typesets is accurate after we reach a fixed point? With the simple algorithm I didn't talk about correctness because it was obviously correct (for the ridiculously limited subset of Mix to which it applied), but now correctness is less obvious.
Stay with me, and I'll get to answering these questions (or at least trying to) soon enough.
So, I said that some folks might not be interested in static analyses like type correctness, or const correctness, or god knows what. And, to be honest, I can't blame you (too much), because in a lot of popular langauges the available static analyses (the type system, in most cases) isn't that great. Maybe it isn't powerful enough to encode useful properties (C, anybody?), or maybe it is powerful enough if you really want it to be, but everyone sort of ignores it because its more trouble than it's worth. Anyway, lot's of people aren't all about static typing, are all about dynamic typing, and will kill you with an ASCII blade if you disagree. On the other hand, some people love static typing, and static analysis in general, more than they love their own aged grandmothers. They will fight you (and any passing dragons) to the death with the dual weapons of dataflow analysis and syntax directed translation. Or something.
Anyway, the point is that I'm about to jump into the fray, so to speak, and try to convince you that thinking about static typing for a language like Mix is really a good idea. So, away we go!
First, there are a number of reasonable complaints about the static type systems of popular existing languages (clearly this list isn't even close to exhaustive, but it hits a few):
Ok, lot's of arguments against static typing. The first one I'll give away: lot's of languages have got it "wrong" according those voicing this particular complaint. But, others have gotten it right (I'm thinking O'Caml, Haskell, F#): type inference can save a language from failing this early in the game.
The second one we can give away in the same fashion: Java, C, C++, and all kinds of others
share this problem, to varying degrees: in Java you can get a ClassCastException
,
and in C you can cause a core dump. But again, there are plenty of languages whose type system you cannot subvert
to cause runtime problems, so stronger static typing could save us from this failing as well.
The third one is were the complaints start "getting real", to quote someone or another. Basically, the idea is this: when you're programming, especially during the early phases of a project, or when you're really just "exploring the solution space" (buzzword!), or doing rapid development, types can make your code less flexible. It can take longer to twist things, experiment. There's all kinds of inertia to the code, damnit. Well, again this can be true. Here's my argument: if there isn't any manifest typing (in the form of type annotations), and you have a "good" type system, then changing your program around in this manner should only ever irritate the type checker when you've actually done something wrong, which is exactly what you want, right? Of course, the problem becomes defining "good"; that's what this blog is about.
The fourth one is tricky as well. I read about this one in a book edited by Joel Spolsky called The Best Software Writing, in a piece by Bruce Eckel; go read this if you haven't (only takes a minute). Back already?
Ok, so the argument is that you're already writing tests (right?), and they already check all of the things that the type system checks, and more, so why bother with static typing? Good point, certainly. But, this seems to rely on 2 basic facts, at least 1 of which must be true.
Hopefully the above (points 1 through 3) has convinced that, while the first bit can be true, it doesn't have to be. As for the second bit, it's predicated on the assumption that you can't have a statically typed language that saves you the same something in most cases; we'll see about that!
So, if you take all my claims as true, you might be wondering why I went through all that trouble
only to basically show that static and dynamic typing are more or less equivalent. Because that's
the result of point 4 above, right? Well, here's where I'm going out on a limb to give static typing
the edge (under certain conditions, but don't worry, dynamic typing gets to win, too; we're all winners
here). In my personal experience, most of the features of dynamic languages that I use
regularly could (probably) be incorporated in a sbuitably designed static type system (of course,
things like eval
aren't in the "most" I'm talking about; if you need stuff like that
then dynamic takes the game).
Not much content here, but maybe if you were on the fence before you'll hear me out going forward.
Last time I said that my notion of correctness is simple: you should never invoke a method on an object that doesn't implement it. To start off discussion, let's look at a very simple algorithm that statically determines this. The subset of Mix that it handles is one without loops, without recursive calls, and with no inheritance; basically a fairly useless subset of Mix, but let's go with it.
The idea is simple: starting from the programs entrypoint (a free function I'll call main
), we'll interpret
the program, but instead of using regular values like ints, and strings, and objects, we'll use types; that is, we'll
perform an abstract interpretation. Abstract
interpretation, invented and first formalized by Cousot and Cousot, has a long history in static analysis; I'll be more
or less ignoring the formalization from here on in, and just going with an informal approach. Check
this out for a fairly readable introduction to "real" abstract interpretation.
Moving on to the actual algorithm, consider the following trivial code:
class A { public var Member; public new(arg){ this.Member = 1; }; public Method1(){}; public Method2(){ this.Method1() }; }; main() { var a = new A(); a.Method2(); a.Member; return; }
So, starting in main
, let's interpret the program. At the outset, we have a variable declaration; let's
pretend that it was really written as a variable declaration followed by an assigment:
var a; a = new A();
That let's us look at the two steps separately, instead of all at once. So, first we see a variable declaration, so
we add a
to the environment; we associate with a
the empty typeset. A typeset can
be thought of as a set containing all types that the variable is known to hold; by looking at how all of the types
in a typeset intersect, we can understand what operations on a
are guaranteed to be legal.
Next, there's an assignment: the effect of the assignment under the interpretation is that the typeset associated
with a
should be augmented by the right side's value; but how can we determine the right side's value? Let's
continue interpreting. new
(with a particular class name and constructor arguments, of which there are none in
this case) corresponds to constructing a concrete type based on the class and then invoking the class's construction operator.
Concrete types are values that represent the properties of an object, namely it's class name and its fields (both members and methods). Concrete types are what go into typesets. From now on, we can think of the value of a particular expression (under the abstract interpretation) as a typeset containing 0 or more concrete types.
So, first we construct the concrete type: it has name A
, and it has 1 member and 2 methods, namely Method1
and
Method2
; for now we'll indicate this concrete type as follows: A[Member: {}; Method1: {function()}; Method2: {function()}]
.
This should be read as a concrete type "A with field 'Member' having the empty typeset, 'Method1' having a typeset of 1 element,
and 'Method2' having an typeset of 1 element." The elements in the typesets for Method1
and Method2
are callable; I'll talk about callables in some other post, but for now the name should give you enough information
to go on.
Now that we've constructed the concrete type, we interpret the constructor. This means we analyze the body of the
new
operator for class A
, using an environment in which this
is bound to a
typeset consisting of only the just-constructed concrete type. So, we look inside the operator's definition, and the first
thing we see is another assignment.
Once again we add the value of the thing on the right to the typeset of whatever's on the left, and this time the
thing on the right is just an integer; for now we'll say that the concrete type of an integer is Int[]
,
therefore the value of the right side is a typeset containing only 1 concrete type. Essentially we're setting the
typeset of the thing on the left to be the union of itself and the typeset of the thing on the right.
So, the result is that the original concrete type we constructed in the previous paragraph is now the following:
A[Member: {Int[]}; Method1: {function()}; Method2: {function()}]
. Finally we return to main;
the important thing to notice now is that we know that the concrete type has a member Member
that at some
point could be instantiated with an Int[]
, and that a
's intersection contains this
concrete type.
Next we come to an method invocation:
... a.Method2(); ...
It actually consists first of a field lookup and then a call.
In particular, we first check to see that all of the types that a
could be have implement Method1
,
and then we check that each of the fields' types are callable. Again, a
can have only 1 type,
which does indeed implement Method1
, and it is callable. So, we interpret the call, by looking
in the definition of Method1
, passing the relevant concrete type in a
's intersection
as this
.
In Method1
we follow a similar process to check that this
implements Method2
,
which it does. We go into the (vacuous) body of Method2
, and eventually come all the way back to
main
.
Next we check that all of the types in a
's intersection have a field named Member
; again they
do, and in this case its type is dropped on the floor (just as the value of a.Member
would be dropped on the
floor at runtime). Finally, we return, and that's that: all of the code as been interpreted, and no invocation failed,
so the code is deemed correct.
That got really long winded, and we didn't even seem to do much: there were no errors, and a
only ever
had 1 type in its typeset, so it wasn't too interesting. So, let's briefly look at a slightly more interesting
example; we'll ignore some aspects (like where rand
comes from) to keep it simple.
class C { public new(){}; public Foo(){ if(rand() == 0) return 1; else return new A(); }; } main() { var c = new C(); var i = c.Foo(); i.ToString(); }
Interpreting main
we see that, after the call to c.Foo()
, i
's typeset
should have 2 concrete types, one corresponding to an integer, and the other to an instance of A
.
When we check to see if all of these types have a field named ToString
in the last line of main
we encounter an error: C
doesn't implement ToString
.
Hopefully that was pretty clear, and also pretty obvious. You might recall that I pretty severely limited the language at the outset, and now you might be able to guess why: if we allow recursion with this simple algorithm, it will go into an infinite loop, following recursive invocations over and over. If we allow loops, the problem is only slightly more subtle. Consider the following code:
main() { var i = "Hello!"; for(...) { i.ToString(); i = new A(); } }
Our simple algorithm, with no modification, would not see an error here, though there clearly is one if the loop runs
more than once. It will check the invocation i.ToString()
when i
's typeset contains only
Int[]
, and not when it contains also A[...]
.
So, lot's of new stuff (not like there was much beforehand). I'll talk more about typesets, concrete types, callables, and I'll start to address both of these algorithmic issues in the next few posts.
I said that Mix is an object-oriented language after the tradition of C# and Java, which is true enough. There are classes, and the syntax is C-like, and the operational semantics of the language are exactly what the average C# or Java programmer might expect. So, sample code:
class SomeClass { private var someMember; public new(anArg, anotherArg) { if(anArg) { this.someMember = anotherArg.M(anArg); anotherArg.N(); } else { this.someMember = new List(); } } public GetSomething() { for(var i = 0; i < 42; i++) { this.someMember += i / 6; } return this.someMember; }; };
This class really doesn't do anything; it's here to highlight that Mix code really looks like some code you might have written in a language like C#. It has if and for, members and methods, and all kinds of normal stuff; if you don't like that everyday stuff you're not going to like Mix's syntax, unfortunately. In my opinion it is a reasonably usable and certainly widely known syntax, so I went with it.
The only interesting bit is that there aren't any types in the usual places. That is,
you normally expect, in a statically typed language (unless your boss let's you write in
O'Caml, or F#,
or some other language with inference) to have to write types all over the
place. If it were Java we'd need them for each member, for each method (return and argument
types), and each local variable. In C# these days you can
drop the types on locals. But
here we haven't any at all, unless you count SomeClass
itself, which I won't: it's more
like a prototype, in that we use it to create an object, but not to classify objects later
on, as we'll see.
So, this is the language in which I am interested: no types on members, methods, or local variables, looks somewhat like your everyday object-oriented language, simple enough. I could compile this down to Python, say, or Javascript, easy as that (where "that" is snapping your fingers). But I don't only care about compiling and running Mix programs, I care about analysing them for correctness at compile time (and really, all I would have done was give a different syntax to a fragment of one of those languages; that's no fun). Now, you might be the sort of person that isn't interested in static analysis; you probably won't care for Mix, either, so you might want to hang out with the folks that left when I mentioned braces.
Still here? Ok, so, what do I mean by correctness? I'm taking the very simple criteria of "doesn't call methods on objects that don't support those methods" as meaning "correct" with respect to the type system. So, I don't want to call Append on an Int, for example. We'll (eventually) see how this criteria can be used in some interesting and (hopefully) powerful ways, but for now let's leave it (my "correctness" criteria) at that.
To see what I mean, let's look at a (very slightly) more reasonable example, with PETS! (I like pets, in particular my cat, Mortimer). Here we go:
class Cat { public Speak() { return "meow!"; }; public Pounce(target) { ... }; } class Dog { public Speak() { return "arff!"; }; public SearchForBombs() { ... }; }; main() { var pet = null; if(somethingOrAnother) { pet = new Cat(); } else { pet = new Dog(); } print(pet.Speak()); };
So, super simple example, but you probably already get the idea: pet
could be either a Dog
or a Cat
, but
because both of them can Speak
, we're all good. What if the we tried to call Pounce
on pet? According
to Mix (that is, according to me), we'll treat it as an error: there is some path through main
that leads
to pet having an instance (of the class Dog
in this case) that cannot Pounce
. Even if the programmer
knows, somehow, that somethingOrAnother
is always true, the type system will treat this as an error.
One way to think of this, which I'll make more concrete in future posts, is that every target of an
method invocation must always (syntactically, in the source) be an instance of a class that implements
the method. So, in the above example, in the expression pet.Speak()
, pet
will,
no matter what, be set to an instance of a class that implements Speak
, and is therefore correct.
Anyway, that should give you a bit of the flavor of what's to come.