Wednesday, December 30, 2009

.ctor

I was wandering around the internet looking for interesting programming languages related blogs and came across the blog of Gilad Bracha, one of the people involved with Newspeak. There were many interesting posts, but an older one caught my eye and got me thinking about object instantiation in Mix.

In Constructors Considered Harmful Mr. Bracha expounds on the problems associated with object constructors, when it comes to both the reusability and extensibility of a piece of code, and the uniformity of the language.

One way of thinking about this first point is to consider object construction to be separate from object initialization. In this scenario new is an operation that first constructs an uninitialized object of a particular class, and then calls the constructor to initialize it. In this sense one can see that the signature of the constructor in many object oriented languages is rather arbitrarily constrained, in that it must return an instance of the class that it is defined in. This means, for instance, that new X() must return an instance of X, and can't return an instance of a related type Y. This means that, if ever your code changes so that you need such functionality, you'll have to traipse all around your code base replacing calls to new X(). That's the first problem; how can we

What might be nice would be for us to separate these two concerns explicitly, so that new really does only instantiate an uninitialized object, and then some method on that object is responsible for initializing the object.

But wait, you say: won't this allow us to accidentally create and then use uninitialized instances of a particular class? And isn't that frightening and terrible? Well, sure, as presented thus far it would. Of course, there's nothing that prevents you (save for good practice, of course) from defining all your classes with an empty constructor and a method named Init. But, we can do a bit better (though we'll still trust ourselves not to amputate our own feet via firearm).

Let us say that one can only create an uninitialized instance of a particlar class from within that class, so that a random programmer can't new up an uninitialized instance. Then let's restore a notion of constructor (call it an initializer) that is nothing but a (possibly named) constructor, but one that is obligated to return the object it is responsibly for creating and initializing. Therefore our initializer can return instances of a different type if it so desires.

Aside: This is really just like static factory methods as in Java, and in fact that is one technique for avoiding some of the problems with constructors that has found its way into general practice. However in Java we can still write regular constructors with all of their supposed problems.

But aren't we back to regular old constructors? Not quite, because we are not required to return any particular type from our initializer. So we might have the following code:

class Foo
{
  new Foo(i)
  {
    var t = new Foo;
    t.Bar = null;
    t.Num = i;
    return t;
  }
  
  new FooBar(i)
  {
    var t = new Foo;
    t.Bar = Bar.Bar();
    t.Num = i + 8;
    return t;
  }
}

var f = Foo.Foo(42);
var g = Foo.FooBar(34);

Then, if ever we need to modify the constructor of Foo, so that, for instance, it returns an access controlled instance of a Foo, we can do the following:

class Foo
{
  new Foo(i)
  {
    var t = new Foo;
    t.Bar = null;
    t.Num = i;
    return AccessController.New(t);
  }
  ...
}

Another complaint raised in the aforementioned blog post (the second point I metioned) is that constructors behave differently than other "callable" objects. That is, creating an object using a constructor is fundementally different at the language level (for many languages, like Java, C#, and so on) than calling a function (or static method, or whatever), so that you cannot, in general, pass a constructor around, or store it in a variable; they aren't first class. In the above presentation there is nothing precluding them from being so.

So really, all we have done is introduced some limited form of static methods, and restricted the use of new as an object instantiation device so that it can only be used to create new, uninitialized objects of the same type as the class in which the call to new is found. What we gain is that constructors need not return a particular type, that we still are "encouraged" by the language to write constructors that return initialized objects, and that constructors behave in the same way as regular functions, in that they may be passed around, stored in variables, and so on.

In fact I had never really faced either of the problems that Mr. Bracha described in my own programming; perhaps this is because I haven't worked on any really large projects with many other participants. In general I've never needed a constructor lke that of Foo above, and if I've ever needed to pass a constructor as a function I have just wrapped it, thus:

  var ctor = function(i, j, k){ return new SomeClass(i, j, k); };

But still, it is an interesting point that merits some thought, especially in the context of large codebases that act as a dependency to many other pieces of software, and whose signature cannot easily be changed.

No comments:

Post a Comment