Saturday, October 2, 2010

Oleg's Unusual Polymorphism

I was wandering the internet, and stumbled once again across Oleg's page of O'Caml and other wizardry. One fun example was that of polymorphic selection.

The idea is that we (for whatever reason) want to be able to select data from a heterogenous structure in a particular way. In O'Caml one of our tools for storing such data is a tuple, so we'll take as input a pair of pairs. We want to be able to select either both of the fst elements, or both of the snd, using some function poly along with the selection function (either fst or snd):

    let data = ((1, 2), (3, 4))
        val data : (int * int) * (int * int)
    
    poly fst data
        - : int * int = (1, 3)
    poly snd data
        - : int * int = (2, 4)

So, how do we write poly? Here's our first attempt:

    let poly selector (l, r) = (selector l, selector r)
        val poly : ('a -> 'b) -> 'a * 'a -> 'b * 'b

    poly fst data
        - : int * int = (1, 3)
    poly snd data
        - : int * int = (2, 4)
Success! Or so we hope... but take a look at the type of poly. Not exactly what we were looking for; consider:
    let data' = (("1", 2), (3, "4"))
    
    poly fst data' (* Type error! *)
The essential issue is that, in the definition of poly, we are trying to use selector polymorphically. In ML we can only apply selector monomorphically, that is, to arguments of a single type within the body of poly. We actually want the type of poly to be something like:
    val poly : (forall a. b. (a, b) -> c) -> ((d * e) * (f * g)) -> {d * f when a = c, e * g when b = c}
Or something crazy like that (as Oleg mentioned, even with "real" polymorphism a la System F we cannot type poly the way we would like)... But it turns out that, if we are willing to do a bit of hackery we can get what we want:
    let poly selector (l, r) =
        selector ((fst l, fst r), (snd l, snd r))
             
        val poly : (('a * 'b) * ('c * 'd) -> 'e) -> ('a * 'c) * ('b * 'd) -> 'e
What we've had to do is inline all possible projections that selector can make, so that we can apply it just once (at one type). Incidentally, if you don't like the idea of invoking both fst and snd each time you want just one or the other, you could thunk it:
    let poly selector (l, r) =
        selector (fun () -> (fst l, fst r), fun () -> (snd l, snd r))
             
        val poly : ((unit -> ('a * 'b) * (unit -> 'c * 'd)) -> 'e) -> ('a * 'c) * ('b * 'd) -> 'e

Here you'll have to invoke poly with an extra argument:

    poly fst data' ()
        - : string * int = ("1", 3)

So, that's all just a recap of what Oleg wrote, and which I found quite interesting. But as I looked a bit more, things seemed strange. First, take a look at the type of the first argument to poly:

    val selector : ('a * 'b) * ('c * 'd) -> 'e

What does this mean? It's basically saying that selector can return anything it wants; that is, it's a whole lot less restrictive than the "crazy" type we wanted to write above. So poly works as we'd like when selector is either fst or snd. But its not a whole lot more useful than just writing out the whole thing longhand, as below, because defining any more selectors will require us to understand the internals of how poly operates (and also how our heterogenous data is structured, of course).

    let fst' ((a,b),(c,d)) = (a, c)
        val fst' : (('a * 'b) * ('c * 'd)) -> ('a * 'c)
    let snd' ((a,b),(c,d)) = (b, d)
        val snd' : (('a * 'b) * ('c * 'd)) -> ('b * 'd)

Nevertheless, we have solved the problem that we set out to solve, by first inlining (and thunking) all of the possible projections of our heterogenous data structure, and then applying our selector to the result. Oleg points out that this demonstrates the relationship that inlining has to polymorphism: essentially inlining allows us to "outside" the restrictive type system and check types only after expansion, which means we can seemingly abstract over aspects of a type that the type system itself cannot allow. This is played out to good effect in F# with its notion of inlining and statically resolved type parameters, though of course there the language actually does the inlining for you, and has extended the type system to make sense of it.

So anyway, what happens when we want to apply this to larger structures, perhaps without as much built-in "symmetry"? Let's add some extra data to our structure.

    type 'a 'b 'c 'd 'e heterogenous = ('a * 'b) * ('c * 'd) * 'e
    let poly selector (l, r, extra) = selector ((fst l, fst r, extra), (snd l, snd r, extra))    

So that works, fair enough. But still, it feels like there's something funny about our examples. It's almost like our structure isn't so heterogenous after all. What about this?

    type 'a 'b 'c 'd 'e heterogenous = ('a * 'b * 'c) * ('d * 'e)

Well now, what kind of selectors could we use on this data structure? There really aren't any yet (though of course we can write some). In fact, the only reason that we ran into trouble before is that we didn't want to write selectors for the data at hand, instead opting to try to use existing selectors on different pieces of the structure. We were exploiting some uniformity, some homogeneity, in the structure (namely that there were several pairs in the structure), and in fact we were able to capture this homogeneity using the type system (in spite of it, perhaps) after a fashion. But I wonder: How often you might run into this problem, where you don't, for whatever reason, want to write your own selectors over the structure in its entirety? Consider:

    let fst'' ((a,_,_),(b,_)) = (a, b)
    let snd'' ((_,a,_),(_,b)) = (a, b)
    let thr'' ((_,_,a),(_,_)) = a
    
    let poly selector data = selector data

Here poly just becomes apply, and of course our selectors understand the entirety of the structure. There wasn't any ready-made uniformity in the structure, so we did it all by ourselves. Of course, if could abstract over the length of tuples we would be in business again... Well, enough of that.

Note: I had to run before I could proofread this... apologies.

No comments:

Post a Comment