Thursday, February 18, 2010

Artificial

I remember reading a book by Dan Simmons a long time ago; could have been Endymion or The Rise of Endymion. I've always been a science fiction fan, and the Hyperion series was pretty cool. In this book, whichever one it was, there are artificial intelligences, as tends to be usual for works of science fiction. But what was really interesting to me at the time (must have been 10 years ago, at least) was that, in the book, these artificial intelligences were described as having evolved, in the distant past, from an experiment in evolving programs. The best part was that Simmons actually mentioned an artificial life researcher by name: Thomas Ray, whose work (the Tierra project) I promptly investigated.

Turns out it was The Rise of Endymion; I should read my own links.

I eventually implemented my own little a-life simulator, had fun with it for a while, and then moved on to other things; I'm not a biologist. But recently while on vacation I decided to mess around with it again. I had stumbled across nanopond at some point, and it looked rather neat, so I decided to implement a system based loosely on that project -- it is unimaginatively called "PondLife".

The basic idea behind nanopond, PondLife, and a host of other similar projects is simple: you have a bunch of organisms that consist of some kind of "code", and which can reproduce. And you execute them over and over again. Usually the virtual machine that executes them isn't perfect, and makes mistakes; this leads to mutations. And usually the organisms compete for something (energy, processor time, memory, or other more artificial resources), which leads to a form of natural selection. And so you get evolution.

Nanopond is written in really efficient C, and uses all sorts of twiddling to pack 8 instructions into every int. I decided to implement my system in C# because I long ago lost any desire to program in the hellish twisted misery that is C ever again, you can't make me damnit! Bwaahhh!!!!

Sorry about that. So, not a fan of writing C when I don't have to (although, as we shall see, this could have been one of those times), and I just wanted to get something working immediately, which is easy when you have a huge standard library at your fingertips. Plus, everyone says C# can be just as fast as C, so why not?

Here's the scenario: there's a 2 dimensional grid of "cells", which wraps around like in Asteroids; its a torus. Each cell has a "genome", basically a vector of instructions that are its code. (In nanopond every cell gets the same amount of code; in Pondlife different cells can have different amounts of code). The instruction set is basically that of Brainfuck with a few tweaks: there's a register and a pointer, a cell can increment and decrement the register, and move the pointer forwards and backwards. A cell can read its code into the register, and write the register into the code (that is, PondLife code can self-modify). And there is an "output buffer", again which the cell can write into and read out of. Finally, the cell can "turn", updating the way it is facing in the 2D grid, "kill" a neighbor, or "share" with a neighbor (more on neighbor's in a second).

Every cycle a random cell is selected and executed until it is out of energy. If it has written anything into the output buffer, the buffer is copied into the neighbor's code (where "neighbor" depends on which way the cell is facing). Instructions take energy to execute, so each cell also has its own amount of energy; this is the resource I mentioned. And the instructions don't always go as planned; hence mutation.

A cell can gain energy by killing its neighbor, and from the sun or other operators. See, there are various "operators" that are executed every so often. One is the SunshineOperator, which adds energy to all cells. Another is the SedimentOperator. Remember how I said that each cell can have a different length of code? Well, if you think of that length as the depth of the world, then think of this operator as sediment accumulating beneath living cells. Eventually this sediment can accumulate to the point where the cell doesn't have enough room for its replication code, and therefore cannot reproduce.

To combat sedimentation we have the ErosionOperator, which works as you might expect, decreasing the depths of all cells and smoothing them as well. There are also a number of "randomization" operators; the CosmicRayOperator randomly flips bits in cell code every once in a while, and the RandomizeOperator more aggressively randomizes entire genomes. There are operators for decay, for seasons, and other fun things.

So, that's all well and good, but what do we get? Well, one fun thing we get is abiogenesis: starting with an empty environment and a few operators (sunshine and randomization are all that you really need in PondLife) after only a little while "viable replictors", cells which can copy themselves well enough that the copy can copy itself again, appear in the environment. And PondLife doesn't seem to "get stuck" too quickly, where only one life form exists in the environment; this is especially true if you include the sedimentation and erosion operators, which force cells to compete not only for energy but also for "depth".

"Pics or it didn't happen." I'll do you one better. I ran a simulation in a 128 by 128 world for 8 or so hours and have 2 videos. In the first video, different colors are different lineages; two cells have the same lineage if one was produced by the other, or if their parents have the same lineage. In the second video, depth is indicated in grayscale, with black being deeper (where more code can fit) and white being shallower (where less code can fit). In this simulation I also used an operator that caused less energy to be added to cells in "deeper water" for fun.

That simulation only ran for 2 billion cycles. That's not many, considering it took 8 hours (probably would have been the most intense computation this poor laptop has ever seen, except I did a 17 hour run right after). I need to optimize.

EDIT: Here are the videos:

Tuesday, February 16, 2010

Time Dilation

I haven't had much to say recently. I rather doubt anyone has noticed, and yet nevertheless I feel a bit guilty... Perhaps this indicates some underlying psychological problem? Anyway, I have an excuse: I was working on two different papers with various colleagues. If things work out I may infinitely increase my publication list!

The first, entitled Focused Inductive Theorem Proving describes the theory behind our interactive and automatic theorem prover for an intuitionist logic with support for generic quantification (using nabla, which is nice for reasoning about systems with names and binding) and definitions (via fixed points, which are useful for encoding inductive and coinductive data structures, like lists and streams, respectively). The system is called Tac, and is released under the GPL version 3. One fun note: we have (kind of, call it mostly) solved the POPLMark challenge parts 1A and 2A with Tac.

The second is A Meta-Programming Approach to Realizing Dependently Typed Logic Programming, and describes a technique that we developed to implement logic programming in LF via translation to λProlog, a higher order logic programming with an efficient bytecode based implementation. The tool, named Parinati, is also available under the GPL version 3.

Other than that I've been snowboarding and working on a fun little side project, which will be to topic of my next update, which I am sure you eagerly await.