Cultboundvariable was our team for the 2007 ICFP Programming Contest. This was our sixth time entering; see also web pages for 2001, 2002, 2004 and 2005. (In previous years we used other names, usually "Tycon Mismatch".) In 2006 we were the judges for the contest, so this year we changed our name to "cultboundvariable" (a UNIX-sanitized version of "Cult of the Bound Variable", the organization from the story of our contest) to strike fear into the hearts of our competitors. Alas, we scored pretty badly in this contest, although we don't think our final score adequately reflects our progress.

Our team this year consisted of Tom Murphy VII, Daniel Spoonhower, Jason Reed, Katrina Ligett, Sean McLaughlin, Akiva Leffert, Dan Licata and Austin McDonald. Unfortunately for continuity, most of our type theorists and previous teammates are away for the summer this year, but this was a fun group with a wide variety of skills.

The Task

The goal of the 2007 contest was to generate a specific image using a simple programming language (called "DNA") and graphics language (called "RNA"). The program that we submitted would be prepended to a large (7mb) DNA program that the judges provided, which meant that our program could use anything that already existed in that 7mb program. Tom kept calling this program the "codex" (our 2006 contest also featured a large mysterious program called the Codex). There were four important metrics in scoring: the number of pixels (exactly) correct in our image, worth 10 points each; the length of the submitted prefix in DNA base pairs (worth -1 point per base); and the execution limits on space (25mb) and time (3 billion operations) either of which disqualified the entry if exceeded. The judges nicely provided an online scoreboard that allowed us to validate our entries easily and automatically kept the best one.

What we did

We spent most of the first day implementing the DNA and RNA languages, in two separate teams. The DNA implementation, cleverly called "Polymerase" was implemented straightforwardly using lists in SML, but we found this to be rather slow: some operations require us to have fast random access into the list of base pairs. After lunch we implemented a hybrid vector/list data structure using SML's nice VectorSlice type, and fixed some bugs, and were able to start running the codex in a reasonable amount of time (about 20 minutes to render the endo images).

The RNA language was also pretty easy to implement in SML; we got this right the first time (except for our PNG output which was C code, hah) and didn't have to optimize it much. The only efficiency improvement we made was to cache the value of the color bucket for successive calls to setPixel. (For our later rasterization strategy we had to make some more improvements, which are explained below.)

From this point we were able to find the Repair Manual prefix, which led to the Daylight prefix, and allowed us to get on the board with what was to be our final score (!) of 1160658. This score was very common, although it seems some people were able to get one more point by submitting a prefix with one fewer character that did the same thing. We didn't try to do this because we thought we'd have our other strategies working by the deadline!

After we started unlocking manual pages, we split off into a bunch of smaller projects. Tom wanted to work on compression, because he had an idea for that and figured that any solution could ultimately benefit from compression (except for those 28 byte prefixes, alas). Spoons and Katrina and Sean were working on color mixing, which turned out to be the key trick we need to make our unorthodox solution feasible. Jason and Akiva and (later) Sean and Dan (working from Merry England) were unlocking and rendering more manual pages.

Codex investigation

The codex includes a lot of information and a lot of routines apparently intended to be used to render the correct Endo image. Unfortunately we found the calling conventions for these functions to be odd and difficult to get working. We were able to render a lot of manual pages, but only some of the functions that we found documentation for were callable, and none of these improved our score enough to submit. To facilitate the investigation of the codex, Sean built an interactive debugger, and Tom added features to show intermediate images as the codex ran. (This turned out to be important because a lot of experiments caused the codex to produce images we had already seen, so we wanted to stop those 20 minute excursions early.) Also because there were a lot of 20 minute excursions Sean spent some time optimizing the string matching with some nice fresh algorithms from the CLR book. Jason (and Austin?) also built an assembler that allowed us to conduct these experiments more pleasantly by writing in a slightly higher-level DNA syntax.

It seems that the organizers of the contest wanted us to render (approximations of) the image by figuring out how to call these functions and use them to make various parts of the picture. All in all we unlocked 80 pages or so of documentation. There is a lot of fun stuff in there. Some of them have puzzles of various sorts surrounding them, which we made a lot of progress on (we found embedded image files that contained embedded MP3s, etc.) but we weren't able to string together any function calls in order to make any useful images. We found this a bit frustrating. Sometimes Jason would come into the other office where we were working and just start clutching his chest or stomach and have to lie down. This is why some of us spent most of the weekend trying to solve the problem from scratch, pretty much ignoring the codex.

The road less traveled

Before we understood what the codex contained, we were already trying to come up with ways to generate the image. One obvious thing to do is to just try to draw the image, pixel by pixel, using the RNA drawing commands that are provided. This method doesn't even necessarily need DNA. The problem and the scoring of the problem were clearly designed to thwart this strategy, but some calculations made it seem feasible, so we set off trying to do it.

One big obstacle is that the target image has detailed "watermarking" (like on a check or dollar bill) to make sure that there are no long runs of the same color. This makes the fill and line drawing functions of RNA pretty useless for drawing exactly the pixels as found in the output image. In fact, the codex renders the outdoor scene using a complex overlay of gradients and faint translucent lines, etc. We still thought that clever compression could allow us to render this image, though. As we started to build a naïve but correct rasterizer, we found another problem: the way that colors are specified meant that it was very expensive to generate a very specific color. Basically, colors are made by combining a certain number of drops of various primary colors in a bucket, with the resulting color being a mix of the input colors. To generate a color that is almost—but not quite—red, you might need 200 drops of red and one drop of white, which takes 201 instructions, each 10 bases long (we need to beat 10 bases per pixel to score any points at all, so this is way too much!). Spoons and Katrina and Sean found another clever way to mix colors: by using multiple translucent layers and compositing. With this method (just think about greyscale values here), we look at the binary representation of the desired color, which is 8 bits. We then create 8 layers, each 50% transparent, either containing 255 or 0. Each layer is composited with the bottom layer immediately. At the end, the lowest layer has been diluted to 1/128th of its original strength (so it is the 1s place), the next to 1/64th (so it is the 2s place), then 1/32nd (the 4s place), etc. This means that in generality we only need a few instructions per bit to encode each color, which gets us closer to a reasonable solution. Unfortunately this does not quite work. The fractions in their color mixing algorithms, because of integer rounding, are actually 127/255, which is not quite one half. That means that this method often produces the wrong colors. Katrina in her zest for mathematical beauty tore her hair out trying to find a nice solution to this problem. Tom declared it would be more expedient to just try all reasonable possibilities by computer, and found that despite the rounding errors, it was still possible to generate color values 0–252 and 255 using this 8-layer decomposition. By tweaking the initial condition, it was also possible to generate 253. Tom then tore his meagre hair out trying to figure out how to generate the one last color value, 254. The best he and Spoons could do was a bizarre two-pass scheme where 8 layers with the tweaked initial condition were used to generate color value 253 (a highly tweaked initial condition) and then a mixed binary-ternary layer scheme allowed for the generation of 254 from that. This was a total of 16 layers. But guess what? The difficulty of generating 254 through compositing does not only afflict us: the target image has zero pixels with this value in any color channel. That means our special case is never necessary.

Once we figured this out Spoons implemented a rasterizer that could generate the upper left corner of the image correctly, but it was too large to get any points. It was also extremely slow, because for each pixel we created 8 layers (600x600 RGBA) and composited them. We had to special case our RNA renderer for layers that were empty or contained only one pixel in order to make it run in a reasonable amount of time. (We think the judges did not implement this optimization and we apologize for our very slow entry.)

Ultimately we switched to a layer-by-layer system instead of a pixel-by-pixel method. This has the neat side effect of reducing a lot of the complexity of the image; although adjacent pixels almost never have the same value in the final image, they often share many bits. (This is especially true of gradients.) If you look at the animation below, you'll see that only a few layers are noisy; the rest have mostly large solid color areas. Spoons and Katrina spent a lot of time improving the rasterizer so that it efficiently generated these layers. There are a lot of tricks to employ here, like keeping the same color in the bucket as long as possible, drawing multiple pixels at once with lines, filling the layer with the most common color before starting, and moving efficiently from each scan line to the next.

Compression

Meanwhile Tom was working on compression. DNA is a pattern substitution language similar to regular expressions. A particular kind of pattern that's useful for compression is as follows:

"find b characters, call that 'beginning'.
 find m characters, call that 'match'.
 find s characters, call that 'skip'.
 find e characters, call that 'end'."

This describes a string that looks like [beginning][match][skip][end], where the lengths of those strings is known in advance. (When you know the length of strings like this, pattern substitution is guaranteed to be very fast and cheap in DNA.) We replace the pattern, which is the entire string, with [beginning][match][skip][match][end]. Doing so takes the string 'match' which already occurs in the input and duplicates it at a specific position of our choice. (We actually don't need to match against the end (it will stick around if we don't mention it) and we can do multiple duplications at once, and there are some other good tricks.)

With this technique we can take any DNA program that has a repeated string in it, and compress it (as long as the overhead of the pattern and substitution is not larger than the savings). We can repeat the process, sometimes even compressing the compressor that was deposited in earlier rounds! Tom's first version of the compressor got promising results for the then very naive rasterizing code, but was very slow, so he set out to make it run faster. The basic algorithmic problem is to find the largest "mass" of repeated non-overlapping substrings in the input (3 repetitions of a 1000 character string is better than 2 repetitions of an 1100 character string.) Eventually he used a variant of the rsync algorithm in order to cheaply find repeated blocks at any offset, and then eagerly expanded those to their maximum lengths. This runs fast enough to compress large strings (several megabytes) over hundreds of rounds in an hour or so. (One nice thing about this design is that we can also terminate compression at any point and still have a valid file.) Our later effort in an ad hoc compressor "interp" for the RNA sequences made this compressor less valuable (it originally achieved 20:1 ratios!) but it still shaves 20% off of our best solution. It compresses the codex from the judges by 26%, suggesting that they didn't implement this form of compression. (Note that compressing this way does change the meaning of the program in the case that it is prefixed with some string, so this is not a completely fair statistic.)

Interp

The other thing that's kinda crazy about RNA commands is that there are only 14 or so of them but they are 10 bases long. That's a lot of wasted bases! During lunch/dinner on the last day (we had been putting this off while we made our solutions actually correct) we finally started talking about ways to generate RNA more efficiently than just storing it verbatim. This turns out to be fairly straightforward. We represent RNA commands (which can actually be any sequence of RNAs, such as "move right 128 times") with short prefix-free codes. (Tom and Spoons initially determined codes themselves based on their feelings and emotions, but then Katrina and Sean employed the cold impartial hand of science and used Huffman's algorithm to generate optimal codes.) Immediately ahead of the sequence of codes we put two copies of the decoder. The decoder consists of a sequence of patterns, that look like this:

"find t characters (the 'tail'), and throw them away.
 find s characters, call this 'self'.
 find the specific code c."

If this match succeeds, we replace the match with [rna-commands-for-code][self][self]. The RNA commands are immediately executed and we are ready to run again. If the match fails, it continues with the next match in the tail, looking for a different specific code. There is some subtlety to generating the decoder, because it needs to mention its own length and the lengths of each of its tail segments, but overall it is not too tricky. We need two copies of the decoder just so that we can loop indefinitely, because each decode step consumes the decoder at the front.

Interestingly, we found a manual page in the codex that describes a similar technique, including the duplicate "backup copy." We didn't look at the actual implementation to see how their technique compares to ours, but it claims to search in a table instead of doing a series of constant-time tests like ours does. Thus theirs might do more work than is necessary. They support variable-length codes (but they are represented as integers, which probably wastes information) but not variable-length outputs. So we think our compressor is probably better but then again we basically got 0 points on this contest so maybe we shouldn't brag too much.

Using the codex

Our rasterizer could draw any image you fed it (but watch out for those 254s), so one thing we did was feed it a simpler image. We did this by creating a prefix that ran the daylight string on the codex (producing an image that gives many correct pixels) and then only drawing the difference mask between the daylight image and the target image. We thought this was a nice way to do it because if the codex team improved the rendering of that initial image (we almost had those damn ducks!) then the rasterizer would have to do less work. So, our final solution did use the codex a little bit.

Results

In the last few hours, our combination of clever rasterization techniques, with improving "interp" encoding and compression, was putting us quite close to the point where we'd be able to outstrip the prefix length penalty and start getting points for our correct pixels. Unfortunately (this was maybe 3am) our pixels started becoming not correct, and we had made so many frantic changes recently that we were not sure what was causing it. (It turned out to be a bug in the prefix assignment for "interp".) At the last minute, we blindly turned in what we had, which turned out to produce this image:

You can clearly see our "inkjet printer" rasterzation strategy; unfortunately because it is missing some clearbucket (or something) commands it turns out to produce this ridiculous image even though it does almost all the work necessary to produce the right thing.

We stuck around for a few hours and fixed the bug, but of course then it was too late. Our final score would have been around the same as the daylight string (so not a winner) but it would have made us feel good to get a totally correct image within the deadline and scoring window. Too bad!

Here's that bugfixed prefix: cbv-late.dna.bz2. DNA length: 2031760 bases. bzip2'd: 321k.

Movie

Here is a movie we made after the contest ended, rendering our bug-fixed solution:


An animation of our rasterization solution rendering on top of the daytime image. You need Flash 8 or maybe 9. It generates all pixels correctly. You can see each layer being drawn in sequence, and get a relative sense of the number of instructions necessary by the rendering speed. Note how some layers are much more complex than others. (The end looks fake but that is just a consequence of the way we made the movie. Download our solution and try it!)

Thoughts

We had a lot of fun with this year's contest and look forward to hearing the results. And we will probably see you again, next year!

- Tom 7 for team cultboundvariable