cashto's 2013 ICFP contest writeup

Team cashto is, as it has been since 2007, the work of one Chris Ashton from Seattle.

Discuss on Reddit!

By the Numbers

The Solver

Like I always say: the key to the ICFP contest is brute force. These problems are always in NP. It's never a matter of finding some clever trick that will lead you directly to a perfect solution every time. It always comes down a matter of search -- search plus some insight into the problem structure that enables you to lop off as many search branches as possible.

My solver starts off by enumerating up to 25 million possible solutions to a problem (later reduced to 15 million, due to time pressure), given the maximum size and set of allowed operators. This takes several minutes, but that's fine, as this happens before the first /eval is requested, so the clock hasn't started ticking yet.

Enumeration of solutions is done via a fairly straightforward dynamic programming process. Supposing that I have all (interesting) solutions of size 1 to n, I generate solutions of size n+1 by taking smaller solutions and combining them or extending them with one operator.

The search space is reduced by exploiting the following identities:

Additionally, the solver keeps track of any "known" bits in a solution. For example, (lambda (x) (shl1 (shr4 (or x 1)))) will always have '0' as the most significant bit and '1000' as the four least significant bits. Since this is known to always be non-zero, the solver won't use it as an "if0" test.

If all the bits are known, then the solution is essentially a constant, and can be replaced by any simpler solution that evaluates to the same constant.

The tracking of "known" bits works for all operators, including fold, although tracking bits for "fold" doesn't seem to give much benefit. Especially since all training problems apparently have 'fold' as the topmost operator, and eventually I modified the solver to prune out any solution that builds on top of a fold.

The solver doesn't treat tfold any different than fold. (Initially I thought 'tfold' referred to programs that contain fold at the very top of the program. Only on day 3 did I realize that tfold programs are those that contain a fold that does not reference the input value in its lambda expression).

Once the corpus of solutions is built up, the solver sends out only one /eval request, containing the numbers FFFF..FFE0 to 0000...001F, all the powers of 2, and 133 randomly generated numbers. On receiving a response, it searches through the corpus of solutions for the first possible solution that generates the same values as recieved from the server. This usually takes only a few seconds.

If the server accepts the solution, the solver moves on to the next problem. Otherwise, the solver adds the provided counterexample to its list of search criteria and continues searching for the next possible solution from the place it left off.

Solutions to bonus problems are generated a little differently. Here, the solver searches for a solution that satisfies at least half of the known outputs. Logically, there must always be one, as one side of the "if0" must be true at least half of the time. (However, there may also exist false positives -- programs that return the right answer half the time, but nevertheless cannot be a branch of the if0).

Once the above solution is found, the solver then looks for a second solution which is correct whenever the first solution is incorrect. (Again, there may be several of these). Finally, the solver looks for a third solution which, when it evaluates to zero, the first solution is correct, and conversely when it evaluates to non-zero, the second solution is correct.

Once the solver have found a set of three solutions which meet the above criteria, it combines them together under an "if0" operator and sends it to the server.

The Code

The source code for my entry is contained in this ZIP file. (SHA256 hash e0c4ab646393b51f204d0408126356e0da81b36ef467dea63498a11cb5b82b1e)

It contains:

main.coffee: last year, I settled on CoffeeScript as my language of choice. That seemed to go pretty well, so I initially set out to repeat that experience this year.

Originally, main.coffee was the main program, but eventually I split it into two parts. One was the solver, which went to solver.coffee. The remainder was a tool that:

solver.coffee: this contained the main solver (as described above), as well as the necessary code to talk to the server.

harness.coffee: this spawned a configurable number of solver.coffee instances to run in parallel and chew on a problem set generated by main.coffee. In the first two days of the competition, I went very carefully, only attacking problems I was guaranteed to generate solutions for. If any solver instance failed to find a solution, harness.coffee would stop launching new solvers, so the failure could be investigated.

Program.cs: late in the first day of the competition, it was becoming increasingly clear that the performance of the "evaluator" half of my solver (which ran possible programs with a given input) was not going to be acceptable. CoffeeScript, like the JavaScript that it compiles to, does not have a 64-bit data type, so I was having to make do with two doubles. At ~150k evaluations per second, evaluating 25 million possible programs 256 times each (one for each input) was going to take on the order of ten hours a problem. As a result, I decided to rewrite the evaluator in C#, and called it "BetterEvaluator.exe".

The solution generator still had reasonable performance, but even then I could only enumerate up to a million solutions before node.js would run out of memory. Eventually, I decided to bite the bullet and re-write the solution generator in C# as well, even though it was considerably more complex than the evaluator.

There is some inefficiency in that there are both "Node" and "Expression" objects, even though an Expression is just a Node annotated with "known bits" information.

In the end, solver.coffee ended up a vestigial husk of its former self -- a sort of "front end" that communicated with the server, parsed JSON, and forwarded requests to and from BetterEvaluator.