Saturday, September 12, 2009

ICFP Programming Contest 2009

Surprisingly, I won the first place in the ICFP programming contest 2009. I'm very appreciated for this result - it was much better than my expectation. I was in 9th place before the validation process.

This is the post to describe my approach of my solution. Here is my source code:

http://shinh.skr.jp/dat_dir/icfpc.zip

* FAQs

Before the detailed descriptions, I put some questions asked by other contestants.

Q. Did you go to the moon?
A. No, I couldn't. I considered if I could go to the moon a bit, but I decided not to go the moon. As it takes a lot of fuel and the score of far debris aren't big, I prioritized improvement of stability and faster arrivals at near debris.

Q. Did you solve traveling salesman problem?
A. No. I chose the next target almost greedy. Please see the following details as well.

Q. Did you solve difficult mathematic formulas?
A. No. Though I tried to find an analytic solutions, the effort was just waste of time. Basically, I only used Hohmann transfer. Even if we had a analytic solutions, we would need some adjustment for errors due to discrete space anyway.

Q. 1 person team?
A. Yes. I always attend this contest because I'd like to do everything by myself.

Q. How long did you sleep?
A. Not sure, but maybe ~20 hours. I cannot write valid code with few sleep. It is the lesson learned from previous contests.

Q. Why C++?
A. I love C++. By the way, I think I'm not offensive to functional programming languages. Actually, I tried some of them a bit (maybe I wrote from ~1k to ~3k lines of toy code for each of them). I thought OCaml is good but it's standard library is poor. Perhaps I should have tried ExtLib or something like this. Haskell was slow when I tried. I read some of assemblies GHC produced, but it was very difficult for me to understand it... I tried Scheme interpreter and it was interpreter after all. Maybe I'll want to try compilers (Stalin?). Commercial common LISP processors sound great, but I only have opensource implementations. So, I'm thinking C++ is the most practical languages to me for now. I know I'm biased - I spent much more time for C++ than other languages. I'm looking forward continuing functional programming funs' great work.

Q. Why not other imperative languages?
A. I feel Java is less free than C++. As for C#, my experience with C# is too small to discuss it. I'm (or was?) a fun of D and actually I used D for this contest three times (2004, 2006, and 2007), but I don't use it these days. Maybe I'll want to use it again for the next year's contest.

* Detail

This post is based on the script of my talk for vidiowiki: http://vidiowiki.com/watch/m844dyn/ (Hmm... it's embarrassing to watch I'm talking with my poor English. But it is good experience to me.)

Overall, I think my approach wasn't so special. Just like many teams might do, I wrote a VM, a visualizer, a physic simulator, and hoahman transfer.

My VM was fast because it translates the input binary into C code. Otherwise it's normal.

The visualizer is also a normal stuff. Nothing to say about it. You can see a YouTube video at (sorry for its poor quality) http://www.youtube.com/watch?v=IUGiiFsnLLs

I also had a physic simulator. It calculates the states of future quickly. Due to two reasons, it was a bit tricky to implement the simulator correctly. The first reason is that the binary organizers provided was wrong. The gravity from the moon didn't affect to our spaceship. Also, the gravity from the moon is calculated by g(t+1) = G*M/(p(t+1)-pm(t))^2 where p is the position of a body and pm is the position of the moon. I think pm(t) should have been pm(t+1). Anyway, with some reverse engineering works, I could implement the simulator correctly. It helped me a lot by its fast simulation.

With these three tools, I implemented my strategy, which was basically, kind of bruteforce.

First my simulator calculates the future positions of a target for 5000 seconds. The track should be an elliptic arc around the earth. And then, it checks if I have a chance to reach a point which is close enough to the arc with ideal Hoahman transfer.

As there are the gravity from the Moon and the program is running in discrete space, there should be some errors. Things are not ideal. Therefore, my program starts adjustment as the next step when it finds it can go close to the track of the target. It runs simulations for various initial velocities again and again. This process takes a lot of time. My fast simulator helped this process.

The last question is how to decide the next target from 10 targets. My approach was a greedy algorithm with a heuristics. If it finds a promissing plan for a target, it starts the journey except for two cases. One case is that it withdraws the plan if the plan takes too much fuel. Another case is that the target is too far. The latter exception is a performance optimization. It is considered before it starts the first step to reduce unnecessary calculations for too far targets (e.g., the debri around the moon isn't good candidate as the first target). I know this is not optimal, but it seemed it was acceptable enough.

So, as I described, I think my approach isn't the smartest way. but I guess the good point of my program was its robustness. I've heard that there are some teams whose program unfortunately crashed with some typical test cases and some teams' solution only work for the given test cases.

The performance might be also the key of my success. It seemed that there are some teams whose solutions are similar to mine. However, most of them don't have fast simulator and they just use VM to simulate the physics. Also, I think C++ helps me to write solid and fast code.


Thanks the organizers for the excellent contest!

Sunday, August 2, 2009

Symbolic Polyglot Quine

Recently, I wrote the following code:

http://shinh.skr.jp/obf/sym_poly_quine.txt

This script is a Quine, so it outputs the program itself into stdout without file input. You can run this script by Ruby, Perl, and JavaScript. This script is a polyglot of these three languages. And, this script was written only with symbolic characters (!"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~). There should be no alphabets, numbers, whitespaces, and 8bit characters.

Perl is OK only with symbolic characters. You can convert any Perl scripts into symbolic style scripts using Acme::EyeDrops. This module is awesome.

http://search.cpan.org/dist/Acme-EyeDrops/

Also, all JavaScript scripts can be converted into symbolic style. jjencode does the magic.

http://utf-8.jp/public/jjencode.html

As for Ruby, Ruby 1.9 is perfect for symbolic programming. kurimura found the way to convert arbitrary Ruby scripts into symbolic format.

http://d.hatena.ne.jp/kurimura/20080824

Unfortunately, I think Ruby 1.8's symbolic programming is very limited. For example, you cannot create a loop. As quine only require substitutions and outputs for stdout, which is doable with symbolic Ruby 1.8, my script can run with Ruby 1.8.

There are some other weird code like this:

http://shinh.skr.jp/obf/

Friday, June 5, 2009

TCC 0.9.25

I worked on porting TCC into x86-64 and the project has recently released a new version which contains my changes.

http://bellard.org/tcc/

I confirmed that my changes pass the tests in TCC and successfully compile some applications such as link and Lua.

As there should be some bugs in the x86-64 support, I'll be really appreciated if you try this release with x86-64 and report bugs.

Regarding implementation... It was more difficult than I expected. Code generation for x86-64 was relatively straightforward because there was support for x86, which is similar to x86-64. The most difficult part was relocation related stuff. As distance of two addresses in 64bit address space can be larger than 32bit, some 32bit relative references need to be fixed by PLT and GOT. TCC's -run mode support was also tricky by the same reason.

Other things by which I was often confused were 64bit system itself. As much code of TCC depends on 32bit system, sometimes it casts pointers into int, which is sometimes incorrect on 64bit system. The following simple code would show the difficulty of 64bit system:


int main() {
printf("%p\n", malloc(10));
printf("%p\n", malloc(1000000));
}

This code produces outputs like


0xfa1010
0x7f3658acd010

on my linux box. When we allocate small memory chunk from heap, it returns addresses which is smaller than INT_MAX. So, even if we cast the returned pointer into 32bit integer, it has no problem as long as we allocate small memory chunks. This hided many bugs from me. This kind of bugs only appears with big programs. It means that it's difficult to create minimal failure cases...

Tuesday, April 7, 2009

TLE

I got the 1st prize in the first TLE programming contest.

http://felicity.iiit.ac.in/tle/

This contest had unique problems. For example, we should have written a C program with no semi-colons in the test run of this contest. The answer was:

main(){if(printf("Hello, world!")){}}

In this contest, the most problems require short coding to get better score. I like short coding and I used a lot of time for this competition (I even took a day off). That's why my performance was surprisingly good in this contest.

Unfortunately, there were some issues in the contest. Since competitors wrote much shorter code than organizers had expected, the score of each problem is imbalanced. For example, the maximum score of problem "CLASS" was 200. On the other hand, some competitors awarded over 2000 points in problem "SHORTEN". Anyway I think this issue didn't break the fun of this contest, and this kind of issues were unavoidable as this was the first attempt for the organizers to run this kind of contest.

Though there were some issues in the contest, it was great fun! I hope I will attend this contest again in the next year. I must thank organizers for the excellent contest.

About Me

http://shinh.skr.jp/ (my website in Japanese)