Wettbewerb

Jedes Übungsblatt beinhaltet eine vom Master of Competition Senior, Herrn Dr. MC „Hammer“ Blanchette, sorgfältig ausgewählte Wettbewerbsaufgabe. Diese zählt so wie die anderen Hausaufgaben, wird aber zusätzlich als Teil eines semesterlangen Wettbewerbs bewertet. Die Bewertungskriterien variieren jedes Mal. Die besten 30 Lösungen bekommen je 30, 29, ..., 1 Punkte.

Jede Woche wird die Top 30 der Woche veröffentlicht und die Top 30 des Semesters aktualisiert. Die Lieblingslösungen des MC Sr. werden ins Internet gestellt. Am Semesterende werden die fünf besten Studenten mit geschmackvollen Trophäen geehrt. Es gibt hier also keine Bonuspunkte, dafür aber jede Menge Spaß und Ehre (und Trophäen)!

Wichtig: Wenn Sie nicht möchten, dass Ihr Name im Internet erscheint, können Sie entweder auf den Wettbewerb verzichten (ganz einfach, indem Sie die {-WETT-}...{-TTEW-} Tags weglassen) oder den MC anschreiben.

Gliederung:

Top 30 des Semesters
  Platz Wettbewerber(in)   Punkte
1. David Widmann 312 1/3
2. Martin Wauligmann 297 2/3
3. Christoph Rudolf 259 1/3
4. Chaoran Chen 256
5. Fabio Madge Pimentel 231 2/3
6. Lukas Tenbrink 228
7. Daniel Schubert 201
8. Maximilian Endraß 175
9. Joshua Rombauer 173
10. Sebastian Neubauer 161
11. Julian Spahl 155
12. Julius Imbery 150
13. Jonas Heintzenberg 148
14. Lukas Heinzmann 139
15. Alexander Küchler 129
16. Nathanael Schilling 123
17. Christoph Griesbeck 103
18. Rudolf Hattenkofer 95
19. Dario Beckert 94
Philipp Jutzi 94
21. Michael Hesse 93
22. Michael Kubitza 82
23. Lukas Prantl 79
24. Daniel Kowatsch 77
25. Simon Rehwald 70
26. Anna-Kathrin Kopetzki 69
Sebastian Weiß 69
28. Adrian Philipp 68
Alexander Book 68
30. Felix Thimm 67
Michael Eder 67

Die Ergebnisse der ersten Woche

Top 30 der Woche
  Platz Wettbewerber(in)   Token   Punkte
1.Dennis Fischer 22 30
2.Julius Imbery 23 29
3.Maximilian Reif 24 28
Björn-Aljoscha Kullmann 24 28
Hans Kirchner 24 28
Maximilian Muth 24 28
Robert Hamsch 24 28
David Widmann 24 28
Simon Zettler 24 28
Jessica Blankenburg 24 28
Christian Feiler 24 28
Rudolf Hattenkofer 24 28
Sebastian Bruhn 24 28
Dario Beckert 24 28
Adrian Jervolino 24 28
Daniel Kowatsch 24 28
Sebastian Wieland 24 28
Patrick Langer 24 28
Kirill Haar 24 28
Daniel Doehring 24 28
Thomas Rößl 24 28
Joshua Rombauer 24 28
Alexander Küchler 24 28
Michael Hesse 24 28
Daniel Schubert 24 28
Simon Rehwald 24 28
Arsatius Regler 24 28
Martin Wauligmann 24 28
Thomas Günzel 24 28
Wolfgang Hobmaier 24 28
Julian Litti 24 28
Roland Würsching 24 28
Niklas Gschwendtner 24 28
Jonas Heintzenberg 24 28
Jakob Huber 24 28
Florian Hölzl 24 28
Nico Metz 24 28
Stefan Wolf 24 28
Matthäus Haselmaier 24 28
Theodor Cheslerean Boghiu 24 28
Maximilian Endraß 24 28
Christian Winter 24 28
Stefan Kreuzer 24 28
Daniel Schneider 24 28
Jan Herlyn 24 28
Julian Schießer 24 28
Markus Ansorge 24 28
Christoph Smutny 24 28
Markus Obrull 24 28
Anna-Kathrin Kopetzki 24 28
Adrian Schultz 24 28
David Frank 24 28
Florian Wiedner 24 28
Philipp Jutzi 24 28
Stefan Cimander 24 28
Alexander Ungar 24 28
Christoph Rudolf 24 28
Dennis Maier 24 28
Kilian Käslin 24 28
Frederic Naumann 24 28
Emanuele Lamanna 24 28
Nathanael Schilling 24 28
Noah Bergbauer 24 28
Lukas Heinzmann 24 28
Michael Eder 24 28
Michael Kubitza 24 28
Lukas Tenbrink 24 28

The Master of Competition Senior was glad to see that approximately 260 students decided to participate in the competition, most of which actually take the Info2 course this year. The number is approximate because it is not clear what to do with entries where the closing {-TTEW-} tag was missing or misspelled ({-TTEW -} is not OK) and other odd cases like that. Nor was it necessary to decide what to do in those cases, since the Top 30 solutions didn't suffer from such Schönheitsfehler. Here's a good hint, nonetheless:

When given instructions, follow them.

The chances are good that the MC Sr. uses advanced computer-based technology (Perl and Haskell) to deal with the large amounts of data. If your entry doesn't meet the expected format, you can expect it to fall through the cracks.

Let's start with the longest correct solution:

        allDistinct a b c d e = not ((elem a [b,c,d,e]) || (elem b [a,c,d,e]) ||
                (elem c [a,b,d,e]) || (elem d [a,b,c,e]) || (elem e [b,c,d,a]))

This entry has a few gratuitous weaknesses: First, why the needless parentheses? When competing for reducing the number of tokens, parentheses should be the first suspects you go to interrogate: What are you doing here, Left Parenthesis? And you, Right Parenthesis? Just hanging around? You certainly don't want your parentheses to go all lazy and just hang around; you want them to do some work or get the heck out of there. Second weakness: There's no point in checking both that a is not equal to e and that e is not equal to a. Equality and disequality are symmetric operations. But the solution is actually correct, which is more that can be said of many other solutions.

Here's a similar idea, at 54 tokens:

        allDistinct a b c d e  =
                if a `elem` [b, c, d, e]
                        then False
                        else if b `elem` [c, d, e]
                                then False
                                else if c `elem` [d, e]
                                        then False
                                        else if d == e
                                                then False
                                                else True

No gratuitous parentheses and explotation of the inherent symmetry: good. Needless ifthenelse expressions: not so good. Here's a general guideline:

If you find yourself writing True or False in one of the branches of an if, this means you're doing something too complicated.

The operators not, &&, ||, and == are your friends. Keep that in mind. Professor Nipkow's Forschungsgebiet is Logik, so if you want to impress him, you have to demonstrate that you can manipulate the logical operators. Imagine if math textbooks were rewritten so that “if m > 0, then n > 0” would become “if m > 0, then if n > 0 then true else false else true”. The MC Sr. is ready to bet that that textbook wouldn't be very popular with students! They wouldn't exactly rush to the bookstore to buy the second edition, would they? And yet—and this really messes with the MC Sr.'s brain—it is the students and not the instructors who insist on producing such material, year after year, Aufgabenblatt after Aufgabenblatt. God moves in mysterious ways (if at all).

All right, let's move on. The next solution, at 66 tokens, is a quite common one:

        allDistinct a b c d e = (a/=b) && (a/=c) && (a/=d) && (a/=e) && (b/=c) &&
            (b/=d) && (b/=e) && (c/=d) && (c/=e) && (d/=e)

Again, are those parentheses just hanging around? Go away, you silly things!

Here's a kind-of-cool solution at 56 tokens, which uses lists:

        allDistinct a b c d e = and (zipWith (/=) [a, a, a, a, b, b, b, c, c, d] [b, c, d, e, c, d, e, d, e, e])

The canonical solution, at 46 tokens, was also the MC Sr.'s first iteration:

        allDistinct a b c d e =
          a /= b && a /= c && a /= d && a /= e &&
                    b /= c && b /= d && b /= e &&
                              c /= d && c /= e &&
                                        d /= e

Notice how the nonstandard indentation helps visualize the problem. The MC Sr. had another 46-token solution in store, of which he is rather proud:

        distinctLoop :: Integer -> Integer -> Integer -> Integer -> Integer -> Bool
        distinctLoop a b c d e = a /= b && b /= c && c /= d && d /= e && e /= a

        allDistinct a b c d e = distinctLoop a b c d e && distinctLoop a c e b d

The program is inspired by the Haus vom Nikolaus and the Seven Bridges of Königsberg (now Калининград).

Let's jump to 24 tokens, which corresponds to the MC Sr.'s best solution as well as that found by many competitors populating the “top 30”:

        allDistinct a b c d e = 5 == length (nub [a, b, c, d, e])

An odd 23-token variant, which exploits some funky operators:

        allDistinct a b c d = ap (==) nub . (:a:b:c:return d)

The winner, Dennis Fischer, has a really crazy 22-token version, using printf!

        allDistinct a b c d e = ap (==) nub . words $ printf "%d %d %d %d %d" a b c d e

A couple of alumni (Maximilian Kirchmeier and Daniel Stüwe) found a 21-token solution:

        allDistinct a b c d e = ap geq nub [a, b, c, d, e]

Comment from Herr Stüwe:

ich war etwas zu träge meine Lösung zu kommentieren, da ich dachte Moritz Sichert würde meine Version übernehmen, da ich einfach seine Lösung, die er mit Adreas Bergmaier entwickelt hat, optimiert habe, da ich aus dem letzten Jahr noch die Funktion geq kannte. (Sh. Kommentar von Moritz Sicherts Code) Daher bitte ich Sie darum, im Blog explizit zu erwähnen, dass ich mit diesen beiden zusammengearbeitet habe, sonst habe ich ein schlechtes Gewissen.

Based on Herr Fischer's solution, Master Lars Noschinski derived a 12-token solution, given here with all the imports:

        import Control.Monad (ap)
        import Data.Composition
        import Data.List (nub)
        import Text.Printf (printf)

        allDistinct :: Integer -> Integer -> Integer -> Integer -> Integer -> Bool
        allDistinct = ap (==) nub ∘ words .::. printf "%d %d %d %d %d"

Comment from Herr Noschinski:

Auf Funktionen gilt "ap f g x = f x (g x)". Weiterentwickelt aus der Lösung eines Studenten mit dem Pseudonym "chaosfisch".

Dennis Fischer, a.k.a. Chaosfisch. Here's one name we're likely to see again.

One could go all the way down to 10 tokens by replacing (==) by geq.

A couple of failed attempts at glory are worth mentioning here. This awesome 15-token solution is the shortest (and generally awesomest) wrong solution (apart from undefined ones, which are anything but awesome):

        allDistinct a b c d e = length [a..e] == 5

Double you tee eff?!?!?!

It's also possible to use 22 tokens to get to a wrong result:

        allDistinct a b c d e = a < b && b < c && c < d && d < e

So what have we learned this week?

And of course:

Closing remark: You might wonder why the MC Sr. calls himself the MC Sr. and not just The MC. That's because he's actually not the only MC. The next three weeks, other MC-like characters will make their appearance, while the MC Sr. will be on vacation. Then the MC will be back, only to vanish again to greener pastures at the end of the year.

Die Ergebnisse der zweiten Woche

Top 30 der Woche
  Platz Wettbewerber(in)   Token   Punkte
1.Julius Imbery 42 30
Daniel Schubert 42 30
David Widmann 42 30
4.Maximilian Bandle 43 27
Florian Gratzer 43 27
Maximilian Reif 43 27
Christoph Rudolf 43 27
Julian Spahl 43 27
Lukas Tenbrink 43 27
10.Nathanael Schilling 44 21
11.Theodor Cheslerean Boghiu 46 20
Martin Wauligmann 46 20
13.Sebastian Neubauer 49 18
Anna Sesselmann 49 18
15.Maximilian Endraß 50 16
Christopher Zentgraf 50 16
17.Dennis Fischer 51 14
18.Yinshui Chang 57 13
Lukas Heinzmann 57 13
Daniel Kowatsch 57 13
Lukas Prantl 57 13
Lukas Reitschuster 57 13
23.Joshua Rombauer 58 8
24.Heiko Lengenfelder 60 7
25.Philipp Jutzi 61 6
26.Moritz Issig 63 5
Thomas Rößl 63 5
28.Florian Andres 65 3
Markus Ansorge 65 3
Simon Rieß 65 3

Another week, another exercise, another MC. As was mentioned last week, the old Master of Competition's time with us is almost up, but as you may know, the Master never dies – he simply regenerates into another body (in a process that involves lots of dramatic music and cheap 1970s special effects). And indeed, this week, the MC Sr has regenerated into a new body: younger, somewhat less fashionable, and using British English. However, apart from that, nothing will change in the Wettbewerb.

All right, allons-y! The MC Jr counted a total of 352 student submissions this week. Of these, 157 seem to be real attempts (i.e. not “undefined”). Additionally, the MC Jr received six unofficial submissions from tutors and alumni who apparently have too much time on their hands. As far as the number of tokens is concerned, this week's solutions seem much more diverse than last week's, ranging from minimalistic, concise 42-token solutions up to a colossal (and wrong) 496-token one. (Mind you, the smallest input for which it fails is apparently a list of length 264, but that does not make it any less incorrect.)

Speaking of incorrect solutions: with considerable disappointment, the MC Jr found that despite being warned not to do so by the Meta-Master Tobias Nipkow in the last lecture, two would-be Top 30 competitors attempted to use the logBase function, e.g. like this:

        decodeIntPairs ns = [(s n,s $div n 2) | n<-ns,n>=0]
            where s i = sum [2^k*mod (div i $4^k) 2 
                                | k<-0:[1..(truncate $logBase 2 $fromIntegral i/2)]]

The MC Jr would like to reiterate that using floating point arithmetic in a discrete problem such as this is almost always a very bad idea. For instance, the above fails to decode the encoding of (10^155, 10^155) due to numeric overflow. Since the counterexample is rather large, the automatic test system did not detect the errors; the MC Jr, however, did. Needless to say, the students who submitted these entries received 0 points and a stern look.

Another student submitted the following solution, which also passed all tests.

        decodeIntPairs l = [(sum [shiftR n i .&. bit i | i <- enumFromTo 0 32],
                             sum [shiftR n (i+1) .&. bit i | i <- enumFromTo 0 32]) | n <- l, n >= 0]

Note that it contains the ominous constant 32. Yes, that is 32 as in 32 bits. The MC Jr tested this code and found, to his complete and utter lack of surprise, that decoding the encoded pair (2^33,2^33) returned (0,0).

Now, on to to the good bits: The MC Jr was pleased that despite considerable scrutiny on his part, all the other non-undefined entries with a competitive amount of tokens seemed to be correct.

The following 65-token solution is very close to the Musterlösung, and around two thirds of the Top 30 solutions are merely variations of it:

        decodeIntPairs a = map decodeIntPair [b | b <- a, b >= 0]
        decodeIntPair a 
              | a <= 0 = (0,0)
              | otherwise = (x*2 +  mod a 2, y*2 + mod (div a 2) 2)
            where (x,y) = decodeIntPair(div a 4)

This solution is simple, understandable, and only uses material covered in the lecture.

In contrast, for the next example, someone dove deep into the depths of Haskell's libraries and tried their luck with printf and the Data.Digits module, which provides functions for converting numbers into lists of digits and back again. It led them to the following 46-token solution and place 11:

        decodeIntPairs = map ((\n -> (head n, last n)) . map (unDigits 2 . digits 10 . read . reverse) . 
                                 transpose . chunksOf 2 . reverse . printf "0%b") . filter (>=0)

This solution is so cutting-edge that the MC Jr had to download and instal the very latest version of GHC to compile it. Nevertheless, it is possible to do better with less.

Another student took a very different approach:

        {- 
          Im wesentlicha demmer die decodeLeftInt Funktion pro Int zweimal 
          anwenda, einmal auf das originale Int, und einmal auf ein geshiftes 
          Int. Die decodeLeftInt Funktion isoliert jedes 2i-te bit (des hat 
          den wert 4^i wenn gesetzt), teilt dann durch 2^i und summiert am 
          die summe nemma dädat, aber log2(n) <= n, und tokens send teuer.
        -}
        decodeIntPairs xs= [(decodeLeftInt x, decodeLeftInt $ div x 2) | x <- xs,x>=0]
        decodeLeftInt n = sum [n .&. 4^i `div` 2^i | i<-enumFromTo 0 n]

This solution, while reasonably easy to understand, is very inefficient – whilst the problem allows for an obvious linear-time solution, this implementation requires exponential time. The MC Jr acknowledges the clever idea that underlies this solution and the author's thoughtfulness in writing his comments in Schwäbisch for the benefit of his coworkers at the Stuttgarter Firma, but while the MC also agrees that, as the author quite rightly states in his comments, Tokens are expensive – so is time. Desch koscht ais a Geld! Remember: the Aufgabenblatt clearly states:

Lösungen, die gleichwertig bezüglich der Tokenanzahl sind, werden im Hinblick auf ihre Effizienz verglichen.

However, seeing as this is the only solution with this particular number of tokens, it lands in tenth place despite its inefficiency. In the latter half of the Wettbewerb, when efficiency will be the key criterion by which solutions are measured, things will be quite different.

This week's winners are David Widmann, Daniel Schubert, and Julius Imbery with reasonably efficient, very down-to-earth 42-token entries that choose simple arithmetic and clever programming over fancy libraries and dirty tricks:

David Widmann:

        decodeIntPair 0 = (0,0)
        decodeIntPair n = (mod n 2 + snd m * 2, fst m)
            where m = decodeIntPair $ div n 2
        decodeIntPairs = map decodeIntPair . filter (>=0) 

Daniel Schubert:

        decodeIntPairs = map decode . filter (>=0)
        decode e = (ti e, ti $ div e 2)
        ti 0 = 0
        ti e = ti (div e 4) * 2 + e .&. 1

Julius Imbery:

        decodeIntPairs z = [(help a ,help$div a 2)|a <-z,a>=0 ]
        help 0 = 0
        help z = 2*help (div z 4)+mod z 2 

David Widmann's solution is the MC Jr's personal favourite due to its well-chosen function names and general conciseness. It is also formatted quite nicely, and the MC Jr ponders whether to deduct points for poor formatting in the future.

Julius Imbery's code also deserves special mention seeing as it only uses material covered in the lecture so far (apart from the “$” operator, which is just a way to save parentheses). This shows that even without perusing Haskell's extensive libraries for hours, one can write a nice, competition-winning entry.

However, this is not to say that fancy libraries and dirty tricks are not useful for the Wettbewerb; tutor Andreas Bergmaier submitted the following 37-token solution using Arrows, which will not be covered in this lecture (and probably in no other lecture you will ever attend):

        decodeIntPairs = map (decodeInt &&& decodeInt . flip shiftR 1) . filter (>= 0)
        decodeInt 0 = 0
        decodeInt n = 1 .&. n .|. decodeInt (n `shiftR` 2) `shiftL` 1

In the same spirit, alumnus and long-term Wettbewerb addict Lorenz Panny privately communicated to the MC Jr the following 36-token solution which uses Arrows, Applicatives and the module GHC.Integer from the very depths of GHC's innards for andInteger:

        decodeInt n = sum $ andInteger <*> div n <$> (2^) <$> enumFromTo 0 n
        decodeIntPairs = map (decodeInt &&& decodeInt . flip div 2) . filter (>=0)

If that was not complicated enough for you yet, the MC Jr was also not immune to the nerd-sniping qualities of the MC Sr's problem and concocted the following 27 tokens of complete gibberish:

        decodeIntPairs = filter (>= 0) >>> map (bimap `join` unDigits 2 <<< unzip <<< 
                             map (read "[(0,0), (1,0), (0,1), (1,1)]" `genericIndex`) <<< digits 4)

The idea that lies hidden beneath all this is simple enough: convert the input number into base 4, convert each digit (from 0 to 3) to a pair of binary digits using a lookup table, concatenate the results to get two binary numbers, and convert these back to integers. The MC Jr expected some of the students to use lookup tables as well and wonders why, apparently, none of them did.

In conclusion, the lessons for this week are:

Die Ergebnisse der dritten Woche

Top 30 der Woche
  Platz Wettbewerber(in)   Token   Punkte
1. Lukas Tenbrink 8 25
Martin Wauligmann 8 25
Alexander Küchler 8 25
Christoph Rudolf 8 25
Daniel Schubert 8 25
Lucas Stoffl 8 25
Maximilian Endraß 8 25
Simon Zettler 8 25
9. Joshua Rombauer 15 22
10. Anna-Kathrin Kopetzki 17 21
Thomas Rößl 17 21
12. Lukas Heinzmann 18 19
13. Maximilian Muth 20 18
David Widmann 20 18
Philipp Jutzi 20 18
16. Manuel Burghard 23 15
17. Julius Imbery 24 14
Nathanael Schilling 24 14
19. Daniel Kowatsch 25 12
20. Kordian Bruck 27 11
21. Arsatius Regler 28 10
Thuy Tran 28 10
23. Dario Beckert 29 8
24. Theodor Cheslerean Boghiu 30 7
25. Jonas Heintzenberg 32 6
Chaoran Chen 32 6
27. Simon Rehwald 36 4
28. Hans Kirchner 37 3
29. Christian Winter 43 2
Christoph Smutny 43 2
Rudolf Hattenkofer 43 2
Jakob Huber 43 2

Before we start, a word of caution: a number of submissions this week looked remarkably similar, if not quasi-identical. The MC Jr would like to remind everyone again that Gruppenarbeit is not allowed and will result in the subtraction of one zillion points. The solutions in question were short enough that it may very well have been coincidence, but be advised that the MC always maintains a watchful eye on plagiarism. Do not test his lenience.

This week, the MC asked you to write the coalesce function, which helps him compute the overall sum of points for the Wettbewerb. This was no joke – the MC Jr indeed used this function to compute the very ranking you are reading right now. This problem was a lot easier than last week's; nevertheless, the MC Jr only received 85 submissions (plus 6 unofficial ones). The median solution was 53 tokens long and was basically a somewhat less elegant version of the Musterlösung, which has 50 tokens.

The longest solution this week (at 186 tokens) is considerably shorter than last week's 496-token monster. It also seems to be correct, but that is quite hard to verify:

      coalesce :: [(String, Integer)] -> [(String, Integer)]
      coalesce [] = []
      coalesce xs = hi xs []

      hi :: [(String, Integer)] -> [(String, Integer)] -> [(String, Integer)]
      hi [] re = re
      hi (x:xs) re
          | y && null re = if fst x == fst (head xs) then hi (tail xs) 
                               [((fst x), ((snd x)+(snd ( head xs))))] else hi xs [x]
          | y && fst x == fst r =  hi xs ((init re)++[((fst r), ((snd r)+(snd x)))])
          | length re /= 0 && null xs && fst x == fst r= (init re)++[((fst r), ((snd r)+(snd x)))]
          | null xs = re ++ [x]
          | otherwise = hi xs (re ++ [x])
              where y = length xs /= 0
                    r = last re

Now, on to the TOP 30 solutions. Alas, there was a severe lack in diversity this week. All in all, there were roughly three different equivalence classes of solutions:

  1. Class 1: For a Fistful of Tokens

    The solutions in the lowest part of the Top 30 were all essentially pruned versions of the Musterlösung, e.g. like this:

          coalesce (t1:t2:ts) = if fst t1 == fst t2 then 
              coalesce $ (fst t1, snd t1 + snd t2) : ts else t1 : coalesce (t2:ts)
          coalesce ts = ts
    

    This solution very straightforward, quite readable, and (almost) only uses material from the lecture.

  2. Class 2: For a Few Tokens Less

    Considerably shorter versions are possible with some knowledge of the Data.List and Data.Function module. The groupBy function takes care of the grouping of adjacent elements, thus solving a huge chunk of the problem, e.g. in this 30-token solution:

          coalesce = map (\tuple -> (fst . head $ tuple, sum $ map snd tuple)) . groupBy ((==) `on` fst)
    

    The largest part of this solution is concerned with transforming the result of groupBy into the actual result. This can be done using the unzip function (which was mentioned in the lecture) to transform the list and Arrows to transform the tuples. Applying the geq trick from last week can save another two tokens, bringing us down to 17 tokens, e.g. this much shorter and nicer solution by Anna-Kathrin Kopetzki:

          coalesce = map (unzip >>> head *** sum) . groupBy (on geq fst)
    

    One student had a variant of Ms Kopetzki's solution in his comments, packaged in a very peculiar fashion, for which the MC Jr would like to award him the Günther Oettinger Award for Creative Use of the English Language:

          {- I heard, you like gibberish code, so here I got a present for you ;) -}
    
          coalesce =
                    -- **.      .** --
                   --     \    /     --
                -- -\\----||--||----//------- -}     {- ------------------------------------------ -}
             --      \\   ||  ||   //      /  -}     {-              LABEL on the Box:             -}
           {- ############################    -}     {- ------------------------------------------ -}
           {- -}                        {-    -}     {-            Mind your brain when            -}
           {- -}          map           {-    -}     {-      Wettbewerbs-syndrom strikes back      -}
           {- -}(head *** sum <<< unzip){-    -}     {- ------------------------------------------ -}
           {- -}       . groupBy        {-    -}     {- To risks and side effects read the hackage -}
           {- -}      (on geq fst)      {-    -}     {-  library and ask your tutor or current MC  -}
           {- -}                        {-    -}     {-         (They know from their own)         -}
           {- ############################ -}        {- ------------------------------------------ -}
    

    While the MC is very touched by the thoughtful gift, one should not take the contestant's comments too seriously. First of all, everyone knows that Haskell does not have any side effects and secondly, the MC would not call this code gibberish at all. Apart from the terrible geq trick, this is a very nice and concise solution, and it is, in fact, quite close to the MC Jr's personal solution using Bifunctors (which has 14 tokens):

          coalesce xs = bimap head sum . unzip <$> on geq fst `groupBy` xs
    

    A related solution that a handful of people found uses functions from Data.Map and the List monad to merge the elements. The shortest one (adapted from that by tutor Andreas Bergmaier) has 13 tokens:

          coalesce = toList . fromListWith plusInteger <=< groupBy (geq  `on` fst)
    
  3. Class 3: The Good, the Bad and the Unspecified

    The MC Jr did not expect 8-token solutions this week, so the fact that eight people handed in one (yes, all eight of them were virtually the same) was quite a surprise to him. He is somewhat ambivalent about these solutions: on the one hand, the MC appreciates unexpected approaches, dirty hacks, and breaking all the rules (at least in the context of the Wettbewerb); on the other hand, the solution is quite a bit too dirty even for the MC Jr's taste:

          coalesce = toAscList . fromAscListWith (+)
    

    This solution undoubtedly works. However, as the name fromAscListWith implies, it expects the list it is given to be ascending. The behaviour of the function is unspecified when given a unsorted list, meaning that this solution relies on unspecified implementation details. The current implementation of fromAscListWith builds a binary search tree that violates the invariants for binary search trees when given an unsorted list. Due to the implementation of toAscList, this still works – but other functions from Data.Map return incorrect results.

    Another issue the MC Jr has with this solution is that after someone claimed to have an 8-token solution on Piazza and gave a very vague description of how he obtained it, eight people (in total) found the same solution, but all of them failed to see that it could be reduced by another two tokens in a very obvious fashion by using plusInteger instead of (+). The MC Jr suspects that the competitors influenced each other a little too much with this communication and would therefore like to urge everyone (including himself) to wait for the submission deadline in the future before communicating their solutions to anyone – directly or indirectly. That said, the MC Jr is not without blame himself – he could have prevented this solution by requiring coalesce to work on tuple Eq a => (a, Integer) instead of (String, Integer), which would have precluded these solutions.

    For these reasons, the MC Jr has decided not to award the full 30 points to these 8-token solutions, but only 25 each. This still leaves a well-deserved 3-point gap between them and the 15-token solution in second place, but also reinforces the fact that the MC Jr is not amused by code relying on unspecified behaviour.

What have we learned this week?

Die Ergebnisse der vierten Woche

Top 30 der Woche
  Platz Wettbewerber(in)   Token   Punkte
1. Nathanael Schilling 21 30
2. Maximilian Endraß 22 29
3. Philipp Jutzi 27 28
David Widmann 27 28
5. Lukas Tenbrink 27 26
6. Lukas Heinzmann 31 25
7. Manuel Burghard 32 24
Michael Eder 32 24
9. Julius Imbery 38 22
10. Arsatius Regler 39 21
11. Anna-Kathrin Kopetzki 48 20
12. Martin Wauligmann 54 19
13. Christian Winter 55 18
14. Julian Labeit (KIT) 55 17
15. Michael Hesse 56 17
16. Rudolf Hattenkofer 57 16
17. Chaoran Chen 66 15
18. Jonas Heintzenberg 79 14
19. Thomas Günzel 79 13
20. Daniel Schubert 91 12
21. Andreas Schaarschmidt (KIT) 92 11
22. Christoph Rudolf 94 11
23. Florian Weber (KIT) 98 10
24. Max Vogler (KIT) 99 10
25. Fabio Madge Pimentel 103 10
26. Joshua Rombauer 118 9
27. Maximilian Reif 121 8
28. Thomas Rahm 124 7
29. Alexander Book 133 6
30. Julian Spahl 136 5
31. Karsten Diekhoff (KIT) 138 4
32. Dario Beckert 143 4
33. Yinshui Chang 151 3
34. Markus Ongyerth 159 2
35. Kirill Haar 162 1

This week, the MC Jr is pleased to be able to welcome the students of the Programming Paradigms lecture by Prof Snelting from Karlsruhe to the Wettbewerb. They can submit solutions until the end of our lecture and they will be ranked among the TUM students in each week's Top 30, identified by a ‘(KIT)’ next to their names. They will, however, not appear in the Semester Top 30, seeing as they did not participate in the first three weeks. Also, for consistency, the Top 30 will be extended in such a way that they always contain at least 30 TUM students and the points for the Semester ranking will be computed as if there were no KIT students.

The MC Jr received 84 student solutions from TUM and 8 from KIT. There were also 6 submissions from tutors and alumni from TUM and 4 from alumni and employees at KIT. A staggering 47 of the TUM solutions and 3 of the KIT solutions did not pass the MC Jr's inspection. Many solutions looped indefinitely for negative numbers in the input or crashed with exceptions and non-exhaustive patterns. A lot of solutions also timed out for input with long lists or large numbers, suggesting non-linear run time. This ubiquity of incorrect and slow (often both) solutions made the MC Jr's life quite hard and he had to employ a wide variety of shell scripts to automate testing enough to get his Top 30.

The variety of solutions was significantly greater than last week. Apart from the already-mentioned ‘Crash Randomly’ and ‘Return Nonsense’, the MC Jr identified roughly six different approaches:

  1. The Pragmatist

    Seeing as we already learned how to encode two integers into one, the obvious approach for this week is to apply this method recursively to encode all list elements and the length of the list itself. Some participants took this approach (plus handling of negative numbers), e.g. like this:

        encodeInts []  = 0
        encodeInts xs  = encodePair (genericLength xs) $ foldr1 encodePair $ map encodeSign xs
    
        encodePair 0 0 = 0
        encodePair x y = mod x 2 + 2 * encodePair y (div x 2)
    
        encodeSign x   = abs x * 2 + div (signum x + 1) 2
    
        decodeInts 0   = []
        decodeInts x   = decodeLoop (decodeLeft x) $ decodeLeft $ div x 2
    
        decodeLoop 1 x = [decodeSign x]
        decodeLoop n x = decodeSign (decodeLeft x) : decodeLoop (n - 1) (decodeLeft $ div x 2)
    
        decodeLeft 0   = 0
        decodeLeft x   = mod x 2 + 2 * decodeLeft (div x 4)
    
        decodeSign x   = div x 2 * (mod x 2 * 2 - 1)
    

    While, the MC Jr always appreciates pragmatism, this approach required a lot of tokens and thus only barely made it into the Top 30. It does, however, have the advantage of producing reasonably small encodings, at least if the list elements are roughly the same order of magnitude.

  2. All your base

    One of the most popular approaches was to print the absolute of the given numbers in some base, use additional symbols to encode the sign and the separator between the numbers, and interpret the result as a number in a different base. The obvious choice is to print the numbers in the octal system, i.e. digits from 0 to 7, then use the digits 8 and 9 as separators indicating positive and negative numbers, and interpret the result as a decimal number:

          encodeInts = read . encodeHelper
          encodeHelper (a:rest) = showOct (abs a) $ (if a > 0 then '9' else '8') : encodeHelper rest
          encodeHelper _ = "0"
    
          decodeInts = decodeBreak . show
          decodeHelper (a, (b:rest)) = (if b == '9' then 1 else -1) * (fst $ head $ readOct $ '0':a) : 
                                           decodeBreak rest
          decodeHelper _ = []
          decodeBreak = decodeHelper . break (`elem` "89")
    

    The numbers generated by this approach are reasonably small: the encoding of k numbers with n decimal digits each requires only about 1.1⋅ k⋅ n decimal digits.

  3. The Big Spender

    Of course, one does not have to use n-ary encoding. Some students forwent all considerations of efficiency and simply encoded a number x as x occurrences of some fixed character (i.e. unary encoding), followed by some other character as a separator. This, of course, leads to exponential growth in output size. The most peculiar solution of this kind was the following:

          encodeInts :: [Integer] -> Integer
          encodeInts xs = foldl (\h x -> (1 + 2*h) * 2^(if x >= 0  then 2*x else 1 - 2*x)) 0 xs
    
          log2 1 = 0
          log2 i = 1 + log2  (i `div` 2)
    
          decodeInts :: Integer -> [Integer]
          decodeInts 0 = []
          decodeInts i = decodeInts (i `div` 2 `div` 2^a) ++ [if odd a then -a `div` 2 else a `div` 2] where
            a = log2 $ i `xor` (i-1)
    

    In this case, the list is encoded as a binary string where 1 acts as a separator and the number of 0 inbetween encodes a number from the input list. The log2 $ i `xor` (i-1) part is a very strange and complicated way of obtaining the position of the rightmost 1 bit and suggests that perhaps the author of this solution, like the MC Jr, is an avid reader of Hacker's Delight. The knowledge of low-level binary magic like this is declining rapidly these days and the MC Jr is very glad to find that is apparently still alive in some of his students. For the record, a = popCount (i `xor` (i-1)) - 1  or  a = pred . popCount $ ap xor pred i would also have done the trick, and saved quite a few tokens, too.

  4. The Number Theorist

    Any student who paid attention in Diskrete Strukturen will know that:

    1. There are countably infinitely many prime numbers
    2. Decomposition into prime numbers is unique

    This, of course, means that it is possible to encode arbitrarily many natural numbers into a single natural number as 2a0⋅3a1⋅5a2⋅… . This plus some way to encode negative numbers makes for a nice algebraic solution, especially considering that functions for generating prime numbers and factoring integers are available in the library:

          encodeInts = product . zipWith (^) primes . 
                 tr> if even x then div x 2 else pred x `div` negate 2) . 
                           map genericLength . group . primeFactors
    

    Unfortunately, the numbers generated by this approach are naturally quite large, since the worst case of a singleton list will produce exponentially large encodings, similarly to the unary approach.

  5. Show, Don't Tell

    The laziest students (and also the most successful ones this week) let Haskell do most the work for them by using show or something similar to convert the list to a string and then convert the resulting string into an integer, e.g. like this nice solution by tutor Johannes Bader from KIT:

          encodeInts = read . concatMap show . map ord . show
          decodeInts = read . map chr . map read . chunksOf 2 . show
    

    A more refined version of this is the following 22-token solution by Maximilian Endraß, which interprets the ASCII string representation of the input list as a number in base 94 (which is possible, because the ASCII codes of [],-0123456789 are all below 94):

          encodeInts = fst . head . readInt 94 isPrint ord . show
          decodeInts x = read $ showIntAtBase 94 chr x ""
    

    This week's winner, Nathanael Schilling, used the hex package to achieve a 21-token solution. He prints the list as a string, converts the string to hexadecimal, and reads the hexadecimal string back into an integer:

          encodeInts = fst . head . readHex . hex . show
          decodeInts = read . join . unhex . printf "%x"
    

    Incidentally, this is quite close to the MC Jr's personal 18-token solution:

          encodeInts = read . mappend "0x" . hex . show
          decodeInts = unhex . printf "%x" >=> read
    
  6. The Reckless Hacker

    Some men just want to watch the world burn – and alumnus Philipp Meyer is certainly one of them. He reached deep into the abyss of unsafe and unportable functions and produced the following 18-token solution, which is basically Mr Endraß's solution on drugs:

          encodeInts = unDigits 94 . map unsafeCoerce . show
          decodeInts = read . map unsafeCoerce . digits 94
    

    It is technically correct (after a fashion), but morally questionable, seeing as it uses the unsafeCoerce function to coerce Char to Integer and vice versa. This is a dirty trick that relies on a lot of compiler internals, and in fact, it fails horribly with the MC Jr's GHC 7.6.3, but with the latest version (GHC 7.8.3), it seems to work fine. Luckily, Philipp Meyer participates außer Konkurrenz, so the MC does not have to worry about whether to give him points for this (impressive, but horrific) hack.

    However, Mr Meyer's solution seems positively sane when compared to one that came from an actual student:

          encodeInts = unsafeCoerce . encode
          decodeInts = decode . unsafeCoerce
    

    Like a number of students, he uses encode and decode from the binary package to encode the input list as a ByteString. Unlike other students, who then converted this ByteString into an Integer in the usual fashion, this student decided that type safety is not for him and used the tried-and-tested C++ approach (cf. reinterpret_cast): ByteString, Integer – where's the difference? Everything is a string of bytes in the end, right? Well, yes. But not like this. While the QuickCheck test decodeInts (encodeInts xs) runs through happily without complaining, all hell breaks loose if you try to actually look at the ‘integer’ the encodeInts function returns.

    To his credit, he was perfectly aware of the issue and was prudent enough to hand in an alternative 51-token solution for the eventuality that the MC does not find the unsafeCoerce solution to his liking. On the other hand, he also not-so-prudently put his {-WETT-} tags around unrelated, commented-out code by mistake, so the MC will not award any points.

    If you thought that this is the end of the line, you are wrong: it appears that type safety is not taken very seriously in Karlsruhe either, as Denis Lohner, an employee at KIT, handed in the following ‘solution’:

          encodeInts = unsafeCoerce
          decodeInts = unsafeCoerce
    

    As one would expect, this solution works about as well as the previous one. Again, the QuickCheck test runs through fine, but when the MC tried to inspect the actual value returned by encodeInts, he spent the following three hours fending off raptor attacks.

This week's lessons:

Die Ergebnisse der fünften Woche

The MC Sr. is back from his lengthy vacation. He does not want to reveal too much about his private life (that would be soo un-German and he's trying hard to fit in), except for the fact that he now wears a ring on his left ring finger and there is now a Mistress of Competition at his side.

Now to the results of the fifth week:

Top 30 der Woche
  Platz Wettbewerber(in)   Schlechtheit   Laufzeit   Punkte
1. Nathanael Schilling 4 6.92 30
2. Lukas Heinzmann 4 7.21 29
3. Michael Hesse 4 7.49 28
4. Martin Wauligmann 4 8.37 27
5. Julius Imbery 4 8.49 26
6. Julian Labeit (KIT) 4 T/O 2 25
Fabio Madge Pimentel 4 T/O 2 25
8. Florian Weber (KIT) 4 T/O 1 24
9. Manuel Burghard 6 5.17 24
Daniel Kowatsch 6 5.17 24
11. Julian Spahl 6 5.18 22
Rudolf Hattenkofer 6 5.18 22
13. Björn-Aljoscha Kullmann 6 5.20 20
David Widmann 6 5.20 20
15. Lukas Tenbrink 6 5.21 18
16. Christopher Zentgraf 6 5.21 17
Moritz Issig 6 5.21 17
18. Jonas Heintzenberg 6 5.23 15
19. Chaoran Chen 6 5.25 14
20. Christian Winter 6 5.26 13
21. Adrian Philipp 6 5.28 12
Sebastian Bruhn 6 5.28 12
23. Theodor Cheslerean Boghiu 6 5.31 10
24. Daniel Schubert 6 5.32 9
25. Heiko Lengenfelder 6 5.42 8
26. Christoph Griesbeck 6 5.68 7
27. Dmitriy Zayko 6 5.70 6
28. Philipp Jutzi 6 5.77 0
29. Martin Sperr 6 5.79 4
30. Noah Bergbauer 6 5.82 3
31. Sebastian Neubauer 6 5.88 2
32. Karsten Diekhoff (KIT) 6 5.89 1
33. Michael Eder 6 5.93 1

(Five points were subtracted from Herr Philipp Jutzi's score, because he fixed a performance bug 9 minutes past the deadline. Unfortunately, that leaves him at 0 points. The slower solution he had submitted on time would have landed him a 34th place, which is also 0 points. Though luck!)

And the Top 4 of tutors and alumni:

Top 30 der Woche
  Platz Wettbewerber(in)   Schlechtheit   Laufzeit   Punkte
1. MC Jr. 4 5.09 0
2. Martin Hecker (KIT) 4 6.22 0
3. Martin Mohr (KIT) 4 T/O 2 0
4. Daniel Stüwe 6 5.31 0

The results were obtained by considering two criteria: badness and run-time. A collection of 11 problems were used to determine the badness of the solutions:

        map (\j -> (j, j)) [1..10]
        map (\j -> (j, j)) [1..20]
        map (\j -> (j, j)) [1..30]
        map (\j -> (j, j)) [81..100]
        (map (\j -> (j, j)) [1..15] ++ map (\j -> (j, j)) [1..15])
        (map (\j -> (j, j)) [1..10] ++ map (\j -> (j, j)) [21..29])
        map (\j -> (j, j)) (take 10 primes)
        map (\j -> (j, j)) (take 20 (drop 20 primes))
        map (\j -> (j, j)) (take 9 primes ++ [99, 4, 100, 101] ++ take 10 primes)
        concatMap (\j -> Data.List.genericReplicate j (j, j)) (reverse (take 5 primes))
        map (\j -> (j, j)) ([1] ++ replicate 15 16 ++ replicate 15 1)

where primes returns the infinite stream of prime numbers, computed using a naive algorithm:

        primes = nextPrimes 2
          where nextPrimes n =
            (if null [() | j <- [2..n - 1], n `mod` j == 0] then [n] else []) ++ nextPrimes (n + 1)

(That the stream is infinite may seem to implicate that the above programs won't terminate. However, thanks to Haskell's lazy evaluation strategy, computing finite segments of the primes function, by dropping and takeing elements from the front, can be done in finite time.)

Notice that the longest of these inputs contains 33 elements, which happens to be the number of entries in our Top 30 (= 30 Munchies + 3 Kitties). Submissions that did not terminate within a reasonable time on any of these inputs were simply filtered out. It might be argued that efficiency is only the second criterion, but if the program is unbearably slow, it is impossible for the MC to evaluate the badness! At the limit, one can imagine a program that would take 15 billion years to terminate. Even if this program is correct in a theoretical sense, the MC pragmatically would put it onto the incorrect pile. This is his privilege, especially in the presence of a vast collection of solutions that do terminate in reasonable time.

In general, the context given in the problem description is not just there to embellish it. It gives hints about the input values to expect. From your experience with the competition, you should know that solutions typically have between 10 and 500 tokens and that there are perhaps 100 or 200 students participating. Negative numbers of tokens don't make sense, so you can rely on the MC not being so nasty as to check this case.

To measure the efficiency, the following 11 inputs were tried:

        map (\j -> (j, j)) [1..50]
        map (\j -> (j, j)) [50..1]
        map (\j -> (j, j)) [1..200]
        map (\j -> (j, j)) [200..1]
        map (\j -> (j, j)) (replicate 200 1000)
        map (\j -> (j, j)) ([1] ++ replicate 25 26 ++ replicate 25 1)
        map (\j -> (j, j)) ([1] ++ replicate 45 46 ++ replicate 45 1)
        map (\j -> (j, j)) (take 500 primes)
        map (\j -> (j, 10 * j)) (take 1000 primes)
        map (\j -> (j, 100 * j)) (take 1000 primes)
        map (\j -> (j, j)) (let ss = take 1000 primes in ss ++ ss ++ ss ++ ss ++ ss ++ ss ++ ss ++ ss ++ ss)

The cumulative time of all these examples (including the computation of the primes function) counted as secondary criterion. The memory aspect and the beauty of the code were entirely omitted from the evaluation; this was the MC Jr.'s idea, but the MC Sr. didn't want to try to measure memory, nor was he inclined to judge the code's elegance. Anyway, it's generally a good idea to use little memory and to write elegant code, irrespective of whether this helps you climb the Top 30 or not.

One submission timed out in the fifth test. A few timed out in the tenth test. Those are marked with T/O 1 and T/O 2, respectively, in the table above.

There were 130 solutions overall, including inofficial ones by tutors and alumni and the MC Jr. Of the 130 entires, 9 were from the KIT (Karlsruhe Institute of Technology), whereas 121 were from the MIT (Munich Institute of Technology). A few correctness and basic performance tests ruled out a dozen solutions. The MC Sr. was stunned to see that some otherwise correct solutions were so suboptimal as to assign both elements of

        [(1, 1), (1000, 1000)]

to the same MC! These were disqualified right away–the MC Sr. has no time to waste with entries that do not satisfy basic sanity conditions. As they say in Germany, Was der Bauer nicht kennt, frisst er nicht.

It's difficult to partition the solutions as neatly as the MC Jr. did in the last couple of weeks, without spending hours studying each solution. Instead, we will focus on the top ones. This is the #1 solution by Herr Schilling:

        {-
         - This is an implementation of the "complete differencing" algorithm described by
         - Mertens in http://arxiv.org/ftp/cond-mat/papers/0310/0310317.pdf and introduced by 
         - Korff in http://ijcai.org/Past%20Proceedings/IJCAI-95-VOL%201/pdf/035.pdf
         -
         - It is an anytime algorithm that gives progressively better results the longer it is allowed
         - to run. This is adjustable. In the worst case, it returns a partition found by the Karmakar
         - and Karp differencing method. In most cases it does better though, and given enough runtime
         - it always finds the optimal solution. Given that the partition problem is NP-Hard, we stop
         - looking for better solutions after a certain (non-exponential) number of tries.
         -
         - Because the greedy approach usually gives quite good results and is very fast,
         - that is run first, and only if it doesn't return an optimal partition do we actually
         - run the complete differencing algorithm.
         -
         - This means that this implementation returns partitions that have a lower "badness" at least as low as
         - that from the greedy and the differencing algorithm.
         -
         - The version of GHC that I tested this on was quite good at being lazy, meaning it stopped 
         - searching for better partitions once an optimal partition had been found.
         - I haven't tested whether this behaviour works on GHC with version < 7.8.3
         -
         - It is assumed that token numbers are non-negative.
         -
         - Sorry that this solution is quite long.
         -}
        {-Helper functions and data structures-}
        --Represents two sets of integers. To avoid having to compute the sum each time, we save the sum
        --of each set. The words "larger" and smaller refer to the fact that we call a Partition "balanced", 
        --if sumlarger is larger than or equal to sumsmaller
        data Partition a = Partition {
            sumlarger::Integer
            ,larger::[(a,Integer)]
            ,sumsmaller::Integer
            ,smaller::[(a,Integer)]
            } deriving (Show)
        --Partitions can be compared using badness
        instance Eq (Partition a) where
          (==) a b = (badness' a == badness' b)
        instance Ord (Partition a) where
          compare = compare `on` badness'

        --Calculate the badness of a *balanced* partition
        badness':: Partition a -> Integer
        badness' x = sumlarger x - sumsmaller x

        --Balance a partition so that it has a positive badness. Basically exchange the "smaller" set with the
        --"bigger" one if the "smaller" set is actually bigger.
        balance :: Partition a -> Partition a
        balance x | badness' x < 0 = Partition (sumsmaller x) (smaller x) (sumlarger x) (larger x)
        balance x = x

        --Insert a balanced partition into a descending list of balanced partitions.
        ins :: Partition a -> [Partition a] -> [Partition a]
        ins y [] = [y]
        ins y (x:xs) | y < x =  x : ins y xs
        ins y xs = y : xs

        --Minimum that "short-circuits" if one side is 0 or 1. If a partitioning has badness of 0 or 1, then we
        --know that we cannot find a a better partitioning, so we stop looking for one.
        --The Integer argument is for the count variable of the findMinPartition function.
        minBad :: (Partition a,Integer) -> (Partition a,Integer) -> (Partition a,Integer)
        minBad x y | badness' (fst x) <= 1 = x
        minBad x y | badness' (fst y) <= 1 = y
        minBad x y | badness' (fst x) < badness' (fst y) =  (fst x, snd y)
        minBad x y = y

        --The words sum and difference refer to the fact that with combineDifference, the resulting badness is the difference
        --of the badness of the arguments, and with sumDifference it is the sum of the badnesses of the arguments.
        --Both functions expect balanced partitions as argumens.
        combineDifference:: Partition a ->Partition a-> Partition a
        combineDifference x y = balance $  Partition
          (sumlarger x + sumsmaller y)
          (larger x ++ smaller y)
          (sumsmaller x + sumlarger y)
          (smaller x ++ larger y)
        combineSum::Partition a -> Partition a -> Partition a
        combineSum x y =   Partition
          (sumlarger x + sumlarger y)
          (larger x ++ larger y)
          (sumsmaller x + sumsmaller y)
          (smaller x ++ smaller y)

        {-The actual algorithm-}
        findMinPartition :: Integer -> Integer -> [Partition a] -> (Partition a,Integer)
        --Expects a list of "Paritions" sorted by badness (descending) as the third parameter.
        --Basically we run findMinPartition recursively on two shorter lists
        -- and take the minimum of that.
        --It gets slightly messy because we want to pass around a counter (called c) and stop summing once it
        --reaches a certain value of max. I.e. max gives some indication of how many steps we want to take to try and get
        --a better result than the Karmarkar-Karp differencing algorithm.
        --Setting max to 0 gives the Karmarkar-Karp differencing algorithm for almost all cases.
        --Call this with c = 0.

        --Simplest base case is if we have only one partition.
        findMinPartition c max [x] = (x,c)
        --If we have less than 5 Elements, differencing is optimal, see
        --http://arxiv.org/ftp/cond-mat/papers/0310/0310317.pdf
        findMinPartition c max (x:y:xs) | length xs < 3 = findMinPartition c max $ ins (combineDifference x y) xs
        --If we have a badness in one parameter that is larger than the combined badness of the other ones, differencing it with the sum of the
        --other badnesses gives the best result.
        findMinPartition c max (x:xs) | (badness' x) >= (sum $ map badness' xs) = (combineDifference x $ foldl combineSum (Partition 0 [] 0 []) xs,c)
        --If the first number is 0, don't bother taking two branches as both branches will be the same.
        findMinPartition c max (x:y:xs) | badness' x == 0 = findMinPartition c max $ (combineSum x y : xs)
        --The main case. Get the minimum of the differenced and summed lists.
        --We sum only if differencing didn't give an optimal result, and if our counter
        --isn't bigger than max. The counter is only incremented for summing.
        findMinPartition c max (x:y:xs) =
            minBad leftResult rightResult
            where
              leftResult = (findMinPartition c max $ ins (combineDifference x y) xs) 
              nextcounter = succ (snd leftResult)
              rightResult = if (nextcounter < max) then
                findMinPartition nextcounter max ((combineSum x y):xs)
                else
                leftResult
        --The naive greedy algorithm is a lot faster than findMinPartition, and often gives an optimal result. It is computationally
        --also very cheap, so we run it first before attempting to run findMinPartition.
        --Expects its input to be sorted in ascending order.
        greedy::[Partition a] -> Partition a
        greedy [x]= balance x
        greedy (x:xs) = combineDifference res x
          where res = greedy $ xs

        --Turns an Element (a,n) into a Partition n [(a,n)] 0 [] with which the algorithm can work.
        --If we want the algorithm to work with negative numbers too, we would have to balance the partitions,
        --but in the case of only non-negative numbers, the Partitions are balanced like this.
        createPartitionList :: [(a,Integer)] -> [Partition a]
        createPartitionList xs = map (\x -> Partition (snd x) [x] 0 []) xs

        --Replace (genericLength xs + 2000) below for some other expression to limit how
        --long the algorithm should run for. Note that the actual worst case complexity is then also
        --scaled by at least length of xs, meaning that we have at least quadratic worst-case runtime overall (I think)
        --for this choice of max.
        splitSubmissions :: [(a, Integer)] -> ([(a, Integer)], [(a, Integer)])
        splitSubmissions [] = ([],[])
        splitSubmissions xs = (larger minimalResult,smaller minimalResult)
            where   
              minimalResult = fst $ minBad (greedyResult,0) myResult
              greedyResult  = greedy $ sort partitionList
              myResult      = findMinPartition 0 (genericLength xs+ 2000) $ sortBy (flip compare) $ partitionList
              partitionList = createPartitionList xs

The #2 solution, by Herr Heinzmann, is somewhat shorter, even though it implements a different algorithm when given more than 10000 entries (which the MC didn't test, regrettably–but Info 2 is not a MOOC!):

        --Fallunterscheidung nach Länge
        splitSubmissions :: [(a, Integer)] -> ([(a, Integer)], [(a, Integer)])
        splitSubmissions a = if length a >= 10^4 then splitSubmissions' a else splitSubmissions'' a 

        --Einfacher Algorithmus -> Aufsteigend sortieren und Greedy
        splitSubmissions' :: [(a, Integer)] -> ([(a, Integer)], [(a, Integer)])
        splitSubmissions' x = let (a,b,c,d) = ins (sortBy (flip (compare `on` snd)) x) ([],0,[],0) in (a,c)

        --Zuordnung der Werte (Greedy)
        ins :: [(a,Integer)] -> ([(a,Integer)],Integer,[(a,Integer)],Integer) -> ([(a,Integer)],Integer,[(a,Integer)],Integer)
        ins [] x = x
        ins (x:xs) (a,c,b,d) = if c < d then ins xs (x:a,c+(snd x) ,b,d) else ins xs (a,c,x:b,d + (snd x))

        --Komplexerer Algorithmus, Berrechnung der Differenzen, danach Rückbau der Liste  
        splitSubmissions'' :: [(a, Integer)] -> ([(a, Integer)], [(a, Integer)])
        splitSubmissions'' (a:[]) = ([a],[])
        splitSubmissions'' a = build a (conc(reverse(split (sortBy (flip (compare `on` snd)) a))) ([],[])) ([],[])

        -- a's den Integern zuordnen
        build :: [(a,Integer)] -> ([Integer],[Integer]) -> ([(a,Integer)],[(a,Integer)]) -> ([(a,Integer)],[(a,Integer)])
        build [] _ a = a
        build ((a,b):xs) (c,d) (e,f) = if b `elem` c then build xs (delete b c,d) ((a,b):e,f) else build xs (c,delete b d) (e,(a,b):f)

        -- Aufteilung der in split berechnenten Werte
        conc :: [(Integer,Integer,Integer)] -> ([Integer],[Integer])-> ([Integer],[Integer])
        conc [] a = a
        conc ((a,b,c):xs) ([],[]) = conc ((a,b,c):xs) ([a],[])
        conc ((a,b,c):xs) (d,e) = conc xs (if a `elem` d then (b:(delete a d),c:e) else (c:d,b:(delete a e)))

        -- Berechnung der minimalen Differnz und speichern der Werte zum Rückbau
        split  :: [(a,Integer)] -> [(Integer,Integer,Integer)]
        split ((_,x):(_,xs):[]) = [(x - xs,x,xs)]
        split ((a,x):(_,xs):xss) = (x -xs,x,xs):(split (sortIns (a,(x-xs)) xss))

        -- Hilfsfunktion zum einfügen in eine sortierte Liste
        sortIns :: (a,Integer) -> [(a,Integer)] -> [(a,Integer)]
        sortIns a [] = [a]
        sortIns (a,b) ((x,y):xs) = if b > y then (a,b):(x,y):xs else (x,y):(sortIns (a,b) xs)

The third place, by Herr Hesse, shows that it's possible to get good results without too much pain:

        {-
          Darstellung von Aufteilungen/Abgaben in splitSubmissions:
            (badness, submissionsList1, submissionsList2) :: (Integer, [(a, Integer)], [(a, Integer)])
          badness = badness / Anzahl an Tokens, die submissionsList1 größer ist als submissionsList2 (submissionsList1 ist immer größer)
          submissionsList1 / submissionsList2 = Abgaben, die MC 1 bzw. MC 2 zugeordnet werden
        -}
        splitSubmissions :: [(a, Integer)] -> ([(a, Integer)], [(a, Integer)])
        splitSubmissions [] = ([],[])
        splitSubmissions submissions = (resultA,resultB)
          where
            (_, resultA, resultB) = splitSubmissionsHelp (sortBy sorter (map transformToInnerType submissions))

            -- wandelt eine Abgabe in das o.g. Datenformat um, indem jede Abgabe zunächst MC 1 zugewiesen wird
            transformToInnerType (name, submissionLength) = (submissionLength, [(name,submissionLength)], [])

            -- splitSubmissionsHelp :: [(Integer, [(a, Integer)], [(a, Integer)])] -> (Integer, [(a, Integer)], [(a, Integer)])
            -- fasst in jedem Schritt die beiden Aufteilungen mit der größten badness zu einer Aufteilung zusammen
            -- Annahme bei Aufruf von splitSubmissionsHelp: List der Aufteilungen ist bereits absteigend sortiert
            splitSubmissionsHelp [a] = a
            splitSubmissionsHelp (first:second:xs) = splitSubmissionsHelp (sortBy sorter (joinPartitions first second:xs))

            -- joinPartitions :: (Integer, [(a, Integer)], [(a, Integer)]) -> (Integer, [(a, Integer)], [(a, Integer)]) -> (Integer, [(a, Integer)], [(a, Integer)])
            -- fasst zwei Aufteilungen A, B zusammen, indem die größere Liste von A mit der kleineren von B und umgekehrt konkateniert wird
            -- Annahme bei Aufruf von joinPartitions: badness von A > badness von B => badness vom Ergebnis positiv
            joinPartitions (badnessA, a1, a2) (badnessB, b1, b2) = (badnessA - badnessB, a1++b2, a2++b1)

            -- vergleicht zwei Aufteilungen anhand ihrer badness (absteigende Sortierung)
            sorter (badnessA, _, _) (badnessB, _, _) = compare badnessB badnessA

Finally, the MC Jr. has an educational solution that compresses his knowledge on the topic. His solution is, according to the tests, both the least bad (badness: 4) and the fastest (5.09 s). He achieved this by using quite a few packages that require cabal install, though. It would be a pity not to show it:

        type Submission a = (a, Integer)

        tokenSum :: [Submission a] -> Integer
        tokenSum = sum . map snd

        badness :: ([Submission a],[Submission a]) -> Integer
        badness (xs,ys) = abs (tokenSum xs - tokenSum ys)

        gcdList :: Integral a => [a] -> a
        gcdList = gcdList' 0
          where gcdList' 1   _      = 1
                gcdList' acc []     = acc
                gcdList' acc (x:xs) = gcdList' (gcd acc x) xs

        -- Pseudo-polynomial algorithm for the optimal solution. Takes O(m*n) time, where m is the length of the input list
        -- and n is its token sum. Basically, a witness-providing version of 
        -- http://en.wikipedia.org/wiki/Partition_problem#Pseudo-polynomial_time_algorithm .
        -- Uses memoisation combinators instead of an explicit table.
        -- Optional threshold allows to use heuristic solution for very large instances to avoid excessively long computations.
        splitSubmissionsThreshold :: Maybe Integer -> [Submission a] -> ([Submission a], [Submission a])
        splitSubmissionsThreshold threshold xs = case threshold of
                                                   Just t  -> if (n + upperB) * genericLength xs > t then upper else optimal
                                                   Nothing -> optimal
          where -- If GCD of list elements is non-trivial, reduce list by GCD and unreduce the result.
                -- In practice, GCD will probably always be 1, but computing it is cheap when it's trivial and 
                -- the reduction saves a huge amount of time when it's not.
                g        = gcdList (map snd xs)
                xs'      = if g == 1 then xs else map (second (`div` g)) xs
                unreduce = if g == 1 then id else join bimap (map (second (*g)))
                n        = tokenSum xs'
                upper    = splitSubmissionsHeuristic xs -- heuristic algorithm gives upper bound
                upperB   = badness upper
                optimal  = unreduce $ head $ catMaybes $ [findPartitionWithBadness b | b <- [0..upperB - 1]] ++ [Just upper]

                -- Try to find a partition with badness exactly b, i.e. ‘tokenSum ys + b == tokenSum zs’
                -- Since ‘tokenSum ys + tokenSum zs ==  n’, this is equivalent to ‘2*tokenSum ys == n - b’
                -- Hence, if n is odd, there are no such ys, zs and if n is even, we only need to find a subset of xs 
                -- that sums to ‘(n - b) `div` 2’.
                findPartitionWithBadness b
                  | odd (n - b) = Nothing
                  | otherwise   = findSubsetSum ((n - b) `div` 2) (0 :: Integer) xs'
        
                -- Find a subset ys of ‘drop i xs’ that sums exactly to a (and its complement zs)
                -- by putting each x in xs either in ys or zs. Memoisation w.r.t. a and i.
                findSubsetSum :: Integer -> Integer -> [Submission a] -> Maybe ([Submission a], [Submission a])
                findSubsetSum = memoize2 findSubsetSum'
                findSubsetSum' a _ [] = if a == 0 then Just ([], []) else Nothing
                findSubsetSum' a i (x:xs)
                  | a < 0     = Nothing
                  | otherwise = fmap (first  (x:)) (findSubsetSum (a - snd x) (i + 1) xs) <|> 
                                fmap (second (x:)) (findSubsetSum a           (i + 1) xs)

        -- Karmarkar-Karp differencing heuristic: remove the largest two elements and add their difference back in.
        -- The idea behind this is that the heap contains partial splittings (ys,zs) sorted by their badness.
        -- We can merge such partial splittings (xs1,ys1) and (xs2,ys2) by combining (xs1++ys2, ys1++xs2) and get a new 
        -- partial splitting whose badness is the difference of the individual badnesses.
        -- Note the use of Data.Sequence instead of lists. Lists have terrible performance for long inputs due to the many ++.
        splitSubmissionsHeuristic :: forall a. [Submission a] -> ([Submission a], [Submission a])
        splitSubmissionsHeuristic xs = bimap toList toList (aux heap)
          where heap = H.fromList [(n, (S.singleton (x,n), S.empty)) | (x,n) <- xs] 
                           :: MaxPrioHeap Integer (Seq (Submission a), Seq (Submission a))
                aux h = case H.splitAt 2 h of
                          ([], _)                                 -> (S.empty, S.empty)
                          ([(_, (ys, zs))], _)                    -> (ys, zs)
                          ([(m, (ys1, zs1)), (n, (ys2, zs2))], h) -> aux $ H.insert ((m - n), (ys1 >< zs2, zs1 >< ys2)) h

        -- The brute force approach: generate all combinations, take minimum.
        splitSubmissionsBrute :: [Submission a] -> ([Submission a],[Submission a])
        splitSubmissionsBrute = fst . minimumBy (compare `on` abs . snd) . combs
          where combs []     = return (([] , []), 0)
                combs [x]    = return (([x], []), snd x)
                combs (x:xs) = let cs = combs xs 
                               in [((x:ys,zs), n + snd x) | ((ys,zs),n) <- cs] ++ [((ys,x:zs), n - snd x) | ((ys,zs),n) <- cs]

        -- Uses splitSubmissionsOptimal with default threshold of 10^9. Also, uses brute force approach for small input length.
        splitSubmissions :: [Submission a] -> ([Submission a], [Submission a])
        splitSubmissions xs = 
          if genericLength xs < 20 then splitSubmissionsBrute xs else splitSubmissionsThreshold (Just (10^9)) xs

Additional comments (from MC Jr. to MC Sr.):

Here's my code for the submission splitting exercise. It's actually quite short and concise, despite the considerable amount of optimisation I did. I also commented it extensively.

If you call ‘splitSubmissionsThreshold Nothing xs’, you will get the optimal splitting of xs in O(length xs * tokenSum xs) time. This may be useful as a reference implementation to test the student submissions against, as it tells you (reasonably) quickly, what the optimal solution that could be achieved is.

My experience so far: – for very long lists (> 1000) and numbers that are not ridiculously huge, the differencing heuristic almost always finds a solution with badness ≤ 1, which is optimal. For shorter lists, it often fails miserably. – the optimal solution works well with long lists, but does not perform well with very large numbers, which is why I implemented the GCD thing and built in a threshold.

By the way, when testing, I would advise you not to put negative numbers in the input list. Since they represent numbers of tokens, the problem statement implies that the function will only be called on non-negative numbers. My code, for instance, will not be amused by negative numbers.

To the MC Sr., it looks as though this was a fairly common problem. Those competitors who had the reflex to turn to the literature (starting with, e.g., Wikipedia) had a sizable advantage over the others.

A week without lessons would be like a sundae with no cherry on top (or a wedding cake with no icing):

  1. The MC needs to run your program. As a competitor, you certainly don't want to write programs that overheat the MC's MacBook Pro, because if you overheat his MacBook Pro, this will make him grumpy, and you positively don't want him to be grumpy when he looks at your solution.
  2. The scientific literature is full of little gems (and big diamonds). If the problem sounds standard, chances are that the solution is so, too. As a general rule, do not reinvent the wheel.
  3. Also important: If you get inspired by the literature, remember to acknowledge this by a reference (as our top competitors have done). This will make the MC's job easier, but more importantly this will ensure that nobody accuses you of being a thief.
  4. Submit your solutions on time. The MC Sr. is sometimes slack with a few things, but deadlines are rather strict with him.
  5. Kalsruh, wake up! Last time the MC checked (which was some years ago, he will admit so much), KIT was an Eliteuniversität. The quality of some of the solutions we get from you guys corroborates this hypothesis, but what about the quantity? Next week, the MC Sr. expects more submissions from you guys; otherwise, he will personally show up in your Vorlesung (distances are not an obstacle for him). As they say in Swabia, Jede Woch isch Kehrwoch.
  6. The MC Sr. will sometimes go out of his way to insert some (pseudo-)German Sprüche even in contexts where they obviously don't fit.

Die Ergebnisse der sechsten Woche

Although this week's task was posed by the MC Jr., it is ranked by his Sr. counterpart.

First, there were 91 solutions that were not undefined and that compiled. One solution imported Data.Ranges.Ranges. The MC could not make it work, unfortunately–but the MC would like to hear from that student, esp. if he or she (one can never be sure with Vietnamese names) could contact the MC to sort this out.

The MC Jr. made the MC Sr.'s job more tedious by allowing solutions that are commented out. That means the MC (Sr., not Jr.) had to go through all solutions manually and check if there was some commented out code that should be uncommented. In the end, there were exactly 0 (zero, Z-E-R-O) solutions that met this criterion. (OK, a few were commented out, but it was because they were broken, not because they imported exotic packages.)

To be fair, Unix has a nice utility, called grep, that helped automate this tedious task. Moreover, grep helped reveal some other comments, such as this one:

        {-
        Joachim Breitner (KIT, Mitarbeiter)

        Dies ist ein einfacher Knotenfärb-Algorithmus, der stets den Knoten mit dem
        größten Grad (also, dem Zeitslot mit den meisten Überschneidungen) zuweist.
        Anscheinend heißt das auch „Welsh–Powell algorithm“.

        Es gibt kein Backtracking, das heißt der Algorithmus kann fehlschlagen, auch
        wenn es eine Lösung gibt. Dafür ist er hoffentlich einigermaßen schnell, dank
        Vektor als Datenstruktur. Die Adjazenzmatrix `adj` wird dabei nicht verändert; 
        wenn Knoten zugewiesen werden wird ihr Grad auf -1 gesetzt.

        Über-den-Daumen-Gepeilte Komplexität: O(max(t·n, n·n)), wobei t die Zahl der
        Tutoren und n die Zahl der Slots ist.
        -}

What? Did the MC really read that “der Algorithmus kann fehlschlagen”? Sonst geht's gut, oder? ;-)

Programs that sometimes fail to do their job, but that do their job correctly when they actually do it, are said to be partially correct. Partial correctness is nice, because it means you can trust the results (e.g., the schedule that comes out). But what is even nicer is total correctness, whereby the program of interest always does its job (given enough resources, e.g., time and memory). In the example above, one could say that returning (False, []) instead of (True, ...) is an instance of partial correctness, just as entering a loop or throwing an exception. But strictly speaking, it isn't even: The program did its job, but it did it wrong.

For the Wettbewerb, what we care about is of course total correctness! Otherwise, solutions like findMapping = undefined would be winning every week, and the whole contest would be a joke. (Which it isn't, right?)

So here's a suggestion for Herr Breitner: When you have a efficient but incomplete algorithm, you can always have it fall back on a naive, brute-force (and complete) algorithm when it fails. Now, the MC Sr. is not going to judge Herr Breitner too harshly, since he himself found no time to attack the problem; to his defense, he was traveling (again) last week.

Another comment:

        {- Diese Lösung ist darauf ausgerichtet, möglichst auch in Worst Case Anfragen - insbesondere 
           bei denen ohne Lösung - in aushaltbarer Zeit zu beantworten. Die durchschnittliche Laufzeit
           bei einfachen Anfragen mag zwar nicht top sein, aber die paar Augenblicke, die man mehr warten müsste,
           werden, so hoffe ich, als weniger wichtig erachtet, als dass das Programm mit einer möglichst geringen
           Wahrscheinlichkeit gar nicht in absehbarer Zeit (Stunden, Tage, Jahre ...) terminiert.
        -}

Good principle! The next one looks like a good idea, but the MC Sr. doesn't quite get it:

        {-
         - Basically just try all combinations until something works.
         - What makes it slightly more efficient is that some combinations are discarded.
         - We initially sort the slots by beginning time, and then make use of the following property:
         - Let A be a slot that starts first of all the slots. Let B be the (or a) earliest slot that can be matched with A.
         - Then it is relatively easy to show that if A cannot be matched with B for a big enough matching, then if there is 
         - a big enough matching, there is a big enough matching in which B is matched with a slot that is after B.
         - So once we know that A with B does not give a big enough matching, we mark B in a way
         - that the algorithm will not try matching it with an earlier slot later on.
         -}

What is a “big enough matching”? A definition and an example would have helped here.

Another noteworthy comment:

        {-
        Die Tests funktionieren hier nicht: "[..] excessive memory usage."

        Die Lösung ist sehr ineffizient und evtl. schwer verständlich.
        Um dir die Korrektur zu erleichtern, habe ich den Gedankengang notiert:

        Zur Beschreibung:
        in findMapping werden erst alle möglichen, sich nicht überschneidende Tupel
        herausgesucht und in die liste myTimeSlots geworfen.
        Dann wird für jedes Tupel mit jeweils 2 TimeSlots geprüft rekursiv geprüft, ob es
        eine aus den übrigen Tutoren und den übrigen TimeSlots eine Lösung gibt.
        Es entsteht also ein Baum, deren Äste immer bei einer konkreten Kombination
        der TimeSlots Tupel TimeSlots verzweigen.
        Anschließend wird die Liste aus Verzweigungen und Ergebnisse der durch die Verzweigung
        erreichten Kinder aussortiert, sodass nur noch die gültigen Pfade übrigbleiben.
        head validList bewirkt, dass das erste korrekte Ergebnis entnommen wird.  

        -}

Finally, the last comment could do with a bit of formatting:

        {-a very rushed atempt due to schedule problems. my algorithm is trying to solve the problem by trying to put those numbers that need to be on 
        the leftside of the (timeslot,Timeslot) tuple. it will try to match away the highest snd on the right side so that it wont have to be put on the left sidelater on
        if it fails though dfindMap will ignore that match (put match in another list and combine lists again in the next smMap call
        )
        -}

And surely its author must know that English sentences must be capitalized, that “wont” is spelled with an apostrophe, and that there should be a space in “sidelater”. But after all, this was a “a very rushed atempt”–so what was the MC Sr. expecting? This is not quite Oettinger-level material, but perhaps it qualifies for the Deutsche Bahn Consolation Prize for Suboptimal Use of the English Language?

Incidentally, the MC Sr.'s blog is kind of a rushed job, too. It would be easy for the MC to spend days studying the solutions in detail, but his financial backers, the Deutsche Forschungsgesellmeinschaft (DFG) (not to mention the DFG's backers), probably wouldn't be too pleased about that. On the other hand, it's quite obvious that the MC Jr., who has so far been working on a volunteer basis, takes his obligations very seriously. For example, the MC Sr. doesn't bother putting Deutsch words in italics, whereas his Jr. Doppelgänger does.

End of grepping. Now, soundness tests. QuickCheck and its cousin SmallCheck are nice tools, but they often fail to find bugs in the presence of complicated conditions on the input.

First soundness test:

        findMapping [] [] == (True, [])

This is the case where there are 0 tutors and 0 slots (with 0 >= 2 × 0). It's rather disappointing to see 8 solutions “fly out”, with (False, []). There are better ways of getting disqualified.

Second soundness test:

        findMapping [] [(1, 2), (2, 3)] == (True, [])

Two solutions give (False, []), and one gives

        *** Exception: round2_bug2/20409/Exercise_6.hs:(113,1)-(122,74): Non-exhaustive patterns in function findMapping

Third soundness test:

        findMapping ["A"] [(1, 2), (1, 2)] in m == (False, [])

We lost 7, which either entered a loop (or were ridiculously slow, which is the same to the MC) or produced odd results such as

        (True, [])
        (True, [("A", (1, 2), (1, 2))])

Now at 73 solutions. At this rhythm, we'll reach a top 30 without even looking at the actual criterion! (Update: How prophetic!)

This is the reason why the MC Sr., when he writes the Aufgabentext, normally gives a long list of trivial looking examples. The MC Jr. is of the opinion that the TUM is an Eliteuniversität and shouldn't spoon-feed the students. Fair enough.

Incidentally, the Mistress of C. pointed out to the MC Sr. that KIT has, regrettably, lost its Elitestatus since the MC visited that venerable institution in 2009. Schade. The MC's goal wasn't to pour sand in the wounds nor to rub it in (even though that wouldn't be inconsistent with his persona). But the many lights in Karlsruhe's intellectual chandelier, among which Prof. Snelting, are still shining as brightly as ever, so don't let this temporary setback worry you even for a second. Indeed, with all due respect to his alma mater, the MC himself went to a much more provincial university for his bachelor, and now he's Tha MC! Take that, eh! He reckons that what's much more important that all that Elite stuff is how much energy and interest you put in your studies (including this very competition). No German Beamte will be able to take that away from you.

Fourth soundness test:

        fst (findMapping ["A", "B"] [(1, 11), (2, 12), (11, 21), (11, 22)]) == False

Some odd results:

        (True, [("A", (1, 11), (11, 22)), ("B", (1, 11), (11, 22))])
        (True, [("A", (1, 11), (11, 21)), ("B", (2, 12), (11, 22))])

Fifth soundness test:

        isMapping (findMapping ["A"] [(1, 11), (2, 10), (3, 9), (4, 8), (5, 7), (1, 2)])

(where isMapping just checks that the answers looks OK.) 27 solutions failed to find a solution, even though at least three existed:

        [("A", (1, 2), (3, 9))]
        [("A", (1, 2), (4, 8))]
        [("A", (1, 2), (5, 7))]

Among the failures is the one by Herr Breitner, which does no backtracking. Korrektheit ist was anderes. ;-)

Sixth soundness test:

        isMapping (findMapping ["A", "B"] [(1, 11), (2, 10), (3, 9), (4, 8), (5, 7), (1, 2), (2, 5)])

We lost another 13! (That's thirteen, not 6227020800!) And that leaves us with 31, including the MC Jr.'s and probably some Kitties. It looks like we already have a Top 30, and we're not even done finding the bugs yet.

More soundness tests:

        isMapping (findMapping ["A", "B", "C"] [(1, 11), (2, 10), (3, 9), (4, 8), (5, 7), (1, 2), (2, 5), (1, 2)])
        fst (findMapping ["A", "B", "C", "D"] [(1, 11), (2, 10), (3, 9), (4, 8), (5, 7), (1, 2), (2, 5), (1, 2)]) == False
        isMapping (findMapping ["A", "B"] [(1, 10), (10, 11), (10, 20), (11, 12)])
        isMapping (findMapping ["A", "B"] [(1, 4), (2, 5), (3, 6), (2, 3), (3, 10), (4, 5)])
        isMapping (findMapping ["A", "B", "C"] [(20, 30), (20, 30), (1, 4), (3, 5), (4, 10), (6, 11)])
        isMapping (findMapping ["A", "B", "C", "D"] [(20, 30), (20, 30), (20, 30), (20, 30), (1, 4), (3, 5), (4, 10), (6, 11)])
        isMapping (findMapping ["A", "B", "C", "D", "E"] [(31, 32), (20, 30), (20, 30), (20, 30), (20, 30), (20, 30), (1, 4), (3, 5), (4, 10), (6, 11)])
        isMapping (findMapping ["A", "B", "C", "D"] [(1, 2), (1, 2), (1, 20), (21, 30), (2, 3), (2, 3), (1, 2), (2, 3)])

We lost 3+1+8+1+2+1+1+2 more. (Notice how simpler tests catch more culprits.)

At this point, the MC noticed that the solution by Denis Lohner, Mitarbeiter am KIT, was silently failing due to the absence of the dynamic library libglpk.dylib on the test machine. So unfortunately, we won't know how well Herr Lohner is going to score.

This is also where MC-check gave up. There are 12 solutions left that must be evaluated for speed. The test

        findMapping ["A", "B", "C", "D", "E", "F"] [(1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), (2, 8), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7)]

caused Herr Bennet Breier (or rather, his solution) to time out (> 10 s). The test

        ["A", "B", "C", "D", "E", "F"] (replicate 1000 (0, 1) ++ replicate 1000 (0, 2) ++ replicate 6 (1, 2))

caused Herr Johann Frei and Herr Martin Wauligmann to time out. The test

        ["A", "B", "C", "D", "E", "F"] (replicate 10 (1, 2) ++ replicate 10 (0, 2) ++ replicate 5 (0, 1))

caused Herr Markus Ansorge, Herr Martin Sperr, Herr David Widmann, and Herr Alexander Ungar to time out. The test

        ["A", "B", "C", "D", "E", "F"] (replicate 50 (1, 2) ++ replicate 50 (0, 2) ++ replicate 5 (0, 1))

caused Herr Thomas Rahm, Herr Daniel Schubert, and Herr Julius Imbery to time out.

This leaves us with Herr or Frau Chaoran Chen as sole winner (sorry, Chinese names aren't the MC's forte any more than Vietnamese ones [*]), ex aequo with the MC Jr.! As for Herr Lohner's solution, we will never know, unless he runs it on his own machine and sends us the results. (Update: The MC Jr. managed to run the example on his machine, but he got a crash and core dump for the empty-list case. Not good.)

[*] Here the MC Sr. is compelled to inject a personal note. As one can imagine, he's used to receive mail addressed to Frau B. But last week, when talking to the electricity company Energie SaarLorLux on the phone, he was shocked to hear “Vielen Dank für Ihren Anruf, Frau Dr. Blanchette” after a ten-minute conversation.

In conclusion, the Top 30 (!!) of the week is as follows:

Top 30 der Woche
  Platz Wettbewerber(in)   Punkte
1. Chaoran Chen 30
2. Thomas Rahm 29
Daniel Schubert 29
Julius Imbery 29
5. Markus Ansorge 26
Martin Sperr 26
David Widmann 26
Alexander Ungar 26
9. Martin Wauligmann 22
Johann Frei 22
11. Bennet Breier 20

A pure massacre.


The MC Jr would like to add a thing or two about this week's problem. It seems that all (correct) submissions employed some sort of backtracking, which means that they are worst-case exponential. Indeed, some students even conjectured that this problem (like last week's) was NP-hard and therefore probably does not have a polynomial solution. However, there is a polynomial solution: if we arrange the time slots as nodes in a graph and connect precisely those time slots that do not overlap, the problem simply becomes that of finding the largest matching in that graph. This problem can be solved efficiently e.g. by Edmond's Blossom Algorithm, and the MC Jr implemented it here. That said, despite being worst-case exponential, Chaoran Chen's solution works extremely well on most practical inputs (much better than the MC Jr's) – but when the problems become ‘conflict-rich’ and backtracking kicks in, things look very different.

Herr Lohner used an interesting approach: he used an external library for integer linear programming, which the MC Jr explicitly encouraged in the problem statement, although ILP was not what he had in mind. The MC expected at least one person to use an external SAT solver (such as picosat) to solve the problem, but ILP works as well. Or it would have, if the ILP library in question did not crash on empty lists.

The MC Jr certainly did not aim to create so brutal an assignment; he was under the impression that the brute force approach was fairly easy to accomplish. It would appear that he was wrong.


Another student who was as baffled by the poor turnout this week as the MCs were asked on Piazza whether his solution, which does not have {-WETT-} tags could be considered for the Wettbewerb after all. Unfortunately, MC Check found that his code was incorrect. For some reason, his solution contains the comment -- The code never bothered me anyway. The MC is not sure what this refers to; he is quite certainly bothered by incorrect code, especially seeing as said student went to such lengths to get it noticed by the MC. Perhaps he should have just let it go.

What are the lessons this week?

Die Ergebnisse der siebten Woche

(Warning: Before you read this week's blog entry, make sure to get all the updates for last week's, by grepping for “pdate”.)

This week, there were exactly 100 non-undefined solutions, all of which compiled (at least after running cabal install). One solution is by a KIT student, another by a KIT tutor; 97 were by Munich students and one by a Munich tutor.

In the first round, the MC Sr. tested the input [(0, 0), (2, 0), (3, 0)], which was basically the running example in the Aufgabenblatt; to his disappointment, 7 solutions were lost at this stage.

He also tested [(0, 0), (1, 1), (2, 0)], wondering what would happen in the presence of an ambiguity (the nearest shelter to (1, 1) is not uniquely defined). Since the Aufgabentext wasn't entirely clear about what to do in that case, solutions that returned all nearest shelters instead of an arbitrary one were tolerated. After all, in a real-life scenario, such ambiguity could be helpful to prevent overcrowding the nearest shelter. It could also lead the refugees to hesitate, thereby wasting precious seconds and giving Godzilla (or Gojira) a chance to crush them.

Round 2 consisted of the correctness test [(0, 0), (2, 0), (5, 0), (12, 0), (0, 6), (2, 6), (5, 6), (12, 6)], which disqualified one solution, leaving 92.

At this point, the MC Sr. couldn't find more bugs. Either (1) the competitors were more careful than last week or (2) the problem was easier. The MC would like to believe it's (1), but he fears it's (2). We'll see further down. When running efficiency tests, the MC might still run into some bugs. (Update: Prophetic words!)

Most of the speed tests will be based on the mkCircle function:

        mkCircle res scale =
          map (\x -> (round (cos (x * 2 * pi / res) * scale),
                      round (sin (x * 2 * pi / res) * scale)))
              [0..res - 1]

Round 3 tries mkCircle 1000 1000, i.e., 1000 points more or less along the path of a circle of radius 1000. “More or less” because the x- and y-coordinates are rounded to the nearest integers. Solutions that took more than 1 s were kicked out. We lost 8 solutions, plus 1 buggy one, leaving 83.

To detect buggy solutions, the MC simply computed the “cost” of each solution and compared them:

        cost pqs =
          let pqs' = map head (Data.List.groupBy (\(x, _) (x', _) -> x == x') (Data.List.sort pqs))
          in sum (map (\((x1, y1), (x2, y2)) -> abs(x2 - x1) + abs(y2 - y1)) pqs')

Outliers were regarded with suspicion. For example, the optimal cost for mkCircle 100 100 is 736, but the buggy solution gave 738. This test is not complete, but it's hard to imagine that a buggy solution would consistently yield correct costs.

In round 4, the MC tried 10 concentric “circles”, Data.List.concatMap (\n -> mkCircle (100 * n) (100 * n)) [1..10], with a timeout of 10 s. This kicked out 37 solutions, leaving 46.

For round 5, the example becomes Data.List.concatMap (\n -> mkCircle n n) [1..80]. No timeouts this time, but one solution gave a result with a cost of 5749 instead of the expected 5747. The code seemed otherwise quite nice and is due to one of the Top 5 of WS 2012/2015, Herr Bergmaier; but the MC shows kein Pardon when it comes to Korrektheit.

Most programming languages have a rather imperfect notion of integers, by artificially squeezing them into 32 or 64 bits. Haskell is one of the few languages out there where the designers had the courage to bet on correctness, at the expense of efficiency. Haskell's Integers (unlike Ints) are true mathematical integers—God's integers as they are sometimes called (even by people who otherwise don't believe in God). When asked to write a function that takes God's integers as input, we should always strive to preserve this godliness.

This longish preamble was just setting the stage for round 6:

        mkCircle res scale =
          map (\x -> (123456789123456789123456789 * round (cos (x * 2 * pi / res) * scale),
                      234567891234567891234567891 * round (sin (x * 2 * pi / res) * scale)))
              [0..res - 1]
        let ps = Data.List.concatMap (\n -> mkCircle (100 * n) (10 * n)) [1]

This was enough to kick out 3 more solutions. One of them had some conversions to Int; the other two had hard-coded constants such as 9999 and 3000 that looked highly suspicious. Um ernst zu sein, the MC expected some solutions to use floating-point arithmetic; but it turns out that those few solutions that did use Double already failed some earlier tests.

Round 7, final round! Lots of concentic circles:

        Data.List.concatMap (\n -> mkCircle n (100 * n)) [1..100]

This gives us our Top 30:

Top 30 der Woche
  Platz Wettbewerber(in)   Laufzeit (s)   Punkte
1. Dario Beckert 0.29 30
2. Julian Spahl 0.53 29
3. Christoph Rudolf 0.54 28
4. Rudolf Franz Hattenkofer 4.09 27
5. Dmitriy Zayko 4.13 26
6. Lukas Heinzmann 4.18 25
7. Alexander Küchler 4.30 24
8. Markus Obrull 4.41 23
9. Fabio Madge Pimentel 4.43 22
10. Borys Levkovskyi 4.69 21
11. Michael Hesse 4.77 20
12. Chaoran Chen 4.99 19
13. Felix Wielander 5.14 18
14. Sebastian Weiß 5.18 17
15. David Widmann 5.19 16
16. Jonas Heintzenberg 5.19 15
17. Philipp Jutzi 5.21 14
18. Marla Narazani 5.30 13
19. Johannes Stubenrauch 5.41 12
20. Michael Eder 5.44 11
21. Kirill Haar 5.48 10
22. Armin Hörtreiter 5.76 9
23. Baris Kayalibay 5.97 8
24. Sebastian Josef Neubauer 6.55 7
25. Sebastian Bruhn 6.66 6
26. Stefan Cimander 7.03 5
27. Leander Nachreiner 7.56 4
28. Simon Rieß 7.77 3
29. Lukas Prantl 7.84 2
30. Felix Thimm 8.02 1

To distinguish between Herr Widmann and Herr Heintzenberg, both at 5.19 s, the MC appealed to the second criterion, the token count, where Widmann had 59

        distance (x,y) (x',y') = abs (x-x') + abs (y-y')

        nearestNeighbors xs = map (\x -> const x &&& snd . minimumBy (compare `on` fst) . map (distance x &&& id) $ delete x xs) xs

whereas Heintzenberg had 63 (but a similar algorithm):

        nearestNeighbors xs = map (nearest xs) xs

        nearest :: [Point] -> Point -> (Point, Point)
        nearest xs p@(x,y) = (p, snd $ minimumBy (compare `on` fst) $ map (\s@(a,b) -> (abs (a-x) + abs (b-y),s)) (delete p xs))

And what happened to our tutors? Herr Lohner (KIT) got kicked out at round 3 due to correctness. Herr Bergmaier got kicked out at round 5 due to correctness. Herr Brinkop forgot (?) his {-WETT-} and {-TTEW-} tags.

Let's look at the winners. The first place first, by Herr Beckert:

        nearestNeighbors :: [Point] -> [(Point, Point)]
        nearestNeighbors = procGrid [] . map (sortBy (compare `on` snd)) . groupBy ((==) `on` fst) . sortBy (compare `on` fst)

        {-Process-all-the-things!-}

        procGrid :: [[Point]] -> [[Point]] -> [(Point, Point)]
        procGrid _   []          = []
        procGrid lft (col : rgt) = map (colFst lft col rgt) col ++ procGrid (col : lft) rgt

        procSide :: [[Point]] -> Point -> Maybe Point -> Maybe Point
        procSide []           _ m = m
        procSide (col : cols) p m
            | closerThan d p m    = m
            | otherwise           = procSide cols p (maybeMin p (procCol col p) m) where
            d = abs (fst p - fst (head col))

        procCol :: [Point] -> Point -> Point
        procCol [q]      _                      = q
        procCol (q : qs) p
            | p `distTo` q < p `distTo` head qs = q
            | otherwise                         = procCol qs p

        procOwn :: [Point] -> Point -> Maybe Point -> Maybe Point
        procOwn [_]      _ m = m
        procOwn (q : qs) p m
            | p == q         = if p == head qs then Just p else maybeMin p (head qs) m
            | otherwise      = procOwn qs p (Just q)

        {-Get-neighbor-or-crash-tryin'-}

        colFst :: [[Point]] -> [Point] -> [[Point]] -> Point -> (Point, Point)
        colFst lft col rgt p
            | closerThan 1 p h = (p, fromJust h)
            | otherwise        = lftSnd lft rgt p h where
            h = procOwn col p Nothing

        lftSnd :: [[Point]] -> [[Point]] -> Point -> Maybe Point -> (Point, Point)
        lftSnd lft rgt p m
            | closerThan 1 p h = (p, fromJust h)
            | otherwise        = rgtLst rgt p h where
            h = procSide lft p m

        rgtLst :: [[Point]] -> Point -> Maybe Point -> (Point, Point)
        rgtLst rgt p m = (p, fromJust (procSide rgt p m))

        {-Utils-}

        distTo :: Point -> Point -> Integer
        distTo (px, py) (mx, my) = abs (px - mx) + abs (py - my)

        closerThan :: Integer -> Point -> Maybe Point -> Bool
        closerThan _ _ Nothing  = False
        closerThan i p (Just m) = p `distTo` m <= i

        maybeMin :: Point -> Point -> Maybe Point -> Maybe Point
        maybeMin _ h Nothing = Just h
        maybeMin p h (Just m)
            | p `distTo` m <= p `distTo` h = Just m
            | otherwise                    = Just h

This looks like an impressive solution, but without documentation or references it is difficult to make much sense out of it. :-S Perhaps Herr Beckert could send the MC Sr. a description, which could then be inserted here? Vielen Dank im Voraus für Ihre Bemühungen!

The second place, by Herr Spahl, is hard to make sense of again because of the lack of documentation; but it seems very smart!

        nearestNeighbors :: [Point] -> [(Point, Point)]
        nearestNeighbors ps = findNearestAll ps (groupMatrix (sortBy (order)ps) 0 [] []) []

        groupMatrix :: [Point] -> Integer -> [Integer]-> [(Integer,[Integer])] -> [(Integer,[Integer])] 
        groupMatrix [] i ls rs = if null ls then rs else (rs++[(i,ls)])
        groupMatrix (p:ps) i ls rs 
          |null ls    = groupMatrix ps (fst p) ([snd p]) rs
          |fst p == i = groupMatrix ps (fst p) (ls++[snd p]) rs
          |otherwise  = groupMatrix ps (fst p) [snd p] (rs++[(i,ls)])

        findNearestAll :: [Point] -> [(Integer,[Integer])] -> [(Point, Point)]-> [(Point, Point)]
        findNearestAll []  _ rs = rs
        findNearestAll a@(p:ps) gs rs = findNearestAll ps gs ((matchx p [] gs) :rs)

        matchx :: Point -> [(Integer,[Integer])] -> [(Integer,[Integer])] ->  (Point,Point)
        matchx (x,y) bs ( (a@(x2,yswy)): gs ) 
          |null gs && null ys = findNearest (x,y) (tail bs, []) (fst(head bs), (findNearInRow y q (head q)))
          |null gs =   findNearest (x,y) (bs, []) (x2, (findNearInRow y (ys) (head ys)))
          |null ys =  matchx (x,y) ( bs) gs  
          |x2 >= x  =  findNearest (x,y) (bs, gs) (x2, (findNearInRow y (ys) (head ys)))
          |otherwise = matchx (x,y) ( a :bs) gs 
          where ys = if x2 == x then delete y yswy else yswy -- rem point !emptylist
        	q = snd (head bs) 

        findNearest :: Point -> ([(Integer,[Integer])], [(Integer,[Integer])])-> Point -> (Point, Point) -- start near
        findNearest s ([], []) e = (s,e) 

        findNearest s (((x1,ys1):xs), []) e 
          |distance s e >= abs (fst s -x1) = findNearest s (xs, []) (nearer s e (x1, findNearInRow (snd s) (tail ys1) (head ys1) ))
          |otherwise = (s,e)
        findNearest s ([], ((x2,ys2):ys)) e 
          |distance s e >= abs (fst s -x2) = findNearest s ([], ys) (nearer s e (x2, findNearInRow (snd s) (tail ys2) (head ys2) ))
          |otherwise = (s,e)
  
        findNearest s (((x1,ys1):xs), ((x2,ys2):ys)) e 
              |distance s e < abs (fst s -x1) = findNearest s ([], ys) n
              |distance s e < abs (fst s -x2) = findNearest s (xs, []) n
              |otherwise = findNearest s (xs, ys) n
          where pu = (x1, findNearInRow (snd s) (tail ys1) (head ys1) )
        	pd = (x2, findNearInRow (snd s) (tail ys2) (head ys2) )
        	nf = nearer  s pu pd
        	n = nearer s nf e
  
        findNearInRow :: Integer -> [Integer] -> Integer -> Integer
        findNearInRow y [] r = r
        findNearInRow y a@(x:xs) r
         -- |null a = r doesnt work
          |abs (y-x)> abs (y-r) = r
          |otherwise = findNearInRow y xs x
  
          order (x,a) (y,b)
          |x < y = LT
          |x > y = GT
          |a < b = LT
          |a > b = GT
          |otherwise = EQ
  
        distance :: Point -> Point -> Integer
        distance p q = abs(fst p - fst q) + abs(snd p - snd q)

        nearer :: Point -> Point -> Point -> Point
        nearer s p q 
          |distance s p < distance s q = p
          |otherwise = q

The third place, by Herr Rudolf, is more to the MC's taste in terms of documentation (although he would have appreciated a margin wrap setting of 80 or at most 100 characters):

        {- Version II using a k-d Tree (http://orion.math.iastate.edu/reu/2001/nearest_neighbor/p509-bentley.pdf), which (according to Wikipedia) is a 
           standard approach to the 'nearest neighbor search' and solves the problem reasonably well for few dimensions. In the original paper (as linked above), 
           the NNS is even described as one application for this structure. -}

        --Regarding the structure of the tree, the slides of the lecture provide us with this definition (except for the Leaf), which fits perfectly:
        data Tree a = Empty | Node a (Tree a) (Tree a) | Leaf a deriving (Eq, Show)

        --Cycling through the axis X and Y
        --This cycling leads to a huge amount of duplicate code, as there are two functions for building a tree by splitting elements and either the x or the y coordinate.
        --The same situation occurs when traversing the tree, which leads to two almost identical search functions.
        --I'm curious if there is a design pattern for cases like this?
        --I thought about passing a function like fst and snd as parameter. but this doesn't solve all problems and pattern matching should be faster.

        --Functions for building the k-d-tree
        --If time left: improve the performance by creating a balanced tree (sorting and taking the center element as node)?
        buildTreeX :: [Point] -> Tree Point
        buildTreeX []				= Empty
        buildTreeX [p]				= Leaf p
        buildTreeX ((x,y):points)	= Node (x,y) leftSubTree rightSubTree
        	where leftSubTree = buildTreeY (filter (\(a,b) -> a <= x) points)
        	      rightSubTree = buildTreeY (filter (\(a,b) -> a > x) points)

        buildTreeY :: [Point] -> Tree Point
        buildTreeY []				= Empty
        buildTreeY [p]				= Leaf p
        buildTreeY ((x,y):points)	= Node (x,y) leftSubTree rightSubTree
        	where leftSubTree = buildTreeX (filter (\(a,b) -> b <= y) points)
        	      rightSubTree = buildTreeX (filter (\(a,b) -> b > y) points)
	
	
        --The search function itself
        --I handle multiple coordinates separately, as this doesn't fit too well with my search inside the tree
        --Why would NYC build multiple of those Godzilla-Panic-Rooms in the same location anyway?
        nearestNeighbors :: [Point] -> [(Point, Point)]
        nearestNeighbors points = map mapNeighbor points
        	where tree = buildTreeX points
        	      multiples = map head (filter (\x -> length x > 1) (group $ sort points))	--getting all multiples once
        	      mapNeighbor p = if elem p multiples then (p, p)							--for each, safety room, see if it has a neighbor with the same coordinate
        		                  else (p, nearestNeighborX p tree)


        --Functions solving the problem for a single point:
        --Again, I apologize for the duplicates.
        nearestNeighborX :: Point -> Tree Point -> Point
        nearestNeighborX point (Leaf node) = node
        nearestNeighborX (px,py) (Node (nx,ny) lTree rTree)
        	| lTree == Empty 					= bestInRight
        	| rTree == Empty 					= bestInLeft
        	| px <= nx 							= testOtherBranchX (px,py) (nx,ny) bestInLeft rTree
        	| otherwise							= testOtherBranchX (px,py) (nx,ny) bestInRight lTree
        	where bestInLeft = getBest (px,py) (nx,ny) (nearestNeighborY (px,py) lTree)
        	      bestInRight = getBest (px,py) (nx,ny) (nearestNeighborY (px,py) rTree)
		  
        testOtherBranchX point node best Empty						= best 
        testOtherBranchX (px,py) (nx,ny) (bestx, besty) nextTree 	= if (abs (px - nx)) < getDistance (px, py) (bestx, besty)
        	                                                          then getBest (px,py) (bestx, besty) (nearestNeighborY (px,py) nextTree)
        															  else (bestx,besty)

        nearestNeighborY :: Point -> Tree Point -> Point
        nearestNeighborY point (Leaf node) = node
        nearestNeighborY (px,py) (Node (nx,ny) lTree rTree)
        	| lTree == Empty 					= bestInRight
        	| rTree == Empty 					= bestInLeft
        	| py <= ny							= testOtherBranchY (px,py) (nx,ny) bestInLeft rTree
        	| otherwise							= testOtherBranchY (px,py) (nx,ny) bestInRight lTree
        	where bestInLeft = getBest (px,py) (nx,ny) (nearestNeighborX (px,py) lTree)
        	      bestInRight = getBest (px,py) (nx,ny) (nearestNeighborX (px,py) rTree)

        testOtherBranchY point node best Empty						= best 
        testOtherBranchY (px,py) (nx,ny) (bestx, besty) nextTree 	= if (abs (py - ny)) < getDistance (px, py) (bestx, besty)
        	                                                          then getBest (px,py) (bestx, besty) (nearestNeighborX (px,py) nextTree)
        															  else (bestx,besty)

        --Calculating the 'Manhatten distance'
        getDistance :: Point -> Point -> Integer
        getDistance (x1,y1) (x2,y2) = abs (x2 - x1) + abs (y2 - y1)

        --Get the better one of two possible neighbors
        getBest :: Point -> Point -> Point -> Point
        getBest p n1 n2
        	| p == n1 								= n2	--Ignore the point itself, however this requires multiple
        	| p == n2 								= n1	--safety rooms at the same location to be handled separately
        	| getDistance p n1 <= getDistance p n2	= n1
        	| otherwise								= n2

As an answer to the comment

        --Cycling through the axis X and Y
        --This cycling leads to a huge amount of duplicate code, as there are two functions for building a tree by splitting elements and either the x or the y coordinate.
        --The same situation occurs when traversing the tree, which leads to two almost identical search functions.
        --I'm curious if there is a design pattern for cases like this?
        --I thought about passing a function like fst and snd as parameter. but this doesn't solve all problems and pattern matching should be faster.

the MC would like to point out that you can't have your cake and eat it too. Passing functions as parameters makes the code more generic, and for this you might have to pay a performance penalty (or not).

<comment author="MC Jr">
The MC Jr disagrees. By passing an infinite list of fst and snd to the traversal functions and some other simplifications, the MC Jr was able to cut the size of Mr Rudolf's code in half. Example:

    both :: (a -> b) -> (a, a) -> (b, b)
    both f (x, y) = (f x, f y)

    -- call with: buildTree (cycle [fst, snd])
    buildTree :: [Point] -> Tree Point
    buildTree = buildTree' (cycle [fst, snd])
      where buildTree _      []         = Empty
            buildTree _      [p]        = Leaf p
            buildTree (f:fs) (p:points) = Node p l r
              where (l, r) = let x = f p in both (buildTree fs) $ partition (f `is` (<= x)) points

While it is true that additional generality can cost performance, this is often negligible, especially considering the cost of code duplication (decreased maintainability, more prone to bugs). In this case, the MC Jr's version of Mr Rudolf's code was only about 10% slower than the original one; with compiler optimisations enabled, it is even consistently slightly faster than the original version.

Note: an even better solution would have been to encode the splitting dimension (x or y) directly in the type of the tree, e.g.:

    data Dim = X | Y
    data Tree = Node Dim Point Tree Tree | Leaf Point | Empty

Or, if you want to go above and beyond (the MC Jr, for one, does) and encode the fact that the splitting dimension always alternates directly into the type using Generalised Algebraic Datatypes (GADTs):

    data Dim = X | Y

    type family OtherDim (d :: Dim) :: Dim where
      OtherDim X = Y
      OtherDim Y = X

    data Tree :: Dim -> * where
      Leaf :: Tree d
      Node :: Point -> Tree (OtherDim d) -> Tree (OtherDim d) -> Tree d

</comment>

The solution goes on with a naive version and some test data, including this nice pearl:

        --In order to prevent last weeks tragedy, we test the function.

Amen! (But “last week's”, not “last weeks”.)

When compiling the entries, the MC noticed that three competitors used Haskell packages for k-d trees. What happened to them? The following comment, by Herr Schilling, is somewhat illuminating:

        --Hier eine etwas effizientere Variante mit KdTrees.
        --Zwar muss für Data.Trees.KdTree die Distanzmetrik als Double angegeben werden,
        --Es wird aber angenommen, dass dies problemlos möglich ist, weil
        --     a) Es zur Problemstellung passt - Manhattan ist nicht so groß, es passen also nur so viele
        --     Schutzräume rein, es sollte also möglich sein diese so als Integer zu enkodieren, dass die Distanzmetrik
        --     zum quadrat sich problemlos in ein Double umwandeln lässt und nicht zu viel an Präzision verloren geht.
        --     Um realistisch zu sein gilt sowieso, dass wenn die Distanz zwischen zwei nearest-neighbor Schutzräumen so groß ist,
        --     dass Umwandlung in ein Double nicht mehr möglich ist, die Fliehenden soweiso keine Chance mehr haben.
        --     b) Es spät ist, und ich keine Lust habe KdTrees selbst zu implementieren. Aber (a) ist natürlich
        --     der motivierende Grund dafür, dass Trotzdem Data.Trees.KdTree benutzt wird.

But his solution got kicked out in round 3 already because it was so slow. Something seems to have gone terribly wrong with the use of the data structure. Furthermore, compromising on correctness is always risky, because the MC is likely to catch it. Indeed, the two other competitors who tried Haskell's KdTrees were kicked out by the correctness tests in rounds 2 and 3 already. The MC tends to follow the spirit of the Problemstellung when testing for performance, but for correctness he sticks to the letter of the law. Also, since the MC hadn't specified any units; what if nanometers are used?

What's a k-d tree? The MC doesn't want to repeat what's written on Wikipedia, but you might enjoy reading it more if you replace k by 2 and “hyperplane” by “plane”.

The idea (and much of the wording) for the exercise came from the scientific article “Functional Pearl: Nearest Shelters in Manhattan” by Shin-Cheng Mu and Ting-Wei Chen (Springer). The basic idea, which seems to underlie some of the solutions above, appears to be as follows: If you want to find the nearest Manhattan-distance point from (x, y), it is either going to be to the North-West, North-East, South-West, or South-East. So you can, for each point, find its nearest NW, NE, SW, and SE neighbor, and then select the nearest of those four. This means we can make four sweeps over the data and merge the results at the end. How to perform those sweeps efficiently? The MC would be dishonest if he claimed that he fully understands it; but the details are there on Wikipedia and in the above paper for those who are very curious.

One more thing: The solutions by Herr Levkovskyi and Herr Jutzi returned [((0, 0), (0, 0))] instead of [((0, 0), (0, 0)), ((0, 0), (0, 0))] for nearestNeighbors [(0, 0), (0, 0)]. This was not entirely clear in the Aufgabentext, but there was an example given that should have clarified it. There's no excuse for not trying the examples. Still, since it can be argued that examples are not really part of the Aufgabe, and only serve to complement it, the MC decided to be generous and let that pass. (Indeed, the example hard-coded a specific order for the results, but the text said that the order was unspecified, which shows that the examples should sometimes be taken with a grain of salt.)

What are the lessons this week?

Die Zwischenauswertung der achten und neunten Woche

Hallo, this is the MC Jr speaking. The MC Jr would like to apologise for the delay, but he was rather busy this week, chasing paper deadlines with one hand and writing an IO mocking system for next week's homework exercises with the other. After all, We do not want any old student to run ugly side-effect-bearing IO actions on our test servers, so the MC Jr hacked together a system that encapsulates IO actions in pure Haskell code and effectively simulates a file system, a network, and a user at the console in order to test your homework submissions.

But now let us have a look at the Wettbewerb submissions so far. There were 196 submissions this week which contained {-WETT-} tags and compiled (plus one by Joachim Breitner from KIT and one by the MC Jr). Funnily enough, only six people (including the MC Jr himself) actually used the advanced interface. Even some of the best solutions simply used nextMove. Anyway, the MC Jr used some grep, sed, and bash magic to create a Haskell module that imports all 198 submissions. Some competitors made the MC Jr's job unnecessarily difficult by handing in solutions using ancient and obscure encodings such as ISO-8859-1 and Mac OS Classic-style line breaks, but that was easily rectified, too.

Then things got a bit hairy: because with the advanced strategy interface, every strategy can use a state of an arbitrary type, you cannot simply put different strategies into a list (since they have a different type, e.g. Strategy Int v Strategy Bool). This can be solved by using an existential type wrapper:

    data MyStrategy = forall u. MyStrategy (Strategy u)

Trying to use those wrapped strategies led to the MC Jr's all-time favourite GHC error message: My brain just exploded Anyway, after a bit of fiddling about, the MC Jr found out how to work with those wrapped strategies:

    let r = case (s1, s2) of (MyStrategy s1', MyStrategy s2') -> playGame g (s1', s2')

In the end, the MC Jr required RankNTypes, ExistentialQuantification, and ImpredicativeTypes for his testing code. Okay, now we have a big list of all strategies and a way to let them play against one another, but there are still 198 of them. Let's decrease that number a bit, shall we.

Round 1
In the first round, every strategy plays 50 games against the random strategy, 25 as the red player and 25 as the blue player. Any strategy that throws an exception or attempts to play an invalid move is disqualified immediately. 110 strategies were disqualified for throwing an exception (mostly undefined), 3 were disqualified for playing invalid moves (tut tut!), and another 4 did not terminate within a reasonable amount of time. Of these last four, three were very simple strategies that tried to return an arbitrary valid move and failed, looping infinitely. The fourth was a very sophisticated solution by Joachim Breitner, for which the MC Jr had high hopes after seeing an earlier version of it perform quite well. Mr Breitner, please fix your solution by Friday, it would be quite a pity to see it disqualified. As it says in the Aufgabenstellung, the MC Jr will not wait for more than a few seconds every turn.

Round 2
The remaining 81 strategies play another 100 games against the random strategy. If a strategy does not win at least 60 of these games, it is culled from the list as it is ostensibly not significantly better than just picking moves at random and thus not worth the MC Jr's precious CPU time. This eliminated another 67 strategies, plus one that played an invalid move. Most strategies either win about half of their games (so they are about as good as the random strategy) or over 90 % of their games (which makes them significantly better than the random strategy), but one strategy actually managed to win only 1 (!) out of its 100 games. This was done by filling columns from left to right as the red player and rows from left to right as the blue player, which, of course, results in a shockingly terrible strategy.

Round 3
At this point, a meagre 13 strategies remain. The MC Jr let each of these strategies play against every other strategy 200 times, 100 times in the combination red/blue and 100 times in the combination blue/red. Every time a strategy wins against another, it receives one point. In the end, the strategies are ranked with respect to points, yielding the following ranking:

Vorläufiges Strategienranking
 Platz Wettbewerber(in) Punkte
 1. Fabio Madge Pimentel 2343 
 2. Christoph Rudolf 2097 
 3. MC Jr 1822 
 4. Martin Wauligmann 1550 
 5. Joshua Rombauer 1326 
 6. Chaoran Chen 1184 
 7. Lukas Prantl 1116 
 8. Sebastian Weiß 965 
 9. Christoph Griesbeck 931 
 10. Florian Wiedner 887 
 11. Sebastian Neubauer 626 
 12. Alexander Book 572 
 13. David Widmann 181 

Don't forget: these points are the points the strategy achieved; they are not Wettbewerb points. Those will be awarded after the deadline on Friday. This is merely a first unofficial glimpse of the current status of the submissions.

As a special treat, you can have a look at two particular games involving Mr Madge Pimentel's strategy defeating the two runners-up effortlessly.

  1. Christoph Rudolf (Red) v Fabio Madge Pimentel (Blue)
  2. MC Jr (Red) v Fabio Madge Pimentel (Blue)

Considering that he invested very little time in his own strategy, the MC Jr is a bit disappointed that he still landed in a respectable third place. Da ist noch Luft nach oben! The MC Jr hopes that the competitors will use the time until Friday to improve their solutions, so until then: Happy Hacking! And remember: there is no such thing as a draw in the Game of Stones.

Die Ergebnisse der achten und neunten Woche

If you're only here to look at some pretty pictures: look here

After some intense testing, the MC Jr has compiled the results for the eighth week. Apparently, some competitors took the MC Jr's call to improve their solutions before the final deadline to heart; third place, for instance, is occupied by Lukas Tenbrink, who was still absent from the preliminary ranking. Joachim Breitner fixed the stack overflow in his code and actually claims that the code came from fgl, Haskell's Functional Graph Library. Apparently, fgl used foldr instead of foldl in one place, leading to very bad performance – more on that in one of the next lectures. All in all, however, the results look very similar to those from one week ago:

Top 30 der Woche
 Platz Wettbewerber(in) Spielpunkte Wettbewerbspunkte
1. Fabio Madge Pimentel1597 60
2. Joachim Breitner (KIT-Übungsleitung)1548 n/a
Christoph Rudolf1416 58
3. Lukas Tenbrink1237 56
4. MC Jr1055 n/a
Chaoran Chen1027 54
5. Julian Spahl1020 52
6. Maximilian Endraß1013 50
7. David Widmann994 48
8. Daniel Schubert946 46
9. Martin Wauligmann935 44
10. Joshua Rombauer517 42
11. Sebastian Weiß457 40
12. Lukas Prantl425 38
13. Florian Wiedner401 36
14. Christoph Griesbeck318 34
15. Sebastian Neubauer219 32
16. Alexander Book175 30

The mode of ranking was similar to the Zwischenauswertung: anything that throws exceptions or does not win against the random strategy at least 60% of the time is disqualified, the remaining strategies play 50 games against each other as each colour. Note, in particular, that the MC Jr did not award any points to competitors whose strategies were no better than the random strategy, so we only have a Top 16 this week. This is due to the fact that when strategies essentially pick moves completely at random, whether or not a particular one lands in the Top 30 is more or less down to luck. Besides, it would mean a significantly higher number of strategies to test, and the MC Jr's time, unlike his fondness for writing moderately witty comments on other people's code, is limited.

The MC Jr wrote a Haskell program to generate this nice tournament table, which shows precisely how often each player won against every other player. (an entry n means that the player in that row won n games out of 100 against the player in that column)

FMPJBCRLTMCJCCJSMEDWDSMWJRSWLPFWCGSJNAB
FMPn/a50100100999850100100100100100100100100100100100
JB50n/a100100100985010010010050100100100100100100100
CR00n/a1009193501009010010010092100100100100100
LT000n/a826450509210010010099100100100100100
MCJ10918n/a52595054894790941009710095100
CC2273648n/a504232981510095100100100100100
JS505050504150n/a5050505010079505050100100
ME00050505850n/a4508210069100100100100100
DW0010846685096n/a148689691009710061100
DS0000112505086n/a47100100100100100100100
MW05000538550181453n/a5062100100100100100
JR00001000011050n/a465050100100100
SW00816521313103854n/a2555327575
LP0000005000005075n/a505010050
FW000030500300504550n/a5010050
CG0000005000000685050n/a5050
SJN0000500039000250050n/a100
AB000000000000255050500n/a

If you prefer shiny pictures to dry numbers, take a look at the MC Jr's collection of games rendered as GIFs, but consider yourselves warned: each game is between 600 kB and 3.5 MB. They are, however, very pretty.

The many ‘50’ entries typically stem from the situation where two deterministic strategies play against each one and each strategy wins when it is the Red player and loses when it is the Blue player, as the Red player has a small advantage.

The MC Jr can now also finally give the game its proper name: it is called Hex and was invented by the Danish mathematician Piet Hein and (independently) the famous John Nash. By a standard strategy-stealing argument, it can be shown that the red player has a winning strategy on any rhomboidal board, but for the board size of 11×11, it is not yet known what this strategy is. Some good heuristics for Hex AIs exist, and some competitors implemented some of them to varying success. Let us go through the different approaches:

The last 5: most of the strategies in the last quarter of the Top 18 do very little analysis of the game field and simply pick the first valid move according to a predetermined pattern that is, for varying reasons, slightly better than picking moves at random. As can be seen from the table above, these strategies perform poorly against the more sophisticated strategies in the Top 10, but, mind you, even the last place was worth 30 points this time.

Sebastian Weiß's strategy is already slightly more interesting: in every move, it attempts to place a stone next to one of its other stones along the principal axis (i.e. left-right if red, top-bottom if blue). This strategy is good enough to win against the random strategy nearly every time, but it still does not take into account the opponent's moves at all.

With six lines of code and a solid tenth place, Joshua Rombauer wins the award for the best result with the smallest effort: his strategy, which the MC Jr calls the ‘Goldilocks strategy’, is but a tiny variation of the trivial strategy also realised in the Musterlösung (i.e. performing the first valid move), and that variation is in the order in which the possible positions are traversed: along the principal axis and preferably close to the middle.

Martin Wauligmann finally implemented some reaction to the opponent's play, and this leads to a 400-point distance between his ninth place and the tenth place behind him. If he has a winning move, he does that. If his opponent has a winning move, he tries to block it. (the MC Jr will refer to this as the ‘winning move analysis’ from now on) If neither is the case, he tries to place a stone along the top-left-to-bottom-right diagonal, which is an interesting choice, but does lead to an strange situation where it beats the much more sophisticated solution by Joachim Breitner in second place. (Wauligmann/Breitner)

Daniel Schubert turned the sophistication-knob all the way up to eleven with his solution. He wrote a neural network implementation and a simple AI and then trained the network to play against the AI (for 16 hours (!)). The MC Jr does not know the first thing about neural networks and admires Mr Schubert's enthusiasm and dedication, but since Martin Wauligmann's simple heuristic was nearly as good as the neural network, neural networks are probably not the best approach for this problem. The fact that it was only trained to play against one particular strategy may also have been a factor. He wrote in his comments: As this is my first neural network etc. please don't expect too much from it. I'm pretty sure you will find a better setup for it that gives us way better results That may or may not be the case, but unfortunately, the MC Jr does not have the time to improve the competitor's solutions for them, especially if he does not understand them. Besides, it would very much go against the spirit of the Wettbewerb.

David Widmann implements the winning move analysis plus a very eager Bridge-building algorithm. Bridges are very useful, since they can be used to safely connect stones; however, David Widmann's strategy places them on the board somewhat indiscriminately, limiting their effectiveness (Schubert/Widmann). It would have been better to place a small number of bridges along the principal axis and then connect them. Regardless, David Widmann's strategy performs remarkably well.

Similarly, Maximilian Endraß also implements a winning move analysis (function names canIwin and dontLetHimWin) plus something with bridges. It appears that he always first tries to set up a bridge and then close it, but like David Widmann, he does not orient his bridges along the principal axis, so he extends his paths somewhat aimlessly, but with good results.

Julian Spahl employs a strangely asymmetric strategy: as the Blue player, he simply takes the first valid move. As the Red player, he seems to just generally go down and right. Surprisingly, this very simplistic stategy works extremely well, winning against almost all of the other strategies almost every time as the Red player (Spahl/Madge Pimentel), but losing almost all the time as the Blue player (Madge Pimentel/Spahl)

Chaoran Chen's solution is a bit of a mystery to the MC Jr. There seems to be some sort of bounded exploration of the game tree in there, but the MC Jr is not quite sure how it works exactly.

The MC Jr's solution employs the winning move analysis plus Dijkstra's algorithm to compute a shortest path from one side of the board to the other and places a stone on a random field of some shortest path. This works reasonably well, but could work a lot better if it tried to employ some more defensive tactics and/or bridge-building as well, like some of the other strategies did.

Lukas Tenbrink implemented a complex bridge-building algorithm that, similarly to the previous bridge-building algorithms, tries to set up many bridges along its principal axis and then close them. It appears to also ‘defend’ its bridges by closing them immediately when the opponent endangers them by putting a stone next to a bridge. (Tenbrink/MC Jr) The great weakness of this strategy is that it attempts to construct paths that would need to run through an already existing enemy path. (Tenbrink/Breitner)

Christoph Rudolf's strategy is very similar to Lukas Tenbrink's, but also implements the winning move analysis. Its greatest weakness seems to be that one can easily cause it to divert its bridge-building efforts from the principal axis, effectively ‘running its path into the wall’. (Rudolf/Madge Pimentel)

Joachim Breitner follows the following interesting idea and seems to do well with it:

    --   * Consider all shortest paths of opponent, measured in number of pieces
    --     that he has to place to connect his two sides.
    --   * Pick the field where _most_ of these paths pass through.
    --   * If there are multiple, pick one where most of our own shortest paths
    --     pass trough
    --   * If there are still multiple choicse, pick one closest to the center.
    --     (this is mostly relevant when starting the game)

Fabio Madge Pimentel also seems to use something with graph algorithms, shortest paths, and analysing the best paths for both players to follow and where they collide, but his code is too complex and too sparsely commented for the MC Jr to understand in more depth. Another curiosity is that, despite the Red player having an advantage in Hex, when Mr Madge Pimentel plays against Mr Breitner, each of them wins against the other as the Blue player. (Madge Pimentel/Breitner, Breitner/Madge Pimentel)

The top 2 solutions by Fabio Madge Pimentel and Joachim Breitner win against most other strategies almost all of the time, but intriguingly, as the Blue player, they lose against the 7th-ranked Julian Spahl every single time. (Spahl/Madge Pimentel, Spahl/Breitner)

To summarise:

Die Ergebnisse der zehnten und elften Woche

Great. The MC Sr. is on permanent leave from TUM to go on other endeavours. Additionally, the MC Jr. is on temporary leave to attend some C++ conference or something, speaking about some storms or whatever. With noone left to write the Wettbewerbsblog, some other guy (hereafter referred to as MC) had to bite the bullet, which made him somewhat grumpy. Happy new yearslackers!

Top 30 der Woche
  Platz Wettbewerber(in)   Punkte
1. Felix Thimm60
2. Christoph Griesbeck58
3. David Widmann56
4. Michael Kubitza54
5. Alexander Küchler52
6. Fabio Madge Pimentel50
7. Chaoran Chen48
Sebastian Josef Neubauer48
Lukas Tenbrink48
Jonas Heintzenberg48
11. Martin Wauligmann40
12. Simon Rehwald38
13. Joshua Rombauer36
14. Aser Abdelrahman34
15. Alexander Book32
16. Christoph Rudolf30
17. Lucas Maximilian Stoffl28
18. Adrian Philipp26
19. Dario Beckert24

The ranking criterion was primarily efficiency, and secondarily beauty of the code. Of course, only correct solutions have been included in the ranking. To increase confidence in the correctness, the MC not only reran the official QuickCheck tests, but also generated some larger example formulas and compared the results of the contestants with the enterprise-grade SAT solver sat4j. (The MC also tried running other programs submitted to the MaxSAT Competition, but they failed to compile, failed to run or generally failed to exist, which made him grumpy.)

In total, 24 programs have been submitted to the Wettbewerb. Two failed the official tests and three more failed the MC's extended tests. Sadly, Herr Lohner's submission failed the official tests for some corner cases (empty clauses); otherwise, it would've been very competitive. It uses linear programming once again and beats the heck out of all other submissions on almost all example formulas. (For some formulas, it's even faster than the enterprise-grade SAT solver.) The MC therefore awards Herrn Lohner a lobende Anerkennung.

After eliminating the five incorrect submissions, we're left with 19 (hopefully) correct submissions. Let's get ranking.

Benchmark 1: pairs of variables

The MC used the following Haskell code to generate formulas of the shape {{x, y}, {-x, y}, {x, -y}, {-x, -y}}.

    bench1 k = concatMap mkGroup [1..k]
      where mkGroup i = group (2*i) (2*i+1)
            group i j = [[-i,j],[i,j],[-j,-i],[-j,i]]

The parameter k denotes the number of pairs of variables to generate. The result is always a formula with 2 · k variables and 4 · k clauses. For all assignments, there are 3 · k satisfied clauses. This formula is nice because one of the optimization techniques which could be applied here is separation of independent subformulas.

Almost all contestants, except for three, performed under 0.5s for k = 8. The MC decided to eliminate them at this point before continuing, securing them ranks 17 to 19. Funnily enough, one submission features the following, extremely optimistic, but also misguided claim:

    -- Not bad...looks like a working solution...not sure if better than brute force, though...looks better in any case
    -- It should be O(|vars|log(|vars|)) (due to sorting), if I see it correctly, while brute force would be O(2^|vars|)(?)

The answer is: No. The MC would like to point out that MaxSAT is an NP-hard problem, making itlikely exponential in the general case. Two years ago, the Meta-Master Tobias N. offered a Doktortitel for a polynomial solution to the SAT problem, but to no avail. Looks like he can't give one out this year eitheryou lazy bums.

The MC had to kill one program after consumption of 10 GB of memory, which made him grumpy. Looking at the source, the author apparently tried to be too clever. Look, the MC – of all people – knows that some of the functions in the standard library are inefficient (he's looking at you, nub), but if you don't even test your reimplementations against large inputs where they're leaking space like they're Edward effin Snowden, that'll only make the MC grumpyyou do not want to have the MC be grumpy because of you.

Christoph Rudolf really exceeded in this benchmark, even for large values of k. He also beat the enterprise-grade SAT solver by some orders of magnitude. Unfortunately, not in any other benchmark.

Benchmark 2: all but one satisfied

Next, the MC wrote this piece of code to generate formulas with an optimal solution satisfying all but one clauses:

    bench2 k = map mkClause [1..k] ++ [finalClause]
      where mkClause i = i : [-i+1.. -1]
              finalClause = map negate [1..k]

For example, with k = 3, the resulting formula is {{1},{2,-1},{3,-2,-1},{-1,-2,-3}}. These have also been used two years ago in the SAT competition.

To the MC's surprise, all 16 remaining contestants performed quite well (< 0.5s) in this benchmark for k = 15. He decided to crank up the difficulty to 11 20. In that case, execution times ranged from 0.40s to 27.52s.

Benchmark 3: forged from research

Did the MC already mention the MaxSAT competition? Well, they have a website, and they offer their test data for download. The MC picked the smallest (around 60 variables) example from the crafted category (meaning: formulas designed in a way to exhibit worst-case runtime behaviour). The enterprise-grade SAT solver solved it instantaneously, whereas no contestant could solve it in less than 30s. The MC became grumpy and declared this benchmark a failure.

Benchmark 4: forged from randomness

Lastly, the MC let his computer come up with some formulas. They're not as pretty as artisanal, hand-crafted ones, but they'll do. Of course, all submissions were ran on the exact same formulas.

All random formulas consist of n variables and k clauses, with each clause composed of two to seven literals. For the first two formulas with n = 10, all submissions were very fast (< 1s). The MC decided to make it more difficult and produced two more: (n, k) = (15, 200) and (n, k) = (17, 300). The enterprise-grade SAT solver took around 9s for the last one.

Benchmarking results

Here are the raw results from the above benchmarks. Every benchmark has two columns: the absolute running time of each submission in seconds and its relative running time compared to the longest-running submission.

Bench 1Bench 2Bench 4aBench 4b
Michael Kubitza6,9050%4,4716%1,0744%6,6442%
Martin Wauligmann11,0280%6,7224%1,4961%9,7862%
Alexander Küchler7,5655%4,4716%1,1347%7,2546%
Fabio Madge Pimentel8,3661%0,401%0,8736%10,7868%
Alexander Book12,8794%10,8339%1,8877%12,1977%
Christoph Rudolf0,010%27,52100%2,43100%15,80100%
Sebastian Josef Neubauer7,8457%5,1819%1,2652%7,8750%
Felix Thimm1,279%0,181%0,104%0,574%
Chaoran Chen7,9958%5,0218%1,2351%7,8950%
Lukas Tenbrink8,3561%5,5420%1,2451%8,0851%
Joshua Rombauer11,7686%7,3227%1,6769%10,8969%
David Widmann6,5648%4,1515%0,9740%6,4141%
Christoph Griesbeck2,8421%0,682%0,3715%2,4315%
Aser Abdelrahman13,69100%7,9629%1,8375%11,9576%
Jonas Heintzenberg8,2760%5,2919%1,2752%8,3353%
Simon Rehwald11,7786%6,9825%1,5263%10,1364%
sat4j 0,5310,72
Denis Lohner0,010,010,161,21

The MC assigned the ranks 1-16 by comparing the average of the relative running times. Some of the submissions are really close with that respect, so the secondary criterion kicks in. The four submissions sharing rank 7 are equal in that respect: concise and readable, but no comments. Don't do that.

Finally, let's have a look at the winning entry by Felix Thimm. It scored best in all benchmarks but the first one (where it scored second-best):

    vars :: [Clause] -> [Var]
    vars clauses = nub [abs l | l<-(concat clauses)]

    isSatisfied :: Clause -> Assignment -> Bool
    isSatisfied (l:ls) ass  | l<0 = notElem (-l) ass || isSatisfied ls ass
                | otherwise = elem l ass || isSatisfied ls ass
    isSatisfied _ _ = False

    countSatisfied :: [Clause] -> Assignment -> Int
    countSatisfied (c:cs) ass = if isSatisfied c ass then 1+countSatisfied cs ass else countSatisfied cs ass
    countSatisfied _ _ = 0

    maxSat :: [Clause] -> (Assignment, Int)
    maxSat [] = ([],0)
    maxSat clauses = (litsToAss a,b)
      where (a,b) = msh clauses (vars clauses)

    litsToAss :: [Literal] -> Assignment
    litsToAss (l:ls) = if l<0 then litsToAss ls else (l:litsToAss ls)
    litsToAss _ = []

    msh :: [Clause] -> [Var] -> ([Literal],Int)
    msh _ [] = ([],0)
    msh cs [v] = if (snd (removeSatisfied cs v)) > (snd (removeSatisfied cs (-v))) then ([v], snd $ removeSatisfied cs v) else ([(-v)], snd $ removeSatisfied cs (-v))
    msh cs (v:vs) = if (snd $ removeSatisfied cs v) + y2 > (snd $ removeSatisfied cs (-v)) + n2 then ((v:y1), (snd $ removeSatisfied cs v) + y2) else (((-v):n1), (snd $ removeSatisfied cs (-v)) + n2)
      where   (y1,y2) = msh (fst $ removeSatisfied cs v) vs
        (n1,n2) = msh (fst $ removeSatisfied cs (-v)) vs

    removeSatisfied :: [Clause] -> Literal -> ([Clause],Int)
    removeSatisfied [] _ = ([],0)
    removeSatisfied (c:cs) l = if elem l c then (a,b+1) else ((c:a),b)
       where (a,b) = removeSatisfied cs l

No comments. I repeat: no comments. This makes the MC grumpy. What is even going on there? What does msh even mean? The MC literally can't even

So what about the second-best entry?Please let it have comments. Please. The MC is pleased to see that Christoph Griesbeck put some comments in there.Finally! The author made some interesting optimization for the assignment check, and gave that a funny name:

    --faster than isSatisfied but expects the clause and the assignment sorted by abs
    isSanta :: Clause -> Assignment -> Bool
    isSanta [] _ = False
    isSanta (x:xs) [] = (x<0) || isSanta xs []
    isSanta (x:xs) (y:ys)   | abs x == abs y = h x y || isSanta xs ys
                  | abs x > abs y = isSanta (x:xs) ys
                  | otherwise = (x<0) || isSanta xs (y:ys)
                  where h l v = l + v /= 0

(the formatting is messed up because – get this – the author used tabs in his sources)

Alright, this concludes this week's blog. The most important lesson this week is: For the love of your favoured deity, put some comments in your code! Oh, and don't make false promises in them :-) The bonus lesson is that even with little effort, you can get good results and even beat enterprise-grade SAT solvers (if only in certain cases).

Die Ergebnisse der zwölften Woche

This week, the Wettbewerbsauswertung was in the hand of one of the MC Jr.'s seniors (not to be confused with the MC Sr. and with Junior Senior).

Update: Due to a regrettable mistake, the wrong student got assigned rank 8. (They didn't even participate in the competition.) Apparently, the MC Jr.'s senior made a mistake while juggling submission IDs. Sorry! The rankings have now been updated accordingly.

Top 30 der Woche
  Platz Wettbewerber(in)   Punkte
1. Martin Wauligmann 30
2. David Widmann 29
3. Joshua Rombauer 28
4. Maximilian Endraß 27
5. Lukas Prantl 26
6. Thomas Günzel 25
7. Fabio Madge Pimentel24
8. Chaoran Chen 23
9. Yinshui Chang 22
Jonas Heintzenberg 22
11. Julian Spahl 20

On the first look, the number of submissions this week was on par with the submissions the week before. Twenty students, poised for the trial, all disquietude silenced by the reassuring green of the testing system. The MC Jr.'s Sr. however was all but flabbergasted when a bit of scrutiny revealed a substantial omission: Any of these twenty submissions produced legitimate anagrams. But only half of them were composed careful enough to always find an anagram if such a thing existed.

An eleventh submission failed this trial, maybe because its author was overly meticulous: Julian Spahl deemed it unbefitting for an anagram to contain any of the fragments making up the original phrase. And he was right with that! But, to adapt the sayings of a most acclaimed fictional detective: „How often have I said to you that when you have eliminated the impossible, whatever remains, however dull, must be the result?“ Assuming that this choice was made deliberately, out of an overwhelming need for style, the MC Jr.'s Sr. mercifully awarded him the eleventh place.

As promised, scalability is of utmost importance to the MC Jr.'s Sr. And while every student ought to expect the unexpected, the MC Jr.'s Sr. felt the surviving submissions deserved a bit of rest after the startling loss of half their competitors. Therefore (and definitely not out of laziness) the MC Jr.'s Sr. decided to start with the examples from the Aufgabenblatt, giving an ample timelimit of 120s and with 8 Gibibytes a considerable amount of memory to generate english anagrams for „Jasmin Blanchette“

    "e e n n a a t t i l s m h j b c"
    "a a b c e e h i j l m n n s t t"
    "anaesthetic lbj mn"
    "anaesthetic lbj mn"
    .KILLED.
    "anaesthetic lbj mn"
    "bc haj sentimental"
    "a a b c e e h i j l m n n s t t"
    "a a b c e e h i j l m n n s t t"
    .KILLED.
    

While Julian Spahl will be appalled by the lack of style of the one-letter sequence and others may wonder, why an english wordlist contains words like „lbj“ and „mn“, a third group will imagine the picture of a sad kitten, missing its cheezburger, bc haj sentimental. Despite this amusing imagery, the MC Jr.'s Sr. knows that this is not the time to wonder (about style), nor to listen to Fury in the Slaughterhouse, and swiftly demotes both candidates failing to deliver a result (Jonas Heintzenberg and Yinshui Chang) to the ninth place in the weekly rating.

Considering the second example, i.e., english anagrams of „Master of Competition“ , does not help the MC Jr.'s Sr. to kick out further candidates.

    "e e n a t t t i i o o o r s m m p c f"
    "a c e e f i i m m n o o o p r s t t t"
    "commiseration poet ft"
    "commiseration poet ft"
    "commiseration poet ft"
    "af memo protectionist"
    "a c e e f i i m m n o o o p r s t t t"
    "a c e e f i i m m n o o o p r s t t t"
    

Looking at the result, it becomes clear that two sets of candidates share the same idea of what constitutes a good anagram, although th MC Jr.'s Sr. is quite unsure whether there exists anybody else who considers „a c e e f i i m m n o o o p r s t t t“ to be an examplery member of the anagram society. This abundant use of single letters is a despicable way to generate anagrams. Therefore, the MC Jr.'s Sr. decided force the participants to make an honest attempt. If he required anagrams for „Haskell ist eine wunderschoene funktionale Programmiersprace“, consisting only of german words of length 5 or more, how would the contenders hold up?

    "tenne tenne daene seele arosa kairo irrig fromm klipp schuh schwul"
    .KILLED.
    "persoenlichkeitsentwicklung parfuemhaendler rosenheim arosa"
    "persoenlichkeitsentwicklung parfuemhaendler rosenheim arosa"
    "persoenlichkeitsentwicklung parfuemhaendler rosenheim arosa"
    "areal hafer sauer endomorphismen persoenlichkeitsentwicklung"
    "aechten aechten aegide aehre eprom fluor knirps linus worms links"
    "druckfaehig kleptomanisch wahltermin liane epson neuer roesser"
    

Who would have thought? With persuasion and just a little pressure, the results for most of the candidates improved considerably. For some reaons, this brings the yearly discussion of whether or not to offer a Hausaufgabenbonus back to the MC Jr.'s Sr.'s mind. Chaoran Chen's submission drops out in this round, rewarding him the eighth place.

As a last round, the MC Jr.'s Sr. decided for something a tad self-referential: Finding a anagram consisting of three words for the word „Dreiwortanagrammsuche“, as fast as possible. This word consists of 21 letters, so this can be enforced by having no words of length five or less in the word list.

    0m00.390s: "aemter waschraum dornig"
    0m00.462s: "armada erogen wurmstich"
    0m01.156s: "daenisch geraum wortarm"
    0m04.151s: "anrede wortarm schaumig"
    0m13.439s: "amsterdam gaucho wirren"
    0m48.074s: "amsterdam gaucho wirren"
    1m33.111s: "amsterdam gaucho wirren"
    

These results (now ordered by time instead of by submission id) directly translate into the Top Seven of this week. It is refreshing to see that the winning submission is both short and well-structured:

    anagram :: [String] -> String -> Maybe String
    anagram ws p | null solutions = Nothing
                 | otherwise      = Just $ space $ head solutions

                 where solutions = anag words phrase
                       words  = map lower ws
                       phrase = lower $ filter isAlpha p
                       space  = intercalate " "
                       lower  = map toLower


    anag _ []         = [[]]
    anag words phrase = [w:a | w <- pWords, a <- anag pWords $ phrase \\ w]

                        where pWords  = filter isSub words
                              isSub w = Map.isSubmapOfBy (<=) (toBag w) pBag
                              toBag w = Map.fromListWith (+) $ zip w $ repeat 1
                              pBag    = toBag phrase
    

The lesson of this week is one which probably does not need to be repeated for the ten hopeful contenders which prematurely dropped out of this weeks evaluation: Just because the tests succeed, there is no guarantee that your solution is correct. Write your own tests for your code. And if that is not enough for you after this humiliation, prove your code correct! Ask us about Loom Isabelle.

Die Ergebnisse der dreizehnten Woche

Finally, the MC Jr has returned from his visit to India, and he brought with him not only a nasty cold, but also a renewed appreciation for the Straßenverkehrsordnung, the thousands of behördliche Lebensmittelkontrolleure, and similarly German things.

Anyway, onwards to this week's Wettbewerbsaufgabe, which Lars Hupel stole from the 23rd BWINF (Round 1, Exercise 5). The MC Jr was mildly shocked to find that only 6 (!) submissions passed the automatic tests of the Abgabesystem this week. While somewhat disappointing, this did make the evaluation a lot easier.

The MC Jr decided to use the multi-round elimination approach pioneered by the MC Sr in his evaluation of the SAT exercise two years ago: all the submissions are compiled with -O and thrown at some reasonably interesting problem instance with the hope that their running times differ enough to eliminate some of them, until some ranking is obtained.

Round 1
For the first round, we use a pathological example in we add n rectangles in such a way that each of them intersects all previous ones, leading to a total of O(n4) rectangles. (to be precise: OEIS A213840, but do not ask the MC Jr why)

    test :: Integer -> [Rectangle]
    test n = foo n 1
      where foo 0 _ = []
            foo n h = ((-n, -h), (n, h)) : foo (n - 1) (h + 1)

We start of with n = 20, leading to the following results:

      0.03  Adrian Philipp
      0.05  Lars Hupel
      0.09  Martin Wauligmann
      0.09  Christoph Rudolf
      0.15  Sebastian Neubauer
      1.2   Chaoran Chen
    -----------------------------------------
      6.2   Fabio Madge Pimentel

The MC Jr completely arbitrarily placed the cutoff at 5 seconds, kicking Mr Pimentel out of the race. (but, mind you, even this last place this week was worth a remarkable 25 points)

Round 2
Seeing as the Info 2 team is often criticised for not being ‘real-world’ enough, the MC Jr decided to next test a property that is very important in real-life programs: how they deal with a large amount of rubbish (i.e. irrelevant data). The following produces 2n rectangles in such a way that every rectangle overlaps with precisely one other rectangle, leading to a total of 6n rectangles in the result.

    foo :: Integer -> [Rectangle]
    foo n = [((2*i,0), (2*i+1, 2)) | i <- [0..n-1]] ++ [((2*i,1), (2*i+1, 3)) | i <- [0..n-1]]

The MC chose n = 1000. Now the results look quite a bit different; two solutions that perform quite well on the pathological examples now perform abysmally bad, namely Sebastian Neubauer's and Lars Hupel's:

       0.5  Martin Wauligmann
       1.2  Adrian Philipp
       2.2  Christoph Rudolf
    -----------------------------------------
      14    Chaoran Chen
      69    Lars Hupel
     193    Sebastian Neubauer
     

Round 3
After this foray into the so-called ‘real world’, let us return to the realm of theory and contrived examples, where the MC Jr feels more at home. The next example is courtesy of Lars Hupel:

    checkerboard :: Integer -> [Rectangle]
    checkerboard n = concatMap item [0..n-1]
      where item k = [((0, k), (n, k + 1)), ((k, 0), (k + 1, n))]

This creates n long horizontal and vertical stripes, leading to a checkerboard pattern with a total of ((n*(n+1))/2)2 rectangles in the result. (OEIS A000537, this one is actually pretty easy)
We choose n = 80 and get:

      1  Adrian Philipp
      8  Martin Wauligmann
     21  Christoph Rudolf

Seeing as there is a distinct gap between each of the three remaining contestants, the MC Jr concluded that he has reached the end of his evaluation and corroborated this by running the submissions on the pathological example from round 1, but with 80 rectangles:

      5  Adrian Philipp
     21  Martin Wauligmann
     68  Christoph Rudolf

Fair enough.

Top 30 der Woche
  Platz Wettbewerber(in)   Punkte
1. Adrian Philipp 30
2. Martin Wauligmann 29
3. Christoph Rudolf 28
4. Chaoran Chen 27
5. Sebastian Neubauer 26
6. Fabio Madge Pimentel25

So how do the solutions work? While most of them are almost completely incomprehensible to the MC Jr due to the lack of (helpful) comments, most of them seem to build a list of all the horizontal and vertical lines and then compute all intersections of these lines. Some solutions (the ones that got kicked out in the first two rounds), however take a more brute force approach and enumerate all points within the bounding rectangle of the input, resulting in very poor performance for sparse inputs such as the one in the second round.

Interestingly, it is again the case that the most complicated solutions were also the slowest ones (and that is not counting all the complicated ones that did not pass the automated tests due to subtle bugs). In particular, Fabio Madge Pimentel's solution performs some sort of elaborate clustering of rectangles to speed up computations on sparse inputs, but unfortunately, it appears that the remainder of his code has a huge performance bottleneck somewhere – the MC Jr was unable to find out where exactly.

Lessons for this week:

Die Ergebnisse der vierzehnten Woche

For those who missed the glorious Preisverleihung ceremony (for which there really is no excuse), the MC Jr shall present the 11 submissions, nay, works of art he received this week on this blog as well. And, truly, there are some very æsthetically pleasing, but also some very abstract and provocative pieces among them. The MC Jr shall employ his entire knowledge of the arts (i.e. next to none) to break down the essence of the more cryptic works of art for the laypeople.

It is always difficult to put a numeric value on art, but we did it nevertheless. The submissions were judged by the following highly-renowned panel of art critics:

Each critic assigns between 0 and 10 points to each submission for its æsthetic and artistic qualities. Additionally, the MC Jr also assigned between 0 and 10 points for the technique used. (complexity and quality of the code, etc.) The sum of the artistic scores was multiplied with ⅔ and added to the technical score, yielding a total maximum of 30 points, as usual.

The MC Jr was not entirely satisfied with the names of the contestants; he found that they simply did not sound artsy enough to seriously follow in the footsteps of the great van Gogh, so he took the liberty to fashion them into suitable noms de plume.

So, let us have a look at the submissions themselves.

11th place
The first work of art is a simple one by the artist Michiel van der Eeder

This work clearly shows the author's fixation on the bat motif. For the anagram problem, van der Eeder submitted a function that always returns ‘Nananananana BATMAN’ and this time, too, his code full of various Batman references.

The meaning of this work is subtle. In European culture, bats have a very negative connotation; they are considered impure animals in the Bible and are associated with demons and hell. In stark contrast to this, the bat in this picture is drawn with very clear and precise lines, perfectly symmetrical, and white, not black – almost angelic in appearance. Considering that other cultures have far more positive associations with bats (e.g. a symbol of luck and success in China), it is clear that the artist's intention was to break with the symbolism of European culture, to do away with Eurocentrism and open the viewer's mind to other cultures' ideas.

The technique used here is a clever (and lazy) one, namely the old gnuplot technique: gnuplot is invoked with the Batman curve as its input and generates the output picture. However, since the Batman curve was not invented by the artist and the implementation is quite simple, the critics' panel could not bring themselves to give out many points.

10th place
The next piece is a very simplistic one by Christoffel Grijsbeck.

This picture obviously draws from the post-war literary movement known as Trümmerliteratur. Just as the Trümmerliteratur authors rejected all but the most basic elements of language, Grijsbeck uses only the simplest of geometric objects: the straight line. He applies this approach even on the technical level, implementing his own rastering functions for lines instead of relying on fancy libraries to get the job done. While the critics found this purism quite admirable, they found this piece too inaccessible, too radical, too provocative and awarded it no points for artistic value. The MC Jr did, however, give it a respectable 4 technical points to reward the artist's effort.

9th place
Felix van Timmerman submitted the following more complex work:

This work is obviously based on the work of Beziér and de Casteljau. The first thing one notices in the picture is the duality of white and black, of inside and outside, of good and bad. This is the key to understanding this work: it is a very subtle criticism of the ‘black and white’ portrayal of issues in the media. The unusual choice of guiding points for the Beziér curves makes the curve bulge awkwardly in some places, emphasising the arbitrary nature of the definition of the borders between good and evil, as do the ‘twists‘, where white becomes infinitely thin.

From a technical point of view, this work is, alas, not very interesting. The Beziér curves are plotted with Rasterific and the guiding points are chosen in a very simple (and strange) way. Due to the artistic simplicity of the picture, the critics decided to award 2 points in each category.

8th place
Of course, it was only a matter of time before someone did something with fractals. Sebastiaan van Wijs submitted the following piece based on the work by Wacław Sierpiński.

  

Note that the triangle is brightly-coloured and, in the second picture, highly deformed. The artist takes the ‘clean’ and mathematical original work and does with it whatever he pleases. This playfulness is somewhat reminiscent of the 1950s Pop Art movement.

Van Wijs also submitted another fractal-based picture that is more basic in its appearance.

The interpretation of this picture is a bit more difficult. The author calls it simply ‘Farn’, and it does resemble a plant. The key to understanding the message of this picture is that this farn, this plant, seems to grow infinitely, which is, of course, impossible in the natural world. On the other hand, it resembles a road (the Autobahn?) that extends into infinity. We have therefore concluded that the work is a criticism of wastefulness and consumerism (represented by the long road) and the infinite growth (as opposed to sustainability) it requires.

On a technical level, van Wijs employs an immense amount of linear algebra to render his points onto a mutable image provided by JuicyPixels. He claims that his transformations must always be affine, but unless the MC Jr misunderstood some fundamental part of the code, he suggests that the artist look up ‘affine’ again. Affine transformations always map lines to lines, and parallels to parallels, neither of which is the case in the second example. Anyway, the critics gave 3 points each for artistic value and the MC Jr rewarded the linear algebra and fractal rendering with 6 points.

7th and 6th place
Next, we have David van Wijdman and Martijn Wouligman with the obligatory Mandelbrot fractal.

     

With their intricate structure and colourfulness, Mandelbrot sets are always nice to look at, but they are a bit clichéd. Everybody has written a Mandelbrot rendering program at some point in their life – case in point: the third picture above is actually generated by a program the MC Jr wrote a year ago or so and which can also be found on his Github page. Still, they are nice to look at and not completely uninteresting on a technical level, so the artists received some 4s and 5s.

5th place
Next in the ranking is Fabio Madge van der Pijmentel:

The approach in this picture is similar to that of Grijsbeck (the one with the lines), but in a way even more radical: it rejects even lines and uses only small dots, in reminiscence of the Neo-Impressionist movement known as Pointillism. The picture is, again, generated by pure mathematics: it arranges the natural numbers in a spiral and places a dot at every prime number, generating an infinite, impregnable desert of points. However, even in this chaos, structure emerges: there are diagonal lines of points visible everywhere in the picture, each of which corresponds to a prime-rich quadratic polynomial sequence.

The Ulam spiral may not be very æsthetically pleasing, but the critics have not previously seen a submission quite like it, so they honoured it with a 6/6/7 rating. Since it is, however, not very interesting from a technical standpoint, it only receives 3 points here.

4th place
Chaoraan van Chen created the following very dynamic and Wettbewerbs-centric piece:

The background shows a number of strings of Haskell function names, draped around the circle in arcs, creating the impression of a globe. Written across this ‘globe’ is, in large letters, a {-WETT-} tag. The obvious message is that the entire world revolves around the Wettbewerb – at least for this artist. The artists also apparently associates the Wettbewerb with an overwhelming flood of Haskell library functions, perhaps criticising the decision of allowing arbitrary Hackage libraries in the submissions, which some competitors found somewhat unwise. The font choice (Comic Sans), on the other hand, implies that this work of art is not to be taken entirely seriously and is somewhat tongue-in-cheek.

From a technical point of view, it is worth noting that the picture is rather dynamic: the placement of the background text is random and changes with every programme invocation, as does its content. The function names are taken from Hackage documentation; the program downloads a number of Haskell documentation pages at runtime and parses the function names from the HTML code. The MC Jr rewarded this amusing idea with 6 technical points, and the critics gave out high marks for æsthetic value as well.

3rd place
We have reached the Top 3 with Daniël Schoebart, who submitted the following colourful picture:

The artist describes the picture as autostereoscopic, i.e. it one should be able to see a three-dimensional picture in it without any additional tools (such as 3D glasses). He mentioned during the ceremony that one has to cross ones eyes, but the MC Jr has not been able to see anything in it, no matter how much he crossed his eyes.

Nevertheless, the picture is rather pretty, and the technique behind generating autostereoscopic picture is also quite interesting. Apparently, the picture one is supposed to see is the Greek letter λ (or, as the Unicode standard calls it, amusingly: GREEK SMALL LETTER LAMDA) encoded in the following code segment:

    depthMap = [
    	[0, 0, 1, 0, 0, 0, 0, 0],
    	[0, 1, 1, 1, 0, 0, 0, 0],
    	[0, 1, 0, 1, 0, 0, 0, 0],
    	[0, 0, 0, 1, 1, 0, 0, 0],
    	[0, 0, 0, 1, 1, 0, 0, 0],
    	[0, 0, 1, 1, 1, 1, 0, 0],
    	[0, 0, 1, 0, 0, 1, 0, 0],
    	[0, 1, 1, 0, 0, 1, 1, 0],
    	[0, 1, 1, 0, 0, 1, 1, 0]
	]

The MC Jr still does not see it. However, the take-away message here is perhaps that functional programming is everywhere, even if you cannot always see it. (Possibly because, unlike some other languages, it does not lead to two-hundred lines of stack traces on web sites all the time)

2nd place
On second place, we have Christoffel van Rudolf with two pictures:

  

The first picture is called ‘Some Kind of Mars Surface with Drop Shadows’. The title suggests that the artist himself wasn't quite sure what this is, and the psychedelic appearance clearly suggest that it was created under the influence of certain chemical substances. The technique used here is a simple one (if the MC Jr understood the code correctly): for each pixel, the left and top neighbour are combined and altered with some random value, giving rise to random structures with ‘shadows’ in the bottom-right direction.

The second picture is much more interesting. The MC Jr thinks that it may very well be the most beautiful picture of this competition, but unfortunately, Mr Rudolf did not hand it in officially because he thought it would take too long to render. He claimed that on his machine, rendering took 16 minutes and rightly concluded that that was too long. The MC Jr ran it on his machine nevertheless and found that it took less than a minute. The MC Jr is unsure what kind of setup Mr Rudolf uses, but maybe he should check for burnt-out thermionic tubes.

The technique of this picture is highly interesting and reminded the MC Jr of Arnold Schönberg's twelve-note composition technique. In this technique, all twelve notes of the chromatic scale must be used equally often, giving every note the same importance. Van Rudolf applied the same principle (albeit borrowed) to pictures: starting with an arbitrarily-coloured pixel at some initial position, all colours of the colour space are added in random order in such a way that the colour difference of every added pixel to its neighbours is minimal. This creates a vibrant explosion of colours around the starting point with an intriguingly hand-drawn look. The MC Jr considers printing this picture and putting it up in his office. Had Mr van Rudolf submitted this as his official picture, he would probably have gotten first place; this way, he must contend with second place.

1st place
Sebastiaan Nieuwbouwer is this week's winner, but initially, the critics' panel almost disqualified him entirely, because they failed to understand his work. It initially appeared to them as a black rectangle (first picture below), but it then transpired that it was, in fact, an animated SVG.

  

The animation is a reference to an obscure work by a little-known Russian artist, Aleksei Pajitnov. It takes the form of a game in which geometric shapes must be arranged in a certain way. The animation displays such a game in which a simple ‘AI’ chooses where the pieces appear. You can download the SVG by clicking on the second picture above, or an OGV video if you do not have a player that supports animated SVGs.

The code contains a total of 240 lines in total and therefore receives the Manuel Eberl Award for Going Hopelessly over the Top with an Assignment.

All right, we have our ranking for this week. The following table shows the æsthetics points given by the Meta-Master (MM), the MC Jr, and the Grumpy MC (GrMC), followed by the technical points, also given by the MC Jr, and the weighted sum of all four, which constitutes the total sum of points.

Top 30 der 3. jährlichen Info 2-Vernissage
  Platz Wettbewerber(in) Kunstwerk MM MC Jr GrMC Tech   Punkte
1. Sebastian Neubauer Tetris 10 10 10 8 28
2. Christoph Rudolf Mars/Farbexplosion 9 9 8 7 24 1/3
3. Daniel Schubert Bunte Quadrate 7 8 9 6 22
4. Chaoran Chen Schrift auf Globus 8 7 6 6 20
5. Fabio Madge Pimentel Ulamspirale 6 6 7 3 15 2/3
6. Martin Wauligmann Mandelbrot (bunt) 4 4 5 5 13 2/3
7. David Widmann Mandelbrot (s/w) 5 5 4 4 13 1/3
8. Sebastian Weiß Sierpinski 3 3 3 6 12
9. Felix Thimm Bézier 2 2 2 2 6
10. Christoph Griesbeck Linien 0 0 0 4 4
11. Michael Eder Batman 1 1 1 1 3

Was ist denn ein Token?

The MC Sr. is a big fan of token counting as a rough measure of software quality. As the previous years' results have clearly shown, shorter answers tend to be more elegant and (usually) less likely to have subtle bugs.

The MC Sr. is making his token-counting program available to the rest of the world. The program builds on Niklas Broberg's Language.Haskell.Exts.Lexer module, which can be installed using Cabal:

        cabal update
        cabal install haskell-src-exts

There appears to be disagreement regarding the status of backquotes in different versions of Haskell (98 vs. 2010). The MC Sr. generously adopted the Haskell 98 definition, with `max` treated as one token rather than three.

Here's how to compile and run the program on Linux or Mac:

        ghc Tokenize.hs -o tokenize
        ./tokenize

At this point, type in some Haskell code, e.g.,

        sum_max_sq x y z = x ^ 2 + y ^ 2 + z ^ 2 - x `min` y `min` z ^ 2

Make sure to type in an empty line afterwards. Press Ctrl+D to terminate. The program then outputs

        24 sxyz=x^2+y^2+z^2-xmymz^2

On Windows, you can run the program with

        .\tokenize

or

        tokenize

and terminate it with Ctrl+C (Windows 8) or Ctrl+Z followed by Enter (Windows 7, probably also for earlier versions).

The first field, “24”, gives the number of tokens. The second field is a short version of the program, where each token is abbreviated as one letter. This can be useful for debugging. The MC Sr. aggregates this information in his top-secret Master View, which looks something like this:

        18 sxyz=mxy^2+mxymz^2 xxx@tum.de
        18 sxyz=mxy^2+xmymz^2 xxx@tum.de
        18 sxyz=mxy^2+zmmxy^2 xxx@mytum.de
        20 sxyz=(mxy)^2+(myz)^2 xxx@mytum.de
        20 sxyz=mxy^2+m(mxy)z^2 xxx@tum.de
        20 sxyz=mxy^2+m(mxy)z^2 xxx@tum.de
        ...
        52 sxyz|x=z&y=z=x*x+y*y|x=y&z=y=x*x+z*z|y=x&z=x=y*y+z*z xxx@mytum.de
        52 sxyz|x=z&y=z=x*x+y*y|x=y&z=y=x*x+z*z|y=x&z=x=y*y+z*z xxx@mytum.de
        52 sxyz|x=z&y=z=x*x+y*y|x=y&z=y=x*x+z*z|y=x&z=x=y*y+z*z xxx@mytum.de
        52 sxyz|x=z&y=z=x^2+y^2|x=y&z=y=x^2+z^2|y=x&z=x=y^2+z^2 xxx@mytum.de
        ...
        252 sxyz|(x+y)>(x+z)&(x+y)>(y+z)=(x*x)+(y*y)|(x+z)>(x+y)&(x+z)>(y+z)=(x*x)+(z*z)|(y+z)>(y+x)&(y+z)>(x+z)=(y*y)+(z*z)|x=z&x>y=(x*x)+(z*z)|x=y&x>z=(x*x)+(y*y)|z=y&z>x=(y*y)+(z*z)|x=z&x<y=(x*x)+(y*y)|x=y&x<z=(x*x)+(z*z)|z=y&z<x=(x*x)+(z*z)|x=y&y=z=(x*x)+(y*y) xxx@mytum.de

The Master View allows the MC Sr. to compile the Top 30 der Woche, but also to quickly spot interesting solutions and potential plagiarism. Plagiarism isn't cool, and occurrences of it will be penalized by subtracting 1 000 000 000 (ONE ZILLION) points from the Semester total.

For the competition, the above program is invoked only on the code snippet included between {-WETT-} and {-TTEW-}, with all type signatures stripped away.

Some competitors have tried to confuse tokenize by using German letters (ä etc.) in their source code. Setting

        export LANG=de_DE

before running the program solves this problem.