Wettbewerb
Jedes Übungsblatt beinhaltet eine vom Master of Competition, Herrn Dr. MC „Hammer“ Blanchette, vom Co-Master of Competition, Herrn Co-MC Noschinski, oder vom Contra-Master of Competition, Herrn Contra-MC Hupel, sorgfältig ausgewählte Wettbewerbsaufgabe. Diese zählt so wie die anderen Hausaufgaben, wird aber zusätzlich als Teil eines semesterlangen Wettbewerbs bewertet. Die Bewertungskriterien variieren jedes Mal. Die besten 30 Lösungen bekommen je 30, 29, ..., 1 Punkte.
Jede Woche wird die Top 30 der Woche veröffentlicht und die Top 30 des Semesters aktualisiert. Die Lieblingslösungen des MC werden ins Internet gestellt. Am Semesterende werden die fünf besten Studenten mit geschmackvollen Trophäen geehrt. Es gibt hier also keine Bonuspunkte, dafür aber jede Menge Spaß und Ehre (und Trophäen)!
Wichtig: Wenn Sie nicht möchten, dass Ihr Name im Internet erscheint, können Sie entweder auf den Wettbewerb verzichten (ganz einfach, indem Sie die {-WETT-}...{-TTEW-} Tags weglassen) oder den MC anschreiben.
Gliederung:
- Top 30 des Semesters
- Die Ergebnisse der ersten Woche
- Die Ergebnisse der zweiten Woche
- Die Ergebnisse der dritten Woche
- Die Ergebnisse der (vierten und) fünften Woche(n) (Update)
- Die Ergebnisse der sechsten Woche (Addendum)
- Die Ergebnisse der siebten Woche
- Die Ergebnisse der achten Woche (Update)
- Die Ergebnisse der neunten Woche
- Die Ergebnisse der zehnten Woche
- Die Ergebnisse der (elften und) zwölften Woche (Update)
- Die Ergebnisse der dreizehnten Woche
- Die Ergebnisse der vierzehnten Woche
- Was ist denn ein Token?
Top 30 des Semesters | |||
---|---|---|---|
Platz | Wettbewerber(in) | Punkte | |
1. | Maximilian Haslbeck | 365.333 | |
2. | Simon Roßkopf | 339.333 | |
3. | Daniel Stüwe | 334.333 | |
4. | Stefan Peter Dirix | 321.333 | |
5. | Hauke Brinkop | 304.333 | |
6. | Michael Schreier | 274.333 | |
7. | Atanas Mirchev | 250.000 | |
8. | Maximilian Kirchmeier | 232.500 | |
9. | Julian Biendarra | 223.333 | |
10. | Florian Dreier | 203.000 | |
11. | Alexander Weidinger | 196.667 | |
12. | Nikita Basargin | 181.333 | |
13. | Lyubomir Stoykov | 181.000 | |
14. | Alexander Christian Hefele | 150.000 | |
15. | Christian Ziegner | 148.000 | |
16. | Martin Mihaylov | 147.000 | |
Moritz Sichert | 147.000 | ||
18. | Ludwig Peuckert | 142.000 | |
19. | Alexandra Fritzen | 137.000 | |
20. | David Otter | 132.000 | |
21. | Lukas Michael Stumberg | 127.000 | |
22. | Clemens Jonischkeit | 112.000 | |
23. | Julia Kindelsberger | 97.000 | |
24. | Johannes Ismair | 87.000 | |
25. | Jens Wöhrle | 81.000 | |
26. | Kevin Yu | 78.000 | |
Thomas Pettinger | 78.000 | ||
28. | Thomas Zwickl | 71.000 | |
29. | Hannes Bibel | 67.000 | |
30. | Roland Schmid | 65.000 |
Die Ergebnisse der ersten Woche
Top 30 der Woche | |||
---|---|---|---|
Platz | Wettbewerber(in) | Token | Punkte |
1. | Clemens Jonischkeit | 20 | 30 |
Lukas Michael Stumberg | 20 | 30 | |
Michael Schreier | 20 | 30 | |
Simon Roßkopf | 20 | 30 | |
5. | Florian Dreier | 25 | 26 |
Roland Schmid | 25 | 26 | |
Hauke Brinkop | 25 | 26 | |
Julien Schmidt | 25 | 26 | |
Maximilian Kirchmeier | 25 | 26 | |
10. | David Otter | 26 | 21 |
Stefan Peter Dirix | 26 | 21 | |
Atanas Mirchev | 26 | 21 | |
Dominik Gutermuth | 26 | 21 | |
Daniel Stüwe | 26 | 21 | |
Lars Wüstrich | 26 | 21 | |
Moritz Sichert | 26 | 21 | |
Christian Brosche | 26 | 21 | |
Michael Zellner | 26 | 21 | |
Fabian Weise | 26 | 21 | |
Daniel-Theodor Plop | 26 | 21 | |
Lyubomir Stoykov | 26 | 21 | |
Andrei Vlad | 26 | 21 | |
Jens Wöhrle | 26 | 21 | |
Lawrence Krug | 26 | 21 | |
Maximilian Haslbeck | 26 | 21 | |
Vladislav Stepa | 26 | 21 | |
Aleksander Umov | 26 | 21 | |
Stefan Kreisig | 26 | 21 | |
Michael Benedikt Schwarz | 26 | 21 | |
Martin Mihaylov | 26 | 21 |
The Master of Competition is proud to report that over 200 contestants took part in the first week's challenge! Bravo!
First things first. The MC mentioned in the Aufgabentext that he had a 31-token solution. Here it is:
f124 x y z = a + 2 * b + 4 * c where [a, b, c] = sort [x, y, z]
Decidedly, he didn't try very hard—indeed, he wouldn't have made it into the week's top 20, not even in the top 80! Asche auf sein Haupt! But to tell the truth, he didn't want to set the bar too high and demotivate the contestants. In particular, the coefficients 1, 2, 4 had been chosen because they exhibit nice mathematical properties, but the MC had made no attempt to use these properties.
How does this solution work? First, it builds the three item list [x, y, z] with the three arguments to the function in it. Lists in Haskell are written using a very natural notation. The list is sorted in increasing order, and the result is assigned to the list [a, b, c]. This is called pattern matching. Once this step is performed, a is the least element, b is the middle element, and c is the maximum element. Finally, the formula is computed. Notice how this solution is an almost direct Haskell rendering of the German prose of the Aufgabentext. Hence it is a good compromise between readability and conciseness.
Now let's move on to the shortest solution, with 20 tokens:
f124 x y z = sum $ map maximum $ tail $ subsequences [x, y, z]
This solution is the work of a genius. It was submitted by four competitors, although the MC is not naive enough to believe that it was discovered independently by four geniuses. There's a saying that “great minds think alike,” but still. In any case, the solution is truly amazing. And all the functions involved are from the Haskell standard library.
It works as follows. First, the $ signs are nothing to be afraid of. They're just a cheap trick to get rid of some parenthesis pairs. At the cost of three more tokens, the above function could have been written as
f124 x y z = sum (map maximum (tail (subsequences [x, y, z])))
The subsequences call computes the list of all subsequences of [x, y, z]. A subsequence of a list is a list that contains a subset of the elements of the original list in the same order. For [x, y, z] this means
[[], [x], [y], [z], [x, y], [x, z], [y, z], [x, y, z]]
The tail call removes the first element, [], leaving a 7-element list. Then the map maximum computes, for each list in the list of lists, the maximum. Assuming x > z > y, after the maximum computation we would have
[x, y, z, x, x, z, x]
Finally, the sum function computes
x + y + z + x + x + z + x
How many x's do we have? Four. How many z's? Two. How many y's? One.
Here's the explanation from one of the four competitors who submitted this entry (presumably the genius who thought of it in the first place):
{- Bildet die Potenzmeng, entfernt die fordere leere liste, wendet auf die teillisten x,y,z,xy,xz,yz,xyz jewails maximum an => Der groesste Wert gewinnt 3 vergleiche, der mittlere einen, dazu kommt jedes element noch ein mal => 4 mal das groesste, 2 mal das mittlere, 1 mal das kleine ent. a + 2b + 4c dann noch zusammenzaehlen. done. -}
(Incidentally, the genius's German orthography leaves much to be desired. Potenzmeng? Jewails? Today is a bad day for the German language. And don't come and tell the MC it's the neue Rechtschreibung! ;))
Let's move on. The next best solutions had 25 tokens:
f124h u v = u + v * 2 f124 x y z = foldr1 f124h $ sort [x, y, z]
Again, this solution uses some material that hasn't been shown in class yet. But that's fine for the competition, where everything goes. What's the trick here? First, there's the sort, which we now understand. So let's conceptually replace sort [x, y, z] by [a, b, c] in our minds and get rid of that annoying $ sign. This gives
f124h u v = u + v * 2 f124 x y z = foldr1 f124h [a, b, c]
Next, we have to explain that weird foldr1 call. The function will be explained in a couple of weeks (and you will then understand why its name sounds like the German word for “torture”), but for the moment it is sufficient to understand that it simply applies an operation (here, f124h) repeatedly on a list. So the above is nothing else than
f124h u v = u + v * 2 f124 x y z = f124h a (f124h b c)
Expanding the definition of the helper function f124h, we get
f124 x y z = a + 2 * (b + 2 * c)
which is correct. The solution is clearly inspired by the base-2 version of the Horner rule. [Here the MC would insert a link to an appropriate web site, e.g., Wikipedia, but he's working offline and doesn't know a relevant URL by heart. Please look it up for yourself.]
Another version of this solution inlines the helper function using a so-called λ-abstraction (λ = lambda). It also needs 25 tokens:
f124 x y z = (\u v -> u + 2 * v) `foldr1` sort [x, y, z]
The λ-abstraction is really nothing more than an anonymous function. It is introduced by a backslash (\), an ASCII-friendly approximation of λ, then come the arguments (u and v), then an arrow, and finally the body of the function. Mathematically inclined readers can think of it as being a perverse computer scientist notation for (u, v) ↦ u + 2v.
The other oddity is the `foldr1` token. A two-argument function f is normally invoked by passing the arguments after the function name, as in f x y, but the alternative syntax x `f` y is also allowed. Im Endeffekt, the backticks (`) transform a prefix operator into an infix operator. The infix operator has a fairly low precedence, so this can be used to save some parentheses. Once we name the anonymous function and get rid of the backtick syntax, we fall back on the other 25-token solution. Which goes to show that these tricks don't buy you much!
The 26-token solutions were also rather exciting. They came in two variants:
f124 x y z = (max x y + max x z + max y z) * 2 + min x y `min` z f124 x y z = (x + y + z + x `max` y `max` z) * 2 - x `min` y `min` z
No new Haskell syntaxes need to be introduced to understand these solutions. What's needed is some (non-lethal) dose of mathematics.
It is customary for the MC not only to present the top solutions, but also median and egregious ones. This is a median solution, at 34 tokens:
f124 x y z | x > y = f124 y x z | y > z = f124 x z y | otherwise = x + 2 * y + 4 * z
The self-calls to f124 are a nice way to normalize the order of the arguments without having to appeal to library or helper functions. In fact, the MC had a very similar solution in his drawer. And to offer some contrast, here is the longest working solution, at 185 tokens:
f124 x y z = (minimum [x,y,z]) + 2 * (mid x y z) + 4 * (maximum[x,y,z]) mid x y z | maximum[x,y,z] == x && minimum[x,y,z] == y = z | maximum[x,y,z] == x && minimum[x,y,z] == z = y | maximum[x,y,z] == y && minimum[x,y,z] == x = z | maximum[x,y,z] == y && minimum[x,y,z] == z = x | maximum[x,y,z] == z && minimum[x,y,z] == x = y | maximum[x,y,z] == z && minimum[x,y,z] == y = x
There were 28 solutions with over 100 tokens, so this is not an isolated case. There's also no reason to be ashamed here, since we're at the very beginning of the course. We're here to learn, after all. But what's important to notice is what makes this solutions long: It's the needless case distinctions. Six cases are distinguished and handled separately, with a lot of copy-paste. All the solutions in the top 20 managed to avoid the case distinctions, either by calling sort or by relying on some tricks. The medial solution distinguished three cases, two of which were reduced to the third case through recursive calls. One of the MC's favorite quotes from Donald Knuth goes as follows:
The ability to handle lots of cases is Computer Science's strength and weakness. We are good at dealing with such complexity, but we sometimes don't try for unity when there is unity.
Finally, here's the shortest incorrect solution (27 tokens), to remind us of the need to QuickCheck our programs:
f124 x y z = sum [max x a + a | a <- [0, y, z, max y z]]
It's hard to explain how this function works, especially since it doesn't. It is such a pity, because the same contestant had actually come up with a correct solution as well:
-- schönere Lösung, aber mehr Token: -- f124 x y z = sum (zipWith (*) (sort [x,y,z]) [1,2,4])
Here's another wrong (but plausible-looking) solution:
f124 x y z = min (min x y) z + 2 * min (max x y) z + 4 * max (max x y) z
To conclude, the MC would like to mention a few solutions that were communicated to him privately by last year's winner of the competition, Lorenz Panny (who is apparently bored by his other lectures). Herr Panny has two 21-token solutions, both based on the read function, which translates a string into a Haskell value:
f124 a b c = sum $ uncurry replicate =<< read "[1, 2, 4]" `zip` sort [a, b, c] f124 a b = sum <<< uncurry replicate <=< uncurry zip . fmap sort . read . printf "([1, 2, 4], [%d, %d, %d])" a b
He also has a 23-token solution:
f124 a b c = sum $ uncurry replicate =<< 3 `delete` enumFrom 1 `zip` sort [a, b, c]
No attempts will be made to explain these. Especially the second solution is mysterious even to the MC, because it uses advanced Haskell-specific constructs that won't be taught in class.
So what are the lessons to learn from this week's little experiment?
The first lesson is probably that Mathematics can be used to shorten programs. Nearly all of the shortest solutions exploited mathematical properties of the coefficients 1, 2, 4, namely, 2 × 1 = 2 and 2 × 2 = 4. Whether it's a good thing or not is another queston. For the competition, it's obviously a good thing, because it saves tokens. But it does make the programs harder to understand and harder to maintain. If the specification changes and the program has to work for the coefficients, say, 8, 3, 299, the most optimized solutions have to be rethought from scratch.
The second lesson is that Libraries can be used to shorten programs. And using the library instead of reimplementing functionality is almost invariably a good thing, because
- the library is virtually bug-free;
- the library is well documented;
- the library is likely to be optimized to be faster and/or use less memory than most user applications;
- the library is part of the vocabulary of other Haskell programmers, so they can read your code and understand it quickly.
The third lesson is that Cheap tricks can be used to shorten programs. This is not a very important lesson for programming in the real world, because sometimes it's better to waste a few tokens and to write clearer code. The cheapest trick of all is surely the use of read to transform strings into Haskell values. The competition is an artificial setting. In noncompetitive contexts, use your common sense!
There's a fourth lesson: If you find yourself copy-pasting code, you must be doing something wrong. Copy-paste occurs when you notice some similarity and exploit it. But there are almost always better ways of exploiting the similarity.
Despite the exceptions, there is a fairly strong correlation between program length and quality of code. The ratio above between shortest and longest solution is over 9:1. For large programs, this can make a real difference. It is not unknown from industry and academia that some people achieve more in 2,000 lines of code (LOCs) what others in 20,000 LOCs. And in nearly all case, the 2,000 LOC program is going to be much easier to understand, maintain, and extend. What this week's exercise didn't show, but is certainly true in practice, is that short code tends naturally to be more general and more widely applicable. In fact, there was a bit of this already in the top solutions: Most of them can easily be generalized to compute a + 2b + 4c + 8d + 16e + etc. or even a + 10b + 100c + 1000d + 10000e + etc., because the core algorithm works on lists and implements some variant of the Horner rule.
PS: Somebody who shall remain nameless submitted this “9-token” solution:
f124naiv :: Integer -> Integer -> Integer -> Integer f124naiv x y z |x<=y && y<=z = x + (2 * y) + (4 * z) |x<=z && z<=y = x + (2 * z) + (4 * y) |y<=x && x<=z = y + (2 * x) + (4 * z) |y<=z && z<=x = y + (2 * z) + (4 * x) |z<=x && x<=y = z + (2 * x) + (4 * y) |z<=y && y<=x = z + (2 * y) + (4 * x) f124 :: Integer -> Integer -> Integer -> Integer f124 x y z = f124naiv x y z
Shame on that person! Needless to say, the MC does not let himself be fooled by the wrong placement of the opening {-WETT-} tag.
Die Ergebnisse der zweiten Woche
First, an important announcement: In consultation with the Meta-Master (Prof. Nipkow), the MC has decided to extend the Top 20 to a Top 30 and retrospectively grant 10 additional points to the winners of the first week's contest.
Top 30 der Woche | ||||
---|---|---|---|---|
Platz | Wettbewerber(in) | Token | Lösung (Kurzfassung) | Punkte |
1. | Maximilian Kirchmeier | 22 | par=lgirf(wa=e0(sr+r)) | 30 |
2. | Simon Roßkopf | 23 | par=a(=)rf(wac[0.r+sr]) | 29 |
3. | Hauke Brinkop | 26 | hk=k=rkpar=fh$wac[0.2*r+1] | 28 |
Julian Biendarra | 26 | ix=x=rxpar=fi$wac[0.r*2+1] | 28 | |
5. | Daniel Stüwe | 27 | par=f((=)ar)$w(na)<e0$2*r+1 | 26 |
6. | Florian Dreier | 28 | fa=a=rapar=ff$c(wa)[0.r*2+1] | 25 |
Clemens Jonischkeit | 28 | par=[x|i<e0$r+sr,x<wai,rx=x] | 25 | |
Michael Schreier | 28 | par=[x|y<e0$sr+r,x<way,x=rx] | 25 | |
Stefan Peter Dirix | 28 | par=fp$c$wam[0.r*2+1]px=x=rx | 25 | |
10. | Maximilian Haslbeck | 29 | par=[a|a<c$wam[0.r*2+1],ra=a] | 21 |
Florian Weinberger | 29 | par=[x|x<c(wa)[0.2*r+1],x=rx] | 21 | |
Hannes Bibel | 29 | par=s(cl)[w+a+rw|a<wa1,w<war] | 21 | |
13. | Jonathan Kienzle | 30 | par=[w|n<[0.2*r+1],w<wan,w=rw] | 17 |
Roland Schmid | 30 | par=[w|n<[0.2*r+1],w<wan,w=rw] | 17 | |
Dominik Dechamps | 30 | par=[w|r<[0.2*r+1],w<war,w=rw] | 17 | |
Alexander Weidinger | 30 | par=[w|x<[0.r*2+1],w<wax,w=rw] | 17 | |
Maximilian Weininger | 30 | par=[x|n<[0.2*r+1],x<wan,x=rx] | 17 | |
Sönke Erfkamp | 30 | par=[y|x<[0.r*2+1],y<wax,y=ry] | 17 | |
19. | Tobias Gottwald | 31 | ix=x=rxpar=fi$c$zw(ra)[0.r*2+1] | 12 |
Lars Wüstrich | 31 | par=[x|x<n$c(wa)[0.2*r+1],x=rx] | 12 | |
Martin Mihaylov | 31 | par=c[ifwan|n<[0.2*r+1]]iw=w=rw | 12 | |
Lawrence Krug | 31 | par=f(\x>rx=x)$n$c(wa)[0.2*r+1] | 12 | |
Bogdan Iacob | 31 | par=n[x|x<[0.2*r+1],x<wax,x=rx] | 12 | |
24. | Gernot Brunner | 32 | par=[x|n<[0.(r*2+1)],x<wan,x=rx] | 7 |
Benjamin Schagerl | 32 | par=f(\x>x=rx)$c[war|r<e0$r*2+1] | 7 | |
26. | Zhechko Zhechev | 33 | par=[x|x<c[war|r<[0.2*r+1]],x=rx] | 5 |
Konstantin Bauer | 33 | par=[y|y<c(\x>wax)[0.2*r+1],y=ry] | 5 | |
Daniel-Theodor Plop | 33 | par=c[f(\x>x=rx)$war|r<[0.2*r+1]] | 5 | |
Matthias Grimm | 33 | par=n["+x|z<[0.2*r+1],x<waz,x=rx] | 5 |
This week's Top 30 has a bit more diversity in it than last week. There are more equivalence classes, which suggests that Gruppenarbeit (which is durchaus erlaubt) was not quite as prominent as last week. But in essence, the top solutions are all based on the same idea, namely that the simplest way to generate all palindromes of radius n is to generate all words of at most length 2n + 1 and filter out the words w that do not satisfy the property w == reverse w.
Here is the top solution, with 22 tokens:
palindromesOfRadius alpha radius = liftM2 geq id reverse `filter` (wordsOfLength alpha =<< enumFromTo 0 (succ radius+radius))
This solution is abbreviated as par=lgirf(wa=e0(sr+r)) in the table above, by taking one letter for each token. The MC normally likes to explain solutions that use some constructs that will be seen later in class, but here the liftM2 construct is esoteric enough that he will restrain from doing so and rather focus on simpler solutions. The same comment applies to
palindromesOfRadius alphabet radius = ap (==) reverse `filter` (wordsOfLength alphabet `concatMap` [0..radius + succ radius])
In short, if you find yourself importing any of the modules GHC.Exts, Control.Arrow, and Control.Monad, chances are that (1) you will win the competition and (2) the MC will call your solution cryptic.
Thankfully, the third place is occupied by a less cryptic entry:
h k = k == reverse k palindromesOfRadius a r = filter h $ wordsOfLength a `concatMap` [0..2*r+1]
First, let's rewrite the right-hand side of the main function's definition to make it a bit more readable:
filter h (concatMap (wordsOfLength a) [0 .. 2 * r + 1])
concatMap is a higher-order function, that is, a function that takes a function as argument. The function it takes as argument is wordsOfLength a. That's right: Even though wordsOfLength is supposed to take two arguments, it's OK to call it with a single argument. In that case it returns a function that takes a second argument, and that function is passed to concatMap.
In general, concatMap f [x1, x2, ..., xn] (where f is a function that returns a list and [x1, x2, ..., xn] is an n-element list) returns the list f x1 ++ f x2 ++ ... + f xn. So in the above case, we get
filter h (wordsOfLength a 0 ++ wordsOfLength a 1 ++ ... ++ wordsOfLength a (2 * r + 1))
(except that Haskell doesn't support the ... notation). Finally, the filter call takes a predicate (a function that returns Bool) and filters out all the elements that do not satisfy the predicate. Here, the predicate is h, and it is defined as h k = k == reverse k.
This use of filter and concatMap is anticipating material covered later in the lectures. An alternative that is just as powerful is to use list comprehensions, as in this 28-token solution:
palindromesOfRadius alphabet radius = [x | i <- enumFromTo 0 $ radius + succ radius, x <- wordsOfLength alphabet i, reverse x == x]
What about the MC's solution? Well, even though the instructions were clear that token count is what matters, he couldn't resist writing an efficient implementation. That cost him 49 (!) tokens:
palindromesOfRadius :: [Char] -> Integer -> [[Char]] palindromesOfRadius alphabet radius = (if radius == 0 then [] else palindromesOfRadius alphabet $ radius - 1) ++ [w ++ mid ++ reverse w | mid <- "" : map (: "") (nub alphabet), w <- wordsOfLength alphabet radius]
Some of the MC's friends also participated in the competition on an informal basis. (Actually, the whole competition is informal, but these people are not eligible for the trophies.) First, here's a 27-token solution by Manuel Eberl, who was a tutor last year:
Puisque Lorenz Panny participe à la compétition maintenant, je pense que je pouvais le faire aussi. :) C'est ma solution avec 27 tokens:import Control.Monad import Data.List import Data.Function.Predicate palindromesOfRadius :: [Char] -> Integer -> [[Char]] palindromesOfRadius xs n = filter (reverse >>= equals id) $ [0..2*n+1] >>= sequence . flip genericReplicate xs(Il y a une version plus courte (25 tokens) avec "replicateM" si on a "Int" au lieu de "Integer", mais on ne l'a pas) Je pense qu'il a besoin d'une version actuelle de GHC, parce qu'il utilise l'instance de "Monad" pour ((->) [a]). Ma version est 7.6.2, je sais que ça ne marche pas avec version 7.4.2.
Then the MC got this email from Lorenz Panny (the winner of last year's competition):
PS: Achja, und bevor ich's vergesse ;-)...: (19 Tokens, aber mit bösem GHC.Exts-Import für plusInteger)palindromesOfRadius as = filter (isPrefixOf <*> reverse) . wordsOfLength as <=< enumFromTo 0 <<< plusInteger <*> succ(20 Tokens, dafür aber "sauber" (naja...) mit Standardbibliotheken (Data.List, Control.Monad, Control.Applicative, Data.Bits) -- man beachte "flip subtract <*> complement" für "λn. 2n+1")palindromesOfRadius as = filter (isPrefixOf <*> reverse) . wordsOfLength as <=< enumFromTo 0 <<< flip subtract <*> complement(und noch eine "konstruktive" (d.h. nur Palindrome bauen, statt alle Nicht-Palindrome herauszufiltern) Lösung mit sehr viel Arrow- und Monad-Magie -- diese Lösung ist nur geringfügig auf Tokenzahl optimiert, da das ohnehin aussichtslos schien :-))palindromesOfRadius as = app <=< uncurry mplus . unzip . map (fmap return &&& fmap ((`map` nub as) . flip (:)) <<< map . mplus &&& reverse) . wordsOfLength as <=< enumFromTo 0
And finally, Andreas Bergmaier, who ended up in fifth place last year because of some bad luck (he is truly number 2 in the MC's heart), felt that he had to give Herr Panny a run for his money:
Auch als Tutor tüftel ich gern wieder bei den Wettbewerbsaufgaben mit. Diese Woche bin ich bis auf 24 Token heruntergekommen, mit der ziemlich ineffizienten LösungpalindromesOfRadius alp r = ap (==) reverse `filter` ([0..2*r+1] >>= wordsOfLength alp)(davon ausgehend, dass nub innerhalb von wordsOfLength verwendet wird, um die Duplikatfreiheit zu garantieren). Sie ist äquivalent zu der einfachen List-Comprehension-LösungpalindromesOfRadius alp r = [ p | n<-[0..2*r+1], p<-wordsOfLength alp n, reverse p == p]Eine effizentere Lösung mit detaillierter Konstruktion der Palindrome habe ich nur in 38 Tokens geschafft:palindromesOfRadius alp = genericIndex $ iterate help enalp where help short = enalp ++ do -- prepend empty word and alphabet to p <- short -- for each shorter (previous) solution c <- tail enalp -- and everything from alphabet (but the empty word) return $ c++p++c -- wrap it between the chars enalp = "" : group (nub alp) -- "extended nubbed alphabet", vom Typ [String]Die do-Notation ist tatsächlich 1 Token kürzer als [c++p++c | p <- short, c <- tail enalp] :-)
To quote the Meta-Master: Diese Jungs haben zu viel Energie.
What are this week's lessons?
The first lesson is that Competition is addictive and may have a detrimental effect on your other interests. In the worst-case scenario, you may find yourself addicted for life, just like Herr Panny, Herr Eberl, and Herr Bergmaier, and periodically write emails to a tragicomical figure who calls himself MC Hammer, not to mention the harm caused by constantly refreshing wettbewerb.html, desperately waiting for the next fix.
The second lesson is that Haskell provides many different means to attack problems at an abstract level. Some of these are way beyond the scope of this course, and can lead to cryptic code, but they have nonetheless legitimate uses.
The third lesson is that Simplicity and efficiency are hard to achieve simultaneously. The top solutions had to give up on long examples like saippuakauppias (the Finnish soap merchant). Efficient solutions take about twice as many tokens, because wordsOfLength cannot be reused. In practice, we often have to choose before simplicity and efficiency, and we must use our wits to determine which is more important for a specific application. Even an algorithm guru like Donald Knuth uses inefficient procedures in some of his programs (see paragraph 22).
There's a really terrible language out there that calls itself Perl and whose main claim of fame is that in that language, TMTOWTDI. If the competition has shown anything, it must be that this is even more true of Haskell.
Finally, you can Write emails to the MC in any language supported by Google Translate. And you can use du or Du or Sie or tu or vous or whatever and he'll try to answer in the same style. (Knigge & Co. have been on a slippery slope since the day a journalist said du to the Swedish prime minister back in the 1970s. The barbarians are at the gate.)
Die Ergebnisse der dritten Woche
Here are the results for the third week. The top 30 is effectively a top 43, due to a slew of 14 competitors with nearly identical solutions occupying the last step in the stairway to heaven.
Top 30 der Woche | ||||
---|---|---|---|---|
Platz | Wettbewerber(in) | Lösung (Kurzfassung) | Token | Punkte |
1. | Clemens Jonischkeit | e=d.fd"dk=m(lzk") | 17 | 30 |
Simon Roßkopf | e=d.fd"dk=m(lzk") | 17 | 30 | |
3. | Maximilian Kirchmeier | e=m.fl.z"d=m.fl.fz" | 19 | 28 |
4. | Daniel Stüwe | e=e"d=fe"ekk=c.m(fl$zkk) | 24 | 27 |
Benjamin Schagerl | ekc=m(lz"k)cdkc=m(lzk")c | 24 | 27 | |
6. | Aleksander Umov | ekc=m(k!)$c>fe"dkc=ekic!2 | 25 | 25 |
7. | Alexander Weidinger | ek=mhWha=gk$oa-9dkt=ekitg2 | 26 | 24 |
Hauke Brinkop | ek=mhWha=gk$oa-9dkt=ekitg2 | 26 | 24 | |
9. | Maximilian Haslbeck | ek=m$gk.s9.odk=m$c.s1.h.fek | 27 | 22 |
10. | Stefan Peter Dirix | ek=m$gk.s9.odk=m$c.(+9).f.fek | 29 | 21 |
Julian Biendarra | ek=m$gk.s9.odk=m$c.(9+).h.fek | 29 | 21 | |
12. | Michael Schreier | ekc=dk['.]dcdkc=[c$9+h(eak)|a<c] | 32 | 19 |
13. | Moritz Sichert | ek=m$\c>k!(oc-9)dk=m$\c>c$9+h(eck) | 34 | 18 |
Florian Weinberger | ek=m$\l>gk$ol-9dk=m$\l>c$9+(h$elk) | 34 | 18 | |
Atanas Mirchev | fabw=c[lx$zab|x<w]ekc=f"kcdkc=fk"c | 34 | 18 | |
16. | David Otter | ek=m(\c>k!(oc-9))dk=m(\c>c$h(eck)+9) | 36 | 15 |
Julien Schmidt | ek=m(\x>k!(ox-9))dk=m(c.(+9).h.(ek)) | 36 | 15 | |
18. | Konstantin Bauer | ekc=m((!)k.s9.o)cdkc=[c$9+h(exk)|x<c] | 37 | 13 |
Lyubomir Stoykov | ekl=(!)km(s9mmol)dkl=mc[9+h(etk)|t<l] | 37 | 12 | |
20. | Vladislav Stepa | ek=_k(sk)dk=_(sk)k_kkt=[k!(h$exk)|x<t] | 38 | 11 |
Marco Probst | ekc=[gk$ocm9|c<c]dkc=[c$9+(h$eck)|c<c] | 38 | 11 | |
Valentin Zieglmeier | ekc=[k!(oc-9)|c<c]dkc=[c$9+h(eck)|c<c] | 38 | 11 | |
Jens Wöhrle | ekc=[k!(oc-9)|c<c]dkc=[c$h(eck)+9|c<c] | 38 | 11 | |
Barbara Moser | ekc=[k!(oc-9)|c<c]dkc=[c(eck!0+9)|c<c] | 38 | 11 | |
Christian Ziegner | ekc=[k!(ox-9)|x<c]dkc=[c$9+h(exk)|x<c] | 38 | 11 | |
Maximilian Pudelko | ekc=[k!(ox-9)|x<c]dkc=[c$9+h(exk)|x<c] | 38 | 11 | |
Martin Hartmond | ekc=[k!(ox-9)|x<c]dkc=[c$h(exk)+9|x<c] | 38 | 11 | |
Amar Saljic | ekc=[k!s9(oa)|a<c]dkc=[c$9+h(eak)|a<c] | 38 | 11 | |
Thomas Pettinger | ekk=[k!(ox-9)|x<k]dkc=[c$9+h(exk)|x<c] | 38 | 11 | |
30. | Roland Schmid | ekc=[k!(oa-o')|a<c]dkc=[c$o'+eak!0|a<c] | 39 | 1 |
Thomas Engel | ekc=[k!(oc-9)|c<c]dkc=[c$9+(h$eck)|c<c] | 39 | 1 | |
Axel Fischer | ekc=[k!(oc-9)|c<c]dkc=[c(9+h(eak))|a<c] | 39 | 1 | |
Kwon-Jin Jensen | ekc=[k!(oc-9)|c<c]dkc=[c(h(eck)+9)|c<c] | 39 | 1 | |
Maximilian Mumme | ekc=[k!(ol-9)|l<c]dkc=[c$(h$elk)+9|l<c] | 39 | 1 | |
Adrian Thilo | ekc=[k!(ol-9)|l<c]dkc=[c(h(elk)+9)|l<c] | 39 | 1 | |
Thorsten Fuchs | ekc=[k!(ox-9)|x<c]dkc=[c$9+(exk)!0|x<c] | 39 | 1 | |
Alexander Christian Hefele | ekc=[k!(ox-9)|x<c]dkc=[c$9+(h$exk)|x<c] | 39 | 1 | |
Ludwig Peuckert | ekc=[k!(ox-9)|x<c]dkc=[c(h(exk)+9)|x<c] | 39 | 1 | |
Konrad Weiss | ekc=[k!(ox-9)|x<c]dkc=[c(h(exk)+9)|x<c] | 39 | 1 | |
Bogdan Iacob | ekc=[k!(ox-9)|x<c]dkc=mc[(h$exk)+9|x<c] | 39 | 1 | |
Mihai Dutescu | ekc=[k!(s9$ox)|x<c]dkc=[c$h(eyk)+9|y<c] | 39 | 1 | |
Zhechko Zhechev | ekc=m(\x>k!(ox-9))cdkc=[c$h(eyk)+9|y<c] | 39 | 1 | |
Lukas Holzner | ekk=[k!(ox-9)|x<k]dkg=[c(h(exk)+9)|x<g] | 39 | 1 |
In conjunction with global warming, nonlinear effects have often been mentioned in the media. Now, the MC is not a climatologue, but whatever nonlinear effects affect the climate, they are certainly dwarfed by what we observe in the bottom half of the table: A one token increase (from 38 to 39) results in a 91% drop (from 11 to 1) in the score! Clearly, every token counts. (The same can unfortunately not be said of votes.)
To resolve the ties, the MC is entitled to resort to the Nebenkriterium, namely the readability (“Lesbarkeit, wobei Formatierung und Namensgebung besonders gewichtet werden”). By now it should be clear that this criterion is just an empty threat. Nevertheless, please continue to keep this in mind when writing code. Several solutions this week came with detailed comments, which makes for nice bedtime reading.
All right, let's move to the top solution, with 17 tokens:
--in kooperation mit Simon Rosskopf encode = decode . flip decode "abcdefghijklmnopqrstuvwxyz" -- by decoding the key with the alphabet you get the inverse key -- encode key text = decode (inverse key) text decode key = mapMaybe (`lookup` zip key "abcdefghijklmnopqrstuvwxyz") -- No black magic in use! Due to lookup returns Maybe, mapMaybe is used, but it works similar to a normal map
The first comment, “in kooperation mit Simon Rosskopf,” is a frank acknowledgment that shows intellectual integrity. MC likes! College may feel like an artificial setting, but sooner than you expect you will find yourself in a situation where forgotten acknowledgments will cause you serious trouble. Here it would be nice to have a quote from Donald Knuth (the MC's idol) to emphasize this point, but we are lucky enough to have the second best thing: A quote about Donald Knuth!
Full of youthful enthusiasm at being able to communicate improvements on a previously published algorithm (Don was a Junior then), he failed to mention his co-authors in the paper; Don did the writing but other students contributed illustrations and most of the ideas of the algorithm. At the time he had no notion there was academic prestige to be gained through publication, Don confessed. This is, he said, a common mistake among young authors who frequently overlook proper acknowledgements in their haste to get the news out.
(Quelle: Section 18 of Mathematical Writing.)
Let's close this parenthesis and move on to the solution itself. Let's first massage it a little bit. The original version uses a few cheap trick, notably the function composition operator (.), to save tokens, but here's the essence:
encode key cleartext = decode (decode key "abcdefghijklmnopqrstuvwxyz") cleartext decode key cryptotext = mapMaybe (\c -> lookup c (zip key "abcdefghijklmnopqrstuvwxyz")) cryptotext
As the comment in the code suggests, decoding the string abc...z with the key gives the inverse key, and decoding with the inverse key is the same as encoding with the original key. This works because (well-formed) keys are permutations and permutations are invertible.
The next solution, at 19 token, has some of the same flavor, but no attempt is made to reuse decode to implement encode (or vice versa):
encode = mapMaybe . flip lookup . zip "abcdefghijklmnopqrstuvwxyz" decode = mapMaybe . flip lookup . flip zip "abcdefghijklmnopqrstuvwxyz"
Let's jump directly to Herr(n) Umov in 6th place and his mathematically inspired solution:
encode key cleartext = map (key !!) $ cleartext >>= flip elemIndices "abcdefghijklmnopqrstuvwxyz" decode key cryptotext = encode key `iterate` cryptotext !! 26771144399
What's going on? The encoding is a bit odd, but what's really interesting is the decoding.
Decoding the text is the same as encoding it
{- The encoding algorithms repeats itself after n times, while n = length key. Because of that, if you want to decode a given cryptotext, you can just keep encoding the cryptotext n times and you will have the same cryptotext again. But if you take the result of the (n-1)th encoding, you will get the original cleartext. So a simple solution would be to write: decode key cryptotext = encode key `iterate` cryptotext !! (length key - 1) However, this took up too many tokens. Thus, I wrote "!! 26771144399", since 26771144400 is the smallest number which is dividable by all numbers from 1 to 26. The downside of this solution is that the evaluation takes up quite much time with current CPUs. -}
The explanation is a bit confused. In particular, the claim that “the encoding algorithms repeats itself after n times” does not quite hold, for reasons that will become clear below.
Alternative, more correct explanation, this time in German (mostly):
{- Positive Einflussfaktoren auf die Lösung -> - * ( Bier & Kaffee ... just being honest ;) ) - * Diskrete Strukturen - * Symmetrisches Verschlüsselungsverfahren - - Die Zahl '26771144399' entspricht dem kgV der Zahlen 1 bis 26, -1. - Warum kgV 26? 26 Buchstaben des Alphabets und - somit potentielle Zyklen der Länge 1 bis maximal 26. - Warum -1? Durch Verwendung der encode Methode zum Entschlüsseln - haben wir den Text bereits einmal verschlüsselt und - würden somit beim entschlüsseln eine Stelle "zu weit entschlüsseln". -}
And yet another explanation–the promise of immortality on the MC's weblog seems to arouse the didactive impulse in the competitors!
{- - Ich nehme an, die erste Frage, die sich auftut, wenn man diesen Code liest, ist "WTF sucht diese Zahl da?!?!" - Die Antwort dazu ist ein wenig kompliziert, vorallem, weil man es empirisch nicht zeigen kann (da Haskell - der Speicher ausgeht >.<). [Anmerkung: Eigentlich hatte ich 26!-1, es ist also schon auf Effizienz optimiert xD] - Zuersteinmal: Was ist das für eine Zahl? Diese Zahl ist das Ergebnis des Ausdruckes - kgv `foldr1` [1..26] - 1 - Also das kleinste gemeinsame Vielfache aller Zahlen bis 26 abzüglich der Konstante 1 - Warum 26? Weil das die maximale Alphabetlänge ist und damit auch die maximale Länge der Zyklen in der Permutation - (also im Key!) - Warum KGV? Eine Permutation der Länge l lässt sich durch Zyklen ausdrücken, deren Längen summiert <= 26 sind - Für jeden Zyklus z der Länge l gilt: z^l = id und für 2 Zyklen z1 z2 der Längen l1 l2 (z1 . z2)^(l1*l2) = id - da jede Zykluslänge <= 26 ein Teiler des KGVs ist, gilt hier: (z1 . z2 ...)^(l1*l2*...*k) = id^k = id - wobei k eine natürliche Zahl ist. - Warum -1? Eigentlich ist das Ziel ja, die Abbildung "id" zu erzeugen, ausgehen vom unkodierten Wort. Da es - aber bereits einmal kodiert wurde, sprich jeder Zyklus 1 mal angewandt, muss eben dieser abgezogen werden, - damit diese Funktion eine echte Decodierung durchführt. -}
What about the Jungs who have too much Energie? No worries, they're doing fine.
Lorenz Panny (the eternal winner from last year) also discovered this solution–which just goes to show that new generations aren't necessarily smarter than old ones ;). He wrote a lengthy description on his web page.
Manuel Eberl (tutor emeritus) has an even less efficient solution. Instead of taking the “kgV” (whatever that acronym stands for) of 26, he simply took the factorial (= die Fakultät) of 26, also minus 1.
And Andreas Bergmaier (the 5th place from last year, but #2 in the MC's heart) has a 21-token solution:
encode = codeHelp (=<<) decode = codeHelp ap codeHelp dir = mapMaybe . flip lookup . dir zip sort
He apologized thusly for not winning: Hab allerdings nur eine Dreiviertelstunde investiert, also erwarte nicht zu viel :-)
So what are the lessons to learn this week?
In the first week, we saw that Mathematics can be used to shorten programs. This was confirmed this week. The top solution exploited algebraic properties of permutations, and the five solutions with kgV 26 − 1 or 26! − 1 were basically leveraging group-theoretic results.
But even without mathematics, short solutions were also possible for those who prefer to wear a computer scientist's hat. The third place is taken by a solution that requires no algebraico-group-theoretical mojo.
Perhaps the main lesson is: If you have to perform two (or more) similar tasks, see if there's a way to reuse one to implement the other. This is true in a variety of domains; for example, there exist frameworks that combine parsing and pretty-printing into a single grammar, potentially saving half the work, and half the bugs! The code that doesn't get written doesn't come back to haunt us later.
Die Ergebnisse der (vierten und) fünften Woche(n)
Finally, here are the results of the combined fourth/fifth weeks' contest!
Top 30 der Woche | |||
---|---|---|---|
Platz | Wettbewerber(in) | Punkte | |
1. | Stefan Peter Dirix | 60 | |
2. | Florian Dreier | 58 | |
3. | Kevin Yu | 56 | |
4. | Maximilian Haslbeck | 56 | |
5. | Daniel Stüwe | 52 | |
6. | Atanas Mirchev | 50 | |
Lyubomir Stoykov | 50 | ||
Julian Biendarra | 50 | ||
Lukas Michael Stumberg | 50 | ||
Michael Schreier | 50 | ||
Simon Roßkopf | 50/2 | ||
12. | Sebastian Peter Lehrbaum | 38 | |
Mihai Dutescu | 38/2 | ||
14. | Fabian Volz | 34 | |
Jens Wöhrle | 34 | ||
Moritz Sichert | 34 | ||
David Gaßmann | 34/2 | ||
18. | Christian Haba-Schneider | 26 | |
Nikita Basargin | 26 | ||
Zhechko Zhechev | 26 | ||
21. | Christian Ziegner | 20 | |
David Otter | 20 | ||
Johannes Ismair | 20 | ||
Martin Helfrich | 20 | ||
Martin Mihaylov | 20 | ||
Ludwig Peuckert | 20/2 | ||
27. | Arved Baus | 8 | |
Florian Weinberger | 8 | ||
Hauke Brinkop | 8 | ||
Johannes Klicpera | 8 | ||
Victor Seifert | 8/2 |
Let us start with some practical matters. First, some competitors have a penalty of /2. This penalty is taken into consideration in the Top 30 des Semesters. The reason is given below. Second, following feedback from a competitor, the MC wanted to introduce a new scoring scheme, whereby solutions detected to be Gruppenarbeit count as one solution, so as not to penalize the solutions below it in the table; but this week the MC could not detect any duplicates, so it doesn't change anything one way or the other.
This somewhat longer problem attracted 43 nontrivial submissions (i.e., solutions with the {-WETT-}...{-TTEW-} tags that were not undefined) that passed the QuickCheck tests. In addition, there is one solution by Ex-Tutor Manuel Eberl and one by the MC. The lower number of competitors probably reflects the difficulty of the task.
To rank the solutions, the MC appealed to the following very nice program generously provided by Herr Eberl (and very slightly adapted by the MC):
generateDistanceTable :: Int -> [(String, Integer, String)] generateDistanceTable n = [(show i, dist i j, show j) | i <- [0..n-1], j <- [0..n-1], i /= j] where g = mkStdGen 0 rs = randomRs (-1000000000000, 1000000000000) g positions = V.unfoldrN n (\(x:y:rs) -> Just ((x,y),rs)) rs sqr x = x * x dist i j = case (positions V.! i, positions V.! j) of ((x1,y1), (x2,y2)) -> floor (sqrt (sqr (x2-x1) + sqr (y2-y1)) :: Double)
The program randomly distributes n cities called 0, 1, etc., on a cartesian plane and then computes the Euclidean distances. Normally the result should be a sane table, but due to the use of floating-point arithmetic there are no guarantees. (Using Manhattan distances could have solved this problem.)
Six of the solutions solutions returned False for the case where n = 10, so these were eliminated right away. Five of these six programs failed for the (perfectly sane) distance table
[ ("A", 17, "B"), ("A", 17, "C"), ("A", 9, "D"), ("B", 17, "A"), ("B", 8, "C"), ("B", 8, "D"), ("C", 17, "A"), ("C", 8, "B"), ("C", 9, "D"), ("D", 9, "A"), ("D", 8, "B"), ("D", 9, "C") ]
possibly because they used strict comparisons (< bzw. >) instead of nonstrict ones (<= bzw. >=). If your name is not in the table and your solution is faster than the naive solution, you might want to check this example.
In addition, five solutions failed on the corner case []: four crashed, one returned False. Strictly speaking, they should be ruled out; but since it's a two-week contest with twice the amount of points, this seems a little harsh for a corner case. Hence, these solutions are be ranked as usual, but the score is divided by 2.
This leaves us with 32+5+2 competitors (normal + buggy for [] + nonstudents).
The first rounds were evaluated with a timeout of 10 s.
Round 1: n = 10
Leaves 31+5+2 participants.
Round 2: n = 15
Leaves 26+5+2 participants.
Round 3: n = 20
Leaves 22+4+1 participants. (The MC's solution was hardly optimized at all,
so this is where he “flies out.”)
Round 4: n = 25
Leaves 17+3+1 participants.
Round 5: n = 30
Leaves 14+3+1 participants.
Round 6: n = 50
Leaves 11+2+1 participants.
At this point the timeout is raised to 20 s.
Round 7: n = 100
Leaves 10+1+1 participants.
Round 8: n = 100, but with 100 repetitions of the table
Leaves 5+0+1 participants.
At this point the timeout is cranked up to 60 s, and the absolute times give the top 5 rankings
Round 9: n = 100
8.03 s Stefan Peter Dirix 10.68 s Florian Dreier 12.59 s Manuel Eberl 13.41 s Maximilian Haslbeck 18.81 s Kevin Yu 42.40 s Daniel Stüwe
Update: Looking at the results with disbelief, Manuel Eberl deduced that the MC ran the programs in ghci. Drastically different results are obtained using ghc -O3. In addition, Max Kirchmeier had sent an alternative IO-monad-based version per email, but the MC forgot to run it. So here are the results once again:
0.19 s Manuel Eberl 0.28 s Stefan Peter Dirix 0.29 s Florian Dreier 0.49 s Kevin Yu 0.92 s Maximilian Haslbeck 1.81 s Daniel Stüwe 23.93 s Maximilian Kirchmeier
And for good measure, with n = 300:
1.32 s Manuel Eberl 3.83 s Stefan Peter Dirix 8.75 s Florian Dreier 11.72 s Kevin Yu 16.47 s Maximilian Haslbeck 17.63 s Daniel Stüwe ∞ s Maximilian Kirchmeier
For the competition, this means Herr Yu and Herr Haslbeck have exchanged position. But the MC is not keen on subtracting points he has already given out, so he's going to give Herr Yu his additional 2 points without subtracting them from anybody else. End of update.
So what do the solutions look like? Let's have a look at Herr Dirix's:
data Node = Node String Int (Map.Map String Integer) deriving (Eq, Ord) type NodeMap = Map.Map String Node isDistanceTableSane :: DistanceTable -> Bool isDistanceTableSane tab = checkTab tab /= Nothing where checkTab tab = do map <- buildMap tab Map.empty checkCompleteness map checkTriangles (Map.elems map) buildMap :: DistanceTable -> NodeMap -> Maybe NodeMap buildMap [] map = Just map buildMap (t:tab) map = let newMap = handleEdge t (Just map) in case newMap of Nothing -> Nothing Just newMap -> buildMap tab newMap addNode :: String -> NodeMap -> NodeMap addNode name map = Map.insert name (Node name 0 Map.empty) map handleEdge :: (String, Integer, String) -> Maybe NodeMap -> Maybe NodeMap handleEdge _ Nothing = Nothing handleEdge (nameA, distance, nameB) (Just map) | distance < 0 = Nothing | nameA == nameB = Nothing | otherwise = do (nodeA, map1) <- getNode nameA map (nodeB, map2) <- getNode nameB map1 addEdge nodeA nodeB distance map2 getNode :: String -> NodeMap -> Maybe (Node, NodeMap) getNode name nodeMap = case node of Nothing -> Just (initNode, Map.insert name initNode nodeMap) Just node -> Just (node, nodeMap) where node = Map.lookup name nodeMap initNode = (Node name 0 Map.empty) checkCompleteness :: NodeMap -> Maybe Bool checkCompleteness nodeMap | allLengthsOk = Just (True) | otherwise = Nothing where nodes = Map.elems nodeMap lengthTarget = length nodes - 1 allLengthsOk = and $ map (\(Node name n distanceMap) -> n==lengthTarget) nodes addEdge :: Node -> Node -> Integer -> NodeMap -> Maybe NodeMap addEdge (Node nameA nA mapA) (Node nameB nB mapB) distance map = do updatedNodeA <- updateNode (Node nameA nA mapA) nameB distance updatedNodeB <- updateNode (Node nameB nB mapB) nameA distance Just (Map.adjust (\_->updatedNodeB) nameB $ Map.adjust (\_->updatedNodeA) nameA map) updateNode :: Node -> String -> Integer -> Maybe Node updateNode (Node name n map) nameB distance = case oldDistance of Nothing -> Just (Node name (n+1) newMap) Just oldDistance -> if (oldDistance /= distance) then Nothing else Just (Node name n map) where (oldDistance, newMap) = Map.insertLookupWithKey (\key oldVal newVal -> newVal) nameB distance map checkTriangles :: [Node] -> Maybe Bool checkTriangles [] = Just (True) checkTriangles (node:nodes) | checkTrianglesForNode node nodes = checkTriangles nodes | otherwise = Nothing checkTrianglesForNode :: Node -> [Node] -> Bool checkTrianglesForNode node nodes = and [checkTrianglesForNodes node nodeB nodeC | nodeB <- nodes, nodeC <- nodes, nodeB < nodeC] checkTrianglesForNodes :: Node -> Node -> Node -> Bool checkTrianglesForNodes (Node nameA nA mapA) (Node nameB nB mapB) (Node nameC nC mapC) = (distAB <= distBC + distAC) && (distBC <= distAB + distAC) && (distAC <= distBC + distAB) where distAB = fromJust $ Map.lookup nameB mapA distBC = fromJust $ Map.lookup nameC mapB distAC = fromJust $ Map.lookup nameC mapA
The code adheres to very nice stylistic conventions, but it's still highly nontrivial to understand how it works. But to quote Frederick Brooks (1975),
Show me your flowcharts and conceal your tables, and I shall continue to be mystified.
Show me your tables, and I won't usually need your flowcharts; they'll be obvious.
Using modern terminology, we would probably phrase it more like this:
Show me your algorithms and conceal your data structures, and I shall continue to be mystified.
Show me your data structures, and I won't usually need your algorithms; they'll be obvious.
Here, the data structures are helpfully introduced at the very top:
data Node = Node String Int (Map.Map String Integer) deriving (Eq, Ord) type NodeMap = Map.Map String Node
A Node represents a row (or column) in the distance table. The first argument is the source city, the second argument the number of cities with which the city is connected, and the third argument is a map from target cities to distances. Perhaps the name DistanceTableRow would have been more illuminating.
Then a NodeMap represents an entire distance table. A map is an associative data structure that stores key–value pairs and that allows us to perform quick lookups on the keys.
Once we understand this much the algorithms are indeed quite natural.
Here's the solution by Herr Dreier:
isDistanceTableSane :: DistanceTable -> Bool isDistanceTableSane table = and [ dis>=0 && a/=b | ((a,b),dis)<-distList ] && vollstaendig && triangle_rule where (froms,dists,tos) = unzip3 table cities_list = nub $ froms++tos -- contains all city names of the table cities_len = length cities_list cities_map = fromList $ zip cities_list [1..cities_len] -- maps city name to index createMapRek :: DistanceTable->Map (Int,Int) Integer->Map (Int,Int) Integer createMapRek [] curMap = curMap createMapRek ((from,dist,to):xs) curMap = if notMember tup curMap then createMapRek xs $ Data.Map.insert tup dist curMap else if (findWithDefault (-1) tup curMap) /= dist then Data.Map.insert tup (-1) curMap else createMapRek xs curMap where a = cities_map ! from b = cities_map ! to tup = if a<b then (a,b) else (b,a) distMap = createMapRek table Data.Map.empty distList = toList $ distMap distance (a,b) | a<b = findWithDefault (-1) (a,b) distMap | otherwise = findWithDefault (-1) (b,a) distMap vollstaendig = length distList == cities_len*(cities_len-1) `div` 2 triangle_rule = and [distance (x,y)+distance (y,z)>=distance (x,z)|x<-[1..cities_len-1],y<-[x+1..cities_len],z<-[1..cities_len],y/=z,x/=z]
This solution isn't quite as nice to read, and one is puzzled by the Mischung aus Englisch and German, but it has one nice refinement, namely that cities are mapped to integer indices. This way, if the problem contains very very long and nearly identical city names, they need not be compared over and over. Indeed, had the MC thought of it, he would have added some tests that generate such ridiculously long names.
Finally, for the record, here's Herr Eberl's solution's text. It uses all sorts of evil tricks which won't be covered here or in class. It also uses a so-called state monad, which explains why he won (virtually):
import Data.Maybe import Control.Monad import Control.Monad.ST import qualified Data.Array.IArray as IA import qualified Data.Array.ST as STA import qualified Data.Array as A import qualified Data.Array.Unsafe as AU import qualified Data.HashTable.ST.Basic as HT -- A simple pairing function for unordered pairs {i,j}. pair :: Integral a => a -> a -> a -> a pair n i j = i' * n + j' where i' = max i j j' = min i j -- Monadic fold over a list that aborts if an intermediate result is "Nothing" foldMAbort :: Monad m => (a -> b -> m (Maybe a)) -> a -> [b] -> m (Maybe a) foldMAbort f σ [] = return (Just σ) foldMAbort f σ (x:xs) = f σ x >>= maybe (return Nothing) (\y -> foldMAbort f y xs) -- Collects the town names from the table and produces a HashTable that maps -- names to numbers from 0 to n-1, where n is the number of towns. Returns -- that HashTable and n. buildNameTable :: [(String, Integer, String)] -> ST s (Maybe (HT.HashTable s String Int, Int, Integer)) buildNameTable distTable = do nameTable <- HT.new let ins town n = HT.lookup nameTable town >>= maybe (HT.insert nameTable town n >> return (n, n + 1)) (\i -> (return (i,n))) n <- foldMAbort (\(n, maxDist) (town1, d, town2) -> if d < 0 then return Nothing -- non-negativity violated else do (i, n') <- ins town1 n (j, n'') <- ins town2 n' let maxDist' = max d maxDist return (if i == j then Nothing else Just (n'', maxDist)) -- check irreflexivity ) (0,0) distTable return (fmap (\(n,maxDist) -> (nameTable, n, maxDist)) n) -- Check whether the given packed distance matrix with the given size -- fulfils the triangle inequality checkTriangleInequality :: (IA.IArray a e, Integral e) => Int -> a Int e -> Bool checkTriangleInequality n distArray = null $ do i <- [0..n-1] -- pick "point of departure" k <- [0..i-1] -- pick "destination" (w.l.o.g. assume i > k) let dik = dist i k j <- [0..n-1] -- pick "detour" guard (i /= j && j /= k) -- assume i ≠ j and j ≠ k (i.e. "proper" detour) guard (dist i j + dist j k < dik) -- assume that the detour is "shorter" -- this produces an empty list iff the triangle property holds where dist i j = distArray IA.! pair n i j -- Generic helper function for isDistanceTableSane that can work on any -- kind of mutable array – boxed and unboxed. genDistanceTableSane distTable nameTable n newArray = do let nEntries = n * (n - 1) `div` 2 -- This array contains a distance matrix, indexed by a clever pair function -- Uninitialised distances are set to -1. distArray <- newArray (0, n * n) (-1) -- get a town's ID by its name let getIndex = liftM fromJust . HT.lookup nameTable -- inserts a distance entry into the array and keeps track of the number -- of unoccupied entries in the matrix let insertDist nEmptyEntries (t1, d_, t2) = do i <- getIndex t1 j <- getIndex t2 let d = fromInteger d_ let k = pair n i j d' <- STA.readArray distArray k STA.writeArray distArray k d if d' < 0 then return (Just (nEmptyEntries - 1)) -- decrease unoccupied else if d == d' then return (Just nEmptyEntries) -- duplicate entry, consistent else return Nothing -- duplicate entry, inconsistent if length distTable < nEntries then -- early abort for distance tables that are too small to be complete return False else do return True -- insert the entries into the matrix nEmptyEntries <- foldMAbort insertDist nEntries distTable -- check whether table is consistent and all entries were present case nEmptyEntries of Nothing -> return False -- inconsistent Just 0 -> liftM (checkTriangleInequality n :: Integral e => A.Array Int e -> Bool) (AU.unsafeFreeze distArray) Just _ -> return False -- not all entries occupied isDistanceTableSane :: [(String, Integer, String)] -> Bool isDistanceTableSane distTable = runST $ do r <- buildNameTable distTable -- build mapping from town to ID case r of Nothing -> return False Just (nameTable, n, maxDist) -> -- If this condition is true, the sum of two distances will be representable -- using a machine-length integer, therefore, we can use unboxed arrays -- with Ints. Otherwise, fall back to boxed array with Integers. if maxDist * 2 <= toInteger (maxBound :: Int) then genDistanceTableSane distTable nameTable n (STA.newArray :: (Int,Int) -> Int -> ST s (STA.STUArray s Int Int)) else genDistanceTableSane distTable nameTable n (STA.newArray :: (Int,Int) -> Integer -> ST s (STA.STArray s Int Integer))
Notice how the program chooses an efficient Int-based implementation if the distances are small enough, falling back on “God's integers” otherwise.
This week, there's only one lesson: Choose your data structures with care.
Actually, there's a second lesson, for those who were rewarded a /2 “bonus”: Remember to test the corner cases! And keep in mind that there are no completeness guarantees associated with the QuickCheck suite run by the submission system.
Third lesson: There is no third lesson.
Third lesson: (Updated:) Run ghc -O3 when benchmarking
programs for speed.
Die Ergebnisse der sechsten Woche
Here the results for the sixth week:
Top 30 der Woche | |||
---|---|---|---|
Platz | Wettbewerber(in) | Tiefe | Punkte |
1. | Julian Biendarra | 4374 | 30 |
2. | Daniel Stüwe | 2916 | 29 |
Simon Roßkopf | 2916 | 29 | |
Hauke Brinkop | 2916 | 29 | |
5. | Maximilian Haslbeck | 2187 | 26 |
6. | Maximilian Kirchmeier | 1296 | 25 |
7. | Atanas Mirchev | 1024 | 24 |
Lyubomir Stoykov | 1024 | 24 | |
9. | Florian Dreier | 972 | 22 |
10. | Roland Schmid | 768 | 21 |
11. | Martin Mihaylov | 576 | 20 |
12. | Stefan Peter Dirix | 129 | 20 |
13. | Johannes Klicpera | 22 | 19 |
14. | Alexander Güssow | 12 | 18 |
15. | Christian Ziegner | 6 | 17 |
16. | Fabian Volz | 5 | 16 |
17. | Axel Fischer | 4 | 15 |
18. | Fabian Schöttl | 3 | 14 |
19. | Hannes Bibel | 2 | 13 |
Johannes Ismair | 2 | 13 | |
Arved Baus | 2 | 13 | |
Frédéric Simonis | 2 | 13 | |
Julia Kindelsberger | 2 | 13 | |
24. | David Otter | 2187 | 8 |
25. | Barbara Moser | 2 | 7 |
Max-Philipp Schrader | 2 | 7 | |
Thomas Zwickl | 2 | 7 |
This week, the MC is away to learn a language even more arcane than some of the submissions to the Wettbewerb. The friendly takeover of the Co-MC brings also brings a change to the rules: Usually, we want the job done, as fast and cheap as possible. This time, we were more explorative: How far can we get with a fixed budget?
In this round, we got 87 submissions participated. The Co-MC's mood got a bit somber as it turned out that 56 of these were not even trying. Only index1 and more than 30 tokens to spare? Where is the scholar's spirit, your will to boldly go where no one has gone before? "Vun nix kütt nix" wie der Kölsche sagt, so this leaves 31 programs.
As is customary, each candidate has to survive a few rounds of "Ene, mene, muh und raus bist du" before assuming her or his rightful place in the weekly wall of fame. Early on, two hopeful aspirants got left behind, as the alert Guard of the Haskell Code (GHC) spotted some inaccuracies in their code. Remember, the GHC is willing to give your code as many chances as you need, but once you submit, its rejection is final! This is particulary annoying to the participants, as in both cases, the error seemed to be a simple mispelling of a function name.
It is said that looking into the eyes of a man (or woman, for that matter) gives you a glimpse of the character of said person. Similarly, looking at the types of a program, tells you a whole lot of what this program is about. But as you can get lost in the deep eyes of your companion on a lovely Summer's day, you can get lost in the types of a program in the faint light of an LCD, in particular if they are hundreds or thousands of characters long. Committed to be impartial, the Co-MC cannot afford to get lost, so he makes a Faustian pact with the occult and type-less forces of the Unix to unearth "des Types Kern":
printf ':t %s\n' "$fun" | ghci round5/$student.hs | awk 'BEGIN { pr = 0 }; /^\*Main>/ { pr = 1 - pr; }; { if (pr == 1) print }' | tr '\n' ' ' | tr -s ' ' | sed -e 's/^.*:: *//' | tee round5/types/$student | sed -e 's/^.*=> *//; s/\[.\]/\[a\]/; s/Integer/I/g; s/Int/I/g; s/t1/I/g; s/[cbt]/I/g' | tr -d ' ' | tee round5/types-simp/$student | tr -s '[]' | tee round5/types-norm/$student > /dev/null
By these incantations, a correct submission's type is normalized to [a]->I->([I],I). Unfortunately, one unhappy soul get lost in size of his type and submitted a function with normalized type of (a->I->(I,I))->[a]->I->([I],I). This leaves 28 candidates and it breaks the Co-MCs heart to see that the Top-30 won't be a Top-30 this time.
Even if you correctly judged the type -- Who knows what evil lurks in the hearts of (wo)men? Hence, the Co-MC tests each program on the empty list, and again, two climbers onto the steep rock of fame lose their grip on this treacherous terrain. Apparently, nobody expects the spanish inqui^W^Wthe empty list. After this trivial test-case, the Co-MC started doing more in-depth functional tests, but was stopped short already by the test
indexN [ [..["a","b"]..], [..["c,d"]..] ] 0 == [ [..[0,1]..], [..[2,3]..] ]
where [..[ and ]..] stand for N-1 opening resp. closing brackets. His puny laptop ground to a halt as the GHC desperately tried to grasp the deeply nested types of the top contenders. Only switching to a bigger machine allowed him to test all the programs. Satisfied that all remaining programs passed this test, he set out to rank the survivors.
The ranking criteria is simple: Just compare the nesting depth. And of course, there was the limit of 60 tokens. But seeing the field of competitors so sparse, the Co-MC decided to award points even to those who blew the budget; setting them at the end of this weeks ranking. Again, a bit of black magic sets the contrast to Haskell's beautiful structure:
for file in ../candidates-3/*@*; do user=${file##*/} normsize=$(stat -c '%s' ../types-norm/$user) simpsize=$(stat -c '%s' ../types-simp/$user) depth=$((($simpsize - $normsize) / 4 + 1)) tokens=$(grep $user ../TOKENS | cut -d' ' -f1) [ "$tokens" -le 60 ] && echo $depth $user done | sort -nr > rank for file in ../candidates-3/*@*; do user=${file##*/} normsize=$(stat -c '%s' ../types-norm/$user) simpsize=$(stat -c '%s' ../types-simp/$user) depth=$((($simpsize - $normsize) / 4 + 1)) tokens=$(grep $user ../TOKENS | cut -d' ' -f1) [ "$tokens" -gt 60 ] && echo $depth $user done | sort -nr >> rank
Let's have a look at this week's competitors: Almost all recognized that the function foldlMap from exercises can be put to good use here:
index1 as j = ([j .. (j+(length as)-1)], j+length as) index2 as j = foldlMap index1 as j
or
index1 = foldlMap (\a i->(i,i+1)) index2 = foldlMap index1 index3 = foldlMap index2 index4 = foldlMap index3
This is the key insight for this exercise and also the one which is really useful outside of this competition. It makes one wonder, why both solutions stopped, having still a number of tokens available -- with the same schema, one could achieve a nesting depth of 12. Leaving out the intermediary functions (the MC just asked for the deepest possible nesting, so the nestings in between are not necessary), one can get up to depth 22:
index = foldlMap (foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap $ foldlMap (\_ y -> (y,y+1)))
This looks like it exhausted foldlMap pretty well, doesn't it? But this is only linear behaviour, there is a way to grow exponentially, leading us to depth 1024.
f a b = (b, succ b) foldlMap_4 = foldlMap . foldlMap . foldlMap . foldlMap foldlMap_16 = foldlMap_4 . foldlMap_4 . foldlMap_4 . foldlMap_4 foldlMap_64 = foldlMap_16 . foldlMap_16 . foldlMap_16 . foldlMap_16 foldlMap_256 = foldlMap_64 . foldlMap_64 . foldlMap_64 . foldlMap_64 foldlMap_1024 = foldlMap_256 . foldlMap_256 . foldlMap_256 . foldlMap_256 index1024 = foldlMap_1024 f
Quite neat, isn't it? Can we get better? Yes, but the not by an order of magnitude. Hauke Brinkop gives credit to his partner and notes:
{- - Zusammen mit Stüwe - Dass immer 3 calls pro zeile sind liegt daran, dass - a**b, 2ab + b <= 42 - bei a = 3, b = 6 ein lokales Maximum besitzt -}
So packing three calls in one definition is better than four, and leads them and a third competitor up to a depth of 2916, securing them a nice second place. Careful tuning of the same idea brings us to this weeks winner, Julian Biendarra. He manages a whooping 4374 levels of nesting:
index3N = foldlMap . foldlMap . foldlMap index9N = index3N . index3N . index3N index27N = index9N . index9N . index9N index81N = index27N . index27N . index27N index243N = index81N . index81N . index81N index729N = index243N . index243N . index243N index2187N = index729N . index729N . index729N index4374 = index2187N . index2187N $ const $ id &&& (succ :: Int -> Int)
An interesting tidbit is the (&&&) operation. While it is much more general than that, it is used here to combine the two functions id and succ to a function \x -> (id x, succ x).
So 4374 is the best we can do? While he employed the same scheme, Simon Roßkopf feels that this shouldn't be the end:
I do not really like this solution because of all the helper functions basically contain the same code. I am pretty sure using template haskell or the trick described in this thread (http://stackoverflow.com/questions/5994051/is-there-a-function-to-flatten-a-nested-list-of-elements) it would be possible to construct some function to do the job for arbitrary deep nested lists. But as both methods are not standard haskell(and more importantly, I do not understand them) I decided not to use them.
And indeed, this feeling is right. Again, two of our informal participants, who have the advantage of age, discovered that it is possible not only to go over 9000, but even beyond, up to an arbitrarily deep nesting. Manual Eberl apparently strives towards teaching, so he provided us with a nice write-up of his solution. On the other hand, Lorenz Panny seems to like living on the edge, digging deep into the dark pits of unsafe and unportable behaviour. Do not try this at home or in the exam!
idx :: Int -> [a] -> Int -> ([b], Int) idx 0 = unsafeCoerce const $ id &&& (succ :: Int -> Int) idx n = unsafeCoerce foldlMap $ idx $ pred n index1 :: [a] -> Int -> ([Int], Int) index1 = idx 1 index2 :: [[a]] -> Int -> ([[Int]], Int) index2 = idx 2 index539 :: [[...[[a]]...]] -> Int -> ([[...[[Int]]...]], Int) index539 = idx 539
So, what can we learn from this week?
- Try! This week, even a nesting depth of 2 guaranteed you a place in the TOP-30!
- Between all this "höher, schneller, weiter", one learns bits of Haskell for real use
- Losing due to a small typo is bad. Compile your code!
The Master of Competition's Addendum to Week 6
こんにちは。This is the real MC speaking. Co-MC Noschinski was graceful enough to pass over the MC's official solution, which can be found in Sol_Exercise_6.hs and is reproduced below:
foldlMap4 = foldlMap . foldlMap . foldlMap . foldlMap foldlMap16 = foldlMap4 . foldlMap4 . foldlMap4 . foldlMap4 foldlMap48 = foldlMap16 . foldlMap16 . foldlMap16 foldlMap144 = foldlMap48 . foldlMap48 . foldlMap48 index1 xs j = (enumFromTo j $ pred j', j') where j' = j + length xs index432 = foldlMap144 $ foldlMap144 $ foldlMap144 index1
Note that index432 is a misnomer: This one goes to 433. That leaves the MC at place 11, or 13 if we count Herr Eberl and Herr Panny. Not so impressive!
The reason the MC is bringing this up is that there is an important lesson to learn from his (relative) failure. We computer scientists are successful because we are experts in the Noble Art of Reducing Big Problems to Small Problems: problems of size n to problems of size n − 1; problems of size 2 to problems of size 1; problems of size 1 to problems of size 0. The key to nearly all nontrivial programming is to handle the base case(s) explicitly and use recursion (in Haskell) or iteration (in Java) to perform the reduction.
Now, this reduction presupposes that we can correctly identify the base cases. The MC's mistake, as it were, was to consider the “n = 1” case as the base case, when it's much more convenient to look at “n = 0.” In other words, instead of starting with
index1 :: [a] -> Int -> ([Int], Int) index1 xs j = (enumFromTo j $ pred j', j') where j' = j + length xs
it would have made much more sense to take
index0 :: a -> Int -> (Int, Int) index0 x j = (j, succ j)
as the starting point—like most of the top competitors did. If index1 is desired (which is not the case for the Wettbewerb), it can easily be obtained as follows:
index1 :: [a] -> Int -> ([Int], Int) index1 = foldlMap index0
This brings us to our fourth lesson:
- When identifying base cases, think small!
This will save you tokens aplenty, potential bugs, and certain embarrassment.
Die Ergebnisse der siebten Woche
The results for week 7 are in!
Top 30 der Woche | |||
---|---|---|---|
Platz | Wettbewerber(in) | Punkte | |
1. | Simon Roßkopf | 30 | |
2. | Stefan Peter Dirix | 29 | |
3. | Maximilian Haslbeck | 28 | |
4. | Moritz Sichert | 27 | |
5. | Lyubomir Stoykov | 26 | |
6. | Hauke Brinkop | 25 | |
Daniel Stüwe | 25 | ||
8. | Florian Dreier | 23 | |
9. | Atanas Mirchev | 22 | |
10. | Maximilian Kirchmeier | 21 | |
11. | Ludwig Peuckert | 20 | |
Christian Ziegner | 20 | ||
13. | David Otter | 18 | |
Kevin Falkenstein | 18 | ||
Stefan Hörl | 18 | ||
16. | Alexander Christian Hefele | 15 | |
Alexander Weidinger | 15 | ||
Alexandra Fritzen | 15 | ||
Arved Baus | 15 | ||
Benjamin Holzschuh | 15 | ||
Adrian Thilo | 15 | ||
Jens Wöhrle | 15 | ||
Julian Biendarra | 15 | ||
Konrad Weiss | 15 | ||
Maximilian Strobel | 15 | ||
Michael Schreier | 15 | ||
Nikita Basargin | 15 | ||
Thomas Zwickl | 15 | ||
Valentin Zieglmeier | 15 | ||
30. | Fabian Kovacs | 1 |
We see many of the usual suspects at the top. In particular, Herr Roßkopf is slowly climbing back to the top of the Top 30 des Semesters, after having lost half of his points in the fourth/five weeks due to a failure to handle the [] case correctly. (Trust the MC: He's unlikely to forget the [] case again anytime soon!)
This week, 50 solutions passed the official QuickCheck tests and had the {-WETT-}...{-TTEW-} tags. In addition, 6 meta-solutions are considered: one from Manuel Eberl (tutor emeritus), Andreas Bergmaier (the 5th place from last year, but #2 in the MC's heart), Maximilian Kirchmeier (a competitor this year who wanted to double his chances of winning), plus three solutions by the MC, called Naive, I, and II. The naive solution is as in Sol_Exercise_7.hs:
quasiIdentical :: String -> String -> Bool quasiIdentical [] [] = True quasiIdentical (c : cs) (d : ds) = (c == d && quasiIdentical cs ds) || cs == ds quasiIdentical _ _ = False fixTypo :: [String] -> String -> String fixTypo vocabs word = if length quasis == 1 then head quasis else word where quasis = List.nub (List.filter (quasiIdentical word) vocabs) fixTypos :: [String] -> [String] -> [String] fixTypos = map . fixTypo
A good starting point for ranking the solutions is this snippet from the problem statement:
Exercise_7> import Data.List Exercise_7> import Data.Hashable Exercise_7> let vocabs = take 10000 (permutations "abcdefghi") Exercise_7> let text = take 100000 (permutations "abcxefgih") Exercise_7> :set +s Exercise_7> hash (fixTypos vocabs text) 3652546476494351287 (3.65 secs, 1029321328 bytes)
A few competitors noticed that the hash value is wrong! Indeed, the MC changed the problem before publishing the Aufgabenblatt and neglected to update the hash. Also, it has been observed that the hash function behaves differently on different machines, perhaps due to different versions of Haskell. On the MC's MacBook Pro, the correct hash is 5025742898364772150. Also, the 3.65 s taken by this solution can be drastically reduced by compiling the sources using ghc -O3, which is done for the evaluation below.
All right, enough blah blah, let's get started! All engines go!
Round 1: The above program with 1000 vocables (instead of 10000), 10000 words of text (instead of 100000), and a generous 60 second timeout.
A bug was found in one solution (by somebody with the initials M.B.), and seven solutions timed out. This leaves 42 solutions plus 6 metasolutions. It is quite frankly a mystery to the MC how these solutions could time out, given that the naive solution took less than 1 s. These competitors should take a second look at their brainchildren.
Round 2: 1000 vocables, 100000 words, 20 seconds.
We lost 12 competitors this time. This leaves us with 30+6 (meta)solutions. We have our Top 30 already!
Round 3: 10000 vocables, 10000 words, 20 seconds.
We decupled the number of vocables but reduced the text length accordingly. Most solutions exhibited similar run-time behavior, but one timed out. Its owner is rewarded by 1 point in the table. As the Germans say: Schade, schade, Marmelade! (Ah, the MC and his Schadenfreude!)
Round 4: 5000 vocables, 50000 words, 10 seconds.
At this point, 14 solutions “fly out” (through the window, presumably). Incredibly, we still have the MC's naive solution, which took 6.57 s. This leaves us with 15+6 (meta)solutions.
Round 5: To make the setting less artificial, the MC took the first 50000 words from /usr/share/dict/web2a on his machine, removing all non-a-z characters. As the text, he simply used Data.List.subsequences "abcdefghijkl", which gives 4096 words such as abel, beg, and chi (and plenty of nonsense like aefikl and bdghk). Timeout: 60 s.
Here Naive took 8.76 s. The MC decided that he had enough and kicked it out as well as the two solutions that were slower. Notice how this protocol reveals many highly unscientific behaviors from the MC's part. This reflects a fundamental difference between genuine scientific experiments and informal competitions for €5 trophies picked up at Nanu-Nana. By the same token, this pseudo-blog (pseudo, because there's no way to enter comments at the bottom of this page) is written in a style that is, to say the least, not very scientific. It's Dr. Jekyll and Mr. Hyde all over again. Ha ha ha ha ha HA HA HA HA HA!!
We are left with 10+5 (meta)solutions.
Round 6: Like round 5, but each vocable and each word w is made four times longer by taking w ++ w ++ w ++ w. This is enough to considerably slow down some solutions, while others are doing just fine:
0.28 s Andreas Bergmaier 0.45 s Simon Roßkopf 0.51 s Stefan Peter Dirix 0.62 s Master of Competition II 0.85 s Maximilian Haslbeck ------------------------------------------------ 6.22 s Moritz Sichert 13.40 s Lyubomir Stoykov 14.67 s Hauke Brinkop 14.69 s Master of Competition I 14.96 s Daniel Stüwe 17.22 s Florian Dreier 17.73 s Atanas Mirchev 18.09 s Maximilian Kirchmeier I 19.38 s Manuel Eberl 31.88 s Maximilian Kirchmeier II
We got our Top 10. Here is the solution by our official winner, Herr Roßkopf:
--Map version --Awfully slow(Around 10 secs on my laptop), not sure if it works, --but it exists and that is all required right now. --Update: Compiling with O3 seems to reduce the runtime to about 1s -- --The hash tests produces different results for some reason. --I cannot find any mistake however... Let's give it a shot --I build a 'prefix-tree' from the vocabs list to avoid checking all --possibiilities of all words each time. It is certainly not optimal --as I overlooked some cases while planning this structure and worked --and quickfixed to insert them... data VocabTree = Node Bool (M.Map Char (VocabTree)) deriving (Show) emptyVT = Node False M.empty buildVocabTree :: [String] -> VocabTree buildVocabTree = foldl insertWord emptyVT insertWord :: VocabTree -> String -> VocabTree insertWord (Node _ as) [] = (Node True as) insertWord (Node end as) (w:ws) = case M.lookup w as of Nothing -> Node end $ M.insert w (insertWord emptyVT ws) as Just vt -> Node end $ M.insert w (insertWord vt ws) as fixTypos :: [String] -> [String] -> [String] fixTypos vocabs text = map (flip fixWord vocabTree) text where vocabTree = buildVocabTree vocabs fixWord :: String -> VocabTree -> String fixWord ws vt = if b == 1 then a else ws where (a, b) = goTillMistake ws vt goTillMistake :: String -> VocabTree -> (String, Int) goTillMistake [] (Node True _) = ([], 1) goTillMistake [] (Node False _) = ([], 0) goTillMistake (w:ws) (Node _ as) = (a, b) where (a, b) = if numPossibilities == 1 then head possibilities else ([], numPossibilities) possibilities = filter (\x -> snd x > 0) $ M.elems $ M.mapWithKey (checkNextStep) as numPossibilities = sum $ map snd possibilities checkNextStep key val | key == w = (w:a, b) | otherwise = (key:a', b') where (a, b) = goTillMistake ws val (a', b') = goFromFirstMistake ws val goFromFirstMistake :: String -> VocabTree -> (String, Int) goFromFirstMistake [] (Node True _) = ([], 1) goFromFirstMistake [] (Node False _) = ([], 0) goFromFirstMistake (w:ws) (Node _ as) = case M.lookup w as of Nothing -> ([], 0) Just vt -> helper w vt where helper w vt = (w:a, b) where (a, b) = goFromFirstMistake ws vt --Some quickChecks propFixTypos_lengthOk vocabs text = length text == length (fixTypos vocabs text) propFixTypos_identity vocabs = fixTypos vocabs vocabs == vocabs propFixTypos_wordLength vocabs text = and $ zipWith (\x y -> length x == length y) text $ fixTypos vocabs text
This solution is very similar to MC II. It builds a tree storing all the vocables. The root node indicates whether the empty word is part of the dictionary. Each node has up to 26 child nodes (subtrees), one for each letter of the alphabet. For example, the tree
Node True { 'a' |-> Node True { 'n' |-> Node True {}, 's' |-> Node True {} }, 'b' |-> Node False { 'e' |-> Node True {} } }
(using the meta-notation {k1 |-> v1, ..., kN |-> vN} to denote the map that associates key k1 to value v1 etc.) represents the dictionary [a, an, as, be]. The Boolean False for the node associated with b indicates that b (unlike a) is not part of the dictionary.
The lookup function on this data structure is a bit special because it needs to take potential typos into account. This is reflected in Herr Roßkopf's code by the pair of functions goTillMistake and goFromFirstMistake. Let's be crazy and try to look up the word ae:
We can either take the a branch of the root node (and continue with goTillMistake) or accept that the a is a typo and take the b branch (and continue with goFromFirstMistake). We do both and see where this leads us. In the a branch, we proceed recursively and take the n and s branches. At this point, we've consumed the entire word and the flag is True in both leaf nodes, so an and as are part of the answer. In the b branch, we have already accounted for one typo, so we do a standard lookup of the remaining suffix (e), yielding be. Hence, conceptually, the result of the lookup is the set {an, as, be}.
Since this set is not a singleton set, the typo in ae won't be corrected.
Herr Bergmaier's, Herr Dirik's, and Herr Haslbeck's solutions are conceptially very similar. The MC II solution is a bit different in that it uses a 26-element vector instead of a map. The map makes more sense if the tree is sparsely populated, as is the case when building it with actual English words. In contrast, the vector tends to be superior for more exhaustive dictionaries, such as the permutation-based example from the problem statement.
The next group of solutions—those that tend to be about 10 times slower than the first group—are solutions that generated “patterns” or “substitution masks” for all vocables and stored them in the dictionary. For example:
type SubsMask = String {- produce substition masks for the given string e.g. subsMasks "asdf" == ["\0sdf", "a\0df", "as\0f", "asd\0"] -} subsMasks :: String -> [SubsMask] subsMasks "" = [] subsMasks s = map (\i -> let (a, _:b) = splitAt i s in a ++ ('\0':b)) [0 .. length s - 1] type SubsTable s = HashTable s SubsMask (Maybe String)
This group includes Herr Brinkop, Herr Dreier, Herr Kirchmeier (I and II), Herr Stüwe, and MC I. That Brinkop and Stüwe would land in the same group should not come as a surprise, since they form the so-called Team Nord, whatever that is meant to mean. (Reminder: Gruppenarbeit is durchaus erlaubt.)
These solutions are slower for longer words, because a word of length n gives rise to n patterns of length n, i.e. n^{2}. This quadractic (^{2}) component becomes very expensive as we increase n, which is exactly what the MC did by taking w ++ w ++ w ++ w. The dictionary gets more entries and the number of lookup gets larger as we lengthen the words. In contrast, the solutions from the first group are almost as fast for the quadrupled words as for the original ones.
Here's a representative implementation from that group:
type TypoHash = Map String (Set String) patternsOfWord :: String -> [String] patternsOfWord [] = [] patternsOfWord (c : cs) = ('?' : cs) : List.map (c :) (patternsOfWord cs) insertPattern :: String -> TypoHash -> String -> TypoHash insertPattern word hash pattern = Map.insertWith Set.union pattern (Set.singleton word) hash insertWord :: TypoHash -> String -> TypoHash insertWord hash word = List.foldl (insertPattern word) hash (patternsOfWord word) findPattern :: TypoHash -> String -> Set String findPattern hash pattern = Map.findWithDefault Set.empty pattern hash fixTypo :: TypoHash -> String -> String fixTypo hash word = if numWords == 1 then head (Set.elems wordSet) else word where patterns = patternsOfWord word wordSets = List.map (findPattern hash) patterns wordSet = Set.unions wordSets numWords = Set.size wordSet fixTypos :: [String] -> [String] -> [String] fixTypos vocabs = List.map (fixTypo hash) where hash = List.foldl insertWord Map.empty vocabs
A slight variation, implemented by Herr Eberl, Herr Mirchev, Herr Sichert, and Herr Stoykov, is to represent a pattern as a pair (String, Integer). This has the advantage that we do not have to set aside a special character (e.g., \0 or ?). To quote Herr Eberl's code:
{- Returns all lists that can be obtained from the given list by deleting exactly one character, and the position of the element that was deleted. -} deleteOne :: [a] -> [([a], Integer)] deleteOne xs = zip (deleteOne' [] xs) [0..] where deleteOne' _ [] = [] deleteOne' xs (y:ys) = (xs ++ ys) : deleteOne' (xs++[y]) ys {- The key insight here is that two words w₁, w₂ are quasi-identical iff they can be made the same by deleting the i-th character of w₁ and w₂ for some i. Using this, we can now simply keep a table mapping all correct words with one character deleted (tagged with the position from which it was deleted) to the original word and we can find all correction candidates for a misspelt word by 1. deleting the i-th character (for any i) 2. performing a lookup in our table -}
Herr Sichert's solution is faster than the others in that group. He uses a couple of further tricks. Since his code is very nicely documented, it is worth quoting it in full:
fixTypos :: [String] -> [String] -> [String] fixTypos vocabs = map findVocab where -- vocabMap ist ein Baum mit der Länge der Wörter als key und -- den dazugehörigen vocabCharsList als values: -- 1 -> vocabCharsList für vocabs der Länge 1 -- 2 -> vocabCharsList für vocabs der Länge 2 -- ... vocabMap = M.fromAscList $ map ((\ vs -> (length $ head vs, vocabCharsList vs)) . sortedNub . sort) . groupBy (\ a b -> length a == length b) . sortBy (comparing length) $ vocabs -- vocabCharsList ist eine Liste mit allen vocabCharsMaps, wobei bei jeder -- vocab jeder Buchstabe jeweils einmal weggelassen wird. Dies wird als key -- verwendet. Der weggelassene Buchstabe wird als value verwendet. -- Beispiel vocabCharsList für vocabs der Länge 4 mit den vocabs "abcd" und "efgh": -- [ vocabCharsMap von [("bcd", 'a'), ("fgh", 'e')], -- vocabCharsMap von [("acd", 'b'), ("egh", 'f')], -- vocabCharsMap von [("abd", 'c'), ("efh", 'g')], -- vocabCharsMap von [("abc", 'd'), ("efg", 'h')] ] vocabCharsList vs = map vocabCharsMap [map (\ v -> (deleteIndex i v, v !! i)) vs | i <- [0 .. length (head vs) - 1]] -- Erstellt ein Map aus den Werten, die von vocabCharsList übergeben wurden, -- um effizient darauf zugreifen zu können. -- Da Wörter, für die es mehrere mögliche Ersetzungen gibt, nicht ersetzt werden -- sollen, kann man hier direkt schon alle rausschmeißen, bei denen mehrere -- Lösungen möglich wären. Das erledigen groupBy und filter. vocabCharsMap vs = M.fromDistinctAscList $ concat . filter (\ xs -> length xs == 1) . groupBy (\ a b -> fst a == fst b) . sortBy (comparing fst) $ vs -- Sucht für ein Wort w eine mögliche Ersetzung. -- Wenn kein Map aus vocabMap mit der richtigen Wortlänge gefunden wird, -- wird sofort das Wort selbst zurückgeliefert, ansonsten wird mit der -- "richtigen" vocabCharsList weiter gesucht. findVocab w | isNothing mapOfLengths = w | otherwise = fromMaybe w $ lookupWord w $ fromJust mapOfLengths where mapOfLengths = M.lookup (length w) vocabMap -- Erhält ein Wort w und eine vocabCharsList ms und sucht darin eine gültige -- Ersetzung. -- suitableWords enthält zunächts alle vocabs, die als Ersetzung in Frage kommen. -- Dazu wird von w jeder Buchstabe jeweils einmal weggenommen und damit in den -- dazugehörigen Maps gesucht. Wenn dabei nichts rauskommt, als suitableWords -- leer ist, gibt es keine Ersetzung. -- Dann wird überprüft ob es mehrere Ersetzungen gibt. Wenn ja, soll laut -- Aufgabenstellung nicht ersetzt werden. -- Ansonsten wird die gefunden Ersetzung verwendet. lookupWord w ms | null suitableWords = Nothing | length replacements /= 1 = Nothing | otherwise = Just $ head replacements where suitableWords = map (\ (i, v) -> replaceIndex i (fromJust v) w) $ filter (isJust . snd) [(i, M.lookup (deleteIndex i w) m) | (i, m) <- zip [0..] ms] replacements = filter (\v -> isValidReplacement v w) suitableWords -- Prüft, ob eine vocab v eine gültige Korrektur für ein Wort w ist. -- Es wird davon ausgegangen, dass length v == length w isValidReplacement v w = sum [1 | (vc, wc) <- zip v w, vc /= wc] == 1 -- Wie nub aus Data.List, nur geht davon aus, dass die Liste sortiert ist. -- Hat den Vorteil, dass die Laufzeit (inklusive Sortieren) O(n log n) ist, -- anstatt O(n²), wie bei Data.List.nub. sortedNub :: Eq a => [a] -> [a] sortedNub [] = [] sortedNub (x:[]) = x:[] sortedNub (x:y:xs) | x == y = sortedNub (x:xs) | otherwise = x:sortedNub (y:xs) -- Entfernt aus einer Liste xs den i-ten Eintrag und gibt sie dann wieder zurück. -- z.B. deleteIndex 1 "abc" == "ac" deleteIndex :: Int -> [a] -> [a] deleteIndex i xs = take i xs ++ drop (i+1) xs -- Ersetzt den i-ten Eintrag von xs mit x. -- z.B. replaceIndex 1 'x' "abc" == "axc" replaceIndex :: Int -> a -> [a] -> [a] replaceIndex i x xs = take i xs ++ [x] ++ drop (i+1) xs
(Strangely enough, Herr Sichert didn't feel like using the German word Vokabel in his documentation.)
Wow, this was an exciting week! The MC was a bit disappointed to see that his naive solution outperformed so many other solutions, but he was glad to see that there were so many (apparently) bug-free programs.
A week without some lessons wouldn't be a week. So here are they:
The first lesson is that It is possible to achieve quite respectable performance without fancy optimizations. This was illustrated by the MC's naive solution, which would have landed at place 11, but more convincingly by the authors of solutions revolving around patterns or substitution masks.
The second, complementary lesson is that We have to keep an eye on computational complexity. As the MC increased the length of the words, we saw the gap grow at an alarming rate between the solutions that were linear (n) in the word length and those that were quadratic (n^{2}).
Die Ergebnisse der achten Woche
The results for week 8 are in!
Top 30 der Woche | |||
---|---|---|---|
Platz | Wettbewerber(in) | Punkte | |
1. | Stefan Peter Dirix | 30 | |
2. | Ludwig Peuckert | 29 | |
3. | David Otter | 28 | |
4. | Clemens Jonischkeit | 27 | |
5. | Maximilian Kirchmeier | 26 | |
6. | Daniel Stüwe | 25 | |
Hauke Brinkop | 25 | ||
8. | Atanas Mirchev | 23 | |
9. | Florian Dreier | 22 | |
Simon Roßkopf | 22 | ||
10. | Martin Helfrich | 21 | |
11. | Alexander Christian Hefele | 20 | |
12. | Michael Benedikt Schwarz | 19 | |
13. | Maximilian Haslbeck | 18 | |
14. | Martin Hartmond | 17 | |
Lyubomir Stoykov | 17 | ||
Johannes Klicpera | 17 | ||
Lukas Michael Stumberg | 17 | ||
Nikita Basargin | 17 | ||
Stefan Hörl | 17 | ||
20. | Alexander Weidinger | 11 | |
Hannes Bibel | 11 | ||
Christian Ziegner | 11 | ||
Konstantin Bauer | 11 | ||
Fabian Volz | 11 | ||
Johannes Ismair | 11 | ||
Michael Schreier | 11 |
This week, we had 43 submissions which had the official {-WETT-} tags. Since the MC is away in a multi-million city you have probably never heard of – busy conferencing – and the Co-MC took some days off – when he left on Wednesday, he was mumbling something about painting his apartment –, yet another figure in this never-ending saga of the "Wettbewerb" appears: The Contra-MC. But fear not, for he shall be faithful to the ideals of the Great MC.
Of the 43 submissions, one did not typecheck, three produced wrong results for
the examples given in the exercise sheet, and two failed at runtime with
non-exhaustive pattern matches. The Contra-MC loves correctness and hates
non-exhaustive pattern matches, hence these were removed in the zeroth round.
Additionally, there were two submissions by Manuel E. M. Eberl.
He also considered the Musterlösung, leaving us at 37+3 solutions.
Let's first reproduce the Musterlösung here:
matches :: [RegEx] -> String -> Bool matches [] s = null s matches (Any : _) [] = False matches (Any : rs) (_ : cs) = matches rs cs matches (One _ : rs) [] = False matches (One d : rs) (c : cs) = c == d && matches rs cs matches (OneIn _ : _) [] = False matches (OneIn [] : _) _ = False matches (OneIn ((l, u) : lus) : rs) (c : cs) = if l <= c && c <= u then matches rs cs else matches (OneIn lus : rs) (c : cs) matches (Concat r r' : rs) cs = matches (r : r' : rs) cs matches (Alt r r' : rs) cs = matches (r : rs) cs || matches (r' : rs) cs matches (Repeat _ : _) [] = False matches (Repeat r : rs) cs = matches (r : rs) cs || matches (r : Repeat r : rs) cs match :: RegEx -> String -> Bool match r = matches [r]
The basic idea is that the first argument in matches carries a list of still-to-be-matched regular expressions. Surprisingly, this performs pretty well.
Alright, let's start the actual measurements. In round 1, the Contra-MC noticed that there were still some correctness issues left. Nine more solutions did wrong matches on a regular expression which was supposed to be a test for performance, and two threw runtime errors.
To be fair, one of the runtime errors was introduced by the Contra-MC himself, because one contestant wrote code like if x then y without an else branch, so he just added else undefined in the hope that This Will Never Be Executed™. The other offenders with runtime errors have no excuse, though. Their names and addresses have been given to the Supreme Court for Totality Enforcement where they will face justice. Did the Contra-MC already mention that he hates partiality?
This leaves us with 26+3 solutions, hence there will just be a Top 26 this week.
Anyway, what is this mysterious regular expression which is supposed to do the performance testing? Ideally, it should force the naive implementations to backtrack a lot on a suitably contrived input. The trick is to construct a regular expression with lots of alternatives, where the correct choice is only determined at the very end of the input. Hence, we obtain (a|aa|…|a^{n})a^{n}, and test it against the words a^{n+1}, a^{2n}, a^{2n+1} and a^{2n}b, where the former two match, and the latter two don't. The Contra-MC did these measurements with increasing n, starting with 30 and going up to 300.
In the first round, programs were run with a time limit. 14 submissions were ruled out. After that, the Contra-MC varied the associativity of the concatenation and alternative operators in the regular expression. Two more solutions were lost in this process, leaving us at 10+3 solutions. Note that the quick-and-dirty solution, written by the MC himself, is still doing well here.
Let's have a look at a ranking of the remaining contestants at that point:
< 0.01 s Ludwig Peuckert < 0.01 s David Otter < 0.01 s Daniel Stüwe < 0.01 s Maximilian Kirchmeier < 0.01 s Hauke Brinkop < 0.01 s Clemens Jonischkeit < 0.01 s Stefan Peter Dirix < 0.01 s MC 0.11 s Atanas Mirchev 0.14 s Manuel Eberl II 0.56 s Florian Dreier 4.03 s Martin Helfrich 4.37 s Manuel Eberl I
All right. At this point it makes sense to peek into the implementations to see what different ideas are being used. Some of the sources are very disturbing, since the authors clearly exhibit severe symptoms of the Wettbewerbssyndrom:
-- Dear you, who reads this, i can bearly sleep, -- the last weeks had been really tough. Because of my strong addiction to Haschkell -- i tried cold detoxification. My Hands are trembling, i can't take it any more. -- I need to get my fix. I AM BACK.
The Contra-MC isn't sure what Haschkell is supposed to be, but guesses that it must be a schwäbische Version of Haskell, with keywords like inschtanz and wenn–dann–sonscht. In any case, should you also experience Haskell addiction: the Haskell Anonymous support group in Munich gathers monthly.
Anyway, the Contra-MC was able to group the submissions into four meta-solutions:
- Thompson's algorithm for generating a finite automaton from a regular expression. This one is used in multiple different flavours.
- Compiling the regular expression to a VM – a technique hitherto
unknown to the Contra-MC – which apparently is also based on Thompson's
algorithm. This is used by Team
FischkoppNord. - The nondeterministic approach, which in one way or another produces a (lazy) list of all matches and checks whether this list is empty at the end.
- A variation of the above, where matches for concatenation are only considered on subwords of some pre-computed maximum length. Unfortunately, this seems to cost more than it actually helps.
The implementations for the nondeterministic approach are pretty short, so let's have a look at one of them.
match :: RegEx -> String -> Bool match regEx string = ([], True) `elem` (match' regEx string) match' :: RegEx -> String -> [(String, Bool)] match' Any (_:xs) = [(xs, True)] match' (One c) (x:xs) = [(xs, c == x)] match' (OneIn cs) (x:xs) = [(xs, any (\ (a, b) -> a<=x "" x<=b) cs)] match' (Concat a b) (x:xs) | null bereinigt = [] | otherwise = concatMap (match' b . fst) bereinigt where bereinigt = filter snd $ match' a (x:xs) match' (Alt a b) xs = filter snd $ match' a xs ++ match' b xs match' (Repeat r) (x:xs) | null rep = [] | otherwise = rep ++ concatMap (match' (Repeat r) . fst) rep where rep = filter snd $ match' r (x:xs) match' _ _ = []
As we can see in the One case, it returns a singleton list which compares the current character x with the desired character c, and additionally returns the remainder of the input string. A call to match' hence produces a list of tuples (rem, matches), and we check whether this list contains a match with an empty remainder. This solution could nicely be adapted to a more liberal definition of "match" where a string matches iff a substring strictly matches.
Here's another notable snippet from a contestant:
--Ueberprueft, ob eine Zahl in einem Bereich liegt (0 = Unendlich) isInBounds :: Int -> (Int,Int) -> Bool isInBounds x (min,max) = x >= min "" (max == 0 || x <= max)
The Contra-MC experiences physical pain watching someone assign special meaning to a particular integer value. Remember Java? (If you don't: It's an obscure niche language from the 90s. Its creation has made a lot of people very angry and been widely regarded as a bad move.) The compareTo method returns an int, and the first 2147483648 values mean "less than", the next one means "equal", and the remaining 2147483647 values mean "greater than". Don't do that. Haskell has special data types for these and other purposes. Listen to the BZgA and practice safer programming.
Before continuing with examining more source code, the Contra-MC decided to crank up the difficulty even more by using the regular expression (a|aa|…|a^{n})a^{n}(a|aa|…|a^{n}).
Here's the ranking for n = 200:
0.39 s MC 0.74 s Stefan Peter Dirix 1.17 s Ludwig Peuckert 1.67 s Clemens Jonischkeit 1.96 s David Otter 9.98 s Maximilian Kirchmeier -------------------------------------- > 10.00 s Hauke Brinkop > 10.00 s Daniel Stüwe
Unfortunately, our Team Fischkopp Nord has been kicked out of
the leading group. Here are some excerpts from their code:
-- RegEx ismorph NFA isomorph Parallele VM -- RegEx in VM-Code umwandeln -- VM-Code interpretieren! -- Der Trick dabei: Nicht jeden möglichen Ausführungspfad ausführen, -- sondern nur die jeweiligen Positionen im Code merken -- und ggf. zusammenführen. -- Anzahl der VM-'Threads' == Anzahl der Code-Zeilen -- Anzahl der Code-Zeilen < Anzahl Ausführungspfad im Code -- Also eig. Ken Thompsons Algorithmus ursprüngliche Implementierung. data VMInstruction = IChar Char | IRange (Set.Set (Char , Char)) | IMatch | IJmp Int -- Springe an Adresse | ISplit Int Int -- Teile Programmpfad in zwei Teile auf! | IAny deriving (Show) type Codeline = Int type Code = [(Codeline, VMInstruction)] type Instructions = Vec.Vector VMInstruction type ProgramCounter = Int convert :: RegEx -> Instructions convert regEx = Vec.fromList (map snd code) `Vec.snoc` IMatch where code = regExToVMInstruction regEx 0 regExToVMInstruction :: RegEx -> Codeline -> Code regExToVMInstruction regExTop codeline = case regExTop of Any -> [(codeline, IAny)] One c -> [(codeline, IChar c)] OneIn inters -> [(codeline, IRange $ Set.fromDistinctAscList $ wellform inters)] Concat regEx regEx' -> let code = regExToVMInstruction regEx codeline codeline' = codeline + length code code' = regExToVMInstruction regEx' codeline' in code ++ code' Alt regEx regEx' -> let newCodeline = succ codeline code = regExToVMInstruction regEx newCodeline codeline' = 2 + codeline + length code code' = regExToVMInstruction regEx' codeline' codeline'' = codeline' + length code' in (codeline, ISplit newCodeline codeline') : code ++ ((pred codeline', IJmp codeline'') : code') Repeat regEx -> let code = regExToVMInstruction regEx codeline codeline' = codeline + length code in code ++ [(codeline', ISplit codeline (succ codeline'))]
The Contra-MC acknowledges the clever idea underlying this
implementation, but cannot fathom why it performs so badly.
The other solution which took very long used the ST monad, which in turn was a key performance ingredient in M. Eberl's winning submission for the fifth week. Incidentally, one of his submissions for this week also used ST, but it didn't performed very well either. The Contra-MC concludes that not all advanced Haskell constructs are worth it.
This leaves us with 4+1 solutions standing. We're getting close to the final evaluation.
The Contra-MC was not satisfied with the regular expression he used (which had been contributed by the CoMC Traytel). He decided to concoct a regular expression so sophisticated that advanced number theory is needed to explain the behaviour (left as an exercise to the reader). He humbly claims to have succeeded, and presents you this evil little monster: (a^{53}|a^{59})^{+}(b^{3}|b^{5})^{+} (the b's are just there for added fun-ness). Note that all occuring numbers are prime numbers. The Contra-MC ran the programs against the words a^{1000}b^{12} (no match) and a^{1132}b^{12} (match). The non-matching words are far more interesting here, because they (statistically) require a significantly higher amount of backtracking. All remaining contestants were able to produce the correct results in less than ten seconds.
Still not satisfied, he ran again with somewhere in the range of around 1030 a's, which provoked stack space overflows or timeouts from all contestants. Especially the nondeterminism-based implementations were susceptible of space leaks. Not wanting to unduly discriminate against these solutions, he settled for a smaller number, judging by the execution time.
Finally, he arrived at this ranking:
4.66 s MC 4.71 s Stefan Peter Dirix 6.32 s Ludwig Peuckert 8.64 s David Otter 9.18 s Clemens Jonischkeit
And this concludes this week's evaluation. What did we learn? Firstly, compile with -W. It'll tell you when you use non-exhaustive pattern matches. Secondly, ST is no magic smoke making all Haskell programs perform well. Lastly, make sure you don't run out of memory!
Update: One contestant raised a good point via mail. Indeed, the choice of regular expressions does influence the ranking. Regular expressions of the form (a|aa|…|a^{n})a^{n} actually provoke worst-case performance of automata-based constructions. For completeness' sake, here are the results of running the coin problem regular expression on the solutions which timed out earlier:
< 0.01 s Michael Benedikt Schwarz < 0.01 s Eberl I/II 0.01 s Daniel Stüwe 0.01 s Hauke Brinkop 0.02 s Florian Dreier 0.02 s Maximilian Kirchmeier 0.10 s Simon Roßkopf 0.21 s Lyubomir Stoykov 0.34 s Maximilian Haslbeck 0.42 s Alexander Weidinger
The Contra-MC acknowledges these solutions. They (conceptually) excel on this kind of regular expression.
Furthermore, the Contra-MC forgot to test one submission. It was not his fault, though – the contestant forgot to put the {-WETT-} tags into his file. The leader board has been updated accordingly and as always, no points have been deducted.
Die Ergebnisse der neunten Woche
The “Great” MC is back with a vengeance the results of the ninth week:
Top 30 der Woche | |||
---|---|---|---|
Platz | Wettbewerber(in) | Punkte | |
1. | Atanas Mirchev | 30 | |
2. | Maximilian Kirchmeier | 29 | |
3. | Maximilian Haslbeck | 28 | |
4. | Hauke Brinkop | 27 | |
Daniel Stüwe | 27 | ||
6. | Alexander Weidinger | 25 | |
7. | Michael Schreier | 24 | |
Simon Roßkopf | 24 | ||
Christian Ziegner | 24 | ||
9. | Alexander Christian Hefele | 22 | |
Alexandra Fritzen | 22 | ||
Anna Kropp | 22 | ||
Hannes Bibel | 22 | ||
David Otter | 22 | ||
Frédéric Simonis | 22 | ||
Natalie Reppekus | 22 | ||
Mathias Staudigl | 22 | ||
Ludwig Peuckert | 22 | ||
Barbara Moser | 22 | ||
Florian Dreier | 22 | ||
Sebastian Zeitel | 22 | ||
Johannes Ismair | 22 | ||
Kevin Yu | 22 | ||
Lukas Holzner | 22 | ||
Marco Probst | 22 | ||
Martin Mihaylov | 22 | ||
Moritz Becher | 22 | ||
Moritz Sichert | 22 | ||
Nikita Basargin | 22 | ||
Thomas Engel | 22 | ||
Thomas Pettinger | 22 | ||
Valentin Zieglmeier | 22 | ||
Zhechko Zhechev | 22 |
The criterion is efficiency. Like in previous weeks, the MC is sticking to a round-based approach, ruling out at each round solutions that fall below a threshold. We saw last week that this is dangerous, because this depends very much on the choice of benchmarks. However, when the proper care is taken for selecting the examples, this works well. For the regular expression problem of last week, it would have been better to conjure a diverse benchmark suite and measure the speed of each solution on each benchmark, then compute scores. That's because there are so many ways to implement regular expressions, with various trade-offs.
The above comments shouldn't be construed as a critique of the Contra-MC's work. Quite the opposite. The MC is extremely pleased with the work of his Co(ntra)-MCs. He even found himself rolling on the floor of his hotel room laughing while reading the comments about inschtanz and wenn–dann–sonscht. He has himself a couple of friends from Schwoabia and remembers how, after a few cocktails, one of them repeatedly schtumbled trying to pronounce “Hauschschlüssel”. But even the Contra-MC, who clearly enjoys taking cheap shots at others, is not perfect: He has a slight tendency to lisp when he speaks English, which becomes very interesting when the talks about the Lisp programming language. And for all his humor about Team Nord/Fischkopf, he's from the East/Ossie. ;) (Being a French Canadian like the MC is great, because it means you're at the top of the food chain. Or maybe not. [Disclaimer: The linked page has yellow text on a white background.])
It feels strange for the MC to read this “weblog” as opposted to be writing it. For those who watch the Daily Show with Jon Stewart, the feeling is similar to when Stewart had to watch John Oliver host his own show. His comment is politically incorrect enough that the MC will have to take out his <non-PC> tags:
<non-PC>
“I don't watch it all the time because it's too weird. It's like watching someone have sex with your wife's desk.”
</non-PC>
All right, let's take the invitation offered by the closing </non-PC> tag to move on to more serious stuff. For the quasi-majority element problem, there are basically two classes of solutions: the quadratic ones (O(n^{2})) and the linear ones (O(n)). The quote from Boyer & Moore at the beginning of the Aufgabentext was a “subtle” hint to the existence of a linear solution to the standard majority element (Mehrheitselement) problem, from which it is not too hard to derive a solution to the quasi version of the problem.
First, some stats: 78 solutions were submitted, among which 22 didn't pass the QuickCheck tests. This leaves 56 solutions, plus a solution by Manuel Eberl, plus a naive solution by the MC (in Sol_Exercise_9.hs), plus a very naive solution by the MC (like the naive one but without the nub), plus three real solutions by the MC (!). The various MC solutions help him keep the Überblick over the relative speed of the competitors' entries. The timeout is always 10 seconds.
Round 1: One hundred thousand times the same element:
Data.List.replicate 100000 1
10 solutions took over 10 seconds. The other solutions, including the very naive one, took less than 1 second. This leaves 46+6 entries.
Round 2: Six million times one element followed by six million times another element:
Data.List.replicate 6000000 1 ++ Data.List.replicate 6000000 2
This is enough to kick out 12+1 solution, the +1 being the very naive one as expected. This leaves 34+5.
Round 3: Like the above, but the two elements are interwoven:
Data.List.concat (replicate 2000000 [2, 1, 2, 2, 1, 1])
We lost two more competitors and are left with 32+5.
Round 4: 10000 elements repeated 5 times:
map (`mod` 10000) [(0::Int) .. 50000]
This is where we lose all the quadratic solutions. They just cannot cope with the high number of distinct elements. After this round, only 8+4 solutions survive.
Round 5: Twenty million distinct elements:
[1 .. 20000000]
There's quite some variation among the linear solutions. One solution even causes a stack overflow (denoted SOFL, “sitting on the floor laughing”):
0.85 s Maximilian Haslbeck 1.01 s Alexander Weidinger 1.02 s Maximilian Kirchmeier 1.04 s Master of Competition II 1.26 s Manuel Eberl 1.37 s Master of Competition III 1.65 s Maximilian Haslbeck 1.80 s Master of Competition I 1.92 s Daniel Stüwe 2.07 s Hauke Brinkop 2.13 s Atanas Mirchev ------------------------------------------------ 5.76 s Simon Roßkopf SOFL Michael Schreier
We had to draw the line somewhere, and the factor 2.7 gap between Herr Mirchev and Herr Roßkopf seemed like an appropriate place. For ten million instead of twenty million elements, the solution by Herr Schreier took over 5 seconds, so we can extrapolate that it would have been over the timeout had it not totally SOFL'd out beforehard.
Round 6: Twenty million times the same element:
Data.List.replicate 20000000 123456
This is reminescent of round 1.
0.51 s Maximilian Kirchmeier 0.69 s Master of Competition III 0.76 s Atanas Mirchev 0.86 s Daniel Stüwe 0.91 s Hauke Brinkop 1.00 s Maximilian Haslbeck 1.33 s Manuel Eberl 2.12 s Master of Competition I 2.57 s Master of Competition II ------------------------------------------------ 3.02 s Alexander Weidinger
Before we say farewell to the last competitor, let's quote the nice comment in his code:
{- This code is mainly based on the „Majority Vote Algorithm“ by Boyer and Moore with a little difference, to match the „quasiMajority“ criteria. Instead of just using the algorithm I had to differentiate between two types of Lists. 1. The input list has an odd number of elements: In that way if the majority algorithm finds a majority, it's the only possible quasiMajority and we're done 2. The input list has an even number of elments: In that way, the majority algorithm could find no quasiMajority - of course, because it only works for majoritys (> 50%) and not quasimajoritys (>=50%) - or the algorithm finds a majority, but it's not the only majority, because it's a quasiMajority. To solve the problem with 2. i came up with an (easy) solution, which i longly discussed with my father ( thanks dad ;) ). The idea in that case is, to split the first element from the list and use the majority algorithm on the shorten list, with now an odd number of elements. So, no matter what the algorithm finds, we have to check if the splitted element is also a quasiMajority of the list. Only if the majority algorithm finds the same element as the splitted element we can stop with just one majority element. -}
Let's all say “Bye bye” to Herr Weidinger and his dad!
Round 7: Half of the elements are the same value, the other half is made of unique elements, and the two halves are interwoven: Data.List.concatMap (\x -> [1234567890123456789 + x, 1234567890123456789]) [1 .. 5000000]
To make matters worse, the values chosen are big integers, to penalize the solutions that perform needlessly many comparisons on the element type. This gives us our top 5:
1.45 s Atanas Mirchev 1.48 s Master of Competition II 1.58 s Maximilian Kirchmeier 1.65 s Master of Competition III 1.79 s Manuel Eberl 2.02 s Maximilian Haslbeck 2.15 s Master of Competition I 3.09 s Hauke Brinkop 3.89 s Daniel Stüwe
Interestingly, the two members of Team Fischkopf submitted once again slightly diverging solutions, resulting in some timing differences. So we're counting them as two separate entities this week (Herr Fisch and Herr Kopf?). (It might seem like the MC is recycling the Contra-MC's humor, but it was actually his idea. Actually, it wasn't really his idea but his special lady friend's. For the record, she's from Konschtanz. Then the MC opened his big mouth and the Contra-MC profited from his absence to steal the idea.)
Update: Herr Fisch wrote to complain about the separation of the Fisch and the Kopf, so the MC in his endless magnanimity has decided to give Kopf the same number of points as Fisch, even though the explanation as to whether they had diverging solutions was a bit “fishy”. In the interest of academic transparency, here's what Fisch wrote:
you mentioned in the text that me and Daniel would have slightly different solutions. This is not correct, cause i sent it to him after i pressed "save" and we just had a look: what i sent was the same code as he gab ab ( ;-) ). But im working directly on the server, and it seems the server was down that time so it didnt save the changes. So it would be fair if he would get the same amount of points as i did.
Actually im running gedit on a remote file system (ssh/sftp), mounted with nautilus. So im editing in a local editor, and when i press "save" it sometimes need to remount the remote file system. The server was down a few times, it e.g. freezed my remote shell (and also those of other guys) and i couldn't reconnect (for about 5 minutes). I guess it was just bad timing, That wouldnt be fair to Daniel, because he is aiming for the total win :D
(End of Update)
One competitor found it a pity that the signature of the function didn't have Ord a => as an additional type class constraint. Indeed, if the MC could travel back in time, he would change that, because then we'd have not only O(n) and O(n^{2}) solutions but also O(n log n). Likely, it should be possible to use unsafeCoerce and/or some other type system trick to determine if the concrete instance of a supports Ord and use appropriate code if it does; but the MC's Haskell mojo isn't quite up to the task!
The MC would also like to formally apologize to Herr Eberl. For weeks, compiling his solutions with ghc -O3 might have produced the message
The function `main' is not defined in module `Main'
which the MC failed to notice. Compilation failed as a result, and Herr Eberl's solutions had an unfair disadvantage over the other solutions in that it was interpreted instead of compiled. This having been said (genauer: written), Herr Eberl is not entirely blameless, since he provided the MC with a file that didn't have the module Exercise_9 where heading, which caused the error in the first place.
Herr Eberl's solution is probably the one that best combines conciseness and efficiency:
quasiMajorityElems :: Eq a => [a] -> [a] quasiMajorityElems xs = [x | x <- candidates (1::Integer) xs, occs x * 2 >= n] where n = genericLength xs occs x = genericLength (filter (==x) xs) candidates 0 (x:xs) = candidates 1 xs candidates 1 [x,y] = if x == y then [x] else [x,y] candidates k (x:y:xs) = candidates (if x == y then k+1 else k-1) (x:xs) candidates _ xs = xs
He also went into the extra trouble of formally proving that his solution is correct.
He did his proof using Isabelle/HOL, a theorem prover developed at StuttgartLehrstuhl 21.
Here it is for the record:
theory Wett9 imports Main "~~/src/HOL/Library/Efficient_Nat" begin (* Number of occurrences of an element in a list *) definition occs :: "'a => 'a list => nat" where "occs x xs = length (filter (op = x) xs)" (* Definition of the (nonstrict) majority winners of a list *) definition isQuasiMajorityWinner :: "'a => 'a list => bool" where "isQuasiMajorityWinner x xs <-> (x ∈ set xs & occs x xs * 2 >= length xs)" definition quasiMajorityWinners :: "'a list => 'a set" where "quasiMajorityWinners xs = {x. isQuasiMajorityWinner x xs}" lemma isQuasiMajorityWinner_imp_in_list[dest]: "isQuasiMajorityWinner x xs ==> x ∈ set xs" unfolding isQuasiMajorityWinner_def by simp (* Some lemmas about occs *) lemma occs_replicate_same[simp]: "occs x (replicate k x) = k" by (induction k, simp_all add: occs_def) lemma occs_replicate_different[simp]: "x ≠ y ==> occs x (replicate k y) = 0" by (induction k, simp_all add: occs_def) lemma occs_append[simp]: "occs x (xs @ ys) = occs x xs + occs x ys" by (induction xs, simp_all add: occs_def) lemma occs_Cons[simp]: "occs y (x#xs) = (if x = y then occs y xs + 1 else occs y xs)" by (induction xs, simp_all add: occs_def) lemma occs_in_set[simp]: "x ∈ set xs ==> occs x xs > 0" by (induction xs, auto simp: occs_def) lemma occs_notin_set[simp]: "x ∉ set xs ==> occs x xs = 0" by (induction xs, auto simp: occs_def) (* Narrows the candidates for majority winners of a list down to <= 2 *) fun quasiMajorityCandidates :: "nat => 'a list => 'a list" where "quasiMajorityCandidates 0 (x#xs) = quasiMajorityCandidates 1 xs" | "quasiMajorityCandidates _ [] = []" | "quasiMajorityCandidates _ [x] = [x]" | "quasiMajorityCandidates (Suc 0) [x,y] = (if x = y then [x] else [x,y])" | "quasiMajorityCandidates k (x#y#xs) = quasiMajorityCandidates (if x = y then k + 1 else k - 1) (x#xs)" (* Correctness and distinctness of majorityCandidates *) lemma quasiMajorityCandidates_correct: assumes "xs ≠ []" assumes "isQuasiMajorityWinner y (replicate k (hd xs) @ tl xs)" shows "y ∈ set (quasiMajorityCandidates k xs)" using assms proof (induction k xs rule: quasiMajorityCandidates.induct) case (goal5 k' x1 x2 xs) show ?case proof (cases "x1 = x2") case False hence "isQuasiMajorityWinner y ((replicate (Suc k')) x1 @ xs)" using goal5 by (rule_tac ccontr, simp add: isQuasiMajorityWinner_def split: split_if_asm) with False and goal5(1) show ?thesis by simp qed (insert goal5(1,3), simp add: replicate_app_Cons_same) next case (goal6 k' x1 x2 x3 xs) show ?case proof (cases "x1 = x2") case False hence "isQuasiMajorityWinner y ((replicate k') x1 @ x3 # xs)" using goal6 by (rule_tac ccontr, simp add: isQuasiMajorityWinner_def split: split_if_asm) with False and goal6(1) show ?thesis by simp qed (insert goal6(1,3), simp add: replicate_app_Cons_same) qed (force simp: isQuasiMajorityWinner_def split: split_if_asm)+ lemma quasiMajorityCandidates_distinct: "distinct (quasiMajorityCandidates k xs)" by (induction k xs rule: quasiMajorityCandidates.induct) simp_all (* For nicer generated code (i.e. not wrapped in "Nat" type, define version with int instead of nat *) fun quasiMajorityCandidates_int :: "int => 'a list => 'a list" where "quasiMajorityCandidates_int _ [] = []" | "quasiMajorityCandidates_int k [x] = (if k = 0 then [] else [x])" | "quasiMajorityCandidates_int k (x#y#xs) = (if k = 0 then quasiMajorityCandidates_int 1 (y#xs) else (if k = 1 & xs = [] then (if x = y then [x] else [x,y]) else quasiMajorityCandidates_int (if x = y then k + 1 else k - 1) (x#xs)))" lemma quasiMajorityCandidates_int_correct: "quasiMajorityCandidates_int (int k) xs = quasiMajorityCandidates k xs" by (induction k xs rule: quasiMajorityCandidates.induct) (case_tac xs, auto) (* Definition of the function that returns the list of majority winners *) definition quasiMajorityWinnersList where "quasiMajorityWinnersList xs == (let n = length xs in filter (λx. occs x xs * 2 >= n) (quasiMajorityCandidates_int (int 1) xs))" lemma quasiMajorityWinnersList_correct: "set (quasiMajorityWinnersList xs) = quasiMajorityWinners xs" proof (cases "xs = []") assume "xs = []" thus ?thesis unfolding quasiMajorityWinnersList_def quasiMajorityWinners_def isQuasiMajorityWinner_def by simp next assume "xs ≠ []" show ?thesis unfolding quasiMajorityWinnersList_def quasiMajorityWinners_def using quasiMajorityCandidates_int_correct[of "Suc 0" xs] apply (intro equalityI subsetI) apply (rule ccontr, simp add: isQuasiMajorityWinner_def `xs ≠ []`) apply (auto simp: `xs ≠ []` isQuasiMajorityWinner_def intro!: quasiMajorityCandidates_correct) done qed lemma quasiMajorityWinnersList_distinct: "distinct (quasiMajorityWinnersList xs)" unfolding quasiMajorityWinnersList_def Let_def by (subst quasiMajorityCandidates_int_correct, intro distinct_filter quasiMajorityCandidates_distinct) lemmas [code] = quasiMajorityWinnersList_correct[symmetric] (* Code export *) export_code quasiMajorityWinnersList in Haskell file "/home/manuel/haskell/wett/wett9" end
The MC is almost done babbling and rambling, but he would like to mention one news item he saw this week. No, it's not about the GroKo, although the news did reach him even behind the Chinese firewall (no Facebook though). Rather, it was an article called Exponential algorithm making Windows XP miserable could be fixed that drew the MC's attention. It turns out that Windows XP, a version of Windows that is still very popular in some parts of the world (like China), suffers from a truly debilitating bug. Here are three extracts from the article:
Each time you boot your Windows XP machine, it slows to a crawl. There's a built-in process, svchost.exe, chewing up the entire processor, sometimes for an hour or more at a time. Wait long enough after booting and the machine will eventually return to normalcy. But an hour can be a long time to wait.
Machines using Windows Update retrieve patch information from Microsoft's servers. That patch information contains information about each patch: what software it applies to (for example, systems that have been upgraded to Internet Explorer 7 or 8 don't need Internet Explorer 6 patches), what knowledge base article it relates to, and, critically, what historic patch or patches the current patch supersedes.
Unfortunately, the Windows Update client components used an algorithm with exponential scaling when processing these lists. Each additional superseded patch would double the time taken to process the list. With the operating system now very old, those lists have grown long, sometimes to 40 or more items. On a new machine, that processing appeared to be almost instantaneous. It is now very slow.
But the best part of the article is that Microsoft have, in their own words, “all the right (and smartest) people” working on it. One can only wonder what would happen if they had all the right people working on it, but the right people weren't smart, or if they had the wrong smart people (like Dr. Frankenstein), or, even worse (but more likely perhaps), if they had the wrong dumb people working on it. Hm, one really does wonder who actually is working on it (assuming somebody is)!
Pfui, this was a long week! The main lesson is perhaps to Pay attention; seemingly irrelevant information, like the Boyer & Moore quote about delegates kocking down each other with their placards, can be highly relevant to knock down the competition. By the way, there's a quote in the Wettbewerbsaufgabe for Week 10 as well.
Update (Updated): Christian Ziegner had submitted a wrong solution that didn't pass the QuickCheck tests. He also had an alternative solution that was commented out and put it back in at 15:45 on 17 December. The MC tested this solution and found out that it would have been kicked out in round 5. Since we had wrongly written 18 December as the Abgabetermin (something only a handful of contestants seem to have noticed, thankfully), he now got all his points.
Important message: We checked the submission directories and found no other discrepancy between the submissions on December 17 and 18. However, the results of the contest were already announced a couple of hours before the contest officially closed. In the unlikely event that you have been somehow affected by this mess-up, please contact the MC and he'll see what he can do.
Die Ergebnisse der zehnten Woche
The results are in!
Top 30 der Woche | |||
---|---|---|---|
Platz | Wettbewerber(in) | Score | Punkte |
1. | Atanas Mirchev | 0.145 | 30 |
Lyubomir Stoykov | 0.145 | 30 | |
3. | Maximilian Haslbeck | 0.152 | 28 |
4. | Alexander Weidinger | 0.198 | 27 |
Daniel Stüwe | 0.198 | 27 | |
Hauke Brinkop | 0.198 | 27 | |
7. | Julia Kindelsberger | 3.41 | 24 |
8. | Nikita Basargin | 3.73 | 23 |
9. | Alexandra Fritzen | 3.75 | 22 |
10. | Alexander Güssow | 3.98 | 21 |
Christian Ziegner | 3.98 | 21 | |
Johannes Ismair | 3.98 | 21 | |
Simon Roßkopf | 3.98 | 21 | |
Thomas Engel | 3.98 | 21 | |
Thomas Pettinger | 3.98 | 21 | |
Thomas Zwickl | 3.98 | 21 | |
17. | Mihai Dutescu | 6.64 | 14 |
18. | Alexander Christian Hefele | N/A | 5 |
Stefan Peter Dirix | N/A | 5 | |
Florian Dreier | N/A | 5 | |
21. | Maximilian Kirchmeier | N/A | 2.5 |
As usual, let us start with some stats. Three solutions had failed the official QuickCheck suite. For these, kein Pardon!
In addition, four solutions that had either succeeded on the QuickCheck server or that had timed out failed on additional tests. These get 5 points. (The MC can send the failing inputs to the concerned parties. Just send him an email.) One of the solutions even had the wrong signature, with Int instead of Integer. The MC generously repaired it, but it still failed the tests horribly. Giving 5 points to a solution that does not even compile would be a travesty, so this competitor gets 2.5 points. He himself probably won't be too surprised, since he wrote the following in his code (wie recht er hatte...):
-- this is a grim day for democracy, I know... but congresses and snowboarding -- vacations (which I sacrificed part of for this... construct) ate all my time
The MC has no pity! He sacrificed vacation to prepare the Othello-Aufgabe and now he's sacrificing the Pub Quiz to write this pseudo-weblog entry.
Anyway, this leaves us with 17 real contestants. To rank them, the MC and the Contra-MC joined forces and created a database of 66 randomly generated inputs with anything between 1 and 3000 cities. For each input and each solution, they computed the variance (σ) of the sizes of the Wahlkreise, and divided it by the sum of the variances of all solutions. Then they took the sum over all 66 inputs to obtain a numeric score (displayed in the third column of the Top 30 table above), from which a rank was computed.
The solutions neatly aggregated into clusters of people who implemented the same algorithm. Team Balkan (score: 0.145), consisting of Herren Mirchev and Stoykov, have a nice 250 LOC (lines of code) solution. Unfortunately it is not very well documented; their solution is all Greek (or Bulgarian) to the MC. Herr Stoykov even had a solution to spare:
{-JUST IN CASE THE FIRST SOLUTION IS NOT BUG FREE, HERE IS ANOTHER ONE: [211 lines are omitted here] -}
Update: Herr Stoykov has now explained the solution in an email to the MC:
Hi MC,
just not to leave a bad (balkan) impression - we don't tend to write SOO bad code!!!:D Our solution was a last minute ride (as always). The main idea was at every recursive step to find which alternative is more favorable - "vertical splitting" or "horizontal splitting" line. Of course, we didn't have the time to fully implement the solution, therefore we had to always split horizontally;P We barely managed to submit at ~15.20;-)
All best,
LS
(End of Update)
Then comes Herr Haslbeck (score: 0.152) and his very nicely documented solution:
--------------- -- Naja, das hätte besser sein können -- Im Groben sortiert die Funktion alle Städte und Eckpunkte (Eckpunkte sind -- dann Städte mit 0 Einwohnern, so entstehen hübschere Polygone), bündelt -- dann die Städte immer bis zu einer gewissen Einwohnerzahl (in unserem Fall -- (Bevölkerung / Anzahl der Teile)) und erstellt so eine Liste. -- Die Implementierung findet oft nicht die optimale Lösung und ist -- ziemliches Flickwerk, da ich in den Ferien immer wieder daran -- rumgewurschtelt habe. --------------- split :: District -> Integer -> [Polygon] split (District ps cs) n = bundleCities cityAndPoin aver n where cityAndPoin = sortCities $ cs ++ polygonToCities ps aver = div (population cs) n --------------- -- arbeitet die Städte der Reihe nach ab und erstellt ein Polygon -- wenn gewisse Grenze überschritten. -- Besteht ein Polygon nur aus ein oder zwei -- Punkten, wird noch ein "Wettbewerbspolynom" darausgemacht. -- Sind nur noch drei Städte zum verarbeiten übrig, zerhackt -- er das Dreieck einfach so oft bis wir die gewünschte Anzahl -- an Polygonen haben. (Naja, alles nicht so schön) --------------- bundleCities :: [City] -> Integer -> Integer -> [Polygon] bundleCities ((City _ p popu):cs) aver rest | rest == 1 = [convexHull $ p:(citiesToPoints cs)] | lenRestCities <= 3 = splitTriangleMult (p:(citiesToPoints cs)) rest | otherwise = partDist' : bundleCities restCs aver (rest - 1) where partDist = convexHull $ pointSet lenDist = genericLength partDist partDist' | lenDist == 1 = cheatOnePoint (head partDist) (head $ citiesToPoints restCs) | lenDist == 2 = cheatTwoPoints partDist . | otherwise = partDist pointSet = (p:(citiesToPoints $ formPolygon cs aver popu)) lenPS = genericLength $ pointSet lenRestCities = 1 + genericLength cs restCs = drop (lenPS - 1) cs bundleCities _ _ _ = [] -- nimmt soviel Städte vom Beginn der Liste weg, -- bis Durchschnitt erreicht formPolygon :: [City] -> Integer -> Integer -> [City] formPolygon [_,_,_] _ _ = [] formPolygon (c:cs) aver popu | newPopu < aver = c:(formPolygon cs aver newPopu) | abs (popu - aver) < abs (newPopu - aver) = [] | otherwise = [c] where cityPopu (City _ _ i) = i newPopu = popu + cityPopu c formPolygon _ _ _ = [] -- macht aus einem Dreieck zwei Dreiecke splitTriangle :: Polygon -> [Polygon] splitTriangle [p1,p2,p3] = [[p1,midpoint,p3],[p2,midpoint,p3]] where (x1,y1) = p1 (x2,y2) = p2 midpoint = ((x1+x2)/2,(y1+y2)/2) splitTriangle _ = [] -- macht aus einem Dreieck n Dreiecke splitTriangleMult :: Polygon -> Integer -> [Polygon] splitTriangleMult poly n | n == 1 = [poly] | otherwise = t : (splitTriangleMult t' (n-1)) where (t:t':_) = splitTriangle poly -- macht aus Punkten Städte mit null Einwohnern polygonToCities :: Polygon -> [City] polygonToCities = map (\p -> City [] p 0) -- extrahiert Koordinaten der Städte citiesToPoints :: [City] -> [Point] citiesToPoints = map (\(City _ p _) -> p) -- sortiert Städte anhand ihrer Koordinaten sortCities :: [City] -> [City] sortCities = sortBy compareCities where compareCities (City _ p1 _) (City _ p2 _) = compare p1 p2 -- bestimmt Gesamtbevölkerung einer Liste von Städten population :: [City] -> Integer population = sum . map (\(City _ _ i) -> i) -- macht aus einem Punkt ein gültiges Polygon (eine Linie mit 3 Punkten) cheatOnePoint :: Point -> Point -> Polygon cheatOnePoint (x1,y1) (x2,y2) = [(x1,y1),quarterPoint,midPoint] where quarterPoint = ((3*x1+x2)/4,(3*y1+y2)/4) midPoint = ((x1+x2)/2,(y1+y2)/2) -- macht aus zwei Punkten ein gültiges Polygon (eine Linie mit 3 Punkten) cheatTwoPoints :: [Point] -> [Point] cheatTwoPoints [(x1,y1),(x2,y2)] = [(x1,y1),midPoint,(x2,y2)] where midPoint = ((x1+x2)/2,(y1+y2)/2) cheatTwoPoints _ = [] ---------------- -- Konvexe Hülle Algorithmus -- geklaut von hier: -- https://en.wikibooks.org/wiki/Algorithm_Implementation/Geometry/Convex_hull/Monotone_chain#Haskell --------------- [omitted]
Then comes Team Fischkopf (score: 0.198), this time extended by a third member, Herr Weidinger (likely a Weißwurstfresser). This North-South collaboration seems to have implemented a cool algorithm, but again in the absence of documentation it's hard to see what they really did.
Finally, Frau Kindelsberger's solution (score: 3.41) is the first of many similar short solutions (which could have been made even shorter by removing the spurious first equations for getPointList and getStaedteList). The formatting also leaves much to be desired—to minimize the chances are that the code on the right-hand side will be clipped by your browser, the MC wad obliged to reduce the font size drastically:
split :: District -> Integer -> [Polygon] split _ 0 = [] split district 1 = [getPointList district] split district n = take pointsDivN point : split (District (point \\ deletePoints) staedte) (n-1) where point = getPointList district staedte = getStaedteList district pointsDivN = if div (genericLength point) (fromIntegral n :: Int) <= 3 then 3 else div (genericLength point) (fromIntegral n :: Int) + 1 deletePoints = tail (take (pointsDivN-1) point) getPointList :: District -> [Point] getPointList (District [] []) = [] getPointList (District point _) = point getStaedteList :: District -> [City] getStaedteList (District [] []) = [] getStaedteList (District _ city) = city
The awesome Contra-MC wrote a program to visualize the results. Here's the output on one of the examples from the benchmark set (where larger circles correspond to larger cities):
Team Balkan | Herr Haslbeck | Team Fischkopf–Weißwurst | Everybody else |
---|
We see that Team Balkan and Team Fischkopf didn't hesitate to cut existing boundary segments into two. Both appear to like horizontal and vertical lines. The naive approach, in contrast, simply tries to connect existing vertices, with catastrophic results for the example shown above. The coolest of these solutions is undoubedly Herr Haslbeck's, who even leaves empty territory as a pink no man's land. It is a pity that his solution didn't win on this particular benchmark suite.
The main lesson this week is probably that When you implement a cool algorithm, please document your code! This increases your chances of seeing your solution on this weblog, but also means that you will be able to read it again in n years from now (for large values of n).
Die Ergebnisse der (elften und) zwölften Woche
Update: The results have been multiplied by 2, as promised in the Aufgabentext for a zweiwöchige Wettbewerbsaufgabe; the text below has been modified accordingly. The MC says: Asche auf sein Haupt!
Here they are at last, the results for week 12!
Top 30 der Woche | |||
---|---|---|---|
Platz | Wettbewerber(in) | Punkte | |
1. | Maximilian Haslbeck | 30 × 2 | |
(2. | Maximilian Kirchmeier | (29/2) × 2) | |
2. | Julian Biendarra | 29 × 2 | |
3. | Stefan Peter Dirix | 28 × 2 | |
4. | Alexandra Fritzen | 27 × 2 | |
5. | Martin Mihaylov | 26 × 2 | |
Georgi Dikov | 26 × 2 | ||
7. | Michael Schreier | 24 × 2 | |
Simon Roßkopf | 24 × 2 | ||
9. | Alexander Christian Hefele | 22 × 2 | |
10. | Ludwig Peuckert | (21 − 2) × 2 | |
11. | Alexander Weidinger | 20 × 2 | |
Hauke Brinkop | 20 × 2 | ||
13. | Julia Kindelsberger | 18 × 2 | |
14. | Nikita Basargin | 17 × 2 | |
15. | Atanas Mirchev | 16 × 2 | |
16. | Lukas Michael Stumberg | 15 × 2 | |
17. | Thomas Zwickl | 14 × 2 | |
18. | Daniel Stüwe | 13 × 2 | |
19. | Christian Ziegner | 12 × 2 | |
Martin Wurzer | 12 × 2 | ||
Thomas Pettinger | 12 × 2 |
First, some general comments:
- The MC quickly realized that he didn't have enough CPU power (and/or
patience) to run the tournament on his laptop. So he used a cluster of eight
machines with two cores each. This is slow hardware, reflecting the difficult
financial situation of the TUM (ju$t kidding). The MC found that it's about 2.2
times slower than his own laptop, a three-year-old MacBook Pro. For good
measure, he gave 20 minutes per player per game instead of the advertised 5 to
be on the safe side. Still, some solutions managed to time out.
- One competitor submitted an entry that did not respect the type signature given
in Blatt 12. Two points (× 2) were subtracted.
- Three competitors submitted entries that played illegal moves. Maybe they
were playing Reversi instead of Othello? The MC made a short process and
disqualified them. Their authors are listed in the bottom part of the table
above, also known as the Hall of Shame. (Of course, we all know that the real
Hall of Shame is the list of students who didn't submit any solution and who got
the zero points (× 2) they deserve. Twelve points (× 2) ain't that bad at all for buggy code.)
- The MC performed basic clone detection to identify teams, using ls -l on Unix
and sorting files by size. A few perfectly identical copies were removed, but
most of the members of teams had slightly differing solutions (sometimes just in the
formatting of their files) and are treated as distinct competitors.
- A team of competitors submitted a solution past the deadline. They already had
three solutions, so the MC added it as a fourth one. He was going to subtract 2
points (× 2) for the late submission, but it turns out the fourth solution didn't help
them in any way, so this doesn't affect their score in the end.
- One competitor submitted a very late entry; he informed the MC of its presence on the servers two days after the deadline. At this point (× 2), it was too late to include the entry in the tournament. The MC ran a mini-tournament involving this entry against all the others, computed a score, and divided the number of points (× 2) by 2 to account for the extreme tardiness of the entry. His entry is listed in the first table below.
The MC was quite pleased to see that most competitors wrote detailed descriptions of their programs. He extracted them and linked them. (Just click the name of a competitor in the table below.) He was tempted to subtract points (× 2) to those who didn't write any docs, since the instructions were abundantly clear about the need to do so, but he resisted the temptation, and now the temptation is gone forever.
In addition to the total score, the table also lists the total time taken by the program. It's refreshing to see that some of programs are both fast (bzw. slow) and good (bzw. bad). The MC's program is pretty terrible, but that's because he spent most on his energy on this (pseudo-)weblog instead. The real participants don't have that excuse!
Readers of this page are highly encouraged to click on the above links. Much of the humor you would normally expect from the MC is delegated to this week's participants. Some of them are really funny and/or witty. The MC's personal favorite is Ludwig Peuckert's entry, not so much for his witty comments as for his willingness to go his own way. Oh, and don't worry if the non-ASCII letters come up all screwed up; just use your imagination to fill the gaps. (Hint: Usually, the missing letter will be one of ä ö ü ß, apart from the entry that features elaborate, 18+ ASCII-art.)
The next table is the coolest one. It shows the outcome of every single game (except those involving Herr Kirchmeier). The entries on the left indicate black players; those at the top are the white players. Since the entries are ranked by score, one would expect the upper right triangle to be filled with b's and the lower left triangle with w's (see the legend).
Legend w: white won b: black won W: white won due to timeout B: black won due to timeout =: tie | M a x i m i l i a n H a s l b e c k 2 | M a x i m i l i a n H a s l b e c k 1 | J u l i a n B i e n d a r r a 1 | J u l i a n B i e n d a r r a 2 | M a x i m i l i a n H a s l b e c k 3 | S t e f a n P e t e r D i r i x 1 | S t e f a n P e t e r D i r i x 3 | A l e x a n d r a F r i t z e n 3 | M i h a y l o v & D i k o v 4 | S t e f a n P e t e r D i r i x 2 | M i h a y l o v & D i k o v 3 | M i h a y l o v & D i k o v 2 | M i c h a e l S c h r e i e r 1 | S i m o n R o ß k o p f 1 | A l e x a n d e r C h r i s t i a n H e f e l e | L u d w i g P e u c k e r t | A l e x a n d e r W e i d i n g e r | H a u k e B r i n k o p | A l e x a n d r a F r i t z e n 1 | M i c h a e l S c h r e i e r 3 | J u l i a K i n d e l s b e r g e r | N i k i t a B a s a r g i n 3 | N i k i t a B a s a r g i n 2 | A t a n a s M i r c h e v 1 | A t a n a s M i r c h e v 3 | N i k i t a B a s a r g i n 1 | M a s t e r o f C o m p e t i t i o n | L u k a s M i c h a e l S t u m b e r g | T h o m a s Z w i c k l | A t a n a s M i r c h e v 2 | D a n i e l S t ü w e | A l e x a n d r a F r i t z e n 2 | M i h a y l o v & D i k o v 1 | M i c h a e l S c h r e i e r 2 | S i m o n R o ß k o p f 2 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Maximilian Haslbeck 2 | b | b | b | w | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | B | B | ||
Maximilian Haslbeck 1 | w | b | b | B | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | B | B | ||
Julian Biendarra 1 | w | b | b | w | w | w | b | b | w | w | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | B | B | ||
Julian Biendarra 2 | w | w | w | w | w | w | w | b | w | b | b | b | b | b | b | b | b | w | w | w | b | b | b | b | b | b | b | b | B | b | b | b | B | B | ||
Maximilian Haslbeck 3 | b | W | W | b | b | b | b | W | b | b | b | W | W | b | b | b | b | W | b | W | b | b | b | b | b | b | b | b | b | b | b | b | B | B | ||
Stefan Peter Dirix 1 | b | w | b | = | w | B | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | B | b | b | w | B | B | ||
Stefan Peter Dirix 3 | w | w | b | w | w | w | w | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | B | b | b | b | B | B | ||
Alexandra Fritzen 3 | w | w | b | b | w | w | w | b | w | b | = | w | w | b | w | w | w | b | w | b | b | b | b | b | b | b | b | b | b | b | b | b | B | B | ||
Mihaylov & Dikov 4 | w | w | w | w | w | w | B | w | w | w | b | w | w | b | b | b | b | b | w | b | b | b | b | b | b | b | b | b | b | b | b | b | w | B | ||
Stefan Peter Dirix 2 | w | w | b | b | w | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | B | B | ||
Mihaylov & Dikov 3 | w | w | b | b | w | w | b | b | w | w | w | b | b | b | w | = | = | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | B | B | ||
Mihaylov & Dikov 2 | w | w | w | w | w | w | w | w | w | w | w | b | b | b | w | w | w | b | b | b | b | b | b | b | b | b | b | b | b | b | b | b | B | B | ||
Michael Schreier 1 | w | w | w | b | w | w | w | w | w | w | b | b | b | b | b | b | b | b | b | b | = | = | b | b | b | b | b | b | b | b | b | b | B | B | ||
Simon Roßkopf 1 | w | w | w | b | w | w | w | w | w | w | b | b | b | b | b | b | b | b | b | b | = | = | b | b | b | b | b | b | b | b | b | b | B | B | ||
Alexander Christian Hefele | w | w | w | w | B | w | w | w | b | w | w | b | w | w | w | b | b | b | w | b | b | b | b | b | b | w | b | w | b | w | w | b | B | B | ||
Ludwig Peuckert | w | w | w | w | w | w | B | w | b | w | b | w | b | b | b | b | b | w | b | w | b | b | b | b | b | w | b | b | B | w | w | b | B | B | ||
Alexander Weidinger | w | b | w | w | B | w | w | w | w | w | w | w | w | w | b | b | w | b | b | b | b | b | b | b | w | b | b | w | b | b | b | = | B | B | ||
Hauke Brinkop | w | b | w | w | B | w | w | w | w | w | w | w | w | w | b | b | w | b | b | b | b | b | b | b | w | b | b | w | b | b | b | = | B | B | ||
Alexandra Fritzen 1 | w | w | w | w | w | w | w | b | w | w | w | w | w | w | w | w | w | w | w | b | b | b | b | b | b | b | b | b | b | b | b | b | B | B | ||
Michael Schreier 3 | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | b | w | w | w | w | b | b | b | b | b | b | w | b | b | b | b | b | w | w | ||
Julia Kindelsberger | w | w | w | w | w | w | w | b | w | w | w | w | w | w | w | w | w | w | b | w | b | b | b | b | b | b | b | b | b | b | b | b | w | w | ||
Nikita Basargin 3 | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | b | b | b | w | w | w | b | b | w | b | B | B | ||
Nikita Basargin 2 | w | w | w | w | w | w | w | w | w | w | w | w | b | b | w | w | w | w | w | w | w | w | w | w | w | w | b | w | w | b | = | b | B | B | ||
Atanas Mirchev 1 | B | w | w | w | B | w | w | w | w | = | w | w | w | w | w | w | w | w | w | w | w | b | b | w | b | b | b | b | w | b | b | w | B | B | ||
Atanas Mirchev 3 | B | w | w | w | B | w | w | w | w | = | w | w | w | w | w | w | w | w | w | w | w | b | b | w | b | b | b | b | w | b | b | w | B | B | ||
Nikita Basargin 1 | w | w | w | w | w | w | w | w | w | w | w | w | w | w | b | w | w | w | w | w | w | w | w | b | b | w | w | w | b | b | b | b | B | B | ||
Master of Competition | w | w | w | w | B | w | w | w | w | w | w | w | w | w | b | b | b | b | w | w | w | w | w | b | b | w | w | w | b | b | w | b | B | B | ||
Lukas Michael Stumberg | w | w | w | w | B | w | w | w | w | w | b | w | w | w | w | w | w | w | w | w | w | w | w | b | b | b | b | b | b | b | b | w | B | B | ||
Thomas Zwickl | w | w | w | w | B | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | b | b | w | w | b | B | w | w | b | w | w | ||
Atanas Mirchev 2 | W | w | w | w | W | w | w | w | w | = | W | w | w | w | w | w | w | w | w | w | w | b | b | w | w | b | b | b | b | b | b | w | B | B | ||
Daniel Stüwe | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | b | b | b | b | b | b | w | w | b | b | b | = | B | ||
Alexandra Fritzen 2 | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | b | b | w | w | b | b | w | w | ||
Mihaylov & Dikov 1 | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | w | B | B | ||
Michael Schreier 2 | W | W | W | W | W | W | W | W | W | W | W | W | b | b | W | W | W | W | W | W | W | W | W | W | W | W | b | W | b | W | W | b | W | W | ||
Simon Roßkopf 2 | W | W | W | W | W | W | W | W | W | W | W | W | b | b | W | W | W | W | W | W | W | W | W | W | W | W | b | W | b | W | W | b | W | W |
So we have a clear winner! Congratulations to Herr Haslbeck! With a few exceptions (including losses against himself), he really knocked everybody's socks off! If this gentleman doesn't get a trophy at the end of the term, nobody will.
It would be nice to show the actual games, but the MC wasn't able to make the Java applet that's supposed to do that work. As a sample of what you're missing, here's the log of a game in which the MC takes a real beating from Herr Haslbeck's third solution.
Looking at the solutions, it appears that nobody thought about “cheating,” und zwar in the following way: Submit a real solution A and two fake solutions B and C. The fake solutions play extremely well, except that if they detect that they are playing against A (e.g., by running A for the first 10 or so moves and checking that the moves coincide), then they start playing extremely crappily, thereby raising A's score. This can be improved by recruting, say, 20 students who normally don't submit solutions, offering one can of beer to each of them, in exchange for their submitting further copies of the fake solution. Then just make sure that the solutions look different enough so that the MC doesn't notice what's going on, and your're on your way to victory.
What are this week's lessons?
- It's never too late to submit an entry, but send an email to the MC. Don't expect him to go fishing after late solutions. (Last year, he was better at that sport, as explained here; but he's getting old and gray-haired like his idols.)
- Always document your solutions, even if the MC doesn't ask for it! This makes your code so much more enjoyable to read, and probably to write too!
- In the competition and in real life, you will often have the choice between reusing existing ideas and rolling your own (i.e., reinventing the wheel). Choose judiciously between reusing existing ideas and rolling your own! Criteria that should guide your decision are, e.g., estimated development effort, quality of the result, but also how much fun (bzw. pain) is connected with either approach.
Update: Mister Tardy has this to say:
Well, would you look at that... it seems my caffeine-powered coding marathon did actually pay off. All the more frustrating that my tardiness cost me half the points my stubbornness earned me :(
But that's how it goes, I guess... still, it's nice to see my effort validated :)
Ergebnisse der dreizehnten Woche
The results for week 13 are in!
Top 30 der Woche | |||
---|---|---|---|
Platz | Wettbewerber(in) | Punkte | |
1. | Stefan Peter Dirix | 30 | |
2. | Simon Roßkopf | 29 | |
Michael Schreier | 29 | ||
4. | Maximilian Haslbeck | 27 | |
5. | Daniel Stüwe | 26 | |
6. | Moritz Sichert | 25 | |
7. | Alexandra Fritzen | 24 | |
Julia Kindelsberger | 24 | ||
9. | Alexander Christian Hefele | 22 | |
David Krebs | 22 | ||
Ludwig Peuckert | 22 | ||
David Zhang | 22 | ||
Hauke Brinkop | 22 | ||
Nikita Basargin | 22 | ||
15. | Maximilian Kirchmeier | 16 | |
16. | Alexander Weidinger | 15 |
Unless one of the current top five competitors of the Semester gets disqualified somehow (e.g., for cheating in an egregious fashion), we already have a set of five winners. Which is good news not only for these participants (congrats!), but also for the MC, because he had grabbed the five trophies already and was dreading a fifth-place tie that would force him to take the U-Bahn to the PEP in Neuperlach and hope the Nanu-Nana still has some trophies left. (Technically, a tie is still possible if Michael Schreier gets 30 points and Herr Brinkop gets 0 points, but what are the odds?)
This week was a rather weird week to rank. The MC divided the solutions into classes and made sure that the higher classes received more points than the lower classes.
- First, there were five real submissions, including a duplicate due to Gruppenarbeit (which is durchaus erlaubt, as usual).
- Then there was a solution that just called gzip, bzip2, and/or xz using unsafe I/O. Although the Aufgabentext wasn't explicit about this (something that was later patched under the course web site's “News”), it should have been clear that this is against the very Spirit of the exercise. The MC could barely suppress the temptation of redirecting gzip to gnuchess and bzip2 to KReversi, making the solution unsound/buggy and giving it 0 points (× 1).
- Then there were two solutions that just used the Haskell GZip library. Since the effort involved in doing so is trivial, it deserves even less points than the unsafe-I/O solution (which at least involved some programming).
- Then there were six solutions where compress and decompress were the identity functions. This is trivial and doesn't compress as much as using gzip or GZip, but it works.
- As usual, buggy implementations are heavily penalized for their bugs. There was one buggy solution, for which decompress (compress data) /= data for most values of data that were tried.
- One solution failed to pass the official QuickCheck test.
- Finally, a number of solutions were undefined. These were simply filtered out, because the authors didn't bother changing the template. No pain, no gain!
In the words of one of the competitors:
{- Unfortunately I had not enough time to finish the Wettbewerbsaufgabe this time. :/ But there is still a small propability of getting some points. (Because the other students seem to be quite lazy and won't even change "undefined" to "id") Well, at least the output is never longer than the input. -}
Entirely accurate! (Even the bit about the students being like their favorite programming language—or is this mistaking lethargy for strategy?) The MC was about to write something along those very lines. Great minds think alike! (And Eigenlob st...) But the small “propability” of getting some points materialized into a 100% probability of getting 22 out of 30 points! Not bad!
To make matters more complicated, a couple of occasionally empty-headed, maybe-just-not-working solutions threw exceptions like
Maybe.fromJust: Nothing Data.ByteString.head: empty ByteString
or exhausted the stack space upon decompression. The MC decided to give these
programs the benefit of the doubt—i.e., he assumed
that the compressed data should be
recoverable, using a bug-free program (which exists in principle even though it
has yet to be found). But there's a thin line between giving the benefit of the
doubt and being utterly naïve, which the MC is not ready to cross;
so whenever a decompression function failed, the expansion ratio is taken to
be 200% (and displayed in italics to make it more peinlich for the
offenders), so that this serves as a lesson. As Joe StalinVladimir Ilyich Ulyanov is supposed to maybe
have said: Vertrauen ist gut, Kontrolle ist besser.
(Update des Meta-Masters:
Joe Stalin said “Vertrauen ist schlecht, Gulag ist besser, Hinrichtung ist am
besten”.)
All right, here come the detailed results in all their glory. The MC built an archive of 50 text files and 50 binary files. In each group, he put 10 tiny files (two-digit size), 10 small files (three-digit size), 10 medium files (four-digit size), 10 large files (five-digit size), and 10 huge files (six-digit size). He didn't dare to go into the millions of bytes because Haskell performance is rather hard to predict and debug for these, and the goal was to focus on compression algorithms, not on Haskell performance tuning. Results are displayed as size after : size before ratios, in %. Lower is better.
Fritzen & Kindelsberger | Sichert | Dirix | Schreier & Roßkopf | Haslbeck | Stüwe | ||
---|---|---|---|---|---|---|---|
File | Size (bytes) | GZip | gzip/bzip2/xz | LZW | LZW | LZ77/Huffman | “Arith” |
ccc.html | 30 | 167% | 103% | 103% | 103% | 153% | 117% |
conclusion.tex | 35 | 146% | 103% | 97% | 103% | 157% | 114% |
test.cpp | 44 | 145% | 102% | 109% | 102% | 184% | 111% |
access_3box_innerleft.gif | 45 | 138% | 102% | 96% | 102% | 182% | 107% |
border_top.gif | 50 | 128% | 102% | 90% | 102% | 162% | 104% |
access_border_bottom.gif | 50 | 128% | 102% | 90% | 102% | 158% | 104% |
access_border_right.gif | 52 | 127% | 102% | 94% | 102% | 171% | 104% |
Test055.ML | 58 | 116% | 102% | 88% | 102% | 129% | 109% |
README_en_GB_thes.txt | 63 | 122% | 102% | 98% | 102% | 143% | 108% |
rule.gif | 65 | 125% | 102% | 82% | 102% | 143% | 103% |
i16.gif | 72 | 121% | 101% | 94% | 101% | 194% | 101% |
Test062.ML | 82 | 100% | 101% | 89% | 101% | 116% | 106% |
border_bottom.gif | 82 | 121% | 101% | 93% | 101% | 174% | 106% |
description-om.txt | 87 | 110% | 101% | 97% | 101% | 123% | 106% |
rgb.png | 89 | 116% | 101% | 99% | 101% | 185% | 100% |
ChatInputFill.png | 90 | 111% | 101% | 93% | 101% | 168% | 96% |
EmbeddedManager_impl.cpp | 94 | 106% | 101% | 99% | 101% | 133% | 105% |
ViewingPaneLeftInnerShadow.png | 96 | 111% | 101% | 95% | 101% | 172% | 93% |
version.texi | 99 | 82% | 83% | 82% | 101% | 118% | 102% |
trashed.html | 99 | 106% | 101% | 96% | 101% | 127% | 104% |
version.cpp | 120 | 112% | 101% | 105% | 101% | 143% | 104% |
ctl0.tex | 168 | 76% | 76% | 79% | 101% | 105% | 104% |
Test036.ML | 285 | 56% | 57% | 68% | 100% | 89% | 101% |
unlocked_bw_18.png | 349 | 106% | 100% | 108% | 100% | 160% | 98% |
tictac.html | 421 | 61% | 61% | 72% | 100% | 81% | 100% |
description-ta.txt | 557 | 48% | 44% | 51% | 93% | 66% | 101% |
ROOT.ML | 651 | 60% | 60% | 70% | 100% | 78% | 98% |
packages.html | 655 | 58% | 58% | 74% | 100% | 90% | 100% |
QuestionMark.pdf | 676 | 55% | 55% | 67% | 100% | 78% | 97% |
gdi.exe | 686 | 65% | 65% | 69% | 100% | 82% | 85% |
Pfeil1.pdf | 686 | 65% | 65% | 74% | 100% | 93% | 98% |
user.exe | 690 | 64% | 64% | 69% | 100% | 81% | 86% |
krnl386.exe | 692 | 65% | 65% | 69% | 100% | 82% | 86% |
Firefox3.png | 703 | 103% | 100% | 117% | 100% | 132% | 100% |
mem_alloc.cpp | 713 | 46% | 46% | 67% | 100% | 77% | 100% |
wallet-20.png | 735 | 103% | 100% | 117% | 100% | 131% | 101% |
buttonec.pdf | 788 | 73% | 74% | 81% | 100% | 105% | 99% |
Lambda.pdf | 802 | 103% | 100% | 120% | 100% | 129% | 101% |
translations.tex | 852 | 53% | 53% | 68% | 100% | 68% | 100% |
msg_44.txt | 895 | 54% | 54% | 79% | 100% | 83% | 100% |
Calendar_Alert.pdf | 1167 | 62% | 62% | 73% | 100% | 92% | 98% |
csharpexec-test.exe | 2560 | 22% | 22% | 23% | 40% | 34% | 34% |
preface.tex | 3212 | 48% | 48% | 63% | 97% | 67% | 100% |
write.exe | 3376 | 23% | 22% | 27% | 45% | 35% | 42% |
contact_us.html | 3518 | 43% | 43% | 60% | 93% | 67% | 99% |
esther-ch2.tex | 4978 | 50% | 47% | 61% | 91% | 70% | 98% |
metatex.exe | 6144 | 44% | 42% | 55% | 81% | 64% | 80% |
TODO.txt | 6645 | 48% | 48% | 58% | 85% | 61% | 99% |
doc.txt | 6938 | 39% | 38% | 53% | 77% | 62% | 100% |
backward.png | 7019 | 100% | 100% | 138% | 100% | 103% | 100% |
clasimp.ML | 7182 | 34% | 34% | 52% | 76% | 65% | 99% |
background.pdf | 7261 | 55% | 52% | 89% | 100% | 90% | 99% |
multi_page_logins.html | 7475 | 38% | 38% | 53% | 78% | 64% | 99% |
CAAudioFileRecorder.cpp | 7556 | 35% | 35% | 54% | 79% | 69% | 99% |
DM_Attribute-Number.pdf | 8365 | 52% | 50% | 101% | 100% | 98% | 100% |
AppleSamplePCI.cpp | 8937 | 30% | 30% | 50% | 72% | 65% | 98% |
bnf_fp_def_sugar_tactics.ML | 8982 | 31% | 31% | 48% | 69% | 64% | 99% |
Stop.pdf | 9286 | 24% | 23% | 38% | 56% | 63% | 95% |
SmallTux.png | 9824 | 99% | 99% | 132% | 100% | 102% | 100% |
creditcard-solo.png | 9899 | 100% | 100% | 139% | 100% | 102% | 100% |
end_user_license_agreement.html | 10031 | 39% | 38% | 55% | 78% | 66% | 99% |
Calendar_Publish.pdf | 18748 | 57% | 55% | 78% | 96% | 77% | 98% |
HP_DeviceSettings.cpp | 26299 | 21% | 20% | 43% | 54% | 67% | 100% |
HISTORY.txt | 35981 | 36% | 32% | 51% | 60% | 63% | 99% |
groebner.ML | 37493 | 30% | 29% | 52% | 57% | 61% | 98% |
fun-brain-2.png | 40042 | 100% | 100% | 140% | 100% | 101% | 100% |
root.tex | 41082 | 33% | 31% | 54% | 58% | 61% | 100% |
money_back.png | 42225 | 100% | 100% | 141% | 100% | 101% | 100% |
welcome-keychain-icon.png | 44112 | 100% | 100% | 141% | 100% | 101% | 100% |
clock.exe | 46488 | 17% | 14% | 44% | 43% | 46% | 72% |
Reply_Pressed.pdf | 48569 | 24% | 22% | 56% | 49% | 61% | 94% |
AUBase.cpp | 62705 | 21% | 20% | 45% | 46% | 66% | 100% |
thm.ML | 64040 | 24% | 23% | 46% | 47% | 59% | 98% |
idnchars.txt | 69650 | 22% | 14% | 45% | 38% | 63% | 93% |
diagonal_2_light.pdf | 73031 | 48% | 45% | 88% | 85% | 75% | 99% |
xpidl.exe | 84480 | 39% | 33% | 74% | 65% | 72% | 95% |
dired-ref.pdf | 85646 | 95% | 95% | 137% | 100% | 100% | 100% |
HOL.tex | 86843 | 32% | 29% | 58% | 53% | 63% | 99% |
LispCall.html | 91732 | 11% | 9% | 42% | 35% | 68% | 99% |
xpcshell.exe | 97280 | 40% | 35% | 73% | 65% | 74% | 96% |
safari_import.png | 110941 | 100% | 100% | 141% | 100% | 101% | 100% |
realconv.cpp | 112664 | 25% | 21% | 63% | 42% | 57% | 93% |
welcome-browser-spiral.png | 120703 | 100% | 100% | 141% | 100% | 101% | 100% |
THIRDPARTY.*JAVAFX.txt | 123324 | 26% | 23% | 52% | 47% | 62% | 99% |
cmath_testcases.txt | 136295 | 28% | 22% | 45% | 41% | 56% | 98% |
WhitePiece.png | 137941 | 100% | 100% | 141% | 100% | 101% | 100% |
dictionary.html | 167415 | 15% | 12% | 38% | 32% | 60% | 100% |
PG-adapting.texi | 171954 | 28% | 23% | 53% | 42% | 60% | 100% |
acachetutorial.pdf | 193933 | 93% | 93% | 137% | 100% | 100% | 100% |
PARSE_TREE.ML | 304787 | 21% | 200% | 42% | 32% | 50% | 93% |
1Password.html | 313489 | 47% | 200% | 93% | 73% | 72% | 99% |
xwindows.cpp | 316758 | 17% | 200% | 52% | 32% | 62% | 96% |
texinfo.tex | 320197 | 30% | 200% | 57% | 44% | 62% | 99% |
eground.exe | 391124 | 43% | 200% | 81% | 62% | 73% | 94% |
polyimport.exe | 498080 | 37% | 200% | 89% | 58% | 69% | 93% |
kew-dec2004.pdf | 619253 | 98% | 200% | 141% | 100% | 100% | 100% |
mapping.pdf | 625717 | 39% | 200% | 83% | 59% | 70% | 98% |
metis.ML | 664659 | 16% | 200% | 44% | 29% | 55% | 97% |
TB_JunkTemplate.pdf | 800400 | 5% | 200% | 62% | 21% | 62% | 99% |
SPASS.exe | 819429 | 41% | 200% | 79% | 58% | 72% | 98% |
Average | 66.30% | 79.19% | 79.73% | 81.71% | 93.74% | 97.88% |
(It's frightening how much rubbish there is lying around on the MC's Mac. Especially worrying are all these Windows executable files, like SPASS.exe. Probably all viruses.)
The two “winning” solutions are those that use GZip or gzip etc. These don't really count, but they are shown in the table for reference. Then comes Herr Dirix's solution:
{-Info Compression Algorithm: Lempel<96>Ziv<96>Welch http://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch Additional info from: http://www.cs.duke.edu/csed/curious/compression/lzw.html Starts with 8-bit codes (to not expand small files) to max 12-bit codes This should work nicely on textfiles. For binary formats a proper bitrange should be set (for example 5-12) depending on the binary (e.g. gif) (This implementation is hardcoded to a range of [8-16] bits) -}
That's a nice choice! The MC remembers implementing a GIF decoder in C back in 1995, and GIF uses LZW. Ah, nostalgie ! quand tu nous tiens...
Then comes another LZW implementation, due to Herr Schreier and Herr Roßkopf. Strangely enough, the numbers are very different from those obtained by Herr Dirix, sometimes for the better, often for the worse:
{- Version: -inf (Pre-alpha) Team: Simon Roßkopf Michael Schreier Translated the LWZ algorithm from here: http://www.cplusplus.com/articles/iL18T05o/ (Version 1) We planned to implement LWZ+Huffman(Similar to DEFLATE). But, no time. So only LZW No time, no comments sadly. Maybe next time. -} --Those check wether the LZW algorithm inflates the input --If yes, just use the input as output
(Perhaps the doubious performance comes from the anachronistic misspelling of LZW as LWZ. Anachronistic because LZW = LZ78 + Welch84.)
Next, Herr Haslbeck and his dual LZ77 + Huffman implementation:
{-- -- Wow, one could spend a lifetime optimising a compression algorithm. -- -- Ich habe hier eine Version von Huffmankodierung und fuer Dateien -- der Groesse 1 kB bis ca. 1 MB wird zusaetzlich noch eine LZ77 Variante -- verwendet (das Ding ist saulangsam, was wahrscheinlich an meinem Code -- und das Haskell nicht ideal für sowas ist liegt). -- -- Huffmankodierung: -- Okay, ich erstelle einen Huffmantree mithilfe der Funktionen von -- letzter Woche und bringe ihn in die sogenannte "canonical form" -- (siehe Wikipedia). Somit kann er platzsparend gespeichert werden -- (max. 256 bytes). Es werden immer 32KB Bloecke komprimiert, also -- fuer jeden dieser Bloecke ein eigener Baum und entsprechner Code -- erstellt. Ein komprimierter Block beginnt mit einem Bit, indem -- gespeichert wird, wie der Baum gespeichert wurde. Dann folgen -- 15 Bits mit der Länge des komprimierten Blocks, dann der Baum -- und dann Huffmancode in ByteString Form. -- -- LZ77: -- Ich habe versucht, so etwas ähnliches wie lz77 zu implementieren. -- (Also hab ich fast schon deflate (huffman + lz77) implementiert :) ) -- Ganz grob, fuer jede Position wird gesucht ob die naechsten 256 Zeichen -- irgendwie schon in den vorigen 4096 Zeichen vorkommen. Falls eine -- Zeichenkette laenger 2 vorkommt, wird statt dem eigentlichen Zeichen -- ein Verweis auf die vorige Zeichenkette + deren Laenge gespeichert. -- Das Ganze ist saulangsam und wird deshalb nur fuer Dateien > 1024 KB -- bzw. < ca. 1MB angewendet. Hauptsache ich klau jmd. die 40 Punkte. -- -- Das Ganze funktioniert hoffentlich. Es ist fuerchterbarer Durcheinander. -- Es sollte besser kommentiert sein, aber die Zeit ging fuer Testen und -- ultrasubtile Fehler finden drauf. -- -- Es ist 15:07. Ich werde jetzt noch ein bisschen testen, wahrscheinlich -- feststellen, dass ein Bock drin ist und dann alles woran ich glaube -- anzweifeln. -}
Amen.
Update: Herr Haslbeck writes:
der Kommentar zu meinem Programm ist nicht ganz korrekt. Ich habe gestern um 15:20 Uhr ein JPEG mit meinem Huffman + LZ77 noch komprimiert und das Programm lieferte ein falschen Ergebnis [sic]. Deshalb komprimiert mein Programm nicht mit LZ77, sondern implementiert eine reine Huffmankodierung.
Auf zur nächsten Challenge: Kunst programmieren.
Und dann gibt es die Lösung von Herrn Stüwe. (Oops, I got a little bit too inspired by the above passage in German, and the sentence is ausgerutscht like that. Oops, now I even forgot that I'm supposed to be the MC.) [Throat clearing sound]
-- http://www.bodden.de/wordpress/wp-content/uploads/2007/11/ac.pdf -- http://www.cs.ox.ac.uk/people/jeremy.gibbons/publications/arith.pdf -- http://www.staff.science.uu.nl/~jeuri101/afp/afp4/arith-4.pdf -- http://www.cs.cmu.edu/~guyb/realworld/compression.pdf -- http://tcs.rwth-aachen.de/lehre/Komprimierung/SS2012/
Lots of links to prestigious colleges, but how come the ratio is only 97.88%? The MC's grandma can compress better than that!
Interestingly, Herr Stüwe (a.k.a. Herr Kopf) apparently dissolved Team Nord. Last week, we had the interesting case, which the MC neglected to comment, that Herr Stüwe had managed to “verschlimmbessern” his solution a few hours before the deadline; as a result, he got far less points than his “lazier” teammates (where “lazy” can mean “studying for the exams”, of course). For better or for worse, competitions don't reward efforts; they reward results (and luck). One of the challenges in life is to achieve a high correlation between efforts and results (and to maximize luck). This can be arbitrarily difficult slash highly nontrivial. It been said of Einstein that he could have gone sailing for the last 30 years of his life (or served his majesty's government and become a Sir, perhaps) instead of spending his time on fruitless research (relatively speaking).
What is the lesson this week? Hint: It's hidden in the previous paragraph!
Oh, the MC has got mention one more thing. If you kept clicking on last week's entries to find some 18+ ASCII-art, you're probably disappointed by now not to have found any. But one reader who took this task a bit too seriously for his own good found a true gem, a perle rare.
This is the 1st version of the game tree that uses the minimax algorithm to find the best possible move. It will take too long.
And how long did the solution take to play 68 games? 0.127 seconds!! Which brings us to the second lesson of the week: You have the right to remain silent. Anything you write in a comment can and will be used against you. ;-)
Ergebnisse der vierzehnten Woche
The results for the final week are in!
Top 30 der Woche | |||
---|---|---|---|
Platz | Wettbewerber(in) | Punkte | |
1. | Maximilian Haslbeck | 29.333 | |
2. | Stefan Peter Dirix | 24.333 | |
3. | Daniel Stüwe | 23.333 | |
. | Hauke Brinkop | 23.333 | |
. | Simon Roßkopf | 23.333 | |
. | Michael Schreier | 23.333 | |
6. | Alexander Weidinger | 22.667 | |
7. | Nikita Basargin | 22.333 | |
8. | Julian Biendarra | 21.333 | |
9. | Alexander Christian Hefele | 21.000 | |
10. | Martin Hartmond | 18.000 |
Was ist denn ein Token?
The MC is a big fan of token counting as a rough measure of software quality. As the previous year's results have clearly shown, shorter answers tend to be more elegant and (usually) less likely to have subtle bugs.
The MC is making his token-counting program available to the rest of the world. The program builds on Niklas Broberg's Language.Haskell.Exts.Lexer module, which can be installed using Cabal:
cabal update cabal install haskell-src-exts
There appears to be disagreement regarding the status of backquotes in different versions of Haskell (98 vs. 2010). The MC generously adopted the Haskell 98 definition, with `max` treated as one token rather than three.
Here's how to compile and run the program on Linux or Mac:
ghc Tokenize.hs -o tokenize ./tokenize
At this point, type in some Haskell code, e.g.,
sum_max_sq x y z = x ^ 2 + y ^ 2 + z ^ 2 - x `min` y `min` z ^ 2
Make sure to type in an empty line afterwards. Press Ctrl+D to terminate. The program then outputs
24 sxyz=x^2+y^2+z^2-xmymz^2
On Windows, you can run the program with
.\tokenize
or
tokenize
and terminate it with Ctrl+C (Windows 8) or Ctrl+Z followed by Enter (Windows 7, probably also for earlier versions).
The first field, “24”, gives the number of tokens. The second field is a short version of the program, where each token is abbreviated as one letter. This can be useful for debugging. The MC aggregates this information in his top-secret Master View, which looks something like this:
18 sxyz=mxy^2+mxymz^2 xxx@tum.de 18 sxyz=mxy^2+xmymz^2 xxx@tum.de 18 sxyz=mxy^2+zmmxy^2 xxx@mytum.de 20 sxyz=(mxy)^2+(myz)^2 xxx@mytum.de 20 sxyz=mxy^2+m(mxy)z^2 xxx@tum.de 20 sxyz=mxy^2+m(mxy)z^2 xxx@tum.de ... 52 sxyz|x=z&y=z=x*x+y*y|x=y&z=y=x*x+z*z|y=x&z=x=y*y+z*z xxx@mytum.de 52 sxyz|x=z&y=z=x*x+y*y|x=y&z=y=x*x+z*z|y=x&z=x=y*y+z*z xxx@mytum.de 52 sxyz|x=z&y=z=x*x+y*y|x=y&z=y=x*x+z*z|y=x&z=x=y*y+z*z xxx@mytum.de 52 sxyz|x=z&y=z=x^2+y^2|x=y&z=y=x^2+z^2|y=x&z=x=y^2+z^2 xxx@mytum.de ... 252 sxyz|(x+y)>(x+z)&(x+y)>(y+z)=(x*x)+(y*y)|(x+z)>(x+y)&(x+z)>(y+z)=(x*x)+(z*z)|(y+z)>(y+x)&(y+z)>(x+z)=(y*y)+(z*z)|x=z&x>y=(x*x)+(z*z)|x=y&x>z=(x*x)+(y*y)|z=y&z>x=(y*y)+(z*z)|x=z&x<y=(x*x)+(y*y)|x=y&x<z=(x*x)+(z*z)|z=y&z<x=(x*x)+(z*z)|x=y&y=z=(x*x)+(y*y) xxx@mytum.de
The Master View allows the MC to compile the Top 30 der Woche, but also to quickly spot interesting solutions and duplicates. Last year, the Wettbewerbsaufgaben were also Prüfungsleistungen and hence plagiarism was a serious issue to deal with. This year there are no specific guidelines regarding team work.
For the competition, the above program is invoked only on the code snippet included between {-WETT-} and {-TTEW-}, with all type signatures stripped away.
Some competitors have tried to confuse tokenize by using German letters (ä etc.) in their source code. Setting
export LANG=de_DE
before running the program solves this problem.