Sunday, February 22, 2009

Hoogle package search

Recently on the Haskell mailing list there has been some discussions of which packages Hoogle searches by default. One person remarked that it was unfortunate that the network package isn't searched by default. There are lots of packages on Hackage, and Hoogle needs to decide how to cope with so much choice. There are a number of questions that I need to answer in Hoogle:


  1. What packages should Hoogle search by default? All of hackage? The base libraries? Only the packages a user has installed? Only packages that make it in to the Haskell Platform?

  2. What groups of packages should Hoogle have available? Each package individually? All packages which compile on Windows? All packages by a certain author? All packages whose minor version number is even?

  3. What UI should Hoogle show? Should there be checkboxes for each os's package? Should their be a checkbox for each compiler/version? Should their be no UI but some documentation?



And these questions present a number of trade offs:


  • The packages have to be divided under sensible and clear lines - I don't want to (and shouldn't) arbitrate divisions like "good" or "popular".

  • The more packages you search, the less relevant the results will be.

  • The fewer packages you search, the more chance that you miss something.

  • The more UI that is added the more confusing things get.

  • My development time for Hoogle derives Bounded, Finite and increasingly also derives Small.



Thoughts and suggestions are very welcome. I've set up a wiki page to track peoples thoughts, please make your view and arguments known: http://haskell.org/haskellwiki/Hoogle/Packages.

(As an aside, I recently found that dolphin friendly tuna is actually really harmful to the environment, far more harmful than dolphin unfriendly tuna. Read more here.)

Tuesday, February 03, 2009

Monomorphism and Defaulting

Haskell has some ugly corners - not many, but a few. One that many people consider exceptionally ugly is the monomorphism restriction. In this post I'm going to discuss three related issues - Constant Applicative Forms (CAFs), the monomorphism restriction and defaulting. But before we start, lets take a simple example.

Computing Pi

Haskell already provides the pi function which represents the value of pi, but lets assume it didn't. Taking a quick look at Wikipedia we can see that one way of computing Pi is the Gregory-Leibniz series. We can calculate pi as:

pi = (4/1) + (-4/3) + (4/5) + (-4/7) + (4/9) + (-4/11) ...

So let's write that as a Haskell program:


pie = sum $ take 1000000 $ zipWith (/) (iterate negate 4) [1,3..]


Here the constant 1000000 gives the accuracy of our approach, increasing this value will give a higher precision. As it currently stands, the Haskell library says pi = 3.14159265358979 and our program says pie = 3.14159165358977. Thirteen matching digits should be suffient for most uses of pi :-)

CAFs

The disadvantage of our pie function is that (under Hugs) it takes about 4 seconds to evaluate. If we are performing lots of calculations with pi, calculating pie each time will be a serious problem. CAFs are the solution!

A CAF is a top-level constant, which doesn't take any arguments, and will be computed at most once per program execution. As a slight subtlety, if the constant has class constraints on it (i.e. is Num a => a, instead of a) then it isn't a CAF because the class constraints act like implicit arguments. Our pie function above doesn't take any arguments, so is a CAF.

Defaulting

While pie doesn't have any class constraints, the right-hand side of pie does! Take a look in Hugs:


Main> :t sum $ take 1000000 $ zipWith (/) (iterate negate 4) [1,3..]
:: (Enum a, Fractional a) => a

Main> :t pie
:: Double


The right-hand side works for any Enum and Fractional type, for example Float, but pie is restricted to Double. The reason is the defaulting mechanism in Haskell - if a type can't be nailed down precisely, but is one of a handful of built-in classes, then it will default to a particular type. This feature is handy for working at an interactive environment, but can sometimes be a little unexpected.

Monomorphism restriction

Without defaulting the compiler would infer the type of pie as ::(Enum a, Fractional a) => a. However, such a definition would be rejected by the monomorphism restriction. The monomorphism restriction states that a function with no explicit arguments, but with class constraints, must be given a type annotation. This rejects functions like:


snub = sort . nub


To fix the problem there are two solutions:


snub i_hate_the_evil_mr = (sort . nub) i_hate_the_evil_mr

snub :: Ord a => [a] -> [a]
snub = sort . nub


For a function like pie only the second approach is applicable. The addition of dummy arguments to avoid the monomorphism restriction is sufficiently common that the HLint tool never suggests eta-reduction if the argument is named mr.

Conclusion

So why was the monomorphism restriction first introducted? For a function with no explicit arguments, the programmer might think they had written a CAF, but class constraints may substantially degrade the performance. Defaulting reduces the number of cases where the monomorphism restriction would otherwise bite, but it is still useful to be aware of the ugly corners.

There are proposals afoot to remove the monomorphism restriction and to increase the power of the default mechanism - hopefully both will be included in to Haskell'.