Apr 3 2010

Ditaa support for gitit

I hacked together a quick plugin for the most excellent gitit wiki today. It’s written in Haskell, so it’s an absolute pleasure to write code for it.

What I added support for is a neat little tool called ditaa (DIagrams Through Ascii Art). Basically, in the markdown source of your Gitit wiki you can now write something like the following:

~~~ {.ditaa}
+--------+   +-------+    +-------+
|        | --+ ditaa +--> |       |
|  Text  |   +-------+    |diagram|
|Document|   |!magic!|    |       |
|     {d}|   |       |    |       |
+---+----+   +-------+    +-------+
    :                         ^
    |       Lots of work      |
    +-------------------------+
~~~

The plugin will then call out to the ditaa command line tool (written in Java, boo!) to render that to a beautiful image:

A ditaa image on a Gitit page

To get this set up for yourself, try the following from the root of your Gitit wiki:

git clone git://github.com/batterseapower/gitit-plugins.git batterseapower-plugins
wget http://downloads.sourceforge.net/project/ditaa/ditaa/0.9/ditaa0_9.zip?use_mirror=kent -O ditaa0_9.zip
unzip ditaa0_9.zip

Now edit your Gitit configuration file so the plugins list includes my plugin:

plugins: batterseapower-plugins/Ditaa.hs

That’s it – restart Gitit and you should be ready to go!


May 11 2009

Constraint families

Various people, notably John Meacham, have proposed adding “context aliases” to Haskell. The basic idea is that you could write declarations like the following in Haskell:

context Num a = (Monoid a, Group a, Multiplicative a, FromInteger a)

Now, what this means is that when you write Num a in a constraint, you really mean all of Monoid a, Group a and so on. This means that the following program is valid, and presumably computes the number 7:

foo :: Num a => a -> a
foo = fromInteger 2 `mappend` fromInteger 5

This lets you write shorter type signatures in programs which make ubiquitous use of type classes. However, in the brave new world of type families an obvious generalisation is to allow class-associated constraints. In particular, this lets us solve the classic problem where you can’t make Set an instance of Monad:

class RMonad m where
    context RMonadElem a

    return :: RMonadElem a => a -> m a
    (>>=) :: (RMonadElem a, RMonadElem b) => m a -> (a -> m b) -> m b

instance RMonad [] where
    context RMonadElem a = ()

    return x = [x]
    (>>=) = concatMap

instance RMonad Set where
    context RMonadElem a = Ord a

    return x = singleton x
    (>>=) = fold (\a s' -> union (f a) s') empty s

A few interesting points:

  1. What is the kind signature of the context synonym? We probably need another “kind” – that of class constraints – which is preserved by n-ary tupling.
  2. Can you provide a default implementation for the kind synonym? This would let us change the definition of the Monad type class in a backward compatible way, by defaulting RMonadElem a to ()
  3. I mentioned this idea to Ganesh at Fun In The Afternoon, and he told me about his rmonad package, which essentially does exactly this, but by reifying the dictionaries explicitly as data. This is a nice demonstration that the approach is workable, but I think we could really do without the boilerplate dictionary management.
  4. Amusingly, GHC currently represents type classes internall as a normal data type, with some extra invariants. This means that most of the existing machinery for dealing with associated type synonyms could probably be used changed to implement this extension!

I don’t think that this introduces any horrendous type checking problems, and I can see how the desugarer has to treat dictionaries arising from such contexts. Nonetheless, there are probably some weirdnesses that I’m forgetting, so I should probably try to come up with a real specification (and implementation!) when I get some time…

(P.S: It looks like some guys at Hac5 were working on adding the simple constraint families to GHC – does anyone know how far they got with that?)


May 10 2009

New paper: Types Are Calling Conventions

I’ve just submitted a paper, coauthored with Simon Peyton Jones, to the Haskell Symposium. In this paper, we outline what we think is an interesting point in the design space of intermediate languages for a lazy functional programming language like Haskell, and show some cool optimisations that we can do with it that are hard or impossible to express in the intermediate language used by GHC today.

Although this mainly represents a potential improvement in GHC’s internals, where I’d really like to go with this is to push the ability to make a distinction between strict and lazy data into the type system of Haskell itself. This would mean that you could, for example, write functions that produce element-strict lists, and document some of the strictness properties of your functions in their types.

If any of this sounds interesting to you, you can obtain the paper from my freshly-minted Computer Lab website. You can leave any comments you may have on the corresponding Wiki page.


Aug 31 2008

Hackage Releases Made Easy

The Haskell community has built up a great resource: the Hackage Haskell package database, where we recently hit the 500-package mark!

One of those 500 packages was mine, I added another to their number just an hour ago, and I’ve got two more in the oven. Given, then, that I’m starting to maintain a few packages, I went to the trouble of automating the Hackage release process, and in this post I’m going to briefly walk through setting up this automated environment.

  1. Install cabal-upload from Hackage. I’m afraid that at the time of writing this is not perfectly simple because it won’t build with GHC 6.8 or above: this can be fixed with a new .cabal file, however, which I’ve made available here. (Edit: I’ve just noticed that this functionality seems to have been added to Cabal itself! You may just be able to use cabal upload. However, I’m not sure what the right config file location is for the next step).
  2. Add a file containing your Hackage username and password in the format (“username”,”password”) called ~/.cabal-upload/auth.
  3. Copy the following shell script into a file called release in the root of your project (the same directory as the Setup.lhs file):
    #!/bin/bash
    #
    
    echo "Have you updated the version number? Type 'yes' if you have!"
    read version_response
    
    if [ "$version_response" != "yes" ]; then
        echo "Go and update the version number"
        exit 1
    fi
    
    sdist_output=`runghc Setup.lhs sdist`
    
    if [ "$?" != "0" ]; then
        echo "Cabal sdist failed, aborting"
        exit 1
    fi
    
    # Want to find a line like:
    # Source tarball created: dist/ansi-terminal-0.1.tar.gz
    
    # Test this with:
    # runghc Setup.lhs sdist | grep ...
    filename=`echo $sdist_output | sed 's/.*Source tarball created: \([^ ]*\).*/\1/'`
    echo "Filename: $filename"
    
    if [ "$filename" = "$sdist_output" ]; then
        echo "Could not find filename, aborting"
        exit 1
    fi
    
    # Test this with:
    # echo dist/ansi-terminal-0.1.tar.gz | sed ...
    version=`echo $filename | sed 's/^[^0-9]*\([0-9\.]*\).tar.gz$/\1/'`
    echo "Version: $version"
    
    if [ "$version" = "$filename" ]; then
        echo "Could not find version, aborting"
        exit 1
    fi
    
    echo "This is your last chance to abort! I'm going to upload in 10 seconds"
    sleep 10
    
    git tag "v$version"
    
    if [ "$?" != "0" ]; then
        echo "Git tag failed, aborting"
        exit 1
    fi
    
    # You need to have stored your Hackage username and password as directed by cabal-upload
    # I use -v5 because otherwise the error messages can be cryptic :-)
    cabal-upload -v5 $filename
    
    if [ "$?" != "0" ]; then
        echo "Hackage upload failed, aborting"
        exit 1
    fi
    
    # Success!
    exit 0
  4. When you’re ready to release something, simply run the shell script! Not only will this package up your project and upload it to Hackage, it will also add a version tag to your Git repository (obviously you should change this bit if you are using another VCS!).

If you would like to follow my continuing adventures in Haskell open source, please check out my GitHub profile! Patches gratefully accepted :-)


Aug 11 2008

Compiler Plugins AngloHaskell Talk

I was given the opportunity to speak on the topic of compiler plugins for GHC at AngloHaskell 2008 last weekend, and the slides and audio are now available here for those interested.

I wish they had taken video instead, as I went forwards and backwards over my slides quite a bit and wrote quite a lot on the whiteboard. I fear this may have left the presentation entirely incomprehensible to those who weren’t there in person (and maybe even some people who were :-) .

The other presenters were great, and I especially enjoyed Neil Mitchells talk on how Hoogle works, as type-searching is an area I was totally unfamiliar with. If you’re a local Haskeller, I thoroughly reccomend attending the 2009 event!


Jul 21 2008

Compiler Plugins For GHC: Week Six

Are we six weeks into it already? It’s flown by. What did I get up to this week?

Core HTML Output
I bring you the “mystery” project that I promised you last week: a HTML pretty printer for GHCs Core language! Features of my implementation are:

  • Syntax highlighting
  • Hyperlinked variables: click one to jump to its definition site if it refers to a local name, or to a Hoogle search for it otherwise
  • Mark interesting parts of the Core output with a thick border by clicking on a binder
  • Hover over variable usage sites to highlight their binding sites
  • Handy index phases run by the compiler, click a phase name to jump to the Core output by that phase

I’ve put some sample output up here for you to try out, but beware: it’s a 500kB document!

For the inspiration for this project I owe a debt of gratitude to Neil Mitchell’s YHC.Core.HTML. If you are interested in other human readable Core output formats, check out Don Stewart’s ghc-core package, which gives you a syntax-highlighted command line pager for Core.

Spit And Polish
I spent the vast majority of last week tidying up loose ends, hunting and squashing bugs and preening my sample plugin code. This reflects the fact that the torrent of new GHC features I’ve announced here are being bedded down in preparation for finishing the project off and getting them merged into HEAD.

Conclusion
More of the same this week: a focus on tidying up and getting some solid documentation done. I want to get something releasable ready well before the deadline which I can then enhance with discretionary improvements without fear of leaving the Summer of Code period without having something ready for real-world use.

Since this activity is all rather tedious to the casual Summer of Code follower, this will probably be my last weekly blog post. However, I’ll still try and write about any major developments in the project: see you then!


Jul 14 2008

Compiler Plugins For GHC: Week Five

Welcome to the fourth instalment in this ongoing series! How have compiler plugins progressed this week?

Haddocking GHC Internals
If we’re going to have any hope of people other than GHC developers writing plugins, there needs to be accessible documentation about the relevant parts of the compiler. The lions share of my time last week was taken up documenting about 40 of GHCs modules, describing their functions and writing some notes about GHC jargon that is useful to know.

Unfortunately, I don’t yet have an HTML version to show you since Haddock won’t run on GHC’s source code yet due to our use of the C-preprocessor. However, I’m assured that this is being resolved..

Annotation System Enhancements
I’ve added the ability for plugins to attach new annotations to Core syntax trees during compilation, rather than just sticking with the ones that were present in the source statically. This means you could now implement e.g. a strictness analysis pass with a seperate pass that used the annotations generated by that pass to perform the worker-wrapper transform.

I’ve additionally added the ability to generate annotations through Template Haskell. This smells like it might be useful to someone.

Phase Aliases
Roman Leshchinskiy replied to my previous email on phase control for GHC and came up with some interesting comments. The upshot of this is that I’ve added the ability to specify phase equality, rather than just inequality:

{-# PHASE Foo = Bar, < Spqr #-}

Pan Optimization Sample Plugin
My mentor for this project, Sean Seefried, worked on pluggable compilers as part of his doctoral studies. One of the things that came out of that was a GHC plugin that performed a domain-specific optimization known as “image lifting” on his reimplementation of the Pan combinator library for functional image construction.

I’ve ported this plugin to my own compiler plugins framework, and extracted considerable amounts of it’s utility code into GHC itself for use by other plugin authors. I also hope to use it’s codebase as an exemplar of how to write a large compiler plugin.

API Cleanup
GHC has grown by accretion, and as a result some of the APIs we provide are inconsistently named, are parts of incomplete sets of functions and so on. I’ve been spending some time refactoring the worst offenders so the code is a bit more presentable, and hopefully the code will be a bit easier to get a handle on for users new to GHC.

Conclusion
A good week: I’ve hit all the goals I laid out in the last installment of this series. However, documentation work is a bit tedious and I’m glad I’ve got a large chunk of it out of the way: this leaves me free to work on some more exciting things this week.

This week I’m focusing on polishing off the rough edges in my API and sample plugins, so they are something approaching releasable. I also have numerous annoying to-dos accumulated from the last five weeks that I will ned to take another another look at. This small stuff aside, I’ve just spent the first day of week 6 working on a rather exciting feature that I think will be very useful even for those who do not plan to be plugin authors: you’ll have to wait until my next post to find out about that!


Jul 5 2008

Compiler Plugins For GHC: Weeks Three and Four

I was attending graduation last weekend so didn’t have time to put a progress post together, which means that this week you get a double helping of GHC-plugins goodness! For those new to the series, you might like to read the first two posts before continuing.

Type Safe Dynamic Loading

This project is all about dynamically loading plugin code into GHC so it can transform the program being compiled. Until now, I blindly trusted that if the plugin exposed a value with the name “plugin” it was indeed a Plugin data structure, but now I’ve pieced together some parts of the GHC typechecker to use the associated .hi files to check whether this is indeed the case.

It might be useful to expose this to users as an alternative to parts of hs-plugins, but as it would need a spot of generalization work to be done first this is not on the agenda at the moment.

Annotations System

What is an annotations system? It is entirely analagous to the annotations system of Java or the attributes of .NET languages in that it allows you to associate bits of user data with elements of the program. In my current design, you can attach annotations to modules, types, data constructors and top level values, like so:

module Example where

import AnnotationTypes ( MyAnnotationType )

{-# MODANN MyAnnotationType { friendliness = 10, friend = "Jim" } #-}

{-# TYPEANN Foo MyAnnotationType { friendliness = 20, friend = "Bob" } #-}
data Foo = Bar

{-# ANN f MyAnnotationType { friendliness = 30, friend = "Jane" } #-}
f x = x

Actually, you can use arbitrary expressions in your annotations, as long as they eventually boil down to something with a Typeable instance. So this rather esoteric expression would be perfectly fine:

{-# ANN f SillyAnnotation { foo = (id 10) + $( [| 20 |]), bar = 'f } #-}

You probably want to avoid non-termination or even expensive computation in those annotations as they are potentially evaluated at compile time! Plugins can see annotations during compilation and hence can use them to guide the transformations they perform on your code, but you can also access the annotations of any module via the GHC API.

I’ve previously alluded to the difficulties with an annotations system: I’ll take this opportunity to discuss them a bit further. Consider this program:

module Example2 ( exported ) where

exported = not_exported

{-# ANN not_exported Just "Hello" #-}
not_exported = 10

Because “not_exported” is only used once its definition will be inlined straight into “exported” (regardless of it’s size). This means that the annotation on it is entirely useless, as odds are that the plugin will never see the not_exported identifier in the program!

We have the same sort of problem with identifiers in the rules system, and require manual addition of NOINLINE pragmas to the relevant identifiers to circumvent it, but it all feels rather clumsy and I’m not sure what the best solution is.

Note that this is not a problem for other modules accessing the annotation, as by definition they do so on an exported identifier that does not suffer this treatment.

Another problem with annotations is that it’s almost impossible to allow them on non top-level identifiers with the current GHC implementation, as those identifiers get created and destroyed with reckless abandon by compiler passes.

We work around this for things like SCCs by actually making SCCs a kind of expression in the intermediate language, but doing this for annotations doesn’t sit well with the idea of being able to look up the annotations attached to a particular identifier from other modules using the GHC API. So again, I’m not quite sure what the solution is here.

Sample Plugins

To try and produce some sample code for the eventual release and get some experience about the API I need to provide to plugin authors I have implemented some simple compiler plugins. I’ve got two complete so far:

  • A plugin to make Haskell a strict rather than lazy language
  • A plugin that performs GHCs current common subexpression elimination pass, but outside of the main compiler

There are some more planned: watch this space.

Conclusion
It’s been a month since I began the project, and I’m fairly pleased with my progress up to this point. There’s still a lot left to do, but I’m confident I should have something presentable by the end of the Summer Of Code period.

Next week will probably see some refinements to the annotation and phase control systems, the construction of a few more sample plugins, and perhaps even a start on documenting GHCs internals to some extent.


Jun 23 2008

Compiler Plugins For GHC: Week Two

I wasn’t quite as productive with my Summer Of Code project this week as I was last week. Let’s take a look at the big ticket items that were accomplished.

Phase Control System Implemented

I covered the design of in my last post, and most of my work over the week has been on implementing and refining that proposal. It’s not essentially done, with a remaining small but significant wrinkle. The new phase control system lets you write rules that make use of phases above and beyond the existing ontology of phase 0, 1 and 2. An example of such a rule is as follows:

import GHC.Prim ({-# PHASE ConstructorSpecialization #-})

{-# RULES "foldr/build" [~ConstructorSpecialization] foldr c n (build g) = g c n #-}

That’s all very well, but the snag is that we actually have one of these phases for every compiler pass in GHC, so in order to ensure that we always fire the RULEs that may be set up we need to insert a full simplifier pass after almost every compiler pass – yikes! That’s a lot more simplification than we currently do and it can’t be good for compile times. I’m still thinking about how to resolve that one.

Compiler Pipeline Dynamically Constructed From Constraints
This is the reason why we have one phase for every compiler pass: I’ve changed GHC so its entire core-to-core pipeline is now built up from the relative ordering of these phases and the phase tags I’ve attached to every pass. This is a prerequisite for allowing compiler plugin authors to insert their own core-to-core passes by specifying declaratively when they would like them to run.

Template Haskell Phase Integration
So we have these phase pragmas, but how do plugin go about referring to phases in their actual code that talks to GHC? The answer is with the new support for phases in Template Haskell!

{-# PHASE MyPhase #-}

... stuff ...

getPluginPasses :: HscEnv -> CoreModule -> IO [PhasedPluginPass]
getPluginPasses hsc_env cm = do
    ... stuff ...
    Just phase_name <- thNameToGhcName hsc_env '''MyPhase
    return [PhasedPluginPass phase_name (VanillaPluginPass pass)]

This code is using the new triple quote notation to get a Template Haskell Name for the compiler phase, which is converted to a GHC Name and finally given to GHC itself. Of course, the Template Haskell support allows a lot more than this, such as generating new phases and splicing them in to your code at compile time.

Conclusion
The project is still coming steadily along. I’m starting this week with ancillary work on the Static Argument Transformation that isn’t directly related to the project, but then I hope to move on to the plugin annotations system that I called out last week as a looming and highly thorny issue: expect to see more on this topic soon!


Jun 15 2008

Compiler Plugins For GHC: The First Week

Things have been coming along very well with my Summer of Code project to add dynamically loaded plugins to the Glasgow Haskell Compiler. In my first week of coding post-finals I’ve got a lot done. I’ll be discussing two of the headline items in this post.

Proof Of Concept Plugin Loading

GHC is capable of dynamically loading plugins specified on the command line from any installed package, and running the compiler phases that they install. To give you an idea of what that looks like, here is the current code for my sample-plugin project:

module Simple.Plugin(plugin) where

import UniqSupply
import DynFlags
import HscTypes
import CoreSyn
import Outputable
import Module

import Plugins (Plugin(..), PluginPass(..))

plugin :: Plugin
plugin = Plugin {
    initializePlugin = initialize,
    getPluginPasses = getPasses
  }

initialize :: IO ()
initialize = do
    putStrLn "Simple Plugin Initialized"

getPasses :: CoreModule -> IO [PluginPass]
getPasses cm = do
    putStrLn "Simple Plugin Passes Inspecting Module"
    let mod_s = showSDoc (pprModule (cm_module cm))
    putStrLn $ "Simple Plugin Passes Queried For " ++ mod_s
    return [PluginPass pass]

pass :: DynFlags -> UniqSupply -> [CoreBind] -> IO [CoreBind]
pass _ _ binds = do
    putStrLn "Simple Plugin Pass Run"
    return binds

There’s a lot of work still to do here: the biggie is allowing “annotations” a-la languages like C# that let you mark identifiers or expressions in the language with extra stuff that meta-programs can make use of. For example, you might want to tag which functions you want your compiler plugin to analyse or add instrumenting code to. It’s quite hard to get this feature right, and I’ll probably be posting some more about the issues involved later as I get closer to implementing it.

Phase Control

GHC compiles your programs in a classic pipelined style: the main stages in a typical pipeline would be lexing, parsing, typechecking, desugaring, optimization and finally code generation. Although most of these stages have to run in a particular order, some of the stages can potentially run in multiple orders, most notably those sub-stages that make up the “optimization” stage I mentioned.

This is relevant to plugins because we need to be able to say when any phases you install should run. However, it also turns out that we use this feature to allow compiler users to control when inlining and source code rewrite rules should be applied, as documented in the user guide.

The current system we use is a bit ugly and just establishes a mapping between controlled things and the natural numbers to establish an ordering. This week, I’ve proposed (and partially implemented) a system for phase control that uses phase names that are declared in PHASE pragmas and henceforth exported and imported just like any other Haskell name, so for example:

module Spqr(... {-# PHASE C #-} ...) where

import GHC.Phases({-# PHASE SpecConstr #-})

{-# PHASE C < SpecConstr #-}

{-# RULE "silly" [~C] id = (\x -> x) #-}

This establishes a new phase C that must run before GHCs constructor specialization phase. This phase is in turn used to control the activation of the “silly” rule, and the phase exported so it can be referred to by other modules.

If you have any comments about this system, please make yourself heard on glasgow-haskell-users!

Conclusion

I’m fairly pleased with my progress so far and having a great time finally doing some coding again after the long exam period!

Hopefully this coming week will see me complete the implementation of the new phase control system and a refactoring of GHCs existing pipeline construction to take into account that phase information. I should then be able to move on to some issues more directly related to plugins, such as the rather thorny issue of the annotation system.