Aug 11 2008

Compiler Plugins AngloHaskell Talk

I was given the opportunity to speak on the topic of compiler plugins for GHC at AngloHaskell 2008 last weekend, and the slides and audio are now available here for those interested.

I wish they had taken video instead, as I went forwards and backwards over my slides quite a bit and wrote quite a lot on the whiteboard. I fear this may have left the presentation entirely incomprehensible to those who weren’t there in person (and maybe even some people who were :-) .

The other presenters were great, and I especially enjoyed Neil Mitchells talk on how Hoogle works, as type-searching is an area I was totally unfamiliar with. If you’re a local Haskeller, I thoroughly reccomend attending the 2009 event!


Jul 21 2008

Compiler Plugins For GHC: Week Six

Are we six weeks into it already? It’s flown by. What did I get up to this week?

Core HTML Output
I bring you the “mystery” project that I promised you last week: a HTML pretty printer for GHCs Core language! Features of my implementation are:

  • Syntax highlighting
  • Hyperlinked variables: click one to jump to its definition site if it refers to a local name, or to a Hoogle search for it otherwise
  • Mark interesting parts of the Core output with a thick border by clicking on a binder
  • Hover over variable usage sites to highlight their binding sites
  • Handy index phases run by the compiler, click a phase name to jump to the Core output by that phase

I’ve put some sample output up here for you to try out, but beware: it’s a 500kB document!

For the inspiration for this project I owe a debt of gratitude to Neil Mitchell’s YHC.Core.HTML. If you are interested in other human readable Core output formats, check out Don Stewart’s ghc-core package, which gives you a syntax-highlighted command line pager for Core.

Spit And Polish
I spent the vast majority of last week tidying up loose ends, hunting and squashing bugs and preening my sample plugin code. This reflects the fact that the torrent of new GHC features I’ve announced here are being bedded down in preparation for finishing the project off and getting them merged into HEAD.

Conclusion
More of the same this week: a focus on tidying up and getting some solid documentation done. I want to get something releasable ready well before the deadline which I can then enhance with discretionary improvements without fear of leaving the Summer of Code period without having something ready for real-world use.

Since this activity is all rather tedious to the casual Summer of Code follower, this will probably be my last weekly blog post. However, I’ll still try and write about any major developments in the project: see you then!


Jul 14 2008

Compiler Plugins For GHC: Week Five

Welcome to the fourth instalment in this ongoing series! How have compiler plugins progressed this week?

Haddocking GHC Internals
If we’re going to have any hope of people other than GHC developers writing plugins, there needs to be accessible documentation about the relevant parts of the compiler. The lions share of my time last week was taken up documenting about 40 of GHCs modules, describing their functions and writing some notes about GHC jargon that is useful to know.

Unfortunately, I don’t yet have an HTML version to show you since Haddock won’t run on GHC’s source code yet due to our use of the C-preprocessor. However, I’m assured that this is being resolved..

Annotation System Enhancements
I’ve added the ability for plugins to attach new annotations to Core syntax trees during compilation, rather than just sticking with the ones that were present in the source statically. This means you could now implement e.g. a strictness analysis pass with a seperate pass that used the annotations generated by that pass to perform the worker-wrapper transform.

I’ve additionally added the ability to generate annotations through Template Haskell. This smells like it might be useful to someone.

Phase Aliases
Roman Leshchinskiy replied to my previous email on phase control for GHC and came up with some interesting comments. The upshot of this is that I’ve added the ability to specify phase equality, rather than just inequality:

{-# PHASE Foo = Bar, < Spqr #-}

Pan Optimization Sample Plugin
My mentor for this project, Sean Seefried, worked on pluggable compilers as part of his doctoral studies. One of the things that came out of that was a GHC plugin that performed a domain-specific optimization known as “image lifting” on his reimplementation of the Pan combinator library for functional image construction.

I’ve ported this plugin to my own compiler plugins framework, and extracted considerable amounts of it’s utility code into GHC itself for use by other plugin authors. I also hope to use it’s codebase as an exemplar of how to write a large compiler plugin.

API Cleanup
GHC has grown by accretion, and as a result some of the APIs we provide are inconsistently named, are parts of incomplete sets of functions and so on. I’ve been spending some time refactoring the worst offenders so the code is a bit more presentable, and hopefully the code will be a bit easier to get a handle on for users new to GHC.

Conclusion
A good week: I’ve hit all the goals I laid out in the last installment of this series. However, documentation work is a bit tedious and I’m glad I’ve got a large chunk of it out of the way: this leaves me free to work on some more exciting things this week.

This week I’m focusing on polishing off the rough edges in my API and sample plugins, so they are something approaching releasable. I also have numerous annoying to-dos accumulated from the last five weeks that I will ned to take another another look at. This small stuff aside, I’ve just spent the first day of week 6 working on a rather exciting feature that I think will be very useful even for those who do not plan to be plugin authors: you’ll have to wait until my next post to find out about that!


Jul 5 2008

Compiler Plugins For GHC: Weeks Three and Four

I was attending graduation last weekend so didn’t have time to put a progress post together, which means that this week you get a double helping of GHC-plugins goodness! For those new to the series, you might like to read the first two posts before continuing.

Type Safe Dynamic Loading

This project is all about dynamically loading plugin code into GHC so it can transform the program being compiled. Until now, I blindly trusted that if the plugin exposed a value with the name “plugin” it was indeed a Plugin data structure, but now I’ve pieced together some parts of the GHC typechecker to use the associated .hi files to check whether this is indeed the case.

It might be useful to expose this to users as an alternative to parts of hs-plugins, but as it would need a spot of generalization work to be done first this is not on the agenda at the moment.

Annotations System

What is an annotations system? It is entirely analagous to the annotations system of Java or the attributes of .NET languages in that it allows you to associate bits of user data with elements of the program. In my current design, you can attach annotations to modules, types, data constructors and top level values, like so:

module Example where

import AnnotationTypes ( MyAnnotationType )

{-# MODANN MyAnnotationType { friendliness = 10, friend = "Jim" } #-}

{-# TYPEANN Foo MyAnnotationType { friendliness = 20, friend = "Bob" } #-}
data Foo = Bar

{-# ANN f MyAnnotationType { friendliness = 30, friend = "Jane" } #-}
f x = x

Actually, you can use arbitrary expressions in your annotations, as long as they eventually boil down to something with a Typeable instance. So this rather esoteric expression would be perfectly fine:

{-# ANN f SillyAnnotation { foo = (id 10) + $( [| 20 |]), bar = 'f } #-}

You probably want to avoid non-termination or even expensive computation in those annotations as they are potentially evaluated at compile time! Plugins can see annotations during compilation and hence can use them to guide the transformations they perform on your code, but you can also access the annotations of any module via the GHC API.

I’ve previously alluded to the difficulties with an annotations system: I’ll take this opportunity to discuss them a bit further. Consider this program:

module Example2 ( exported ) where

exported = not_exported

{-# ANN not_exported Just "Hello" #-}
not_exported = 10

Because “not_exported” is only used once its definition will be inlined straight into “exported” (regardless of it’s size). This means that the annotation on it is entirely useless, as odds are that the plugin will never see the not_exported identifier in the program!

We have the same sort of problem with identifiers in the rules system, and require manual addition of NOINLINE pragmas to the relevant identifiers to circumvent it, but it all feels rather clumsy and I’m not sure what the best solution is.

Note that this is not a problem for other modules accessing the annotation, as by definition they do so on an exported identifier that does not suffer this treatment.

Another problem with annotations is that it’s almost impossible to allow them on non top-level identifiers with the current GHC implementation, as those identifiers get created and destroyed with reckless abandon by compiler passes.

We work around this for things like SCCs by actually making SCCs a kind of expression in the intermediate language, but doing this for annotations doesn’t sit well with the idea of being able to look up the annotations attached to a particular identifier from other modules using the GHC API. So again, I’m not quite sure what the solution is here.

Sample Plugins

To try and produce some sample code for the eventual release and get some experience about the API I need to provide to plugin authors I have implemented some simple compiler plugins. I’ve got two complete so far:

  • A plugin to make Haskell a strict rather than lazy language
  • A plugin that performs GHCs current common subexpression elimination pass, but outside of the main compiler

There are some more planned: watch this space.

Conclusion
It’s been a month since I began the project, and I’m fairly pleased with my progress up to this point. There’s still a lot left to do, but I’m confident I should have something presentable by the end of the Summer Of Code period.

Next week will probably see some refinements to the annotation and phase control systems, the construction of a few more sample plugins, and perhaps even a start on documenting GHCs internals to some extent.


Jun 23 2008

Compiler Plugins For GHC: Week Two

I wasn’t quite as productive with my Summer Of Code project this week as I was last week. Let’s take a look at the big ticket items that were accomplished.

Phase Control System Implemented

I covered the design of in my last post, and most of my work over the week has been on implementing and refining that proposal. It’s not essentially done, with a remaining small but significant wrinkle. The new phase control system lets you write rules that make use of phases above and beyond the existing ontology of phase 0, 1 and 2. An example of such a rule is as follows:

import GHC.Prim ({-# PHASE ConstructorSpecialization #-})

{-# RULES "foldr/build" [~ConstructorSpecialization] foldr c n (build g) = g c n #-}

That’s all very well, but the snag is that we actually have one of these phases for every compiler pass in GHC, so in order to ensure that we always fire the RULEs that may be set up we need to insert a full simplifier pass after almost every compiler pass – yikes! That’s a lot more simplification than we currently do and it can’t be good for compile times. I’m still thinking about how to resolve that one.

Compiler Pipeline Dynamically Constructed From Constraints
This is the reason why we have one phase for every compiler pass: I’ve changed GHC so its entire core-to-core pipeline is now built up from the relative ordering of these phases and the phase tags I’ve attached to every pass. This is a prerequisite for allowing compiler plugin authors to insert their own core-to-core passes by specifying declaratively when they would like them to run.

Template Haskell Phase Integration
So we have these phase pragmas, but how do plugin go about referring to phases in their actual code that talks to GHC? The answer is with the new support for phases in Template Haskell!

{-# PHASE MyPhase #-}

... stuff ...

getPluginPasses :: HscEnv -> CoreModule -> IO [PhasedPluginPass]
getPluginPasses hsc_env cm = do
    ... stuff ...
    Just phase_name <- thNameToGhcName hsc_env '''MyPhase
    return [PhasedPluginPass phase_name (VanillaPluginPass pass)]

This code is using the new triple quote notation to get a Template Haskell Name for the compiler phase, which is converted to a GHC Name and finally given to GHC itself. Of course, the Template Haskell support allows a lot more than this, such as generating new phases and splicing them in to your code at compile time.

Conclusion
The project is still coming steadily along. I’m starting this week with ancillary work on the Static Argument Transformation that isn’t directly related to the project, but then I hope to move on to the plugin annotations system that I called out last week as a looming and highly thorny issue: expect to see more on this topic soon!


Jun 15 2008

Compiler Plugins For GHC: The First Week

Things have been coming along very well with my Summer of Code project to add dynamically loaded plugins to the Glasgow Haskell Compiler. In my first week of coding post-finals I’ve got a lot done. I’ll be discussing two of the headline items in this post.

Proof Of Concept Plugin Loading

GHC is capable of dynamically loading plugins specified on the command line from any installed package, and running the compiler phases that they install. To give you an idea of what that looks like, here is the current code for my sample-plugin project:

module Simple.Plugin(plugin) where

import UniqSupply
import DynFlags
import HscTypes
import CoreSyn
import Outputable
import Module

import Plugins (Plugin(..), PluginPass(..))

plugin :: Plugin
plugin = Plugin {
    initializePlugin = initialize,
    getPluginPasses = getPasses
  }

initialize :: IO ()
initialize = do
    putStrLn "Simple Plugin Initialized"

getPasses :: CoreModule -> IO [PluginPass]
getPasses cm = do
    putStrLn "Simple Plugin Passes Inspecting Module"
    let mod_s = showSDoc (pprModule (cm_module cm))
    putStrLn $ "Simple Plugin Passes Queried For " ++ mod_s
    return [PluginPass pass]

pass :: DynFlags -> UniqSupply -> [CoreBind] -> IO [CoreBind]
pass _ _ binds = do
    putStrLn "Simple Plugin Pass Run"
    return binds

There’s a lot of work still to do here: the biggie is allowing “annotations” a-la languages like C# that let you mark identifiers or expressions in the language with extra stuff that meta-programs can make use of. For example, you might want to tag which functions you want your compiler plugin to analyse or add instrumenting code to. It’s quite hard to get this feature right, and I’ll probably be posting some more about the issues involved later as I get closer to implementing it.

Phase Control

GHC compiles your programs in a classic pipelined style: the main stages in a typical pipeline would be lexing, parsing, typechecking, desugaring, optimization and finally code generation. Although most of these stages have to run in a particular order, some of the stages can potentially run in multiple orders, most notably those sub-stages that make up the “optimization” stage I mentioned.

This is relevant to plugins because we need to be able to say when any phases you install should run. However, it also turns out that we use this feature to allow compiler users to control when inlining and source code rewrite rules should be applied, as documented in the user guide.

The current system we use is a bit ugly and just establishes a mapping between controlled things and the natural numbers to establish an ordering. This week, I’ve proposed (and partially implemented) a system for phase control that uses phase names that are declared in PHASE pragmas and henceforth exported and imported just like any other Haskell name, so for example:

module Spqr(... {-# PHASE C #-} ...) where

import GHC.Phases({-# PHASE SpecConstr #-})

{-# PHASE C < SpecConstr #-}

{-# RULE "silly" [~C] id = (\x -> x) #-}

This establishes a new phase C that must run before GHCs constructor specialization phase. This phase is in turn used to control the activation of the “silly” rule, and the phase exported so it can be referred to by other modules.

If you have any comments about this system, please make yourself heard on glasgow-haskell-users!

Conclusion

I’m fairly pleased with my progress so far and having a great time finally doing some coding again after the long exam period!

Hopefully this coming week will see me complete the implementation of the new phase control system and a refactoring of GHCs existing pipeline construction to take into account that phase information. I should then be able to move on to some issues more directly related to plugins, such as the rather thorny issue of the annotation system.


Apr 24 2008

The Summer Of Code, or Compiler Development for the Masses

I’m very pleased to report that my application for the Google Summer of Code has been accepted! It almost goes without saying to mention that I’ve proposed work on the leading compiler for my language-du-jour: Haskell!

So, what exactly am I working on? Well, I and my mentor, Sean Seefried, think it would be awesome if we could give users of Haskell the ability to extend the compiler with their own code. Of course, they can do this today if they are willing to dig into and try and grok the rather intimidating guts of GHCs 216KLOC codebase, but we’d really like to let you do it without a source checkout of GHC and in a way so that it’s easy to use other peoples extensions too.

How are we going to meet these exacting criteria? The plan is to let people write modules that we can distribute via the existing Cabal packaging/build system infrastructure and load into GHC dynamically! We owe the ability to do this to Don Stewart’s excellent hs-plugins library. I’m also going to rustle up a good chunk of documentation and sample code to make it easy as pie to get into development.

This is going to give the Haskell community a whole new way in which to extend the language: I’m very excited to see what they come up with! However, here are just a taste of some of the more reasonable things I think our plugins are going to be able to do:

  • Selectively make Haskell a strict functional programming language
  • Optimize your code in whatever application-specific way you can come up with
  • Declaratively memoize arbitrary function definitions
  • Compile Haskell code to run on GPUs if available
  • Simple empirical research on functional programming by letting you write code analysis extensions
  • Cure world hunger

OK, that last one might be a bit optimistic, but I’m still very excited about the possibilities :-) .

What’s more, although nothing is certain, it looks like come October I’ll be working here at the Cambridge Computer Lab as a PhD student! I’ve proposed to investigate some aspects of parallelism in functional programming – more on this as it unfolds. I’ve just got to worry about getting a first in my finals – which are only a month or so away! Gah!