Compiler Plugins For GHC: Weeks Three and Four
I was attending graduation last weekend so didn’t have time to put a progress post together, which means that this week you get a double helping of GHC-plugins goodness! For those new to the series, you might like to read the first two posts before continuing.
Type Safe Dynamic Loading
This project is all about dynamically loading plugin code into GHC so it can transform the program being compiled. Until now, I blindly trusted that if the plugin exposed a value with the name “plugin” it was indeed a Plugin data structure, but now I’ve pieced together some parts of the GHC typechecker to use the associated .hi files to check whether this is indeed the case.
It might be useful to expose this to users as an alternative to parts of hs-plugins, but as it would need a spot of generalization work to be done first this is not on the agenda at the moment.
Annotations System
What is an annotations system? It is entirely analagous to the annotations system of Java or the attributes of .NET languages in that it allows you to associate bits of user data with elements of the program. In my current design, you can attach annotations to modules, types, data constructors and top level values, like so:
Actually, you can use arbitrary expressions in your annotations, as long as they eventually boil down to something with a Typeable instance. So this rather esoteric expression would be perfectly fine:
You probably want to avoid non-termination or even expensive computation in those annotations as they are potentially evaluated at compile time! Plugins can see annotations during compilation and hence can use them to guide the transformations they perform on your code, but you can also access the annotations of any module via the GHC API.
I’ve previously alluded to the difficulties with an annotations system: I’ll take this opportunity to discuss them a bit further. Consider this program:
Because “not_exported” is only used once its definition will be inlined straight into “exported” (regardless of it’s size). This means that the annotation on it is entirely useless, as odds are that the plugin will never see the not_exported identifier in the program!
We have the same sort of problem with identifiers in the rules system, and require manual addition of NOINLINE pragmas to the relevant identifiers to circumvent it, but it all feels rather clumsy and I’m not sure what the best solution is.
Note that this is not a problem for other modules accessing the annotation, as by definition they do so on an exported identifier that does not suffer this treatment.
Another problem with annotations is that it’s almost impossible to allow them on non top-level identifiers with the current GHC implementation, as those identifiers get created and destroyed with reckless abandon by compiler passes.
We work around this for things like SCCs by actually making SCCs a kind of expression in the intermediate language, but doing this for annotations doesn’t sit well with the idea of being able to look up the annotations attached to a particular identifier from other modules using the GHC API. So again, I’m not quite sure what the solution is here.
Sample Plugins
To try and produce some sample code for the eventual release and get some experience about the API I need to provide to plugin authors I have implemented some simple compiler plugins. I’ve got two complete so far:
-
A plugin to make Haskell a strict rather than lazy language
-
A plugin that performs GHCs current common subexpression elimination pass, but outside of the main compiler
There are some more planned: watch this space.
Conclusion It’s been a month since I began the project, and I’m fairly pleased with my progress up to this point. There’s still a lot left to do, but I’m confident I should have something presentable by the end of the Summer Of Code period.
Next week will probably see some refinements to the annotation and phase control systems, the construction of a few more sample plugins, and perhaps even a start on documenting GHCs internals to some extent.
This sounds like fun, * A plugin to make Haskell a strict rather than lazy language A `ghc -fstrict` mode for OCaml refugees?