Ugly field selection syntax

OK, the most trivial complaint first. If we have defined a record like this:

> data Bird = Bird { name :: String, wings :: Integer }

How do we go about accessing the name and wings fields of a record instance? If you are used to a language like C#, you might say it would look something like this:

> (Bird { name = "Fred", wings = 2 }).name

Unfortunately, this isn’t the case. Actually, declaring a record creates a named function which uses pattern matching to destroy a passed record and return the name. So access actually looks like this:

> name (Bird { name = "Fred", wings = 2 })

Prefix notation may please the Lisp fans, but for me at least it can get a bit confusing, especially when you are dealing with nested fields. In this case, the code you write looks like:

> (innerRecordField . outerRecordField) record

Which (when read left to right, as is natural) is entirely the wrong order of accessors. However, it is possible to argue this is just a bug in my brain from having spent too long staring at C# code.. anyway, let’s move onto more substantitive complaints!

Namespace pollution

Imagine you’re writing a Haskell program to model poulty farmers who work as programmers in their spare time, so naturally you want to add to the Bird record above a Person record:

> data Person = Person { name :: String, knowsHaskell :: Boolean }

But I think you’ll find the compiler has something to say about that….

Records.hs:4:23:
Multiple declarations of `Main.name'
Declared at: Records.hs:3:19
Records.hs:4:23

Ouch! This is because of the automatic creation of the name function I alluded to earlier. Let’s see what the Haskell compiler’s desugaring would look like:

> newtype Bird = Bird String Integer
>
> name :: Bird -> String
> name (Bird value, _) = value
>
> wings :: Bird -> Integer
> wings (Bird _, value) = value
>
> newtype Person = Person String Boolean
>
> name :: Person -> String
> name (Person value, _) = value
>
> knowsHaskell :: Person -> Boolean
> knowsHaskell (Person _, value) = value

As you can see, we have two name functions in the same scope: that’s no good! In particular, this means you can’t have records which share field names. However, using the magic of type classes we can hack up something approaching a solution. Let’s desugar the records as before, but instead of those name functions add this lot:

> class NameField a where
>   name :: a -> String
>
> instance NameField Bird where
>   name (Bird value, _) = value
>
> instance NameField Person where
>   name (Person value, _) = value

All we have done here is used the happy (and not entirely accidental) fact that the name field is of type String in both records to create a type class with instances to let us extract it from both record types. A use of this would look something like:

> showName :: (NameField a) => a -> IO String
> showName hasNameField = print ("Name: " ++ (name hasNameField))
>
> showName (Person { name = "Simon Peyton-Jones", knowsHaskell = true })
> showName (Bird { name = "Clucker", wings = 2 })

Great stuff! Actually, we could use this hack to establish something like a subtype relationship on records, since any record with at least the fields of another could implement all of its field type classes (like the NameField type class, in this example). Another way this could be extended is to make use of the multiparameter type classes and functional dependency extensions to GHC to let the field types differ.

Of course, this is all just one hack on top of another. Actually, considerable brainpower has been expended on improving the Haskell record system, such as in a 2003 paper by the areforementioned Simon Peyton-Jones here. This proposal would have let you write something like this:

> showName :: (r <: { name :: String }) -> IO String
> showName { name = myName, .. } = print ("Name: " ++ myName)

The r <: { name : String } indicates any record which contains at least a field called name with type String can be consumed. The two dots .. in the pattern match likewise indicate that fields other than name may be present. Note also the use of an anonymous record type: no data decleration was required in the code above. This is obviously a lot more concise than having to create the type classes yourselves, as we did, but actually we can make it even more concise by using another of the proposed extensions:

> showName { name, .. } = print ("Name: " ++ name)

Here, we omit the “name = myName” pattern match and make use of so-called “punning” to give us access to the name field: very nice! Unfortunately, all of this record-y goodness is speculative at least until Haskell’ gets off the ground.

Record update is not first class

Haskell gives us a conventient syntax for record update. Lets say that one of our chickens strayed too close to the local nuclear reactor and sprouted an extra limb:

> exampleBird = Bird { name = "Son Of Clucker", wings = 2 }
> exampleBird { wings = 3 }

The last line above will return a Bird identical in all respects except that the wings will have been changed to 3. The naïve amongst us at this point might then think we could write something like:

> changeWings :: Integer -> Bird -> Bird
> changeWings x = { wings = x }

The intention here is to return a function that just sets a Bird records wings field to x. Unfortunately, this is not even remotely legal, which does make some sense since if it was record update should, to follow normal function application convention, look more like this:

> { wings = 3 } exampleBird

Right, I think that’s got everything that’s wrong about Haskell records off my chest: do you know of any points I’ve missed?

Edit: Corrected my pattern match syntax (whoops :-). Thanks, Saizan!

Edit 2: Clarified some points in response to jaybee’s comments on the Reddit comments page.

Josef Svenningsson says:

27 April 2007 09:52 UTC

I agree with you that Haskell's record system is not the strongest part of the language. It's a hack that was added because it was easy to do so and gave a way to refer to components of a constructor by name instead of by position. So the power to weight ratio is pretty high but it still is something of a wart.
The best proposals for records that I have ever seen is that by Daan Leijen. "Extensible records with scoped labels"(http://www.cs.uu.nl/~daan/download/papers/scopedlabels.pdf) I wish some Haskell implementation would implement that so that one could play with it. It would also be very nice to have in Haskell' but that may be too much to hope for.

Neil Mitchell says:

27 April 2007 10:30 UTC

Daan Leijen, "Extensible records with scoped labels" - http://www.cs.ioc.ee/tfp-icfp-gpce05/tfp-proc/21num.pdf
That paper may be of interest to you, its a very different, and very neat way of defining records.

Logan Capaldo says:

27 April 2007 11:57 UTC

An alternative hack (which I'm sure you are aware of) would be something like:
module Bird where
data Bird = Bird { name :: String, wings :: Integer }
...
module Person where
data Person = Person { name :: String, age :: Integer }
...
module Main where
import qualified Bird
import qualified Person -- heh
...
> Person.name examplePerson
...
> Bird.name exampleBird
...
To make it a little less wordy:
import qualified Bird as B
...
> B.name exampleBird

batterseapower says:

27 April 2007 12:45 UTC

Thanks a lot for all the great comments! I'll be sure to check out those two papers. And yeah, the module system is one (considerably cleaner, it must be said!) way to get around the namespace problem, but it's still not perfect :). Heres hoping Haskell' solves all this for us!

Saizan says:

27 April 2007 12:48 UTC

your desugaring examples are quite wrong:
> name :: Bird -> String
> name (value, _) = value
should be written:
> name (Bird value _) = value
Also you can indeed write a changeWings function:
> changeWings :: Integer -> Bird -> Bird
> changeWings x b = b{wings = x }
the problem is that you can't write something like this:
> changeField field v b = b{field = v}

27 April 2007 13:05 UTC

Saizan, thanks for your comment! You are indeed right that my pattern matches were totally off, I've fixed that (maybe it'll teach me to make posts without a compiler available!).
However, my point about changeWings stands: what I'm trying to say is that { wings = x } is not actually a function, it's something a bit special that has meaning only when a record value is put before it. It's ugly because it breaks the nice functional orthogonality of other Haskell constructions. I probably should have made this clearer!

josh says:

27 April 2007 18:43 UTC

I think the { wings = x } issue would be less of a problem if you could write "update field container value = container { field = value }". But of course you can't.