7.4. Extensions to data types and type synonyms (2022)

7.4.1.Data types with no constructors

With the -fglasgow-exts flag, GHC lets you declarea data type with no constructors. For example:

 data S -- S :: * data T a -- T :: * -> *

Syntactically, the declaration lacks the "= constrs" part. The type can be parameterised over types of any kind, but if the kind isnot * then an explicit kind annotation must be used(see Section7.8.4, “Explicitly-kinded quantification”).

Such data types have only one value, namely bottom.Nevertheless, they can be useful when defining "phantom types".

7.4.2.Data type contexts

Haskell allows datatypes to be given contexts, e.g.

data Eq a => Set a = NilSet | ConsSet a (Set a)

give constructors with types:

NilSet :: Set aConsSet :: Eq a => a -> Set a -> Set a

In GHC this feature is an extension calledDatatypeContexts, and on by default.

7.4.3.Infix type constructors, classes, and type variables

GHC allows type constructors, classes, and type variables to be operators, andto be written infix, very much like expressions. More specifically:

  • A type constructor or class can be an operator, beginning with a colon; e.g. :*:. The lexical syntax is the same as that for data constructors.

  • Data type and type-synonym declarations can be written infix, parenthesised if you want further arguments. E.g.

     data a :*: b = Foo a b type a :+: b = Either a b class a :=: b where ... data (a :**: b) x = Baz a b x type (a :++: b) y = Either (a,b) y
  • Types, and class constraints, can be written infix. For example

    x :: Int :*: Bool f :: (a :=: b) => a -> b 
  • A type variable can be an (unqualified) operator e.g. +. The lexical syntax is the same as that for variable operators, excluding "(.)", "(!)", and "(*)". In a binding position, the operator must be parenthesised. For example:

     type T (+) = Int + Int f :: T Either f = Left 3 liftA2 :: Arrow (~>) => (a -> b -> c) -> (e ~> a) -> (e ~> b) -> (e ~> c) liftA2 = ...
  • Back-quotes work as for expressions, both for type constructors and type variables; e.g. Int `Either` Bool, or Int `a` Bool. Similarly, parentheses work the same; e.g. (:*:) Int Bool.

  • Fixities may be declared for type constructors, or classes, just as for data constructors. However, one cannot distinguish between the two in a fixity declaration; a fixity declaration sets the fixity for a data constructor and the corresponding type constructor. For example:

     infixl 7 T, :*:

    sets the fixity for both type constructor T and data constructor T, and similarly for :*:. Int `a` Bool.

  • Function arrow is infixr with fixity 0. (This might change; I'm not sure what it should be.)

7.4.4.Liberalised type synonyms

Type synonyms are like macros at the type level, but Haskell 98 imposes many ruleson individual synonym declarations.With the -XLiberalTypeSynonyms extension,GHC does validity checking on types only after expanding type synonyms.That means that GHC can be very much more liberal about type synonyms than Haskell 98.

  • You can write a forall (including overloading)in a type synonym, thus:

     type Discard a = forall b. Show b => a -> b -> (a, String) f :: Discard a f x y = (x, show y) g :: Discard Int -> (Int,String) -- A rank-2 type g f = f 3 True
  • If you also use -XUnboxedTuples, you can write an unboxed tuple in a type synonym:

     type Pr = (# Int, Int #) h :: Int -> Pr h x = (# x, x #)
  • You can apply a type synonym to a forall type:

     type Foo a = a -> a -> Bool f :: Foo (forall b. b->b)

    After expanding the synonym, f has the legal (in GHC) type:

     f :: (forall b. b->b) -> (forall b. b->b) -> Bool
  • You can apply a type synonym to a partially applied type synonym:

     type Generic i o = forall x. i x -> o x type Id x = x foo :: Generic Id []

    After expanding the synonym, foo has the legal (in GHC) type:

     foo :: forall x. x -> [x]

GHC currently does kind checking before expanding synonyms (though even thatcould be changed.)

After expanding type synonyms, GHC does validity checking on types, looking forthe following mal-formedness which isn't detected simply by kind checking:

  • Type constructor applied to a type involving for-alls.

  • Unboxed tuple on left of an arrow.

  • Partially-applied type synonym.

So, for example,this will be rejected:

 type Pr = (# Int, Int #) h :: Pr -> Int h x = ...

because GHC does not allow unboxed tuples on the left of a function arrow.

7.4.5.Existentially quantified data constructors

The idea of using existential quantification in data type declarationswas suggested by Perry, and implemented in Hope+ (Nigel Perry, The Implementationof Practical Functional Programming Languages, PhD Thesis, University ofLondon, 1991). It was later formalised by Laufer and Odersky(Polymorphic type inference and abstract data types,TOPLAS, 16(5), pp1411-1430, 1994).It's been in LennartAugustsson's hbc Haskell compiler for several years, andproved very useful. Here's the idea. Consider the declaration:

 data Foo = forall a. MkFoo a (a -> Bool) | Nil

The data type Foo has two constructors with types:

 MkFoo :: forall a. a -> (a -> Bool) -> Foo Nil :: Foo

Notice that the type variable a in the type of MkFoodoes not appear in the data type itself, which is plain Foo.For example, the following expression is fine:

 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]

Here, (MkFoo 3 even) packages an integer with a functioneven that maps an integer to Bool; and MkFoo 'c'isUpper packages a character with a compatible function. Thesetwo things are each of type Foo and can be put in a list.

What can we do with a value of type Foo?. In particular,what happens when we pattern-match on MkFoo?

 f (MkFoo val fn) = ???

Since all we know about val and fn is that theyare compatible, the only (useful) thing we can do with them is toapply fn to val to get a boolean. For example:

 f :: Foo -> Bool f (MkFoo val fn) = fn val

What this allows us to do is to package heterogeneous valuestogether with a bunch of functions that manipulate them, and then treatthat collection of packages in a uniform manner. You can expressquite a bit of object-oriented-like programming this way. existential?

What has this to do with existential quantification?Simply that MkFoo has the (nearly) isomorphic type

 MkFoo :: (exists a . (a, a -> Bool)) -> Foo

But Haskell programmers can safely think of the ordinaryuniversally quantified type given above, thereby avoidingadding a new existential quantification construct. and type classes

An easy extension is to allowarbitrary contexts before the constructor. For example:

data Baz = forall a. Eq a => Baz1 a a | forall b. Show b => Baz2 b (b -> b)

The two constructors have the types you'd expect:

Baz1 :: forall a. Eq a => a -> a -> BazBaz2 :: forall b. Show b => b -> (b -> b) -> Baz

But when pattern matching on Baz1 the matched values can be comparedfor equality, and when pattern matching on Baz2 the first matchedvalue can be converted to a string (as well as applying the function to it).So this program is legal:

 f :: Baz -> String f (Baz1 p q) | p == q = "Yes" | otherwise = "No" f (Baz2 v fn) = show (fn v)

Operationally, in a dictionary-passing implementation, theconstructors Baz1 and Baz2 must store thedictionaries for Eq and Show respectively, andextract it on pattern matching. Constructors

GHC allows existentials to be used with records syntax as well. For example:

data Counter a = forall self. NewCounter { _this :: self , _inc :: self -> self , _display :: self -> IO () , tag :: a }

Here tag is a public field, with a well-typed selectorfunction tag :: Counter a -> a. The selftype is hidden from the outside; any attempt to apply _this,_inc or _display as functions will raise acompile-time error. In other words, GHC defines a record selector functiononly for fields whose type does not mention the existentially-quantified variables.(This example used an underscore in the fields for which record selectorswill not be defined, but that is only programming style; GHC ignores them.)

To make use of these hidden fields, we need to create some helper functions:

inc :: Counter a -> Counter ainc (NewCounter x i d t) = NewCounter { _this = i x, _inc = i, _display = d, tag = t } display :: Counter a -> IO ()display NewCounter{ _this = x, _display = d } = d x

Now we can define counters with different underlying implementations:

counterA :: Counter String counterA = NewCounter { _this = 0, _inc = (1+), _display = print, tag = "A" }counterB :: Counter String counterB = NewCounter { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }main = do display (inc counterA) -- prints "1" display (inc (inc counterB)) -- prints "##"

Record update syntax is supported for existentials (and GADTs):

setTag :: Counter a -> a -> Counter asetTag obj t = obj{ tag = t }

The rule for record update is this: the types of the updated fields maymention only the universally-quantified type variablesof the data constructor. For GADTs, the field may mention only typesthat appear as a simple type-variable argument in the constructor's resulttype. For example:

data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existentialupd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' bupd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is -- existentially quantified)data G a b where { G1 { g1::a, g2::c } :: G a [c] }upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c bupd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple -- type-variable argument in G1's result type)

There are several restrictions on the ways in which existentially-quantifiedconstructors can be use.

  • When pattern matching, each pattern match introduces a new,distinct, type for each existential type variable. These types cannotbe unified with any other type, nor can they escape from the scope ofthe pattern match. For example, these fragments are incorrect:

    f1 (MkFoo a f) = a

    Here, the type bound by MkFoo "escapes", because ais the result of f1. One way to see why this is wrong is toask what type f1 has:

     f1 :: Foo -> a -- Weird!

    What is this "a" in the result type? Clearly we don't meanthis:

     f1 :: forall a. Foo -> a -- Wrong!

    The original program is just plain wrong. Here's another sort of error

     f2 (Baz1 a b) (Baz1 p q) = a==q

    It's ok to say a==b or p==q, buta==q is wrong because it equates the two distinct types arisingfrom the two Baz1 constructors.

  • You can't pattern-match on an existentially quantifiedconstructor in a let or where group ofbindings. So this is illegal:

     f3 x = a==b where { Baz1 a b = x }

    Instead, use a case expression:

     f3 x = case x of Baz1 a b -> a==b

    In general, you can only pattern-matchon an existentially-quantified constructor in a case expression orin the patterns of a function definition.The reason for this restriction is really an implementation one.Type-checking binding groups is already a nightmare withoutexistentials complicating the picture. Also an existential patternbinding at the top level of a module doesn't make sense, because it'snot clear how to prevent the existentially-quantified type "escaping".So for now, there's a simple-to-state restriction. We'll see howannoying it is.

  • You can't use existential quantification for newtypedeclarations. So this is illegal:

     newtype T = forall a. Ord a => MkT a

    Reason: a value of type T must be represented as apair of a dictionary for Ord t and a value of typet. That contradicts the idea thatnewtype should have no concrete representation.You can get just the same efficiency and effect by usingdata instead of newtype. Ifthere is no overloading involved, then there is more of a case forallowing an existentially-quantified newtype,because the data version does carry animplementation cost, but single-field existentially quantifiedconstructors aren't much use. So the simple restriction (noexistential stuff on newtype) stands, unless thereare convincing reasons to change it.

  • You can't use deriving to define instances of adata type with existentially quantified data constructors.Reason: in most cases it would not make sense. For example:;

    data T = forall a. MkT [a] deriving( Eq )

    To derive Eq in the standard way we would need to have equalitybetween the single component of two MkT constructors:

    instance Eq T where (MkT a) == (MkT b) = ???

    But a and b have distinct types, and so can't be compared.It's just about possible to imagine examples in which the derived instancewould make sense, but it seems altogether simpler simply to prohibit suchdeclarations. Define your own instances!

7.4.6.Declaring data types with explicit constructor signatures

GHC allows you to declare an algebraic data type by giving the type signatures of constructors explicitly. For example:

 data Maybe a where Nothing :: Maybe a Just :: a -> Maybe a

The form is called a "GADT-style declaration"because Generalised Algebraic Data Types, described in Section7.4.7, “Generalised Algebraic Data Types (GADTs)”, can only be declared using this form.

Notice that GADT-style syntax generalises existential types (Section7.4.5, “Existentially quantified data constructors”). For example, these two declarations are equivalent:

 data Foo = forall a. MkFoo a (a -> Bool) data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }

Any data type that can be declared in standard Haskell-98 syntax can also be declared using GADT-style syntax.The choice is largely stylistic, but GADT-style declarations differ in one important respect:they treat class constraints on the data constructors differently.Specifically, if the constructor is given a type-class context, thatcontext is made available by pattern matching. For example:

 data Set a where MkSet :: Eq a => [a] -> Set a makeSet :: Eq a => [a] -> Set a makeSet xs = MkSet (nub xs) insert :: a -> Set a -> Set a insert a (MkSet as) | a `elem` as = MkSet as | otherwise = MkSet (a:as)

A use of MkSet as a constructor (e.g. in the definition of makeSet) gives rise to a (Eq a)constraint, as you would expect. The new feature is that pattern-matching on MkSet(as in the definition of insert) makes available an (Eq a)context. In implementation terms, the MkSet constructor has a hidden field that storesthe (Eq a) dictionary that is passed to MkSet; sowhen pattern-matching that dictionary becomes available for the right-hand side of the match.In the example, the equality dictionary is used to satisfy the equality constraint generated by the call to elem, so that the type ofinsert itself has no Eq constraint.

For example, one possible application is to reify dictionaries:

 data NumInst a where MkNumInst :: Num a => NumInst a intInst :: NumInst Int intInst = MkNumInst plus :: NumInst a -> a -> a -> a plus MkNumInst p q = p + q

Here, a value of type NumInst a is equivalent to an explicit (Num a) dictionary.

All this applies to constructors declared using the syntax of Section7.4.5.2, “Existentials and type classes”.For example, the NumInst data type above could equivalently be declared like this:

 data NumInst a = Num a => MkNumInst (NumInst a)

Notice that, unlike the situation when declaring an existential, there is no forall, because the Num constrains thedata type's universally quantified type variable a. A constructor may have both universal and existential type variables: for example,the following two declarations are equivalent:

 data T1 a = forall b. (Num a, Eq b) => MkT1 a b data T2 a whereMkT2 :: (Num a, Eq b) => a -> b -> T2 a

All this behaviour contrasts with Haskell 98's peculiar treatment of contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).In Haskell 98 the definition

 data Eq a => Set' a = MkSet' [a]

gives MkSet' the same type as MkSet above. But instead of making available an (Eq a) constraint, pattern-matchingon MkSet' requires an (Eq a) constraint!GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,GHC's behaviour is much more useful, as well as much more intuitive.

The rest of this section gives further details about GADT-style datatype declarations.

  • The result type of each data constructor must begin with the type constructor being defined.If the result type of all constructors has the form T a1 ... an, where a1 ... anare distinct type variables, then the data type is ordinary;otherwise is a generalised data type (Section7.4.7, “Generalised Algebraic Data Types (GADTs)”).

  • As with other type signatures, you can give a single signature for several data constructors.In this example we give a single signature for T1 and T2:

     data T a where T1,T2 :: a -> T a T3 :: T a
  • The type signature ofeach constructor is independent, and is implicitly universally quantified as usual. In particular, the type variable(s) in the "data T a where" header have no scope, and different constructors may have different universally-quantified type variables:

     data T a where -- The 'a' has no scope T1,T2 :: b -> T b -- Means forall b. b -> T b T3 :: T a -- Means forall a. T a
  • A constructor signature may mention type class constraints, which can differ fordifferent constructors. For example, this is fine:

     data T a where T1 :: Eq b => b -> b -> T b T2 :: (Show c, Ix c) => c -> [c] -> T c

    When patten matching, these constraints are made available to discharge constraintsin the body of the match. For example:

     f :: T a -> String f (T1 x y) | x==y = "yes" | otherwise = "no" f (T2 a b) = show a

    Note that f is not overloaded; the Eq constraint arisingfrom the use of == is discharged by the pattern match on T1and similarly the Show constraint arising from the use of show.

  • Unlike a Haskell-98-style data type declaration, the type variable(s) in the "data Set a where" header have no scope. Indeed, one can write a kind signature instead:

     data Set :: * -> * where ...

    or even a mixture of the two:

     data Bar a :: (* -> *) -> * where ...

    The type variables (if given) may be explicitly kinded, so we could also write the header for Foolike this:

     data Bar a (b :: * -> *) where ...
  • You can use strictness annotations, in the obvious placesin the constructor type:

     data Term a where Lit :: !Int -> Term Int If :: Term Bool -> !(Term a) -> !(Term a) -> Term a Pair :: Term a -> Term b -> Term (a,b)
  • You can use a deriving clause on a GADT-style data typedeclaration. For example, these two declarations are equivalent

     data Maybe1 a where { Nothing1 :: Maybe1 a ; Just1 :: a -> Maybe1 a } deriving( Eq, Ord ) data Maybe2 a = Nothing2 | Just2 a deriving( Eq, Ord )
  • The type signature may have quantified type variables that do not appearin the result type:

     data Foo where MkFoo :: a -> (a->Bool) -> Foo Nil :: Foo

    Here the type variable a does not appear in the result typeof either constructor. Although it is universally quantified in the type of the constructor, sucha type variable is often called "existential". Indeed, the above declaration declares precisely the same type as the data Foo in Section7.4.5, “Existentially quantified data constructors”.

    The type may contain a class context too, of course:

     data Showable where MkShowable :: Show a => a -> Showable
  • You can use record syntax on a GADT-style data type declaration:

     data Person where Adult :: { name :: String, children :: [Person] } -> Person Child :: Show a => { name :: !String, funny :: a } -> Person

    As usual, for every constructor that has a field f, the type offield f must be the same (modulo alpha conversion).The Child constructor above shows that the signaturemay have a context, existentially-quantified variables, and strictness annotations, just as in the non-record case. (NB: the "type" that follows the double-colonis not really a type, because of the record syntax and strictness annotations.A "type" of this form can appear only in a constructor signature.)

  • Record updates are allowed with GADT-style declarations, only fields that have the following property: the type of the fieldmentions no existential type variables.

  • As in the case of existentials declared using the Haskell-98-like record syntax (Section7.4.5.3, “Record Constructors”),record-selector functions are generated only for those fields that have well-typedselectors. Here is the example of that section, in GADT-style syntax:

    data Counter a where NewCounter { _this :: self , _inc :: self -> self , _display :: self -> IO () , tag :: a } :: Counter a

    As before, only one selector function is generated here, that for tag.Nevertheless, you can still use all the field names in pattern matching and record construction.

7.4.7.Generalised Algebraic Data Types (GADTs)

Generalised Algebraic Data Types generalise ordinary algebraic data types by allowing constructors to have richer return types. Here is an example:

 data Term a where Lit :: Int -> Term Int Succ :: Term Int -> Term Int IsZero :: Term Int -> Term Bool If :: Term Bool -> Term a -> Term a -> Term a Pair :: Term a -> Term b -> Term (a,b)

Notice that the return type of the constructors is not always Term a, as is thecase with ordinary data types. This generality allows us to write a well-typed eval functionfor these Terms:

 eval :: Term a -> a eval (Lit i) = i eval (Succ t) = 1 + eval t eval (IsZero t) = eval t == 0 eval (If b e1 e2) = if eval b then eval e1 else eval e2 eval (Pair e1 e2) = (eval e1, eval e2)

The key point about GADTs is that pattern matching causes type refinement. For example, in the right hand side of the equation

 eval :: Term a -> a eval (Lit i) = ...

the type a is refined to Int. That's the whole point!A precise specification of the type rules is beyond what this user manual aspires to, but the design closely follows that described inthe paper Simpleunification-based type inference for GADTs,(ICFP 2006).The general principle is this: type refinement is only carried out based on user-supplied type annotations.So if no type signature is supplied for eval, no type refinement happens, and lots of obscure error messages willoccur. However, the refinement is quite general. For example, if we had:

 eval :: Term a -> a -> a eval (Lit i) j = i+j

the pattern match causes the type a to be refined to Int (because of the typeof the constructor Lit), and that refinement also applies to the type of j, andthe result type of the case expression. Hence the addition i+j is legal.

These and many other examples are given in papers by Hongwei Xi, andTim Sheard. There is a longer introductionon the wiki,and Ralf Hinze'sFun with phantom types also has a number of examples. Note that papersmay use different notation to that implemented in GHC.

The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with -XGADTs. The -XGADTs flag also sets -XRelaxedPolyRec.

  • A GADT can only be declared using GADT-style syntax (Section7.4.6, “Declaring data types with explicit constructor signatures”); the old Haskell-98 syntax for data declarations always declares an ordinary data type.The result type of each constructor must begin with the type constructor being defined,but for a GADT the arguments to the type constructor can be arbitrary monotypes. For example, in the Term datatype above, the type of each constructor must end with Term ty, butthe ty need not be a type variable (e.g. the Litconstructor).

  • It is permitted to declare an ordinary algebraic data type using GADT-style syntax.What makes a GADT into a GADT is not the syntax, but rather the presence of data constructorswhose result type is not just T a b.

  • You cannot use a deriving clause for a GADT; only foran ordinary data type.

  • As mentioned in Section7.4.6, “Declaring data types with explicit constructor signatures”, record syntax is supported.For example:

     data Term a where Lit { val :: Int } :: Term Int Succ { num :: Term Int } :: Term Int Pred { num :: Term Int } :: Term Int IsZero { arg :: Term Int } :: Term Bool Pair { arg1 :: Term a , arg2 :: Term b } :: Term (a,b) If { cnd :: Term Bool , tru :: Term a , fls :: Term a } :: Term a

    However, for GADTs there is the following additional constraint: every constructor that has a field f must havethe same result type (modulo alpha conversion)Hence, in the above example, we cannot merge the num and arg fields above into a single name. Although their field types are both Term Int,their selector functions actually have different types:

     num :: Term Int -> Term Int arg :: Term Bool -> Term Int
  • When pattern-matching against data constructors drawn from a GADT, for example in a case expression, the following rules apply:

    • The type of the scrutinee must be rigid.

    • The type of the entire case expression must be rigid.

    • The type of any free variable mentioned in any ofthe case alternatives must be rigid.

    A type is "rigid" if it is completely known to the compiler at its binding site. The easiestway to ensure that a variable a rigid type is to give it a type signature.For more precise details see Simple unification-based type inference for GADTs. The criteria implemented by GHC are given in the Appendix.

Top Articles

You might also like

Latest Posts

Article information

Author: Rueben Jacobs

Last Updated: 09/25/2022

Views: 5561

Rating: 4.7 / 5 (77 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Rueben Jacobs

Birthday: 1999-03-14

Address: 951 Caterina Walk, Schambergerside, CA 67667-0896

Phone: +6881806848632

Job: Internal Education Planner

Hobby: Candle making, Cabaret, Poi, Gambling, Rock climbing, Wood carving, Computer programming

Introduction: My name is Rueben Jacobs, I am a cooperative, beautiful, kind, comfortable, glamorous, open, magnificent person who loves writing and wants to share my knowledge and understanding with you.