With the `-fglasgow-exts`

flag, GHC lets you declarea data type with no constructors. For example:

data S -- S :: * data T a -- T :: * -> *

Syntactically, the declaration lacks the "= constrs" part. The type can be parameterised over types of any kind, but if the kind isnot `*`

then an explicit kind annotation must be used(see Section7.8.4, “Explicitly-kinded quantification”).

Such data types have only one value, namely bottom.Nevertheless, they can be useful when defining "phantom types".

Haskell allows datatypes to be given contexts, e.g.

data Eq a => Set a = NilSet | ConsSet a (Set a)

give constructors with types:

NilSet :: Set aConsSet :: Eq a => a -> Set a -> Set a

In GHC this feature is an extension called`DatatypeContexts`

, and on by default.

GHC allows type constructors, classes, and type variables to be operators, andto be written infix, very much like expressions. More specifically:

A type constructor or class can be an operator, beginning with a colon; e.g.

`:*:`

. The lexical syntax is the same as that for data constructors.Data type and type-synonym declarations can be written infix, parenthesised if you want further arguments. E.g.

data a :*: b = Foo a b type a :+: b = Either a b class a :=: b where ... data (a :**: b) x = Baz a b x type (a :++: b) y = Either (a,b) y

Types, and class constraints, can be written infix. For example

x :: Int :*: Bool f :: (a :=: b) => a -> b

A type variable can be an (unqualified) operator e.g.

`+`

. The lexical syntax is the same as that for variable operators, excluding "(.)", "(!)", and "(*)". In a binding position, the operator must be parenthesised. For example:type T (+) = Int + Int f :: T Either f = Left 3 liftA2 :: Arrow (~>) => (a -> b -> c) -> (e ~> a) -> (e ~> b) -> (e ~> c) liftA2 = ...

Back-quotes work as for expressions, both for type constructors and type variables; e.g.

`Int `Either` Bool`

, or`Int `a` Bool`

. Similarly, parentheses work the same; e.g.`(:*:) Int Bool`

.Fixities may be declared for type constructors, or classes, just as for data constructors. However, one cannot distinguish between the two in a fixity declaration; a fixity declaration sets the fixity for a data constructor and the corresponding type constructor. For example:

infixl 7 T, :*:

sets the fixity for both type constructor

`T`

and data constructor`T`

, and similarly for`:*:`

.`Int `a` Bool`

.Function arrow is

`infixr`

with fixity 0. (This might change; I'm not sure what it should be.)

Type synonyms are like macros at the type level, but Haskell 98 imposes many ruleson individual synonym declarations.With the `-XLiberalTypeSynonyms`

extension,GHC does validity checking on types *only after expanding type synonyms*.That means that GHC can be very much more liberal about type synonyms than Haskell 98.

You can write a

`forall`

(including overloading)in a type synonym, thus:type Discard a = forall b. Show b => a -> b -> (a, String) f :: Discard a f x y = (x, show y) g :: Discard Int -> (Int,String) -- A rank-2 type g f = f 3 True

If you also use

`-XUnboxedTuples`

, you can write an unboxed tuple in a type synonym:type Pr = (# Int, Int #) h :: Int -> Pr h x = (# x, x #)

You can apply a type synonym to a forall type:

type Foo a = a -> a -> Bool f :: Foo (forall b. b->b)

After expanding the synonym,

`f`

has the legal (in GHC) type:f :: (forall b. b->b) -> (forall b. b->b) -> Bool

You can apply a type synonym to a partially applied type synonym:

type Generic i o = forall x. i x -> o x type Id x = x foo :: Generic Id []

After expanding the synonym,

`foo`

has the legal (in GHC) type:foo :: forall x. x -> [x]

GHC currently does kind checking before expanding synonyms (though even thatcould be changed.)

After expanding type synonyms, GHC does validity checking on types, looking forthe following mal-formedness which isn't detected simply by kind checking:

Type constructor applied to a type involving for-alls.

Unboxed tuple on left of an arrow.

Partially-applied type synonym.

So, for example,this will be rejected:

type Pr = (# Int, Int #) h :: Pr -> Int h x = ...

because GHC does not allow unboxed tuples on the left of a function arrow.

The idea of using existential quantification in data type declarationswas suggested by Perry, and implemented in Hope+ (Nigel Perry, *The Implementationof Practical Functional Programming Languages*, PhD Thesis, University ofLondon, 1991). It was later formalised by Laufer and Odersky(*Polymorphic type inference and abstract data types*,TOPLAS, 16(5), pp1411-1430, 1994).It's been in LennartAugustsson's **hbc** Haskell compiler for several years, andproved very useful. Here's the idea. Consider the declaration:

data Foo = forall a. MkFoo a (a -> Bool) | Nil

The data type `Foo`

has two constructors with types:

MkFoo :: forall a. a -> (a -> Bool) -> Foo Nil :: Foo

Notice that the type variable `a`

in the type of `MkFoo`

does not appear in the data type itself, which is plain `Foo`

.For example, the following expression is fine:

[MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]

Here, `(MkFoo 3 even)`

packages an integer with a function`even`

that maps an integer to `Bool`

; and `MkFoo 'c'isUpper`

packages a character with a compatible function. Thesetwo things are each of type `Foo`

and can be put in a list.

What can we do with a value of type `Foo`

?. In particular,what happens when we pattern-match on `MkFoo`

?

f (MkFoo val fn) = ???

Since all we know about `val`

and `fn`

is that theyare compatible, the only (useful) thing we can do with them is toapply `fn`

to `val`

to get a boolean. For example:

f :: Foo -> Bool f (MkFoo val fn) = fn val

What this allows us to do is to package heterogeneous valuestogether with a bunch of functions that manipulate them, and then treatthat collection of packages in a uniform manner. You can expressquite a bit of object-oriented-like programming this way.

What has this to do with *existential* quantification?Simply that `MkFoo`

has the (nearly) isomorphic type

MkFoo :: (exists a . (a, a -> Bool)) -> Foo

But Haskell programmers can safely think of the ordinary*universally* quantified type given above, thereby avoidingadding a new existential quantification construct.

An easy extension is to allowarbitrary contexts before the constructor. For example:

data Baz = forall a. Eq a => Baz1 a a | forall b. Show b => Baz2 b (b -> b)

The two constructors have the types you'd expect:

Baz1 :: forall a. Eq a => a -> a -> BazBaz2 :: forall b. Show b => b -> (b -> b) -> Baz

But when pattern matching on `Baz1`

the matched values can be comparedfor equality, and when pattern matching on `Baz2`

the first matchedvalue can be converted to a string (as well as applying the function to it).So this program is legal:

f :: Baz -> String f (Baz1 p q) | p == q = "Yes" | otherwise = "No" f (Baz2 v fn) = show (fn v)

Operationally, in a dictionary-passing implementation, theconstructors `Baz1`

and `Baz2`

must store thedictionaries for `Eq`

and `Show`

respectively, andextract it on pattern matching.

GHC allows existentials to be used with records syntax as well. For example:

data Counter a = forall self. NewCounter { _this :: self , _inc :: self -> self , _display :: self -> IO () , tag :: a }

Here `tag`

is a public field, with a well-typed selectorfunction `tag :: Counter a -> a`

. The `self`

type is hidden from the outside; any attempt to apply `_this`

,`_inc`

or `_display`

as functions will raise acompile-time error. In other words, *GHC defines a record selector functiononly for fields whose type does not mention the existentially-quantified variables*.(This example used an underscore in the fields for which record selectorswill not be defined, but that is only programming style; GHC ignores them.)

To make use of these hidden fields, we need to create some helper functions:

inc :: Counter a -> Counter ainc (NewCounter x i d t) = NewCounter { _this = i x, _inc = i, _display = d, tag = t } display :: Counter a -> IO ()display NewCounter{ _this = x, _display = d } = d x

Now we can define counters with different underlying implementations:

counterA :: Counter String counterA = NewCounter { _this = 0, _inc = (1+), _display = print, tag = "A" }counterB :: Counter String counterB = NewCounter { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }main = do display (inc counterA) -- prints "1" display (inc (inc counterB)) -- prints "##"

Record update syntax is supported for existentials (and GADTs):

setTag :: Counter a -> a -> Counter asetTag obj t = obj{ tag = t }

The rule for record update is this: *the types of the updated fields maymention only the universally-quantified type variablesof the data constructor. For GADTs, the field may mention only typesthat appear as a simple type-variable argument in the constructor's resulttype*. For example:

data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existentialupd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' bupd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is -- existentially quantified)data G a b where { G1 { g1::a, g2::c } :: G a [c] }upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c bupd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple -- type-variable argument in G1's result type)

There are several restrictions on the ways in which existentially-quantifiedconstructors can be use.

When pattern matching, each pattern match introduces a new,distinct, type for each existential type variable. These types cannotbe unified with any other type, nor can they escape from the scope ofthe pattern match. For example, these fragments are incorrect:

f1 (MkFoo a f) = a

Here, the type bound by

`MkFoo`

"escapes", because`a`

is the result of`f1`

. One way to see why this is wrong is toask what type`f1`

has:f1 :: Foo -> a -- Weird!

What is this "

`a`

" in the result type? Clearly we don't meanthis:f1 :: forall a. Foo -> a -- Wrong!

The original program is just plain wrong. Here's another sort of error

f2 (Baz1 a b) (Baz1 p q) = a==q

It's ok to say

`a==b`

or`p==q`

, but`a==q`

is wrong because it equates the two distinct types arisingfrom the two`Baz1`

constructors.You can't pattern-match on an existentially quantifiedconstructor in a

`let`

or`where`

group ofbindings. So this is illegal:f3 x = a==b where { Baz1 a b = x }

Instead, use a

`case`

expression:f3 x = case x of Baz1 a b -> a==b

In general, you can only pattern-matchon an existentially-quantified constructor in a

`case`

expression orin the patterns of a function definition.The reason for this restriction is really an implementation one.Type-checking binding groups is already a nightmare withoutexistentials complicating the picture. Also an existential patternbinding at the top level of a module doesn't make sense, because it'snot clear how to prevent the existentially-quantified type "escaping".So for now, there's a simple-to-state restriction. We'll see howannoying it is.You can't use existential quantification for

`newtype`

declarations. So this is illegal:newtype T = forall a. Ord a => MkT a

Reason: a value of type

`T`

must be represented as apair of a dictionary for`Ord t`

and a value of type`t`

. That contradicts the idea that`newtype`

should have no concrete representation.You can get just the same efficiency and effect by using`data`

instead of`newtype`

. Ifthere is no overloading involved, then there is more of a case forallowing an existentially-quantified`newtype`

,because the`data`

version does carry animplementation cost, but single-field existentially quantifiedconstructors aren't much use. So the simple restriction (noexistential stuff on`newtype`

) stands, unless thereare convincing reasons to change it.You can't use

`deriving`

to define instances of adata type with existentially quantified data constructors.Reason: in most cases it would not make sense. For example:;data T = forall a. MkT [a] deriving( Eq )

To derive

`Eq`

in the standard way we would need to have equalitybetween the single component of two`MkT`

constructors:instance Eq T where (MkT a) == (MkT b) = ???

But

`a`

and`b`

have distinct types, and so can't be compared.It's just about possible to imagine examples in which the derived instancewould make sense, but it seems altogether simpler simply to prohibit suchdeclarations. Define your own instances!

GHC allows you to declare an algebraic data type by giving the type signatures of constructors explicitly. For example:

data Maybe a where Nothing :: Maybe a Just :: a -> Maybe a

The form is called a "GADT-style declaration"because Generalised Algebraic Data Types, described in Section7.4.7, “Generalised Algebraic Data Types (GADTs)”, can only be declared using this form.

Notice that GADT-style syntax generalises existential types (Section7.4.5, “Existentially quantified data constructors”). For example, these two declarations are equivalent:

data Foo = forall a. MkFoo a (a -> Bool) data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }

Any data type that can be declared in standard Haskell-98 syntax can also be declared using GADT-style syntax.The choice is largely stylistic, but GADT-style declarations differ in one important respect:they treat class constraints on the data constructors differently.Specifically, if the constructor is given a type-class context, thatcontext is made available by pattern matching. For example:

data Set a where MkSet :: Eq a => [a] -> Set a makeSet :: Eq a => [a] -> Set a makeSet xs = MkSet (nub xs) insert :: a -> Set a -> Set a insert a (MkSet as) | a `elem` as = MkSet as | otherwise = MkSet (a:as)

A use of `MkSet`

as a constructor (e.g. in the definition of `makeSet`

) gives rise to a `(Eq a)`

constraint, as you would expect. The new feature is that pattern-matching on `MkSet`

(as in the definition of `insert`

) makes *available* an `(Eq a)`

context. In implementation terms, the `MkSet`

constructor has a hidden field that storesthe `(Eq a)`

dictionary that is passed to `MkSet`

; sowhen pattern-matching that dictionary becomes available for the right-hand side of the match.In the example, the equality dictionary is used to satisfy the equality constraint generated by the call to `elem`

, so that the type of`insert`

itself has no `Eq`

constraint.

For example, one possible application is to reify dictionaries:

data NumInst a where MkNumInst :: Num a => NumInst a intInst :: NumInst Int intInst = MkNumInst plus :: NumInst a -> a -> a -> a plus MkNumInst p q = p + q

Here, a value of type `NumInst a`

is equivalent to an explicit `(Num a)`

dictionary.

All this applies to constructors declared using the syntax of Section7.4.5.2, “Existentials and type classes”.For example, the `NumInst`

data type above could equivalently be declared like this:

data NumInst a = Num a => MkNumInst (NumInst a)

Notice that, unlike the situation when declaring an existential, there is no `forall`

, because the `Num`

constrains thedata type's universally quantified type variable `a`

. A constructor may have both universal and existential type variables: for example,the following two declarations are equivalent:

data T1 a = forall b. (Num a, Eq b) => MkT1 a b data T2 a whereMkT2 :: (Num a, Eq b) => a -> b -> T2 a

All this behaviour contrasts with Haskell 98's peculiar treatment of contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).In Haskell 98 the definition

data Eq a => Set' a = MkSet' [a]

gives `MkSet'`

the same type as `MkSet`

above. But instead of *making available* an `(Eq a)`

constraint, pattern-matchingon `MkSet'`

*requires* an `(Eq a)`

constraint!GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,GHC's behaviour is much more useful, as well as much more intuitive.

The rest of this section gives further details about GADT-style datatype declarations.

The result type of each data constructor must begin with the type constructor being defined.If the result type of all constructors has the form

`T a1 ... an`

, where`a1 ... an`

are distinct type variables, then the data type is*ordinary*;otherwise is a*generalised*data type (Section7.4.7, “Generalised Algebraic Data Types (GADTs)”).As with other type signatures, you can give a single signature for several data constructors.In this example we give a single signature for

`T1`

and`T2`

:data T a where T1,T2 :: a -> T a T3 :: T a

The type signature ofeach constructor is independent, and is implicitly universally quantified as usual. In particular, the type variable(s) in the "

`data T a where`

" header have no scope, and different constructors may have different universally-quantified type variables:data T a where -- The 'a' has no scope T1,T2 :: b -> T b -- Means forall b. b -> T b T3 :: T a -- Means forall a. T a

A constructor signature may mention type class constraints, which can differ fordifferent constructors. For example, this is fine:

data T a where T1 :: Eq b => b -> b -> T b T2 :: (Show c, Ix c) => c -> [c] -> T c

When patten matching, these constraints are made available to discharge constraintsin the body of the match. For example:

f :: T a -> String f (T1 x y) | x==y = "yes" | otherwise = "no" f (T2 a b) = show a

Note that

`f`

is not overloaded; the`Eq`

constraint arisingfrom the use of`==`

is discharged by the pattern match on`T1`

and similarly the`Show`

constraint arising from the use of`show`

.Unlike a Haskell-98-style data type declaration, the type variable(s) in the "

`data Set a where`

" header have no scope. Indeed, one can write a kind signature instead:data Set :: * -> * where ...

or even a mixture of the two:

data Bar a :: (* -> *) -> * where ...

The type variables (if given) may be explicitly kinded, so we could also write the header for

`Foo`

like this:data Bar a (b :: * -> *) where ...

You can use strictness annotations, in the obvious placesin the constructor type:

data Term a where Lit :: !Int -> Term Int If :: Term Bool -> !(Term a) -> !(Term a) -> Term a Pair :: Term a -> Term b -> Term (a,b)

You can use a

`deriving`

clause on a GADT-style data typedeclaration. For example, these two declarations are equivalentdata Maybe1 a where { Nothing1 :: Maybe1 a ; Just1 :: a -> Maybe1 a } deriving( Eq, Ord ) data Maybe2 a = Nothing2 | Just2 a deriving( Eq, Ord )

The type signature may have quantified type variables that do not appearin the result type:

data Foo where MkFoo :: a -> (a->Bool) -> Foo Nil :: Foo

Here the type variable

`a`

does not appear in the result typeof either constructor. Although it is universally quantified in the type of the constructor, sucha type variable is often called "existential". Indeed, the above declaration declares precisely the same type as the`data Foo`

in Section7.4.5, “Existentially quantified data constructors”.The type may contain a class context too, of course:

data Showable where MkShowable :: Show a => a -> Showable

You can use record syntax on a GADT-style data type declaration:

data Person where Adult :: { name :: String, children :: [Person] } -> Person Child :: Show a => { name :: !String, funny :: a } -> Person

As usual, for every constructor that has a field

`f`

, the type offield`f`

must be the same (modulo alpha conversion).The`Child`

constructor above shows that the signaturemay have a context, existentially-quantified variables, and strictness annotations, just as in the non-record case. (NB: the "type" that follows the double-colonis not really a type, because of the record syntax and strictness annotations.A "type" of this form can appear only in a constructor signature.)Record updates are allowed with GADT-style declarations, only fields that have the following property: the type of the fieldmentions no existential type variables.

As in the case of existentials declared using the Haskell-98-like record syntax (Section7.4.5.3, “Record Constructors”),record-selector functions are generated only for those fields that have well-typedselectors. Here is the example of that section, in GADT-style syntax:

data Counter a where NewCounter { _this :: self , _inc :: self -> self , _display :: self -> IO () , tag :: a } :: Counter a

As before, only one selector function is generated here, that for

`tag`

.Nevertheless, you can still use all the field names in pattern matching and record construction.

Generalised Algebraic Data Types generalise ordinary algebraic data types by allowing constructors to have richer return types. Here is an example:

data Term a where Lit :: Int -> Term Int Succ :: Term Int -> Term Int IsZero :: Term Int -> Term Bool If :: Term Bool -> Term a -> Term a -> Term a Pair :: Term a -> Term b -> Term (a,b)

Notice that the return type of the constructors is not always `Term a`

, as is thecase with ordinary data types. This generality allows us to write a well-typed `eval`

functionfor these `Terms`

:

eval :: Term a -> a eval (Lit i) = i eval (Succ t) = 1 + eval t eval (IsZero t) = eval t == 0 eval (If b e1 e2) = if eval b then eval e1 else eval e2 eval (Pair e1 e2) = (eval e1, eval e2)

The key point about GADTs is that *pattern matching causes type refinement*. For example, in the right hand side of the equation

eval :: Term a -> a eval (Lit i) = ...

the type `a`

is refined to `Int`

. That's the whole point!A precise specification of the type rules is beyond what this user manual aspires to, but the design closely follows that described inthe paper Simpleunification-based type inference for GADTs,(ICFP 2006).The general principle is this: *type refinement is only carried out based on user-supplied type annotations*.So if no type signature is supplied for `eval`

, no type refinement happens, and lots of obscure error messages willoccur. However, the refinement is quite general. For example, if we had:

eval :: Term a -> a -> a eval (Lit i) j = i+j

the pattern match causes the type `a`

to be refined to `Int`

(because of the typeof the constructor `Lit`

), and that refinement also applies to the type of `j`

, andthe result type of the `case`

expression. Hence the addition `i+j`

is legal.

These and many other examples are given in papers by Hongwei Xi, andTim Sheard. There is a longer introductionon the wiki,and Ralf Hinze'sFun with phantom types also has a number of examples. Note that papersmay use different notation to that implemented in GHC.

The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with `-XGADTs`

. The `-XGADTs`

flag also sets `-XRelaxedPolyRec`

.

A GADT can only be declared using GADT-style syntax (Section7.4.6, “Declaring data types with explicit constructor signatures”); the old Haskell-98 syntax for data declarations always declares an ordinary data type.The result type of each constructor must begin with the type constructor being defined,but for a GADT the arguments to the type constructor can be arbitrary monotypes. For example, in the

`Term`

datatype above, the type of each constructor must end with`Term ty`

, butthe`ty`

need not be a type variable (e.g. the`Lit`

constructor).It is permitted to declare an ordinary algebraic data type using GADT-style syntax.What makes a GADT into a GADT is not the syntax, but rather the presence of data constructorswhose result type is not just

`T a b`

.You cannot use a

`deriving`

clause for a GADT; only foran ordinary data type.As mentioned in Section7.4.6, “Declaring data types with explicit constructor signatures”, record syntax is supported.For example:

data Term a where Lit { val :: Int } :: Term Int Succ { num :: Term Int } :: Term Int Pred { num :: Term Int } :: Term Int IsZero { arg :: Term Int } :: Term Bool Pair { arg1 :: Term a , arg2 :: Term b } :: Term (a,b) If { cnd :: Term Bool , tru :: Term a , fls :: Term a } :: Term a

However, for GADTs there is the following additional constraint: every constructor that has a field

`f`

must havethe same result type (modulo alpha conversion)Hence, in the above example, we cannot merge the`num`

and`arg`

fields above into a single name. Although their field types are both`Term Int`

,their selector functions actually have different types:num :: Term Int -> Term Int arg :: Term Bool -> Term Int

When pattern-matching against data constructors drawn from a GADT, for example in a

`case`

expression, the following rules apply:The type of the scrutinee must be rigid.

The type of the entire

`case`

expression must be rigid.The type of any free variable mentioned in any ofthe

`case`

alternatives must be rigid.

A type is "rigid" if it is completely known to the compiler at its binding site. The easiestway to ensure that a variable a rigid type is to give it a type signature.For more precise details see Simple unification-based type inference for GADTs. The criteria implemented by GHC are given in the Appendix.