(Not updated very recently)
When programming in other languages, assignment (a.k.a. mutation) can be used to achieve several programming goals, such as performance, convenience, and sloppiness. Consider how far we’ve come without the need to rely on mutation.
Haskell does, however, provide mechanisms for mutation, albeit in a manner distinct from most other languages. In a nutshell, Haskell tracks the “extent” or “scope” of mutable references, using the type system to ensure that they do not “escape” their “computational threads.” As we will see, this allows mutation to be used locally without allowing them to leak outside of the kinds of nice, pure functional boundaries we have grown to know and love.
As a very simple example, implement Fibonacci naively:
fib_slow :: Integer -> Integer
fib_slow 1 = 1
fib_slow 2 = 1
fib_slow n = fib_slow (n-1) + fib_slow (n-2)
> map fib_slow [1..30] -- slow
> map fib_slow [1..100] -- hopeless
A much more efficient version keeps track of previous two answers:
fib_fast :: Integer -> Integer
fib_fast 1 = 1
fib_fast 2 = 1
fib_fast n = foo 1 1 (n-2) where
foo i _ 0 = i
foo i j m = foo (i+j) i (m-1)
> map fib_fast [1..100]
> map fib_fast [1..1000]
Even though we don’t need to, for sloppiness, let’s implement this using mutable variables (like we probably would in C or Java).
There are two components to mutable references in Haskell:
A “thread” of computation, denoted by
Control.Monad.ST s a
, where s
is the
(abstract) “id” of a thread and a
is the type of value
produced; and
A reference cell of type Data.STRef s a
points to a
memory cell that stores a value of type a
and
s
is the thread id.
Might help to remember more informative type variable names:
ST id a
and STRef id a
.
Unlike the “stateful functions”
Control.Monad.State s a ~= s -> (a, s)
we studied
before, this “state monad” really is stateful.
ST s
is a Monad
, so we can sequence
ST s a
actions.
fib_impure :: Integer -> Integer
fib_impure 1 = 1
fib_impure 2 = 1
fib_impure n = runST $ do
x <- newSTRef 1 -- every cell initialized to some value
y <- newSTRef 1
replicateM_ (n-2) $ do -- review Control.Monad!
i <- readSTRef x
j <- readSTRef y
writeSTRef x j
writeSTRef y (i+j)
readSTRef y
> map fib_impure [1..30]
> map fib_impure [1..300]
> map fib_impure [1..300] == map fib_fast [1..300]
Okay, so how do these functions work? Notice that
{new,read,write}STRef
return values “in the ST
monad.”
newSTRef :: a -> ST s (STRef s a)
readSTRef :: STRef s a -> ST s a
writeSTRef :: STRef s a -> a -> ST s ()
runST :: (forall s. ST s a) -> a
Whoa, explicit forall s. ...
. How is this type different
than usual?
All of the polymorphic functions we’ve seen so far are rank-1, or prenex, polymorphic: all of the forall quantifiers are (implicitly) written at the outermost position.
As an aside, there is an option that allows writing the quantifiers explicitly:
> :set -XExplicitForAll
> :t id :: forall a. a -> a
> :t id :: forall b. b -> b
By restricting to prenex polymorphism, the type checker is able to automatically infer polymorphic types and and type applications completely automatically for a (perhaps surprisingly-) large class of functions. But there are other functions where prenex polymorphism is not enough.
For example:
-- foo :: ([a] -> b) -> Bool -> b
-- foo :: forall a. forall b. ([a] -> b) -> Bool -> b
foo :: forall a b. ([a] -> b) -> Bool -> b
foo f True = f "abcd"
foo f False = f [1,2]
This does not type check: foo
cannot be assigned the
declared prenex polymorphic type below, because foo
uses
f
at type [Char] -> b
(in the first call)
and at type Num n => [n] -> b
(in the second call).
f
only has one type, and it is not either of these.
If we move the a
quantifier inside, however, it
typechecks! (Writing forall
s anywhere but all the way to
the left in a type requires the flag RankNTypes
.).
foo :: forall b. (forall a. [a] -> b) -> Bool -> b
By default, all quantifiers are at “the top-level” of a type cannot appear to the left on arrow. “Higher-rank” types allow quantifiers to be nested inside, to the left of arrows
all all
/ \ / \
a all b ------>
/ \ / \
b ------> all ->
/ \ / \ / \
-> -> a -> Bool b
/ \ / \ / \
[a] b Bool b [a] b
We can call foo
with any functions of type
(forall a. [a] -> b)
. For example:
> foo length True
> foo length False
> foo (const 0) True
> foo (const 0) False
A prenex forall
allows (requires) the caller to choose
the type arguments, so language/compiler cannot make any assumptions
about choice.
But a forall
to the left of an arrow prevents
the caller from making any assumptions about the type variable. Used to
encode encapsulation (e.g. objects). Allows compiler to make decisions
about representation.
ST
The higher-rank polymorphic type
runST :: (forall s. ST s a) -> a
prevents output from depending at all on the thread; it has
to work for all thread “ids” s
.
This approach allows us to “locally” use mutation. But only pure values at our function boundaries.
For example:
bad :: Bool
bad = runST $ do
x <- newSTRef True
pure $ runST $ do
writeSTRef x False -- inner thread can't access vars from outer
readSTRef x
Even the following fails “because type variable ‘s
’
would escape its scope” (akin to something like
“x :: exists s. STRef s Bool
”).
bad' :: ()
bad' =
let x = runST $ newSTRef True in
()
Can define aliases:
aliasEx = runST $ do
x <- newSTRef True
let y = x
writeSTRef y False
readSTRef x
References prevented from “escaping” scope. So can use references internally (e.g. within a function), but must be pure at boundaries. Best of both worlds! Haskell can enforce this, whereas most type systems cannot either (a) cannot prevent references from crossing boundaries (b) require different types depending on implementation.
“Impure” features do exist in Haskell, but they are to be used with restraint and caution:
ST
(State Transformer / State Thread) monads and
STRef
IO
(and see unsafe IO
functions)IORef
, in addition to mutation, allows exceptions and
I/O. But cannot (safely) escape IO
like you can
ST
.Control.Concurrent.MVars