Not all computational effects are monads
- 17 November 2008
Lately I've been thinking about computational effects in general, i.e., what is the structure of the “space of all computational effects”. We know very little about this topic. I am not even sure we know what “the space of all computational effects” is. Because Haskell is very popular and in Haskell computational effects are expressed as monads, many people might think that I am talking about the space of all monads. But I am not, and in this post I show two computational effects which are not of the usual Haskell monad kind. They should present a nice programming challenge to Haskell fans.
Computational effects are things like non-termination, exceptions, state, input/output, continuations, non-determinism, and others. In Haskell these are represented as monads, and in fact there is a whole industry of Haskell monads. But how about the following two examples?
First we fix a particular operational semantics for (pure) Haskell so that the order of execution is determined (of course, from within pure Haskell we cannot detect the order). We now adjoin to Haskell a new constant
catch :: ((a -> b) -> c) -> Maybe a
which allows us to detect the fact that a function is evaluated at an argument. More precisely, given any
f :: (a -> b) -> c
catch f inspects evaluation of
f g where the argument
g :: a -> b is specially crafted so that:
fattempts to evaluate
gat an argument
catch fevaluates to
- otherwise, if
fevaluates to a result without ever evaluating
gat any argument then
catch fevaluates to
fdiverges without evaluating
catch fdiverges, also.
catch (\g -> 42)evaluates to
catch (\g -> \x -> g (g x))evaluates to
catch (\g -> g 1 + g 2)evaluates to
Just 1, assuming that + is evaluated from left to right.
With access to the underlying Haskell compiler or interpreter we could implement
catch. We can also simulate
catch within Haskell using exceptions. The idea is quite simple:
catch f passes to
f a function which raises an exception when it is evaluated, then
catch intercepts the exception. Like this:
module Katch where
-- In order to avoid conflicts with Haskell's Prelude.catch we -- call our function "katch"
-- To keep things self-contained, we define our own mini exception monad
data Exception e a = Success a | Failure e deriving Show
instance Monad (Exception e) where return = Success (Failure x) >>= _ = Failure x (Success x) >>= f = f x
throw :: e -> Exception e a throw = Failure
intercept :: Exception e a -> (e -> a) -> a intercept (Success x) _ = x intercept (Failure x) h = h x
-- Now we may simulate catch by throwing an exception and intercepting it. -- Of course, the type of katsch reflects the fact that exceptions are used -- under the hood.
katsch :: ((a -> Exception a b) -> Exception a c) -> Maybe a katsch f = intercept (do y <- f (\x -> throw x) return Nothing) (\x -> Just x)
-- Examples (must now be written in monadic style) a = katsch (\g -> return 42) -- Nothing b = katsch (\g -> return (\x -> do y <- g x -- Nothing z <- g y return z)) c = katsch (\g -> do x <- g 1; -- Just 1 y <- g 2; return (x + y))</pre>
This works but is unsatisfactory. I don't want to simulate
catch with exceptions. Is there a way to do
catch directly? I do not know, since it is not even clear to me that we have a monad.
The second example is easier to understand than the first one. Assume we have an operational semantics of Haskell in which it is possible to count steps of execution. Exactly what is “one step” does not matter. The important thing is that a diverging computation has infinitely many steps, whereas a terminating one has finitely many steps. Define a special operation
timeout :: Integer -> a -> Maybe a
timeout k e evaluates to
Just v if
e evaluates to
v in at most
k steps of execution, and to
Nothing if evaluation of
e takes more than
k steps. This could be implemented in an interpreter or compiler by keeping a counter of execution steps (we would actually need a stack of counters, one for each invocation of timeout).
Here is an attempt to implement timeout as a Haskel monad (think of the clock “ticking” when you read the code):
module Timeout where
-- We represent values together with number of steps needed to compute them data Timeout a = Ticks (Integer, a)
tick x = Ticks (1, x)
instance Monad (Timeout) where return = tick Ticks (j, x) >>= f = let Ticks (k, y) = f x in Ticks (1+j+k, y)
timeout :: Integer -> Timeout a -> Maybe a timeout n a = let Ticks (k, v) = a in if k <= n then Just v else Nothing
-- Examples a = timeout 4 (do x <- tick 7 -- Nothing y <- tick 5 return (x + y))
b = timeout 5 (do x <- tick 7 -- Just 12 y <- tick 5 return (x + y))
-- This example should evaluate to Nothing c = timeout 5 (omega 0) where omega n = do m <- tick (n+1) omega m</pre>
It is understandable, albeit annoying, that we have to “tick” basic computation steps explicitly. But the real trouble is that the last example diverges when it should evaluate to
Nothing. This is happening because the monad just counts steps of execution without ever aborting evaluation. What we really need is a monad which stops the execution when the allowed number of steps has been reached. I think this can be done, and I hope someone will tell me how, myabe with a comonad or some such.
Maybe I'm missing some deep theoretical aspect of your post, but practically I see no reason why 'timeout' cannot be implemented in monad. It suffices to replace the definition 'data Timeout a = Ticks (Integer, a)' with 'data Timeout a = Ticks (Nat, a)' assuming we have some well-behaved lazy Nat datatype, after this change evaluation of 'c' no longer diverges. The other possible way to go would be analogous to Maybe or Error monads: we could check the number of steps performed so far in the definition of 'bind'.
Unfortunately, Timeout is not a valid Monad: it doesn't satisfy the monad laws. To see this, note that the monad laws prohibit anything which counts the number of times that >>= is used. For example, the first law says that you need
return x >>= f == f x
For Timeout, if k is the number of "ticks" in "f x" then the lhs will have k+2 ticks and the rhs will have only k. I think that you can't implement the equivalent as an arrow or a comand for the same reasons.
It might work out, however, (but not necessarily as a monad) if you only count ticks in "primitive computations", not in return or >>=.
The boring way to do it is to make Timeout a = Integer -> Integer -> (Maybe a, Integer), or in Haskell nomenclature, Timeout a = MaybeT (ReaderT Integer (State Integer)) a. The idea being that tick checks the current counter against a maximum and returns Nothing if it's too big.
But that's complicated. The fun way to do it is with lazy naturals. Just define:
data Nat = Z | S Nat
greaterThan :: Nat -> Integer -> Bool greaterThan Z _ = False greaterThan (S _) 0 = True greaterThan (S n) m = greaterThan n (m-1)
And standard addition, then define Timeout just how you did in your post. The timeout function is then:
timeout :: Integer -> Timeout a -> Maybe a timeout n a = let Ticks (k,v) = a in if k
greaterThann then Nothing else Just v
And all the examples work fine :-). The Haskellian way to do it is to define Nat as a Monoid under addition then use the Writer monad. Same diff.
Oh, and I wouldn't have defined return = tick, because it violates a monad law.
Do we get anything by reducing monads to first-class continuations plus a piece of state?
I am referring to Filinski's completeness result for monads from POPL 1994: http://www.diku.dk/hjemmesider/ansatte/andrzej/papers/RM-abstract.html
Answers to Gleb: the lazy natural numbers idea is neat. I need to see if we can make the monad laws hold, and at the same time make sure that diverging computations consume infinitely many ticks. I think it can be done.
Answer to luqui: thanks for your comment. I am a bit worried about your first suggestion which uses state, but feel free to prove me wrong. I like the idea with lazy natural numbers.
Answer to speleologic: thank you for the reference, that is useful to know. In the context of computational effects there are other, more powerful theorems. There is in fact "the mother of all effects", namely the quote/unqoute computational effect. Every other (computable) effect can be simulated by it.
I conjecture that Haskell semantics are invariant under permutation of case statements in some sense, meaning that if all branches of a case statement A have the same case statement B in them then this B case statement can be pulled outside (and putting the same case statement A inside each branch of B) without changing the semantics (under some suitable set of hypotheses that include the assumption that the expressions being cased are not diverging). Adding catch to Haskell would violate this property that I'm conjecturing (as your example using (+) illustrates), and hence catch cannot be defined in Haskell. Therefore catch must be modelled in some way, such as with monads as you do.
Catch definitely breaks Haskell's purity, I am well aware of that since it obviously breaks beta reduction, a corner-stone of Haskell's evaluation strategy. Isn't the the whole point of computational effects to break purity? I am interested to see if you can simulate catch in terms of a monad. Is it a monad?
Andrej, could you give a reference to the more powerful theorems above? I am unable to find anything about the quote effect, although I think I know what you are referring to. The self-evaluator of lambda calculus, ie Scheme's quote/unquote. Thanks!
Timeout is an instance of an effect handler, a generalisation of exception handlers to effects that can be represented by algebraic theories (single-sorted equational theories). Time has such a representation if one takes a single unary operation 'tick' and no equations. Each such theory gives rise to a monad, and for time, it is the one that jeremy has suggested.
You can take a look at http://matija.pretnar.info/handlers-of-algebraic-effects.pdf for more detail.
What more precisely is the "challenge" you have in mind? If it is to "break purity" or "breaks beta reduction", then state and non-determinism are just as hard as your two examples. If it is to "express" effects as is typically done in Haskell, then your two examples are just as easy as state and non-determinism.
Reply to ccshan: the challenge here is to express catch and timeout in Haskell as is typically done in Haskell. I have failed to do so for catch (because simulating catch with the exception monad is an overkill). I disagree that catch is as easy as state and non-determinism. Show me how easy it is to directly express catch (and only catch) in Haskell as a monad, comonad, or whatever.
Timeout can be presented as a monad, which follows from the general theory of effect that Matija has referred to, but I would still like to see a concrete Haskell implementation that results from Matija's theory.
I wrote a note (in 1998) that considers ticks, and has a monad for that.
I would define, in Haskell, corresponding to the above note,
data T a = Stop a | Delay(T a)
instance Monad T where return = eta xs >>= f = mu(t f xs)
eta :: a -> T a eta x = Stop x
mu :: T(T a) -> T a mu(Stop xs) = xs mu(Delay xss) = Delay(mu xss)
t :: (a -> b) -> (T a -> T b) t f (Stop x) = Stop(f x) t f (Delay xs) = Delay(t f xs)
timeout :: Int -> T a -> Maybe a timeout n (Stop x) = Just x timeout 0 (Delay xs) = Nothing timeout (n+1) (Delay xs) = timeout n xs
timeof :: T a -> Int timeof (Stop x) = 0 timeof (Delay xs) = 1 + timeof xs
timedRecursion :: (T a -> T a) -> T a timedRecursion f = Delay (f (timedRecursion f))
The last one ticks once for each recursion unfolding. Notice the similarity (and difference) with lazy natural numbers. But, as the paper discusses, it is perhaps more sensible, in a call-by-name language, to tick only at ground types. Then you define, by induction on types, a new delay function (starting with delay=Delay at ground types) pointwise. The paper proves some properties of this.
Notice that an infinite computation returns, rather than bottom, infty, where
infty = Delay infty.
The output type of timeof can of course (and sensibly) be replaced by the lazy natural numbers.
This is a first try of a simple lift of Delay to functions spaces.
timedRecursion :: ((a -> T b) -> (a -> T b)) -> (a -> T b) timedRecursion f = \ a -> Delay (f (timedRecursion f) a)
rfac :: (Int -> T Int) -> (Int -> T Int) rfac f 0 = Stop 1 rfac f n = t (n*) (f (n-1))
fac :: Int -> T Int fac = timedRecursion rfac
Hi Sebastian. Yes, what you wrote is what you would get as the (denotational) metric semantics of the usual factorial program, or as the (operational) chattering semantics I describe in the above paper.