How does the Haskell do notation know which value to take when it isn't defined by a return?

I have this monadic object.

data Parser a = Parser (String -> Maybe (a, String))

instance Functor Parser where
  -- fmap :: (a -> b) -> Parser a -> Parser b
  fmap f (Parser pa) =  Parser $ \input -> case pa input of
                                             Nothing -> Nothing
                                             Just (a, rest) -> Just (f a, rest)

instance Applicative Parser where
  pure = return
  (<*>) = ap

instance Monad Parser where
  --return :: a -> Parser a
  return a =  Parser $ \input -> Just (a, input)

  --(>>=) :: Parser a -> (a -> Parser b) -> Parser b
  (Parser pa) >>= f  = Parser $ \input -> case pa input of
                                            Nothing -> Nothing
                                            Just (a,rest) -> parse (f a) rest

And I have this definition of an item which I am told "reads in a character" but I don't really see any reading going on.

item :: Parser Char
item = Parser $ \ input -> case input of ""    -> Nothing
                                         (h:t) -> Just (h, t)

But ok, fine, maybe I should just relax about how literal to take the word "read" and jibe with it. Moving on, I have

failParse :: Parser a
failParse = Parser $ \ input -> Nothing

sat :: (Char -> Bool) -> Parser Char
sat p = do c <- item
           if p c
           then return c
           else failParse

And this is where I get pretty confused. What is getting stored in the variable c? Since item is a Parser with parameter Char, my first guess is that c is storing such an object. But after a second of thought I know that's not now the do notation works, you don't get the monad you get the content of the monad. Great, but then that tells me c is then the function

\ input -> case input of ""    -> Nothing
                         (h:t) -> Just (h, t)

But clearly that's wrong since the next line of the definition of sat treats c like a character. Not only is that not what I expect, but it's about three levels of structure down from what I expected! It's not the function, it's not the Maybe object, and it's not the tuple, but it's the left coordinate of the Just tuple buried inside the function! How is that little character working all that way outside? What is instructing the <- to extract this part of the monad?

2 answers

  • answered 2018-11-08 07:05 assembly.jc

    As comment mentioned, <- just be do notation syntax sugar and equivalent to:

    item >>= (\c->if p c 
                  then return c 
                  else failParse)
    

    Okay, let see what is c? consider the definition of (>>=)

    (>>=) :: Parser a -> (a -> Parser b) -> Parser b
    

    or more readable way:

    Parser a >>= (a -> Parser b)
    

    And Now, matches it with above expression item >>= (\c->if p c then return c else failParse) give:

    Parer a = item
    

    and

    (a->Parser b) = (\c->if p c then return c else failParse) 
    

    and item has type:

    item :: Parser Char
    

    so, we can now replace a in (>>=) by Char, gives

    Parser Char >>= (Char -> Parser b)
    

    and now \c->if p c then return c else failParse also have type: (Char -> Parser b)

    and so c is a Char, and the whole expression can be extended to:

    sat p =
    item >>= (\c->...) = 
    Parser pa >= (\c->...) = Parser $ \input -> case pa input of
                                                Nothing -> Nothing
                                                Just (a,rest) -> parse (f a) rest
                             where f c =  if p c
                                          then return c
                                          else failParse
                                   pa input = case input of ""   -> Nothing
                                                           (h:t) -> Just (h, t)
    

  • answered 2018-11-08 07:59 Will Ness

    TL;DR: In general, by Monad laws,

    do { item }
    

    is the same as

    do { c <- item
       ; return c
       }
    

    so it is defined by a return, in a sense. Details follow.


    It does take one character out from the input string which is being "read", so in this sense it "reads" that character:

    item :: Parser                                   Char
    item = Parser $ \ input ->   -- input :: [Char]   
                 case input of { ""    -> Nothing
                               ; (h:t) -> Just (h, t)  -- (h:t) :: [Char]
                               }             -- h :: Char    t :: [Char]
    

    and I bet there's a definition

    parse (Parser pa) input = pa input
    

    defined there somewhere; so

    parse item input = case input of { "" -> Nothing ; (h:t) -> Just (h, t) }
    

    Next, what does (>>=) mean? It means that

    parse (Parser pa >>= f) input = case (parse (Parser pa) input) of
                         Nothing -> Nothing
                         Just (a, leftovers) -> parse (f a) leftovers
    

    i.e.

    parse (item      >>= f) input = case (parse  item  input) of
                         Nothing -> Nothing
                         Just (a, leftovers) -> parse (f a) leftovers
          = case (case input of { "" -> Nothing ; (h:t) -> Just (h, t) }) of
                         Nothing -> Nothing
                         Just (a, leftovers) -> parse (f a) leftovers
          = case input of 
               ""    -> Nothing 
               (h:t) -> case Just (h, t) of {
                             Just (a, leftovers) -> parse (f a) leftovers }
          = case input of 
               ""    -> Nothing 
               (h:t) -> parse (f h) t
    

    Now,

    -- sat p: a "satisfies `p`" parser 
    sat :: (Char -> Bool) -> Parser Char
    sat p = do { c <- item           -- sat p :: Parser Char
               ; if p c              -- item :: Parser Char,   c :: Char
                    then return c    -- return c :: Parser Char
                    else failParse   -- failParse :: Parser Char
               }
          = item >>= (\ c ->
                        if p c then return c else failParse)
    

    (by unraveling the do syntax), and so

    parse (sat p) input 
     = parse (item >>= (\ c ->
                        if p c then return c else failParse)) input
       -- parse (item >>= f) input
       --  = case input of { "" -> Nothing ; (h:t) -> parse (f h) t }
     = case input of 
           "" -> Nothing 
           (h:t) -> parse ((\ c -> if p c then (return c)
                                           else failParse) h) t
     = case input of 
           "" -> Nothing 
           (c:t) -> parse (if p c then (return c) 
                                   else failParse) t
     = case input of 
           "" -> Nothing 
           (c:t) -> if p c then parse (return c) t
                                   else parse failParse t
     = case input of 
           "" -> Nothing 
           (c:t) -> if p c then Just (c, t)
                                   else Nothing
    

    Now the meaning of sat p should be clear: for c produced by item (which is the first character in the input, if input is non-empty), if p c holds, c is accepted and the parse succeeds, otherwise the parse fails:

    sat p = for c from item:                 -- do { c <- item 
               if p c                        --    ; if p c
                    then return c            --        then return c 
                    else failParse           --        else failParse }