Two Guys Arguing

One fn to bind them

Posted in clojure, haskell by youngnh on 11.21.10

I had a chance to work on my parsec port a little this weekend. Say hello to one of the most important and ubiquitous parsers in the parsec arsenal, parser-bind.

The idea behind parser-bind is that it should squish two parsers together. It represents parsing one thing after another. The only other parser we’ve built that squishes two parsers together is parser-plus, which operates more like “or” in that if the first one fails, it tries the second. This parser will quit immediately whenever either fails. If this parser succeeds, it’s because each matched successive input.

(defn parser-bind [m n]
  (fn [state cok cerr eok eerr]
    (letfn [(mcok [item state]
              (n state cok cerr cok cerr)))
            (meok [item state]
              (n state cok cerr eok eerr)))]
      (m state mcok cerr meok eerr))))

If the first parser, m, consumes ok, but the second one, n, does not consume, our combined parser will still call the cok continuation. Conversely, if the first one is empty and ok, but the second one consumes, we will also escape via the cok continuation. parser-bind does not override any of the error handling continuations because if something goes wrong, we use them to exit immediately.

The useful part of parser-bind isn’t in the the above implementation. It isn’t how parsec implements the idea. Parsec’s implementation does take the first parser, m, but for it’s second argument, it takes a function that, when executed, returns the second parser.

This is a neat idea because the unlike a parser that has to be fully specified at write-time, a function can bind intermediate, runtime results. Those intermediate results, once bound and named can be used to create further parsers. It allows us to write let-like forms:

(p-let [c (one-of "abc")]
  (char c))

Where each binding form in the parser let has to be a destructuring form and parser pair. The above is a parser that parses a character, and then looks for a duplicate of what it just parsed, similar to capture groups in regular expressions. p-let uses parser-bind under the covers:

(defmacro p-let [[& bindings] & body]
  (let [[bind-form p] (take 2 bindings)]
    (if (= 2 (count bindings))
      `(parser-bind ~p (fn [~bind-form] ~@body))
      `(parser-bind ~p (fn [~bind-form] (p-let ~(drop 2 bindings) ~@body))))))

Given only a single binding pair, we make the parser in it the first argument to parser-bind, and wrap a function with it’s destructing form as args, returning the body. In longer binding forms, we produce a recursive structure that macroexpand will continue to expand one binding form at a time.

Tagged with:

2 Responses

Subscribe to comments with RSS.

  1. jimduey said, on 11.22.10 at 8:51 am

    Don’t know if you’ve seen the work Konrad Hinson has done with monads in Clojure, but you can do something like parsec using them. Checkout the parser described here:

    • youngnh said, on 11.22.10 at 10:46 am

      Wow, I had read through Konrad’s four-part monad tutorial, but I hadn’t seen your page. It looks like you really bring the full possibilities of monads to bear on the problem.

      I started this project, though, struck by how little of formal monad theory you needed to know to get a fully working implementation. I liked that the heavily monadic Haskell code could be approximated by vanilla defns and a couple of letfns.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s