Sunday, August 3, 2008

Fear and Loathing in Parse Vegas

Martin Fowler writes about parser fear and I have to say "guilty as charged". I've done a few DSL but all have be so simple that I could hand-write a parser using regular expressions and other string manipulations. To be honest, the resulting parser would probably be easier to understand, maintain, etc, if it was developed using a proper grammar and a parser generator. Despite (knowing) this, I kept writing those convoluted hand-written parsers.

I did the compiler class at the university and I'm intressted in most things programming langage related, e.g., compilers and parsers. Despite this I never actually done a parser (with proper grammar) by my self. Why? I had parser fear.

Fowler writes:
So why is there an unreasonable fear of writing parsers for DSLs? I think it boils down to two main reasons.
  • You didn't do the compiler class at university and therefore think parsers are scary.
  • You did do the compiler class at university and are therefore convinced that parsers are scary.

I think the last bullet explains why I never did a proper-grammar-parser by my self.

However, the last time I had to write a parser I (finally) realized that a proper-grammar-parser was a better idea than trying to hand-write something convoluted. The parser should be implemented in Ruby, so I googled (is that a verb now?) and found a generic recursive decent parser -- all I had to do was to write the grammar, which was straight forward.

There were several resons that finally made me take the step to use a proper parser:

  • The language was complex enough to make my old approach unsuitable
  • The parser was really easy to integrate with my other Ruby code
  • No separate step for generating the parser (i.e. short turn-around time, and less complexity because there is no code generation)
In essense: it was easy to use and test. That was the cure for my (irrational) anxiety towards parsers.

3 comments:

Greg said...

But generally speaking, if you're operating at the level of abstraction where you're writing a DSL, why wouldn't you be working in a language that makes an embedded DSL an easy natural fit?

I don't think parsing complex DSLs is really the way to go.

Togge said...

Sure, I agree.
However, if you don't have the freedom of choosing the syntax of the DSL, then there is not much you can do other than writing a parser. Also, if can't choose the(implementation) language, then you may be stuck with C#, Java, or whatever.

Togge said...

In my case I didn't have the freedom of choosing the syntax.