Sunday, August 3, 2008

Fear and Loathing in Parse Vegas

Martin Fowler writes about parser fear and I have to say "guilty as charged". I've done a few DSL but all have be so simple that I could hand-write a parser using regular expressions and other string manipulations. To be honest, the resulting parser would probably be easier to understand, maintain, etc, if it was developed using a proper grammar and a parser generator. Despite (knowing) this, I kept writing those convoluted hand-written parsers.

I did the compiler class at the university and I'm intressted in most things programming langage related, e.g., compilers and parsers. Despite this I never actually done a parser (with proper grammar) by my self. Why? I had parser fear.

Fowler writes:
So why is there an unreasonable fear of writing parsers for DSLs? I think it boils down to two main reasons.
  • You didn't do the compiler class at university and therefore think parsers are scary.
  • You did do the compiler class at university and are therefore convinced that parsers are scary.

I think the last bullet explains why I never did a proper-grammar-parser by my self.

However, the last time I had to write a parser I (finally) realized that a proper-grammar-parser was a better idea than trying to hand-write something convoluted. The parser should be implemented in Ruby, so I googled (is that a verb now?) and found a generic recursive decent parser -- all I had to do was to write the grammar, which was straight forward.

There were several resons that finally made me take the step to use a proper parser:

  • The language was complex enough to make my old approach unsuitable
  • The parser was really easy to integrate with my other Ruby code
  • No separate step for generating the parser (i.e. short turn-around time, and less complexity because there is no code generation)
In essense: it was easy to use and test. That was the cure for my (irrational) anxiety towards parsers.


Greg said...

But generally speaking, if you're operating at the level of abstraction where you're writing a DSL, why wouldn't you be working in a language that makes an embedded DSL an easy natural fit?

I don't think parsing complex DSLs is really the way to go.

Togge said...

Sure, I agree.
However, if you don't have the freedom of choosing the syntax of the DSL, then there is not much you can do other than writing a parser. Also, if can't choose the(implementation) language, then you may be stuck with C#, Java, or whatever.

Togge said...

In my case I didn't have the freedom of choosing the syntax.