User login

Parsing's not just for languages anymore

I look back at the history of computer languages and I am truly amazed and delighted by what the field has accomplished. Sixty years ago, we had nothing more than an array of switches to program a computer. Now we have programming languages that let us abstract much further toward the problem domain.

And I am fascinated by ways to improve programming language expressiveness. Managed memory, structured programming, object-orientation---all of these free the programmer from the minutiae of registers and jumps. And what's even more wonderful is that despite all of these, we are still at the very beginning of what is possible.

We know next to nothing about what is possible. Programming doesn't have to be the tedious, error-prone, repetitive chore it is today. I look forward to a day when I can code up a new idea in a few lines--just to try it out. A simple idea should have a simple representation in code. I'm currently putting a lot of hope in the project headed by Alan Kay.

I just posted about why people love to talk about DSL's. A little parsing can increase your productivity in gobs. But the folks at Viewpoint Research Institute are putting parsers to work in a different way---with impressive results. In this post, I'll explain what they're doing and why I like it.

But first, a little background. Martin Fowler says he prefers to get his DSL's working like this:

Develop Domain Model

Define the domain of the problem using classes, functions, etc.

Create a parser/AST Translator

Define a language that maps well to the domain

Write a solution to the problem in the new language

Your solution should now be easily expressed

Wow. That sounds great and powerful. Read his article to learn more about his technique.

But that's not what I want to talk about in this post.

What I do want to talk about is this: instead of writing a language that gets translated into domain objects, you model the problem as a language to be parsed! You can model many problems in computer science as a sequence of inputs. The computer needs to interpret them---figure out what they mean. Then the computer needs to figure out what to do with them---how to respond or react. You can look at network traffic, user input, and disk I/O that way. Modeling the problem as a protocol---a structured language---allows you to use a parser to define how to interpret the patterns and what actions need to be taken.

So all you have to do is define a grammar that "solves" the problem. In a DSL, you write the solution in the mini-language according to the rules of your grammar. In the new system, the grammar is the solution.

What's the use of that, you say?

Well, the year-end report details a few proofs-of-concept, including:

  • TCP/IP -- implemented in less than 200 lines
  • Animated Postscript Rendering -- in realtime in a few thousand lines of code

The parsing system is called OMeta. Read the report. You might be surprised.

Why is this such a powerful approach?

Declarative

don't worry about "how" it's done

Powerful, recursive pattern matching

like regular expressions but more powerful

Non-deterministic with backtracking

it figures out which rule to apply automatically

Grammar rules

define value of an expression and side effects

There's still a lot of work to do on the project. Efficiency is still an issue. And how far can the idea be stretched?

I'd say quite far. And I'm looking forward to hearing more from them. They've only just scratched the surface.

And could it be? OMeta in Lisp? Yes!

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.