Parsing wrongness and language value

NOTE HXA7241 2012-05-06T09:07Z

Parsing is a poor choice of problem. Programming languages should help us see and help us move.

MJS has been recently mulling over these kinds of things, so here is some more.

Why is parsing wrong ?

“Is parsing not an illusory problem caused by using the wrong data structure?”

First, let us be clear about this message by separating-off its expression. The quote is actually logically degenerate – it is, in essence, saying: “If something is bad, is it bad?”. Of course the answer is always yes – it is a tautology.

But that accidental rhetorical form merely lends some appealing force. The suggestion is that the thing in question is indeed significantly bad.


Parsing is sometimes said to be a solved problem, but that is only from a high-level academic computer-science perspective. Practically it is not. If you have to write a parser, there is some work – and for anything moderately complex or more, a serious amount (C++ anyone?).

The meaning of a data structure is the algorithms that use it (to paraphrase Wittgenstein). A choice of data structure really says what we want to do with it. So what of programming languages and text-files of code? We would have to deduce that our number-one priority is ease of typing-in code char-by-char. Does that sound like a sensible priority?

The normal text-file code data structure is biased toward making unimportant things easier – and so imposing significant barriers to more important things we (might) want to do.

What should programming languages do ?

As the bottom line, programming languages have only psychological value (very broadly speaking), not computational value – because they do no computation. All we value in computers and computation is orthogonal to language: changing languages will not improve it, or change it at all. (This is just the other face of the Church-Turing thesis coin.)


A programming language is a data structure with a presentation – that is the only constraint or limit to what it could possibly be. (To clarify: a language is a data structure in that it fixes a general format and allows various particular instances: the grammar is the type, programs are the values. (And the common presentation is simply text.))

We could picture it as an articulated mechanism with several freedoms of movement. Or a game like chess with various legal moves. (These map to software, rather than being only analogous, though they are simpler logically.)

So what should the presentation do? Perhaps the presentation for a programming language (hence programs) should show nearby, interesting, possible movements. (We also want the data structure, the ‘grammar’/type, to facilitate a good set of freedoms or moves, but that can be left aside for the moment.)

The intent is that a language should present a rich, clear, view of the extended/implicit form of the software artifact – the kinds of patterns, or regularities, composing it or relating to it. A good language communicates not only the program itself, but its ‘context of extension’ – the ways it relates to other things, and the ways it could be extended, advanced, or changed. If we want to understand a mechanism or game, we diagram its envelopes of motions etc. or the tactical look-ahead, and so on.

We want a view of possible change because the main thing we do with software is to change it – to develop it, re-do it, extend it, modify it. The most essential requirement of languages is to support the most essential activities of software engineering.


How to get a clearer idea of what this could amount to specifically, concretely, in practice? Ask: what kind of transforms do we want to make on code, and what would be interesting to know about their results (what is interesting to know about any program (parts))?

The first can be pursued by looking carefully at what we do, as above: we want to support the transforms of normal development: adding/removing/modifying functionality.

The second can be pursued by recalling the two basic aspects/questions of software engineering: the computational properties and the developmental properties – how much time/space/energy does computation take, and how much effort/resources does development take. (Do any current dev-environments/languages help with such information? They should.)

Immediate feedback, a direct view of execution state and effect – as recent demos have shown – is just a start. It is good to see what we are doing, but it would be better to see a little further too: to see what we might do next.


How we interact with the material we work with can be of great importance. It leads us to discover more about that part of the world, and leads us to new thoughts about it. If humans were blind, would we have achieved so much?

Programming languages are the eyes through which we see the computational universe. Are they doing that job well?

A good programming language shows us possible places to go and helps us step toward them.