People talk a lot about programming language expressivity. Why is it important?

Simplifying a bit, a highly expressive language lets you write the same program with fewer lines of code (LOC). Fewer lines of code means easier to write, easier to extend, and easier to maintain. Software cost is not directly proportional to LOC. If you want to change a feature, bigger code means looking through more lines that aren't the code you want to change. It means more lines to change. It means more time to understand how the system works before you change things. It's not linear. So it's vitally important to reduce the LOC of your app. One way to do that is to use a more expressive language.

So, let me give a little formal theory about what I mean by expressivity. First of all, expressivity is relative to a given feature (piece of functionality). There is no general or absolute expressivity. So we can say Java is more expressive in terms of web service programming than Common Lisp is. That's debatable (though very possible). Visual Basic is (potentially) more expressive for GUI's than Erlang. The point is that the language we use can help us solve a particular problem better than other languages.
Second of all, expressivity has a metric. Some quantitative or qualitative comparison. Very often, people talk about expressivity being inversely proportional to the number of lines of code. As long as the feature is the same, you can use the metric to compare the implementations in two different languages.
Lots of people on the internet show one single piece of functionality to prove the expressive superiority of their language. For example, a Scheme guy will say "my language can express a lambda in two lines, but Java needs at least 10." That's an expression of Scheme's expressivity versus Java's for the domain of anonymous functions.
This approach is flawed for at least two reasons. Since a program only containing one feature isn't very useful, the Scheme versus Java example isn't very practical. It doesn't talk about real-world programming. The second reason is that often these single features are features for which the language was designed. Scheme was designed to make lambda easy to do. The Java designers optimized other things.
Now that I think about it, another flaw is that this kind of comparison annoys people. But that's beside the point.
But I propose a better metric of expressivity. It's better because it's more fair. It still has the disadvantage of being pretty unlikely to actually be used. Here it is:

The number of features n used in the expressivity needs to be set arbitrarily large. The choice of metric is still open. So, for example, in the Scheme vs. Java example above, the n = 1 (creating a lambda). The metric is LOC.
The real measure of expressivity is how well it helps you write large, real-world programs. That means n=100, n=1,000, or n=1,000,000, etc. As long as the apps implemented in both languages are matched feature for feature, you can compare them with the same metric.

In the graph, I've taken a very conservative curve for the two languages. Java increases linearly with the number of features. Really, I don't think Java can do better than this. Scheme, on the other hand, starts off at a disadvantage because of all of the Java libraries. Scheme has to catch up. But as the Schemers develop macros to abstract away their code, their code can actually get smaller. Adding a new feature can actually be done with fewer lines of code than the last feature. Therefore, it's less than linear.
So, since software cost increases more than linearly, it is extremely important to improve expressivity to avoid skyrocketing software costs. Bottom line: use the most expressive language that can get the job done.
So that's the framework I propose. Of course, I don't really expect any real comparison like this to happen. I mean, the real way to figure out who is more expressive, Java or Scheme, is to implement something big in both, like a web browser. And they both have to work exactly the same way. And to be fair, you need to include features that are considered easy in Java and features that are considered easy in Scheme. But I think it's a good way to think about it.
Now what's left is to pick a metric.
Comments
Thank you! You have
Thank you! You have expressed better than I ever could why I think Forth and Lisp would, ultimately, be better languages for nearly every real-world project I can think of, and your observations seem to give additional evidence to why I consider frameworks and other forms of middleware on the market today evil, both open-source and commercial alike.
In particular, the organization will need to make a tangible investment in language support infrastructure (macros, proprietary libraries, etc.), but once that's created, everything tends to be smooth sailing from there.
I used to scoff at this "Not Invented Here" syndrome, but after significant exposure to real-world programming and QA roles in major corporations, I've come to the conclusion that in-house-developed software is very nearly *always* preferable to stuff you pull off the shelf.
I have another metric which I use personally: if it takes me longer to read an API reference document than it takes for me to write it myself from first concepts (which, as it turns out, is usually the case), I just write it myself.
Expressivity != whole story?
I like this article but I'm not sure I really agree with the conclusion in all cases.
Often, less expressive languages (eg. Java vs Lisp) come with other advantages like static typing, additional compile-time checks, and refactoring tools that are of great benefit to large projects. There are various levels at which languages can trade expressivity for "safety" (meaning the extent to which the compiler can find errors in your code); consider pure functional languages like Haskell as an extreme example of this.
There's also the issue that code in more expressive languages can be harder to predict or analyse; here I'm thinking of a Ruby program with ORM classes generated at runtime, vs. a Java program where the ORM classes are generated at or before compile time.
So even though you will need to type more to produce a Java program than an equivalent Lisp program, expressivity doesn't tell the whole story.
Although if I'm programming myself I'll take Lisp over Java any day, unless the libraries I'm going to rely on are larger than my program's code :)
I disagree
I've never really found that type safety helped me develop "safer" code. The most important thing for me is to be able to keep the whole system in my head. And that means to be able to reduce the lines of code.
I find that I can remember what functions need what types---my own internal "type checking"---just fine while I'm coding, debugging, and refactoring. That is, as long as I can see most of the code at a glance. Maybe about a page or two per feature. Once it extends beyond that, i want to reorganize so that I can understand it all.
What I find with static languages is that you are forced to write more code to work around the static parts. You need a method to deal with type A. You need another method to deal with type B. Et cetera. More duplicated code. More errors.
What I find with static
What I find with static languages is that you are forced to write more code to work around the static parts. You need a method to deal with type A. You need another method to deal with type B.
Then you are doing it wrong. Either you are not using an expressive language as this article recommends (no structural typing), or your data structures and functions are not properly factored.
@bob84123 Haskell (and _some_
@bob84123
Haskell (and _some_ other pure functional languages, you can't really generalize - there even exist dynamically-types ones!) don't really trade any expressivity to gain safety - although you do tend to express things in different ways once you have support for it. You need to learn a few extra things in order to write imperative-style code as quickly/cleanly/expressively, but those are merely Haskell's codifications of policies you (hopefully!) already adhere to in other languages.
Focus on the beginning of the curve!
I think the part of your Java vs Scheme curve that is the most interesting is the beginning of the curve. That is the part that determines the adoption of the languages. If you were to accurately graph the curves of all languages together, I would posit that the language with the lowest integral for the first 10 features would be the language that gets the most adoption. If you want your language to have lots of users you need to focus on that section of the curve, and not the end of it.
that would explain
That would why it is so important to have a simple install for your language. That's the first feature of every application.
Post new comment