When I want to develop an application, I choose the best language for the job. But when I want to develop a web application, no matter what language I choose, I feel restricted by the current selection of database technology.
Standard industry practice is to use a database as the storage mechanism of your web application. That's to take advantage of ACID properties as well as to be able to scale-up well.
But I always have problems with it. The types rarely correspond to the types I have available in my programming language. And a row in a table does not map well to an object. Let's not even go into how you have to define your row format ahead of time. All in all, the relational databases are so fundamental a piece of your application that they drag down your coding practices to its lowest common denominator. I hate it, and I've searched far and wide for alternatives.
One alternative is AllegroCache. It's a persistent object database that you query using Prolog. Slava Akhmechet did a great job explaining the how and why of AllegroCache. AllegroCache looks great. It's a heavy-duty database, with full transactions, and it lets you store plain-old CLOS objects. That's really nice, since you never have to build a bridge to a table-based database.
I've never used AllegroCache since it's expensive. I try to use open source whenever possible (because it's free). But I haven't found an open-source persistent object database. There are some attempts at open-source equivalents. Elephant and PLOB! come to mind. I think there are licensing issues with PLOB!. I have heard good things about Elephant, but I have never used it myself. I just might try it for my next project!
Common Lisp has a very long lineage. The large standard library in the :COMMON-LISP package is sort of an akashic record of that long history. It records programming wisdom and knowledge passed down through the generations of Lisp programmers. There is often a very clean, clear, and general facility for performing a task built in to the language.
Other languages were written for different purposes and have different (usually shorter) histories. So they probably won't have the same idioms that Lisp enjoys. When you're translating Java code (or any other language) to Common Lisp, it's all too easy to miss some opportunities for writing clear, lispy code. Simon Alexander has a great post detailing a lispier translation of code that Mike Ajemian translated last week.
This is something I struggle with, too. The learning curve of Lisp is steeper because of the breadth and depth of its library. But the richness that Lisp provides in the end is beautiful to behold. I have learned that before I write a function that seems very useful, I stop and consult the HyperSpec to see if it already exists. Very often it does, and with nice optional parameters that anticipate my needs. Truly, the Lisp gods must be praised.
My last post was called Java is not Object-Oriented. It's principally about where most of the abstraction in Java takes place and where it will take place in the future. There is one trend, in particular, that I would like to focus on.
Let's call it the paradigm cycle. It goes like this: programmers program using the features of a language. Best practices emerge as to how to use those features to write better code---a paradigm emerges. Programmers discipline themselves to follow the best practices. Then the paradigm becomes integrated into the language itself. Linguistic constructs for the concepts of the best practices start showing up. Programmers get used to their language having those features. Then they take the paradigm for granted.

Take for example structured programming. It came to people's attention that using GOTO's all over your code was actually hard to read and maintain. The languages of the day started to emphasize use of conditionals and formal loop statements instead of GOTO. Now we take structured programming for granted. It is so natural to assume that conditionals exist, we don't go around calling C structure-oriented, even though it is.
And this has happened with OOP, functional, and procedural programming. They have been well-integrated into their respective languages---to the point of becoming ordinary. The languages have assumed the paradigms as given. A Java programmer uses classes, methods, and polymorphism like a C programmer writes new functions. Writing object-oriented in C is non-obvious. You have to write your own object system and stick to a rigorous discipline. But in Java, the language takes care of the details.
If we project this cycle from the past, through the present, and into the future, what do we see? We see that a new paradigm, built on top of Object-Oriented, on top of functional, and on top of procedural is inevitable. And those changes are already happening.
Ronen mentions that the JCP is creating a module system for Java. They want to make a new package system with versioning support, integrity checks, and a distribution mechanism. These are ways to directly support library-oriented programming in the language itself.

Martin Ward coined the term "Language-oriented programming" to describe the paradigm where DSL's are a main abstraction. That shows that people are already building a new abstraction paradigm on top of other languages. Ruby Gems are a neat way to define, recombine, and distribute mini-languages.
So, what is now mainly a discipline of the programmer is being integrated into the language. New features are being experimented with to remove the burden of the programmer from the details of library and language maintenance. Very soon, programming languages will take care of most of the details of the new abstractions---just as they do now with the old ones. And programmers will come up with a new abstraction to build on top of that.
Now is our chance to ride the coming wave. We know it's going to happen as it has happened many times in the past. Become an early adopter. Evolve the language to include the new paradigms. Nature rewards those who further her aims (Buckminster Fuller). The best way to predict the future is to invent it (Alan Kay). You might as well ride the curve out in front of it instead of getting smashed by it and wondering what happened.
So what needs to be done? Abelson and Sussman (SICP) have this to say about programming languages:
* primitive expressions, which represent the simplest entities the language is concerned with,
* means of combination, by which compound elements are built from simpler ones, and
* means of abstraction, by which compound elements can be named and manipulated as units.
A library-oriented language needs libraries as primitives, a way to combine the libraries, and a way to name and manipulate those combinations. The Java Module System defines how libraries are defined (primitive) and a way to combine them (through dependencies). It might define more---the project is not yet complete. But it is obviously a start in this direction. There is still so much to be done.
But has it already happened? I blogged about OMeta a while ago. The language for defining languages. Languages being a primitive of abstraction in the language? Grammars are first-class objects? Grammars can inherit from other grammars (combination)? That sounds like language-oriented programming to me. From the guy who brought you Object-oriented.
New programming paradigms will inevitably emerge. We can already see it happening in Java and other languages. Language-oriented and Library-oriented programming will likely be built on top of the current object-oriented and functional programming paradigms. They will become commonplace. The question is this: what will your favorite language do to participate in this monumental change? Further, what new possibilities will emerge from the new paradigms?
The Java programming language is one of the most taught, learned, written about, and programmed in programming languages today. Beginning its life in 1995, it rode in on the Object-Oriented Programming hype-wave of the nineties. Although some might argue that Java's primary means of abstraction is the class---and therefore Java is primarily object-oriented, the huge number of available Java libraries indicates otherwise. In this essay, I will argue that Java's most powerful means of abstraction is the library. I will also explore whether a new term---namely "Library-Oriented Programming"---is warranted to describe programming using libraries as the main abstraction.
A library (in OO languages) is a collection of classes designed and made to be used together in an organized way. There are certain conventions that must be followed to use the library. And within those conventions, you can solve a problem. It may be a foreign concept to some to see libraries described as a form of abstraction, so I might need to borrow the credibility of an outside source. According to FOLDOC, the definition of abstraction is:
A library is meant to hide the details of a particular problem (say reading the contents of c:\directory\file.txt) by generalizing it to some more abstract problem (like creating a File object, then creating a FileInputStream, then wrapping it in an InputStreamReader). In general, though, classes of a library must be used together for them to work. This means that the library is an atomic unit. One could define a programming paradigm as a way to define atomic units (and their interactions) that model a problem in an abstract way. Java, therefore, relies on libraries as one of its means of abstraction.
But are libraries the principal means of abstraction? When you need to read in a weird file format or speak some well-defined protocol, do you ask yourself "I wonder if someone has written a library to do this?" or do you ask yourself "I wonder if someone has written a class to do this?" My intuition is to ask for the library. And I suspect most others do the same.
On the other side, you don't create an object to solve the problem, you create a framework--another word for a library. On the whole, solitary classes are not powerful enough to abstract away the complexity of real-world problems. Classes make great calculators, cash registers, and rectangle simulators---examples that teachers and books use to teach OOP principles---but they won't abstract very much more than that before they start to get brittle.
Since many classes are needed, and those classes, in general, are intertwined with the specifics of the library, the classes themselves do not abstract away problems atomically. The meat of the abstraction's ability to solve a problem is in the whole of the library.
One could argue that, although libraries are very commonly used to model problems, the main source of the power of the libraries is from the object-oriented nature of Java. It is therefore unfair to say that libraries are "principal", even though they are used as abstractions. Do not the bread and butter of OOP---object encapsulation, message passing, and polymorphism---come into play?
Object-oriented programming was touted as the solution to the code reuse problem. However, experience has shown the programming community that classes, by themselves, are not reused. When was the last time you downloaded a single .class file? It is, in fact, not the objects that are important, but the interactions and relationships between objects that are important. Their interactions are where they get their power. Hence libraries, which define a set of classes and their interactions, are the principal means of code reuse. And it is this reuse, because libraries can solve many specific problems in a general way, that defines their place as principal.
As to whether a new name should be used, well, that's really not up to me. The community will decide if it needs "library-oriented programming". I will probably use it---it seems a useful concept. I would say it's at least as useful as "language-oriented programming" as applied to Lisp. And don't forget that you heard it here first if others start adopting it.
From day to day, Java programmers write classes and methods. And they write code that instantiates, initializes, and operates on objects. But in the long view, those classes and methods and their patterns of use principally constitute libraries. Libraries have their own means of encapsulation (through interface classes) and their own means of combination (also through the interface classes). Library-oriented programming parallels Lisp's Language-oriented programming---where programmers write functions and macros then take a step back and realize they have written a whole new language. Could this be one of the reasons Java libraries flourish while Common Lisp's libraries don't? On another note: should language designers try to see the long view of their language as they design it? Can a language be built that makes the library the basic form of abstraction? What kind of abstraction would be superimposed on top?
One measure of the health and vitality of an online community is the number of people actively participating in that community. The online communities of many programming languages have grown tremendously with the advent of the Internet despite any technical limitations in the programming languages themselves. Although it has been argued for years that Common Lisp is technically superior to other (some would say all) languages, Lisp has failed to capitalize on the boom in Internet users to captivate the same scale of user base that other languages have. I will argue that the focus on technical merit is futile for attracting users and that a "killer-app" is needed to grow a sizable community.
One of Paul Graham's principle arguments for the adoption of Lisp (especially on the server-side) is that Lisp will let you build powerful applications more quickly. The basic argument is that Lisp's powerful abstractions (macros and first-class functions) were key to his success in beating the competition of Viaweb. His numerous essays expounding the benefits of a Lispy company have brought many people to consider Common Lisp as both a venerable repository of programming wisdom and an elegant tool for creating web programs.
But his essays have failed to bring the massive numbers of users the community needs and deserves. One of the most common complaints one hears of Common Lisp is its lack of libraries. This obviously wasn't the case in the past (there are gigantic repositories of old Lisp code buried in FTP sites). People complain that even the simplest of tasks have no decently supported library. It used to be the case that Lisp had most of the libraries. But Common Lisp has failed to keep up with the burgeoning number of file types, protocols, algorithms, and API's that has exploded with the adoption of ubiquitous computing.
While some may blame Common Lisp's technical qualities (overeager use of parentheses, age, difficulty of use), I believe that these are not the cause of its lack of popularity. Many of the popular languages today began life much worse off than Lisp is today. Java was hideously slow when it first emerged from its corporate womb. Despite Ruby-on-Rails' admitted scalability issues, the popularity of Ruby continues to grow. The list goes on. But if it's not technical merit, how does a language become popular? How does anything become popular on the Internet?
In order to understand what draws and keeps users online, I would like to use YouTube as an example, though many other websites would do as well. YouTube has built its popularity on three main functions: consume content, share content, and create content. These are the three purposes of social media that Clay Shirky expounds on. Users can browse YouTube's website for videos, send cool videos they see to their friends, and upload their own videos.
The triad of functionality (consume ("This looks cool"), share ("Hey, look what I found"), create ("Hey, look what I made")) create a virtuous circle. People who browse to the site find cool videos that they want to send to their friends. Those friends then tell more friends, who tell more friends. People want to upload their videos. Some want to impress people and get lots of views. Others just want a place to put their video and get feedback. But for whatever reason they do it, the community grows in size and strength for each video seen, shared, and created.
But there's something more to it: there are extremely compelling reasons to want to participate. Video allows for Narrative. Humans have a deep need to tell and listen to stories. Stories help us navigate the difficulties of life and keep us feeling connected to each other. Video allows for Novelty. New "viral" videos appear all the time on YouTube. Though they don't necessarily tell a deeply compelling story, they are new and creative enough to get sent to their friends a lot. Lastly, video allows for Community. At least the way YouTube does it. People post not only text comments but video responses as well. YouTubers want to participate, find common ground, and express themselves. It was not the technical aspects of streaming video on the web that made it so compelling to use. It was the way it let people interact.
It is this same aspect that brought Perl users together. Perl has the same triad that made YouTube a success: consume, share, create. Perl poetry brought narrative to the community. Perl one-liners in email signatures spread a kind of novelty. People love to compete to write the shortest Perl program that sings Happy Birthday (community). There is something that made people exclaim "Hey, look what I made!" on IRC. Some people collected them on their web pages under a heading equivalent to "Hey, look what I found!". It was compelling to learn how others did what they did. There was a kind of implicit narrative about how extremely obfuscated programs worked. Legends were born. Perl was a social programming language. CPAN (Perl's module repository) acts the same way. People consume modules, tell others about a cool module, and upload their own modules.
I would argue that Ruby is the same. People are excited to find some cool feature, some new idiom. Excited enough to exclaim it to enough people to get the buzz rolling. The RubyGem system allows for sharing and creation of new code in a social way. Ruby's features are novel enough to make people want to share. People feel compelled to try it (consume) because of the ease of creating a new web app with Rails. Before Rails, Ruby was not very well-known. Rails was its killer app.
And if Common Lisp wants a bigger, more vital community, that's what it needs. Common Lisp needs a killer-app. A social application that lets people consume code, share code, and create code. An application that makes it compelling to participate. Technical merits are not enough.
I don't think there is a "magic formula" for creating such an application. I would guess that a lot of experimentation would be needed before anything has the "killer app effect". But I can suggest a domain that I think has a good chance of working. It combines narrative, novelty, and community into its medium (and therefore message). That domain is video games.
Video games are a great medium for narrative. People can tell very compelling, interactive stories with simple games. Novelty also plays a big role in games. Players want new graphics, new gameplay, and new levels. And consuming good games is inherently compelling because it is fun. Sharing cool video games is natural.
The last part of the triad needed to make it compelling is the community. And that has to be done right. People need to feel that they belong and that they want to contribute. It would be a trivial exercise to emulate YouTube and make a website that had pages for each video game, score boards for high scores, and comments.
And although on the surface people are creating and playing video games, more fundamentally they are writing and sharing Lisp code. A cool video game will most certainly stoke the curiosity of programmers. And when they ask "How was this written?", the answer is but a click away. People will post their code to their blogs and on IRC and in their email. Much as Javascript one-liners make their way around the web-development world, Lisp one-liners would allow the prowess of coders to be known by all. "Run this line at the REPL and watch my cool fireworks display." "Pong in 10 lines." I think it speaks for itself.
The technical hurdles are small. At base, one needs to be able to create, consume, and share the games. A good Common Lisp video game framework would let users interactively develop small, fun games. People could develop the game interactively---they could see the results of their coding as they code. A good framework would make it simple to create a simple game. The SDL bindings in Lisp seem to have a good reputation, so it shouldn't be hard to bootstrap something. Consuming should be as easy as downloading a file and pointing their Lisp image at it. Sharing is trivial with the ubiquity of email.
I believe that this could work, if executed well. And there are certainly other applications that have potential. But I am certain that Common Lisp needs a compelling application to draw new users. Espousing its technical merit (even superiority) will not grab a decent share of the programmosphere. Can we encourage an irresistible urge to show off Lisp code? What would it be like to commonly see Lisp expressions in people's email signatures?
We've all heard Greenspun's Tenth and only rule. It's often applied to non-Lisp programming languages as well as to larger applications. It prognosticates that, eventually, all languages will become Lisp. One assertion that follows from Greenspun's rule is that if all languages are going to turn into Lisp anyway, why not just start with Lisp? But judging from the available features in languages today, perhaps it is too late to "just start with Lisp". Inspired by a post that found its way to my trusty feed reader, I think it may already be too late.
Greenspun's Tenth rule, from the Wikipedia entry, reads as follows:
The basic idea is simple: as a program grows in complexity, the developers need to start reigning in the resulting chaos. Automatic memory management is added. Functional utilities start to pile up. Pretty soon, they fire up a parser generator and add some DSL's and a prompt. Eventually, it's just a hacked together version of Lisp without the parentheses.
The same, it can be argued, happens with programming languages. At first, they can only do basic scripting language stuff. Then the intrepid programmers discover the utility of a map operation. Maybe they implement a debugger. Or they really need anonymous functions. And, like before, it's not-quite-Lisp.
When faced with the prospect of having to manage a buggy, home-built codebase when all of the features you will have to develop anyway are already well implemented in Lisp, it seems like a no-brainer what you'd choose.
And I think that for a long time, this was true. But is it still true? What do we have now? The big players in the field are Java, C#, and Python. You might add Javascript to that list, too, depending on how you count it.
All of those languages have managed memory. Most have a decent debugger and feature-laden development environments. They run on different platforms. Python has decent functional-style programming. C# has some cool language development tools. They've got all these features that Lisp had long ago. But three things have changed---they're not ad hoc. They're not informally specified. And they're not buggy. The only thing that's still true is the slow thing for some of them and they're still half.
So, really, I think that Lisp, or its brainchildren, are yes, mainstream. So long did I think that my obsessive desires would lead me away from the mainstream. No, they only bring me closer to Java. Or, put another way:
That quote is from Guy Steele. Halfway to Common Lisp? That seems awfully close to voluntarily following Greenspun's Tenth Rule. James Hague over at Programming in the 21st Century has a similar view on the matter. He says Functional Programming is Mainstream---and has been for a while. Other programming languages have half of Lisp---and more that Lisp doesn't have. Lisp, or its essence, is mainstream. And after bringing the fire of Lispy programming to the world of languages, it is forced to face the onslaughts of a growing amount of weenie-bashing for all of eternity.
Ok. Maybe that was melodramatic. My point is that it may be too late to start with Lisp so you don't have to reimplement all of its features. Because all of those new languages have already implemented them. At least what most people consider the important ones.
I imagine people will disagree with this view. People might say that although Java and C# have many of the features that made Lisp great, it doesn't have the essence that makes Lisp still the best choice for discriminating programmers. That essence might include meta-programming facilities, or first-class closures, or macros.
Macros let you subsume more code into less code. Macros let you write more functionality with fewer lines. Macros let you abstract away boilerplate into new syntax.
But the corporate manager will say: if everyone writes their own syntax, my programmers can't read each other's code. So instead of having to learn a language once, they will have to learn a new language each time they approach a program for the first time. And the value of macros is lessened.
Code as data lets you manipulate code at runtime. It means you can optimize it, count it, store it, send it somewhere, and more importantly, write it in itself. The possibilities are endless.
But the corporate manager again has an answer: Java is already written. Why would I want to rewrite it? I have a program to develop---and you're worried about optimization? Let the folks at Sun worry about that. We're not language developers!
And so do each of the features fall like dominoes. Either they hinder some unforeseen corporate best-practice, or they just aren't really as powerful in that environment as one would really hope their expressive purity would like.
Python really seemed to be playing out Greenspun's Tenth for a while. I mean, really. Cool functional programming features. Strong support for a list-like basic type. Functions as the primary means of abstraction. And people were talking about it. They still do. And there was a sparkle of hope that one day, all of the features of Lisp would make it into Python. And that day would come when people realized that those other features were great. The Pythonists just didn't know how useful those features were. They just never used them before.
Check out this story as witnessed by Kenny Tilton.
When he finished Peter took questions and to my surprise called first on the rumpled old guy who had wandered in just before the talk began and eased himself into a chair just across the aisle from me and a few rows up.
This guy had wild white hair and a scraggly white beard and looked hopelessly lost as if he had gotten separated from the tour group and wandered in mostly to rest his feet and just a little to see what we were all up to. My first thought was that he would be terribly disappointed by our bizarre topic and my second thought was that he would be about the right age, Stanford is just down the road, I think he is still at Stanford -- could it be?
"Yes, John?" Peter said.
I won't pretend to remember Lisp inventor John McCarthy's exact words which is odd because there were only about ten but he simply asked if Python could gracefully manipulate Python code as data.
"No, John, it can't," said Peter and nothing more, graciously assenting to the professor's critique, and McCarthy said no more though Peter waited a moment to see if he would and in the silence a thousand words were said.
I quote this somewhat long story here because it really shows the dynamic. One person claims that they're close enough to be called a Lisp. Another guy pointing out "but you don't have this feature". It's just perfect. Python seems to be following that solitary Tenth Rule.
But the bomb was dropped, folks! Python, in its newest incarnation, is breaking the Rule. After absorbing all sorts of functional goodness from the Lispy womb, Python is rejecting its uterine confines! The news that Python 3000 (its projected release date) will reject lambda (!), map, filter, and reduce shocked the Functional Programming World. They later recanted some of it. But it is a bold statement against the tyrannical rule of Greenspun.
And Peter Norvig? The author of Paradigms of Artificial Intelligence? He knows about those other features of Lisp. Trust me. He knows what he's missing. And he chose Python.
I guess my point, through all this meandering, is that other languages did borrow a lot from Lisp. About half of it. And now those features are out there, in the world. And in the meantime, while they were borrowing, they got some new features of their own. Features like giant user bases, gazilions of libraries, corporate support, standards bodies. So Lisp has half of the features of Python. Java and Python are far from my ideal language---but so is Common Lisp. The idea that I would have to implement so much of Lisp on my own is a little overblown these days. And speaking of reimplementation: How much of Python's standard library does a complex Lisp program reimplement? How much of Python would you have to reimplement before you regret choosing Common Lisp?

But the message I was trying to convey still rings true. With some hindsight, I've thought about a better way to express it. And in the great blog tradition, I'll write it as a list!
And there's also a big idea embedded in it: in this world of open-source software, you're not just worried about how good your code is. You need to attract users and more importantly developers. An open-source project can't survive without loving, passionate people. You need to make your users love the library.
Remember your library has two kinds of users: the programmers and the other code that uses your library. They have somewhat different needs, but they are both extremely important to think about.
This is the basis of everything! A library that doesn't make your users more powerful is not worth using!
Your job as a library designer is to make the programmer kick butt. Give the programmer the tools he needs to learn how to use your library for all it's worth. Show him with examples. Name your functions so that they're easier to find. Make it easy to get started and make it easy to learn more.

A good tool will make the user happy to work with it. Someone who is happy with a tool will send his friends to use it too. More users means more bug testing and more patches and more feature requests. An active library is a happy library.

Increasing the amount of feedback your library gives to the programmer will help the user learn faster and more correctly. The worst thing you can actually do is to make the programmer totally unaware of what has happened after calling a function or performing an action.
People want to see, hear, or feel the results of their actions. It helps them tune their own mental models of what the library can and will do for them. Feedback makes the experience much more enjoyable. Programmers can more easily enter flow. And when people are in flow, they enjoy themselves. Then they tell their friends.
But don't forget about the other kind of user. The program that uses your library also needs to have information about the library's state. Programs using your library can be much more compact and intuitively written when they don't need to keep track of your library's state. Keeping track of the library's state is just as complex as rewriting the whole thing. The program needs feedback to know if something went wrong, and perhaps even how to fix it. Which brings us to . . .
Both the programmer and the user code will make mistakes. That's normal. Nothing is 100%. What your library needs to do is to let the user quickly and easily understand the state of the system. That way, the user can learn from the mistake and fix the problem.
You can also provide complementary functions. So if you have an add function, implement a remove just in case it's not what the user wants. That opens the door to experimentation because your mistakes aren't permanent. Exploratory programming!
Mistakes are great opportunities to learn how to use the library better. Make mistakes easily undoable and increase the feedback!.
Give the user as few options as possible. Make it as impossible as possible to not get a working program. Even if it's not exactly what they would want in the end. You want your library to work with defaults and no options. When that's not possible, crash gracefully with a decent error message---one that explains what the user should have done.
Your code should not require me, the programmer, to read pages and pages of documentation to understand every option before I have even decided whether I want to use the system or not. This is actually a crucial design place that most designers don't think about: when the programmer is deciding which library to use. Usually, the programmer picks the one that does something with little or no effort.
I'm a big fan of writing very clear, mathematically modeled software. That's great for maintainability and checking the correctness. But it is unreasonable to assume that your users are going to read a few papers on the subject before diving into your Support Vector Regression library. They want it to work out of the box with the ideas they already have. It is important to support users that know what gamma and epsilon mean so they can tweak it to their liking. But what about the other guys? Why make them set parameters that they don't even understand?

Consider building your library as a series of layers. Your bottom layer defines the low-level operations. Your upper layers define operations in terms of the lower layers. Each layer adds functionality and abstraction.
When a new programmer wants to use the library, point them at one of the upper layers. They'll be able to start quickly and they will appreciate the library earlier. As they become more proficient, their curiosity can lead them down the layers. And as they approach the bottom layer, they are more likely to understand the inner workings of the code---and thus more likely to contribute to your project.
The user should never be surprised. Functions should do what they seem like they should do. Name your functions well. Name the parameters well.
Also, provide as much information as is needed to the user. Make programmatic access to important data easy.
I don't know if I can stress this enough. Naming is very important. Firstly, it lets programmers learn what they need faster. Well-named functions are easier to guess and easier to understand their functionality. Secondly, fewer bugs. The user knows what to expect due to good names.
Really, I know I can't stress this enough. Without an interface that makes sense from the very beginning, the user will be frustrated. Frustration prohibits flow state. No flow state means no love. You don't get love, and neither does your code. Name everything well.
So that's my take on what I wrote. Some of it's new, some of it's old. And there's a lot left out. If you want, you should go take a look at a more didactic view of the same stuff. Happy hacking!
PS Watch this awesome video:
How To Design A Good API and Why it Matters
It seems to me that if you wanted to really put an end to the old "C is faster than X" (where X is your favorite High Level Language) snickers and arguments you'd start taking advantage of the vast amounts of information contained in a High Level Language program as opposed to a Low Level Language program.
What do I mean? Well, because C programs are so low-level---they're all about byte manipulation, basically---the compiler doesn't really know anything more than byte operations. There is no high-level semantics. The constraints and meanings embedded in the semantics of a high-level language allow for lots and lots of optimizations.
Let's look at some successes.
By far, I think the one that comes to mind the most easily is Haskell. Part of the semantics defined by the language is that there are no visible side-effects (at least in a large portion of the code). This means that the compiler can optimize out lots of unnecessary computation. How does it do it? No side effects means that you can do lazy evaluation. Basically, that means don't calculate anything you don't need to, and if you do calculate it, you might as well remember it so you won't have to calculate it ever again.
So, automatically, all of this code you wrote that would, using a naive version, create billions of conses now only creates a few hundred. Automatic optimization. You can't do that in C because there's no way to know what the program is trying to do---because anything is allowed. And all of that information is used at run-time. Haskell does some other cool optimizations, too. I just don't know enough about them.
Here's another example: tail-call removal in Scheme. Most Common Lisp implementations do it, too. But in Scheme it's part of the standard. What does it mean? When you write a recursive function in a certain format, it is automatically converted to an iterative function that generates the same value.
Steve Yegge has a whole talk on how run-time information trumps compile-time information. He says there are techniques now that can let a VM adapt to dynamic typing information so that it approaches the speed of static typing. But only if you have some kind of runtime environment.
These are all great. But they're not enough. So many optimizations are done at a low level. But they only gain so much. Here's a question: would you rather optimize a function so that it takes 10 instructions instead of 15 OR optimize the whole algorithm to call the function half the number of times? High-level, algorithmic optimizations trump low-level optimizations.
Here's an optimization I wish my Lisp implementation did (if it does do this, let me know. I'll jump for joy): convert (= 10 (length list)) to (length= 10 list).
Huh?
This is what I mean: it should rearrange the AST at compile-time to semantically equivalent programs that are more efficient.
If I write (length list), it will recurse down the list and add 1 for each cons in it. But if I only want to know if there are exactly ten items in the list, the LENGTH function is going to go to the end of the list anyway. Even if there are 10 billion items in the list. That is not smart. What most people say is that you should do it yourself, changing (= 10 (length list)) to (length= 10 list), which only recurses 10 times at most.
But what I say is that the compiler can do it. LENGTH is part of the standard language, and so is =. They should get compiled away. Further, I think there should be an entire framework that lets you define and manage your own optimizations. Since you know the semantics of your own programs, you should be able to define all sorts of great optimizations without having to mess up your clean code.
Oh, and why is tail-call removal the only kind of recursion-to-iteration techniques used? There are lots more! Read the paper called "A system which automatically improves programs". I love old papers! So full of untapped goodness.
That paper is so chock full of good stuff that I'll reserve giving it a thorough review here. However, this quote will explain nearly everything:
A programmer is able to present his algorithms to the system in a clear and abstract language. The system converts them to efficient but probably not transparent versions.
For example, here are two versions of one program which reverses lists.
[note: I have converted the pseudo-code in the paper to Common Lisp]
(defun reverse (x) (if (null x) nil (append (reverse (rest x)) (list (first x)))))
(defun reverse (x) (let ((result nil) (temp nil)) (loop while (not (null x)) do (setf temp (rest x)) (setf (rest x) result) (setf result x) (setf x temp)) result))
One is clear and abstract, the other more tortuous but efficient. Given the first as a definition, a competent programmer should be able to produce the second. Our system can do this for him.
The system is built around the concept of abstract programming, and we hope to encourage a user to formulate his algorithms in abstract terms appropriate to his problem domain and leave the system the task of implementing them efficiently.
So the code it transforms it into in the quote above is destructive: it modifies the list. But in the paper it shows that it's optional. The Lisp implementation they wrote makes that an interactive step. But the conversion of the recursion to iteration is just as safe as tail-call elimination---and clearly defined in the paper. Why aren't we doing this?
I would love to see a system where you can define your own optimizations as transformations of the AST. Transformations that are applied automatically or optionally turned on or off if you like. Simple pattern matching would do. The paper I cited is not a simple pattern matching operation. It's more complex---but definitely doable.
While the C coders are busy digging through their optimized spaghetti code, the Common Lisp compiler is intelligently rewriting your highly abstract program for you. I love it!
I've been working on a project on the side. I'm implementing a refactoring framework for Common Lisp. Eventually I'll get it hooked into Slime.
It's based on a paper I read recently: A Formal Pattern Language for Refactoring of Lisp Programs. You should read it, too.
But here's the book report version: It defines a pattern language for describing transformations between different s-expressions. You can then define patterns that have semantic equivalence. The system will then let you transform one form into another.
Example time!
Say you wanted an easy way to transform between a COND and an IF form.
(defequiv (cond (?test ?conseq) (t ?alt)) (if ?test ?conseq ?alt))
That defines the two forms as equivalent. Meaning you can freely refactor in both directions. It lets you convert
(cond ((evenp x) x) (t (* 2 x)))
into
(if (evenp x) x (* 2 x))
and vice-versa.
You can also define
(defequiv (cond (?test (?+ ?conseq)) (t (?+ ?alt))) (if ?test (progn ?conseq) (progn ?alt)))
to take care of the pesky progn case.
So that part is done. The paper also defines some transformation rules that can only go one way. Those are more powerful and allow for maps over structured lists. Those shouldn't be too hard to implement on top of what I've already got.
I've got a simple integration into Slime. It's not what I want it to be in the end. Right now, you position point over an expression, you hit a command in Emacs, which asks you for the name of the refactoring you want to do. It then tries to perform it. Don't type in the wrong name or try one that won't work. It'll just delete your code.
What I really want is for you to position point over an expression, hit a command, then you are presented with a buffer containing a list of all possible transformations. You can scroll through it and select one. It will then replace your code.
I've never done any Emacs hacking, so that's way beyond me. If anyone wants to volunteer to help me, I'd appreciate it. Maybe it could go in the Slime contrib section.
I use Eclipse a lot at work. I love its refactoring features in Java. Especially rename. They've done a lot of work to make that work really smoothly. For those of you that have never used it, Eclipse's rename lets you position the cursor over an identifier, hit Alt-R, and edit the name which is surrounded by a box for a little visual cue. You can see all of the places in the code where that name is referenced changing as you type. If you want to keep your changes, hit Enter. What's most magical about it is that the refactoring is scope-aware. It needs to be, but the fact that it works is great. If you change the name of a public method, Eclipse goes through all of the classes that call that method and changes the name.
Remember: a refactoring operation needs to be atomic and not alter the semantics of the program. So Eclipse's rename works perfectly for that.
But there are tons of refactorings that are not at such a high level. One, for instance, is the cond<->if refactoring I defined above. It doesn't require global knowledge like renaming does. It's a local change. And Eclipse doesn't have anything like that.
Maybe it's because the Java syntax is a little more complex than Lisp. Lisp's AST is simple. An expression is either an atom or a list of expressions. Pretty much that's it. I don't even want to get into Java's AST.
At least not much. Let's take, for instance, the semantics preserving refactoring of changing a nested if statement to a single if with a compound test.
if(x>10){
if(y>100){
doStuff(x, y);
}
}transforms into
if(x>10 && y>100){
doStuff(x, y);
}That looks simple enough. But I would bet that the number of corner cases when performing this AST transformation would make it so difficult that no one would write it.
Of course, in Lisp, with the system described in the paper above, a simple equivalence rule will suffice. Note that you'd probably define the above conditionals as WHEN's because they don't have an else.
(defequiv (when ?test1 (when ?test2 (?+ ?conseq))) (when (and ?test1 ?test2) (?+ ?conseq)))
A large library of these transformations could be built (by its users as they need them) and shared. Everybody would benefit.
What would it take to do the high-level refactorings that Eclipse does? You would need a code-walker that understood a subset of the semantics of Common Lisp. To perform a global rename, the code-walker would need to determine the scope of the variable or function name and figure out what other symbols refer to the same variable within that scope. This is where syntax-heavy languages win: a lot of that information is embedded in the tree. In Lisp, you've got the global scope, scopes created by LET's and DEFUN's and DEFMACRO's. But let's not forget also that macros can hide a LET. It's complicated and requires a very smart code-walker. But not to worry: I'm sure some enterprising young lisper will figure it out.
Before I go, I'll extend out another invitation to help me with the Slime front-end to the refactoring engine. Thanks.
I've been disturbed recently by comments I've read saying that if people can't learn to install an implementation of Common Lisp and get some packages working, they probably don't have what it takes to learn the language. There's also complaining about people complaining.
This is snobbery pure and simple.
And it's disturbing. And its endemic in the Common Lisp community. Yeah, I know that it's not everybody's job to sit around helping people learn something eminently worth learning. Oh, wait. Maybe it is. Maybe it is your duty to use what we've learned to help others. To spread knowledge. To frickin' help people.
Teaching requires patience. So does learning. Sometimes people say something hastily when they're frustrated. Sometimes people get defensive of something they care about and react badly. It's important to look past the seemingly negative remarks. It's so easy to be misunderstood online.
And what do you get out of teaching? You get more knowledgeable people using your language. You get more libraries. More minds developing the implementation. More press. More nice people to talk to. That's what you get!
The problem with being elitist about the language is that it only attracts more elitists. If you sell the language to snobs, snobs are going to flock to it. If you flaunt your snobbery, other snobs will smell it and drop by. And they might not leave.
We have a lot of people in the Lisp community trying to help each other. That's the basis of a community. People are developing libraries, implementations, and documentation. It's a lot of work---hard work---and they're doing a good job. It's a relatively small community with lots to do. It's only natural that not all of it is going to get done.
But shunning people because they vent frustration (which everyone has experienced) is not the message I see people communicating all the time through helpful advice, thorough tutorials, and useful libraries. Lisp has always been a language of easy starts (bootstrapped), easy development (incremental dev in the REPL), easy learning (doc strings), easy change (dynamic recompilation), and easy debugging. Unless all of those easies are unimportant, why not add easy download, easy install, and easy going?