I've been working on a project on the side. I'm implementing a refactoring framework for Common Lisp. Eventually I'll get it hooked into Slime.
It's based on a paper I read recently: A Formal Pattern Language for Refactoring of Lisp Programs. You should read it, too.
But here's the book report version: It defines a pattern language for describing transformations between different s-expressions. You can then define patterns that have semantic equivalence. The system will then let you transform one form into another.
Example time!
Say you wanted an easy way to transform between a COND and an IF form.
(defequiv (cond (?test ?conseq) (t ?alt)) (if ?test ?conseq ?alt))
That defines the two forms as equivalent. Meaning you can freely refactor in both directions. It lets you convert
(cond ((evenp x) x) (t (* 2 x)))
into
(if (evenp x) x (* 2 x))
and vice-versa.
You can also define
(defequiv (cond (?test (?+ ?conseq)) (t (?+ ?alt))) (if ?test (progn ?conseq) (progn ?alt)))
to take care of the pesky progn case.
So that part is done. The paper also defines some transformation rules that can only go one way. Those are more powerful and allow for maps over structured lists. Those shouldn't be too hard to implement on top of what I've already got.
I've got a simple integration into Slime. It's not what I want it to be in the end. Right now, you position point over an expression, you hit a command in Emacs, which asks you for the name of the refactoring you want to do. It then tries to perform it. Don't type in the wrong name or try one that won't work. It'll just delete your code.
What I really want is for you to position point over an expression, hit a command, then you are presented with a buffer containing a list of all possible transformations. You can scroll through it and select one. It will then replace your code.
I've never done any Emacs hacking, so that's way beyond me. If anyone wants to volunteer to help me, I'd appreciate it. Maybe it could go in the Slime contrib section.
I use Eclipse a lot at work. I love its refactoring features in Java. Especially rename. They've done a lot of work to make that work really smoothly. For those of you that have never used it, Eclipse's rename lets you position the cursor over an identifier, hit Alt-R, and edit the name which is surrounded by a box for a little visual cue. You can see all of the places in the code where that name is referenced changing as you type. If you want to keep your changes, hit Enter. What's most magical about it is that the refactoring is scope-aware. It needs to be, but the fact that it works is great. If you change the name of a public method, Eclipse goes through all of the classes that call that method and changes the name.
Remember: a refactoring operation needs to be atomic and not alter the semantics of the program. So Eclipse's rename works perfectly for that.
But there are tons of refactorings that are not at such a high level. One, for instance, is the cond<->if refactoring I defined above. It doesn't require global knowledge like renaming does. It's a local change. And Eclipse doesn't have anything like that.
Maybe it's because the Java syntax is a little more complex than Lisp. Lisp's AST is simple. An expression is either an atom or a list of expressions. Pretty much that's it. I don't even want to get into Java's AST.
At least not much. Let's take, for instance, the semantics preserving refactoring of changing a nested if statement to a single if with a compound test.
if(x>10){
if(y>100){
doStuff(x, y);
}
}transforms into
if(x>10 && y>100){
doStuff(x, y);
}That looks simple enough. But I would bet that the number of corner cases when performing this AST transformation would make it so difficult that no one would write it.
Of course, in Lisp, with the system described in the paper above, a simple equivalence rule will suffice. Note that you'd probably define the above conditionals as WHEN's because they don't have an else.
(defequiv (when ?test1 (when ?test2 (?+ ?conseq))) (when (and ?test1 ?test2) (?+ ?conseq)))
A large library of these transformations could be built (by its users as they need them) and shared. Everybody would benefit.
What would it take to do the high-level refactorings that Eclipse does? You would need a code-walker that understood a subset of the semantics of Common Lisp. To perform a global rename, the code-walker would need to determine the scope of the variable or function name and figure out what other symbols refer to the same variable within that scope. This is where syntax-heavy languages win: a lot of that information is embedded in the tree. In Lisp, you've got the global scope, scopes created by LET's and DEFUN's and DEFMACRO's. But let's not forget also that macros can hide a LET. It's complicated and requires a very smart code-walker. But not to worry: I'm sure some enterprising young lisper will figure it out.
Before I go, I'll extend out another invitation to help me with the Slime front-end to the refactoring engine. Thanks.
Comments
A large library of these
A large library of these transformations could be built (by its users as they need them) and shared. Everybody would benefit.
This strikes me as particularly awesome. Say you could define transformations form "Lame other programming idioms" to "Lispy idioms" then your program could have all the right idioms for free.... it would be a great learning tool.
IntelliJ IDEA
the semantics preserving refactoring of changing a nested if statement to a single if with a compound test
IntelliJ IDEA has this and many other transformations. It calls them "intentions". Here is a list of the ones available: http://www.jetbrains.com/idea/documentation/intentions.jsp
They are not very hard to write, although the code is a lot more verbose than your LISP examples. "High-level" refactorings are quite a bit more complicated to implement.
Redshank
I started redshank with similar aims. It can do some refactoring, even without an elaborate infrastructure (see for example redshank-extract-to-defun. However, this is only an approximation, and a 100% solution would likely require to integrate IDE and CL compiler more tightly.
The few patterns supported by redshank are currently hand-coded, as they work on a text representation, and there are aspects like minimal code re-indentation to keep in mind. I came across Leitao's paper some time ago, but I think first one needs a (non-abstract!) syntax tree representation of a buffer, a la Eclipse.
Indentation
I'm having indentation issues now, too.
The refactorings work fine, but they're all capitalized (printing symbols) and on one line. You then have to go through and reindent.
Maybe it can be done in a whitespace- (and capitalization-) preserving way. It would be nice, also, to make it completely in Emacs. But as I said before, I've never hacked Emacs. I worry that textual refactorings might have lots of corner cases.
One of the benefits of doing it in the Lisp image is that you get package management for free.
But if you did do it textually, ie, as in structured text editing, you could preserve whitespace and capitalization. Your matcher would operate not on s-expressions, but on Lisp code (in text format). Parentheses would be matched, when needed. I'm sure it's possible to do a regular-expression-like language that can recognize s-expr syntax. I'm just not aware of how to do it right off the bat.
Indentation?
Hmm, works for me, both indentation and lower-case symbols. Can you provide a test case and your setup?
I worry that textual refactorings might have lots of corner cases.
Yes, that's why I am not very fond of them. However, I also want to respect what the user typed. Paredit goes to some lengths to preserve as much of what the user typed as possible, and I think it's a good strategy in the presence of partial input. In fact, that's what makes it so pleasant to use, I believe.
I'm sure it's possible to do a regular-expression-like language that can recognize s-expr syntax. I'm just not aware of how to do it right off the bat.
Ugh, that does not sound right. The Eclipse approach seems better to me: incremental parsing, preserving comments and other whitespace (hence my mentioning of syntax tree earlier), inserting error nodes when something cannot be parsed to synchronize the parser. The matcher then works on the "DOM", with the matching ignoring whitespace and the transformations doing something intelligent with it.
Indentation problems
The indentation problems are in my own system---not redshank. By the way, I like redshank.
I think it would work well to have a pattern-aware parser. It would try to build the tree expressed in the antecedent of the transformation rule. But the tree would be based on textual elements, not symbols, conses, and literals.
But I'll defer to you since you've written more refactoring code than I have.
Refactoring and lisp movie collaboration.
Hello sir,
I enjoy your blog and am interested in working on refactoring tools for emacs and in the generation of lisp videos. Would you be willing to engage in a dialogue of potentialities?
Thank you much,
Will
Dialogue!
Yes, of course. However, I have put the refactoring framework way on the back burner for now. I'll let you know anything you want to know about the movies.
Eric
Post new comment