Hindley-Milner in Clojure

Typing Comic from Cartesian Closed Comics

All sarcasm aside, the above diagram has a kernel of truth. The important thing to note is that the intersection between "Proponents of dynamic typing" and "People familiar with type theory" is very small.

In an effort to increase the size of that intersection, I decided to familiarize myself with a little more type theory. I developed an implementation of Hindley-Milner which operates on a simple Lisp-like λ-calculus with let polymorphism. Everything you need for Hindley-Milner Algorithm W.

Background

Hindley-Milner is a type system that is used by ML and Haskell. Algorithm W is a fast algorithm for inferencing Hindley-Milner which is syntax-directed (meaning it is based on the type of expression) and recursively defined. In this way, it is similar to Lisp's eval.

Implementation

I based my implementation on a paper called Algorithm W Step by Step which implemented it in Haskell. My implementation started very similar to that implementation and diverged as I refactored it into a more Clojure-esque style.

This implementation uses no in-place mutation, instead using a "substitution style" which is slightly harder to read. It is, however, easier to write and prove correct.

In addition to the type inferencer, I wrote an interpreter for the same language, just because. My original intent was to expose (to myself) the similarities between syntax-driven type inference and eval. There might be some, but the full clarity I desire is yet many refactorings away. Note that the interpreter and type inferencer are completely independent, except that both apply to the same set of expressions.

I added a couple of minor luxuries to the language having to do with currying. Writing fully parentheisized function applications for curried functions is a pain, as is writing a hand-curried function with multiple arguments. I added two syntax transformations which transform them into a more Lispy style. For example:

(fn [a b c d e f] 1) => (fn a (fn b (fn c (fn d (fn e (fn f 1))))))

and

(+ 1 2) => ((+ 1) 2)

I'm pretty sure the syntactic transformation is completely safe. All of my tests still type check.

The final luxury is that it is a lazily-evaluated language. That's not strictly necessary, but it is strictly cool. It builds up thunks (Clojure delays and promises) and a trampoline is used to get the values out. This lets me define if as a function. The only special forms are let and fn.

Where to find it

You can find the code in the ericnormand/hindley-milner Github repo. I don't promise that it has no bugs. But it does have a small and growing test suite. Pull requests are welcome and I will continue to expand and refactor it.

What I learned

Type unification is why the error messages of most type inferencers are so bad. Unification by default only has local knowledge and is commutative (unify a b == unify b a). No preference is given to either argument. A lot of work must have gone into making the error messages of Haskell as good as they are.

Let polymorphism is useful and I'm glad that Haskell has it.

Hindley-Milner is powerful, but it does not by itself work magic on a languageq. A language still requires a lot of good design and a well-chosen set of types.

Your turn

I think you should implement Hindley-Milner in the language of your choice for a small toy λ-calculus. There is a lot to learn from it, even if you never program using a Hindley-Milner language. At the very least, you'll know what the fuss is about.

If you think it would be helpful, have a look at my implementation. Like I said, pull requests are welcome.

For more inspiration, history, interviews, and trends of interest to Clojure programmers, get the free Clojure Gazette.

Learn More

Clojure pulls in ideas from many different languages and paradigms, and also from the broader world, including music and philosophy. The Clojure Gazette shares that vision and weaves a rich tapestry of ideas from the daily flow of library releases to the deep historical roots of computer science.

You might also like

React: Another Level of Indirection

Any problem in computer science can be solved with another level of indirection. -- David Wheeler

The React library from Facebook makes DOM programming functional by using a Virtual DOM. The Virtual DOM is an indirection mechanism that solves the difficult problem of DOM programming: how to deal with incremental changes to a stateful tree structure. By abstracting away the statefulness, the Virtual DOM turns the real DOM into an immediate mode GUI, which is perfect for functional programming. Further, the Virtual DOM provides the last piece of the Web Frontend Puzzle for ClojureScript.

David Nolen has written up a short explanation of how the Virtual DOM works, as well as some amazing performance benchmarking of React in a ClojureScript context. His work is important, so you should read it. I'd like to focus a bit more on the expressivity and why I would use React even if it were not so fast.

One can view MVC frameworks as an attempt to impose some structure on the code that has to interface with the DOM. The bargain they propose is this: wrap the DOM in View objects, so that subtrees of DOM nodes are managed by a View object. Wrap your state in Model objects, which will notify the View objects of changes. You thereby keep a layer of indirection between Models and Views, which inherently need different structure and need to change independently.

The layer of indirection (usually an event or observer system) solves a coupling problem. It decouples your state from the DOM, while leaving you to deal with all of the difficulties of the DOM. You are essentially making a new type of DOM that is better suited to your GUI domain than plain HTML, but is still a PITA. It is little more than a coat of paint. The DOM is still stateful.

React takes a different approach. It provides a level of indirection which solves the actual problem with the DOM--statefulness. The DOM has become a smart canvas. Paint the whole picture again, but only the different parts get wet.

Instead of wrapping the DOM in View objects, you create a Virtual DOM (completely managed by React) which mirrors the real DOM. When the model changes, you generate a new Virtual DOM. The differences are calculated and converted into batch operations to the real DOM. In essence, React is a function which takes two DOMs and generates a list of DOM operations, i.e., it's referentially transparent.

It's easy to imagine how this changes the game. You no longer need an initializer to set up the DOM and observers to modify it. The first Virtual DOM rendering is like the second one, in fact like any other! Your "View", if you want to call it that, is simply a function from state to Virtual DOM nodes. If the state and DOM nodes are immutable, all the better. There's less work to know if it has changed.

This also means that your Views can be composed functionally. Define a component (as a function) and your subcomponents, and build them up functionally. All of the functional abstraction and refactoring that you're used to is available. If you're doing it right, your code should get shorter, easier to read, and more fun to maintain.

My (short) experience rewriting a program to use React converted me. It was the only library I have used that actually made DOM programming fun and functional--and dare I say my code now works!

Which gets me to my last point, which is that React is the final puzzle piece for ClojureScript web frontend development.

  • Problem: Global state management
  • Solution: Atoms and persistent data structures

  • Problem: Client-server communication
  • Solution: EDN (also solved pretty well by JSON)

  • Problem: Callback hell
  • Solution: core.async

  • Problem: Stateful DOM
  • Solution: React

Any other problems left? I can't think of any. That's something to discuss on Twitter.

I suggest you try React. Om by David Nolen is the most mature React ClojureScript library I know of. It does a bit more than I've described here (Om manages your state tree for you, among other things) and is evolving quickly. I have some code that generates React Virtual DOM using hiccup style with macros (so the work is done at compile time), but other than that, contains only half-baked implementations of what's in Om. If you're interested in the hiccup macros, let me know what you'd use it for and I'll put it on github.

For more inspiration, history, interviews, and trends of interest to Clojure programmers, get the free Clojure Gazette.

Learn More

Clojure pulls in ideas from many different languages and paradigms, and also from the broader world, including music and philosophy. The Clojure Gazette shares that vision and weaves a rich tapestry of ideas from the daily flow of library releases to the deep historical roots of computer science.

You might also like

Deconstructing Functional Programming

Because if you think about it, the stack itself is just an optimization. Right? There are these frames which contain information about each invocation. Each stack frame. Each activation record. And that's what they are--they're activation records. They're sort of objects. If you really have objects on the brain, like I do, then you realize that they're all just objects.

And they should be treated uniformly. You can even build a language that works this way--as I have. If that's the case, this is really a garbage collection problem. Right? Your stack traces might go away if you don't need them. You do need them when it's not a tail call because you need to go back there and use that information. But if it's a tail call, you don't need them, you don't need a tail call back to that frame. You need a pointer back to the last frame that wasn't a tail call. And they might get collected.

Which doesn't mean you actually have to implement it that way. Erlang sort of does. And there are many implementations that do that. They are not noted for their speed. If you can have real stacks, what happens when you run out of space? If you don't run out of space, it didn't really matter if you optimized the tail calls or not.

When you run out of space, you should GC the damned stack. You shouldn't just throw up your hands and say you're dead. And for all the normal programs that people write where it didn't matter, they won't care. And for your tail recursive programs, well, they might be a bit slower, but they will work. And then it's a matter of flags to the garbage collector if in production you don't want to debug it if you're sure it's all going to be fine--then go ahead and tell it to don't bother and just slam it and overwrite those frames directly.

Why is this so hard for implementers to do? Optimization is an optimization and should be optional.

It's a cool algorithm, but there's nothing nice that I'm prepared to say about Hindley-Milner.

He nails a lot of things I didn't like about Haskell.

All in all, this talk gets a lot of things right.

You might also like

How to Use New Relic with Clojure on Heroku

Heroku is a great service, especially for the lone developer who wants free hosting for an app.1 The free hosting works just like the paid service except that your server will be "spun down", meaning that after five minutes of no activity, your server is stopped. It will be "spun up" again when there is a request. This spin up process can take a while and certainly does not give a good user experience.

Luckily there is a recommended way of avoiding this delay. The solution is to run New Relic monitoring, which periodically polls your server, avoiding the five minutes of no activity and hence keeping your server running.

In addition to keeping your server from falling asleep, it also gathers lots of profiling information that could help you understand your server as it runs in production. Luckily, both Heroku and New Relic offer free tiers. This brings us to my favorite financial formula:

Free + Free = Free

You can also use this process with the paid versions. Note also that Heroku itself recommends setting this up, so don't feel guilty!

Instructions

I am assuming you are already using Heroku and have a working app hosted there, using an uberjar deploy. You also have the Heroku Toolbelt installed.

1. Add the New Relic add-on to your app.

$ heroku addons:add newrelic:stark

I recommend the stark plan for free apps. You can choose any of the New Relic tiers.

2. Find and download the latest New Relic Java API release.

This download page lists all of the versions. Find the latest one and download the ZIP file.

3. Unzip the ZIP file into the base of your app.

$ cd projects/my-app
$ unzip ~/Downloads/newrelic-java-3.2.0.zip

It should unzip into a newrelic/ directory.

4. Check your .gitignore file for *.jar

New Relic includes its own JAR file which needs to be deployed with your app in the git repo. My .gitignore included a line *.jar which would exclude all JAR files. Remove this line if you see it.

5. Add the .gitignore and the newrelic/ directory to your repo.

$ git add .gitignore
$ git add newrelic
$ git commit -m "Add New Relic monitoring agent."

Make sure the file newrelic/newrelic.jar was added.

6. Release to Heroku.

$ git push heroku master

7. Configure your app to use New Relic.

We need to add a new JVM option. There is an environment variable called JVM_OPTS which is typically used to do this. Find out what value it has now.

$ heroku config

Find the line starting with JVM_OPTS:. Mine says "-Xmx400m". Now we add this to the variable: "-javaagent:newrelic/newrelic.jar".

$ heroku config:set JVM_OPTS="-Xmx400m -javaagent:newrelic/newrelic.jar"

The app should restart with the new options. Visit your Heroku dashboard, find your app, and click on the New Relic addon to see the New Relic Dashboard for your app. The first load might take some time, but subsequent loads will be at full speed and it won't spin down. load!

References:

You might also like


  1. We use (and pay for) Heroku at work and we like the paid service, too!

On Type Unity

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures. -- Alan Perlis

In his essay titled "Dynamic languages are static languages", Robert Harper states:

the so-called untyped (that is "dynamically typed") languages are, in fact, unityped. Rather than have a variety of types from which to choose, there is but one!

The notion that a language with no types also has one type is absurd, so we must proceed and allow the perspective that a language with dynamic typing actually has a single static type. The argument is very rhetorical, but it does seem to be a valuable perspective1. However, I disagree with the final conclusion:

And this is precisely what is wrong with dynamically typed languages: rather than affording the freedom to ignore types, they instead impose the bondage of restricting attention to a single type! Every single value has to be a value of that type, you have no choice!

I believe that a well-chosen type is freeing.

In Smalltalk, "everything is an object". This statement is one of the first things you might hear about the language--from people who use it and like it. What is that statement but a declaration of having one type? That one type, with its single operation (send message), is what made Smalltalk so powerful.

Smalltalk's power comes from adhering to a small, firm discipline that others also adhere to. Much like the laws of a government might be binding in a certain sense, in another they give rise to the freedom and privilege citizens enjoy in society. The prohibition of murder allows us the freedom to move about without fear of death. The rules governing market transactions ensure a minimum of fairness. Everyone benefits.

In the same way, Smalltalk's message passing presents a very small point of agreement in order to participate in the bounty of the runtime. Powerful tools, services, and metaprogramming facilities were available for the small price of conforming to a very simple type.

Far from being restrictive, a single type is freeing. The rhetoric of "bondage" does not hold, since there is an existential proof of a single type being the source of freedom and power. I wonder if a paucity of perspectives is part of the reason such an obvious flaw can arise in an intelligent essay. There are no sides in the search for better tools. We will want to combine the powerful elements from as many different perspectives as possible.

You might also like


  1. Which I will gladly add to my repertoire, thank you very much!