Annotated map

July 29, 2015

Summary: map is one of the staples of functional programming. It's totally useful and also surprisingly simple. Let's look at some examples and annotated code.

About a week ago I showed some examples of using reduce, a very commonly used function. This time, I'm going to give some examples of map, which is probably even more common.

map is one of those things that's so useful and so straightforward that it finds its way into every language. Javascript has a map in the newer versions, but people couldn't live without it so it's in a lot of Javascript libraries (for example, in Underscore).

Let's imagine you're walking down a list [0 1 2 3 4 5]. Your job is to increment each one. As you pass each each number on your right you pick it up, add one to it, and put it down on your left. Boom, a new list on your left.

(map inc [0 1 2 3 4 5])

  => (1 2 3 4 5 6)
add 1
to each of these
it returns a new list with the numbers incremented

Ok, next one. Let's say you're walking down a list . . . of lists. Your job? Write down the sizes of those lists. Let's do it! Walk down the list, pick up each list as you go, and drop the size to your left. You just made a new list!

(map count
     [[]
      [1]
      [1 1]
      [1 1 1]
      [1 1 1 1]
      [1 1 1 1 1]])
      
  => (0 1 2 3 4 5)
get the size
of each of these
a list of the sizes

Alright, let's get fun with this one. You walk down a list of maps. Your job? figure out what's under the :a key. Drop the answers on the left. Remember, if the map doesn't have the list, it gives you nil.

(map :a
     [{:a 1}
      {:a 2}
      {:a 3}
      {:b 4}])
      
  => (1 2 3 nil)
keywords are functions, too
look at these little maps, just waiting there!
look, the last one was nil

Ok, here's a good one. Someone wrote a bunch of sentences, but you want to make them angry. ALL CAPS!! Walk down the list, apply this epic function to each, and make a new list!

(map (fn [x] (str (.toUpperCase x) "!!"))
     ["I am angry"
      "don't yell at me"
      "stop yelling"])
      
  => ("I AM ANGRY!!"
      "DON'T YELL AT ME!!"
      "STOP YELLING!!")
our epic function
make these angry!!!
LOOK AT THEM!!

Conclusions

Yep, map is useful. It's one of the staples of functional programming. Once you start using it, you'll use it everywhere.

If you liked the code with annotations, the physical metaphors, the focus on the basics, you will love Lispcast Introduction to Clojure. Visuals, metaphors, exercises, annotated code, and lots of code in a repo. You learn the basics of Clojure down to your fingertips, writing code right away.

Learn Functional Programming using Clojure with screencasts, visual aids, and interactive exercises
Learn more

You might also like

Some Annotated clojure.core/reduce Examples

July 21, 2015

Summary: reduce is a very useful function. You can use it for many calculations over a collection. Code annotations are useful, as are physical metaphors.

reduce is used a lot in Clojure. I've heard a lot of people get scared by reduce, like it's something deep and mysterious. It is deep, but it's not mysterious. It's also very loveable for its easy mechanistic application.

I love to watch my toddler doing mechanistic stuff. She picks up a bean and puts it in the cup. She picks up a bean and puts it in the cup. Over. And over. And then I think: that's reduce!

Let's look at a first example. Let's say we want to add up a list of numbers.

(reduce +
        0
        [1 2 3 4 5])
let's add
start with zero
add these numbers to it, one at a time

Imagine holding 0 in your left hand and walking down the list of numbers 1, 2, 3, 4, 5. You approach the number 1. You grab it with your right hand. You have two numbers in your hands. You add them together (+) and hold the answer in your left hand. Now proceed to the next number. Repeat until you're done with the list. At the end, what number are you holding in your left hand? That's the answer.

We can add stuff to a set.

(reduce conj
        #{}
        [1 2 3 4 5])
let's add things to a collection
start with an empty set
add these numbers to it, one at a time

Imagine holding an empty bucket (like the empty set) in your left hand and walking down the list of numbers. When you get to 1, you pick it up in your right hand. Now drop it in the bucket (conj). Proceed to the next number and repeat down the list. What is in your left hand at the end? A bucket with the numbers 1-5. That's the value of this expression.

Alright, we can do something more complicated. This time, let's use an anonymous function. We'll calculate the maximum number from a list.

(reduce (fn [a b]
          (if (> a b)
            a
            b))
        0
        [1 2 3 4 5])
we can use an anonymous function
return the larger of the two
start with zero
compare these numbers to it, one at a time

Hold 0 in your left hand and walk down the list. When you get to 1, pick it up. Which is bigger, what's in your left hand or what's in your right hand? Whichever it is, put that in your left hand and proceed to the next number, and so on down the line. The answer is in your left hand.

Hmm. What about averages? We think of averages as the sum of a bunch of numbers divided by the number of numbers. We can keep the sum and the count separate so we can operate on them.

(reduce (fn [[n d] b]
          [(+ n b)
           (inc d)])
        [0 0]
        [1 2 3 4 5])
n is the numerator, d is the denominator
add each number to the numerator
add 1 to the denominator
start here
average these numbers, one at a time

Hold [0 0] in your left hand. Walk down the list. When you get to 1, pick it up in your right hand. Now, add what's in your right hand to the first number in your left hand, and add 1 to the second number in your left hand. Hold the two new numbers in your left hand. Proceed down the line, picking each number up in your right hand, until the end. The answer is in your left hand.

Now, one final time, the general pattern:

(reduce (fn [left right]
          (dosomething left right))
        starting-value
        collection)
a function of 2 arguments
the arguments correspond to your hands
the return value of this function will be the next value in your left hand
what you start with in your left hand
the items you grab with your right hand

Conclusions

Well, that was fun. reduce is not hard. You just need a good way of thinking about it. Physical metaphors can help a ton. I wish I could have made this visual with a cartoon, but that would take more time and skill than I have. And I think the code annotations really help add context to what would otherwise be very terse code. Context is so important.

If you liked the code annotations and this style of teaching the basics, you will love Lispcast Introduction to Clojure. It has lots of visuals, lots of metaphors, exercises, and code annotations. It teaches the basics of Clojure and functional programming in a fun way. Check out a preview.

Learn Functional Programming using Clojure with screencasts, visual aids, and interactive exercises
Learn more

You might also like

Reduce Complexity with Variants

July 08, 2015

Summary: The structure of our data should match the relevant structures in the real world. And to ensure that our data is structured well, we should reduce the potential for incorrect structure. Variants provide a great solution for it.

Introduction

Jeanine Adkisson's talk at the Conj last year was my favorite talk of the Conj. Her presentation was well-delivered, her slides were just quirky enough, and of course the material was so important. You should go watch the talk before you read on.

The problem

In Clojure we often represent things as maps with some implicit rules. For instance, there might be three different ways to represent an image: an array of pixels in memory, a filename for a jpeg file on disk, or a URL of a jpeg on the web. We might have maps that look like this:

{:type :image/in-memory
 :pixels [...]}

{:type :image/on-disk
 :filename "/cats.jpg"}

{:type :image/web
 :url "http://cats.com/cats.jpg"}

This is all well and good. Except it's rife with incidental complexity. We have an implicit rule that if the :type is :image/in-memory, then the pixels will be at :pixels, but if the :type is :image/on-disk, we are expecting a string containing a filename at :filename, and so on. The rule might seem obvious, but it's actually kind of cumbersome.

If it's a rule, are we going to enforce it? Where? How strictly? How often? Enforcement is great, but what do we do if we find an :image/web without a :url? Decisions, decisions. Plus, it's a hashmap! It looks like you can have multiple things. Some lazy programmer is going to reuse those keys. For instance, there could be a save-to-disk function that takes an image and writes it to disk. It returns the original image with the filename where it was saved attached at :filename. Now you've got :filename and :pixels. It's unclear whether that's what we want, but it's totally allowed by the data structure.

Actually, if you analyze it, you can quantify the complexity pretty well. We'll make some simplifying assumptions. Let's just say we have the type key and the three "value" keys. The type key can have four possible values (:image/in-memory, :image/on-disk, :image/web, and nil (or missing)) and the values keys have either a valid value or are missing (or nil). The number of states is 4 * 2 * 2 * 2 = 32. So instead of three cases, like we want above, we have 32 cases to deal with.

Is there a way to reduce this complexity? Is there a way to nip it in the bud before we've got code enforcing these assumptions all over the place? Do we really need to drown in assertions and nil checks?

A great solution

Jeanine showed us a well-suited solution. Her solution, as I mentioned before, is called "variants". We have three variants of image representation. Instead of representing them in a map, we represent them in a vector, like this:

[:image/in-memory [...]]

[:image/on-disk "/cats.jpg"]

[:image/web "http://cats.com/cats.jpg"]

How does this help?

Well, let's do the math. There are now two positions where data can be (instead of four in the hashmap representation). If we make the same simplifications we made above, we have 4 * 2 = 8. This is cheating slightly because we're only considering vectors of two elements or less. But then again, we cheated above because we never considered adding arbitrary keys to the hashmap.

Okay, so it's 32 vs. 8. But what happens when we add a new kind of image? In the hashmap version, we're adding a new key and a new :type, so the new states become 5 * 2 * 2 * 2 * 2 = 80. Woah! And in the vector version? 5 * 2 = 10. Wow! Much better. The variant solution actually grows in complexity more slowly than the hashmap solution.

But we've gained something less quantifiable: the data is now easier to write, easier to read, and most importantly, easier to get right. It looks a lot like the tuple pattern we're used to in Clojure. The first value of the tuple is a tag telling you what kind of image it is. As Jeanine pointed out, it's the same pattern employed by hiccup to represent HTML. The first element of a hiccup vector tells you what kind of HTML tag it is.

A plot twist

Now, Jeanine's solution works really well for deeply nested structures like ASTs or HTML trees. However, I always get a little scared when I see vectors being overloaded.

It works okay in hiccup because when you're writing hiccup, you are mostly making big chunks of hiccup. You're not passing hiccup around. Hiccup in your code is dense and will typically only be returned (as opposed to passed to another function). But notice the problem with having to use lists within hiccup to represent sequences. Even in hiccup, people get tripped up. But it's still a great solution.

However, once you start shipping these things around as units of data, they get to be a problem. This has happened to me in the past. You start accumulating values, you have a vector of variants, and all of a sudden you're counting how many nesting levels you've got to know how to interpret each level. You've got a vector of vectors of vectors. It happens everywhere where vectors are overloaded.

It happens in Pedestal routes. Here's a sample Pedestal route data structure. Notice it starts with three nesting levels. Each of those vectors has a different meaning. Perhaps putting keywords in the front would help, but I suspect that's not a good solution here.

Pedestal routes from Hello, World! sample

A different solution

I do have a solution to the case of variants that need to get passed around and collected into sequences. I propose that you use a hashmap with a single key, which is the variant's tag. The value for that key is a vector of data.

{:image/in-memory [[...]]} ;; ellipsis represents lots of pixel data

{:image/on-disk ["/dogs.jpg"]}

{:image/web ["http://doggy.com/dogs.jpg"]}

This can still be checked by core.typed and used in core.match. It's natural in Clojure for a hashmap to represent a self-contained value. It's also only slightly harder to type than the vector version, and just as easy to read. I think it's easy, at least, because this is clearly a different pattern from the original hashmap pattern. And it's still easy to get right. I also recommend adding a namespace to the tag to show that that they are related.

The takeaway

When do you use variants?

Whenever you have different data that have the same operations (for instance, all three kinds of images can be displayed to the screen).

When do you use the vector version of variants?

If you have trees like HTML documents or ASTs after parsing.

When do you use the single key hashmap version of variants?

If you are planning on collecting values into a vector, or the nesting is not obvious.

Other ways to reduce complexity

Jeanine went over core.typed and core.match for variants.

By using core.typed, you can encode the exact data structures you want, eliminate nils, and enforce enumerations (like the :types). You can eliminate all of the erroneous cases statically and be left with only the three correct cases (or your code won't compile).

core.match can be used, too, both for convenience (comparing and defining locals in one go), encoding some more complex rules (like extra keys are allowed but ignored), and also collapsing all of the erroneous cases into a single catch-all case. That's still four cases instead of three, but that's way nicer. And core.match works for all of the solutions I've presented (including the original hashmap version).

Conclusion

Variants are a great tool that we should keep up front for lots of jobs. Their complexity grows much more slowly than the hashmap-with-:type solution we saw at first. We should all be considerate of how much complexity our choice of data representations is adding to our system. Don't forget the talk. There is also an interview with Jeanine I did before the Conj.

You may like the Clojure Gazette

For more inspiration, history, interviews, and trends of interest to Clojure programmers, get the free Clojure Gazette.

Learn More

Clojure pulls in ideas from many different languages and paradigms, and also from the broader world, including music and philosophy. The Clojure Gazette shares that vision and weaves a rich tapestry of ideas from the daily flow of library releases to the deep historical roots of computer science.

You might also like

Avoid Naming at All Costs

July 05, 2015

Summary: If naming is one of the two hardest things in programming, it follows that every other possible solution (except those few involving cache invalidation) should be attempted before naming something. As a corrolary, bad names are a code smell.

Phil Karlton (attributed):

There are only two hard things in Computer Science: cache invalidation and naming things.

Programs used to be written in binary. That is, the only names we had were those the computer understood directly. Over time, we've improved programming languages so that they are better for people to read and write. A lot of that improvement is building in higher-level concepts, such as functions, garbage-collection, etc. But the majority of the improvement comes from the ability to name things.

Naming things helps us organize our ideas about the software1. A program has to deal with many levels of abstraction. We write about how data gets represented in the machine, how that relates to domain concepts, and what the user is intending to do. Naming things helps us organize those, just like good headings in an outline help us organize ideas about a topic.

And yet it is one of the hardest problems we solve regularly. There are times when I have looked for a good name for hours, only to find none. A bad name can cost a lot. Someone coming in later could be confused, wasting precious cognitive resources.

Naming is hard because of a fundamental property of abstraction: the name does not have to relate at all to what it is naming. Names are just a string of letters. They're not meaningful to the machine, just to us. Names can lie, and that's a fundamental part of carrying meaning. If you could not lie, you could not convey new truthful information, either. And even truthful names can begin to diverge with the original code with time.

Naming is hard because it's a different kind of thinking from the rest of programming. We are coding along, in a nice engineering flow, and all of a sudden, we need a nice, human-readable name. We need to find compassion for the reader from within our cold, calculating programmer trance. This is very difficult.

Naming is hard because names need to be at the right abstraction level. Are you doing a low-level trie operation? Or is it a concept from the problem domain? Another choice to make. But it gets worse! Domain experts invent new words all the time. They're called jargon. And they're very useful. Maybe you should invent a name, instead of trying to find a name. Another difficult choice.

When I'm having trouble naming something, there is often an easy change to the code that makes the name unnecessary. If we can avoid having to name something (while also keeping the code readable), we've avoided a very costly and error-prone process. Here are a few alternatives I use a lot:

You'll notice these all play with the means of combination instead of naming. Recombine to avoid naming when naming is hard.

Since there are so many alternatives to naming that are easier than naming, it follows that if there is a bad name in our code, it means there might be a better way to organize it that we overlooked. That makes it a code smell. A little (re)factoring can get rid of that name.

You may like the Clojure Gazette

For more inspiration, history, interviews, and trends of interest to Clojure programmers, get the free Clojure Gazette.

Learn More

Clojure pulls in ideas from many different languages and paradigms, and also from the broader world, including music and philosophy. The Clojure Gazette shares that vision and weaves a rich tapestry of ideas from the daily flow of library releases to the deep historical roots of computer science.

You might also like


  1. Abelson and Sussman in SICP 1.1:

    A powerful programming language is more than just a means for instructing a computer to perform tasks. The language also serves as a framework within which we organize our ideas about processes. Thus, when we describe a language, we should pay particular attention to the means that the language provides for combining simple ideas to form more complex ideas. Every powerful language has three mechanisms for accomplishing this:

    primitive expressions, which represent the simplest entities the language is concerned with,

    means of combination, by which compound elements are built from simpler ones, and

    means of abstraction, by which compound elements can be named and manipulated as units.

Mastering ClojureScript Routing with Secretary and goog.History

June 24, 2015

Summary: The Google Closure Library provides a nice interface to the HTML5 History API. Coupling it with Secretary is very easy. But not all browsers support HTML5 History. In this post I'll talk about one way to make sure you have client-side routing in all browsers.

Background

About a year ago I was working for a company of three people. Two coders and one business person. I was developing a consumer product and the other programmer was building a related B2B product. We were as agile as could be: no planning meetings, no prioritized list of features, just a shared vision. I was working in Clojure and ClojureScript and getting paid to do it.

That job eventually disappeared. But the amount of code I produced and the dark corners of features I explored still surprises me. I discovered (uncovered?) a lot of gems of ClojureScript in that time. This post is about one of them.

Update: Andre Rauh pointed out that I was using a require when I should use an import for goog.history.EventType. I fixed it in the code. Thanks!

Browser History

In a project I did about a year ago, we wanted the speed of a single page application but we wanted the back button to work and we wanted the URL to reflect where the reader was in the app. We turned to the HTML5 History API.

The HTML5 History API is an API for manipulating the browser's history without making a request to the server and loading a new page. The idea is that your Javascript application can keep all of its state in memory, but still change the URLs and keep the back button working. You have to code it up yourself, but it gives you fine-grained control over what exactly the back button does.

Luckily (and not surprisingly), the Google Closure Library has a nice way to access the History API. It's in a class called goog.history.Html5History. That gives you events about when the URL changes. We used that along with Secretary to parse, interpret, and dispatch on the URL.

The code

First, we set up our ns declaration.

(ns history.core
  (:require
   [secretary.core :as secretary :refer-macros [defroute]]
   [goog.events])
  (:import
   [goog.history Html5History EventType]))

We need a function that will get the current path fragment to switch on. We'll just use the path and the query string.

(defn get-token []
  (str js/window.location.pathname js/window.location.search))

Now we define how to instatiate the history object.

(defn make-history []
  (doto (Html5History.)
    (.setPathPrefix (str js/window.location.protocol
                         "//"
                         js/window.location.host))
    (.setUseFragment false)))

Let's make a couple of simple routes. I won't go into how to make routes with Secretary in this post.

(defroute home-page "/" []
  (js/console.log "Homepage!"))

(defroute default-route "*" []
  (js/console.log (str "unknown route: " (get-token))))

Now a handler for what to do when the URL changes.

(defn handle-url-change [e]
  ;; log the event object to console for inspection
  (js/console.log e)
  ;; and let's see the token
  (js/console.log (str "Navigating: " (get-token)))
  ;; we are checking if this event is due to user action,
  ;; such as click a link, a back button, etc.
  ;; as opposed to programmatically setting the URL with the API
  (when-not (.-isNavigation e)
    ;; in this case, we're setting it
    (js/console.log "Token set programmatically")
    ;; let's scroll to the top to simulate a navigation
    (js/window.scrollTo 0 0))
  ;; dispatch on the token
  (secretary/dispatch! (get-token)))

Now we set up our global history object. We use defonce so we can hot reload the code.

(defonce history (doto (make-history)
                   (goog.events/listen EventType.NAVIGATE
                                       ;; wrap in a fn to allow live reloading
                                       #(handle-url-change %))
                   (.setEnabled true)))

And we will want a function to programmatically change the URL (and add to the history).

(defn nav! [token]
  (.setToken history token))

Incidentally, my links look like this in Om:

(dom/a
  #js {:href "/some/page"
       :onClick #(do
                   (.preventDefault %)
                   (nav! "/some/page"))}
  "some page")

That is, I try to follow the principle of graceful fallback. If Javascript fails for some reason, the href is still valid. It will make a request to the server and fetch the page. But if Javascript is working, we override it.

On the server side, I make sure that the same routes exist and that they return valid pages that include this script. When the page loads, the EventType.NAVIGATE event will fire, and so Secretary will route it. This usually means a repaint, but it's very quick and acceptable.

Add the requires:

   [om.core :as om]
   [om.dom :as dom]

And the Om code to render and get it started:

(defonce state (atom {}))

(defn cmp-link [cursor owner]
  (reify
    om/IRender
    (render [_]
      (dom/a
       #js {:href "/some/link"
            :onClick #(do
                        (.preventDefault %)
                        (nav! "/some/link"))}
       "some link"))))

(om/root cmp-link state
         {:target (. js/document (getElementById "app"))})

When you click the link, you should see a message in the console saying it's navigating to /some/link.

A hitch

I was using this for a while when I got a message about it not working for someone. After a little investigation, it turned out they were using an older version of IE. :( IE <= 9 does not support HTML5 History. In fact, according to caniuse.com, only 88.2% of users have a browser with HTML5 support. That means that 12 out of every 100 visitors can't use what we just wrote.

What a lot of people would do at this point is just to use the hash-based history wrangling that 93% of the internet supports. But I wanted to do better without punishing people who upgrade their browsers.

Here's what I did: the server still serves content at URLs as normal. The routes on the client stay the same. But I used feature detection to determine if the browser supports HTML5 History. If it does support it, it runs the code above. If it doesn't, it uses the hash API. Lucky for me, Google Closure has a class called goog.History that is interface-compatible with goog.history.Html5History. So 90% of the work was done.

First, we need to add this import:

  [goog History]

goog.history.Html5History required a tiny little patch to work.

;; Replace this method:
;;  https://closure-library.googlecode.com/git-history/docs/local_closure_goog_history_html5history.js.source.html#line237
(aset js/goog.history.Html5History.prototype "getUrl_"
      (fn [token]
        (this-as this
          (if (.-useFragment_ this)
            (str "#" token)
            (str (.-pathPrefix_ this) token)))))

I was very reluctant to do that, but it was the only solution I found to making it work consistently with the query string. Unfortunately, it was done a year ago and I don't remember the exact reason.

Now we need to modify get-token so it works in both cases. In the case HTML5 History is not supported, the token is everything after the # if we're on /.

(defn get-token []
  (if (Html5History.isSupported)
    (str js/window.location.pathname js/window.location.search)
    (if (= js/window.location.pathname "/")
      (.substring js/window.location.hash 1)
      (str js/window.location.pathname js/window.location.search))))

make-history is different, too. If we don't support HTML5 History, we check if we're on /. If not, we redirect to / with the token. If we are, we construct an instance of goog.History.

(defn make-history []
  (if (Html5History.isSupported)
    (doto (Html5History.)
      (.setPathPrefix (str js/window.location.protocol
                           "//"
                           js/window.location.host))
      (.setUseFragment false))
    (if (not= "/" js/window.location.pathname)
      (aset js/window "location" (str "/#" (get-token)))
      (History.))))

Everything else is the same! You can even test out what happens without the HTML5 History API by replacing the (Html5History.isSupported) with false in both places in the code above. You'll see it start to use the # fragment when you click the link!

Conclusions

I figured out all of this stuff incrementally by experimentation. I wanted to share this with you because I think it's valuable. The biggest lesson to take away is that the Google Closure Library is very complete and well-built. We should lean on it as much as we can from ClojureScript.

If you're interested in learning some ClojureScript, Om, and how to make Single Page Applications, I have to recommend my LispCast Single Page Applications with ClojureScript and Om course. It's interactive with lots of animations, exercises, screencasts, and code. It's designed to get you up and running with a smooth dev process all the way through deploying code to production. It won't teach you everything about ClojureScript and Om, but it will get you over lots of the major hurdles we all encounter.

You might also like

How I made my Clojure database tests 5x faster

June 17, 2015

Summary: Setting up and tearing down a test database can be slow. Use a rolled back transaction to quickly reset the database to a known state. You can do that in an :each fixture to run each test in isolation.

On one of my projects, I wrote a bunch of tests that had to hit the database. There was a :once fixture to create all of the tables anew and an :each fixture to delete everything in the tables before each test. That ensured that I was always working with a known empty database. Overall, the tests took about 10 seconds. Woah! That's a long time. But I lived with it.

(defn clear
  "Delete all rows before and after, just for good measure.
  [test]
  (cleardb db) ;; delete all rows from all tables
  (try
    (test)
    (finally
      (cleardb db))))

(defn setupdb [tests]
  (initdb db) ;; create the tables
  (try
    (tests)
    (finally
      (teardown db)))) ;; drop the tables

(use-fixtures :each clear)
(use-fixtures :once setupdb)

Then I remembered a technique someone once mentioned where you use a transaction that you roll back instead of starting with a fresh db each time. It's supposed to be a lot faster.

After a little experimentation, I came up with this:

(defn clear [test]
  (sql/with-db-transaction [db db]
    (sql/db-set-rollback-only! db)
    (binding [db db] ;; rebind dynamic var db, used in tests
      (test))))

We open a transaction, immediately set it to rollback (which it will do when the transaction closes). Then we have to rebind our dynamic db var, which holds the current connection. And inside of that we run the test. Inside of the transaction, anything you write to the database will be available to read. When the test ends, the transaction closes and it rolls back all of the changes, leaving the database empty again.

The result? Running the tests went from 10 seconds to 2 seconds. They still start and end with a clean database, but it's done faster with a transaction.

The one gotcha that I ran into was that the PostgreSQL function now() was always returning the same time within the transaction. I had made an assumption (that was true before) that different calls would happen at different times. That assumption was no longer true inside the transaction. I had to fix the code to not rely on time.

The other part of this technique, which I did not really have to use, was that you can set up your database with test data in the :once fixture. It's costly to set up the test data, but because you're rolling back transactions, once it's set up it's quick to reset it.

If you'd like to learn more about testing in Clojure, you might be interested in my LispCast Intro to clojure.test. In it, we cover test namespaces, assertions, running your tests, and of course fixtures. It's an interactive course with exercises, screencasts, animations, and code. You should also check out the free cheatsheet below.

You might also like

TDD Workflow in Clojure using Emacs with CIDER

June 08, 2015

Summary: TDD is about fast feedback. CIDER tightens the feedback loop with quick commands for running tests and a powerful test reporting system.

Introduction

I've always been into flow. One of the key aspects of flow is a short feedback loop. Test Driven Development (TDD) is partially based on flow, too. You write a new test, then you write code to satisfy the test, then you refactor. You cycle quickly with very small steps. Great for flow!

Now, I'm not going to be a pedant in this post about what is and what is not TDD. Sometimes I like to adhere to a strict discipline of TDD. And sometimes I like to code fast and loose. But as a working definition, for the purposes of this article, I'll define TDD as writing code and tests incrementally, and running the tests fairly often. The last thing you want is to have to wait for those tests to run.

Luckily, CIDER1 has been optimized to make the whole process smooth, fast, and feedbacky. You can learn about installing CIDER here. Or if you run into trouble.

What you need to know

First, you'll need CIDER connected to the REPL (usually just C-c M-j).

Besides the basic commands for switching buffers (C-c b), I use just one command a ton while I'm TDD'ing:

That will give you just the feedback you need: a green status report in the status bar if everything passes. And a new buffer with a failure and error report if it's not passing. You can do some cool stuff in that buffer, like jumping to the test definition, rerunning individual tests, and seeing diffs of actual vs. expected output. Look here for a quick reference to the available key bindings.

But mostly I'm editing code, compiling it (with C-c C-k), and running the tests (C-c ,) to make sure they pass now. Unlike with running lein test at the command line, this command only runs the tests for that specific namespace. This is usually what you want while you're editing code in the namespace. After you're done, you'll want to rerun all of the tests in a fresh JVM at the command line.

Conclusions

Getting a productive workflow set up is really important. It's hard on our nerves to be waiting through long feedback cycles. CIDER tightens those loops down to human scale so we can focus on the work of making the world better.

If you'd like to learn more about testing in Clojure, including how to write tests so that they work seamlessly with Leiningen and CIDER, I have to recommend my LispCast Intro to clojure.test course. It covers the most important and fundamental concepts and skills for testing in Clojure. You should not miss the free clojure.test cheat sheet below.

You might also like


  1. CIDER is an Emacs package for rocking Clojure code.

Lambda Abstraction

May 17, 2015

Summary: Lambda abstractions are always leaky, but some are leakier than others. Clojure programmers recommend keeping most of your functions pure and containing the leaks as much as possible.

Lambda abstraction refers to something we do all of the time. Let's say I have some code:

(+ 1 2)

I'm adding the number 2 to a number, in this case, 1. I could abstract that into a lambda:

(defn add2 [x] (+ x 2))

Now it's a function, which I can apply to 1. (add2 1). I can apply it to any number I want. The actual thing I am adding 2 to is abstracted away and replaced by the variable x. Lambda abstractions are just functions.

Functional programming is at its best when lambda abstractions are referentially transparent. That means that given the same arguments, a function will always return the same value. Being referentially transparent makes a software function more like a mathematical function. And that lets you reason about your code.

But there's a very real difference between software functions and mathematical functions: mathematical functions take no time or energy to "compute". They are defined abstractly, with no notion of computation. In contrast, software functions always take some time to compute. Sometimes the clearest way to write a function takes enough time that the illusion of mathematical functions is shattered. The abstraction is leaky.

So software functions are already a leaky abstraction, even if they are referentially transparent. Clojure (like most programming languages) opens the leak even further: you can put stuff that's not referentially transparent right in your function. For instance, you can write a "function" that reads from the disk or makes a web request. Making the same request twice can obviously return different values.

What most people programming Clojure recommend is to program mostly with pure functions (that means referentially transparent). You still have to deal with time, but that's way easier than dealing with the chaos of the world outside. That leaves a small bit of your code to deal with mutation, input/output, and the disk. It's still a lambda abstraction (function) but it's just leakier. Clojure simply leaves the decision up to you where to draw the line. Clojure tries to make pure functions easy, even when not everything fits into pure functions.

The takeaway of functional programming is the same recommendation: write most of your code as referentially transparent functions. The degree to which a language helps you do that is how "functional" the language is.

If you'd like to learn more about Clojure and pure functions, check out LispCast Introduction to Clojure. It's 2.5 hours of high quality video. You probably haven't seen anything like this! There's animations, exercises, characters, and screencasts. It takes you from no knowledge to a deep experience, all while having fun!

Learn Functional Programming using Clojure with screencasts, visual aids, and interactive exercises
Learn more

You might also like

Infinite Application

May 17, 2015

Summary: Function application is a key concept in lambda calculus. While it is commonly expressed using parentheses in Clojure, it is also reified into a function which itself can be applied to another function.

Function application in Clojure is easily expressed with a couple of parentheses:

(foo 1 2 3)

That's the function foo applied to the arguments 1, 2, and 3. But let's say we have those numbers in a vector somewhere, and we want to call foo on them.

(def some-numbers [1 2 3])

We could manually pull out the arguments from the vector like this:

(foo (some-numbers 0) (some-numbers 1) (some-numbers 2))

Great! That should work. But more commonly you see this:

(apply foo some-numbers)

apply means take the function (the first argument) and apply it to the arguments which are in the list (the last argument). apply pulls out the values from the list internally so you don't have to.

apply is a function you'll see in many Lisps. It plays a key role in the meta-circular evaluator as defined in The Structure and Interpretation of Computer Programs (SICP). In the meta-circular evaluator, eval and apply are defined in terms of each other.

The way eval is defined classically, (foo 1 2 3) gets turned into (apply foo [1 2 3]) internally. This means that you can replace (foo 1 2 3) with (apply foo [1 2 3]) in the program without changing the meaning.

But! Since apply is a function, (apply foo [1 2 3]) is equivalent to (apply apply [foo [1 2 3]]), which is equivalent to (apply apply [apply [foo [1 2 3]]]). And you can expand that out forever. (Please don't!).

apply is something I really love about Lisp. It takes one of the main parts of lambda calculus (function application) and reifies it. Function application is available as a function, which can be passed around, composed, etc, just like any other value. I love it!

If you're in love with this idea, too, you might want to check out LispCast Introduction to Clojure. It's my video course about, you guessed it, Clojure. It takes you from absolute parenthesis-phobe to I-never-knew-it-could-be-this-way lisper by using animations, exercises, and screencasts.

Learn Functional Programming using Clojure with screencasts, visual aids, and interactive exercises
Learn more

You might also like

But the World is Mutable

May 11, 2015

Summary: The world may be mutable, but people have been using the notion of immutability to build reliable systems for a long time.

Immutability is a hard topic to breach. As a programmer used to modeling the world, you might object to immutable data structures. How do you model a changing world? Why would you choose to use immutable data structures when everything in the world is changeable?

Let's do a little thought experiment. Let's look at a nice mutable system: paper and pencil. You can write, erase, and write again. It's very convenient. It lets you correct mistakes. And when you don't need something anymore, you can easily erase it.

Now answer this: would you trust a bank that used pencils to record transactions? It would be easy: whenever you would withdraw money, they would erase the old balance and write the new balance. And if you transferred money from one account to another, they'd erase two balances and write the new ones in. It may sound great, but there's a reason banks don't use pencils: they want to be sure nothing has changed. That sounds like immutability.

Bank ledger (photo credit)

Bank ledger (photo credit)

This is a bank ledger. Each transaction gets its own line. Always done in pen. It's an example of an append-only data structure. You can answer questions about the past like "How much money was in the account at the close of last Tuesday?" by going up lines until you find the last entry for Tuesday. And you can do that because you never modify existing entries. You only add new entries on blank lines.

Medical record system (photo credit)

Medical record system (photo credit)

This is another example of an append-only data structure in the real world: medical records. Each patient gets a file that everything is added to. You never modify old records. That way, everything is recorded, even the wrong diagnoses (mistakes) of the doctor.

It turns out that traditional systems that need a high degree reliability create immutable records out of mutable paper. Even though you could in theory scratch out some data and write it again, or white it out, or find some other way to mutate the document, a mark of professionalism in the job is to discipline yourself to adhere to strict append-only behaviors.

Wouldn't it be nice if the machine took care of the discipline for us? Even though RAM and disk are mutable like paper and pen, we can impose a discipline inside of our program. We could rely on the programmer to never accidentally overwrite existing data. But that's just shifting the burden. Instead, we can build in immutability into our data structures and make a paper that cannot be overwritten.

That's how immutable data structures work. All new pieces of information are written to new locations in memory. Only when it is proven that a location is never going to be used again is it reused.

Reliable paper-based systems use immutable data. There was a time when computer memory was expensive when we had to reuse storage, so we couldn't make immutable systems. But RAM is cheap now! We should be using immutable data, just as banks have done for hundreds of years. Ready to join the 13th century?1

If you're interested in a language with a very cool set of powerful immutable data structures, probably the most cutting edge immutable data structures in any language, you're in luck! You can get LispCast Introduction to Clojure. It's a video course with animations, exercises, and screencasts that teaches you Clojure so you'll learn it and remember it.

Learn Functional Programming using Clojure with screencasts, visual aids, and interactive exercises
Learn more

You might also like


  1. The Double-entry method of accounting can trace its history back to 13th century Florence.