Pre-Conj Interview: Steve Miner

October 19, 2014


I had a conversation about generating test.check generators with Steve Miner. He's giving a talk at Clojure/conj about that. Read the background to his talk.


LispCast: How did you get into Clojure?

Steve Miner: I think it was around 2008 when Paul Graham announced his new programming language, Arc. I was reading about Arc, when I came across a comment about Clojure and decided to take a look. I was very impressed with Rich Hickey's intro video. The immutability and concurrency features really resonated with what I was hoping to find in a new language. The Java integration made Clojure a practical language with a huge eco-system of tools and libraries. I was using Java at work and I managed to do a bit of Clojure for a side project, but mostly I was just dabbling. A couple of years later, I decided to work on my own with Clojure full-time.

LC: Can you describe Herbert for those who have never used it? Why would someone be interested?

SM: Herbert is a schema language for edn. The goal is to have a convenient language for describing the shape of Clojure data. I started out with an informal notation that I used for my internal documentation. For example, describing a map with certain required keys and the corresponding value types. I think you can guess what {:name str :num int} means as a schema. It turned out that with a little work, that informal notation could be used as a pattern language with a simple API for testing conformance. A Herbert schema is itself just edn data, open to all your Clojure tools. More recently, I added the ability to generate test.check generators from Herbert schemas, which makes it easy to generate test data.

LC: That sounds nice. Would you mind explaining test.check generators a bit for those who don't know?

SM: Reid Draper ported QuickCheck from Haskell to Clojure, and it became a contrib library called test.check.

Test.check is a property-based testing tool. A property is basically an invariant that should hold true over a range of input values. The test.check library gives you combinators that allow you to define generator functions which create data of specified types with optional constraints. The idea is to think about the whole range of possible inputs that your system should handle. Test.check then automatically tests across a random sample of generated data, probably generating example data that you might not have considered in your typical unit testing. If it finds a failure case where the desired property does not hold, test.check is smart enough to regenerate test cases so as to shrink the failure example to a reasonable size. That helps you isolate the cause of the failure.

Clojure/West had two excellent talks about property-based testing last year. Reid Draper covered test.check, and John Hughes talked about QuickCheck.

LC: So you're able to automatically create generators from your schemas, which can also be used as contracts on your function arguments. Has having both improved your bug rate?

SM: I don't have any numbers, but subjectively I think it's helped. For me, Herbert schemas are primarily documentation tools, which help me to keep track of my data. That being said, I often test schema conformance in preconditions or asserts, especially with new code or when I'm trying to debug a problem.

Of course, I still make errors in specifying schemas and sometimes my properties aren't exactly correct the first time. Particularly with new property-based tests, I have to look carefully at failures in case the bug is actually in the test. My hope is that schema-based generators will make property-based testing easier to use.

Using test.check definitely improves my confidence that I'm finding bugs in testing and avoiding regression errors. It's been a great way to catch bugs in my own Herbert library.

LC: So you have runtime check to make sure the function arguments conform to certain schemas. And you have a generative test that exercises a large space of that schema. Sounds pretty good to me!

But it sounds like you're saying the primary benefit is more for you or other readers of your code. Can you elaborate on that?

SM: My approach started with a notation designed to help me keep Clojure data structures straight in my mind. I wanted something simple and terse, what I called a "whiteboard compatible" notation. My goal was that the notation should look something like the data it was supposed to represent as opposed to code or a type system.

So I began with documentation in mind, and I still think of that as the primary benefit. Once I got the idea of implementing conformance testing against formal schemas, the project became more about the code.

LC: In what other areas do you see schemas playing a part? The first thing that comes to mind is writing core.typed type annotations. Anything else?

SM: There's some conceptual overlap between schemas and type systems, but I see core.typed as a much more ambitious project. Herbert schemas only cover edn data and don't deal with function types, for example. The Datomic database naturally has a schema language, so it would be interesting to see if Herbert could useful for data modeling. In the near term, I plan to extend Herbert so that it supports the Transit datatypes.

LC: What resources would you recommend to a beginner who wanted to make the most of your talk?


LC: Where can people follow your adventures online?


LC: One last question: If Clojure were a food, what would it be?

SM: I'll say "pizza" because it's a food that is made by composing fancy toppings on a classic crust. And hackers like it while coding.

LC: Thanks, Steve, for a great interview.

This post is one of a series called Pre-conj Prep.

You may like the Newsletter

For more inspiration, history, interviews, and trends of interest to functional programmers, get the free Newsletter.

Learn More

Clojure pulls in ideas from many different languages and paradigms, and also from the broader world, including music and philosophy. The Newsletter shares that vision and weaves a rich tapestry of ideas from the daily flow of library releases to the deep historical roots of computer science.

Clojure/conj is a conference organized and hosted by Cognitect. This information is in no way official. It is not sponsored by nor affiliated with Clojure/conj or Cognitect. It is simply me curating and organizing public information about the conference.

You might also like