jank development update - Dynamic bindings and more!

For the past couple of months, I have been focused on tackling dynamic var bindings and meta hints, which grew into much, much more. Along the way, I've learned some neat things and have positioned jank to be ready for a lot more outside contributions. Grab a seat and I'll explain it all! Much love to Clojurists Together, who have sponsored some of my work this quarter.

Vars and roots

I estimated that dynamic bindings would take me a while to implement, mainly because, even after several years of using Clojure, I still didn't know how they worked. It's because of this that I'll now explain it to all of you. We can break it down into two parts:

Var roots
Var bindings (overrides)

So, firstly, each var is a container for an object (though vars may start out unbound, they still hold a special unbound object regardless). That object is known as the var's root. When you define a var with def or defn, you're generally providing its root.

; person is a var and its root is "sally"
(def person "sally")

; add is a var and its root is an instance of the fn
(defn add [x y]
  (+ x y))

When we refer to a var by its name, we're implicitly dereferencing that var in order to get its root.

; You write this.
(println person)

; Implicitly, it does something more like this.
(println (deref #'person))

When we use a REPL client to re-evaluate a file, which evaluates the def and defn forms again, we end up changing the root of those vars. The old root is gone entirely. There's another way we can alter the root of a var, which is using the alter-var-root function. This is very similar to just redefining it. The old root is gone.

However, for some vars, we want to change the root for only a short period of time, after which we'll want the old value replaced. We can do this with with-redefs.

(println person) ; => sally

; Within the scope of this form, we redefine our var.
(with-redefs [person "joe"]
  (println person)) ; => joe

; Now it's back to being sally.
(println person) ; => sally

An important thing to note is that changing a var root affects every running thread of the program. The var root is global and synchronized. This is great, for some things, but sometimes we only want to change a var on our current thread. For example, in Clojure there is a var called *out* which holds a java.io.Writer object. Any core functions which write to stdout will write to that object. Let's say we have a particular bit of code where we want to write to a file.

; Open a file, redefine *out* to point to the writer, and then write to *out*.
(with-open [log (clojure.java.io/writer "system.log" :append true)]
  (with-redefs [*out* log]
    (println "this should go into a file")))

This works! However, if you have any other threads which are also writing to *out*, you may find that what they write ends up going into your log file, too! This is because every thread is affected by the changing of a var's root. So how would we avoid this problem? Dynamic bindings!

Dynamic bindings

With dynamic bindings, we can supply an "override" for a var, without changing its root. Even better, that override is tied only to our current thread. The way that this works is that, separate from the var itself, Clojure keeps track of a map of overrides. The map is from var to value. Something like this:

{#'*out* log}

There's one extra trick here, though. Rather than just a single map of overrides, we have a stack. Each time a var is overridden, a new map is pushed onto the stack. We always pull from the top of the stack to the get latest value for a var. Let's take a look at an example.

(def ^:dynamic *foo* 42)

(binding [*foo* 0]
  (binding [*foo* 100]
    (println *foo*)) ; => 100
  (println *foo*)) ; => 0

Here, we have a var which has a root of 42. We note this var as dynamic, which means we can provide these overrides, known as bindings. There's a binding macro which helps us do this, but all it does it push the new override and then pop at the end. Our stack (which is a linked list) of overrides might look something like this:

; Initially, no overrides, let's say. An empty stack.
'()

; When we bind *foo* to 0, we add an override.
'({#'*foo* 0})

; When we bind *foo* to 100, we add an override.
'({#'*foo* 100} {#'*foo* 0})

; At the end of each binding form, we pop from the top of the stack.

With this stack of overrides, we just need to update the deref implementation for vars to not just return the root. Instead, they check if the var has any overrides and then returns the latest override, falling back on the root if there is none. In jank's C++ code, that looks like this:

object_ptr var::deref() const
{
  auto const binding(get_thread_binding());
  if(binding)
  { return binding->value; }
  return *root.rlock();
}

Binding conveyance

There's one other neat thing about dynamic bindings which can't go unmentioned. When you make a new thread in Clojure, the current var overrides you have get copied over to the new thread. The same applies to async blocks in core.async. This allows you to set up some var state and push it to another thread without needing to rebuild everything. With core.async, each time your task becomes unparked, bindings are re-established to what they were before. This ends up being super convenient and can feel magical.

Thanks to Tim Baldridge for his post on this stuff, which helped put it more into terms I could understand. My goal here has been to reiterate and elucidate.

Status report

Ok, so all of this is implemented in jank now. I could've just said that from the beginning, I know, but I get paid by the character still, I swear. jank doesn't have future yet, but everything is in place for binding conveyance. To go hand in hand with this, I tackled meta hints next, allowing us to do things like ^:dynamic on a var.

Meta hints

This one doesn't get a big explanation. jank now supports hints in the form of ^:keyword, as well as ^{:keyword true}. Multiple hints can be specified, they can be nested, etc. Not many parts of jank are using these hints yet, since I didn't go through all existing code to update usages, but we can do that iteratively. See? I can be brief.

What else?

Well, my quarter was booked for the following tasks:

🗹 Dynamic vars
🗹 Meta hints
☐ Syntax quoting
☐ Normalized interpolation syntax

The first two are done and the latter two remain, to be done in the next month. But is that all I did in two months? Just dynamic vars and meta hints? Nah.

Support for exceptions

I've added support for the special throw form, as well as for try/catch/finally. Previously, I was using inline C++ to throw exceptions, but the work on the dynamic vars gave me a good excuse to get proper exception support in there. I needed to ensure bindings were gracefully handled in exceptional scenarios!

My plan is to do a compiler deep dive, next post, going into how exceptions were implemented in jank. Stay tuned for that.

Escaped string literals

jank previously didn't properly support strings like "foo \"bar\" spam", since they require mutation in order to unescape them. This is an interesting edge case, because all of the other tokens within jank's lexer work based on memory mapped string views. String views allow jank's lexer to run without any dynamic allocations at all. However, for escaped strings, we need to allocate a string, since we need to mutate it when unescaping. It's straight-forward, but just something I hadn't tackled yet.

Reader macros

Since I was improving the reader to support meta hints, I figured I'd add reader macro support. In jank, you can now use #_ to comment out forms. Also, you can use #{} to create sets. Finally, you can use #?(:jank foo) and #?@(:jank []) to conditionally read in jank code. The :default option is supported, too. Support for shorthand functions, regex values, and var quotes will be coming soon.

New core functions and macros

While implementing the above features, I also added support for the following vars in clojure.core:

assert
when-not
comment
zipmap
binding
with-redefs
drop (not lazy)
take-nth (not lazy)

Since jank doesn't have lazy sequences yet, some of these functions are a little hacky. Lazy sequences come next quarter, though, as well as loop support!

Community involvement

But wait, there's more! jank received its first non-trivial code contributions this quarter, from a helpful man named Saket Patel. He's working on a jank plugin for leiningen and it's currently in a state where you can set up a multi-file jank project and use lein jank run to make magic happen. 😁 He's also submitted some C++ improvements to aid in that quest and to help jank compile on Apple M1 hardware. I'll have a more detailed demonstration of this in my next development update.

On top of that, in order to get him set up for contributing, I've done the following:

Added a Code of Conduct
Added a CLA, which is managed by a bot that will prompt you to sign upon submitting a PR with a large enough
Added automated code formatting and re-formatted the whole codebase, using clang-format. I don't like the look of it, but at least it's consistent

Would you like to join in?

Join the community on Slack
Join the design discussions or pick up a ticket on GitHub
Considering becoming a Sponsor
Hire me full-time to work on jank!