(clj 10) The mapify function of (clj 9) revisited
In my previous Clojure post I covered the code example at the end of Chapter 4 of “Clojure for the Brave and True”. Or rather, I focused on a single function in the example, called mapify
, and described how I figured out what it does. At the end of that post I shared my disappointment:
Having done all this work (and writing) I can’t help but feel a bit disappointed. In the end, the function does not seem to do a lot.
And I shared a Python version of that same function, claiming that it basically does the same thing1.
However, less than a week after publishing the post, I got a very kind email by someone named Martin. And “in defence of clojure and to maybe reduce your disappointment” Martin pointed out the two versions are not as equivalent as I claimed, because the Clojure version is more general than the Python version. So I looked at the two version again and the way they are different turned out to be quite interesting - interesting enough to write a follow-up post.
The Cojure version of mapify
The Clojure version looks like this:
(defn mapify [rows] (map (fn [unmapped-row] (reduce (fn [row-map [vamp-key value]] (assoc row-map vamp-key (convert vamp-key value))) {} (map vector vamp-keys unmapped-row))) rows))
It expects an argument similar to:
(["Edward Cullen" "10"] ["Bella Swan" "0"])
and transforms it into a sequence of maps:
[{:name "Edward Cullen" :glitter-index 10} {:name "Bella Swan" :glitter-index 0}]
The main thing to note about the Clojure version are the vamp-keys
and the convert
, which were defined earlier in the example.
The vamp-keys
lists the keys that are used to transform each sequence of the input argument to a map:
(def vamp-keys [:name :glitter-index])
So this variable describes our expectations of the input: a name and a glitter index.
The convert
function uses the conversions
variable to convert a value in our map:
(def conversions {:name identity :glitter-index str->int}) (defn convert [vamp-key value] ((get conversions vamp-key) value))
If the key is :name
, it applies the identity
function, i.e. the value is unchanged. If they key is :glitter-index
, it converts the value from a string to an integer. That’s needed, because as you can see above, the argument to mapify
has the glitter index still as a string. So the conversions
variable describes if/how we want to convert the values in our map.
The result of having vamp-keys
and convert
defined separately in this way is that, as Martin writes:
The clojure code is more general and allows to add new columns/keys by just adding the key to the vamp-keys and the column to the data - and probably a conversion to conversions.
The Python version of mapify
My Python version looks like this:
vamp_keys = ["name", "glitter-index"] def mapify(rows): return [ { vamp_keys[0]: row[0], vamp_keys[1]: int(row[1]) } for row in rows ]
Let’s start with the conversion: there is no separate convert
function in my Python version. The conversion happens inside the function. (Inside the return
statement even!)
There is a separate vamp_keys
variable as in the Clojure version, but that variable is not used in the same way. The Python version refers to the items in vamp_keys
by their index, e.g. vamp_keys[0]
. The Clojure version on the other hand, uses map
to loop through the vamp-keys
and the unmapped-row
.
What this means is that if you’d want to add for instance “age” to the data, in the Clojure version you’d only need to add "age"
to vamp-keys
and :age str->int
to conversions
. The mapify
function can remain exactly the same. That would not be the case in the Python version. There you’d need to update both vamp_keys
and the mapify
function. A truly equivalent version in Python would provide that same flexibility, instead of taking the shortcut as in my version2.
Why does this difference matter?
So the Clojure version is more general and can accommodate changes in the data structure without any changes to the mapify
function. Does that matter?
In a weird sense it doesn’t. The whole code example is meant as a summary of chapter 4 of “Clojure for the Brave and True”. So it doesn’t need to be general, it needs to illustrate the contents of that chapter.
And yet I find the difference between the two versions intriguing. Not as in my previous post where I was a little disappointed that my simpler Python version basically does the same thing as the Clojure version3. It’s clear to me now that that’s not true. (Unless you want to make the “basically” do a lot of work.) Rather, I’m intrigued by the difference in what information is stored where in the two versions.
The Clojure version has the information about what kind of argument the mapify
function expects, in vamp-keys
and conversions
. This means that once you understand the code, you can forget most of it again. All you have to remember is what mapify
does on a very high level and that changes to the function’s argument require changes to vamp-keys
and conversions
.
That’s not the case with the Python version. There you do have to understand what the mapify
function does in detail, so you can update it together with the vamp_keys
. For this particular code example that might not be that big a deal, because the function is rather straightforward. Yet it does still illustrate an important question about structuring your code: where do you put the key elements?
-
I did also add this footnote: “The Python code does do the
convert
in a separate function. It would also return a different result than the Clojure code if the data would have more than two items per row.” ↩ -
In my defense, this difference did cross my mind very briefly. I dismissed it without giving it much thought, though. Which is a valuable reminder to not just dismiss that little voice in the back of your head. ↩
-
Martin also shared a Clojure code equivalent to my Python version, so without that flexibility, and lo and behold, they look very similar. ↩