Typed Configuration Files with Malli & Aero

Toni VäisänenToni Väisänen
9 min read

Most projects start by defining the application configuration file and schemas or code to validate it. The configuration can be as simple as a database connection string. Sometimes, this is read directly from an env variable in the code. In Clojure, an EDN file usually describes the required configuration explicitly. This is then loaded into the application memory from a file and, depending on the requirements, further transformed into the right format in different ways. A port number needs to be an integer instead of a string, and so on. But there's an alternative way to do this. Write your configuration files as Malli schemas and validate and transform when reading.

If you are familiar with Aero and Malli jump till the end to see how to write configurations as Malli schemas.

Let's go through the basics of Aero and Malli first before putting them together.

Aero

Aero is a popular Clojure library from JUXT that enables us to write the environment variables into the EDN configuration file with #ENV tag literals. It has other features, but for our purposes here, this is enough. Let's see what this means by creating a configuration file for an application that needs to know the user and the shell of the system the application runs on.

Without Aero, we might write something like

(System/getenv "USER") ;; => "tvaisanen"
(System/getenv "SHELL");; => "/bin/zsh"

But we prefer the declarative way

;; sample-01.edn
{:user #env USER
 :shell #env SHELL}
(ns core
  (:require [aero.core :as aero]
            [clojure.java.io :as io]))

(aero/read-config (io/resource "sample-01.edn"))
;; => 
{:user "tvaisanen" :shell "/bin/zsh"}

Now that the application can access the configuration data, we'd usually do the validation step.

Let's say that we expect the username to be at least five characters and the shell to be bash instead of zsh and we want to throw an error when the requirements are not met.

(defn validate-config [config]
  (when-not (and (count (get config :user))
                 (clojure.string/ends-with? (get config :shell)
                                            "bash"))
    (throw (ex-info "invalid config" config))))

(-> (io/resource "sample-01.edn")
    (aero/read-config)
    (validate-config))

  Show: Project-Only All 
  Hide: Clojure Java REPL Tooling Duplicates  (26 frames hidden)

1. Unhandled clojure.lang.ExceptionInfo
   invalid config
   {:user "tvaisanen", :shell "/bin/zsh"}
                      REPL:   47  core/validate-config
                      REPL:   43  core/validate-config
                      REPL:   51  core/eval17603
                      REPL:   49  core/eval17603

As a developer, I'm already thinking that this approach will be a pain. I'll probably need to add many different configuration parameters, validate for all of them, and then provide descriptive error messages. It'd still be manageable for a small toy application like ours, but it's most likely not the case for larger projects.

Validation With Malli

Malli to the rescue.

If you're unfamiliar with Malli, refer to my earlier post or the official documentation for the basics.

Let's start by defining the expected values and their types. At this point, we'll assume both values as strings and attack the constraints in a moment.

(require '[malli.core :as m])

(def schema
  [:map
   [:user :string]
   [:shell :string]])

(->> (io/resource "sample-01.edn")
     (aero/read-config)
     (m/validate schema))
;; => 
true

Everything is OK so far. Next, add the constraints from before.

(def schema
  [:map
   [:user {:min 5} :string]
   [:shell [:re "^.*bash$"]]])

(->> (io/resource "sample-01.edn")
     (aero/read-config)
     (m/validate schema))
;; => 
false

Now, we are at the same stage with our validate-config function. Let's take it a step further. Do you remember we talked about the descriptive error messages? Malli provides us with out-of-the-box tools to make this easy for us.

(->> (io/resource "sample-01.edn")
     (aero/read-config)
     (m/explain schema))
;; =>
{:schema
 [:map 
   [:user {:doc "At least 5 characters long", :min 5} :string] 
   [:shell {:doc "String needs to end with `bash`"} [:re "^.*bash$"]]],
 :value {:user "tvaisanen", :shell "/bin/zsh"},
 :errors
 ({:path [:shell], 
   :in [:shell], 
   :schema [:re "^.*bash$"], 
   :value "/bin/zsh"})}

The output is still somewhat cryptic for the casual reader. We are not satisfied until the resulting error reads like English.

(require '[malli.error :as me])

(->> (io/resource "sample-01.edn")
     (aero/read-config)
     (m/explain schema)
     (me/humanize))
;; => 
{:shell ["should match regex"]}

Now we have the error in English, but we still don't know what the regex part contains.

(def schema
  [:map
   [:user
    {:min 5}
    :string]
   [:shell
    [:re
     {:error/message "String needs to end with `bash`"}
     "^.*bash$"]]])

(->> (io/resource "sample-01.edn")
     (aero/read-config)
     (m/explain schema)
     (me/humanize))
;; 
=> {:shell ["String needs to end with `bash`"]}

One last thing before we move forward.

Usually, we need to read some values as specific types.

Assume that we expect a comma-separated string for the features application and want to use it as a set of strings. Again, we can use Malli decode for this at load time.

Let's see how decoding works first.

(require '[malli.transformer :as mt])

(m/decode [:map [:int int?]]
          {:int "1"}
          mt/string-transformer)
;; 
=> {:int 1}

Then, we adapt the example to our use case.

(m/decode [:map [:features [:set :string]]]
          {:features "foo,bar,fizz"}
          mt/string-transformer)
;; 
=> {:features "foo,bar,fizz"}

It didn't work as expected! But of course, how could Malli know what string encoding we use? We can fix this with custom decoders.

(m/decode 
  [:map [:features
          {:decode/string (fn [v] 
                            (set (clojure.string/split v #",")))}
          [:set :string]]]
          {:features "foo,bar,fizz"}
          mt/string-transformer)
;; 
=> {:features #{"foo" "bar" "fizz"}}

Let's wrap all of this in a function to load our configuration. I also updated the expected shell to be zsh so the validation step won't throw.

(defn load-config [filename]
  (let [config (aero/read-config (io/resource filename))
        decoded (m/decode schema config mt/string-transformer)]
    (when-not (m/validate schema decoded)
      (throw (ex-info "invalid schema"
                      (->> decoded
                           (m/explain schema)
                           (me/humanize)))))
    decoded))

;; let's see what happens
(load-config "sample-01.edn")
1. Unhandled clojure.lang.ExceptionInfo
   invalid schema
   {:features ["missing required key"]}
                      REPL:  132  core/load-config
                      REPL:  128  core/load-config
                      REPL:  138  core/eval17775

The error shows that we are missing the :features key. Update the configuration file to fix this.

{:user #env USER
 :shell #env SHELL
 :features "foo,bar,fizz"}

And try again.

(load-config "sample-01.edn")
;; =>
{:user "tvaisanen"
 :shell "/bin/zsh"
 :features #{"foo" "bar" "fizz"}}

It works as expected! We save a lot of work when dealing with configuration updates. In the future, we'll keep the schema updated with the configuration file itself. Simple.

Keep Schema Up-To-Date

Keeping the config and schema up to date is easier said than done. We are people, after all. If we want to ensure that the schema is updated when the configuration changes, we most likely need to add a step to remove all undefined keys from the loaded configuration to force each developer to update the schema on changes.

We can do this by adding a mt/strip-extra-keys-transformer.

(def transformer
  (mt/transformer mt/strip-extra-keys-transformer
                  mt/string-transformer))

(defn load-config [filename]
  (let [config (aero/read-config (io/resource filename))
        decoded (m/decode schema config transformer)]
    (when-not (m/validate schema decoded)
      (throw (ex-info "invalid schema"
                      (->> decoded
                           (m/explain schema)
                           (me/humanize)))))
    decoded))

Now, we force developers to add their changes to the schema configuration since if they don't, the strip-extra-keys-transformer drops the undefined keys, and it'll throw if an update doesn't match the schema.

But why stop here?

Configuration File as Malli Schema

We can define Malli schemas as EDN files.

Aero reads EDN files.

We can write our configurations and type definitions in the same file. As long as we reference the decoder functions in the EDN, if we have any, we should be good to go. Let's update our config file.

;; sample-02.edn
[:map
 [:user
  {:error/message "At least 5 characters long"
   :default #env USER
   :min 5}
  :string]
 [:shell
  [:re
   {:error/message "String needs to end with `zsh`"
    :default #env SHELL}
   "^.*zsh$"]]
 [:features
  {:default "foo,bar,bizz"}
  [:set :string]]]

Now, we have the schema with default values.

(aero/read-config (io/resource "sample-02.edn"))
;; => 
[:map
 [:user
  {:error/message "At least 5 characters long"
   :default "tvaisanen"
   :min 5}
  :string]
 [:shell
  [:re
   {:error/message "String needs to end with `zsh`"
    :default "/bin/zsh"}
   "^.*zsh$"]]
 [:features {:default "foo,bar,bizz"} [:set :string]]]

The last remaining step is to transform this into a configuration map. We'll do this by creating a transformer using the default-value-transformer .

(def defaults-transformer
  (mt/default-value-transformer
   {:key :value
    :defaults {:map    (constantly {})
               :string (constantly "")
               :vector (constantly [])}}))

We must define defaults for maps to populate nested maps in the configuration. Depending on your case, doing the same for vectors or even strings might be necessary.

(m/decode [:map
           [:int {:value 1} :int]
           [:map :map]
           [:nested-map [:map [:a :string]]]
           [:vector [:vector :any]]
           [:string :string]]
          nil
          defaults-transformer)
;; =>
{:int 1
 :map {}
 :nested-map {:a ""}
 :vector []
 :string ""}

Then, we use this to load our configuration schema with the defaults.


(defn read-config
  ([filename]
   (read-config filename {:transformer transformer}))
  ([filename {:keys [transformer]}]
   (let [schema-data (aero/read-config (io/resource filename))
         schema (try (m/schema schema-data)
                     (catch Exception e
                       (throw (ex-info "Invalid configuration schema"
                                       {:malli/error (ex-data e)}))))
         config (m/decode schema nil defaults-transformer)]
     (when-not (m/validate schema config)
       (throw (ex-info "Invalid configuration"
                       {:value config
                        :error  (me/humanize (m/explain schema config))})))
     config)))

The last problem we need to solve is including the decoder functions.

1. Unhandled clojure.lang.ExceptionInfo
   Invalid configuration
   {:value {:user "tvaisanen", :shell "/bin/zsh", :features "foo,bar,bizz"},
    :error {:features ["invalid type"]}}
                      REPL:  158  core/read-config
                      REPL:  148  core/read-config
                      REPL:  167  core/eval17934

Extend Aero Readers

Let's create our reader #decoder ... to reference the custom decoder functions.

(require '[aero.core :as aero]

(defmethod aero/reader 'decoder
  [opts tag symbol-pointing-to-a-fn]
  (if-let [decoder-fn (resolve symbol-pointing-to-a-fn)]
    decoder-fn
    (throw (ex-info "Can't load decoder function"
                    {:fn symbol-pointing-to-a-fn}))))

Now, we can add #decoder config/decode-set-of-strings to the schema file.

[:map
 [:user
  {:doc "At least 5 characters long"
   :value #env USER
   :min 5}
  :string]
 [:shell
  [:re
   {:error/message "String needs to end with `zsh`"
    :value #env SHELL}
   "^.*zsh$"]]
 [:features
  {:value "foo,bar,bizz"
   :decode/string #decoder config/decode-set-of-strings}
  [:set :string]]]

And that's it! We can now load the configuration from the schema file by decoding the values.

(read-config "sample-02.edn")
;; => 
{:user "tvaisanen"
 :shell "/bin/zsh"
 :features #{"foo" "bar" "bizz"}}

The Final Result

Here's everything put together.

(ns config
  (:require [aero.core :as aero]
            [malli.core :as m]
            [malli.transform :as mt]
            [malli.error :as me]
            [clojure.java.io :as io]))

(def defaults-transformer
  (mt/transformer
   (mt/default-value-transformer
    ;; look default value under the key `value`
    {:key :value
     :defaults {:map    (constantly {})
                :string (constantly "")
                :vector (constantly [])}})
   ;; use string-transformer to "enable" string decoders
   (mt/string-transformer)))

(defn decode-set-of-strings [string-value]
  (set (clojure.string/split string-value #",")))

(defmethod aero/reader 'decoder
  [opts tag symbol-pointing-to-a-fn]
  (if-let [decoder-fn (resolve symbol-pointing-to-a-fn)]
    decoder-fn
    (throw (ex-info "Can't load decoder function"
                    {:fn symbol-pointing-to-a-fn}))))

(defn read-config
  ([filename]
   (read-config filename {:transformer defaults-transformer}))
  ([filename {:keys [transformer]}]
   (let [schema-data (eval (aero/read-config (io/resource filename)))
         schema (try (m/schema schema-data)
                     (catch Exception e
                       (throw (ex-info "Invalid configuration schema"
                                       {:malli/error (ex-data e)}))))
         config (m/decode schema nil transformer)]
     (when-not (m/validate schema config)
       (throw (ex-info "Invalid configuration"
                       {:value config
                        :error  (me/humanize (m/explain schema config))})))
     config)))

(read-config "sample-02.edn")
;; => 
{:user "tvaisanen" 
 :shell "/bin/zsh" 
 :features #{"foo" "bar" "bizz"}}

Conclusions

The benefits of having the validation and the configuration in the same file are that everything is in the same place, including the definition and the validation. It also makes it easier to maintain consistent validation practices since Malli takes care of the validation process. But all of this can make the configurations more difficult to read, which might be a problem, especially if the files need to be understood by people who don't know Clojure, for example, in DevOps.

If your project is already all in on Malli, this makes sense. The developers should already be familiar with the schema syntax, so there shouldn't be too much adoption friction.

I've never used this approach in production, but I'll try it out next time I need to configure a new project.

Once again, thanks for reading.

Feel free to reach out and let me know what you think—social links in the menu.

0
Subscribe to my newsletter

Read articles from Toni Väisänen directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Toni Väisänen
Toni Väisänen

Software engineer @ Metosin Ltd Need help with a project, contact: first.last@metosin.com As a 𝐜𝐨𝐧𝐬𝐮𝐥𝐭𝐚𝐧𝐭, I help clients find technical solutions to their business problems and facilitate communication between the stakeholders and the technical team. As a 𝐟𝐮𝐥𝐥-𝐬𝐭𝐚𝐜𝐤 𝐝𝐞𝐯𝐞𝐥𝐨𝐩𝐞𝐫, I build technical solutions for client's problems from user interfaces, and backend services to infrastructure-as-code solutions. As a 𝐦𝐚𝐜𝐡𝐢𝐧𝐞 𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 𝐞𝐧𝐠𝐢𝐧𝐞𝐞𝐫, I create, validate and deploy predictive models.