Jack-in-the-box code š£
In my previous article I explained why itās time to move away from using string
as the type of things in your code just because theyāre meant to contain text. That article was more of an appetizer, most of what matters is repeated here so feel free to continue reading even if you havenāt read the previous one.
Jack-in-the-box, the toy
Jack-in-the-box is a 14th-century childrenās toy that looks like a box with a crank that can be turned to play music. If one keeps turning the crank eventually something pops out to startle them. The good ones pop out randomly, as opposed to at the end of the song.
Jack-in-the-box, the code
I like to think of loosely typed variables ā such as an email field typed as string
ā as little jacks-in-the-box in your code. Most of the time the box plays a beautiful song, but turn that crank for long enough and you hit an edge case, and a nasty exception springs out of the box because that email string turned out to be an empty string. After all thatās why you take your laptop when going on vacation.
Your vacation deserves better
Just like the better quality versions of jack-in-the-box, invalid content in your domain usually breaks at unexpected places. This is not a problem with your architecture. Most (if not all) your code assumes that an email field is not an empty string, is not multiline, does not contain spaces, etc, because assuming the opposite means validating everywhere which is just not realistic.
Of course you can just validate the known entry points such as user input, but start adding shady APIs and morally flexible document DBs into the mix, and soon enough youāll have more āvalidationā code than ācodeā code in your solution.
Goodbye Jack, you wonāt be missed
Saying goodbye to Jack is easy, and it even has a name, itās called making illegal states unpresentable. Itās a mouthful, but as far as mouthfuls go, itās one of the most important ones Iāve come across. It means that if a thing is not supposed to be a different thing, design your domain so it canāt possibly be that different thing.
Think about it, you type integer fields as int
and text fields as string
precisely so that integer doesnāt turn out to be a string when you need to use it. But why settle for this rudimentary safety. If an email is always different than a country code field, why give them the same type. They are different types in real life, itās time for them to be different types in your code.
Donāt just do it in the one place where you expect errors, do it everywhere. A string is never a string. Look closely and that endpoint that expects a string probably actually expects a string no longer than 32,768 characters with no tabs, so why use string
when thereās a readily available 32KnoTabString
type for you to use, or at least there will be after you create it.
Ok, it requires some code
So the idea is to create types. Lots of types. Creating lots of types might not sound appealing depending on your programming language of choice, but this is absolutely a language limitation, not a technical one.
This is how little code is necessary to declare a new type that represents non empty strings in F#, ready to replace all your strings that were probably never meant to be empty anyway. Good riddance.
// Non-empty string, single-case union style
type ActualText = private ActualText of string with
static member New = function
| s when String.IsNullOrEmpty s -> None
| s -> Some (ActualText s)
member x.Value = let (ActualText s) = x in s
// DISCLAIMER:
// Meant to illustrate the point above, it's not particularly good code
While this is more verbose than not declaring anything and using strings everywhere, think of all the exceptions these 5 lines of code prevent, and all the content-checking and validation code they render useless.
Itās usually cheaper (both in lines of code and in potential errors) to fix the problem at the source, and the source is your domain ā where you define what a thing is.
Tell it like it isnāt
So it all boils down to making domains more explicit about what things are (an email is a string) and arenāt (an email is not a multiline string), and this is done using types with all the necessary validation embedded into them. The concept is simple, and so is the code. Letās take a look at one way (of many possible ways) to define an Email
type using object oriented style:
// embedding validation in the type itself (object oriented style)
type Email private (s:string) = class end with
static member Validate = function
| s when String.IsNullOrWhiteSpace s ->
Error "input is empty"
| s when Regex.IsMatch(s, "[^\w-+_.@]") ->
Error "input contains invalid characters"
| s when Regex.IsMatch(s, "^[^@]+@\w+.\w+$") ->
Ok (Email(s))
| _ -> Error "invalid email"
member x.Value = s
Note that here a conscious choice was made to have more validation cases than necessary in order to have more meaningful errors. If brevity is your concern you can always have a single validation case with a universal āinvalidā error message. Itās still safer than using strings, albeit not particularly user friendly.
Using your shiny new types
These types require some care. Remember, we got rid of jack-in-the-box, but we still have a box, and we still donāt know whatās in it until we open it. Call it an appropriately labeled container. Or call it Result
. You may still find Result
less convenient to use than string
bindings, but consider the following two things:
Result
will never blow up in your face (this is a good thing)- Functional languages have things that start with āMā¦ā but shall not be named that allow you to write almost exactly the same code that you would using strings (also a good thing)
Iām not going to go deeper into the topic of Result
, itās a huge topic and beyond the scope of this article. For now weāll just use a plain old match expression (POME) to unpack that box. Turns out this particular one wouldāve blown up in our face. Bloody linebreakersā¦
// safely create and consume an email using embedded validation
let result = Email.Validate "do\n@syme.fs"
match result with
| Ok x -> consumeValidEmail x.Value
| e -> printfn "%A" e
// OUTPUT> Error "input contains control characters"
There will be blocks
Thereās not a lot of code in this article, and itās not particularly good code either. The next one has more and better code, but hopefully itās enough to illustrate the concept of designing with types.
If you enjoyed it please consider retweeting this articleās tweet to support the blog!