Replacing booleans and enums with dates

It's imperative for a domain model to contain enough data to enforce domain invariants. A good domain model may capture even more information related to the business processes being handled by the system (it's a bit different when you use event sourcing, but it's out of the scope of this post). I believe that it is very useful to gather as much data as possible – as long as the related investment is not too high.

Additional data might be a valuable asset for analysis (e.g. data mining jobs). What's more, business rules may change in time. What is an extra information today, may become something related to a new business rule tomorrow. If you persist such information from the beginning, you can use it for enforcing such emerging invariants not only for the new, but also for the historical data.

The value of temporal data

A stereotypical domain model is focused on the current state of the entities. Unless modeled explicitly, the history of the state changes is lost. In many cases it's perfectly OK, but you should never underestimate the value of temporal data.

You can apply various methods to keep track of changes in your domain model. Today I'd like to share a tiny little trick that can enrich your domain model at virtually no cost. Let's consider the following business domain fragment:

After registration, a customer has to fill some additional details in. After that, they can submit their account for activation. Administrators see all of the submitted customer accounts in their panel. After verification, they activate the account.

A typical implementation could look as the following:

public class Account {  
   private boolean active;

   // ...

   public void activate() {
       this.active = true;
   }

   public boolean isActive() {
       return active;
   }
}

What I found valuable in many cases, is to replace the boolean with a temporal data type. With proper encapsulation, it doesn't change the contract and is invisible to the collaborators of the entity:

public class Account {  
   private Date activationDate;

   // ...

   public void activate() {
       this.activationDate = new Date();
   }

   public boolean isActive() {
       return activationDate != null;
   }
}

And that's it. Simple change, similar amount of code. But now you store more information in your database tables. Even if you don't have a real need for using it in the code, it can become a time-saver when troubleshooting some production issues. Whenever a business rule includes a state transition, I consider it a good candidate for using temporal types.

Since storage space is very cheap nowadays, I wouldn't care that DATETIME column requires more space than a boolean. The real cost is related to writing slightly different queries (nullchecks instead of simple boolean comparisons). It's not a big deal and in many situations the return on investment is very high.

What's more, you can always relatively easily migrate from date to boolean if you really need to. It's much harder (or often impossible) when you start with boolean and at some point realize that you need some additional information.

Unexpected insight

I know a great example of a situation, when applying this simple trick paid off significantly. In that system, users could request an instance of the analytic platform to be installed for them. After receiving a request from the user, administrators would start the installation process. It could take some time, during which the end user would see "in progress" status.

The initial domain model consisted of Instance entity with a status field. It could have one of the following values: REQUESTED, IN_PROGRESS, DONE.

After learning the trick with using dates, the team converted status into three date fields: installationRequestedAt, installationStartedAt and installationFinishedAt.

After some time, a new idea emerged. Some changes to the infrastructure were made. They expected to reduce the time of the installation process. It was valuable to compare some statistics for installations before and after the changes. With the enchanced model in place, it was an easy task to calculate mean time between submitting a request and the end of the installation. Such statistics could be calculated for both the new and the historical data. It gave valuable insight into real output of the introduced changes.

What started as a simple refactoring, eventually turned out to bring a great reward for the developers:

I will never use a boolean in my model again ;) — Paweł Wacławczyk (@pawaclawczyk)

Of course there is more code to write when converting an enum field with multiple values into several dates than when converting a single boolean value into a single date. In some cases the cost might be too high, but I believe that you'll find many places with adequate risk-benefit ratio.

Summary

The amount of business information you gather in your system is one of the metrics of how good the domain model is. Whenever a business rule involves some state transitions, you may find it valuable to persist the time of the event. In many cases it can be done at virtually no additional cost. Such temporal data becomes an asset when analyzing production issues. What's more, it can enable additional ways for data analysis and bring some unexpected value in the future. I hope you will find this trick useful in your projects!

Update (07 Jun 2015)

The original message was that it's worth to keep track of the changes in time (as opposed to storing just the current state).

The actual way of storing such data is a separate concern. For example, with event sourcing you can take those concepts to another level. But even without it you can introduce an explicit model that would contain all the data.

The solution described in the post is just a tip for the simplest cases. The bottom line is, that you can keep the model relatively simple and still record more data. If the scenario gets more complex, though, you will definitely need to use more elaborated approach when modeling your domain (such as introducing additional entities).

One of the readers expressed their concern about using null values. Please note, that I put emphasis on strict encapsulation. The nullable field is invisible from the outside of the class. Of course nulls are still present in the database. I have never seen it to become a real problem, though. But if it is an issue for you or your operations team, you could either resign from storing the date or prepare a more complex model. As always, we need to evaluate the investment against the potential benefit.


comments powered by Disqus