Care and Worry about State Transitions

Posted on July 16, 2012

0


Ever had to try to explain why a certain part of a system is in a particular state and not been able to say what, why or when the data change happened? I have. Too many times. And it is all my own fault. I have contributed to the ‘update culture’ of software development by writing systems that overwrite state without adequate supervision or control.

As developers we care enough about state transitions to develop fairly complex systems to keep track of all the changes, additions and deletes we make to our own data; the source code. These systems allow us to control, monitor and roll back and forwards any changes we want. We call them source control systems and everyone likes and relies on them.

What is surprising is that many of the systems we write and keep in source control show a flippant, almost reckless disregard for state transitions. We have a (now) derogatory acronym for data-centric software that is primarily designed to store and change data freely – CRUD. CRUD applications are often just a thin layer on top of some kind of database and I think a lot of the blame for the ‘update culture’ lies with databases like SQL Server that encourage the developers to store and mutate data.  In addition to allowing for free mutation these systems rarely capture the intent the user so to answer the ‘why’ question is almost always impossible.  Things like this needs to be done at the code level, in the domain, in the language of the domain.

Why aren’t the systems we keep in source control equally concerned with the state changes as the source control systems we use? We want to know the when-what-who’s of every change that is applied to the code base. We want to look at the history and understand the present, how we got there, who made the changes and what they were. Why don’t all systems, particularly those with a collaborative domain do a similar thing? Ignorance? Laziness? Resistance to change? It has got to be something like that.

How to show you that you care about state transitions.

Ok so you are convinced that the ‘update culture’ is evil and you want to start paying a bit more attention to your state transitions. How can you do this? Where do you start?

Command Query Separation

CQS (Command Query Separation – Bertrand Mayer) is principle that that states that a method on a class should not change the state of the application if it returns a value and inversely should not return a value if it changes state. Methods that change state are callad ‘Commands’ and methods that return results are called ‘Queries’.

By applying this principle to object oriented code you are forced to pay special attention to the consequences of your code which in turn will reduce the chance of accidentally introducing a state transition. It also makes your code easier to read and use as the separation is evident in the structure of the code.

The simplest C# example would be to rewrite the humble property using ‘normal’ methods. A getter is a query and a setter is a command. Because the syntactic sugar of C# properties creates this separation for us it may be that C# developers aren’t as tuned into making other methods adhere to the CQS principle (pure speculation).

public class CommandQuerySeparation
{
    private string state = string.Empty;

    //Command
    public void SetState(string value)
    {
        this.state = value;
    }
    // Query
    public string GetState()
    {
       return this.state;
    }
}

Versioning

The easiest way to avoid changing the state of a database is by never overwriting the data of any record. For example, every time a persistent object is modified and saved a completely new records is added to the database table instead of just overwriting the old one. The new record is marked as current and is used for all view queries and the old record is left primarily for audit purposes. If you ever have to roll back to a previous state you only change the current flag on the record. You also have a full change history of the entire record.

This requires some care to ensure the correct record is marked as current but that is a fairly trivial task and is probably best handled by your persistence abstraction. In a domain where records keep changing a lot you may end up with a significantly larger database that you would otherwise have had but this is rarely a massive problem. At least, having a large database is probably a smaller problem than not having the entire change history readily at hand when things go wrong.

Event Sourcing

Event Sourcing is an approach to persistent state that has gained traction primarily in the last few years. It is the one that is most similar to the source control systems discussed above and could also be seen as more mature variant of ‘Versioning’. Instead of applying updates to the persisted data each state transition is modelled as an event containing the data required to apply the state transition. This event object is what is persisted to an append only event stream. Mistakes are rectified by taking compensating action much like a double entry accounting ledger (reverse and repost).

The state of a particular object is re-hydrated from the stored event stream each time it is required. Reads are typically handled separately (see CQRS) from some form of persistent view store.

Use Functional Languages

In Functional Programming (FP) state transitions (or side-effects) already have an elevated status. When creating side-effects the order or execution is extremely important as a state transition outcome depends on potentially all the previous states of the system. Some functional languages require special constructs so that they can produce and sequence side-effects (see I/O monads in Haskell). Functions in FP are normally (and preferrably) pure, i.e. they do not produce any side-effects and this encourages you to keep the majority of your code pure and only cause side effects when you really need to. Functional languages that do allow mutable state often require you to use a special keyword for mutable state (see the mutable keyword in F#).

Functional languages force you to think about state transitions and manage state in general and more often than not they discourage developers from mutation in general.

Code Contracts

C# Code Contracts (see Design by Contract – Bertrand Mayer) can be used to control certain aspects of state to ensure that it never enters one that is considered invalid by the system. In particular, code contract invariants can be declared that prevent certain properties of an object to be assigned invalid values. For example, it can be used to stop a particular field from ever becoming null.

public class InvariantExample
{
    private string data = string.Empty;

    public string Data
    {
        get { return data; }
        set { data = value; }
    }

    [ContractInvariantMethod]
    private void Invariants()
    {
        Contract.Invariant(data != null);
    }
}

Code Contracts encourage you to think about what the permitted values of state needs to be and enforce the policies you create in code.

So?

We need state transitions as without them there would be no systems. So as the state is the defining characteristic of most systems it should be treated with massive amount of care and attention and should only be allowed to change when absolutely necessary. I generally don’t like code gold-plating but when it comes to state transitions it can almost be justified. If for no other reason than to make people notice that it is happening.

There are plenty of ways you can elevate the status of the humble state transition and make your code less error prone and easier to reason about and I hope this blog has given some options and reasons for why we need to care more.

Advertisements
Posted in: Patterns