Status is one of the most common concepts you'll think about when designing any system. Whether you're managing users, products, orders, flights, patients, job applications, emails - I could go on forever - you can probably think of at leat a couple of statuses that can be applied to any of those; and odds are, any database you used or had access to in the past, had at least one column named status
in some table. It just sounds natural. What is the current status to this particular thing?
The thing is, a status usually requires extra context to make sense within the system it's being used in and that status is often used as a condition for a certain piece of logic.
Take the following example of a table where specific fruits are being stored.
id (int) | fruit_id (int) | status (varchar) |
---|---|---|
1 | 1 (apple) | ripe |
2 | 2 (banana) | underripe |
3 | 2 (banana) | expired |
4 | 3 (pear) | ripe |
NOTE: I've added the fruit names to the fruit_id
column just so they're easier to tell apart. In practice, this column would simply contain a foreign key to a fruits
table.
A 'status' column in this scenario seems pretty useful. You can easily check how many fruit items are ripe, expired, etc, or a breakdown by status for each fruit. So far, so good. We'll come back to this in a bit.
Let's check another example. Imagine a table for users that have signed up to your platform. The thing is, it's a very exclusive platform and only you get to approve who can get access to it. You've also implemented a simple script to automatically archive any users that haven't logged in in a long time. So you've ended up with some like this.
id (int) | name (varchar) | status (varchar) | last_login_at (timestamp) |
---|---|---|---|
1 | Grant | approved | 2025-03-22 22:16:41 |
2 | Ellie | pending | 2025-03-16 12:16:41 |
3 | Hammond | banned | 2023-03-17 10:16:01 |
4 | Malcolm | pending | 2025-02-12 05:16:41 |
5 | Gennaro | archived | 2024-12-12 05:16:41 |
In this scenario, a 'status' column seems pretty useful too, right? You can grant access to users by checking if their status is approved
; or automatically redirect a user to a specific page if they're pending
. And this would work. Great. But what if you need to revert to a previous status?
By definition, the status of something will eventually change; that's why you have different statuses. But depending on what you're dealing with, the way a status can change can also vary. Let's go back to the fruit scenario.
For any given fruit, you know that there's basically a linear timeline for all the statuses. Each fruit starts at Underripe
, eventually becomes Ripe
and finally Expired
(let's keep it simple). You can never revert to a previous status because this timeline is set in stone (that's life for you...). So in this particular case, a status column is all you need.
In the user scenario though, statuses can be reverted. An archived
user could get in touch with you and ask you to reactivate their account so you do that. But then... how do you know which status to revert to? If they were approved
than they should regain immediate access to the platform but if they were pending
then some other logic should apply. Maybe you could save their previous status in another column. And maybe that's all you'd need here but in a more complex system you'd quickly find yourself in a scenario where you need to revert to the previous status again. And now what? Another option could be to simply define a safe fallback status that "won't cause any trouble", like pending
(or so you're hoping...). But then if the user was approved, they'll have to wait to get approved again. And that just sounds like poor user experience to me.
The real issue in this example is that by only having a status column you are losing information. Two different users could have the same exact status but have different logic applied to them because of previous statuses. How do we fix this then?
Let's imagine you've decided that, in your platform, bans are for life so if a user gets banned, they'll lose access no matter what. They may have done something really bad, maybe even ilegal, therefore, it is very important to know if a user has been banned regardless of their current status. However unlikely, in a scenario where a user account that had previously been banned now has the approved
status (e.g. maybe it was human error, maybe a script updated the wrong acounts) the system has no way of knowing that and will grant them access. Their account's status is approved
after all, so that makes sense, right? And to tie in the previous example, what if the owner of an archived
account that was previously pending
(but we don't have that information) requests a reactivation of the account?
Let's start with the obvious and add the missing information. We'll add new columns to reflect whether a user has been approved and/or banned. We'll call it approved_at
and banned_at
and we'll make them timestamps instead of a boolean so we can also store the when. So we end up with something like this.
id (int) | name (varchar) | status (varchar) | approved_at (timestamp) | banned_at (timestamp) |
---|---|---|---|---|
1 | Grant | approved | 2025-03-01 19:18:10 | null |
2 | Ellie | pending | null | null |
3 | Hammond | banned | 2025-02-12 05:16:41 | 2025-02-13 15:11:31 |
4 | Malcolm | archived | null | null |
5 | Gennaro | archived | 2024-12-12 05:16:41 | null |
We've just introduced two new variables in the system that will always have to be checked from now on and that represents extra logic. True. But we've also solved the main issue. We've guaranteed that a banned user will always be banned no matter what as well as preventing any user from being assigned a status that they're not supposed to have.
But it gets better. Because the two new columns provide all the contex that we need you no longer need the status column. You now have essentially computed statuses. Regardless of a user's current status you can check at any time whether they've been banned before (e.g. in case you need to keep a blacklist of sorts?), or whether they've been archived at some point, etc. You're no longer losing any information and all your statuses can now be calculated using the new columns. "Creating" a new status is also possible: if, for instance, you want to permantely prevent an approved user from accessing something if they've ever been banned before, you can do so; because you have the data. However, if all you have is a status column then what are you going to call this status, approved_but_previously_banned? That doesn't sound right...