14 Oct 2019

How Do Software Systems Become Complex and What Can You Do to Prevent it

Why does everything have to be so complicated?
Wouldn't it be nice for once, to have a simple and clean system that you can run or make small changes to?

But how is it that systems get complex to begin with and how can we avoid it reaching that stage?

In order to answer that, I need us to have a common language and explain a concept called "crow epistemology". Epistemology is the philosophical branch of the acquisition of knowledge. The crow part referres to an experiment done with crows many decades ago:
The experiment was conducted to ascertain the extent of the ability of birds to deal with numbers. A hidden observer watched the behavior of a flock of crows gathered in a clearing in the woods. When a man came into the clearing and went on into the woods, the crows hid in the tree tops and would not come out until he returned and left the way he had come. When three men went into the woods and only two returned, the crows would not come out: they waited until the third one had left. But when five men went into the woods and only four returned, the crows came out of hiding. Apparently, their power of discrimination did not extend beyond three units--and their perceptual-mathematical ability consisted of a sequence such as: one-two-three-many.

We humans, are also limited to the number of things we can hold in our head at any one time. Here lies the (human) issue with complexity. For us to make computer systems less complex, we need to take steps as to only allow for a small number of things or concepts to take up space in our brain at any one time.

Let's look at a few types of complexities:

Cyclomatic complexity

Cyclomatic Complexity is a quantitative measure of the number of linearly independent paths through a program's source code. It was developed by Thomas J. McCabe, Sr. in 1976.

  1. IF statements. Too many IF statements make an application complex. Each 'path' of the IF statement needs to be 'rendered' in your brain to have an overview of what the applications will do. In extreme cases, you can often reach the infamous 'pyramid of doom'. If your brain can only hold (on average) 5-7 things, then you are looking at either two variables in the IF statements or three boolean variables (2 options - true/false - to the power of 3 is 8).  Anything more than that, can be considered complex and will force people to stare at the screen for many minutes whenever they want to go over that code.
    1. Decision Tables. An exception to this maybe a decision table, where those paths are 'pre-rendered' and are therefore slightly easier to understand. But even with decision tables, too many options and you find yourself going down one path at a time, tracing the screen with your finger.
  2. Error Handling. A subset of IF statements can be included in Error Handling. Usually, you have the default way a class or function expects to get and process requests and when you include error handling into it, you get too many 'paths' and the code becomes messy. 
    1. Personally, I am interested to see how 'contract by design' works for this use case, by off-loading error handling into other parts of the code. If my predictions are right, this could be what replaces a large chunk of Unit Tests in the future.  
    2. It is also, philosophically more inline with the original intent of Object Oriented programming. A human can 'run' and 'eat', but a human also has limits on what it can eat and on what surfaces it can run. Specifying those constraints helps reduce complexity in other parts of the application.
  3. Function has too many lines of code. It is simply difficult to understand what is going on. Maybe the original person who wrote it can understand, but not anyone else that need to make changes to that code.
  4. A class has too many functions
    1. A class has too much logic in it. Try using a Decision Object
  5. A class has too many dependencies. This overloads your cognition in a similar way to IFs, because you need to 'render' the dependencies to get an overview of what is going on, in your brain.
    1. Too many parts in your system. Too many moving parts, makes it difficult to figure out where issues are, as a general rule.
  6. Too many options for communicating with an API or interface or cli. This isn't very obvious, but too many options is both difficult to develop and maintain, but also difficult for the user to understand how to use. It also makes it a more complicated dependency to interact with and test for.
    1. Too many buttons on your website. A corollary of point 6 is a busy website that is too complicated to understand how to use. 

As a side note, trying to solve a problem that has too many possible decisions to make, also counts as complexity for your brain. In cognitive science (but more math, really), it is call combinatorial explosion.

Castles on Quicksand 

Now that we have covered what humans might consider complexity, let us consider other forms of non-human complexity. Apart from making our code 'clean and simple', we sometimes need to factor in more parts of the terrain. Specifically, we cannot isolate ourself to just making the code aethstically pleasing and not question how the code would run on the metal underneath. How do the physics of of it work, at least in principle? How do we move 0s and 1s as fast as possible without causing bottlenecks?

In philosophy, we call this 'evasion of reality' and its becoming more and more common in the age of cloud computing - although, to be fair, the cloud pretty much plays a 'cha-ching' sound whenever you do this.

Let's take a couple of examples:

  1. Flooding your database with single insert connections, instead of batching writes to it. Batching is the multi-threadiness of databases
  2. Array of Objects or Object of Arrays for performance. The gaming industry takes a more data-oriented design approach to get better performance as well as work on less powerful machines like mobile phones. Using Structs of Arrays, they are able to render more moving units on a screen with far less CPU cache misses. 
  3. The Rust programming language using various principles for memory usage instead of a garbage collector. Rust uses innovative methods that help humans code without a garbage collector while making it a lot easier to manage lifetimes and data races. 

I would like to focus on the last example: in order to build efficient and clean computer systems, we need to use the principles discussed to make our code less complex AND integrate them with principles about how computers work best under the hood. Similar to how Rust does it - integration is the key.

Once you have principles that consider both code complexity and system performance, you develop much faster, your code is simpler and you do not need to revist it in the future to make 50 more commits just to make it go fast. After 50 additional commits, nothing looks simple, anyway.

Pragmatic Entropy

Lastly, we need to make sure that our system has integrity. Here, I mean that it was done right with core values and principles and that those were not deviated from during the development process. I mean here, to not let too many cooks spoil the broth and in particular, exciting spices from online blogs or conferences.

If you do not keep a watchful eye for this, your system can get complex and random very quickly. If you have ever attended a meeting were this was raised "why don't we do this? it's what all the cool kids use", then you will know what I mean.

The problem with these situations, is that if you compromise (be pragmatic), you have already lost. You try to rush things out without hurting other people's feelings and soon you will get those annoying conversations with QA or the people who review your code that go "why didn't you just do X instead? It is a lot simpler".

The other cooks need you to compromise or they can't do anything. What do I mean?

  1. If you were to say "this is a bad idea, I wont do it." - they have lost and the integrity of the software system is safe (assuming you are following the right values and principles)
  2. If you were to say "I agree to do this idea, but tell me exactly how to implement it" - they have lost, because they have no idea how. They need to persuade you to do it and for you to implement their idea.
  3. If you compromise, they have fully won. They get the latest technology into the system and you have to implement it, test it and maintain it going forward. You have at that point, introduced complexity into the system.

If you ever read someone's code and ask "why did they do it this way?" or the more common "wtf", it was probably because someone compromised.

In Conclusion

For those of you that may have missed it, we have covered complexity in computer systems, but we have also done so in a complete philosophical framework. We have Epistemology (Cyclomatic Complexity), Metaphysics (Castles on Quicksand) and we have Ethics (Pragmatic Entropy) - we also have a slight mention of aethstics.

I hope you have enjoyed it and that it helps you in your implementations.

No comments:

Post a Comment