Up with cohesion, down with coupling

We all love cohesion and hate coupling, don't we?

The standard advice on cohesion and coupling is that a design should strive to maximise cohesion and minimise coupling. This is a fine mantra, but, as is so often the case, without a good understanding of what is really intended, it becomes either misguidance or perceived as academic and irrelevant.

A simple characterisation is that coupling is the degree of interconnectedness of parts in a system and cohesion is the degree of intraconnectedness of those parts.

Getting cohesive you want components — in the general sense of the word rather than the more specialised component-based development sense of the word — to be focused and crisply defined rather than diluted with multiple unrelated responsibilities.

For example, the root of an inheritance hierarchy should normally present an interface and represent a responsibility that all subclasses in the hierarchy can fulfil meaningfully. You should try to avoid having optional features in the root interface of the hierarchy that are only relevant to some subclasses but not others: this makes both using and writing classes in the hierarchy awkward.

For hierarchy users, they cannot confidently program to an interface without either feature testing or receiving some kind of "feature not supported" error.

For class implementers, if a superclass or base interface offers features that don't make sense for a particular subclass, either they have to make up some kind of fictional behaviour that sort of makes sense or they have to raise a "feature not supported" error. Either way, it's hard working in and around such hierarchies.

As an aside, you may already recognise this particular example as the principle of substitution (the Liskov Substitution Principle, or LSP, to be precise) for class hierarchies. It is commonly taught in its somewhat abridged sound-bite form: only use inheritance for "is-a" relationships.

It is interesting to note that rather than being a principle separate from maximise cohesion, LSP arises naturally from maximising cohesion, so it can be considered a specific application of maximise cohesion in the context of type hierarchies.

Anyway, what about coupling? You want components to be loosely coupled rather than tightly coupled. If they are tightly coupled, the internal structure of the resulting code base is intertwined, subtle, difficult to comprehend, hard to change, and many other costly and awkward etceteras.

For example, having one part of the code base depend on an incidental data representation decision in another part of the code base is a pain when you need to change the representation, even if the usage is stable. Hence, the common recommendation to keep data representation decisions private is a guideline that reduces coupling.

However, don't get too carried away: low coupling does not mean no coupling. The goal is reduction rather than elimination of coupling: a system with no coupling is, by definition, not a system.

There are many forms of coupling, and some are explicit whereas others are implicit. For example, the inheritance relationship expressed with extends in Java or the file dependency expressed using #include in C or C++ are examples of dependencies that are directly declared and visible in code.

But don't assume that the only form of coupling is the explicit stuff. Many years ago I encountered a team that did not appreciate this implicit aspect of coupling, and drew completely the wrong conclusions about what to do with a C++ header file that contained an enum with all the error codes for a whole program. Bundling all the error codes for a whole system in one place smacks of coincidental cohesion and couples unrelated parts of a system to a relatively unstable component.

There are many different and reasonable ways to cut this dependency knot, should you chose to do so. Hopefully you'll agree that replacing the enum with an int and replacing named constants with magic numbers to represent different errors is not one of them. The reasoning was that because they had eliminated the type and all those pesky named constants, there was no longer any need to have a header file, which meant, by definition, that different parts of the program were no longer coupled to a common header. Hey presto! Or not.

Instead, they had a program littered with magic numbers that were implicitly coupled to one another by ad hoc usage and convention and they had thrown away the ability of the compiler to perform any checking: "OK everyone, team meeting, listen in. Please try to be consistent in your use of the number three to indicate file writing errors (it was two last week, but two is now for file reading errors), 17 to indicate dropped connections...".

Sponsored: How to determine if cloud backup is right for your servers