Let's get cohesive

Biting into sound principles

To sharpen our concept of cohesion a little more, we can draw on another criterion, which is that of common use: if you're going to use one feature of a cohesive module, you should be just as likely to use another. In other words, move things together that are used together and separate those that are not.

This is not a question of whether or not an application as a whole would or would not use a set of features together; but whether one of its parts would. You may also recognise this co-dependency approach by another name: normalisation. An application may use both threading and file I/O functionality, but it is not inevitable that a class in that application that depends on using threading, will also depend on using file I/O. Threading and file I/O stand out as obvious and separate candidates for common use, and so each represents a separate cohesive concept.

By contrast, a not uncommon habit on C projects is to put all the typedefs in one header file ("typedefs.h") or all the constants in a header file ("constants.h"). While it is clear that all the features in the header file have something in common, it is not clear that this is actually useful except in the most trivial sense.

This question of common use (also documented, a little misleadingly, as the Common Reuse Principle) highlights another consideration as well: the criteria used for separating or combining parts should be visible and positively defined.

Consider, by way of counterexample, the java.util package, which appears to contain as ragtag an assortment of unrelated classes as you are ever likely to find. What is the common theme that binds the collection classes together in the same package as, for instance, the calendar facilities? It isn't that they are utilities, because this theme cannot be applied consistently or meaningfully — everything in software is ultimately a utility to something else.

The implication of the name util fails to explain just why the many collection classes are merged in a package of unrelated utilities that clearly do not measure up against a yardstick of common use; but the BigInteger and BigDecimal utility classes get a package — albeit inaccurately named — pretty much to themselves.

By way of contrast, the criterion used for the partitioning of the System.Collections namespace in the .NET library is clear. The criterion used to define the content of java.util is a somewhat accidental one that can be summarised as "it contains utility classes that are not defined in other packages".

Yes, it's some kind of design reasoning, but one that's difficult to employ constructively. But this is not (just) a rant about things named "util" or similar. This problem of arbitrary cohesion is found in libraries and applications that many programmers commonly work with. The presence or absence of a particular feature in such libraries and classes is more a matter of lottery than of reasoning. For example, the C library's <stdlib.h> header can be characterised as "holding all the standard library features that are not already defined in other standard headers (except for the couple that are)".

A lot of learning is from example. Lessons are absorbed, often passively and unconsciously, from our environment. For programmers, this environment includes the standard APIs they work with, which is why we should value a good understanding of their developmental strengths and weaknesses, and focus on more than just their functional behaviour. If it has sufficient functionality, it is always possible to work with uncohesive code. But effective design is more than just affording the basic "possibility of use": ease of use, errors in use, cost of use, delight in use, and so on, are all part of the picture. ®

Sponsored: VersaStack for data center with direct attached storage