Rethink code cohesion
Emergent Design: time to relate
It is an assignment of responsibilities, but in the software sense. This comes from the procedural days, from what we called functional decomposition, and it essentially robs us of most of the power of object orientation. Object orientation allows us to model systems by discovering the entities that exist in the problem domain itself, and then assign them responsibilities that make sense for the way the business runs or the game plays, and so on.
What are the entities in the problem domain "banking"? Account, Customer, Statement-these are a few of them, and each of these should be its own class. There may be a BankingSystem class, but it will use these other classes to accomplish the responsibilities that they rightly have assigned to them. Why is this important?
- One of the powerful concepts of object orientation is that software should model the problem directly, so that its structure and behavior are logical in terms of the issues being solved. Just as you would not organize a physical bank by having one big drawer in a file cabinet marked "Stuff", you should not model the problem in software this way either.
- Breaking the problem into classes with responsibilities makes the system easier to test, debug, and therefore maintain. If the statements are not printing properly, I know to look at the Statement class, or one of the classes it collaborates with, to find the problem. Mentally, I can focus myself in a very narrow way on the issue, rather than wade through a lot of unrelated material that will distract and confuse me.
- If I have to add something later, a new kind of customer or a new kind of account, then having discreet classes that deal with these issues gives me a place to make the changes and additions without affecting the rest of the system; we will actually see how this works later when we refactor to the open-closed, but without breaking the problem down into entities with responsibilities, this would be much harder or even impossible to do.
How cohesive is cohesive enough?
Assuming that class cohesion is important and that it will benefit me in terms of easy maintenance and extensibility in my project, how do I determine if my classes are cohesive? First, simply being aware that this is important is half the battle. If I look at my classes with the "is this cohesive?" question firmly in my mind, I can probably make a pretty good determination. That said, there are some good metrics to use as well.
- A strongly cohesive class will have methods that access the same data members of the class frequently. If you count up the number of methods that access a key data member of the class, and then divide that by the total number of methods the class has, you have a metric that will allow you to compare one class to another, in terms of cohesion.
- A strongly cohesive class is difficult to break up into multiple classes. Ask yourself how hard it would be to break up a class, how much of each component class would have to be exposed through public methods to the other component classes; if the number is high, then likely the class is cohesive to begin with.
- As with methods, a strongly cohesive class should be easy to name, and the name should accurately and completely describe what the class is. Vague, generic, or lengthy class names can be an indication that the class is weakly cohesive and should be broken up.
- If your unit test is significantly larger than the class, or can fail in the same way for multiple reasons, this can be an indicator of weak cohesion. When a class contains multiple, unrelated issues, the test must test them all, and all their possible combinations. This is because once you are "inside" a class, you lose much of your opportunity to encapsulate.
This last point is, perhaps, my best answer today. When someone asks me: "How small should my objects be?", "When should I stop pulling things out of an object?" or any of a dozen questions that are similar to these, I find the answer is nearly always: "Once you can write a good unit test for it, you've probably gone far enough."
In part three of Reg Dev's serialization, Scott examines everyone's favorite subject: coupling.
This chapter is excerpted from the new book, Emergent Design: The Evolutionary Nature of Professional Software Development by Scott Bain, published by Addison-Wesley Professional, March 2008 ISBN 0-321-50936-6 Copyright (c) 2008 Pearson Education, Inc. For more information, please see informIT.com and Register Books.
Sponsored: Hyper-scale data management