Original URL: https://www.theregister.com/2008/04/07/emergent_design_part_two/

Rethink code cohesion

Emergent Design: time to relate

By Gavin Clarke

Posted in Channel, 7th April 2008 11:02 GMT

Book extract, part two Scott Bain's book Emergent Design: The Evolutionary Nature of Professional Software Development published by Addison Wesley looks at how to deliver and maintain robust, reliable, and cost-effective systems. In this, the second of five Reg Dev extracts, Scott tackles the complex subject of cohesion as a step to building simple and maintainable code.

Cohesion is an often misunderstood term. I think it may be due to the fact that it sounds a lot like adhesion, and so people think it means "how well things are stuck together." I have heard people describing a team of developers as a "good, cohesive team," meaning that they get along well together and work closely without conflict.

As nice as this may be for the team, it has nothing to do with cohesion. Cohesion refers to how much (or how little) the internal parts of something are working on the same issue, and how well they relate to each other. It is the quality of single mindedness or singleness of purpose, and it makes entities (classes, methods) easier to name and understand.

For example, a team is strongly cohesive if it has established an identity and all of its members are working toward the same goal, in a consistent manner, regardless of their personal feelings for each other. One clear sign of cohesion is how easy it is to put a name to something. If you can call a team "the GUI team," then likely everyone in it is working on the graphical user interface, which may indicate or help to bring about strong cohesion. If you have to refer to them as "the team that does the GUI, the database proxies, and some of the business logic," then the team is going to have a tougher time being cohesive (1), even if the members of the team are best buddies.

Cohesion in our code is much like this. One can consider cohesion at the method level, the class level, or even at higher levels like package, application, system, solution. For my purposes, method- and class-cohesion is all I will need.

Method cohesion

Consider the following code:


public class Application {
  public void process(String[] words) {
    // Loop through the array of Strings
    for(int i=0; i<words.length; i++) {
      String argument = "";
      // Reverse the characters in each String
      for(int j=words[i].length(); j>0; j--){
        argument += words[i].substring(j-1,j);
      }
      System.out.println(argument);
    }
    // Test for two particular Strings
    if(words.length == 2){        
      if(words[0].toLowerCase().equals("mighty") && 
words[1].toLowerCase().equals("mouse"))
         System.out.println("...here he comes to save the day.");
    }
  }
        
  public static void main(String[] args){
    Application myApp = new Application();
    myApp.process(args);
  }
}



This is a simple little application that takes any parameters passed on the command line, reverses the characters, tests for the name of a remarkable fellow, and then makes an appropriate comment.

But it has weak cohesion.

Why?

A clue lies in the generic quality of the method name process(). It does not tell us what the method does because to name the method properly it would have to be something like:

reverseCharactersAndTestForMightyMouse()

Difficult-to-name methods are a good sign that you have weak method cohesion. In essence, the method process() is doing too much, and it is doing things that are not related to each other. Reversing the order of characters in each string parameter and testing them all together for a particular name are activities that have nothing to do with one another.

We could fix this by putting these different steps into their own methods, then calling those methods from process(), such as in the following code:


public class Application {

  public void process(String[] words) {
    for(int i=0; i<words.length; i++) {
      reverseCharacters(words[i]);
      System.out.println(words[i]);
    }
    if(isMightyMouse(words)) {
      System.out.println("...here he comes to save the day.");
    }
  }

  private String reverseCharacters(String forward){
    String reverse = "";
      for(int j=forward.length(); j>0; j--){
        reverse += forward.substring(j-1,j);
      }
      return reverse;
  }

  private boolean isMightyMouse(String[] names){
    boolean rval = false;
    if(names.length == 2){        
      if(names[0].toLowerCase().equals("mighty") && 
         names[1].toLowerCase().equals("mouse"))
         rval = true;
    }
    return rval;
  }
        
  public static void main(String[] args){
    Application myApp = new Application();
    myApp.process(args);
  }
}

When I read the process() method, I am reading a series of steps, each step accomplished by another method. Process() has become an organizing method, a scaffold that creates the general shape of the behavior, but then delegates the actual steps to other methods.

1. A team of human beings, of course, can be quite effective but lack cohesion. A cross-functional team, for example, can be a good thing. I just want to make it clear what this word means, and to suggest that software entities (which are not intelligent) should be cohesive.

The methods reverseCharacters() and isMightyMouse() are easy to name because they each do a single, identifiable thing. It also helps with debugging. If the characters are not reversing properly, I know exactly where to look to find the bug, because the responsibility for doing this is clearly assigned to the properly named method, and only that method.

Cohesion of perspective level

Another aspect of cohesion you should be aware of is the level of perspective at which a method operates. Put simply, methods tend to accomplish their functionality by either of the following:

The preceding process() method has a little bit of logic in it, but it is purely for sequencing the other methods and organizing the results of their actions. Mostly, process() calls reverseCharacters() and isMightyMouse(), where the actual work is done. This aggregation of behavior that process() does is at a level of perspective we call specification.

Levels of perspective

In UML Distilled: A Brief Guide to the Standard Object Modeling Language, Martin Fowler refers to the levels of perspective that Steve Cook and John Daniels identified in their book Designing Object Systems. The three types of perspective are conceptual, specification, and implementation.

The reverseCharacters() and isMightyMouse() methods are implementation-level methods; they have the code that does the dirty work.

It would be overstating things to suggest that I always write methods that are purely at one level of perspective or another-even here, process() has a little bit of logic in it, not just a series of method calls. But my goal is to be as cohesive as possible in terms of levels of perspective, mostly because it makes the code easier to read and understand.

It would not be overstating things to suggest that I always strive to write methods that are cohesive in the general sense, that they contain code that is all about the same issue or purpose. When I find poorly cohesive methods, I am going to change them, every time, because I know that method cohesion is a principle that will help me create maintainable code, and that is something I want for myself, let alone my team and my customer.

Class cohesion

Classes themselves also need to be cohesive. The readability, maintenance, and clarity issues that, in part, drive the need for method cohesion also motivate class cohesion.

In addition, we know we want our classes to define objects that are responsible for themselves, that are well-understood entities in the problem domain. A typical mistake that developers who are new to object orientation make is to define their classes in terms of the software itself, rather than in terms of the problem being solved. For instance, consider the following code:


public class BankingSystem {
  // No "method guts" are provided; this is just a 
  // conceptual example
  public void addCustomer(String cName, String cAddress, 
                          String accountNumber, 
                          double balance) {}
  public void removeCustomer(String accountNumber) {}
  public double creditAccount(String accountNumber, 
                              double creditAmount) {}
  public double debitAccount(String accountNumber, 
                             double debitAmount) {}
  public boolean checkSufficientFunds(String accountNumber, 
                                      double checkAmount) {}
  public void sendStatement(String accountNumber) {}
  public boolean qualifiesForFreeToaster(String accountNumber){}
  public boolean transferFunds(String fromAccount, 
                               String toAccount, 
                               double transferAmount) {}
}

It is easy to imagine the thought process that leads to code like this: "I am writing a banking system, and so I will name the class for what it is, a BankingSystem. What does a banking system do? Well, I need to be able to add and remove customers, manage their accounts by adding to them and withdrawing from them," and so on.

It is an assignment of responsibilities, but in the software sense. This comes from the procedural days, from what we called functional decomposition, and it essentially robs us of most of the power of object orientation. Object orientation allows us to model systems by discovering the entities that exist in the problem domain itself, and then assign them responsibilities that make sense for the way the business runs or the game plays, and so on.

What are the entities in the problem domain "banking"? Account, Customer, Statement-these are a few of them, and each of these should be its own class. There may be a BankingSystem class, but it will use these other classes to accomplish the responsibilities that they rightly have assigned to them. Why is this important?

How cohesive is cohesive enough?

Assuming that class cohesion is important and that it will benefit me in terms of easy maintenance and extensibility in my project, how do I determine if my classes are cohesive? First, simply being aware that this is important is half the battle. If I look at my classes with the "is this cohesive?" question firmly in my mind, I can probably make a pretty good determination. That said, there are some good metrics to use as well.

This last point is, perhaps, my best answer today. When someone asks me: "How small should my objects be?", "When should I stop pulling things out of an object?" or any of a dozen questions that are similar to these, I find the answer is nearly always: "Once you can write a good unit test for it, you've probably gone far enough."

In part three of Reg Dev's serialization, Scott examines everyone's favorite subject: coupling.

This chapter is excerpted from the new book, Emergent Design: The Evolutionary Nature of Professional Software Development by Scott Bain, published by Addison-Wesley Professional, March 2008 ISBN 0-321-50936-6 Copyright (c) 2008 Pearson Education, Inc. For more information, please see informIT.com and Register Books.