To iterate is human
Why it’s three, not four – or five
Column In the previous article, I made the following observation:
A collection that holds its elements but doesn't allow you to traverse them is unlikely to prove popular. There are many ways to offer traversal, but if the caller needs to be able to know the position of elements in some way there are essentially only three general designs that keep the collection's internal representation hidden from the caller.
And a reader made the following comment : A fourth option, which is in the general case superior to all three of your options, is the internal iterator. It delegates management of the iteration to the collection, rather than adding repetitive boilerplate to your functional code.
In this particular case, the design pattern mentioned is also known as the Enumeration Method pattern. The use of this pattern was mentioned in the article, along with a link to a corresponding article, although it appears that it was overlooked. However, the posted comment touches on a whole topic area that deserves more attention than there was either space or topic focus for in the previous article. In fact, it deserves at least a whole article!
In essence, where Iterator is normally characterised by the introduction of an additional kind of object that performs iteration over some kind of collection, Enumeration Method introduces a method on the collection that calls back on a piece of supplied code for each element in the collection. The key benefit here is that you are only dealing with a single unit of design responsibility. All the mechanics of iteration are contained within the collection. Very compact, very cohesive. Nice — just what we want.
However, there are some other considerations at play: some are related to matters of perception; others are related to questions of practice; the rest are related to requirements and design goals; all are related to understanding the context that frame, and the forces that drive, a particular pattern's application.
The Gang of Four tried to shoehorn this additional iteration pattern, in the guise of Internal Iterator, into their write up of Iterator, but it must be said with limited success. Its presence is fairly low key, with most of the detail tucked away towards the end of the pattern write up, the last item in the Sample Code section. The unification of these two different design approaches was something of a compromise to try to accommodate a single iteration pattern for all OO-related languages. Specifically, an attempt to square a C++ view of the world with a Smalltalk view of the world... which is always a challenge.
However, the fundamentally different design structures, philosophies and trade-offs of the two approaches undermines any claim that they can be considered the same pattern. A pattern represents a recurring design solution, with an associated set of consequences, to a recurring problem whose forces are understood and arise within a specific context. It turns out that although both can be characterised in the general sense as iteration patterns the similarity ends there: the two approaches have almost nothing in common; the consequences of applying each one have almost nothing in common; indeed, even the problem forces that they resolve differ in the detail.
As a distinct pattern, Enumeration Method was first properly documented by Kent Beck in Smalltalk Best Practice Patterns. However, it is not a pattern that is restricted to Smalltalk: it can be applied in C, using function pointers, such as EnumChildWindows in the Windows API; it can be used in Java, based on the common Command pattern and the specific use of inner classes to achieve a sense of closure; it is the common form of iteration in Ruby, which supports blocks as objects directly, and where these are commonly (but confusingly, for our purposes) also known as iterators; for the functional programmers amongst you, it is in essence the map function from an object-centred perspective.
Of course, the ease with which Enumeration Method can be implemented and used, and therefore its applicability, is an important consideration. It would be too simple to claim that Iterators are necessarily more verbose than Enumeration Methods and that Enumeration Methods are generally superior, for the simple reason that such a claim needs to be made and measured against a specific context. Understanding the role context plays in design is perhaps one of the most important, but most overlooked, aspects of successful pattern application.
In a language where blocks are supported natively as objects, such as Smalltalk and Ruby, implementing Iterators without appropriate cause might be considered quite curious and more than a little gratuitous. However, although it is said that "what's good for the goose is good for the gander", it doesn't follow that it's always such good source for the mongoose.
In a language that doesn't support closures, it turns out that even though implementing the Enumeration Method itself is normally easy, using it can be something of a pain, shifting the complexity from the collection writer to the collection user. This applies to a greater or lesser degree depending on what other features a language supports and what its native library style is. For example, assuming that closures are adopted in Java 6, Enumeration Method will become easier to implement and use in Java. For the moment, however, although anonymous inner classes make a block-like approach possible, the resulting syntactic overhead is somewhat cumbersome if you aren't getting any obvious additional benefit. So unless there is a specific reason to do otherwise, such as recursive traversal or synchronized traversal, it is far wiser to favour Iterator as the default approach in Java: both the language and the library are geared up to support it, and writing an Iterator correctly is not a significant challenge.
We could go on to talk about Python's approach to iteration, or the diversity of styles that can be conveniently supported in C# 2.0, or the style of iteration used in C++ that supports the concept of generic programming, or understand how simple and effective map is in Scheme, or the relationship between iteration in Ruby and CLU, and so on. But, for the sake of brevity, I'll stop the language listing there. Hopefully you get the general idea: there is no single option that is best across all languages.