This article is more than 1 year old

To iterate is human

Why it’s three, not four – or five

So, whatever its merits elsewhere, if a particular pattern cuts across the grain of a language's features and its received idioms, it is normally easier to go with the flow than against it. You don't want to be writing code that is unnecessarily complex by virtue of an unquestioned idiom import from elsewhere; an idiom that fails to add any noticeable advantage over the more native idiom for a typical case of application.

But let's be clear: that is "normally easier", not "always easier". It pays to be a polyglot: you want your design gestalt to contain more than just a nice set of unquestioned defaults informed only by a single language. To be able to select the mot juste, whatever the situation, you want your design vocabulary to be able to draw on multiple sources. You need more than one idea.

I've already mentioned that for recursive data structures, Enumeration Method is much easier to implement than Iterator. Likewise, where you have a collection shared between threads and you want to support uninterrupted traversals, it is much easier to use. These situations sound quite specific, but having explained why Enumeration Method is a less appropriate approach than Iterator for most uses of iteration in certain of languages, I would like to ensure that not too much design territory is ceded.

Enumeration Method is a general design pattern, not just a language-specific idiom. In the right situation, it offers a number of significant benefits. For example, if you have a relatively stable set of actions you want to perform during an iteration, the fact that a language does not support closures becomes less of an issue. Passing blocks as objects is particularly effective when iteration is ad hoc but, when the kinds of loop action you have are well characterised and bounded, common actions can be wrapped up and provided as predefined Command classes.

Alternatively, when traversing aggregate data structures holding different kinds of elements, such as objects that represent the syntax structure of source code or a document, Enumeration Method with a Visitor interface offers a far simpler programming model than working multiple nested loops, multiple Iterator types and explicit runtime type checking. It is also much easier to write unit tests against such structures by passing Mock Objects to the Enumeration Method.

But what of the original observation that kicked off this article? If you recall:

There are many ways to offer traversal, but if the caller needs to be able to know the position of elements in some way there are essentially only three general designs that keep the collection's internal representation hidden from the caller.

Whenever you mention magic numbers — three in this case — you implicitly invite others to check your working (in a discipline where we don't do nearly as much reviewing and checking as we should, this is no bad thing). The claim is that I was off by one and that, including Enumeration Method (or Internal Iterator), there are four. Given that I seem to be quite a fan of Enumeration Method, have written about it going back many years, and even mention it in the very same article, is this correction correct? Not quite, and it is worth understanding why.

Had the problem to be solved just been how to provide some mechanism for traversal that kept the collection's internal representation hidden from the caller, three would have been right out. In fact, so would four. Here's a summary of the three approaches from the previous article:

First, it is possible to use an index into the collection. [...] The second approach is to internalise a cursor within the collection. [...] And the third option is to introduce an Iterator object.

And then add Enumeration Method. And then add Batch Method. And then you can choose whether or not to count separately variations of these or combinations that draw on more than one approach, such as Batch Iterator (also known as Chunky Iterator).

However, the phrasing of the design issue was more constrained:

... if the caller needs to be able to know the position of elements in some way...

This constraint is important because not only does it subset the available solutions, it also highlights a significant distinction between Iterator and Enumeration Method. While many choices are driven by idiom and context, keep in mind that the detail of the problem being solved casts an important vote (or veto) in choosing an appropriate solution. In this particular case, one of Enumeration Method's strengths — that of fully encapsulating any concept of position and iteration mechanism from the caller — is a mismatch for the problem we want to solve. What are needed are solutions that have the property of both traversal and persistent position, hence the inclusion of Iterator and the exclusion of Enumeration Method.

So, to summarise, Iterator supports traversal by encapsulating position and offering traversal control whereas Enumeration Method supports traversal by encapsulating the whole loop and all of its associated mechanisms. Specific requirements and general context help to determine which is the more appropriate solution in a given situation. ®

More about

TIP US OFF

Send us news


Other stories you might like