Feeds

Unequal equivalence

When is a number not a number?

  • alert
  • submit to reddit

Security and trust: The backbone of doing business over the internet

kevlin henney headshotColumn In my previous column I put the contract for object equality under the microscope in most detail for Object.equals in Java but also with a brief look at Object.Equals in C#.

The idea that equality can also be assessed by relational comparison between two objects was also examined, looking again in most detail at the contract for Comparable.compareTo in Java and then briefly at some differences in IComparable.CompareTo in C#. Certain obvious similarities between the languages, in this respect, make their direct comparison easy.

However, one point that was mentioned briefly, but otherwise glossed over, deserves more attention. The idea of determining equality from a total ordering is essentially based on the idea that if two objects are neither greater than nor less than one another they can be considered equal, i.e. when compareTo in Java returns a value of zero rather than a positive or negative value. What may come as a surprise is that this does not necessarily — and is not required to — give the same measure of equality as the direct notion of equality comparison embodied in the equals method.

When are equivalent objects not equal?

The leeway on allowing equality according to equals to be inconsistent with equality determined by compareTo may seem strange at first, but there are cases when a strict total ordering and a notion of exact equality do not necessarily reach the same conclusion. Of course, the wording of Sun's JDK documentation strongly encourages the two concepts to be consistent, but that is not a hard and fast contractual binding.

And you don't have to look very far for an everyday example. In Britain, an alphabetically sorted listing of names, such as a telephone directory, should order names beginning "Mc" and "Mac" together and as the latter. However, although they are equivalent with respect to ordering there is no mistaking "McDonald" as equal to "MacDonald".

A similar case can be made for case-insensitive ordering of strings: "Mongoose" should order after "aardvark" but before "ZEBRA". However, "mongoose", "MONGOOSE" and "Mongoose" should be considered equivalent with respect to ordering even though they are not strictly equal.

A practical application of this concept can be seen in the syntax of identifiers in CORBA's Interface Definition Language. Because of its intended role as an interoperability standard, IDL cannot reasonably favour either case-sensitive naming, as found in the C family of languages, or case-free naming, as found in Pascal, Fortran and other languages. The compromise is to ensure that spelling must be unique — so "Mongoose" has the same spelling as "mongoose", and in a sorted table would map to the same place — but case is preserved — so "Mongoose" is not considered equal to "mongoose", so you cannot redeclare one as the other or use one as the other. This is a good compromise that works with both case-sensitive and case-free languages, and a middle path that should be considered by more language designers.

A further example can be found in the java.math package. In contrast to floating-point numbers, a BigDecimal holds a scaled, arbitrary precision integer, where the scale indicates the number of digits to the right of the decimal point. A strict interpretation of the notion of equality between two objects suggests that a representation of 2.5 (25 with a scale of 1) and one of 2.50 (250 with a scale of 2) are not truly equal, and therefore equals returns false. However, no matter what the representation, the values represented by BigDecimal are reasonably subject to a total and natural strict ordering. In that case, 2.5 is neither greater nor less than 2.50, so they are considered equivalent and compareTo returns 0.

A similar model exists in C++. Equality is expressed through the == operator and bound by the EqualityComparable requirements, which require conforming implementations of == to be reflexive, symmetric and transitive. Reasonably enough, inequality is normally considered to be defined with respect to equality, so a != b is equivalent to !(a == b). This is not just a good logical relationship: it's good implementation advice as well. Rather than duplicating the concept of equality comparison in both operator== and operator!= functions, it exists in only a single place and therefore, should it need to be changed to correct a defect or to modify representation, the change is needed in only one place. This leads to a more stable design with fewer hidden dependencies.

The C++ standard also defines LessThanComparable requirements to govern the < operator. To satisfy the LessThanComparable contract an implementation of the operator must define an ordering on its arguments and it must be irreflexive, i.e. !(a < a) must be true. Comparison in the C++ standard library is based solely on this operator. However, one would also expect — in the name of consistency, convention and reason [2, 3] — the other relational operators to be defined for a given type, and defined according to logical relationships in terms of operator< and not operator==, e.g. a <= b is defined as !(b < a) . Therefore, again, only a single piece of code defines the concept of ordering rather than having it subtly duplicated over four functions.

The LessThanComparable requirements are also used to define an equivalence relation along the lines of two values being considered equivalent if neither is less than the other, i.e. !(a < b) && !(b < a) . And, as we have seen with Java and C#, this notion of equivalence with respect to ordering does not imply consistency with equivalence defined in terms of EqualityComparable. Such consistency may be recommended but, as we have seen, making it a rule may sometimes be too strong an imposition.

Security and trust: The backbone of doing business over the internet

More from The Register

next story
New 'Cosmos' browser surfs the net by TXT alone
No data plan? No WiFi? No worries ... except sluggish download speed
'Windows 9' LEAK: Microsoft's playing catchup with Linux
Multiple desktops and live tiles in restored Start button star in new vids
iOS 8 release: WebGL now runs everywhere. Hurrah for 3D graphics!
HTML 5's pretty neat ... when your browser supports it
Mathematica hits the Web
Wolfram embraces the cloud, promies private cloud cut of its number-cruncher
Google extends app refund window to two hours
You now have 120 minutes to finish that game instead of 15
Intel: Hey, enterprises, drop everything and DO HADOOP
Big Data analytics projected to run on more servers than any other app
Mozilla shutters Labs, tells nobody it's been dead for five months
Staffer's blog reveals all as projects languish on GitHub
SUSE Linux owner Attachmate gobbled by Micro Focus for $2.3bn
Merger will lead to mainframe and COBOL powerhouse
iOS 8 Healthkit gets a bug SO Apple KILLS it. That's real healthcare!
Not fit for purpose on day of launch, says Cupertino
prev story

Whitepapers

Providing a secure and efficient Helpdesk
A single remote control platform for user support is be key to providing an efficient helpdesk. Retain full control over the way in which screen and keystroke data is transmitted.
WIN a very cool portable ZX Spectrum
Win a one-off portable Spectrum built by legendary hardware hacker Ben Heck
Saudi Petroleum chooses Tegile storage solution
A storage solution that addresses company growth and performance for business-critical applications of caseware archive and search along with other key operational systems.
Protecting users from Firesheep and other Sidejacking attacks with SSL
Discussing the vulnerabilities inherent in Wi-Fi networks, and how using TLS/SSL for your entire site will assure security.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.