Prejudiced humans = prejudiced algorithms, and it's not an easy fix

Building bad practices in ML can turn out awkward

It is half a century since the days when London B&Bs would welcome guests with notices stating "No Irish, no blacks, no dogs."

Twenty years since, as marketing manager for a major UK financial institution, I had to inform our underwriting department that adding "does not apply to disabled persons" to an ad for motor insurance was no longer legal.

Both, in the UK, are unlawful direct discrimination. That is, treating individuals differently, on the basis of a number of "protected characteristics". The full list – there are nine – can be found in the Equality Act 2010, which includes gender, race, religion, sexual orientation and gender reassignment.

Other jurisdictions deal with discrimination differently, though in general it is frowned upon.

Most businesses are pretty good when it comes to direct discrimination nowadays. The harder cases – the ones that get written up as "political correctness gone mad" – usually involve indirect discrimination. This occurs when preconditions for doing business are set up that disproportionately affect individuals according to their possession of a protected characteristic.

So offering a job only to someone over six foot tall or refusing credit to inhabitants of Bradford could both fall foul of discrimination law. The first, because it clearly discriminates against women. The second, because it is likely to discriminate against individuals from a particular ethnic group.

The key word here is "could". Because even if discrimination is found to exist, the law provides a ready-made exemption: it is permitted where it is a reasonable means to achieve a necessary objective. So if machinery could only be used safely if the operator was over six foot, the employer is in the clear. But, as you would imagine, this is lawyer heaven. What is necessary? What is reasonable?

It has generated some seriously costly cases. All too often, IT focuses on the end goal – account security for instance – and since the aim is necessary, any and every solution is reasonable.

This fails, twice over. First because use of any protected characteristic to determine individual service level is always going to be legally risky.

For instance, redlining – the practice of using differential pricing to deny services to residents of certain areas based on the racial or ethnic profile of those areas – is simply unlawful, in the UK and USA alike, while differential pricing of motor insurance according to gender is unlawful in the EU, although it still happens.

Second, such practices also call into question the underlying business competence. Use of voice-recognition software to detect fraud frequently causes problems for trans customers, and justifying it "because security" not only undercuts the fitness of other security processes, but will fail if it can be shown that alternative processes exist.

In other words, it is a minefield. But at least where decisions are made by humans, businesses can trade in a law-aware fashion.

Problems arise with machine learning. Related to but not quite the same as AI, machine learning (ML) involves giving data to the processing machines and letting them draw their own conclusions about the relationship between one set of (independent) variables, and a second set of dependent ones. ML is methodologically agnostic: behind the learning facade could be anything, from classical statistical techniques such as regression analysis and decision tree learning to neural nets or genetic algorithms.

Sponsored: The Joy and Pain of Buying IT - Have Your Say

Next page: Risk

Biting the hand that feeds IT © 1998–2017