More like this




Use of big data can lead to 'harmful exclusion, discrimination' – FTC

Data model should take account of biases

Brent Spiner signed photo saying 'Big Data'

Businesses should take steps to avoid causing "harmful exclusion or discrimination" when using aggregated consumer data that they have analysed, a US regulator has said.

In a new report the Federal Trade Commission called on companies to check how representative their data sets are, whether their "data model" takes account of biases and how accurate the predictions they make are when based on big data.

"Companies should remember that while big data is very good at detecting correlations, it does not explain which correlations are meaningful," the 'Big Data: A Tool for Inclusion or Exclusion?' report (50-page / 600KB PDF) said.

The FTC also said that businesses should review whether their "reliance on big data raise[s] ethical or fairness concerns".

"Companies should assess the factors that go into an analytics model and balance the predictive value of the model with fairness considerations," the FTC said. "For example, one company determined that employees who live closer to their jobs stay at these jobs longer than those who live farther away. However, another company decided to exclude this factor from its hiring algorithm because of concerns about racial discrimination, particularly since different neighbourhoods can have different racial compositions."

"The Commission encourages companies to apply big data analytics in ways that provide benefits and opportunities to consumers, while avoiding pitfalls that may violate consumer protection or equal opportunity laws, or detract from core values of inclusion and fairness. For its part, the Commission will continue to monitor areas where big data practices could violate existing laws … and will bring enforcement actions where appropriate," it said.

The FTC's report also highlighted some of the benefits that have stemmed from the use of big data. It said it had helped to improve student attainment, improve access to credit facilities and to deliver more personalised health care, among other examples. However, the report highlighted concerns about making predictions based on big data analytics as this can "exclude certain populations from the benefits society and markets have to offer".

"For example, one academic has argued that hidden biases in the collection, analysis, and interpretation stages present considerable risks," the FTC's report said. "If the process that generated the underlying data reflects biases in favour of or against certain types of individuals, then some statistical relationships revealed by that data could perpetuate those biases. When not recognised and addressed, poor data quality can lead to inaccurate predictions, which in turn can lead to companies erroneously denying consumers offers or benefits."

"Although the use of inaccurate or biased data and analysis to justify decisions that have harmed certain populations is not new, some commenters worry that big data analytics may lead to wider propagation of the problem and make it more difficult for the company using such data to identify the source of discriminatory effects and address it," it said.

The FTC said that concerns have also been raised about the prospect of reading too much into "meaningless correlations" that can be present in large data sets. One example of this cited in the report showed how data about a group of people could potentially negatively impact on the creditworthiness of some consumers.

"Several commenters explained that some credit card companies have lowered a customer’s credit limit, not based on the customer’s payment history, but rather based on analysis of other customers with a poor repayment history that had shopped at the same establishments where the customer had shopped," the FTC said. "Using this type of a statistical model might reduce the cost of credit for some individuals, but may also result in some creditworthy consumers being denied or charged more for credit than they might otherwise have been charged."

Technology law expert Luke Scanlon of Pinsent Masons, the law firm behind, said that data protection laws in the EU already account for people profiling through the use of big data analytics and will continue to do so under the forthcoming new General Data Protection Regulation (GDPR).

"While the existing Data Protection Directive places restrictions on the extent to which businesses can rely on automated decisions which have legal effects on individuals, the GDPR will more directly require businesses to think in terms of whether their data analytics processes may lead to discrimination," Scanlon said.

"Consideration will need to be given as to whether mathematical and statistical procedures used are adequate enough to deal with the risk of data inaccuracies and other errors that could have the consequence of classes of individuals being treated less favourably than others," he said.

Scanlon said "there is good reason therefore for EU businesses to keep up-to-date with international guidance on the topic of discrimination by data".

Copyright © 2015, is part of international law firm Pinsent Masons.

Sponsored: Fast data protection ROI?