Humanity needs you... to build an AI bot that can finger rotten headlines
Identifying fake news is too hard at the moment, say developers, but we can spot lies in headings
Two AI researchers are behind a daring open challenge to fight the spread of outrageous headlines that are completely detached from reality. (As if anyone would write such things, tut-tut.)
The Fake News Challenge (FNC) is organized by Dean Pomerleau, an entrepreneur and adjunct professor at Carnegie Mellon University, and Delip Rao, CEO at Joostware. The aim is to explore how AI – particularly machine learning and natural language processing – might be used to combat the negative effects of false information.
The problem of fake news has been bubbling away for some time, but reached a climax as Donald Trump was sworn in as the 45th President of the United States. People were quick to blame dodgy websites for pushing lies – such as the Pope backing Donald – that potentially skewed the election results in the telly celebrity's favor.
As machine learning advances, the scope of problems it’s being applied to has expanded. Classification algorithms are particularly useful in computer vision and healthcare, helping doctors diagnose diseases.
But curing fake news is not as simple as telling apart cancerous moles from noncancerous ones. There is no AI system powerful enough to spit out the words “FAKE NEWS” with a red flashing light as an output. The FNC admits that truth labelling is “virtually impossible” with existing AI and natural language processing knowledge for the following reasons:
Truth labeling also poses several large technical and logistical challenges for a contest like the FNC:
- There exists very little labeled training data of fake vs real news stories.
- The data that does exist (eg, fact checker website archives) is almost all copyright protected.
- The data that does exist is extremely diverse and unstructured, making it hard to train on.
- Any dataset containing claims with associated “truth” labels is going to be contested as biased.
Instead, the goal is to help human fact checkers with “stance detection.” The headline and contents of a story are pitted against each other.
Claims made in the headlines are tested relative to the stance of the contents. The output will be split into four categories:
- Agrees: The body text agrees with the headline.
- Disagrees: The body text disagrees with the headline.
- Discusses: The body text discusses the same topic as the headline, but does not take a position.
- Unrelated: The body text discusses a different topic than the headline.
The idea is that human fact checkers could then quickly scan through the article looking for arguments for and against the claim to judge the article’s accuracy.
Registration for the competition closes in May and 72 teams have signed up so far. They must not deviate from the training dataset, as using extra data jeopardizes the chances of judging the system’s performance fairly. Winners will be announced in June. The financial details of the prize are still to be determined.
Building a stance detection may lead to a false news labelling system by taking into account the credibility of news organization, the FNC said.
“For example, if several high-credibility news outlets run stories that disagree with a claim (eg, “Denmark Stops Issuing Travel Visas to US Citizens”), the claim would be provisionally labeled as False. Alternatively, if a highly newsworthy claim (eg, “British Prime Minister Resigns in Disgrace”) only appears in one very low-credibility news outlet, without any mention by high-credibility sources despite its newsworthiness, the claim would be provisionally labeled as False by such a truth labeling system.
“In this way, the various stances (or lack of a stance) news organizations take on a claim, as determined by an automatic stance detection system, could be combined to tentatively label the claim as True or False. While crude, this type of fully-automated approach to truth labeling could serve as a starting point for human fact checkers, eg, to prioritize which claims are worth further investigation.”
The FNC isn’t the only project that hopes to use machine intelligence to combat fake news. Google awarded a total of €150,000 to two British fact checking companies – Full Fact and Factmata – and The Ferret, a Scottish investigative journalism site. ®