Real talk: Machine learning is not there yet. Some assembly required
So you wanna get into AI? Get ready for a lot of handholding and coddling
Despite the vendor-driven hype, machine learning is not a miracle cure for businesses. In fact, it's something of a problem child, one that requires quite a bit of handholding and coddling.
In a paper published this week, Ilias Flaounas, senior data scientist at developer tools biz Atlassian, argues that companies should be careful not to embrace the technology without the people to make it work or the planning necessary for it to provide meaningful value.
Flaounas is undoubtedly a fan and believes machine learning has been helpful to Atlassian. "We have deployed ML techniques to tackle many different problems," he said in an email to The Register.
"For example, we use it to predict future values of key business metrics. Machine learning helps us find patterns in product usage and thus expose users to specific features that will bring the greatest benefits to them."
Flaounas explained that his company uses machine learning to assess whether a user who tests a product will become a paying customer. "We analyze sign-up information from new customers and their activities inside the product to make predictions," he said.
That data helps Atlassian improve the signup process for new customers through A/B testing, Flaounas said.
At the same time, it isn't always clear how to interpret the data produced by machine learning. Flaounas said questions of causality may arise: "For example, does the creation of multiple projects in JIRA Software lead an evaluator to become a paying customer, or does the decision to convert precede the activity of creating projects in the product?"
Data scientists wrestle with that sort of issue. But they turn out to be rather scarce, and that's a problem for companies that think they can just put money into machine learning and get business advice out.
"Non-experts can easily believe that a plug-n-play deployment of the latest ML tool will solve the problems at hand," Flaounas explains in his paper. "The expectations are raised and morale is degraded if the prototypes they build underperform."
The humans overseeing machine learning turn out to be crucial. Flaounas says successful implementations begin with figuring out how many users will benefit from the proposed system and assessing the feasibility of the project's goals.
Privacy, particularly as it applied to regulatory compliance, needs to be considered from the outset. "Some hard questions include how to debug a ML system at scale when there is no access to the raw data generated by users?" Flaounas observes in his paper.
"Can users be profiled based on their sex, age, experience, their role? If not, can these attributes indirectly be inferred from other known variables and fed into a ML system?"
Machine learning systems may throw off data that requires the implementation of other systems for processing. Atlassian, for example, wanted to automate the monitoring of business metrics and of alerts raised in response to detected anomalies.
Its development team considered that a minor challenge compared to the difficulty of designing the algorithm for anomaly detection. For the data scientists, the issue was the tuning of the system rather than creating the algorithm.
"They advised that the system will require full re-training, for example in cases of data loss/delays, or annotation of data points that represent one-off well understood events, and in cases of more permanent changes a mechanism for the system to become less sensitive until a new baseline is established," Flaounas says in his paper.
Companies that commit to machine learning need to understand that the technology requires ongoing commitment.
"Personally, I was surprised by the continuous effort and resources needed to handle data at scale, and do so reliably," said Flaounas. "Nevertheless, that’s a necessary investment that unlocks the creativity of our data scientists." ®
Sponsored: Beyond the Data Frontier