Working Successfully with Algorithms that can Outperform Humans

Matthew Baggott, Director of Data Science and Engineering, People Analytics, Genentech
Matthew Baggott, Director of Data Science and Engineering, People Analytics, Genentech

Matthew Baggott, Director of Data Science and Engineering, People Analytics, Genentech

In recent years, a modern form of machine learning called deep learning has delivered stunning performance in domains including image classification and text processing. Big data, simple parametric learning methods, and lots of number crunching can often produce models that surpass human performance.  Deep learning models power much of what is called artificial intelligence (AI) in the press.  However, fully leveraging these models and deploying them into production involves challenges that leaders need to be aware of.

The nature of these challenges depends on whether the project is central to the business model of the company or not.  Companies that have deep learning or AI as a core part of their product are faced with full-stack, systems-level concerns about how to efficiently and simultaneously architect deep learning, software, and hardware systems. This involves challenging problems that few are knowledgeable enough to handle individually.

Yet this situation — where the attention and resources of the company are focused on successful AI — may have a greater probability of success than cases where a project is a later add-on to existing processes or products. In these add-on cases, deep learning (and indeed much other data science) projects often fail because they are not successfully integrated into the process they are trying to improve.

One common reason for this failure is that the project did not account for people who were integral to the process.  At many companies, there are few processes that are both automated and important enough to try to improve using deep learning. More often, important processes are only partly automated and have, at their center, people.

  Companies that have deep learning or AI as a core part of their product are faced with full-stack, systems-level concerns about how to efficiently and simultaneously architect deep learning, software, and hardware systems 

This means that models need to be integrated into human workflows to truly succeed.  For example, it is straightforward for data scientists in HR to create a model that predicts attrition of employees. Yet it is much harder to deliver predictions and suggested actions to the right people at right time to prevent unwanted attrition. The difference between a successful and failed deep learning project is often how smoothly the project has been integrated into existing workflows and systems.

Leaders should also expect that even the most successful projects will need to be monitored and adjusted.  This is true because the best machine learning systems are designed to be open to learning from new data.  And, unfortunately, data from complex systems change over time. For example, one study found it only took four months for clinical data to lose half its predictive value when trying to anticipate clinical test orders. 

It is, therefore, crucial to set up robust processes to monitor the ongoing performance of in-production models. For example, newly updated models should be automatically checked to ensure their performance meets some minimum threshold for key cases.

Perhaps the deepest reason that deep learning projects fail is that models often fail to reflect the true causal structure of reality. Deep learning is great at correlating which results arose from which inputs in a given data set. However, unless one is dealing with a simple closed system such as a video game, it is difficult to move from these correlations to correctly recommended actions unless the model incorporates human knowledge about causality. Because of this, the most successful projects are likely to be those that combine deep learning with causal modeling.

See Also:

Top Big Data Solution Companies

Read Also

Big Data: Separating the Hype from Reality in Corporate Culture

Big Data: Separating the Hype from Reality in Corporate Culture

Brett MacLaren, VP, Enterprise Analytics, Sharp HealthCare
Maintaining Maximum Relevancy for Buyers and Sellers

Maintaining Maximum Relevancy for Buyers and Sellers

Zoher Karu, Vice President and Chief Data Officer, eBay
Building Levies to Manage Data Flood

Building Levies to Manage Data Flood

Adam Bowen, World Wild Lead of Innovation, Delphix
Resolving Disassociated Processing of Real-Time and Historical Data in IoT

Resolving Disassociated Processing of Real-Time and Historical Data in IoT

Konstantin Boudnik, Chief Technologist Bigdata Open Source Fellow, EPAM