Thank you for Subscribing to CIO Applications Weekly Brief
Balancing Accuracy With Interpretability in Machine Learning Models
Joshua Mabry Expert in Advanced Analytics practice and Fernando Beserra, Specialist in Advanced Analytics practice Bain & Company
To achieve a high level of accuracy, analysts train intricate black box models on large data sets that capture complex underlying relationships. The unfortunate trade-off traditionally has come in model interpretability, but concerns about bias, safety and auditability have sparked a cascade of research in this area. Recently, robust model interpretation methodologies, such as SHAP (Shapley additive explanation) and LIME (local interpretable model-agnostic explanations), have gained adoption in data science circles and have been incorporated into most commonly used software. One selling point is the ability to explain decisions at the level of a single prediction. This has been a massive advance for imbuing trust into predictive analytics applications and creating explanations that fit with human intuition.
We recently built an ML pipeline to forecast demand for generic products sold in a national retail chain. This retailer suffered from significant pricing competition among nimble competitors in an emerging market and needed a way to identify products most at risk without waiting to see long-term changes in market share.
Sales demand was affected by a large number of complex factors, including weather, marketing activities and substitution effects, and it needed to be predicted for hundreds of stores, each subject to different market conditions.
The scale and heterogeneity of the data led us to devise an ML solution based on an ensemble of models rather than taking a more traditional forecasting approach, as shown in the chart. In support of this strategy, we saw a significant increase in accuracy, including thousands of additional variables in the model, with the downside being a loss of explainability.
We found it useful for the retailer to use business expertise to group the model inputs into natural hierarchies and then compute variable importance for these high-level features. This approach allowed the analysts to focus on the overall effect of catalysts such as price rather than trying to look at the raw output of our explanatory algorithm (SHAP) as provided by many off-the-shelf solutions. Analysts quickly were able to flag predicted declines in sales and the main reasons behind these declines without raising too many false alarms. That yielded both the benefit of black box model accuracy and the explanatory power usually associated with a simpler model.
We highlight this case because it met our acceptance criteria for a black box model: First, the model accuracy is significantly higher than for simpler models, and second, the cost of a wrong answer is low.
However, we advise caution when setting policy based on this type of post hoc analysis and remain strong advocates of a test-and-learn approach, in which these types of insights inform rigorously controlled in market tests. Nonetheless, we are seeing business leaders successfully use data science and ML methodologies. What once was viewed as the domain of the specialist now is better informing critical decisions throughout the corporate enterprise.