Model Interpretability: The Conversation Continues

by on November 14, 2019

This Domino Data Science Field Note covers a proposed definition of interpretability and distilled overview of the PDR framework. Insights are drawn from Bin Yu, W. James Murdoch, Chandan Singh, Karl Kumber, and Reza Abbasi-Asi’s recent paper, “Definitions, methods, and applications in interpretable machine learning”.

Introduction

Model interpretability continues to spark public discourse among industry. We have covered model interpretability previously, including a proposed definition of machine learning (ML) interpretability. Yet, Bin Yu, W. James Murdoch, Chandan Singh, Karl Kumber, and Reza Abbasi-Asi argue that earlier definitions do not go far enough in their recent paper, “Definitions, methods, and applications in interpretable machine learning“. Yu et al. advocate for “defining interpretability in the context of machine learning” and for using a Predictive, Descriptive, Relevant (PDR) framework because there is “considerable confusion about the notion of interpretability”.

Data science work is experimental, iterative, and at times, confusing. Yet, despite the complexity (or because of it), data scientists and researchers curate and use different languages, tools, packages, techniques, and frameworks to tackle the problem they are trying to solve. Industry is constantly assessing potential components to determine whether integrating the component will help or hamper their workflow. While we offer a platform-as-a-service, where industry can use their choice of languages, tools, and infra to support model-driven workflows, we cover practical techniques and research in this blog to help industry make their own assessments. This blog post provides a distilled overview of the proposed PDR framework as well as some additional resources for industry to consider.

Why Iterate on the Definition of Interpretable ML?

While researchers including Finale Doshi-Velez and Been Kim have proposed and contributed definitions of interpretability, Yu et al. argue in the recent paper that prior definitions do not go far enough and

“This has led to considerable confusion about the notion of interpretability. In particular, it is unclear what it means to interpret something, what common threads exist among disparate methods, and how to select an interpretation method for a particular problem/ audience.”

and advocate

“Instead of general interpretability, we focus on the use of interpretations to produce insight from ML models as part of the larger data–science life cycle. We define interpretable machine learning as the extraction of relevant knowledge from a machine-learning model concerning relationships either contained in data or learned by the model. Here, we view knowledge as being relevant if it provides insight for a particular audience into a chosen problem. These insights are often used to guide communication, actions, and discovery. They can be produced in formats such as visualizations, natural language, or mathematical equations, depending on the context and audience.”

Yu et al. also argue that prior definitions focus on subsets in ML interpretability rather than holistically and that the Prescriptive, Descriptive, Relevant (PDR) Framework, coupled with a vocabulary, aims to “fully capture interpretable machine learning, its benefits, and its applications to concrete data problems.”

Prescriptive, Descriptive, Relevant (PDR) Framework

Yu et al. indicate that there is a lack of clarity regarding “how to select and evaluate interpretation methods for a particular problem and audience” and how the PDR Framework aims to address this challenge. The PDR framework consists of

“3 desiderata that should be used to select interpretation methods for a particular problem: predictive accuracy, descriptive accuracy, and relevancy”.

Yu et al. also argue that for an interpretation to be trustworthy, that practitioners should seek to maximize both predictive and descriptive accuracies. Yet there are tradeoffs to consider when selecting a model. For example,

“the simplicity of model-based interpretation methods yields consistently high descriptive accuracy, but can sometimes result in lower predictive accuracy on complex datasets. On the other hand, in complex settings such as image analysis, complicated models can provide high predictive accuracy, but are harder to analyze, resulting in a lower descriptive accuracy.”

Predictive Accuracy

Yu et al. define predictive accuracy, in the context of interpretation, as the approximation regarding the underlying data relationships with the model. If the approximation is poor, then insights extracted are also impacted. Errors like these may occur when the model is being constructed.

“Evaluating the quality of a model’s fit has been well studied in standard supervised ML frameworks, through measures such as test-set accuracy. In the context of interpretation, we describe this error as predictive accuracy. Note that in problems involving interpretability, one must appropriately measure predictive accuracy. In particular, the data used to check for predictive accuracy must resemble the population of interest. For instance, evaluating on patients from one hospital may not generalize to others. Moreover, problems often require a notion of predictive accuracy that goes beyond just average accuracy. The distribution of predictions matters. For instance, it could be problematic if the prediction error is much higher for a particular class.”

Descriptive Accuracy

Yu et al. define descriptive accuracy,

“in the context of interpretation, as the degree to which an interpretation method objectively captures the relationships learned by machine-learning models.

Yu et al. indicate descriptive accuracy is a challenge for complex black box models or neural networks when the relationship is not obvious.

Relevancy

Yu et al. argue that relevancy is defined, in the context of interpretation, “if it provides insight for a particular audience into a chosen domain problem.” Yue et al. also indicates that relevancy contributes to trade off decisions regarding both accuracies and places emphasis on the audience being a human audience.

“Depending on the context of the problem at hand, a practitioner may choose to focus on one over the other. For instance, when interpretability is used to audit a model’s predictions, such as to enforce fairness, descriptive accuracy can be more important. In contrast, interpretability can also be used solely as a tool to increase the predictive accuracy of a model, for instance, through improved feature engineering.”

Conclusion and Resources

This Domino Data Science Field Note provided a distilled overview of Yu et al.’s definition of ML interpretability and the PDR framework to help researchers and data scientists assess whether to integrate specific techniques or frameworks into their existing work flow. For more information on interpretability, check out the following resources

Domino Data Science Field Notes provide highlights of data science research, trends, techniques, and more, that support data scientists and data science leaders accelerate their work. If you are interested in your data science work being covered in this blog series, please send us an email at writeforus(at)dominodatalab(dot)com.

Share