Model Interpretability with TCAV (Testing with Concept Activation Vectors)

This Domino Data Science Field Note provides very distilled insights and excerpts from Been Kim’s recent MLConf 2018 talk and research about Testing with Concept Activation Vectors...

Creating Multi-language Pipelines with Apache Spark or Avoid Having to Rewrite spaCy into Java

In this guest post, Holden Karau, Apache Spark Committer, provides insights on how to create multi-language pipelines with Apache Spark and avoid rewriting spaCy into Java. She...

Making PySpark Work with spaCy: Overcoming Serialization Errors

In this guest post, Holden Karau, Apache Spark Committer, provides insights on how to use spaCy to process text data. Karau is a Developer Advocate at Google, as well...

Collaboration Between Data Science and Data Engineering: True or False?

This blog post includes candid insights about addressing tension points that arise when people collaborate on developing and deploying models. Domino’s Head of Content sat down...

Growing Data Scientists Into Manager Roles

In this post, Ricky Chachra, Research Science Manager at Lyft, provides insight for companies looking to home-grow their promising individual contributors (ICs) into effective managers. He...

Justified Algorithmic Forgiveness?

Last week, Paco Nathan referenced Julia Angwin’s recent Strata keynote that covered algorithmic bias. This Domino Data Science Field Note dives a bit deeper into some...

Trust in LIME: Yes, No, Maybe So? 

In this Domino Data Science Field Note, we briefly discuss an algorithm and framework for generating explanations, LIME (Local Interpretable Model-Agnostic Explanations), that may help data...

Item Response Theory in R for Survey Analysis

In this guest blog post, Derrick Higgins, of American Family Insurance, covers item response theory (IRT) and how data scientists can apply it within a project. As...

Make Machine Learning Interpretability More Rigorous

This Domino Data Science Field Note covers a proposed definition of machine learning interpretability, why interpretability matters, and the arguments for considering a rigorous evaluation of...

Learn from the Reproducibility Crisis in Science

Key highlights from Clare Gollnick’s talk, “The limits of inference: what data scientists can learn from the reproducibility crisis in science”, are covered in this Domino...

Feature Engineering: A Framework and Techniques 

This Domino Field Note provides highlights and excerpted slides from Amanda Casari’s “Feature Engineering for Machine Learning” talk at QCon Sao Paulo. Casari is the Principal...

Three Simple Worrying Stats Problems

In this guest post, Sean Owen, writes about three data situations that provide ambiguous results and how causation helps clarifies the interpretation of data. A version...

The Past/Present/Future + Myths of Data Science

Sivan Aldor-Noiman, VP of Data Science at Wellio, presented “The Past/Present/Future + Myths of Data Science” at Domino. This blog post provides a few highlights from...