Latest

Themes and Conferences per Pacoid, Episode 4

Paco Nathan's latest column covers themes that include data privacy, machine ethics, and yes, Don Quixote. Introduction Welcome back to our monthly series about data science....

SHAP and LIME Python Libraries: Part 1 – Great Explainers, with Pros and Cons to Both

This blog post provides a brief technical introduction to the SHAP and LIME Python libraries, followed by code and output to highlight a few pros and...

Making PySpark Work with spaCy: Overcoming Serialization Errors

In this guest post, Holden Karau, Apache Spark Committer, provides insights on how to use spaCy to process text data. Karau is a Developer Advocate at Google, as well...

Themes and Conferences per Pacoid, Episode 3

Paco Nathan‘s column covers themes that include open source, "intelligence is a team sport", and "implications of massive latent hardware". Introduction Welcome to our monthly series...

Justified Algorithmic Forgiveness?

Last week, Paco Nathan referenced Julia Angwin’s recent Strata keynote that covered algorithmic bias. This Domino Data Science Field Note dives a bit deeper into some...

Themes and Conferences per Pacoid, Episode 2

Paco Nathan's column covers themes of data science for accountability, reinforcement learning challenges assumptions, as well as surprises within AI and Economics. Introduction Welcome back to...

Trust in LIME: Yes, No, Maybe So? 

In this Domino Data Science Field Note, we briefly discuss an algorithm and framework for generating explanations, LIME (Local Interpretable Model-Agnostic Explanations), that may help data...

Benchmarking NVIDIA CUDA 9 and Amazon EC2 P3 Instances Using Fashion MNIST

In this post, Josh Poduska, Chief Data Scientist at Domino Data Lab, writes about benchmarking NVIDIA CUDA 9 and Amazon EC2 P3 Instances Using Fashion MNIST....

Themes and Conferences per Pacoid, Episode 1

Introduction: New Monthly Series! Welcome to a new monthly series! I’ll summarize highlights from recent industry conferences, new open source projects, interesting research, great examples, amazing...

Make Machine Learning Interpretability More Rigorous

This Domino Data Science Field Note covers a proposed definition of machine learning interpretability, why interpretability matters, and the arguments for considering a rigorous evaluation of...

Learn from the Reproducibility Crisis in Science

Key highlights from Clare Gollnick’s talk, “The limits of inference: what data scientists can learn from the reproducibility crisis in science”, are covered in this Domino...

Feature Engineering: A Framework and Techniques 

This Domino Field Note provides highlights and excerpted slides from Amanda Casari’s “Feature Engineering for Machine Learning” talk at QCon Sao Paulo. Casari is the Principal...

Three Simple Worrying Stats Problems

In this guest post, Sean Owen, writes about three data situations that provide ambiguous results and how causation helps clarifies the interpretation of data. A version...