Tag: Python

Themes and Conferences per Pacoid, Episode 4

Paco Nathan's latest column covers themes that include data privacy, machine ethics, and yes, Don Quixote. Introduction Welcome back to our monthly series about data science....

SHAP and LIME Python Libraries: Part 1 – Great Explainers, with Pros and Cons to Both

This blog post provides a brief technical introduction to the SHAP and LIME Python libraries, followed by code and output to highlight a few pros and...

Making PySpark Work with spaCy: Overcoming Serialization Errors

In this guest post, Holden Karau, Apache Spark Committer, provides insights on how to use spaCy to process text data. Karau is a Developer Advocate at Google, as well...

On the Importance of Community-Led Open Source

Wes McKinney, Director of Ursa Labs and creator of pandas project, presented the keynote, "Advancing Data Science Through Open Source" at Rev. McKinney's keynote covered open...

Building a Domino Web App with Dash

Randi R. Ludwig, Data Scientist at Dell EMC and an organizer of Women in Data Science ATX, covers how to build a Domino web app with...

Intel’s Python Distribution is Smoking Fast, and Now it’s in Domino

Domino just finished benchmarking Intel’s Python Distribution, and it is fast, very fast. Intel’s Python distribution is available for use in Domino. Intel’s Python Distribution People...

Reproducible Machine Learning with Jupyter and Quilt

In this guest blog post, Aneesh Karve, Co-founder and CTO of Quilt, demonstrates how Quilt works in conjunction with Domino's Reproducibility Engine to make Jupyter notebooks...

Reproducible Dashboards and Other Great Things to do with Jupyter

Mac Rogers, Research Engineer at Domino, presented best practices for creating Jupyter dashboards at a recent Domino Data Science Pop-Up. Session Summary In this Data Science...

Horizontal Scaling for Parallel Experimentation

The amount of time data scientists spend waiting for experiment results is the difference between making incremental improvements and making significant advances. With parallel experimentation, data...

Multicore Data Science with R and Python

This article is an excerpt from the full video on Multicore Data Science in R and Python. Watch the full video to learn how to leverage...

Imbalanced Datasets

Imagine you are a medical professional who is training a classifier to detect whether an individual has an extremely rare disease. You train your classifier, and...

Fitting Gaussian Process Models in Python

Written by Chris Fonnesbeck, Assistant Professor of Biostatistics, Vanderbilt University Medical Center. You can view, fork, and play with this project on the Domino data science...

Achieving Reproducibility with Conda and Domino Environments

Managing “environments” (i.e., the set of packages, configuration, etc.) is a critical capability of any Data Science Platform. Not only does environment setup waste time on-boarding...