Recommender Systems through Collaborative Filtering

This is a technical deep dive of the collaborative filtering algorithm and how to use it in practice. From Amazon recommending products you may be interested...

Imbalanced Datasets

Imagine you are a medical professional who is training a classifier to detect whether an individual has an extremely rare disease. You train your classifier, and...

Density-Based Clustering

Cluster Analysis is an important problem in data analysis. Data scientists use clustering to identify malfunctioning servers, group genes with similar expression patterns, or various other...

A/B Testing with Hierarchical Models in Python

In this post, I discuss a method for A/B testing using Beta-Binomial Hierarchical models to correct for a common pitfall when testing multiple hypotheses. I will...

Faster deep learning with GPUs and Theano

Domino recently added support for GPU instances. To celebrate this release, I will show you how to: Configure the Python library Theano to use the GPU...

Social network analysis with NetworkX

Many types of real-world problems involve dependencies between records in the data. For example, sociologist are eager to understand how people influence the behaviors of their...