Skip to content
    Latest

    Density-Based Clustering

    Original content by Manojit Nandi - Updated by Josh Poduska. Cluster Analysis is an important problem in data analysis. Data scientists use clustering to identify...

    Clustering in R

    This article covers clustering including K-means and hierarchical clustering. A complementary Domino project is available. Introduction Clustering is...

    Time Series with R

    This article delves into methods for analyzing multivariate and univariate time series data. A complementary Domino project is available. Introduction

    Can Data Science Help Us Make Sense of the Mueller Report?

    This blog post provides insights on how to apply Natural Language Processing (NLP) techniques. The Mueller Report The Mueller Report, officially...

    Manipulating Data with dplyr

    Special thanks to Addison-Wesley Professional for permission to excerpt the following "Manipulating data with dplyr" chapter from the book, ...

    Item Response Theory in R for Survey Analysis

    In this guest blog post, Derrick Higgins, of American Family Insurance, covers item response theory (IRT) and how data scientists can apply it within...

    Three Simple Worrying Stats Problems

    In this guest post, Sean Owen, writes about three data situations that provide ambiguous results and how causation helps clarifies the interpretation...

    On the Importance of Community-Led Open Source

    Wes McKinney, Director of Ursa Labs and creator of pandas project, presented the keynote, "Advancing Data Science Through Open Source" at Rev....

    Large Visualizations in canvasXpress

    Dr. Connie Brett is the owner of Aggregate Genius. Dr. Connie Brett provides custom visualization tool development and support for the Translational...

    Data Scientist? Programmer? Are They Mutually Exclusive?

    This Domino Data Science Field Note blog post provides highlights of Hadley Wickham’s ACM Chicago talk, “You Can’t Do Data Science in a GUI”. In his...

    Summertime Analytics: Predicting E. Coli and West Nile Virus

    Gene Leynes (Senior Data Scientist) and Nick Lucius (Advanced Analytics) from the City of Chicago discussed two predictive analytics projects that...

    Model Deployment Powered by Kubernetes

    In this article we explain how we’re using Kubernetes to enable data scientists to deploy predictive models as production-grade APIs. Background ...

    Horizontal Scaling for Parallel Experimentation

    The amount of time data scientists spend waiting for experiment results is the difference between making incremental improvements and making...

    Multicore Data Science with R and Python

    This article is an excerpt from the full video on [Multicore Data Science in R and Python]. Watch the full video to learn how to leverage multicore...

    Using Monte Carlo Simulations in R to Test Methodological Advances in Social Policy Research

    This is a guest post written by Kristin Porter, Senior Research Associate at MDRC. MDRC is a nonprofit, nonpartisan education and social policy...

    R vs. Python for Data Science

    While the elections are over, some debates continue. R and Python are both popular programming languages for data scientists. Each has its advantages...