Analyzing Large P Small N Data – Examples from Microbiome

Guest Post by Bill Shannon, Co-Founder and Managing Partner of BioRankings Introduction High throughput screening technologies have been developed to measure all the molecules of interest...

Bringing ML to Agriculture: Transforming a Millennia-old Industry

Guest post by Jeff Melching from The Climate Corporation At The Climate Corporation, we aim to help farmers better understand their operations and make better decisions...

The curse of Dimensionality

Guest Post by Bill Shannon, Founder and Managing Partner of BioRankings Danger of Big Data Big data is the rage. This could be lots of rows...

Evaluating Generative Adversarial Networks (GANs)

This article provides concise insights into GANs to help data scientists and researchers assess whether to investigate GANs further. If you are interested in a tutorial...

Techniques for Collecting, Prepping, and Plotting Data: Predicting Social Media-Influence in the NBA

This article provides insight on the mindset, approach, and tools to consider when solving a real-world ML problem. It covers questions to consider as well as...

Clustering in R

This article covers clustering including K-means and hierarchical clustering. A complementary Domino project is available. Introduction Clustering is a machine learning technique that enables researchers and...

Understanding Causal Inference

This article covers causal relationships and includes a chapter excerpt from the book Machine Learning in Production: Developing and Optimizing Data Science Workflows and Applications by...

Time Series with R

This article delves into methods for analyzing multivariate and univariate time series data. A complementary Domino project is available. Introduction Conducting exploratory analysis and extracting meaningful...

Announcing Trial and Domino 3.5: Control Center for Data Science Leaders

Even the most sophisticated data science organizations struggle to keep track of their data science projects. Data science leaders want to know, at any given moment,...

Announcing Domino 3.4: Furthering Collaboration with Activity Feed

Our last release, Domino 3.3 saw the addition of two major capabilities: Datasets and Experiment Manager. “Datasets”, a high-performance, revisioned data store offers data scientists the...

Comparing the Functionality of Open Source Natural Language Processing Libraries

In this guest post, Maziyar Panahi and David Talby provide a cheat sheet for choosing open source NLP libraries. What do natural language processing libraries do?...

Manipulating Data with dplyr

Special thanks to Addison-Wesley Professional for permission to excerpt the following "Manipulating data with dplyr" chapter from the book, Programming Skills for Data Science: Start Writing...

Announcing Domino 3.3: Datasets and Experiment Manager

Our mission at Domino is to enable organizations to put models at the heart of their business. Models are so different from software — e.g., they...

Next page