Tag: Data Science

Improving Zillow’s Zestimate with 36 Lines of Code

Zillow and Kaggle recently started a $1 million competition to improve the Zestimate. We are releasing a public Domino project that uses H2O’s AutoML to generate...

Data Scientists are Analysts are Software Engineers

In this Data Science Popup session, W. Whipple Neely, Director of Data Science at Electronic Arts, explains why data scientists have responsibilities beyond just data science....

Horizontal Scaling for Parallel Experimentation

The amount of time data scientists spend waiting for experiment results is the difference between making incremental improvements and making significant advances. With parallel experimentation, data...

What Data Scientists Should Know About Hiring, Sharing, and Collaborating

In this post we summarize some of our most recent and favorite answers on Quora to questions from the community about hiring junior data scientists, sharing...

Multicore Data Science with R and Python

This article is an excerpt from the full video on Multicore Data Science in R and Python. Watch the full video to learn how to leverage...

Imbalanced Datasets

Imagine you are a medical professional who is training a classifier to detect whether an individual has an extremely rare disease. You train your classifier, and...

The Cost of Doing Data Science on Laptops

At the heart of the data science process are the resource intensive tasks of modeling and validation. During these tasks, data scientists will try and discard...

‘Lean’ Data Science

In this Data Science Popup session, Noelle Sio, Principal Data Scientist at Pivotal, explains how to apply Lean methodology to data science.   Video Transcript I'm...

Benchmarking Predictive Models

It's been said that debugging is harder than programming. If we, as data scientists, are developing models ("programming") at the limits of our understanding, then we're...

Data Science on AWS: Benefits and Common Pitfalls

More than two years ago, we wrote about the misguided fear of the cloud among many enterprise companies. How quickly things change! Today, every enterprise we...

Numenta Anomaly Benchmark: A Benchmark for Streaming Anomaly Detection

Written by Subutai Ahmad, VP Research at Numenta. With sensors invading our everyday lives, we are seeing an exponential increase in the availability of streaming, time-series...

Principles of Collaboration in Data Science

Data science is no longer a specialization of a single person or small group. It is now a key source of competitive advantage, and as a...

Building a Model is the Least Important Part of Your Job

In this Data Science Popup session, Kimberly Shenk, Director of Data Science Solutions at Domino Data Lab, explains why building models is the least important part...