Tag: R

Data Scientist? Programmer? Are They Mutually Exclusive?

This Domino Data Science Field Note blog post provides highlights of Hadley Wickham’s ACM Chicago talk, “You Can’t Do Data Science in a GUI”. In his talk,...

Summertime Analytics: Predicting E. Coli and West Nile Virus

Gene Leynes (Senior Data Scientist) and Nick Lucius (Advanced Analytics) from the City of Chicago discussed two predictive analytics projects that forecasted potential risk involved with...

Model Deployment Powered by Kubernetes

In this article we explain how we’re using Kubernetes to enable data scientists to deploy predictive models as production-grade APIs. Background Domino lets users publish R...

Horizontal Scaling for Parallel Experimentation

The amount of time data scientists spend waiting for experiment results is the difference between making incremental improvements and making significant advances. With parallel experimentation, data...

Multicore Data Science with R and Python

This article is an excerpt from the full video on Multicore Data Science in R and Python. Watch the full video to learn how to leverage...

Using Monte Carlo Simulations in R to Test Methodological Advances in Social Policy Research

This is a guest post written by Kristin Porter, Senior Research Associate at MDRC. MDRC is a nonprofit, nonpartisan education and social policy research organization dedicated...

R vs. Python for Data Science

While the elections are over, some debates continue. R and Python are both popular programming languages for data scientists. Each has its advantages for performing data...

A Quick Benchmark of Hashtable Implementations in R

UPDATE: I am humbled and thankful to have had so much feedback on this post! It started out as a quick and dirty benchmark but I...

High-performance Computing with Amazon’s X1 Instance – Part II

When you have at your disposal 128 cores and 2TB of RAM, it’s hard not to experiment and attempt to find ways to leverage the amount...

A Summary of Using k-NN in Production

This week, Domino’s Chief Data Scientist, Eduardo Ariño de la Rubia, presented a webinar: An Introduction to Using k-NN in Production. If you missed the live...

Join Us: An Introduction to Using k-NN in Production

Join us next Wednesday, October 5 for a webinar hosted by our Chief Data Scientist covering best practices for using k-NN in production. The k-Nearest Neighbors...

An Introduction to Model-Based Machine Learning

This guest post was written by Daniel Emaasit, a Ph.D Student of Transportation Engineering at the University of Nevada, Las Vegas. Daniel's research interests include the...

Providing Digital Provenance: from Modeling through Production

At last week's useR! R User conference, I spoke on digital provenance, the importance of reproducible research, and how Domino has solved many of the challenges...