Skip to content

    About Eduardo Ariño de la Rubia

    Improving Zillow's Zestimate with 36 Lines of Code

    Zillow and Kaggle recently started a $1 million competition to improve the Zestimate. We used H2O’s AutoML to generate a solution. The new Kaggle...

    Horizontal Scaling for Parallel Experimentation

    The amount of time data scientists spend waiting for experiment results is the difference between making incremental improvements and making...

    What Data Scientists Should Know About Hiring, Sharing, and Collaborating

    In this post we summarize some of our most recent and favorite answers on Quora to questions from the community about hiring junior data scientists,...

    Multicore Data Science with R and Python

    This article is an excerpt from the full video on [Multicore Data Science in R and Python]. Watch the full video to learn how to leverage multicore...

    Git Integration in Domino

    We recently released new functionality that provides first-class integration between Domino and git. This post describes the new feature, and...

    The Cost of Doing Data Science on Laptops

    At the heart of the data science process are the resource intensive tasks of modeling and validation. During these tasks, data scientists will try...

    Benchmarking Predictive Models

    It's been said that debugging is harder than programming. If we, as data scientists, are developing models ("programming") at the limits of our...

    Principles of Collaboration in Data Science

    Data science is no longer a specialization of a single person or small group. It is now a key source of competitive advantage, and as a result, the...

    Achieving Reproducibility with Conda and Domino Environments

    Managing “environments” (i.e., the set of packages, configuration, etc.) is a critical capability of any Data Science Platform. Not only does...

    Gain Shell Access To Your Domino Instances

    Domino offers a managed, scalable compute environment that provides push-button convenience to data scientists, whether they're interested in...

    A Quick Benchmark of Hashtable Implementations in R

    UPDATE: I am humbled and thankful to have had so much feedback on this post! It started out as a quick and dirty benchmark but I had some great...

    High-performance Computing with Amazon's X1 Instance - Part II

    When you have at your disposal 128 cores and 2TB of RAM, it’s hard not to experiment and attempt to find ways to leverage the amount of power that is...

    High-performance Computing with Amazon's X1 Instance

    We’re excited to announce support for Amazon’s X1 instances. Now in Domino, you can do data science on machines with 128 cores and 2TB of RAM — with...

    Providing Digital Provenance: from Modeling through Production

    At last week's useR! R User conference, I spoke on digital provenance, the importance of reproducible research, and how Domino has solved many of the...

    Announcing Enhanced Apache Spark Support

    Domino now offers data scientists a simple, yet incredibly powerful way to conduct quantitative work using Apache Spark. Apache Spark has captured...

    Ugly Little Bits of the Data Science Process

    This morning there was a great conversation on Twitter, kicked off by Hadley Wickham, about one of the ugly little bits of the data science process.