Better interactive data science with Beaker and Rodeo

pythonR
Share

Domino has offered support for IPython/Jupyter for a while, but we recently added support for two newer, up-and-coming tools for interactive data science: Beaker Notebooks, and Rodeo. This post gives a brief overview of each tool and describes how to use them on Domino.

Power tools at your fingertips

The motivation behind Domino is to make data scientists more productive by letting them focus on their analysis without worrying about infrastructure and configuration; and to facilitate collaboration and sharing among teams, by keeping work organized and tracked in a central place.

To that end, Domino now lets you spin up Rodeo or Beaker sessions on big machines, and keeps your files and notebooks stored centrally so it's easier to track, share, and comment on them.

Beaker

Beaker Notebook is a notebook application from the team at Two Sigma Open Source, in some ways similar to Jupyter/IPython notebooks. But in addition to supporting inline code, documentation and visualization in many different languages, Beaker lets you mix languages. That's right: one notebook can mix code from any language they support, and Beaker's slick interop capabilities seamlessly translate data between languages. This even works for DataFrames and more complex types.

There's a lot going on under the hood to make that work — it's pretty magical.

This makes Beaker the ultimate weapon for those who believe in "using the best tool for the job": one single analytical workflow can use Python for data prep, R for sophisticated statistical analysis, and HTML with D3, or Latex for beautiful visualization and presentation.

Beaker supports R, Python, Julia, Scala, HTML, Javascript, NodeJS Latex, Java, Groovy, Clojure, Ruby, and Kdb — although right now, Domino's support for Beaker only includes a few of those. Let us know if you want to see others!

You can watch a video of one of Beaker's creators speaking about it at SciPy 2015. You also play with Beaker yourself, without any installation or setup, on Domino. You can create your own projects to do this, or use the public project we've shared.

  1. Start a Beaker session by clicking on the "Notebook" menu on your "Runs" dashboard.

  2. When the server is ready, click the "Open session" button in the right pane.

  3. Create a new notebook, or import one of Beaker's examples, or use the file menu to browse to "/mnt" and choose one of the files in our project (viz.bkr or interop.bkr)

The viz.bkr notebook in the project shows an example that uses Python to compute a graph, and then HTML/D3/Javascript to visualize it in the Notebook.

The interop.viz notebook shows some nice examples of Beaker's flexibility for translating data between languages.

Rodeo

Rodeo is an open source Python IDE from the folks at yHat. It answers the question, "is there anything like RStudio for Python?"

Rodeo is just that: it's a web-based IDE for editing Python files that gives you a code editor along with a plot viewer and a file browser in one interface. Unlike Python editors designed for building large software systems, Rodeo is tailored for doing data science in Python — especially with its built-in plot viewer.

You can read more about our support for Rodeo on our help site, and you can give Rodeo a try — without any setup or configuration — by visiting that same Beaker project on Domino and starting a Rodeo session: