by Dan Enthoven on December 27th, 2017
Domino just finished benchmarking Intel’s Python Distribution, and it is fast, very fast. Intel’s Python distribution is available for use in Domino.
Intel’s Python Distribution
People may not have known that Intel has a Python Distribution. Based on the Anaconda Distribution, the engineers at Intel have optimized popular math and statistical packages such as NumPy, SciPy and scikitlearn using the Intel® Math Kernel Library and Intel® Data Analytics Acceleration Library. Intel continues to collaborate with all major Python Distributions including Anaconda to make these IA optimizations and performance accelerations widely available to all Python users.
Intel benchmark results indicate that batch runs that might have taken over an hour to complete, now complete in as little as two minutes. When working in a Jupyter Notebook, the resulting speedups mean cells that used to take minutes to compute — now do so in seconds.
Domino Benchmarks Intel’s Python Distribution
At Domino, we wanted to run the benchmarks in a few different scenarios to see how the speedups would impact real world data science programs. For each benchmark, we ran identical experiments where the only variable changed was the Python Distribution being used.
It is easy to change environments and hardware in Domino, so changing environments to run benchmarks takes just a few seconds.
Once we had an environment with Intel Python, we could kick off all the benchmarks at the same time, and know we were isolating the environment as the variable. Also, we could run complex jobs on smaller and larger machines to see how that changed the results.
The first benchmark we ran used scikitlearn to compute distances in the distance matrix from a vector array. Each benchmark was run three times on a 16 core/120GB RAM box.
Distribution  Average Time to Complete 

Standard Python 2.7  12m 17 seconds 
Intel Python 3.6  2 m 17 seconds 
The Intel Python consistently completed the runs in less than 20% of the time that it took for the Standard Python Distribution.
The second test we ran used a Black Scholes benchmark on a smaller, shared box. The server had four CPUs and 16 GB of RAM.
Distribution  Average Time to Complete 

Standard Python 2.7  9m 21s 
Intel Python 3.6  2m 50s 
Again, the time savings from using Intel’s Python Distribution were substantial. Even saving seven or eight minutes per experiment leads to a substantial improvements in research results. Faster results allow for more iterations, and also ensure researchers won’t be distracted and pulled away in the middle of their work. When runs are shortened from hours to just a few minutes the difference is even more valuable.
Intel Python Environments Available in Domino
Domino customers can benefit from Intel’s Python Distribution right away. We’ve already created Intel Python environments in both our trial environment and our cloud production environment. People who want to see how the Black Scholes benchmarks were run, or try it themselves are welcome to see and run the code.

Royi

Robert Cohn
