Subject archive for "parallel-processing"

Python

Snowflake and RAPIDS For On-Demand Computing by a Storm

With data being quoted as the oil of the 21st century and data science being labeled as the sexiest job of the century, we're seeing a sharp rise in data science and machine learning applications in every field. In IT, finance, and business, predictive analytics is disrupting every industry.

By Richard Ecker8 min read

Code

Parallel Computing with Dask: A Step-by-Step Tutorial

It’s now normal for computational power to be improved continuously over time. Monthly, or at times even weekly, new devices are being created with better features and increased processing power. However, these enhancements require intensive hardware resources. The question then is, Are you able to use all the computational resources that a device provides? In most cases, the answer is no, and you’ll get an out of memory error. But how can you make use of all the computational resources without changing the underlying architecture of your project?

By Gourav Singh Bais15 min read

Machine Learning

Speeding up Machine Learning with parallel C/C++ code execution via Spark

The C programming language was introduced over 50 years ago and it has consistently occupied the most used programming languages list ever since. With the introduction of the C++ extension in 1985 and the addition of classes and objects, the C/C++ pair keep a central role in the development of all major operating systems, databases, and performance critical applications in general. Because of its efficiency, C/C++ underpin a large number of machine learning libraries (e.g. TensorFlow, Caffe, CNTK) and widely used tools (e.g. MATLAB, SAS). C++ may not be the first thing that springs to mind when thinking about Machine Learning and Big Data, but it is omnipresent everywhere in the field where lightning fast computations are needed - from Google's Bigtable and GFS to pretty much everything GPU related (CUDA, OCCA, openCL etc.)

By Nikolay Manchev12 min read

Data Science

Polars - A lightning fast DataFrames library

We have previously talked about the challenges that the latest SOTA models present in terms of computational complexity. We've also talked about frameworks like Spark, Dask, and Ray, and how they help address this challenge using parallelization and GPU acceleration.

By Nikolay Manchev7 min read

Subscribe to the Domino Newsletter

Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.

*

By submitting this form you agree to receive communications from Domino related to products and services in accordance with Domino's privacy policy and may opt-out at anytime.