Subject archive for "code-featured"

Data Science

The Importance of Structure, Coding Style, and Refactoring in Notebooks

Notebooks are increasingly crucial in the data scientist's toolbox. Although considered relatively new, their history traces back to systems like Mathematica and MATLAB. This form of interactive workflow was introduced to assist data scientists in documenting their work, facilitating reproducibility, and prompting collaboration with their team members. Recently there has been an influx of newcomers, and data scientists now have a wide range of implementations to choose from, such as Jupyter Notebook, Zeppelin, R Markdown, Spark Notebook, and Polynote.

By Nikolay Manchev26 min read

Data Science

Natural Language Processing in Python using spaCy: An Introduction

This article provides a brief introduction to natural language using spaCy and related libraries in Python.

By Paco Nathan15 min read

Data Science

Data Scientist? Programmer? Are They Mutually Exclusive?

This Domino Data Science Field Note blog post provides highlights of Hadley Wickham’s ACM Chicago talk, “You Can’t Do Data Science in a GUI”. In his talk, Wickham advocates that, unlike a GUI, using code provides reproducibility, data provenance, and the ability to track changes so that data scientists have the ability to see how the data analysis has evolved. As the creator of ggplot2, it is not a surprise that Wickham also advocates the use of visualizations and models together to help data scientists find the real signals within their data. This blog post also provides clips from the original video and follows the Creative Commons license affiliated with the original video recording.

By Ann Spencer7 min read

Subscribe to the Domino Newsletter

Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.

*

By submitting this form you agree to receive communications from Domino related to products and services in accordance with Domino's privacy policy and may opt-out at anytime.