Skip to content
    Latest

    The Curse of Dimensionality

    Guest Post by Bill Shannon, Founder and Managing Partner of BioRankings Danger of Big Data Big data is the rage. This could be lots of rows (samples) and few columns...

    The importance of structure, coding style, and refactoring in notebooks

    Notebooks are increasingly crucial in the data scientist's toolbox. Although considered relatively new, their history traces back to systems like...

    Natural Language in Python using spaCy: An Introduction

    This article provides a brief introduction to natural language using spaCy and related libraries in Python. The complementary Domino project is also...

    Creating Multi-language Pipelines with Apache Spark or Avoid Having to Rewrite spaCy into Java

    In this guest post, Holden Karau, Apache Spark Committer, provides insights on how to create multi-language pipelines with Apache Spark and avoid...

    Data Scientist? Programmer? Are They Mutually Exclusive?

    This Domino Data Science Field Note blog post provides highlights of Hadley Wickham’s ACM Chicago talk, “You Can’t Do Data Science in a GUI”. In his...