Subject archive for "reproducibility"

Reproducibility

Why AI reproducibility is the holy grail of good governance

True reproducibility means anyone can return to a point in time — anywhere in the AI/ML lifecycle — and see how a model was built and understand its purpose and KPIs. Yet, most AI models are built outside of controlled environments and systems of record. Enterprise AI platforms like Domino solve this by automatically unifying and capturing all model provenance and all artifacts across teams, users, tools, and environments without manual detective work, which can produce mixed results.

By Leila Nouri7 min read

Comma separated values containing integers

Perspective

The Case for Reproducible Data Science

Reproducibility is a cornerstone of the scientific method and ensures that tests and experiments can be reproduced by different teams using the same method. In the context of data science, reproducibility means that everything needed to recreate the model and its results such as data, tools, libraries, frameworks, programming languages and operating systems, have been captured, so with little effort the identical results are produced regardless of how much time has passed since the original project.

By Sundeep Teki8 min read

Perspective

Seeking Reproducibility within Social Science: Search and Discovery

Julia Lane, NYU Professor, Economist and cofounder of the Coleridge Initiative, presented “Where’s the Data: A New Approach to Social Science Search & Discovery” at Rev. Lane described the approach that the Coleridge Initiative is taking to address the science reproducibility challenge. The approach is to provide remote access for government analysts and researchers to confidential data in a secure data facility and to build analytical capacity and collaborations through an Applied Data Analytics training program. This article provides a distilled summary and a written transcript of Lane’s talk at Rev. Many thanks to Julia Lane for providing feedback on this post prior to publication.

By Ann Spencer25 min read

Data Science

MNIST Expanded: 50,000 New Samples Added

This post provides a distilled overview regarding the rediscovery of 50,000 samples within the MNIST dataset.

By Ann Spencer5 min read

Data Science

Addressing Irreproducibility in the Wild

This Domino Data Science Field Note provides highlights and excerpted slides from Chloe Mawer’s "The Ingredients of a Reproducible Machine Learning Model" talk at a recent WiMLDS meetup. Mawer is a Principal Data Scientist at Lineage Logistics as well as an Adjunct Lecturer at Northwestern University. Special thanks to Mawer for the permission to excerpt the slides in this Domino Data Science Field Note. The full deck is available here.

By Ann Spencer7 min read

Data Science

Learn from the Reproducibility Crisis in Science

Key highlights from Clare Gollnick’s talk, “The limits of inference: what data scientists can learn from the reproducibility crisis in science”, are covered in this Domino Data Science Field Note. The full video is available for viewing here.

By Domino5 min read

12 3 4

Subscribe to the Domino Newsletter

Receive data science tips and tutorials from leading Data Science leaders, right to your inbox.

By submitting this form you agree to receive communications from Domino related to products and services in accordance with Domino's privacy policy and may opt-out at anytime.