Skip to content

    Using k-Nearest Neighbors (k-NN) in Production

    on October 8, 2016

    What is k-Nearest Neighbors (k-NN)?

    k-Nearest Neighbors is a simple algorithm that stores all available cases and classifies new cases based on a similarity measure (e.g., distance functions). KNN is a "lazy instance based" algorithm, meaning it does not generalize. Therefore, training a KNN algorithm is insanely fast! For the basic kNN, training happens at literally the speed of just reading all the training data and saving it in a data structure.

    Domino’s Chief Data Scientist, Eduardo Ariño de la Rubia, presented a webinar: An Introduction to Using k-NN in Production.

    If you missed the live webinar or would like to watch it again, you can find a recording below:

     

    Watch the webinar to learn:

    • Different implementations of using k-NN in production;
    • The pros and cons of using the algorithm with production data sets;
    • How to use R and Python packages to get the most out of your k-NN model;
    • A demonstration of training models on the Domino platform.

    If you’d like to benchmark the predictive performance of k-NN against other algorithms contact us for a personalized demo of the Domino Data Science platform.

    Other posts you might be interested in

    Subscribe to the Data Science Blog

    Receive data science tips and tutorials from leading Data Scientists right to your inbox.