Digital Fodder > PROJECTS

    Machine Learning Project

    The goal of the project was to apply decision trees, neural networks, k-nearest neighbor (kNN) and/or SVMs to a data set to train a best possible model for the dataset. As the report outlines, we have applied all these techniques, along with crossvalidation, feature selection, bagging and boosting, and model averaging to train our final model and compute predictions for the 20000 test data points.

    The Tools Used

    • To develop decision trees, we used the IND3 Package for Linux. To use it to train on our data set, we basically wrote a bunch of scripts in TCSH.
    • For k-NN's, Neural Networks, we used MATLAB. MATLAB has direct support for Neural Networks, however, we had to code KNN's from scratch.
    • Finally, for SVM's, we used the popular SVM-Lite.

    The Plan and Approach

    Our plan was to first identify, train and tune a best possible model for decision trees, kNN, neural networks, and SVMs. This would give us an idea about the basic performance for each of the models before we further optimized or combined predictions from any of the models.

    To see the graphs and the full project report, download the pdf below.

    VIEW PDF



    Contact

    Venkat Krishnaraj
    venkat.krishnaraj@gmail.com
    Masters, Computer Science
    Cornell University