GPA Classification of Incoming UCSC Students
We were given a sample of admission data from the University of California, Santa Cruz Fall 2013 academic year. Each student within this admission data was defined by a set of 14 features that may or may not be missing. We examined several methods of data preprocessing in conjunction with classification algorithms from the Weka and Sci-kit machine learning packages. Our goal was to find a classification model that would predict the most accurate GPA ranges for a given set of features. Given an unlabeled test set, we were able to receive a prediction rating of 26% accuracy, which was the highest accuracy of the class.
Given inconsistent data of previously incoming UCSC students, we received the highest prediction accuracy of the class. Each group was tasked to develop a prediction model given a supplied training set. Many elements of this training set contained missing attributes which created difficulty in developing prediction models.
Below you will find a link to our presentation slides and our final report.