Carleton University - School of Computer Science Honours Project
Winter 2022
Predicting Reddit Upvotes using Machine Learning
SCS Honours Project Image
ABSTRACT
Predicting the number of upvotes of a Reddit post appears to be a difficult problem. This project will use machine learning algorithms to find the most accurate model in terms of predicting the number of upvotes of a Reddit post. The machine learning models are based on linear regression, random forest, k-nearest neighbours (KNN), or multi-layer perceptron neural network (MLP). The performance metrics used to judge the accuracy will be based on the R-squared, mean absolute error (MAE), mean squared error (MSE), and the root mean squared error (RMSE). The experiment will web scrape Reddit posts using the Reddit API, prepare the datasets into CSV files, train the machine learning models, and predict upvotes for the testing set while measuring performance metrics. The main findings from this experiment found the random forest model is the most accurate.