The goal of this Master thesis is to examine the advantages and disadvantages of advanced Machine Learning algorithms compared to traditional econometric methods. More specifically, the predictive performance, interpretability and possibilities for casual inference of various tree-based-methods will be compared to the well-established linear regression models. For this purpose, the Stock Market Participation puzzle, which was originally examined by van Rooij, Lusardi, and Alessie (2007) using OLS and IVGMM regressions, will be used for the empirical part of the thesis. The performance of each model is determined by the ROC curve and the according AUC value. Moreover, measures for variable significance are exploited like Feature importance and Permutation Feature importance, which prove the substantial role of financial literacy and income for investing. Albeit Decision Tree and Random Forest models show similar results to the linear models even after optimization, the optimized XGBoost model appears to excel in the majority of cases. This is confirmed by the Diebold-Mariano test and cross-validation.
- Institution / Hochschule
- Ruprecht-Karls-Universität Heidelberg – Wirtschafts- und Sozialwissenschaften
- Machine Learning Financial Literacy Stock Market Participation Stock Market Finance Statistics