Literature Review

Data-Driven Investment Strategies for Peer-to-Peer Lending

This paper investigates a problem very similar in nature and form to the problem that we are investigating in this project. In this paper, Cohen, Guetta, Jia, and Provost analyze various attempts to build model-powered investment strategies for maximizing investment returns on Lending Club, and potentially other Peer-to-Peer lending services. In their analyses, they realized that due to the varied nature of loans and the immeasurable underlying factors behind successful and unsucessful investments, it is relatively impossible to generate anything close to a perfect model that can far surpass Lending Club’s existing standards for investment returns. Through a quantitative analysis, they conclude that they were only able to achieve an ROC AUC of approximately 0.68 even by using over 200 random seeds for sample selection and model building, revealing little to no advantage over existing practical and empirical analyses of Lending Club investment portfolios. Additional analyses using metrics such as Kendall’s tau statistic revealed negligible difference from the null hypothesis of 0.5, reflecting on the difficulty of building successful models for predicting Lending Club’s investment success. However, our model will improve upon the work that Cohen, Guetta, Jia, and Provost have done due to our model’s greater predictor filtering and the use of several predictors that the researchers were not able to interpret and parse in their research.

Maxime C. Cohen, C. Daniel Guetta, Kevin Jiao, and Foster Provost.Big Data.Sep 2018.ahead of print http://doi.org/10.1089/big.2018.0092

Risk Assessment in Social Lending via Random Forests

In our final project, we seek to power a successful investment strategy using the machine learning and data analysis techniques we have learned throughout the course of the semester. For this project specifically, our goal is not only to power such an investment strategy that yields promising returns, but to train a model to recognize and respond to different indicators of risk within a datset. In this sense, the work done by Malekipirbazari and Aksakalli on asessing risk using computer based Random Forest models is essential to the core of our investment strategy. In their paper, they use Random Forest models to analyze the risk of different investment portfolios of loans to a reasonable degree of success. Inspired by their work and their success on building a working risk assessment model, we decide to use Random Forests as a preliminary and a follow-up method of power our investment strategy and evaluating the success and feasibility of beating Lending Club’s existing standards on the state of good investments and portfolios.

Malekipirbazari, Milad, and Vural Aksakalli. “Risk Assessment in Social Lending via Random Forests.” Expert Systems with Applications, vol. 42, no. 10, 2015, pp. 4621–4631., doi:10.1016/j.eswa.2015.02.001.

Works Cited

[1] Maxime C. Cohen, C. Daniel Guetta, Kevin Jiao, and Foster Provost.Big Data.Sep 2018.ahead of print http://doi.org/10.1089/big.2018.0092

[2] Malekipirbazari, Milad, and Vural Aksakalli. “Risk Assessment in Social Lending via Random Forests.” Expert Systems with Applications, vol. 42, no. 10, 2015, pp. 4621–4631., doi:10.1016/j.eswa.2015.02.001.