CS109A Lending Club Final Project

Group #23: Mike Bao, Genty Daku, Jonathan Huang, Andrew Rittenhouse

Background:

Founded in 2007, Lending Club is the world’s largest peer-to-peer lending platform headquartered in San Francisco, California. In 2008, it registered its offerings with the Securities and Exchange Commission, and has since received numerous awards and global recognition, including the World Economic Forum 2012 Technology Pioneer Award, Forbes’s America’s 20 most promising companies in 2011 and 2012, and one of CNBC’s Disruptor 50 in 2013 and 2014.

How Lending Club Works

Through peer to peer lending, Lending Club provides a matching platform and gives borrowers a new way to access credit and helps investors make money through successful loans. Lending Club operates on a simple principle. Borrowers can submit an application for a loan on the market, and investors can browse through loan applications and accept loans based on borrower information provided on the Lending Club platform. Loans can range anywhere from $1,000 to $40,000. Information such as loan amount, interest rate, loan purpose, and applicant income are all available to investors, as well as the zip codes of the applicants.

Lending Club

Fair Lending

Lending Club’s status as an “Equal Housing Lender” means that it “makes loans without regard to race, color, religion, natonal origin, sex, handicap, or familial status.” [3] Historically, discrimination against loan applicants of color, especially African American applicants, has still been inherently present. One possible reason is that Lending Club can consider zip codes when making decisions on which loans to accept, harming, which can in effect harm minority applicants. In effect, Lending Club’s methodology for choosing which loans to accept may bring unintentional discrimination against certain demographic groups on the basis of income, zip code, or race.

Motivation:

Lending Club releases quarterly and annual data on the loans that it has facilitated, providing huge databases with past results and parameters from successful and unsuccessful loans. However, due to the staggering size of these databases, it is impossible for any human investor to utilize this data in an efficient way.

Nevertheless, this abundance of data presents an opportunity to build highly lucrative models by employing data science techniques. Investors on Lending Club are eager to determine which loans seem the most promising in the sense that the loans will have high returns. With data science techniques, these models would ultimately predict and decide which loans are profitable to invest in, taking into account the risk of default as well as the potential interest returns from a loan. Creating successful and accurate models would allow us to invest with relatively high confidence and encourage participation in Lending Club for investors.

Problem Statement:

We view the problem as if we are data scientists advising a specific Lending Club investor. Our goal is to build an investment strategy that will advise our client on Lending Club on which loans to invest in. We define our strategy as choosing the applications that have the highest expected interest rate return. These are the loans that will provide the most benefit to our client on Lending Club.

To address the concern of racial discrimination or bias present in our strategy, we will also investigate and address possible concerns regarding fariness that may be present in how we power our investment strategy. To do this, we carefully consider whether there are significant differences in the loans we select in terms of selecting from specific zip codes and ethnic groups.

To power our investment strategy, we first take as input the data provided to us by Lending Club. Since we are acting in the roles of data scientists advising investors, we consider only the data that is available to investors deciding which loans to invest in. Thus, for each application we receive, we first predict a probability that the loan will be fully paid, as opposed to being charged off. Once we calculate a probability for each application, we multiply this probability together with the interest rate assigned by Lending Club to get our “expected interest rate return”. This metric allows us to sort all the applications from highest to lowest. Our investment strategy, thus, is to pick the highest number of applications from this sorted list to invest in. We evaluate what the optimal number of investments is by analyzing our portfolios returns verses the number of selected loans in our portfolio.

Acknowledgements:

We would like to thank our teaching fellow, Julia Shea, who provided valuable advice and feedback on all stages of this project.

References:

[1] Lending Club
https://www.lendingclub.com/

[2] Lending Club Statistics and Data Dictionary
https://www.lendingclub.com/info/download-data.action

[3] FDIC Equal Housing Lender
https://www.fdic.gov/regulations/laws/rules/2000-6000.html

[4] Lending Club Awards
https://www.forbes.com/lists/2011/28/most-promising-companies-11_Lending-Club_1GYD.html