基于分类算法的某信贷平台的贷款违约预测(外文翻译)
With the advent of Web 2.0,it has become easy to create online markets and virtual communities with convenient accessibility and strong collaboration. One of the emerging Web 2.0 applications is the online Peer-to-Peer(P2P) lending marketplaces, where both lenders and borrowers can virtually meet for loan transactions.Such marketplaces provide a platform service of introducing borrowers to lenders, which can offer some advantages for both borrowers and lenders. Borrowers can get micro loans directly from lenders,and might pay lower rates than commercial credit alternatives. On the other hand, lenders can earn higher rates of return compared to any other type of lending such as corporate bonds, bank deposits or certificate of deposits.
One of the problems in online P2P lending is information asymmetry between the borrower and the lender. That is, the lender does not know the borrower's credibility as well as the borrowerdoes. Such information asymmetry might result in adverse selection(Akerlof,1970)and moral hazard(Stiglitz and Weiss, 1981) Theoretically,some of these problems can be alleviated by regular monitoring,but this approach poses a challenge in the online environment because the borrowers and the buyers do not physically meet.1 Fostering and enhancing the lender's trust in the borrower can also be implemented to mitigate adverse selection and moral hazard problems. In the traditional bank-lending markets, banks can use collateral, certified accounts,regular reporting.and even presence of the board of directors to enhance the trust in the borrower However,such mechanisms are difficult to implement in the online environment which will incur a significant transaction
cost.
To reduce lending risks associated with information asymmetry. current online P2P lending has the following arrangements. First, the Lending Club screens out any potential high-risk borrowers based on the FICO score. The minimum FiCO score to be able to participate is640.2 Second,the typical size of the loans produced in this market is small,which is under $35 000 at the Lending Club.Therefore, these loans are essentially microloans which pose a relatively small loss in case of default, Third, the market maker offers matchmaking systems which can be used to generate portfolio recommendations and minimize lending risks. Fourth, if a borrower fails to pay, the market maker will report the case to a credit agency and hire a collection agency to collect the funds on behalf of the lender. Although there are certain structures imposed in the online P2P that help to minimize the risk,this form of lending is inherently associated with greater amount of risk compared to the traditional lending. The purpose of this article is to evaluate the credit risk of borrowers from one of the largest P2P platforms in the United States provided by the Lending Club,which help lenders to make more informed decisions about the risk and return efficiency of loans based on the borrowers' grade. There are two related research questions this article will address:(1) What are some of the borrowers’characteristics that help determine the default risk? and(2)Is the higher return generated from the riskier borrower large enough to compensate for the incremental risk? Lenders can allocate their investments more efficiently if they know what characteristics of the borrower affect the default risk. Each borrower is classified by credit grade with corresponding borrowing rate assigned by the Lending Club.To make an efficient
allocation,a lender should know whether the higher interest rates set for high-risk borrowers are sufficient to compensate the lenders for the higher probabilities of a potential loss.
This study contributes to the literature in this new and fast growing P2P marketplace in many ways. While there are few studieswhich explored credit screening problem in the P2P lending platforms, this research differs from the prior research in various aspects(see,for example,lyer et al, 2009 and Lin et al, 2013).First, this research extends risk analysis research in the online P2P lending by utilizing the new data from the Lending Club, which is contrast to many prior studies
which
utilize
the
data
from
one
of
the
biggest
P2P
platform(Prosper).Second, this study estimates the default risk of loan applicants based on their significant demographic and characteristic factors, which enables the potential lenders to determine an optimal allocation strategy Third, this research addresses the issue of selection bias by examining whether there is a significant difference in the default risk of the borrowers from the whole Us population and the LendingClubwhich yields an important implication for risk minimization for the Lending Club.Finally.this research relates the default risk of borrowers with the returns generated by the lenders by comparing the calculated theoretical interest rate with the actual interest rate charged by the Lending Club for each credit grade category.This provides important information regarding the risk and return efficiency of the Lending Club.
Our findings suggest that borrowers with high FIcO score, high credit grade,low revolving line utilization and low debt-to-income ratio are associated
with low default risk This finding is consistent withthe studies by Duarte et al(2012) who report that borrowers with a trustworthy characteristic will have better credit scores but low probability of default. This result also suggests that besides the loan applicants’social ties and friendship as reported by Freedman and Jin(2014)and Lin et al.(2013), the four factors discussed above are also important in explaining the default risk.When comparing with Us national borrowers, the results show that the Lending Club should continue to screen out the borrowers with lower Fico score and attract the highest FiCo score borrowers in order to significantly reduce the default riskIn relating the risk to the return,it shows that higher interest rate charged for the riskier borrower is not significant enough to justify the higher default probability. Our finding here is consistent with the study by Berkovich(2011) who reports that high quality loans offer excess return.
The remainder of the article is organized as follows. In the next section, we review the literature for online P2P lending. Section llldescribes our data and summarizes the descriptive statistics of online P2P from the Lending Club. In Section IV,we present the descriptions of methodologies and empirical results for evaluating the credit risk and measuring the risk and return efficiency for the Lending Club. The issue of selection bias is also addressed in this section. Section V offers some concluding remarks.
Data in this section, the loan applicants data is first described followed by loan distribution based on loan purposes, credit grade and loan status and it ends with the detailed descriptive statistics of the loan applicants. This study uses 61 451 loan applications in the Lending Club from May 2007 to June 2012 obtained
from wwwlendingclub.comOver the study period,the Lending Club lent about $713 million to borrowers. To address the borrowers' behaviourin online P2P lending,we first examine the main reasons for borrowing money from others. Table 1 lists the borrowers'self- claimed reasons summarized in the Lending Club.Almost 70% of loan requested are related to debt consolidation or credit card debts with a total loan amount requested of approximately $387 million and $108 million,respectively.The number of loan applications for education renewable energy and vacation contribute less than 1% of total loans with the total loan requested ranging from 1 to 3 million. The borrowers state that their preferences to borrow from the Lending Club are lower borrowing rate and inability to borrow enough money from credit cards. The second purpose for borrowing is to pay home mortgage or to re-model home, The Lending Club uses the borrower’s FiCO credit scores along with other information to assign a loan credit grade ranging from A1 to G5 in descending credit ranks to each loan. The detailed procedure is as follows: after assigning abase score based on FicO ratings, the Lending Club makes some adjustments depending on requested loan amount, number of recent credit inquiries, credit history length, total open credit account, currently open credit accounts and revolving line utilization to determine the final grade,which in turn determines the interest rate on the loan.
Table 2 reports the loan distribution by credit grade.The majority of borrowing requests have grades between A1 and E5. The Highest loan amounts requested are from borrowers with'Bcredit grade which contribute 29.56% of total amount of loans requested. The total number of applicants for thisB credit grade group is 18 707 which represents total loans of approximately $210 million.The lowest loan
amounts requested are from borrowers with the lowest'G credit grade which accounts for 1.53% of total loans. There are only608 loan applicants for this lowest credit rating'G'group and it represents approximately $11 million in total loan value. According to the Lending Club's policy,a loan credit grade is used to determine the interest rate and the maximum amount of money that a borrower can request The higher the loan grade, the lower the interest rate.A borrowing request with a low grade renders a higher interest rate as a compensation for a high risk held by lenders.
Finally,Panel A of Table 3 shows the loan status for all the loanrequests on 20 July 2012. Overall,the default rate is 4.60% with total losses of approximately $29 million.4 Another 2.45% of total loan requests which constitute $18.6 million could be potentially lost because the borrowers are late in making payment within 30 days or120 days and not paying the normal instalments.17.98% of the loans are fully paid with an approximate value of $108 million. The $557 million loans are in current status account for 74.91% of total loans. Naturally, loans with a lower grade demonstrate a higher default rate. Therefore, study on risk management on P2P lending is relevant for the lenders to optimize their investment portfolios. Panel B of Table 3 reports the loan status for the matured loans. The overall loss rate is much higher for matured loans. Among 4904 matured loans,914 loans are charged-off,which represent 18.6%The total loss is $5.5 million which represents 13% of all matured loans amount. Less than1% of the matured loans are late in terms of making payment with the unpaid balance of approximately $27 000. 80.77% or $33 million of matured loans are fully paid. Empirical Results
Evaluation of credit risks
From a lender's perspective the most important concern is whether a borrower will default or not. A lender will benefit if borrowers' characteristics help to predict whether a particular borrower is more or less likely to default. Based on the description ofthe loan distribution by loan status reported earlier, in the past s years the Lending Club provided61 451 loans.2848 of these loans are not paid back(fully or partially).Although these charged-off loans translate to a default rate of 4.6%, this is biased downward since the default rate is generally increasing with the maturity. The default rate for the matured loans as indicated earlier is 18.60%. In this section, we examine the factors that determine the likelihood of the loan default.We first implement the nonparametric tests to examine if there is a significant diference in the variables between defaulted loans and good status loans. Then, we moder the default risk ot the loan applicants by employing a binary logit regression.
Defaulted loans are loans that are charged-off and late in payment.Good status loans are loans that are fully paid or current in payment schedule.Table 5 reports the results of the nonparametric test and summarizesthe differences between defaulted and good loans. These two groups are significantly different in terms of loan and borrower characteristics. The chi-square statistic values of Kruskal Wallis show that interest rate,credit grade,home ownership,FICO score reolving line utilization,total funds and monthly income between the two groups are statistically different at 1% level.Spetical,we find that the interest ate on a defaulted ioan s higher and the amount borrowed is lower. The borrowers of such
defaultedloan tend to have low FicO score and low credit grade but higher revolving line utilization. In addition, they have lower monthly income and are less likely to own a home.
Conclusions
Credit risk is an important concern for the P2P loans. This study employs the data from the Lending Club to evaluate the credit risk of the P2P online loans.We find that credit score,debt-to-income ratio, Fico score and revolving line utilization play an important role in determining loan default. The credit categorization used by the Lending Club successfully predicts the default probability with one exception of next lowest credit grade . in general, higher credit grade loan is associated with lower default risk.
The mortality risk also increases with the maturity of the loans. Loans with lower credit grade and longer duration are associated with high mortality rate. The Cox Proportional Hazard Test results show that as the credit risk of the borrowers increases, so does the likelihood of loan being default. However, the higher interest rate currently charged for the riskier borrower is not significant enough to justify the higher default probability. This suggests that the lenders would be better off to lend only to the safest borrowers in the highest grade category of 7 or Grade A. Increasing spreads on riskier borrower mayead to a more severe adverse selection resulting in higher default risk
The Lending Club lenders should either extend credits only to the highest
grade borrower or try to find more creative ways to lower the default rate among current borrowers. When comparing with the US national consumers, borrowers with relatively higher income and potentially higher FicO scores do not participate in the P2P market. Creating incentives to attract these types of borrowers would have a significant potential to decrease the default risk in this market.
随着Web2.0的出现,通过便捷的可访问性和强大的协作来创建在线市场和虚拟社区变得很容易。新兴的Web2.0应用程序是在线对等(P2P)借贷市场,贷方和借方都可以在此虚拟地进行贷款交易。此类市场提供了将借方介绍给贷方的平台服务,这可以提供一些优势对于借款人和贷方。借款人可以直接从贷方获得小额贷款,而且其支付的利率可能比商业信贷替代方案的利率低。另一方面,与任何其他类型的贷款(例如公司债券,银行存款或存款证明)相比,贷方可以获得更高的回报率。
在线P2P借贷中的问题之一是借方与贷方之间的信息不对称。也就是说,贷方并不像借款人那样知道借款人的信誉。这种信息不对称可能会导致逆向选择(Akerlof,1970)和道德风险(Stiglitz和Weiss,1981)。从理论上讲,通过定期监视可以缓解其中的一些问题,但是这种方法在网上环境中构成了挑战,因为借款人和1还可建立和增强贷方对借款人的信任,以减轻不利的选择和道德风险问题。在传统的银行贷款市场中,银行可以使用抵押品,认证账户,定期报告,甚至可以使用董事会的存在来增强对借款人的信任。但是,这种机制很难在在线环境中实施,这将导致巨大的交易成本。
减少与信息不对称相关的借贷风险。当前的在线P2P借贷有以下安排。首先,借贷俱乐部根据信用分数筛选出任何潜在的高风险借贷者。能够参与的最低信用分数为640.2第二,该市场上产生的贷款的典型规模很小,在信贷平台上不到$35,000。因此,这些贷款本质上是小额贷款,造成的损失相对较小。在出现违约的情况下,第三,做市商提供对接
系统,可用于生成投资组合建议并最大程度地降低借贷风险。第四,如果借款人未还款,做市商将把情况报告给信贷机构,并雇用代收机构代贷方收取资金。尽管在线P2P中强加了某些结构以最大程度地降低风险,但是与传统贷款相比,这种形式的贷款固有地与更大的风险量相关联。本文的目的是评估由信贷平台提供的美国最大的P2P平台之一的借款人的信用风险,该平台可帮助贷方根据借款人对贷款的风险和收益效率做出更明智的决策。'年级。本文将解决两个相关的研究问题:(1)哪些借款人的特征可以帮助确定违约风险?(2)风险较高的借款人产生的较高收益是否足以补偿增量风险?如果贷款人知道借款人的哪些特征会影响违约风险,则他们可以更有效地分配其投资。每个借款人都按信用等级分类,并由信贷平台分配相应的借款利率。为进行有效的分配,贷方应知道为高风险借款人设定的较高利率是否足以补偿贷方较高风险的可能性,潜在的损失。
这项研究从许多方面为这个新兴且快速增长的P2P市场提供了文献资料。虽然很少有研究探讨P2P借贷平台中的信用筛选问题,但这项研究在各个方面都与先前的研究有所不同(例如,参见lyer等人,2009年和Lin等人,2013年)。通过利用信贷平台的新数据进行在线P2P借贷的分析研究,这与许多先前的研究(利用最大的P2P平台之一(Prosper)的数据)形成对比。其次,该研究估计了贷款申请人的违约风险基于他们的重要人口和特征因素,这使潜在的放款人能够确定最佳分配策略。第三,本研究通过检查整个美国人口中借款人的违约风险是否存在显着差异来解决选择偏见的问题。最后,该研究对借贷的违约风险进行了研究。通过将计算出的理论利率与借贷俱乐部针对每个信用等级类别收取的实际利率进行比较,得出贷方产生的回报。这提供了有关借贷俱乐部的风险和回报效率的重要信息。
我们的发现表明,具有较高信用分数,较高信用等级,较低的循环贷款使用率和较低的债务与收入比率的借款人与较低的违约风险相关。这一发现与Duarte等人(2012年)的研究一致,后者报告说借款人具有较高的违约风险。值得信赖的特征将具有更好的信用
评分,但违约概率较低。该结果还表明,除了Freedman和Jin(2014)和Lin等人(2013)所报告的贷款申请人的社会纽带和友谊之外,上述四个因素对于解释违约风险也很重要。对于国家借款人,结果表明,借贷俱乐部应继续筛选出信用得分较低的借款人,并吸引信用得分最高的借款人,以显着降低违约风险。风险较高的借款人的重要性不足以证明较高的违约概率。我们在这里的发现与Berkovich(2011)的研究一致,后者报告了高质量的贷款可以提供超额收益。
本文的其余部分安排如下。在下一部分中,我们将回顾在线P2P贷款的文献。第Lll部分描述了我们的数据,并总结了信贷平台在线P2P的描述性统计数据。在第四部分中,我们介绍了用于评估信贷风险以及衡量贷款俱乐部的风险和收益效率的方法和经验结果的描述。选择偏见的问题也将在本节中讨论。第五节作总结性发言。
在本节中的数据中,首先描述了贷款申请人的数据,然后是根据贷款目的,信用等级和贷款状态进行的贷款分配,并以贷款申请人的详细描述统计信息结尾。该研究使用了2007年5月至2012年6月从www信贷平台.com获得的信贷平台的61451笔贷款申请。在研究期间,信贷平台向借款人提供了约7.13亿美元的贷款。为了解决借款人在在线P2P借贷中的行为,我们首先研究从他人借钱的主要原因。表1列出了信贷平台中借款人的自称原因。所请求的贷款中,有近70%与债务合并或信用卡债务有关,所请求的贷款总额分别约为3.87亿美元和1.08亿美元。用于教育可再生能源和度假的贷款申请占贷款总额的比例不到1%,要求的总贷款额为1-3百万美元。借款人表示,他们向信贷平台借款的偏好是较低的借款利率以及无法从信用卡借款足够的钱。借贷的第二个目的是支付房屋抵押贷款或重塑房屋,借贷俱乐部使用借贷者的信用信用评分以及其他信息来为每笔贷款从A1到G5的信用等级降级。详细过程如下:在根据信用评级分配基本分数之后,借贷俱乐部会根据请求的贷款额,最近的信用查询数量,信用历史记录长度,未结信贷帐户总数,当前未结信贷帐户和周转额进行一些调整。用途来确定最终等级,而最终等级又决定贷款的
利率。
表2列出了按信用等级划分的贷款分配,大多数借款请求的等级在A1和E5之间。要求的最高贷款额来自信用等级为“B”的借款人,占贷款总额的29.56%。B级信用等级的总申请人数为18707人,贷款总额约为2.1亿美元。所要求的最低贷款额是'G级信用等级最低的借款人,占总贷款的1.53%。这个最低信用等级“G”组只有608个贷款申请者,总贷款价值约为1100万美元。根据信贷平台的政策,贷款信用等级用于确定利率和借款人可以要求的最大金额。贷款等级越高,利率就越低。等级低的借款要求会导致较高的利率作为对贷方持有的高风险的补偿。
最后,表3的面板A显示了2012年7月20日所有贷款请求的贷款状态。总体而言,违约率为4.60%,总损失约为2900万美元。4占总贷款请求的2.45%(即1860万美元)可能是由于借款人在30天内或120天内延迟付款并且未支付正常的分期付款而可能造成损失.17.98%的贷款已全额偿还,价值约为1.08亿美元。目前处于状态的5.57亿美元贷款占总贷款的74.91%。自然,较低等级的贷款表现出较高的违约率。因此,对P2P贷款风险管理的研究与贷方优化其投资组合有关。表3的B板报告了到期贷款的贷款状态。到期贷款的总损失率要高得多。在4904笔到期贷款中,有914笔已冲销贷款,占18.6%,总损失为550万美元,占所有到期贷款金额的13%。不到1%的到期贷款延迟还款,未付余额约为27,000美元。全额偿还了80.77%或3,300万美元的到期贷款。
信用风险评估
从贷方的角度来看,最重要的问题是借款人是否违约。如果借款人的特征有助于预测某个特定借款人是否或多或少地违约,则放款人将受益。根据之前报告的按贷款状态进行的贷款分配描述,过去几年里,信贷平台提供了61451笔贷款。其中2848笔未全部(部
分或全部)还清。利率为4.6%,这是向下偏差,因为违约率通常会随着到期日的增加而增加。如前所述,到期贷款的违约率为18.60%。在本节中,我们研究了决定贷款违约可能性的因素。我们首先实施非参数检验,以检验在违约贷款和良好状态贷款之间的变量之间是否存在显着差异。然后,我们通过采用二元logit回归来缓和贷款申请人的默认风险。
违约贷款是冲销和逾期还款的贷款。良好状态贷款是已清还全额或当前还款时间表的贷款。表5报告了非参数测试的结果,并总结了违约贷款和良性贷款之间的差异。这两类人在贷款和借款人特征方面存在显着差异。KruskalWallis的卡方统计值显示,两组之间的利率,信用等级,房屋所有权,信用得分对折线利用率,总资金和月收入的统计差异为1%。违约金较高,而借入的金额较低。此类违约贷款的借款人往往信用得分较低,信用等级较低,但循环贷款使用率较高。此外,他们的月收入较低,拥有房屋的可能性也较小。
结论
信用风险是P2P贷款的重要问题。本研究利用信贷平台的数据来评估P2P在线贷款的信用风险。我们发现,信用评分,债务收入比,信用评分和循环使用率在确定贷款违约方面起着重要作用。信贷平台使用的信用分类成功地预测了违约概率,但下一个最低信用等级除外。通常,较高信用等级的贷款与较低的违约风险相关。
随着贷款到期,死亡率风险也会增加。信用等级较低和期限较长的贷款与较高的死亡率有关。考克斯比例风险测试结果表明,随着借款人信用风险的增加,贷款违约的可能性也随之增加。但是,目前为风险较高的借款人收取的较高利率不足以证明存在较高的违约概率。这表明,放款人最好只向7级或A级的最安全的借款人放贷。风险较高的借款人的利差增加可能会导致更严厉的逆向选择,从而导致更高的违约风险。
信贷平台贷款人应该只向最高等级的借款人提供信贷,或者尝试寻找更多创造性的方法来降低当前借款人的违约率。与美国本国消费者进行比较时,收入相对较高且信用得分可能更高的借款人不参与P2P市场。制定诱因以吸引这类借款人将具有巨大的潜力来降低该市场的违约风险。
因篇幅问题不能全部显示,请点此查看更多更全内容