Harnessing Machine Learning in Credit Underwriting

The market size of the digital lending companies is expected to grow from USD 38.2 billion in 2021 to USD 515 billion in 2030¹. This is a CAGR of 33.5%. As the digital lending market continues to grow, need for speedy mechanisms of screening the borrowers and faster disbursement of the amounts will become crucial.

One of the major technological innovations that has found its application in the credit markets is machine learning. According to Oracle, ‘machine learning focuses on building systems that learn – or improve performance – based on the data they consume’. It is a subset of artificial intelligence. Machine learning algorithms are pieces of code that help people explore, analyse, and find meaning in complex data sets. Each algorithm is a finite set of unambiguous step-by-step instructions that a machine can follow to achieve a certain goal². Machine learning algorithms use parameters that are based on training data—a subset of data that represents the larger set. As the training data expand to represent the world more realistically, the algorithm calculates results that are more accurate³.

Figure 1: How Machine Learning Works?

Source: LinkedIn

Machine learning has the potential to enhance credit allocation, enabling lenders to effectively extend credit. However, according to a research paper titled ‘Predictably Unequal? The Effects of Machine Learning on Credit Markets’ published in 2022 in The Journal of Finance, machine learning lending models would provide higher interest rates to under privileged racial groups. The gains from new technology are skewed in favour of racial groups that have socio-economic advantage.

The research paper used a dataset of 10 million loans disbursed between 2009 and 2013 in the US mortgage market to compare the predictive outcomes, probability of defaulting on a loan, of traditional statistical models with the machine learning models. The results indicate that 65% of White Non-Hispanic and Asian borrowers benefit from the new technology as compared to 50% of Black and Hispanic borrowers. The interesting aspect of the result is that race is not included in the set of variables used to predict the default probabilities.

Figure 2 shows the results of the comparison. As you move from left to right on the x-axis, default probabilities predicted by the machine learning models becomes higher than the traditional statistical models. On the y-axis, as you move from bottom to top, share of borrowers rises. This means that borrowers on the left side of the solid black line are ‘low risk borrowers’ according to machine learning methods.

Figure 2: High Risk and Low Risk Borrowers Predicted by Machine Learning and Traditional Statistical Tools

Source: Fuster, A., et.al (2022). Predictably Unequal? The Effects of Machine Learning on Credit Markets, The Journal of Finance, Vol LXXVII (1).

One of the reasons for these results might be the fact that when the variables under machine learning models interact with each other to predict the outcomes, they bring out the vulnerabilities of the less privileged in the society even when race is not included as an explanatory variable.

Another possibility is that due to societal structure, data itself is skewed against the less privileged, thus increasing their chances of default and charging a higher interest rate. According to the authors’ analysis, since machine learning algorithms are more efficient, they aid the societal bias in the historical data used to predict the default probabilities by further increasing the default probabilities of the Black and Hispanic borrowers. In comparison, the traditional statistical techniques do not significantly impact the default probabilities.

Implications of the Results

Private banks in India have reached 50%-60% automated decision making⁴. If banks buy machine learning models from vendors without thoroughly reviewing them, it can worsen fairness concerns. Machine learning algorithm learns from a large amount of data that represents the real world. If the training data contains any biased information, the resulting algorithm will also incorporate and reinforce that bias. To address this concern, developers need to have a deep understanding of the input and output variables and be knowledgeable about how the chosen algorithm works. However, not all Fintech firms serving the financial industry have this expertise at the moment. As a result, some third-party solutions may create machine learning algorithms that appear to be unbiased, however actually produce biased predictions. Therefore, only experts with the right market knowledge and technical skills can recognize this. Additionally, if the bank doesn’t have direct insight into how the features for the algorithm are selected, the algorithm’s ability to predict accurately may come at the expense of it being too narrowly focused on specific factors, potentially leading to incorrect decisions⁵.

Although qualified analysts and data scientists can explain how machine learning algorithms work with full access to the model, it is more complex to understand how individual decisions are made. The lack of transparency in machine learning models raises additional concerns regarding fair lending practices and regulatory compliance for banks. Compliance specialists cannot review independent variables and coefficients for machine learning models in the same way as they can with regression models because machine learning models lack explicit specifications. This concern becomes particularly relevant when alternative data sources are used to enhance prediction accuracy. Unlike traditional financial data from credit bureaus, alternative data sources introduce more unknowns and uncertainties, making it challenging to understand their impact on decisions.

Conclusion

Artificial intelligence and machine learning algorithms are designed to find hidden patterns in data, allowing computers to make decisions quickly and often more accurately than humans. Unlike traditional models based on economic or social science theories, machine learning models are valued based on their predictive abilities rather than their ability to confirm existing relationships⁶. With the growing volume of data, machine learning will play a significant role for the Banks and financial services providers. However, implementing machine learning into business processes requires careful consideration. It will be a challenging task, but it is essential to be adequately prepared, and have confidence in the quality of the data.

Source:
1. BFSI.com
2. Machine learning algorithms can help people explore, analyze and find meaning in complex data sets
3. Microsoft.com
4. Business Today
5. Brotcke, L. (2020). Time to Assess Bias in Machine Learning Models for Credit Decisions, Journal of Risk and Financial Management, 15(4)
6. BLDS LLC, Discover Financial Services, and H2O.ai

Back to home page

Share this article through

Log In

Subscribe to the platform

Thank you for registration.

Approach with Caution: Harnessing the Power of Machine Learning in Credit Underwriting 11 Jul 2023

Implications of the Results

Conclusion

Featured In: