Retail banking encompasses several crucial analytics tasks, among which credit risk assessment takes top priority. Credit scores underly a range of retail banking activities; loan approvals and individual interest rate calculations are two key applications. Wrong lending decisions can set off ripple effects in a bank, and may even impact the larger financial sector, which emphasises the need to calculate credit scores with high precision.
Credit scores for a given individual are calculated by considering a range of variables such as income, employment, debt-to-income ratio, and payment history. Each of these factors is an indication of the individual’s ability to repay a loan and hence determines whether the bank should offer the loan and the conditions of the offer. The more variables that are considered in the credit risk calculation process, the more accurate the credit risk assessment will be. The process to determine an individual’s creditworthiness should also be credible, transparent, and free of bias.
However, processing large amounts of data during credit risk assessment can be extremely challenging. This can consume considerable financial, human, and computational resources, and still may lead to inefficient outcomes and inaccurate outputs.
A retail bank reached out to TurinTech to improve its credit risk calculation process. The company was using in-house, custom-built machine learning models for credit scoring but noticed inefficiencies that it wanted to mitigate.
The bank had a vast amounts of customer data, collated and held as part of its usual operations.
🛠 State of the existing system
The bank had custom-built machine learning models which were being used to calculate credit scores based on above mentioned data.
🚫 Bottlenecks to successful AI implementation
The bank faced three specific challenges in their credit scoring process:
- Inability to draw insights from alternative data: The company wanted to utilise alternative data sources (such as information on social media and online activity) which could be leveraged to increase precision of credit risk assessment. However, the existing models for credit scoring were not able to utilise this information, which created a significant gap between what was being achieved and what could be achieved.
- Models not complex enough to capture complex patterns in data: The bank’s existing models were linear regression models, which failed to capture more complex patterns in dat
- Problems with imbalanced data: In credit score calculations, datasets can have a significantly higher number of instances with good credit scores and a lesser number of instances with bad credit scores, which makes the dataset imbalanced. If the model is trained on imbalanced data, it can show good performance on the training dataset, but would perform poorly on new data, as it would not have captured underlying patterns accurately. You can read more on this in our article: What Is Imbalanced Data and How to Handle It? – TurinTech AI. The bank’s machine learning models were facing this issue of imbalanced data and model overfitting.
evoML for better credit scores
evoML integrates the entire data science pipeline onto a single platform, automating the development and optimisation of machine learning models. With evoML, the bank was able to successfully mitigate its bottlenecks to AI implementation.
✅ evoML generated a decision tree model to boost prediction accuracy from 77% to 92%
During the model development process, evoML builds several models, evaluates their efficiency on user-defined metrics, and suggests the best model to the user. In this specific case, evoML suggested a decision tree model as the best model for the machine learning task, which, when implemented led to a prediction accuracy of 92%, along with 4x improvement in prediction time.
✅ evoML’s feature engineering functionality effectively mitigated model overfitting issues
evoML offers a wide range of feature engineering functionalities. For a typical model development task, evoML will conduct feature selection, feature transformation, and feature generation. This reduces model complexity, incorporates the most impactful features, and generates more useful features from those available in the dataset, ultimately reducing issues with model overfitting. During the model evaluation, the bank found that evoML-generated models significantly lowered problems associated with imbalanced data, ultimately leading to greater accuracy of credit scores.
✅ evoML took just 2 weeks from ideation to production
Typical machine learning model deployment often requires months of effort and significant resources. For instance, the bank’s in-house models took over a year to be deployed. In contrast, evoML was able to provide a production-ready model to the bank within two weeks while maintaining higher prediction accuracy.
The model we got in two weeks has 20% higher accuracy and 4x faster speed than a model our team worked on for a year.