Data Scientist Hiring Challenge
Important Note - please do NOT put “Synpulse” or “Synpulse8” in your code or documents
Level: intermediate
Background
A well-established bank in Asia Pacific is looking to use data to better their nonperforming loan ratio.
The bank aims to predict whether a loan will be payed back to the bank or not. To solve this problem, the bank needs to gather the right data, analyze it and create a model to predict whether a loan will be paid back or not. Since this is the first time the bank is attempting to use data actively, they will need help on finding the right data to predict nonperforming loans and if a client is able to pay the due installment. At the moment, the bank is fully dependent on the credit risk rating and their past experience. However, the bank is unable to accurately predict whether a client is able to pay their loan back on a continuous basis, month to month, and hence is unable to serve their clients better and prevent clients from defaulting.
The banks goals are to:
- Predict if a client is able to pay the installment in any given month
- Predict how much a client may be able to pay in any given period
- Accurately predict the risk of a client defaulting
Objective
Describe the problem according to the stated goals and solve it.
To solve the problem, you will need to find data. At a minimum the data should include:
- client file, including diverse demographic data
- loan data, including loan tenure, amount, payments, loan status, and a loan identifier
- account transactions
Deliverables
For every loan analyzed, the submission files should contain the predication as well as the predicted contribution on a monthly basis.