Bondora Investments Using Decision Trees – Review of Progress – Part 6

This is part 5 of a series of guest posts by British Bondora p2p lending investor ‘ParisinGOC’. Please read part 1, part 2,  part 3 and part 4 and part 5 first.

Plan Your Change And Change Your Plan!

As stated in the previous article (see part 1-3) and revealed in the graphs of performance, I started using the Decision Trees in response to the rapid rise in defaults in my portfolio. Except for very small numbers of “opportunistic” purchases, I have maintained a strict discipline on purchase in order to ensure that my progress could be monitored and assessed. As my confidence has grown, I have modified this discipline to take advantage of the Bondora environment to achieve the demanding personal goals I had set myself when I first started. These included only purchasing Loan parts that should accrue 50% interest over the forecast life of the loan – i.e. should turn 5 Euro into 7.5 Euro over the original loan period.

Since early June, I have modified this discipline further and now purchase loans that, whilst still meeting my overarching rule of looking for 5% to 7% historical default levels, do not have a high enough interest rate to meet my earlier profitability goal. I intend to try and sell these loan parts on the Secondary Market with a short-term profit goal, after Purchase/Sale costs.

This further leg of my overall strategy is still in its infancy, but the results from my use of Decision Trees in my initial selection of Loan Applications suggest I am buying the best performing loans available. This means that should other investors not share this view, I will at least be left with Loan Parts that will perform well for me for the time I hold them.

Given the latest changes at Bondora mentioned earlier, if I can only acquire “good” (as defined by the Decision Tree analysis) from the Secondary Market, it may be that this buy-to-sell tactic may not be possible into the future.

Tree development

Tree Analysis

In the previous article (see part 1-3) on the construction of the Decision Trees, I explained how I had made adjustments to the overall analysis process to give more weight to factors such as “Total Income” in the actual Decision Tree analysis. I have kept the included data under constant Review and have added a few further fields to the analysis process, in particular the field showing the “Total Monthly Income/ New Repayment”. As stated in the first article, this needed to be modified from an infinitely variable value into 20 ranges, each of equal numbers of samples.

I mention this particular field as, since January 2015, it appears as an important feature in both the Estonia and Finland Trees and continues to appear more often in these Trees.

Volume and confidence

It is a fact that Estonia has been the largest market for Bondora from its days as Isepankur. In simple volume terms, the data I use (from 1/1/2013) shows that Estonia accounts for c.50% of the total loans, with Finland and Spain making up about 25% each. Slovakia is simply no longer mentioned in polite, Bondora society, so I will pretend it never happened!

Whilst it is true that Estonia has a lower historical default rate, in the dataset that I use, defaults do occur and are presently running at around 11.986% (1009 out of 8418), compared with exactly 18% (576 out of 3200) for Finland and 27.059% (1022 out of 3777) for Spain.

The above figures carry several implications as follows:

The Estonian Tree is fairly static with few changes at the highest levels. Estonian Loans within Bondora bring with them a richness in the data, by which I mean that the original Credit Scores are well represented across the Loan Applications compared to Finland and Spain, which are almost entirely populated with examples with a Credit Score of “1000”. What this means for Estonia is that the Decision Tree neatly shows that the Bondora Credit Score is relatively accurate, with higher numbers of defaults at lower Credit Scores. Thus it is that the historical record shows that Loan Applications with a Credit Score of “1000” (the highest and most sought after) make for good hunting when searching for segments having a default rate of less than 5%. Indeed, it is not uncommon for the Decision Trees to reveal segments of 50+ examples with NO defaults over the last 2.5 years.
Finland and Spain however, with very few historical Loan Applications with a Credit Score of anything other than “1000” combined with a default rate 50% and over 100% higher respectively than Estonia AND volumes less than half that of Estonia, provide pitifully few obvious segments with a sub-5% default rate AND sufficient numbers of examples to support anything like the confidence levels of Estonia.

I believe that the lack of richness in the Finnish and Spanish data is revealed in the overall structure of the different Trees.

Estonia

The top-most branch in the Estonian Tree is based upon the Employment Status of Estonian Applicants. This represents 5 different values: Full Employment (c.90%), Entrepeneur (c.4%), Self-Employed, Retired and, finally, Partially Employed (these last at c.2%).

The Credit Score generally appears at the 2nd, 3rd or 4th level below this and, as stated above, provides a firm “fault line” between >5% and <5% default rates in most of the segmentation below these levels.
As noted earlier, for those in Full Employment initially Income and latterly the ratio of cost to income (which I refer to subsequently as “Affordability”) is the next most significant differentiator followed by Credit Score with the paths exhibiting differing significant data elements somewhat below this level.

A strange (in my eyes) feature of what I call “Affordability” that appears in the Estonian Tree for those in Full Employment is an apparent truth that the more someone can afford to cover the cost of the loan, the less likely they actually do so and the more likely it is that default will occur! 17.333% (65 out of 375) of those in Full Employment who appear to be most able to afford their loans go on to default whereas only 6.54% (24 out of 367) of those in Full Employment showing the lowest affordability have defaulted. So it seems that, in Estonia, the higher the ability to pay, the less likely this is to occur!

Finland

The lack of richness in the Credit Scores provided by Bondora for Finnish (and Spanish) Loan Applicants is revealed, as the Credit Score is the primary determinant at the top level. This is, however an almost totally useless determinant as just over 98% (just under 98% for Spain) of all Finnish Loan Applications carry a Credit Score of “1000”. Below this level, Employment Status is the prime determinant, as in Estonia, but there any resemblance ends as lacking the Credit Score and with lower overall volumes and there is no common thread to the analysis.

Latterly the ratio of cost to income (what I have termed “Affordability”) has crept in at lower levels but there is no pattern to be discerned and the Tree has not settled down to any pattern at the lower levels with changes occurring at all iterations.

Such are the problems with low volumes and high default rates that I have changed the parameters for the Decision Trees for Finland and Spain to force the analysis to work with higher volumes in the nodes and leafs (end points) in an attempt to increase confidence levels. This has the unfortunate side effect of there being few leafs with a sub-5% default rate, the notable exception being a leaf of 23 examples with a 0% default rate.

Spain

As noted above, Spain shares with Finland the feature of Credit Score and Employment Status being the top 2 levels but for Spanish Loan Applicants in Full Employment, the number of Dependants appears to be the most important factor and has remained so for over 6 months of analysis. This data element does appear occasionally in both other trees, but only at much lower levels.

Other than this notable difference, the overriding feature of the Spanish Decision Tree is the lack of leafs showing a sub-5% default rate. Even where sub-5% default rates can be found, there are so few examples in the set with little in the way of trend or discernable pattern to support confidence at any instinctive level.

The best sub-5% default rate is a leaf of 21 examples, being 4.75%, for fully employed, divorced people with 1 dependant living in Pre-Furnished property! All other leafs with a sub-5% default rate are based on less than 10 examples. Many are only single examples.

A competent statistician (which I am not!) may be able to pry some hidden gems from this Tree, but I fear not.

Conclusion

The Decision Trees themselves, whilst changing over time, now appear to have settled down and changes that occur do so at finer levels of granularity with only occasional changes in the overall structure of any particular tree.

The numbers of samples (the complete Bondora dataset) entering the process have now reached the level where the Trees for Finland and Spain required modification of the actual Decision Tree analysis (known as an “ID3” tree) to increase the sample sizes at the lowest level. This has increased my confidence in the output even though the levels of default are so high that identifying sub-5% default levels leave me rejecting many more Loan Applications than I actually invest in.

My initial, restricted purchasing at the start of my new strategy has opened out over the course of period under review. After an initial period where my cash reserves grew to over 25% of my initial investment at Bondora, I am now confidently pursuing new avenues of activity with a view to maximising my returns within the opportunities suggested by the Decision Tree analysis.

This success in using manual selection of investment opportunities comes in the face of constant change at Bondora, change that is trying to move the investment process towards a passive, easy-to-use activity – an understandable business logic.

I take some comfort that my total efforts to date (which include aggressive management of non-performing loans) appear to be returning better than average results. In conclusion, I believe that my change from instinct- to numbers-lead investing has improved my portfolio performance when measured by this admittedly coarse scale of default level. Furthermore, this process has allowed me to start to take a wider view of the opportunities available on the Bondora platform and I hope to be steering my returns back to the levels that initially drew me to this platform.
In terms of the performance over the past 9 months, I experience severely reduced default levels going forward compared to those that triggered my realisation that a new investment strategy had to be formulated. I am now seeing levels similar to those last observed almost 2 years ago, on purchasing volumes approximately double those from that time. I will be the first to admit that the loans purchased over the last 9 months have yet to “mature” to the level of those from nearly 2 years ago, but I have a renewed confidence in the future performance of my portfolio at Bondora.

P2P-Banking.com thanks the author for sharing his experiences and strategy in detail.

Back in March an investor from Luxembourgh wrote an article sharing his experiences in applying machine learning to peer-to-peer lending at Bondora.

Seedrs Shares Available in Equity Crowdfunding Campaign on Friday
Bondora Investments Using Decision Trees – Review of Progress – Part 5