Paper Complete: Predicting The Outcome of Chapter 13 Bankruptcy Filings

Can we predict the success rate of Chapter 13 bankruptcy filings? Over the summer, we discussed the work of Warren Agin on bankruptcy, machine learning, and prediction models. One of the ultimate goals of legal analytic research is to increase positive outcomes for both civil and criminal court litigants, and to do so more efficiently. In his paper, now available for full download at SSRN, Warren finds out whether and to what extent we can predict whether a Chapter 13 bankruptcy filing will succeed.

(For more data and details, visit the GitHub page as well)


Available Chapter 13 case information is vast enough to be able to generate accurate machine learning models for large numbers of cases. Warren’s research confronts whether information known before a case begins is enough for a machine learning model to identify meaningful relationships in the data. Furthermore, would a model built from pre-case data be able to predict whether an individual bankruptcy case will succeed in the future?

This is a powerful notion, but so far has not been within reach. Many groups of researchers have already dipped their toes into the waters of predicting these sorts of bankruptcy cases, but not for the purposes of prediction. Some research has found that pro se filings (filings without attorney assistance), failure to pay filing fees, failure to complete necessary schedules, and failure of a previous filing are all strong predictors of case failure. Other research has used logistic regression to develop some meaningful equations to evaluate relationships between various factors present in a Chapter 13 case.

This prior research has established a good jump-off point, but has been limited in its examination of prediction accuracy. The logistic regression research, for example, did not use a large enough data set, nor did it cast a wide enough net for its data (data came from only one U.S. State) to create a generally applicable model.

Warren’s Research

Warren set out to predict case outcomes, rather than provide more of the post-mortem assessments found by prior research. Pursuing a reasonable chance of establishing a viable prediction model, Warren gathered a much larger sample data set than previous researchers in the field had gathered. Taking all Chapter 13 cases from the years 2008 and 2009, with some exclusions and deduplication, provided a data set of about three-quarters of a million cases. A heftier sample than the other research mentioned above.

With the sample data set defined, Warren looked for consistent patterns that could be used to predict future results, and also looked for problems in the data that could water down a machine learning algorithm’s predictive power. Warren used a variety of statistical techniques, finding that the Random Forest Method was the most viable for a Chapter 13 data set. Warren used the random forest, and lessons learned from prior studies, to build his machine learning model.

example random forest

Example of Random Forest

The Results

One key insight that likely will not shock any lawyers reading this: pro se filers of Chapter 13 cases are far less successful than filers who have the assistance of counsel. In addition, cases where debtors had a prior filing also had a reduced chance of success. Many of these prior filings had been pro se, as well.

These results feel intuitive. But what is integral to studies like this, and like the research we conduct with FantasySCOTUS, is that no matter how intuitive a conclusion may seem, what might be intuitive may not have any actual evidence behind it. Most people are not surprised to learn that pro se litigants must climb steeper hills to achieve favorable outcomes. But this is a conclusion often reached via “common sense”, rather than empirical data. Studies like Warren’s bring hard data into the legal arena, in order to shore up insights that are often still reached only via common sense by itself.

At the computational level, Warren also found that the random forest method is superior to neural networks, the latter often overfitting to the training data. Additionally, the random forest method achieved accuracy scores above the baseline, a notable benchmark for machine learning models.

the final random forest decision tree model can predict whether a debtor will obtain a Chapter 13 discharge with 70% accuracy. However, the performance of the model differs depending on the prediction – it is better at predicting failure than it is at predicting success.

Three metrics had an outsize effect on Chapter 13 outcomes, according to the machine learning model:

  • The success of bankruptcy cases in the particular judicial district
  • The ratio of income to expenses for a filer
  • A filer’s unsecured debt

These factors do not imply causation, and in the case of district location likely don’t even demonstrate a real correlation. The data also don’t show whether the machine learning model found these metrics relevant to case success versus case failure. The model only shows the importance of that data to its own calculations; it does not necessarily point a causal arrow in any particular direction. The model “can identify cases that are highly likely to fail, those almost certain to succeed, and the large number of cases that have, and should be given, a chance for success.”

case volume and success rate

Case Volume (size) and Success Rate (shade)

The results of this research provide more questions than answers (frankly, all the best research does). However, this work does represent a large step forward in predicting Chapter 13 bankruptcy outcomes, and it highlights important factors – and relatively unimportant factors – that contribute to the success or failure of Chapter 13 filings.

This prediction model, for example, may demonstrate usefulness in the following ways:

  • Consumer debt attorneys could use the model to help counsel their clients, so clients can make better-informed decisions on whether or not to file a Chapter 13 case
  • The possibility of an unfavorable outcome may spur a potential filer to re-strategize, thus improving their final result
  • Less experienced associates can handle certain cases with more certain odds, while more experienced attorneys can lend oversight to cases with less certain forecasted outcomes
  • Creditors could use these models to assist in financial modeling and risk assessment
  • Resources designed to help Chapter 13 debtors could be more effectively implemented after incorporating the data from this model and others like it

Prediction in Bankruptcy and in SCOTUS

This research has a narrow focus, much like the narrow focus of FantasySCOTUS and our {Marshall}+ algorithm. But the aggregation of prediction data from multiple studies on machine learning models is contributing to a better understanding of multiple aspects of the law. {Marshall}+, for example, operates in a similar way to Warren’s bankruptcy model. Factors present in Supreme Court decisions – such as a Justice’s past voting behavior, Constitutional vs. appellate jurisdiction, etc. – can be weighted and accounted for by models similar to the Chapter 13 bankruptcy model demonstrated in this paper.

There are a few caveats, of course. Much as the {Marshall}+ algorithm might forecast based on certain quirks of a case that a human observer would disregard as irrelevant, the model in this study cannot easily “explain” how it arrived at its conclusions. Nevertheless, the accuracy of the model is high enough to be noteworthy.

What makes machine learning an exciting field is that much of the research is brand new. Warren has taken another step toward greater understanding of Chapter 13 bankruptcy, and by taking that step we now know a bit more about how to build a better machine learning predictive model.

More Info

Warren Agin’s full paper is now available for download from SSRN. You can also check out more data from the paper on GitHub. FantasySCOTUS is currently underway for the 2018 Term. Visit the FantasySCOTUS website to learn more about the competition, the cash prizes on offer, and how to build a custom league for you and your friends and colleagues.

To learn more about LexPredict’s product and service offerings, visit our products page. For additional research papers on machine learning, legal analytics, Supreme Court prediction, and more, visit our Research Projects page. If you would like to speak with LexPredict staff about software licensing, support services for our open source products, consulting services, professional training, legal data strategy, or any other questions pertaining to improving technology, people, and processes at your legal organization, drop us a line at

Comments are closed, but trackbacks and pingbacks are open.