Looking at credit scores only tells part of the story – cashflow data may tell another part
Most loan underwriting in the United States makes use of credit reporting data to evaluate repayment risk. Lenders frequently use third party credit scores, and many also develop their own proprietary models. Credit reporting data include individuals’ performance on a variety of credit products, such as mortgages, credit cards, auto loans, and student loans, as well as certain public records and some other forms of lending.1
Cashflow data (broadly defined as various inflows, outflows, and accumulated amounts in checking and savings accounts) may provide lenders with more information about how applicants manage current obligations than they could learn from applicants’ credit repayment histories alone. More research is needed to understand the extent to which cashflow data might enable better predictions about a person’s ability to repay their loans.2
In this blog post, we use the CFPB’s Making Ends Meet survey and the linked Consumer Credit Panel (CCP) to show that three self-reported proxies for cashflow appear predictive of serious delinquency, even when analyzing people with similar traditional credit scores.3 The three proxies we use are high accumulated savings, regularly saving and no overdrafts, and paying bills on time. Though our results suggest that using cashflow data may improve underwriting, there are two major caveats: our effective sample size is small (only hundreds of consumers), and the cashflow measures are self-reported proxies.4 We focus on people with credit scores under 720 (below the superprime segment) because people with superprime credit scores are very unlikely to become seriously delinquent, so there is a smaller margin for cashflow data to be informative.
Our analysis suggests that people who self-report positive cashflow perform considerably better than people who self-report less positive cashflow, even when holding credit scores constant. Consumers with positive self-reported cashflow outperform by 20 percent or more depending on the cashflow proxy used. While only one of our three cashflow proxy results is statistically significant, each of the results is large in magnitude.5 The finding that cashflow data are predictive of serious delinquencies (even for people with the same credit scores) is consistent with other reports and provides evidence on which measures of cashflow appear to have more stable effects.
Credit scores may in part reflect the unequal circumstances that people face, and there are ongoing debates regarding equity and fairness.6 Given the small sample sizes of some demographic groups in our data, we do not break down results by self-reported race, ethnicity, or any other demographics, and similar issues might arise with cashflow metrics.7 However, the evidence in this blog suggests that cashflow data may help lenders better identify borrowers with low likelihood of serious delinquency, even if these borrowers’ credit scores may have otherwise prevented them from receiving credit. Using positive cashflow data in underwriting may improve access to credit for populations with historically low credit scores. The CFPB is undertaking rulemaking to help consumers share such data with lenders willing to use it in their underwriting.
We conduct this analysis using responses from the CFPB’s Making Ends Meet survey, which is sampled off our CCP. The CCP is a 1-in-48 deidentified sample of credit reports from one of the three nationwide consumer reporting agencies. In the survey, the CFPB asks consumers questions about various aspects of financial well-being, such as challenges faced in making various payments and perceptions of financial stability. The Making Ends Meet survey was sent out in March 2019 and responses were received by May 2019.8 Our sample only includes people who have a credit score that we can control for in our analyses, and includes cashflow data from the March 2019 survey (first sample first wave – sufficiently long ago to observe two years of performance). Our analysis does not consider changes to people’s cashflow over time.
Cashflow proxy 1: high accumulated savings
Our first cashflow proxy looks at people with relatively high accumulated savings. We define this group as people who reported at least $3,000 across their checking and savings accounts, or who could cover at least three months’ worth of expenses if their household lost its primary source of income.9,10
In Table 1, we show that people with higher credit scores are more likely to report high accumulated savings. High accumulated savings appears protective against not repaying credit obligations, as people with high accumulated savings and credit scores below 720 are less likely to be in serious delinquency within the next two years when compared to others within their credit score bin.
Table 1: Proxy 1: High Accumulated Savings
Credit scores | Number of survey respondents | Reported high accumulated savings | Percent delinquent who reported high accumulated savings | Percent delinquent who did not report high accumulated savings |
---|---|---|---|---|
Under 620 |
306 |
15% |
21% |
41% |
Between 620 and 659 |
123 |
29% |
9% |
33% |
Between 660 and 719 |
279 |
48% |
4% |
13% |
720 and over |
1354 |
81% |
1% |
2% |
Note: The sample includes all respondents to the Making Ends Meet March 2019 survey who answered the survey questions that were used to construct the cashflow proxies. Not all respondents answered all questions, so sample sizes are slightly different across the cashflow proxies.
The average serious delinquency rates shown in Table 1 indicate the importance of high accumulated savings in preventing serious delinquency. To show how this proxy alone is associated with serious delinquency, we run a logistic regression so we can hold other factors related to delinquency constant. We use credit score as the best available measure of a person’s likelihood of delinquency and include linear and quadratic terms to account for nonlinearities in how credit score predicts delinquency. This regression allows us to ask, holding credit scores constant, whether it appears that people with positive cashflow are less likely to be seriously delinquent in the future. We do not include consumers with credit scores over 720 in our regression samples because, as shown in Table 1, fewer than one percent of these consumers are seriously delinquent in our sample, so cashflow is unlikely to be as informative for them.
Our regression-adjusted difference in average delinquency shows that people with high accumulated savings are close to 70 percent less likely to experience serious delinquency in the following two years after the survey.11 Given the limitations of our data and analysis, we consider this average an indication that these cashflow proxies are qualitatively important, as opposed to a precise estimate of the effect. The small sample of respondents results in a wide confidence interval, so we cannot rule out an effect as low as about 45 percent less likely to experience a serious delinquency.12
Cashflow proxy 2: regularly saving and no overdrafts
Our second cashflow proxy looks at people who reported saving money each month and who did not have any bank overdrafts.13
In Table 2, we again show that people in higher credit score bins are more likely to report positive savings and no overdrafts. People in each credit score bin are less likely to be seriously delinquent if they report regularly saving and not experiencing any overdrafts.
Table 2: people who reported regularly saving and not having overdrafts.
Credit scores | Number of survey respondents | Reported regular savings and no overdrafts | Percent delinquent who reported regular savings and no overdrafts | Percent delinquent who did not report regular savings or reported overdrafts |
---|---|---|---|---|
Under 620 |
295 |
17% |
24% |
40% |
Between 620 and 659 |
120 |
37% |
10% |
35% |
Between 660 and 719 |
277 |
50% |
9% |
8% |
720 and over |
1329 |
75% |
1% |
1% |
Note: The sample includes all respondents to the Making Ends Meet March 2019 survey who answered the survey questions that were used to construct the cashflow proxies. Not all respondents answered all questions, so sample sizes are slightly different across the cashflow proxies.
Our regression-adjusted difference shows that people with the second proxy for positive cashflow are close to 40 percent less likely to experience serious delinquency in the following two years after the survey. Though this estimate is large, it is not statistically significant, so we cannot conclude that the effect is for sure larger than zero.14
Cashflow proxy 3: paying bills on time
Our third cashflow proxy, paying bills on time, considers people who reported no trouble with paying their rent, mortgage, utilities, and regular household expenses as having positive cashflow.15 We combine payment for these expenses into one variable because the sample is too small to study each one independently.
In Table 3, we once again show the share of respondents reporting positive cashflow by wide credit score bands. People in higher credit score bins are more likely to report that they have no trouble paying these bills, and the actions associated with this self-reported cashflow proxy appear to reduce the likelihood of serious delinquency.16
Table 3: people who reported no trouble with paying rent, mortgage, utilities, and regular household expenses.
Credit scores | Number of survey respondents | Reported paying bills on time | Percent delinquent who reported paying bills on time | Percent delinquent who did not report paying bills on time |
---|---|---|---|---|
Under 620 |
284 |
47% |
37% |
38% |
Between 620 and 659 |
116 |
64% |
17% |
42% |
Between 660 and 719 |
257 |
77% |
7% |
12% |
720 and over |
1272 |
90% |
1% |
0% |
Note: The sample includes all respondents to the Making Ends Meet March 2019 survey who answered the survey questions that were used to construct the cashflow proxies. Not all respondents answered all questions, so sample sizes are slightly different across the cashflow proxies.
Our regression-adjusted difference shows that people who reported no trouble paying their bills are close to 20 percent less likely to experience serious delinquency in the following two years after the survey. Given our small sample, we cannot rule out no effect.17
Our small sample of survey respondents (under 1,000 respondents with credit scores below 720) and other caveats limit what else we can learn from this analysis. However, researchers who can link checking account (and other cashflow data) with credit bureau records could have millions of consumer records at their disposal. This could allow for a variety of analyses, for example:
- Whether underwriting models solely based on cashflow data are more predictive than underwriting models solely based on credit reporting data;
- Whether underwriting models solely based on cashflow data are more equitable than underwriting models solely based on credit reporting data;
- Which cashflow variables and proxies matter more for prediction; and
- Which people benefit more from cashflow data;
- Whether the combination of credit reporting and cashflow data adds much in either predictiveness or equity over and above either the credit reporting or cashflow data in and of itself.
Endnotes
- Not all lines of credit are reported by lenders (for example, some auto lenders do not report to the credit bureaus) and some of the relevant information might not be reported (for example, some credit card issuers do not report actual monthly payments). See, e.g., https://www.consumerfinance.gov/about-us/blog/enhancing-public-data-on-auto-lending/ and https://www.consumerfinance.gov/about-us/blog/why-the-largest-credit-card-companies-are-suppressing-actual-payment-data-on-your-credit-report/.
- See, e.g., https://finreglab.org/wp-content/uploads/2019/07/FRL_Research-Report_Final.pdf. See also on including rental and/or utility payment information: https://www.fanniemae.com/newsroom/fannie-mae-news/fannie-mae-introduces-new-underwriting-innovation-help-more-renters-become-homeowners , https://freddiemac.gcs-web.com/news-releases/news-release-details/freddie-mac-announces-underwriting-innovation-help-lenders , https://www.urban.org/sites/default/files/2022-10/Reducing%20the%20Black-White%20Homeownership%20Gap%20through%20Underwriting%20Innovations.pdf (and references within) and https://finreglab.org/wp-content/uploads/2022/03/utility-telecommunications-and-rental-data-in-underwriting-credit_0.pdf. See https://www.formfree.com/news-and-insights/formfree-releases-residual-income-knowledge-index/ on including more cashflow data in mortgage underwriting in general. See https://www.urban.org/sites/default/files/2023-06/Using%20Mortgage%20Reserves%20to%20Advance%20Black%20Homeownership.pdf on the possibility of mortgage reserve accounts and showing that higher reserves are correlated with better performance in Fannie Mae and Freddie Mac data, see also https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3559794 showing the performance of the Hardest Hit Fund recipients. On the other hand, see, e.g., https://www.fico.com/blogs/icing-cake-how-fico-score-and-alternative-data-work-best-together arguing that while cashflow data is predictive, it might not add much more over and above credit scores.
- In this blog, we define serious delinquency as a 90+ day delinquency and use these terms interchangeably with default and being less likely to repay credit obligations.
- We are testing the effect of self-reported data on future performance, adjusting for credit scores. Performance may differ for cashflow measures based on checking and savings account data.
- See, e.g., https://journals.sagepub.com/doi/pdf/10.1177/1745691614551642 on designing future statistical studies given our suggestive, yet high-variance, results.
- See, e.g., https://www.nclc.org/wp-content/uploads/2022/09/Past_Imperfect.pdf , but also see https://onlinelibrary.wiley.com/doi/10.1111/j.1540-6229.2012.00348.x .
- See, e.g., https://www.nclc.org/resources/even-the-catch-22s-come-with-catch-22s-potential-harms-drawbacks-of-rent-reporting/ for potential issues of, for example, rent reporting.
- In order to exclude consumers with insufficient records in the CCP or inaccurate credit scores, we applied several filters, including: (a) consumer’s CCP records must contain both major credit scores at the time of the survey; (b) consumer must have had at least three open tradelines (records of unique debts in the CCP) at the time of the survey; and (c) consumer must have returned the Making Ends Meet survey and have answered either the gender or age demographic questions.
- This proxy might have a counterpart in checking and savings account data, which could be used to measure whether the minimum balance was lower than $3,000 over the last few months, or whether the minimum balance was over three times the person’s average monthly spending over the last few months.
- Our proxy definitions are somewhat ad-hoc and are driven by the survey questions asked in the first wave (2019) of our Making Ends Meet survey. More continuous cashflow metrics and being able to experiment with various proxies and their combinations might add even more predictive power.
- Throughout this blog, we report results from a logistic regression with a binary outcome variable (whether the person was seriously delinquent in the following two years after the survey), a binary treatment variable (whether the person self-reported the positive cashflow proxy), and a linear and a quadratic credit score terms as controls. We only use data from people with credit scores under 720. Unless we explicitly say otherwise, our regressions are weighted by the survey weights we used to make Making Ends Meet respondents representative of the general population with a credit record in Consumer Credit Panel. Throughout the text we informally discuss 95th percentile confidence intervals.
- Weights are intended to reduce bias due to nonresponse or differences in sampling probabilities, but weights may increase variance of estimates. Our qualitative results are similar using unweighted data. The same regression in which people are weighted equally, shows a regression-adjusted difference of people with the first positive proxy being 40 percent less likely to be seriously delinquent, with us unable to rule out effects as low as 6 percent less likely. See, e.g., https://projecteuclid.org/journals/statistical-science/volume-22/issue-2/Struggles-with-Survey-Weighting-and-Regression-Modeling/10.1214/088342306000000691.full , https://www150.statcan.gc.ca/n1/en/catalogue/12-001-X20050029046 , and https://pubmed.ncbi.nlm.nih.gov/32882755/ discussing various issues around and techniques for estimating weights.
- A counterpart in checking and savings account data may be whether the balance is trending up over time. We included the no overdraft clause as a consistency check on self-reported regular savings. Our results did not change substantially when the cashflow definition instead only relied on reported savings each month (ignoring any reported overdrafts).
- As with other proxies, our qualitative results are similar using unweighted data. The same regression in which people are weighted equally, shows a regression-adjusted difference of people with the second positive proxy being 50 percent less likely to become seriously delinquent, with us unable to rule out effects as low as 20 percent less likely.
- This proxy might have a counterpart in checking and savings account data – whether people are paying similar amounts to the same providers around the same time. One might not classify this as strictly cashflow data. See, e.g., https://singlefamily.fanniemae.com/originating-underwriting/faqs-positive-rent-payment-history-desktop-underwriter on Fannie Mae identifying recurring rental payments and using that for underwriting.
- Table 3 further shows that people with credit scores between 620 and 660 who did not report paying bills on time are more likely to become seriously delinquent than people with credit scores under 620 who did not report paying bills on time, even though delinquency risk generally falls when credit scores increase. This anomaly is not present when we exclude sample weights, so we attribute the nonmonotonicity to the increased variance that results from these weights.
- As with other proxies, our qualitative results are similar using unweighted data. The same regression in which people are weighted equally, shows a regression-adjusted difference of people with the third positive proxy being 40 percent less likely to become seriously delinquent, with us unable to rule out effects as low as 5 percent less likely.