Executive Vice President & General Counsel Fair, Isaac and Company, Inc.
During the 1970s and 1980s, credit scoring and automated underwriting became widely accepted for most forms of consumer lending, other than mortgages. Mortgage lenders began using credit scoring much later, starting around 1995. Lenders have widely accepted scoring technology because it allows for expanded lending while maintaining or even reducing loss rates. During the years that credit scoring technology was being developed, there were few, if any, serious concerns on the part of regulators or consumer activists that scoring might somehow restrict access to credit for any significant subset of the population. However, during the past four or five years, such concerns have been raised more and more frequently.
Most regulators and consumer activists accept the claims of lenders and scoring-system developers that credit scoring provides an effective and cost-efficient decision tool for the general population of borrowers. But, when it comes to traditionally underserved segments of the population, they may become very skeptical. Most of these concerns can be grouped into a few broad categories:
How can a statistically based system deal with segments of the population that are unrep or underrepresented in the historical data?
This is a reasonable question, but it is premised on a hidden assumption. The assumption is that when underrepresented groups seek mainstream credit, the factors that predict good and bad performance will be different for them than what has proved predictive for past borrowers. Clearly, there are some differences in what is predictive for various subpopulations. However, more than 40 years of experience in developing credit scoring systems for lenders in 60 countries have demonstrated that the similarities in what is predictive of credit performance outweigh the differences. The same question can be applied to individual applicants: "If an applicant has little or no mainstream credit history, how can a scoring system evaluate such an applicant?" Again, the question has a hidden premise that satisfactory performance with nontraditional obligations will predict satisfactory performance with traditional credit obligations. Since there is little, if any, systematic collection of nontraditional credit histories, no one really knows whether that premise is correct.
Credit bureau-based scoring systems require a minimum amount of reported credit history in order to produce a score. An "unable to score" code should trigger a judgmental evaluation, but that may not always happen. Bureau scoring systems also may employ separate scorecards for "thin file" populations, and special application scorecards have been developed for "no hit" populations—those with no credit bureau history.
Don't inaccuracies in credit bureau data result in inaccurate scores? Of course inaccurate data will cause inaccurate scores, but inaccurate data also affect judgmental credit decisions. However, the current use of scoring in mortgage lending does produce some real differences. For example, prior to the use of credit scores in mortgage origination, when an applicant disputed information in the credit report
I the underwriter could choose to disregard that information. Alternatively, the provider of the merged credit report usually used in mortgage lending might have been willing to change the data in that report, even though the credit repositories had not made a corresponding change.
Now that the credit bureau-based score is the primary tool for evaluating the credit history of mortgage applicants, the score will not change unless and until the data in the underlying repository report are changed. The major secondary market lenders—principally Fannie Mae and Freddie Mac—as well as scoring developers have advised originators that they can and should ignore scores based on inaccurate data. However, some underwriters may not make the effort needed to document such cases to satisfy a potential investor.
Aren't there inequities in overrides, quality of assistance and so on? Even in a situation where a scoring system encompasses substantially all of the available information and can account for most of the final decisions, there is still room for human intervention. An override occurs when the final decision is contrary to that indicated by the scoring system. Scoring developers would argue that overrides are not a scoring problem but rather a problem caused by ignoring the scoring system. The September 1999 complaint and consent decree by the U.S. Department of Justice against Deposit Guaranty National Bank supports the argument of scoring developers that overrides— that is, judgmental decisions—may be more vulnerable to discrimination claims than decisions that follow the scoring system.
Similarly, there have been many claims that the "quality of assistance" offered to minority borrowers is systematically inferior to the assistance offered to white borrowers. While substantively that issue is no different in a scored environment than in a judgmental environment, the scoring system, never theless, may be perceived as the culprit by rejected minority borrowers.
Don't scoring systems reject many applicants who would have performed well and accept many who go delinquent? The short answer to the question is, "Yes." But the question should be whether credit scoring or human judgment does a better job of accepting "good" borrowers and turning away those who would, if accepted, eventually perform badly. Here the evidence is clear: The use of scoring consistently produces 20 to 30 percent improvements (either in reduced delinquency rates or increased acceptance rates) compared with judgmental evaluation. In addition, the available data suggest that similar or even greater improvements can be obtained by applying scoring to traditionally underserved segments of the population.
Doesn't scoring result in higher reject rates for certain minorities than for whites? Again, the short answer is, "Yes," but it is the wrong question. The question ought to be: "Does credit scoring produce an accurate assessment of credit risk regardless of race, national origin, etc.?" Studies conducted by Fair, Isaac and Company, Inc. (discussed in more detail below) strongly suggest that scoring is both fair and effective in assessing the credit risk of lower-income and/or minority applicants.
Unfortunately, income, property, education and employment are not distributed equally by race/national origin in the United States. Since all of these factors influence a borrower's ability to meet financial obligations, it is unreasonable to expect an objective assessment of credit risk to result in equal acceptance and rejection rates across socioeconomic or race/national origin lines. By definition, low-income borrowers are economically disadvantaged, so one would not expect their score distributions to mirror those of higher-income borrowers.
Is Scoring "Fair" to Minority and Low-Income Borrowers?
Since scoring systems are designed to provide the most accurate possible assessment of credit risk—regardless of race, national origin and so on—they will never satisfy critics who believe "fair" means the elimination of all discrepancies in both acceptance and rejection rates. If, however, fair is defined as "assesses credit risk consistently regardless of race, national origin or income" then the available data strongly suggest that credit scoring systems are fair when applied to these borrowers. Two research studies conducted by Fair, Isaac and Company, Inc. early in 1996 support this finding.
The first study used data from more than 20 credit portfolios to look at score distributions and differences in characteristics between low- and moderate-income ("LMI") applicants and the general population. [This study (hereinafter, the "LMI study") also compared the acceptance and default rates for LMI segments. These resulted from actual judgmental underwriting on eight of these portfolios with the results that could have been obtained using scoring.]
Not surprisingly, the score distribution of the LMI segment was lower than that of the general population. Thus, at any given cutoff score, the LMI population would have a lower acceptance rate. However, the score-to-odds relationships1 of the LMI and general populations were virtually identical (especially in the range where most cutoff scores would be set). To the extent there were any differences in the score-to-odds relationships, those discrepancies consistently favored the LMI applicants. That is, at any given score, the risk for LMI applicants is the same as, or slightly greater than, the risk for other applicants.
The second half of the LMI study produced some very interesting results. For the eight different portfolios, we compared acceptance and delinquency rates for LMI borrowers that had resulted from judgmental underwriting with the results that would have been obtained if credit scoring had been used to evaluate the same applicants. In every case, scoring could have produced a significant increase in the acceptance rate for LMI applicants if the bad rate were held constant, or a significant decrease in the bad rate if the acceptance rate were held constant.
The second study (hereinafter, the "HMA study") compared credit bureau scores and characteristics of consumers living in ZIP codes with high concentrations of blacks and Hispanics (the "HMA ZIP codes") against those of consumers living in other ZIP codes. ZIP code was used as a surrogate for race/ national origin simply because direct race/ national origin information was not available. The average household income (as indicated by census data) in HMA ZIP codes was only about two-thirds that for the non-HMA ZIP codes. Once again, while the score distribution for the HMA ZIP codes was lower than for the non-HMA ZIP codes, the score-to-odds relationships were very similar across populations. As in the LMI study, what discrepancies did exist in the score-to-odds relationships consistently favored the HMA population: At any given score, HMA borrowers present the same or greater risk as non-HMA borrowers receiving the same score.
In short, these studies indicate that scoring is both fair and effective when applied to LMI and minority populations. These findings are consistent with results reported by others, including Fannie Mae and Freddie Mac (where direct race/national origin information is available from HMDA data). Moreover, the LMI study indicates that scoring can produce substantial improvements in the quality of decisions when compared with judgmental underwriting.
Despite guidance from secondary market investors and scoring developers, at least some mortgage lenders are overly reliant on
credit scores. The scores most often used in mortgage lending are generic bureau-based scores that consider only credit history information, and were not designed specifically to assess mortgage risk. Ignoring other relevant information in the mortgage decision process is not in the best interests of either borrowers or lenders. And, in cases where the lender is satisfied that inaccuracies exist in the underlying credit information on which the score is based, it is irrational to continue to rely on the score. But, there is evidence that many lenders do not make the effort to manually review and document these cases.
These problems may be exacerbated if overrides and assistance also are not dispensed evenly; higher-income white borrowers may be approved despite marginal credit scores, while low-income and minority borrowers with similar scores are turned away. Such practices would better be described as the misuse of scoring, but the rejected applicant is still left with the perception that the credit scoring system is unfair.
The response from Fair, Isaac and Company, Inc. made reference to specific studies that supported its claim that minorities were not unfairly disadvantaged by credit scoring systems. Since Fair, Isaac is asserting that their research is sound in a statistical and social science context, one needs to assess whether their studies measure up by these standards.
For example, in the above-referenced LMI study, we are told only that the data are from several unnamed lenders for some unnamed type of installment loans from 1992 to 1994. Are these mortgage loans, auto loans, personal loans, home equity loans or student loans? Different loan types attract different types of applicants. The study reviews characteristics taken from credit applications and credit bureau information, but it provides no definitions of any of these characteristics. We are not told if all the lenders used compatible application forms with common definitions for each characteristic. We are provided with tables (in the referenced LMI study) that indicate which applicant and credit bureau characteristics made "large differences," "moderate differences" and "negligible differences." We are given numbers, but we do not know if these numbers are from tests of significance, differences in raw percentages or some other collection of measures.
The comparison of the outcomes for the judgmental and credit scoring system was actually done in a separate study based on data from lenders seeking to replace their judgmental system. This is a clearly biased sample. Were these judgmental systems among the most subjective and least structured in the industry? The indication is that the lenders already saw them as failures.
The above-referenced HMA study of minority differences was based on ZIP codes, where all residents of the ZIP code were treated as either minority or not. Yet, the minority composition of the ZIP codes ranged from 40 percent to 90 percent, with the report data based on ZIP codes that were more than 70 percent black and Hispanic. We are not told what percent of all minorities live in such ZIP codes. Such a grouping is not specific with respect to the race of individuals. Only large segregated minority populations would be included in such definitions. This is likely to exclude the majority of Hispanics and most higher-income minorities. We are not told the time period for the data in this study. The markets are constantly changing. Subprime lending, which was seen in these studies as related to personal finance companies, now relates to a large and rapidly growing industry of subprime lenders providing everything from home purchase loans to auto title loans. Therefore, one historical study is not adequate, even if it was sound at the time.
Fair, Isaac's response emphasizes the need for a broad range of studies by researchers from different perspectives and disciplines. Until this happens, the Fair, Isaac claims of a neutral, or even favorable, treatment of minorities should be treated with skepticism. Fair, Isaac, like Freddie Mac, needs to seek out a broader range of perspectives for its own reviews. The true test for credit scoring, however, will lie in the continuing review of many different systems by many different researchers.
This concludes the first installment of Perspectives on Credit Scoring and Fair Lending: A Five-Installment Series. The Federal Reserve System's Mortgage Credit Partnership Credit Scoring Committee would like to thank the respondents for their participation. The next article will explore the interrelated issues of lending policy, credit scoring model development and model maintenance.
1 Editor's Note: The term score-to-odds relationship refers to the relationship between any given credit score and the degree to which applicants with that score are likely to exhibit the risk that the scoring system is designed to predict. For example, in a system designed to predict the likelihood— or "odds"—that an applicant will default in a loan within two years, a score of 700 might relate to or predict a 1 percent likelihood of default, while a score of 660 might relate to a 3 percent likelihood of default. In such an example, the default risk odds would be 1 in 100 for a score of 700 and 3 in 100 for a score of 660.
Rider ul RfiSfrrVt Bunk of Claweklilid
Was this article helpful?