By Melissa Koide
Introduction
Artificial intelligence and machine learning analyses are driving critical decisions impacting our lives and the economic structure of our society. These complex analytical techniques—powered by sophisticated math, computational power, and often vast amounts of data—are deployed in a variety of critical applications, from making healthcare decisions to evaluating job applications to informing parole and probation decisions to determining eligibility and pricing for insurance and other financial services.
The risk that these algorithms make unreliable, unfair, or exclusionary predictions is a foundational concern for a variety of highly sensitive use cases. Furthermore, it raises core questions about whether we can sufficiently understand and manage these models in the immediate and the longer term. Yet artificial intelligence (AI) and machine learning (ML), if carefully overseen and deployed with representative data, also have the potential to increase accuracy and fairness over current models by identifying data relationships that current models cannot detect. Using AI and ML techniques in ways that realize the benefits and mitigate the risks depends on how they are chosen, deployed, governed, and regulated.
Financial services is an important case study, both because of the role that credit and other financial services play in wealth creation and economic mobility and because that sector already has relatively robust regulatory and governance frameworks for managing model fairness, reliability, and transparency. There are nevertheless important questions about whether those frameworks need to evolve to calibrate to the potential benefits and risks of AI and ML adoption. Answering these questions could be instructive for other high-sensitivity AI and ML applications.
FinRegLab1 has been conducting empirical research and creating platforms for stakeholder dialogue about critical questions concerning the adoption of AI and ML techniques and new data sources for credit underwriting. In partnership with Professors Laura Blatter and Jann Spiess at the Stanford Graduate School of Business, we are currently assessing the performance of diagnostic tools for analyzing and managing machine learning underwriting models to satisfy reliability, fairness, and transparency objectives. As explored in this paper, our findings to date suggest that these technologies hold promise. For instance, for particular tasks, automated approaches performed better than traditional methods of managing for fairness considerations. However, the results overall underscore that thoughtful human oversight at the firm and regulator level is even more critical in managing complex models than for prior generations of predictive algorithms.
Our findings lead us to call for stakeholders to engage in a dialogue centered around three core issues: the consumer experience, fairness and inclusion, and model risk management. These conversations should help to advance how public policy and market practice leverage the accuracy and fairness benefits of machine learning techniques while deploying the technology in ways that are sufficiently transparent. Research and stakeholder dialogue will help to inform a roadmap for evolving market practices and public policy to produce an era of more inclusive and fair credit underwriting.
Download the full policy brief here.