The Problem With Amazon’s AI Recruiter

Resume-Reader Was Biased Against Women. Here's Why.

Oct 31, 2018

SMITH BRAIN TRUST – last year abandoned an AI recruiting tool that it had built, after the company’s machine-learning experts discovered the tool was discriminating against female job applicants.

The e-commerce giant was wise to do so, says P.K. Kannan, the Dean’s Chair in Marketing Science at the University of Maryland’s Robert H. Smith School of Business. His research into artificial intelligence and machine learning has underscored the dangers of blindly relying on algorithms to make important decisions. “Small errors,” he explains, “like a flaw in the attribution scheme, can have a major impact. With machine learning, if an algorithm is used day in and day out, the mistake builds up.”

According to news reports, the company’s computer models had been programmed to judge applicants based on patterns in the resumes that had been submitted to the company over a 10-year period. However, the vast majority of applicants over the period had been men, a fact that mirrors the demographic makeup of the tech industry as a whole. Words associated with women, such as “women’s basketball,” would cause the model to discard the candidate.

Kannan, who was recently named one of six UMD Distinguished Scholar-Teacher award recipients for the 2018-19 academic year, focuses much of his research on digital marketing, specifically modeling applying statistical and econometric methods to online and offline marketing data. His current research stream focuses on freemium models, attribution modeling, media mix modeling, new product/service development and customer relationship management.

We asked him to share some of his insights. 

Q: Do you believe that Amazon can correct the failings of its earlier tool? And, given that errors build upon themselves, what must Amazon do to guard against future failures?

A: Amazon’s problem is with the data and not necessarily with the algorithms themselves. Machine learning and AI tools are very good at learning from past data and using them to develop rules for winnowing through the applications and selecting the right candidates. But when the past data is based on human selection criteria and smaller pool of women candidates, there could be problems. 

For example, when the number of women applicants in the past data is small, it is difficult for these algorithms to discriminate between acceptable and unacceptable women candidates. Given this and tolerance level for Type 1 and Type 2 errors, the algorithms may end up tagging more men candidates as acceptable. 

Also, human biases tainting the past data may play a role in compounding the error. If, for example, past recruiters deliberately downplayed the importance of some criteria to avoid selecting women candidates, these biases will persist in the AI-based recommendations. And, if the AI-based biased data is also included for the next cycle of selection, such errors can compound. 

These failings can be corrected with some effort. Similar to goal programming, where some criteria are optimized subject to attaining some specific goals, recruiting tools can aim for some goals on percentage of minority or underrepresented candidates to be retained in the selected pool. Also, some monitoring of algorithms to check if they work in the intended way would also help.

Q: Machine-learning resume-screening tools have been found to be problematic for other firms as well. Is resume-sorting a compatible task for artificial intelligence? If so, how long until AI resume-screening becomes the norm?

A: As I indicated above, the problem is with the training data. If the training data is clean and unbiased, then these algorithms would work most of the time. Collecting a large volume of training data may take some time, but if the process of selecting candidates which are part of the training data is faulty you will have the same problem all over again.

Researchers are working on techniques to de-bias and give more weight to certain criteria to make the training dataset more unbiased. The important thing to note here is that we assume that the algorithm creators are aware of the biases they have. Biases that originate subconsciously cannot be de-biased. Therein lies the problem with bias in AI-based recommendations. 

While the algorithms can be used, the solutions they provide need to be monitored regularly to ensure that there are no major biases.




About the Expert(s)


P. K. Kannan is the Dean’s Chair in Marketing Science at the Robert H. Smith School of Business at the University of Maryland. In January 2021 he was appointed associate dean for strategic initiatives. His research expertise is on marketing modeling, applying statistical, econometric, machine learning, and AI methods to marketing data. His current research stream focuses on digital marketing - mobile marketing, attribution modeling, media mix modeling, new product/service development and customer relationship management (CRM).

More In


The Super League That Never Was


For 12 of the world's largest soccer clubs, breaking away from traditional competitions to form a Super League seemed like a revolutionary idea for the sport. But in just two days, the plan came crashing down. What went wrong for the Super League? It all starts with the fans, says Maryland Smith's Henry C. Boyd III.

Apr 22, 2021
Why Visual Marketers Should ‘Always Be Knolling’

Though you might not be familiar with the term, you most certainly know knolling.

Apr 20, 2021
Big Brands Take Aim Over Georgia Voting Laws

Big brands are taking aim at Georgia’s new voting laws. Why it no longer pays for companies to stay on the sidelines, when social issues are at play.

Apr 07, 2021
Robert H. Smith School of Business
Map of Robert H. Smith School of Business
University of Maryland
Robert H. Smith School of Business
Van Munching Hall
College Park MD 20742