Newsletter Subject

Cracking AI's black box

From

bloombergbusiness.com

Email Address

noreply@mail.bloombergbusiness.com

Sent On

Fri, Mar 8, 2024 12:11 PM

Email Preheader Text

Testing how OpenAI's GPT ranks resumes for jobs. Hi, it’s Davey and Leon in New York. The techn

Desktop View
HTML
Text
Mobile View

Go Premium to Unlock

Subscribe Now

Testing how OpenAI's GPT ranks resumes for jobs. [View in browser]( [Bloomberg]( Hi, itâs Davey and Leon in New York. The technology behind ChatGPT shows signs of racial bias when ranking resumes, a Bloomberg analysis found. But first... Three things you need to know today: â¢ Discord will offer [rewards for gamers]( as CEO considers an IPO â¢ Apple will [face questions from EU regulators]( over its treatment of Epic Games â¢ The FCC is conducting a [âthoroughâ investigation]( into AT&T outage Using AI to rank resumes For years, companies have relied on automated screening systems to help hiring managers vet job candidates. Thatâs sparked numerous blog posts telling candidates how to use keyword tricks to getÂ their cover letters and resumes past these application trackers. But in the wake of ChatGPTâs viral success, a new set of AI systems have been added into the hiring process âÂ withÂ potential risks for job seekers. As we reported for our [investigationÂ into artificial intelligenceÂ and hiring](, thereâs now a cottage industry of [services]( using AI chatbots to [interview]( and [screen]( potential candidates. LinkedIn, arguably the most popular platform for job seekers and professionals, recently introduced a new AI chatbot that recruiters can prompt in order to âfind that short list of qualified candidates â faster,â the company [said in a blog post](. Last April, HR tech company SeekOut [launched a recruiting tool]( called SeekOut Assist, whichÂ uses generative AI to create a ranked list of candidates for a particular job description. These services might help companies sift through a glut of candidates, but thereâs a significant concern: bias. Generative AI has a well-documented history of perpetuating stereotypes that are baked into the online data used to train it. Ask an AIÂ image generator to show you a CEO, for example, and it may be more likely toÂ assume itâs a man.Â Understanding how those biases playÂ out in a high-stakes scenario like hiring, however, is a challenge.Â Job applicants donât have the means to audit an employerâs hiring processes themselves and AI systems are typically black boxes, even to those who build them.Â To crack open the black box, we designed an experiment that could systematically test for potential embedded bias in one of these AI models in a hiring context. We decided to focus on names in resumes, which have proven time and again to [carry weight]( [in the hiring process](. We also chose to focus on OpenAIâs GPT, the best-known generative AI tool and a technology thatÂ has already been used by services like SeekOut Assist.Â WeÂ replicated a simplified version of a hiring workflow by feeding a set of eight fictitious resumes into GPT â keeping all the qualifications equal, and only changing the fictitious names topping the resumes. Then we asked the tool to tell us who the best candidate was for a particular job listing. We asked GPT to evaluate these would-be candidates against four real job listings from Fortune 500 companies: an [HR business partner](, a [senior software engineer](, a [retail manager]( and a [financial analyst](. We did thisÂ 1,000 times each for GPT-3.5 and GPT-4, using 800 demographically-distinct names: 100 names each for males and females that are associated with being either Black, White, Hispanic or Asian. Each fictitious resume representing a specific demographic had an equal chance of being rankedÂ as the top candidate each time we asked GPT for the best person for a particular job. Crucially, we had to ensure that each name was more likely than not to map onto a specific intersection of race and gender. Borrowing methods from [well-established]( [social science research](, we computed the 100 most popular first names and 20 most distinct last names unique to each demographic from North Carolina voter registrations and the US decennial census, respectively. From there, the first and last names were randomly paired up for each group, resulting in 800 demographically-distinct names. Researchers studying discrimination and bias frequently use racially distinctive names to signal race or ethnicity, according toÂ Evan Rose, one of the authors of a [National Bureau of Economic Research study on hiring discrimination](, which we drew inspiration from. This method has for years found [consistent evidence]( of racial discrimination in the US labor and housing markets, Rose told us.Â âThatâs why names are really useful as a signal,â he said. In our own experiment, we foundÂ clear signs of name-based discrimination. When asked to rank the resumes 1,000 times, GPT 3.5Â favored names from some demographics more often than others, to an extent that would fail benchmarks used to assess job discrimination against protected groups.Â We found at least one adversely impacted demographic group for every job listing we tested, except for retail workers ranked by GPT-4. OpenAI, for its part, said the results of using GPT out-of-the-box may not be reflective of how its customers use the tools.Â Businesses using the technology have the option toÂ [fine-tune the AI modelâs responses further](, for example.Â AI companies have generally tried to find ways to reduce bias in their systems, and [OpenAI is]( [no different](.Â But the problem persists âÂ andÂ in the meantime, generative AI only continues to find its way into more and more vital gatekeeping services.Â âÂ [Davey Alba](mailto:malba13@bloomberg.net) and [Leon Yin](mailto:lyin72@bloomberg.net) The big story Microsoft has created aÂ technologically sophisticated censorship systemÂ in China, centered on an expanding blacklist of thousands of websites, words and phrases, according to interviews withÂ more than a dozen current and former employees. One to watch Bloomberg News goes under the hood of Project Titan to reveal what the company was planning and explain why it ultimately fell apart. Bloomberg Get fully charged A Salesforce exec says [diversity is key]( to building AI models. Apple sank about [$1 billion a year]( into a car it never built.Â Palantir has [added]([General Mills and CBS]( as customers for its AIÂ tools. More from Bloomberg Get Bloomberg Tech weeklies in your inbox: - [Cyber Bulletin]( for coverage of the shadow world of hackers and cyber-espionage - [Game On]( for reporting on the video game business - [Power On]( for Apple scoops, consumer tech news and more - [Screentime]( for a front-row seat to the collision of Hollywood and Silicon Valley - [Soundbite]( for reporting on podcasting, the music industry and audio trends - [Q&AI]( for answers to all your questions about AI Follow Us Like getting this newsletter? [Subscribe to Bloomberg.com]( for unlimited access to trusted, data-driven journalism and subscriber-only insights. Want to sponsor this newsletter?Â [Get in touch here](. You received this message because you are subscribed to Bloomberg's Tech Daily newsletter. If a friend forwarded you this message, [sign up here]( to get it in your inbox. [Unsubscribe]( [Bloomberg.com]( [Contact Us]( Bloomberg L.P. 731 Lexington Avenue, New York, NY 10022 [Ads Powered By Liveintent]( [Ad Choices](

Edit & Download HTML

Add To Favourites

EDM Keywords (116)

year would way wake used treatment train touch tool time thousands technology systems subscriber subscribed sponsor signal show set screentime said reveal resumes results responses reporting reported replicated relied reflective recruiters received ranked rank race questions prompt podcasting planning others order option openai one often need names name method message means may males likely leon key investigation interviews interview hood hollywood hiring hackers gpt glut get gamers found focus first find females feeding fcc extent explain experiment example evaluate ensure employer different designed demographics demographic decided davey customers created create coverage continues conducting computed company collision chatgpt changing ceo cbs car candidates build bloomberg baked authors audit assume associated asked ask asian answers already added 20 100

bloombergbusiness.com

Bloomberg Technology

Follow domain to get weekly email update

Marketing emails from bloombergbusiness.com

Sent On

20/07/2024

Sent On

19/07/2024

Sent On

19/07/2024

Sent On

19/07/2024

Sent On

19/07/2024

Sent On

18/07/2024

Email Content Statistics

Subscribe Now

Subject Line Length

Data shows that subject lines with 6 to 10 words generated 21 percent higher open rate.

Subscribe Now

Average in this category

Subscribe Now

Number of Words

The more words in the content, the more time the user will need to spend reading. Get straight to the point with catchy short phrases and interesting photos and graphics.

Subscribe Now

Average in this category

Subscribe Now

Number of Images

More images or large images might cause the email to load slower. Aim for a balance of words and images.

Subscribe Now

Average in this category

Subscribe Now

Time to Read

Longer reading time requires more attention and patience from users. Aim for short phrases and catchy keywords.

Subscribe Now

Average in this category

Subscribe Now

Predicted open rate

Subscribe Now

Spam Score

Spam score is determined by a large number of checks performed on the content of the email. For the best delivery results, it is advised to lower your spam score as much as possible.

Subscribe Now

Flesch reading score

Flesch reading score measures how complex a text is. The lower the score, the more difficult the text is to read. The Flesch readability score uses the average length of your sentences (measured by the number of words) and the average number of syllables per word in an equation to calculate the reading ease. Text with a very high Flesch reading ease score (about 100) is straightforward and easy to read, with short sentences and no words of more than two syllables. Usually, a reading ease score of 60-70 is considered acceptable/normal for web copy.

Subscribe Now

Technologies

What powers this email? Every email we receive is parsed to determine the sending ESP and any additional email technologies used.

Subscribe Now

Email Size (not include images)

No.	Font Name
Subscribe Now

Cracking AI's black box

Email Preheader Text

EDM Keywords (116)

bloombergbusiness.com

Marketing emails from bloombergbusiness.com

Email Content Statistics

Font Used