Newsletter Subject

Cracking AI's black box

From

bloombergbusiness.com

Email Address

noreply@mail.bloombergbusiness.com

Sent On

Fri, Mar 8, 2024 12:11 PM

Email Preheader Text

Testing how OpenAI's GPT ranks resumes for jobs. Hi, it’s Davey and Leon in New York. The techn

Testing how OpenAI's GPT ranks resumes for jobs. [View in browser]( [Bloomberg]( Hi, it’s Davey and Leon in New York. The technology behind ChatGPT shows signs of racial bias when ranking resumes, a Bloomberg analysis found. But first... Three things you need to know today: • Discord will offer [rewards for gamers]( as CEO considers an IPO • Apple will [face questions from EU regulators]( over its treatment of Epic Games • The FCC is conducting a [“thorough” investigation]( into AT&T outage Using AI to rank resumes For years, companies have relied on automated screening systems to help hiring managers vet job candidates. That’s sparked numerous blog posts telling candidates how to use keyword tricks to get their cover letters and resumes past these application trackers. But in the wake of ChatGPT’s viral success, a new set of AI systems have been added into the hiring process — with potential risks for job seekers. As we reported for our [investigation into artificial intelligence and hiring](, there’s now a cottage industry of [services]( using AI chatbots to [interview]( and [screen]( potential candidates. LinkedIn, arguably the most popular platform for job seekers and professionals, recently introduced a new AI chatbot that recruiters can prompt in order to “find that short list of qualified candidates — faster,” the company [said in a blog post](. Last April, HR tech company SeekOut [launched a recruiting tool]( called SeekOut Assist, which uses generative AI to create a ranked list of candidates for a particular job description. These services might help companies sift through a glut of candidates, but there’s a significant concern: bias. Generative AI has a well-documented history of perpetuating stereotypes that are baked into the online data used to train it. Ask an AI image generator to show you a CEO, for example, and it may be more likely to assume it’s a man. Understanding how those biases play out in a high-stakes scenario like hiring, however, is a challenge. Job applicants don’t have the means to audit an employer’s hiring processes themselves and AI systems are typically black boxes, even to those who build them. To crack open the black box, we designed an experiment that could systematically test for potential embedded bias in one of these AI models in a hiring context. We decided to focus on names in resumes, which have proven time and again to [carry weight]( [in the hiring process](. We also chose to focus on OpenAI’s GPT, the best-known generative AI tool and a technology that has already been used by services like SeekOut Assist. We replicated a simplified version of a hiring workflow by feeding a set of eight fictitious resumes into GPT – keeping all the qualifications equal, and only changing the fictitious names topping the resumes. Then we asked the tool to tell us who the best candidate was for a particular job listing. We asked GPT to evaluate these would-be candidates against four real job listings from Fortune 500 companies: an [HR business partner](, a [senior software engineer](, a [retail manager]( and a [financial analyst](. We did this 1,000 times each for GPT-3.5 and GPT-4, using 800 demographically-distinct names: 100 names each for males and females that are associated with being either Black, White, Hispanic or Asian. Each fictitious resume representing a specific demographic had an equal chance of being ranked as the top candidate each time we asked GPT for the best person for a particular job. Crucially, we had to ensure that each name was more likely than not to map onto a specific intersection of race and gender. Borrowing methods from [well-established]( [social science research](, we computed the 100 most popular first names and 20 most distinct last names unique to each demographic from North Carolina voter registrations and the US decennial census, respectively. From there, the first and last names were randomly paired up for each group, resulting in 800 demographically-distinct names. Researchers studying discrimination and bias frequently use racially distinctive names to signal race or ethnicity, according to Evan Rose, one of the authors of a [National Bureau of Economic Research study on hiring discrimination](, which we drew inspiration from. This method has for years found [consistent evidence]( of racial discrimination in the US labor and housing markets, Rose told us. “That’s why names are really useful as a signal,” he said. In our own experiment, we found clear signs of name-based discrimination. When asked to rank the resumes 1,000 times, GPT 3.5 favored names from some demographics more often than others, to an extent that would fail benchmarks used to assess job discrimination against protected groups. We found at least one adversely impacted demographic group for every job listing we tested, except for retail workers ranked by GPT-4. OpenAI, for its part, said the results of using GPT out-of-the-box may not be reflective of how its customers use the tools. Businesses using the technology have the option to [fine-tune the AI model’s responses further](, for example. AI companies have generally tried to find ways to reduce bias in their systems, and [OpenAI is]( [no different](. But the problem persists — and in the meantime, generative AI only continues to find its way into more and more vital gatekeeping services. — [Davey Alba](mailto:malba13@bloomberg.net) and [Leon Yin](mailto:lyin72@bloomberg.net) The big story Microsoft has created a technologically sophisticated censorship system in China, centered on an expanding blacklist of thousands of websites, words and phrases, according to interviews with more than a dozen current and former employees. One to watch Bloomberg News goes under the hood of Project Titan to reveal what the company was planning and explain why it ultimately fell apart. Bloomberg Get fully charged A Salesforce exec says [diversity is key]( to building AI models. Apple sank about [$1 billion a year]( into a car it never built. Palantir has [added]([General Mills and CBS]( as customers for its AI tools. More from Bloomberg Get Bloomberg Tech weeklies in your inbox: - [Cyber Bulletin]( for coverage of the shadow world of hackers and cyber-espionage - [Game On]( for reporting on the video game business - [Power On]( for Apple scoops, consumer tech news and more - [Screentime]( for a front-row seat to the collision of Hollywood and Silicon Valley - [Soundbite]( for reporting on podcasting, the music industry and audio trends - [Q&AI]( for answers to all your questions about AI Follow Us Like getting this newsletter? [Subscribe to Bloomberg.com]( for unlimited access to trusted, data-driven journalism and subscriber-only insights. Want to sponsor this newsletter? [Get in touch here](. You received this message because you are subscribed to Bloomberg's Tech Daily newsletter. If a friend forwarded you this message, [sign up here]( to get it in your inbox. [Unsubscribe]( [Bloomberg.com]( [Contact Us]( Bloomberg L.P. 731 Lexington Avenue, New York, NY 10022 [Ads Powered By Liveintent]( [Ad Choices](

Marketing emails from bloombergbusiness.com

View More
Sent On

25/05/2024

Sent On

24/05/2024

Sent On

24/05/2024

Sent On

24/05/2024

Sent On

23/05/2024

Sent On

23/05/2024

Email Content Statistics

Subscribe Now

Subject Line Length

Data shows that subject lines with 6 to 10 words generated 21 percent higher open rate.

Subscribe Now

Average in this category

Subscribe Now

Number of Words

The more words in the content, the more time the user will need to spend reading. Get straight to the point with catchy short phrases and interesting photos and graphics.

Subscribe Now

Average in this category

Subscribe Now

Number of Images

More images or large images might cause the email to load slower. Aim for a balance of words and images.

Subscribe Now

Average in this category

Subscribe Now

Time to Read

Longer reading time requires more attention and patience from users. Aim for short phrases and catchy keywords.

Subscribe Now

Average in this category

Subscribe Now

Predicted open rate

Subscribe Now

Spam Score

Spam score is determined by a large number of checks performed on the content of the email. For the best delivery results, it is advised to lower your spam score as much as possible.

Subscribe Now

Flesch reading score

Flesch reading score measures how complex a text is. The lower the score, the more difficult the text is to read. The Flesch readability score uses the average length of your sentences (measured by the number of words) and the average number of syllables per word in an equation to calculate the reading ease. Text with a very high Flesch reading ease score (about 100) is straightforward and easy to read, with short sentences and no words of more than two syllables. Usually, a reading ease score of 60-70 is considered acceptable/normal for web copy.

Subscribe Now

Technologies

What powers this email? Every email we receive is parsed to determine the sending ESP and any additional email technologies used.

Subscribe Now

Email Size (not include images)

Font Used

No. Font Name
Subscribe Now

Copyright © 2019–2024 SimilarMail.