Apple, Anthropic, and other companies used YouTube videos to train AI

9 2 minutes read

More than 170,000 YouTube videos are part of a massive dataset that was used to train AI systems for some of the biggest technology companies, according to an investigation by Proof News and copublished with Wired. Apple, Anthropic, Nvidia, and Salesforce are among the tech firms that used the “YouTube Subtitles” data that was ripped from the video platform without permission. The training dataset is a collection of subtitles taken from YouTube videos belonging to more than 48,000 channels — it does not include imagery from the videos.

Videos from popular creators like MrBeast and Marques Brownlee appear in the dataset, as do clips from news outlets like ABC News, the BBC, and The New York Times. More than 100 videos from The Verge appear in the dataset, along with many other videos from Vox.

“Apple has sourced data for their AI from several companies,” Brownlee, known by his handle MKBHD, wrote in a post on X. “One of them scraped tons of data/transcripts from YouTube videos, including mine.” He added: “This is going to be an evolving problem for a long time.”

YouTube didn’t immediately respond to The Verge’s request for comment.

As part of its investigation, Proof News also released an interactive lookup tool. You can use its search feature to see if your content — or your favorite YouTuber’s — appears in the dataset.

The subtitles dataset is part of a larger collection of material from the nonprofit EleutherAI called The Pile, an open-source collection that also contains datasets of books, Wikipedia articles, and more. Last year, an analysis of one dataset called Books3 revealed which authors’ work had been used to train AI systems, and the dataset has been cited in lawsuits by authors against the companies that used it to train AI.

AI companies are rarely willingly transparent about the data that goes into their AI systems; how YouTube content specifically is being used has been a key question in recent months. In March, when OpenAI unveiled its powerful video generation tool, Sora, CTO Mira Murati repeatedly dodged questions about whether the system was trained on YouTube videos.

“I’m not going to go into the details of the data that was used, but it was publicly available or licensed data,” she told The Wall Street Journal at the time. When pressed by the Journal about YouTube content specifically, Murati said she “wasn’t sure about that.”

In previous interviews, YouTube CEO Neal Mohan has said that the use of video content to train AI — including transcripts — would violate the platform’s terms. And in May on an episode of Decoder, Google CEO Sundar Pichai agreed with Mohan’s assessment that if OpenAI had indeed trained Sora on YouTube content, it would have broken YouTube’s terms.

“We have terms and conditions, and we would expect people to abide by those terms and conditions when you build a product, so that’s how I felt about it,” Pichai said.

Source link

House Democrats look to force vote on IVF

Céline Dion Delivers Powerful Performance At Paris Olympics Opening Ceremony Amid Stiff Person Syndrome Battle! WATCH!

NYT ‘Connections’ hints and answers for July 27: Tips to solve ‘Connections’ #412.

Drake Gets Booed At Limp Bizkit Concert After Fred Durst Introduction

Celine Dion Makes Her Musical Return While Performing on Eiffel Tower at Paris Olympics 2024 Opening Ceremony – Watch Now! | 2024 Paris Olympics, Celine Dion | Just Jared: Celebrity News and Gossip

Eleven Madison Park and SingleThread Farms Collaborate on a 10-Course Plant-Based Menu

Travis Kelce Was ‘Much More Humble’ Before Dating Taylor Swift, Argue Friends

How Kwame Onwuachi Proved to Himself That He Is Enough

Olympics opening ceremony lights up the internet

Harris Campaign Says Trump Is Hiding At Mar-a-Lago And Refusing To Debate

Apple, Anthropic, and other companies used YouTube videos to train AI

MSNBCTV-STAFF

Céline Dion Delivers Powerful Performance At Paris Olympics Opening Ceremony Amid Stiff Person Syndrome Battle! WATCH!

Drake Gets Booed At Limp Bizkit Concert After Fred Durst Introduction

Celine Dion Makes Her Musical Return While Performing on Eiffel Tower at Paris Olympics 2024 Opening Ceremony – Watch Now! | 2024 Paris Olympics, Celine Dion | Just Jared: Celebrity News and Gossip

Travis Kelce Was ‘Much More Humble’ Before Dating Taylor Swift, Argue Friends

Donald Trump Seen Without Bandage For First Time Since Shooting, Seemingly Uninjured

New ‘Wicked’ Trailer Debuts During Paris Olympics 2024 Opening Ceremony – Watch Now! | 2024 Paris Olympics, Ariana Grande, Cynthia Erivo, Movies, Trailer, Wicked | Just Jared: Celebrity News and Gossip

Pregnant Gypsy Rose Blanchard Shows Off Huge New Tattoos As Fans Say She Was Wrong To Get Inked For THIS Reason!

Justin Timberlake Wasn’t Intoxicated During DWI Arrest, Lawyer Claims in Court

Katy Perry Reveals 3-Year-Old Daughter Daisy Won’t Stop Singing One of Her NSFW Songs! | Daisy Bloom, Katy Perry | Just Jared: Celebrity News and Gossip

Chrishell Stause Throws Shade At Jason & Brett Oppenheim Over Office Leadership – Does She Want To Get Fired?!

French Police Cracking Down on Prostitution at 2024 Paris Olympics

Flavor Flav Says Snoop Dogg’s Olympic Torchbearer Role Is Historic Moment For Rap Music

Former Disney Star Bradley Steven Perry & Natasha Bure Dish On Their Relationship After Hard Launching On Instagram | bradley steven perry, Natasha Bure | Just Jared: Celebrity News and Gossip

January Jones to Star In New A24 Horror Movie ‘Altar’ With Kyle MacLachlan & More | A24, Casting, David Krumholtz, Hudson Behling, January Jones, Kyle MacLachlan, Lily Collias, Movies | Just Jared: Celebrity News and Gossip

Cindy Crawford Confirms Daughter’s Boyfriend Austin Butler Can’t Stop Doing Elvis Accent!

Canada’s Taylor Pendrith pulls into lead at 3M Open

Astronauts to Test Cannabis Growth in Outer Space

Psychedelic-Assisted Therapy for Traffic Violators

Reclaim Your Time: Conquer the 4 Major Time Wasters

Lawmakers Call for Accountability Over Pro-Hamas Campus Violence

Reimagining Goal Setting: The Power of Northstar Goals

Subscribe to our mailing list to get the new updates!

Related Articles