OpenAI's models were trained on ebooks from a private ebook torrent tracker leec...

harry8 · on Jan 30, 2025

Have you got some support for this claim?

There's a lot of wild claims about, so while this is plausible it would be great if there were some evidence backing it.

naet · on Jan 30, 2025

NYT claims that OpenAI trained on their material. They argue for copyright violation, although I think another argument might be breach of TOS in scraping the material from their website or archive.

The complaint filing has some references to some of the other training material used by OpenAI, but I didn't dig deeply in to what all of it was:

https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec20...

throwaway314155 · on Jan 30, 2025

What's that got to do with this books claim?

iinnPP · on Jan 30, 2025

Relevant similar behavior.

OsrsNeedsf2P · on Jan 30, 2025

He could be confusing it with Llama: https://www.wired.com/story/new-documents-unredacted-meta-co...