This is physically bought, scanned, books. Not covered by this case is what they’re allowed to do with that model, eg. charge people for access to it.
Maybe controversial, but compared to meta pirating books, claiming it makes no difference, and that each book is individually worthless to the model (but the model is of course worth billions), is it wrong that I’m like “hmm at they’re least buying books”?
As others say, there should be specific licensing, so they actually need to pay a cost per book, set by the publisher, specifically to legally include it in their model, not just shopping as humans but actually an llm skin suit slave.
Then I too can just download books, videos, music, whatever the fuck we want
Alsup also said, however, that Anthropic’s copying and storage of more than 7 million pirated books in a “central library” infringed the authors’ copyrights and was not fair use. The judge has ordered a trial in December to determine how much Anthropic owes for the infringement.
US copyright law says that willful copyright infringement can justify statutory damages of up to $150,000 per work.
this is pretty much what we expected from the decision last week: training on books is legal; pirating books is still piracy… you can train on books you own without asking permission (and i assume books/ebooks that you don’t have to circumvent DRM as that’s illegal in a different way)
If I can read books and learn, why can’t AI?
Just because you own a cd doesn’t mean you have a license to play it in a club.
It’s a good thing they are not playing at a club then.
In this analogy, the AI uses books like a remix DJ would use bits and pieces of songs from different tracks to splice together their output. Except in the case of AI, it will be much harder to identify the original source.
Have you never used bits and pieces of what other people say or what you’ve read in books or riffs you’ve heard or styles seen/heard/read when communicating or creating?
Of course. But I’m not a machine churning out an endless spew of those bits and pieces with no further creative input. I’d be on the side of giving any truly conscious entity rights (including creative ones), but LLMs are not, and I don’t think ever could be, conscious. That’s just not how they work, to my understanding anyway.
If LLMs aren’t conscious, who is using them to churn out an endless spew of those bits and pieces with no further creative input?
Someone has to be doing it. I guess it could be these newfangled AI Agents I’ve been hearing about, but as far as at least I’m aware, they still require input and/or editing (depending on the medium) from a human.
Okay let’s take a break here cuz I think we need to point something out. They are absolutely not conscious. By any definition of the word. By any stretch of the imagination. It’s important to me that you understand this. What you are describing here is a tool. Not something with consciousness.
Under this definition, it is illegal summarize news articles behind a paywall.
If you made money doing that, it probably would be illegal. You would certainly get sued, in any case.
People make a lot of money summarizing articles behind paywalls and it is generally considered legal as long as it is a summary and not copied text.
Who are you paying for that?