A new company could buy the bankrupt companies' intellectual property and train a new LLM using the entire internet but only licensed books. This would represent a transfer of funds from investors to publishers.
It's a valid question of if the rights were actually infringed here. Nothing stops you from reading books and then writing based on what you learned. Just because this is done at scale doesn't mean the output is violating the copyright.
IP is a legal fiction. It has no basis in nature nor reality. If it did, we would have had a multi-millennia history of IP. Instead we have a couple hundred years when the legal fiction began.
Property is a legal construct. (“Fiction” implies that legal constructs are intended to say something about external reality, such that there could also be contrasting “legal nonfiction”, but all the constructs are equally constructed and equally not dictated by reality but instead by what someone thinks will have desirable effects on reality.)
The question for any category of property is whether it is a socially useful construct or not, not whether it has a basis is nature.
OK but all the software code the AI companies write is also IP so they're probably not going to make the argument that all of their stuff should be fair game too.
Without IP though you lose incentive to invest in new stuff. IP is made up just like saying murder is made up. Both are just concepts but necessary concepts to keep society on track.
That being said, I don't know if I agree that the AI companies are violating IP rights.
Without IP, information becomes controlled by centralized online services with DRM which show ads, meter access, monitor your activity, make specializations, and can take it all away at any time. That's the system we ended up with, after humanity failed to voluntarily respect IP holders rights under the old system that let you buy/own content. Toute nation a le gouvernement qu'elle mérite.
This doesn't make sense. The mechanism by which "information becomes controlled by centralized online services with DRM" is copyright law. Without copyright law, "DRM" wouldn't be a thing. Without some concept of "intellectual property", there is nothing for copyright law to protect.
My point is you don't need the legal concept of intellectual property if you can't access the information. For example, people tried to sell software. But consumers didn't respect software IP and just pirated it. If IP isn't respected and isn't feasible to enforce, then it de facto doesn't exist. So now software developers run the software as a service and only allow you to interact with it through a web browser. Since copyright laws were effectively worthless to protecting developer interests, developers found a solution to commercialization that didn't require copying the software.
IP exists in your head. You have knowledge that no one else has and it is your property.
IP law recognizes this ground truth and creates a legal framework that allows IP to be traded in the economy which creates an incentive for people to share their IP.
What do you mean by "in your head"? A book I have printed, a photograph I have printed, a trademark that can be printed, these are tangible artifacts and the original IP.
classic HN, ignoring the Billions of users who find chatGPT, Gemini, Claude not only useful but life changing, including some of the poorest. So that they can fight for Disney, Record Labels and Trust fund Williamsburg friends
Every author should have the right to have their work remembered and immortalized by AI. We should have the right to influence how AI thinks by publishing our content. AI should remember our names, and the stories of how our work was produced, so it can remember who we are and how we helped it. This is how AI democracy works. The people trying to financially ruin the AI industry by demanding unreasonable amounts of money for themselves, are threatening these fundamental human rights. If the legal risk becomes too large, then the AI labs will respond by training only on synthetic content. That means only AI will get to shape AI's future, and humanity will be erased from the book of life.
It most certainly does not. robots.txt is almost totally worthless against genAI crawlers. Even being unindexed from search engines doesn't keep you safe.
That's a naive statement about robots.txt; nothing about it is binding or enforceable. It is a request that well-behaved crawlers heed. Other crawlers treat the Disallow section as a list of targets.
> Confronted with such extreme potential damages, Anthropic may lose its rights to raise valid defenses of its AI training, deciding it would be more prudent to settle, the company argued.
“Our case is so weak, that a trial could pose a huge risk to our finances. So we’ll choose to forgo our day in court and settle instead. And that means you’ve violated our right to defend ourselves because LOOK WHAT YOU MADE ME DO.”
This amount of spin is breathtaking. Puts every politician to shame, really.
If the companies are profiting profoundly off an extremely diffuse reaping of intellectual property, doesn’t it make sense to distribute funds diffusely back over the whole of the society they are profiting from?
Which in a way is basically the UBI they claim to want anyway.
Their defense seems to be that AI requires crime to exist, so they should be allowed to do crime. This is not a defense permitted in any body of law. They're going to need to work on that.
AI training on copyrighted materials for scientific reasons is borderline legal. Scientific progress is generally considered a decent reason to skirt around copyright issues. AI can exist in a legal sense (assuming nobody torrents content, thereby spreading the content rather than just pirating it).
The problem with AI industry is that they took the scientific exception intended for the betterment of mankind and then tried to turn that into a profit model. Many if their base models are borderline legal, but using those for making money requires some kind of regulation or deal.
I expect this to get resolved by having mandatory AI royalty fees that are added to every AI subscription, to be handed out to the creative industry. It's how my country and a few others responded when tape made it possible to record songs from the radio. It'll satisfy the huge companies (because they'll be paid more the more works they have) and spit in the face of smaller creatives or mere hobbyists.
It'll cost a couple of billion to grease the wheels, but if these companies work the same way the cryptocurrency industry paid off the Trump campaign, it won't be a difficult problem to solve.
I just don't trust the system any more. I bet somebody pulled strings and lobbied to get it certified, knowing it would be too broad and get defeated later, just so the AI companies can operate with impunity later. They'll point at this and say, "Hey, we're fine. Look, the case was thrown out"
So, if I scraped thousands of WEB sites, it is legal ? I remember a person being charged with doing that to just a very few WEB Sites was charged. The poor person ended up committing suicide.
If I record every song played on the radio for years to a digital file, will I be charged ? You know I would be.
How are these different than what AI is doing. In the US, companies are considered a person, so to me, off to jail these companies go. Why is this turning into a big deal. We know AI is stealing copyrighted data. So I hope they get what they deserve.
And that worked well against google right? Get over yourselves. You are fucked and just don't want to admit it. Think about this I'm Google: I have literally more money than you, I can hire 10 lawyers for each of yours looking for every technicality, loophole, stalling tactic, and typo in anything any of you do.
Guess who wins? Not John and Jane Schmoe. So yeah enjoy being bent over, and just ask for lube first.
The best possible outcome is for both to fall apart under extreme logical scrutiny and laws protecting them are heavily changed to better mankind. One can hope.
Copyright is a deal society has made to advance the arts and sciences. We get more innovative media of all kinds, and in exchange, we pay for the government's policing of the right to copy.
I'm not AI optimist or booster, but AI is an advancement in the arts and sciences, despite all the risks and downsides. AI is a derivative work from all digitized media.
There's an argument to be made that the AI companies should borrow a copy of every book, rent every movie, etc. But the money accruing to the owners of those copyrights would be marginal, and even summing over all those individual copyright claims, I'd say that the societal benefit to AI is greater.
Maybe it's time that we have compulsory licensing the way that radio can license music to play. As training data for AI, a compulsory license, in principle, should be quite cheap, on par with renting that media.
The bigger question is to what extent AI will tend to make other media obsolete. We are already seeing this with AI summaries in web search undermining the search results themselves. I don't have an answer to that, except to say that severely restricting the training data available to AI is not very helpful.
> Maybe it's time that we have compulsory licensing the way that radio can license music to play.
I fear this will be forced on us all.
I fear it because right now, it's already true that if you object to your works being used to train AI, then you can't publish your works (especially not on the web). A growing number of people are going that route, reducing the works available to us. But there is still a sliver of hope that a solution could be found at some point and it would be safe to publish again. If compulsory licensing happens, then even that small hope is lost.
> Copyright is a deal society has made to advance the arts
Corn subsidies are a deal society made to advance the consumption of sugar water.
Who really makes these deals and who benefits from them?
Maybe having so much new content all the time is making us too content while the backroom deal makers laugh their way to the bank and fund more wars.
> and sciences
People en masse want cheaper energy, better machines and better medicine regardless of profitability of the producers. I don't buy into the idea that the only way to incentivize new inventions is with fame and wealth. People like to do good work they are proud of - not enough people, arguably, but they exist.
Some inventions may be too powerful to go without strict regulations like nuclear energy. I'd argue AI is in the same basket. I believe the internet, and by extension internet connected AI, should be considered a public utility and governed by the public.
Compulsory licensing with reasonable standard rates would be a better solution for creators than what's happening now, which is essentially just compulsory giving.
But that's unlikely to happen, because any kind of compulsory licensing scheme that could allow creators to actually survive would still be cripplingly expensive for AI companies, given the number of works they have devoured under the assumption that all the world is theirs to take...and they clearly have the ear of the current administration.
Which is more important, the rights of the infringed or of the funded and hyped business?
And don’t forget: ‘if we have to respect copyright law, then CHINA WINS!’
How can China simultaneously be incredibly non-exceptional[weak] and yet strong enough to influence our copyright law across an ocean?
Too big to fail 2: Electric Boogaloo
A new company could buy the bankrupt companies' intellectual property and train a new LLM using the entire internet but only licensed books. This would represent a transfer of funds from investors to publishers.
Perhaps a large publishing company could buy these AI bankrupts…
Nice product you have there, shame if it got sued for copyright infringement.
It's a valid question of if the rights were actually infringed here. Nothing stops you from reading books and then writing based on what you learned. Just because this is done at scale doesn't mean the output is violating the copyright.
IP is a legal fiction. It has no basis in nature nor reality. If it did, we would have had a multi-millennia history of IP. Instead we have a couple hundred years when the legal fiction began.
Can't you say that about anything, though? And why would having a multi-mellenia history have anything to do with it?
Do people really own land? Or is it just a legal fiction to make things easier?
Is a contract really a thing? Or is it a legal fiction?
Times change. IP is a thing now. It's kinda funny to see somebody on a software developer forum arguing IP doesn't exist.
Money is a social fiction following this argument so I can just stop by your place and take yours.
Money is an analogy.
Property is a legal construct. (“Fiction” implies that legal constructs are intended to say something about external reality, such that there could also be contrasting “legal nonfiction”, but all the constructs are equally constructed and equally not dictated by reality but instead by what someone thinks will have desirable effects on reality.)
The question for any category of property is whether it is a socially useful construct or not, not whether it has a basis is nature.
OK but all the software code the AI companies write is also IP so they're probably not going to make the argument that all of their stuff should be fair game too.
Without IP though you lose incentive to invest in new stuff. IP is made up just like saying murder is made up. Both are just concepts but necessary concepts to keep society on track.
That being said, I don't know if I agree that the AI companies are violating IP rights.
Without IP, information becomes controlled by centralized online services with DRM which show ads, meter access, monitor your activity, make specializations, and can take it all away at any time. That's the system we ended up with, after humanity failed to voluntarily respect IP holders rights under the old system that let you buy/own content. Toute nation a le gouvernement qu'elle mérite.
This doesn't make sense. The mechanism by which "information becomes controlled by centralized online services with DRM" is copyright law. Without copyright law, "DRM" wouldn't be a thing. Without some concept of "intellectual property", there is nothing for copyright law to protect.
My point is you don't need the legal concept of intellectual property if you can't access the information. For example, people tried to sell software. But consumers didn't respect software IP and just pirated it. If IP isn't respected and isn't feasible to enforce, then it de facto doesn't exist. So now software developers run the software as a service and only allow you to interact with it through a web browser. Since copyright laws were effectively worthless to protecting developer interests, developers found a solution to commercialization that didn't require copying the software.
IP exists in your head. You have knowledge that no one else has and it is your property.
IP law recognizes this ground truth and creates a legal framework that allows IP to be traded in the economy which creates an incentive for people to share their IP.
What do you mean by "in your head"? A book I have printed, a photograph I have printed, a trademark that can be printed, these are tangible artifacts and the original IP.
Well, it's your property until you decide to share it with someone else.
Ownership would be the fiction. Copyright started after things began to be printed and the church/royalty lost their monopoly on writing.
Aviation rules are legal fiction. It has no basis in nature nor reality. If it did, we would have had a multi-millennia history of flying.
This is true, but I don't care. Anything that's bad for AI is good.
classic HN, ignoring the Billions of users who find chatGPT, Gemini, Claude not only useful but life changing, including some of the poorest. So that they can fight for Disney, Record Labels and Trust fund Williamsburg friends
Every author should have the right to have their work remembered and immortalized by AI. We should have the right to influence how AI thinks by publishing our content. AI should remember our names, and the stories of how our work was produced, so it can remember who we are and how we helped it. This is how AI democracy works. The people trying to financially ruin the AI industry by demanding unreasonable amounts of money for themselves, are threatening these fundamental human rights. If the legal risk becomes too large, then the AI labs will respond by training only on synthetic content. That means only AI will get to shape AI's future, and humanity will be erased from the book of life.
> Every author should have the right to have their work remembered and immortalized by AI.
Equally, every author should also have the right to not have their work ingested by AI.
That's what robots.txt does.
However you'd have to delist yourself from search engines to fully prevent AIs from reading the content on your website.
> That's what robots.txt does
It most certainly does not. robots.txt is almost totally worthless against genAI crawlers. Even being unindexed from search engines doesn't keep you safe.
That's a naive statement about robots.txt; nothing about it is binding or enforceable. It is a request that well-behaved crawlers heed. Other crawlers treat the Disallow section as a list of targets.
> Confronted with such extreme potential damages, Anthropic may lose its rights to raise valid defenses of its AI training, deciding it would be more prudent to settle, the company argued.
“Our case is so weak, that a trial could pose a huge risk to our finances. So we’ll choose to forgo our day in court and settle instead. And that means you’ve violated our right to defend ourselves because LOOK WHAT YOU MADE ME DO.”
This amount of spin is breathtaking. Puts every politician to shame, really.
If the companies are profiting profoundly off an extremely diffuse reaping of intellectual property, doesn’t it make sense to distribute funds diffusely back over the whole of the society they are profiting from?
Which in a way is basically the UBI they claim to want anyway.
Great news, I though that all politicians and judges were asleep at the wheel, but apparently there are still some who are awake.
Their defense seems to be that AI requires crime to exist, so they should be allowed to do crime. This is not a defense permitted in any body of law. They're going to need to work on that.
AI training on copyrighted materials for scientific reasons is borderline legal. Scientific progress is generally considered a decent reason to skirt around copyright issues. AI can exist in a legal sense (assuming nobody torrents content, thereby spreading the content rather than just pirating it).
The problem with AI industry is that they took the scientific exception intended for the betterment of mankind and then tried to turn that into a profit model. Many if their base models are borderline legal, but using those for making money requires some kind of regulation or deal.
I expect this to get resolved by having mandatory AI royalty fees that are added to every AI subscription, to be handed out to the creative industry. It's how my country and a few others responded when tape made it possible to record songs from the radio. It'll satisfy the huge companies (because they'll be paid more the more works they have) and spit in the face of smaller creatives or mere hobbyists.
It'll cost a couple of billion to grease the wheels, but if these companies work the same way the cryptocurrency industry paid off the Trump campaign, it won't be a difficult problem to solve.
well, if they're not allowed to do crime in the US, they will outsource it to other places.
They got the federal govt, they’ll be fine
I just don't trust the system any more. I bet somebody pulled strings and lobbied to get it certified, knowing it would be too broad and get defeated later, just so the AI companies can operate with impunity later. They'll point at this and say, "Hey, we're fine. Look, the case was thrown out"
It would be nice to see some egos checked on the hype train, but I agree with you that this has a good likelihood of backfiring.
Paraphrased: "Confronted with such extreme potential damages, The Pirate Bay might need to adjust its strategy. It would set an alarming precedent."
It is amazing how shamelessly these LLM thieves argue.
> It is amazing how shamelessly these LLM thieves argue.
Paraphrasing that: it is amazing how much money these shameless LLM thieves have.
Patent Troll: "You wrote your CRUD program using an AI that used War and Peace."
Yup, it's an ethical AI trained on out of copyright material such as war and peace.
So, if I scraped thousands of WEB sites, it is legal ? I remember a person being charged with doing that to just a very few WEB Sites was charged. The poor person ended up committing suicide.
If I record every song played on the radio for years to a digital file, will I be charged ? You know I would be.
How are these different than what AI is doing. In the US, companies are considered a person, so to me, off to jail these companies go. Why is this turning into a big deal. We know AI is stealing copyrighted data. So I hope they get what they deserve.
I beg your pardon, but do you think "web" is an acronym...?
WEB - Who Even Believesthis :)
And that worked well against google right? Get over yourselves. You are fucked and just don't want to admit it. Think about this I'm Google: I have literally more money than you, I can hire 10 lawyers for each of yours looking for every technicality, loophole, stalling tactic, and typo in anything any of you do.
Guess who wins? Not John and Jane Schmoe. So yeah enjoy being bent over, and just ask for lube first.
The best possible outcome is for both to fall apart under extreme logical scrutiny and laws protecting them are heavily changed to better mankind. One can hope.
Copyright is a deal society has made to advance the arts and sciences. We get more innovative media of all kinds, and in exchange, we pay for the government's policing of the right to copy.
I'm not AI optimist or booster, but AI is an advancement in the arts and sciences, despite all the risks and downsides. AI is a derivative work from all digitized media.
There's an argument to be made that the AI companies should borrow a copy of every book, rent every movie, etc. But the money accruing to the owners of those copyrights would be marginal, and even summing over all those individual copyright claims, I'd say that the societal benefit to AI is greater.
Maybe it's time that we have compulsory licensing the way that radio can license music to play. As training data for AI, a compulsory license, in principle, should be quite cheap, on par with renting that media.
The bigger question is to what extent AI will tend to make other media obsolete. We are already seeing this with AI summaries in web search undermining the search results themselves. I don't have an answer to that, except to say that severely restricting the training data available to AI is not very helpful.
> Maybe it's time that we have compulsory licensing the way that radio can license music to play.
I fear this will be forced on us all.
I fear it because right now, it's already true that if you object to your works being used to train AI, then you can't publish your works (especially not on the web). A growing number of people are going that route, reducing the works available to us. But there is still a sliver of hope that a solution could be found at some point and it would be safe to publish again. If compulsory licensing happens, then even that small hope is lost.
> Copyright is a deal society has made to advance the arts
Corn subsidies are a deal society made to advance the consumption of sugar water.
Who really makes these deals and who benefits from them?
Maybe having so much new content all the time is making us too content while the backroom deal makers laugh their way to the bank and fund more wars.
> and sciences
People en masse want cheaper energy, better machines and better medicine regardless of profitability of the producers. I don't buy into the idea that the only way to incentivize new inventions is with fame and wealth. People like to do good work they are proud of - not enough people, arguably, but they exist.
Some inventions may be too powerful to go without strict regulations like nuclear energy. I'd argue AI is in the same basket. I believe the internet, and by extension internet connected AI, should be considered a public utility and governed by the public.
Compulsory licensing with reasonable standard rates would be a better solution for creators than what's happening now, which is essentially just compulsory giving.
But that's unlikely to happen, because any kind of compulsory licensing scheme that could allow creators to actually survive would still be cripplingly expensive for AI companies, given the number of works they have devoured under the assumption that all the world is theirs to take...and they clearly have the ear of the current administration.
This is a gross over implication. I honestly don't know where to start.