What a bad judge.
Why ? Basically he simply stated that you can use whatever material you want to train your model as long as you ask the permission to use it (and presumably pay for it) to the author (or copytight holder)
“Fair use” is the exact opposite of what you’re saying here. It says that you don’t need to ask for any permission. The judge ruled that obtaining illegitimate copies was unlawful but use without the creators consent is perfectly fine.
If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)
They may be trying to put safeguards so it isn’t directly happening, but here is an example that the text is there word for word:
If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)
Well, it would be interesting if this case would be used as precedence in a case invonving a single student that do the same thing. But you are right
This was my understanding also, and why I think the judge is bad at their job.
I suppose someone could develop an LLM that digests textbooks, and rewords the text and spits it back out. Then distribute it for free page for page. You can’t copy right the math problems I don’t think… so if the text wording is what gives it credence, that would have been changed.
If a human did that it’s still plagiarism.
Oh I agree it should be, but following the judges ruling, I don’t see how it could be. You trained an LLM on textbooks that were purchased, not pirated. And the LLM distributed the responses.
(Unless you mean the human reworded them, then yeah, we aren’t special apparently)
Yes, on the second part. Just rearranging or replacing words in a text is not transformative, which is a requirement. There is an argument that the ‘AI’ are capable of doing transformative work, but the tokenizing and weight process is not magic and in my use of multiple LLM’s they do not have an understanding of the material any more then a dictionary understands the material printed on its pages.
An example was the wine glass problem. Art ‘AI’s were unable to display a wine glass filled to the top. No matter how it was prompted, or what style it aped, it would fail to do so and report back that the glass was full. But it could render a full glass of water. It didn’t understand what a full glass was, not even for the water. How was this possible? Well there was very little art of a full wine glass, because society has an unspoken rule that a full wine glass is the epitome of gluttony, and it is to be savored not drunk. Where as the reference of full glasses of water were abundant. It doesn’t know what full means, just that pictures of full glass of water are tied to phrases full, glass, and water.
Yeah, we had a fun example a while ago, let me see if I can still find it.
We would ask to create a photo of a cat with no tail.
And then tell it there was indeed a tail, and ask it to draw an arrow to point to it.
It just points to where the tail most commonly is, or was said to be in a picture it was not referencing.
Edit: granted now, it shows a picture of a cat where you just can’t see the tail in the picture.
Huh? Didn’t Meta not use any permission, and pirated a lot of books to train their model?
True. And I will be happy if someone sue them and the judge say the same thing.