• @Randomgal@lemmy.ca
    link
    fedilink
    English
    92 hours ago

    You’re poor? Fuck you you have to pay to breathe.

    Millionaire? Whatever you want daddy uwu

  • @MTK@lemmy.world
    link
    fedilink
    English
    62 hours ago

    Check out my new site TheAIBay, you search for content and an LLM that was trained on reproducing it gives it to you, a small hash check is used to validate accuracy. It is now legal.

  • setVeryLoud(true);
    link
    fedilink
    English
    15
    edit-2
    8 hours ago

    Gist:

    What’s new: The Northern District of California has granted a summary judgment for Anthropic that the training use of the copyrighted books and the print-to-digital format change were both “fair use” (full order below box). However, the court also found that the pirated library copies that Anthropic collected could not be deemed as training copies, and therefore, the use of this material was not “fair”. The court also announced that it will have a trial on the pirated copies and any resulting damages, adding:

    “That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft but it may affect the extent of statutory damages.”

    • @DerisionConsulting@lemmy.ca
      link
      fedilink
      English
      1
      edit-2
      9 hours ago

      Formatting thing: if you start a line in a new paragraph with four spaces, it assumes that you want to display the text as a code and won’t line break.

      This means that the last part of your comment is a long line that people need to scroll to see. If you remove one of the spaces, or you remove the empty line between it and the previous paragraph, it’ll look like a normal comment

      With an empty line of space:

      1 space - and a little bit of writing just to see how the text will wrap. I don’t really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.

      2 spaces - and a little bit of writing just to see how the text will wrap. I don’t really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.

      3 spaces - and a little bit of writing just to see how the text will wrap. I don’t really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.

      4 spaces -  and a little bit of writing just to see how the text will wrap. I don't really have anything that I want to put here, but I need to put enough here to make it long enough to wrap around. This is likely enough.
      
      • setVeryLoud(true);
        link
        fedilink
        English
        510 hours ago

        My interpretation was that AI companies can train on material they are licensed to use, but the courts have deemed that Anthropic pirated this material as they were not licensed to use it.

        In other words, if Anthropic bought the physical or digital books, it would be fine so long as their AI couldn’t spit it out verbatim, but they didn’t even do that, i.e. the AI crawler pirated the book.

  • @vane@lemmy.world
    link
    fedilink
    English
    8
    edit-2
    11 hours ago

    Ok so you can buy books scan them or ebooks and use for AI training but you can’t just download priated books from internet to train AI. Did I understood that correctly ?

  • @fum@lemmy.world
    link
    fedilink
    English
    714 hours ago

    What a bad judge.

    This is another indication of how Copyright laws are bad. The whole premise of copyright has been obsolete since the proliferation of the internet.

    • gian
      link
      fedilink
      English
      214 hours ago

      What a bad judge.

      Why ? Basically he simply stated that you can use whatever material you want to train your model as long as you ask the permission to use it (and presumably pay for it) to the author (or copytight holder)

      • @patatahooligan@lemmy.world
        link
        fedilink
        English
        19 hours ago

        “Fair use” is the exact opposite of what you’re saying here. It says that you don’t need to ask for any permission. The judge ruled that obtaining illegitimate copies was unlawful but use without the creators consent is perfectly fine.

      • @LifeInMultipleChoice@lemmy.world
        link
        fedilink
        English
        1
        edit-2
        11 hours ago

        If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

        They may be trying to put safeguards so it isn’t directly happening, but here is an example that the text is there word for word:

        • @VoterFrog@lemmy.world
          link
          fedilink
          English
          22 hours ago

          If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

          A student can absolutely buy a text book and then teach the other students the information in it for free. That’s not redistribution. Redistribution would mean making copies of the book to hand out. That’s illegal for people and companies.

          • @LifeInMultipleChoice@lemmy.world
            link
            fedilink
            English
            1
            edit-2
            1 hour ago

            The language model isn’t teaching anything it is changing the wording of something and spitting it back out. And in some cases, not changing the wording at all, just spitting the information back out, without paying the copyright source. It is not alive, it has no thoughts. It has no “its own words.” (As seen by the judgement that its words cannot be copyrighted.) It only has other people’s words. Every word it spits out by definition is plagiarism, whether the work was copyrighted before or not.

            People wonder why works, such as journalism are getting worse. Well how could they ever get better if anything a journalist writes can be absorbed in real time, reworded and regurgitated without paying any dos to the original source. One journalist article, displayed in 30 versions, dividing the original works worth up into 30 portions. The original work now being worth 1/30th its original value. Maybe one can argue it is twice as good, so 1/15th.

            Long term it means all original creations… Are devalued and therefore not nearly worth pursuing. So we will only get shittier and shittier information. Every research project… Physics, Chemistry, Psychology, all technological advancements, slowly degraded as language models get better, and original sources deminish returns.

        • gian
          link
          fedilink
          English
          210 hours ago

          If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

          Well, it would be interesting if this case would be used as precedence in a case invonving a single student that do the same thing. But you are right

          • @fum@lemmy.world
            link
            fedilink
            English
            110 hours ago

            This was my understanding also, and why I think the judge is bad at their job.

            • @LifeInMultipleChoice@lemmy.world
              link
              fedilink
              English
              010 hours ago

              I suppose someone could develop an LLM that digests textbooks, and rewords the text and spits it back out. Then distribute it for free page for page. You can’t copy right the math problems I don’t think… so if the text wording is what gives it credence, that would have been changed.

                • @LifeInMultipleChoice@lemmy.world
                  link
                  fedilink
                  English
                  1
                  edit-2
                  8 hours ago

                  Oh I agree it should be, but following the judges ruling, I don’t see how it could be. You trained an LLM on textbooks that were purchased, not pirated. And the LLM distributed the responses.

                  (Unless you mean the human reworded them, then yeah, we aren’t special apparently)

        • gian
          link
          fedilink
          English
          210 hours ago

          True. And I will be happy if someone sue them and the judge say the same thing.

  • @GissaMittJobb@lemmy.ml
    link
    fedilink
    English
    917 hours ago

    It’s extremely frustrating to read this comment thread because it’s obvious that so many of you didn’t actually read the article, or even half-skim the article, or even attempted to even comprehend the title of the article for more than a second.

    For shame.

    • @LifeInMultipleChoice@lemmy.world
      link
      fedilink
      English
      111 hours ago

      “While the copies used to convert purchased print library copies into digital library copies were slightly disfavored by the second factor (nature of the work), the court still found “on balance” that it was a fair use because the purchased print copy was destroyed and its digital replacement was not redistributed.”

      So you find this to be valid? To me it is absolutely being redistributed

  • Prox
    link
    fedilink
    English
    1391 day ago

    FTA:

    Anthropic warned against “[t]he prospect of ruinous statutory damages—$150,000 times 5 million books”: that would mean $750 billion.

    So part of their argument is actually that they stole so much that it would be impossible for them/anyone to pay restitution, therefore we should just let them off the hook.

    • @interdimensionalmeme@lemmy.ml
      link
      fedilink
      English
      18 hours ago

      What is means is they don’t own the models. They are the commons of humanity, they are merely temporary custodians. The nightnare ending is the elites keeping the most capable and competent models for themselves as private play things. That must not be allowed to happen under any circumstances. Sue openai, anthropic and the other enclosers, sue them for trying to take their ball and go home. Disposses them and sue the investors for their corrupt influence on research.

    • Phoenixz
      link
      fedilink
      English
      30
      edit-2
      1 day ago

      This version of too big to fail is too big a criminal to pay the fines.

      How about we lock them up instead? All of em.

    • Lovable Sidekick
      link
      fedilink
      English
      3
      edit-2
      1 day ago

      Lawsuits are multifaceted. This statement isn’t a a defense or an argument for innocence, it’s just what it says - an assertion that the proposed damages are unreasonably high. If the court agrees, the plaintiff can always propose a lower damage claim that the court thinks is reasonable.

  • Dr. Moose
    link
    fedilink
    English
    5
    edit-2
    17 hours ago

    Unpopular opinion but I don’t see how it could have been different.

    • There’s no way the west would give AI lead to China which has no desire or framework to ever accept this.
    • Believe it or not but transformers are actually learning by current definitions and not regurgitating a direct copy. It’s transformative work - it’s even in the name.
    • This is actually good as it prevents market moat for super rich corporations only which could afford the expensive training datasets.

    This is an absolute win for everyone involved other than copyright hoarders and mega corporations.

    • @kromem@lemmy.world
      link
      fedilink
      English
      713 hours ago

      I’d encourage everyone upset at this read over some of the EFF posts from actual IP lawyers on this topic like this one:

      Nor is pro-monopoly regulation through copyright likely to provide any meaningful economic support for vulnerable artists and creators. Notwithstanding the highly publicized demands of musicians, authors, actors, and other creative professionals, imposing a licensing requirement is unlikely to protect the jobs or incomes of the underpaid working artists that media and entertainment behemoths have exploited for decades. Because of the imbalance in bargaining power between creators and publishing gatekeepers, trying to help creators by giving them new rights under copyright law is, as EFF Special Advisor Cory Doctorow has written, like trying to help a bullied kid by giving them more lunch money for the bully to take.

      Entertainment companies’ historical practices bear out this concern. For example, in the late-2000’s to mid-2010’s, music publishers and recording companies struck multimillion-dollar direct licensing deals with music streaming companies and video sharing platforms. Google reportedly paid more than $400 million to a single music label, and Spotify gave the major record labels a combined 18 percent ownership interest in its now-$100 billion company. Yet music labels and publishers frequently fail to share these payments with artists, and artists rarely benefit from these equity arrangements. There is no reason to believe that the same companies will treat their artists more fairly once they control AI.

    • @deathbird@mander.xyz
      link
      fedilink
      English
      4
      edit-2
      16 hours ago
      1. Idgaf about China and what they do and you shouldn’t either, even if US paranoia about them is highly predictable.
      2. Depending on the outputs it’s not always that transformative.
      3. The moat would be good actually. The business model of LLMs isn’t good, but it’s not even viable without massive subsidies, not least of which is taking people’s shit without paying.

      It’s a huge loss for smaller copyright holders (like the ones that filed this lawsuit) too. They can’t afford to fight when they get imitated beyond fair use. Copyright abuse can only be fixed by the very force that creates copyright in the first place: law. The market can’t fix that. This just decides winners between competing mega corporations, and even worse, up ends a system that some smaller players have been able to carve a niche in.

      Want to fix copyright? Put real time limits on it. Bind it to a living human only. Make it non-transferable. There’s all sorts of ways to fix it, but this isn’t it.

      ETA: Anthropic are some bitches. “Oh no the fines would ruin us, our business would go under and we’d never maka da money :*-(” Like yeah, no shit, no one cares. Strictly speaking the fines for ripping a single CD, or making a copy of a single DVD to give to a friend, are so astronomically high as to completely financially ruin the average USAian for life. That sword of Damocles for watching Shrek 2 for your personal enjoyment but in the wrong way has been hanging there for decades, and the only thing that keeps the cord that holds it up strong is the cost of persuing “low-level offenders”. If they wanted to they could crush you.

      Anthropic walked right under the sword and assumed their money would protect them from small authors etc. And they were right.

      • @Atlas_@lemmy.world
        link
        fedilink
        English
        216 hours ago

        Maybe something could be hacked together to fix copyright, but further complication there is just going to make accurate enforcement even harder. And we already have Google (in YouTube) already doing a shitty job of it and that’s… One of the largest companies on earth.

        We should just kill copyright. Yes, it’ll disrupt Hollywood. Yes it’ll disrupt the music industry. Yes it’ll make it even harder to be successful or wealthy as an author. But this is going to happen one way or the other so long as AI can be trained on copyrighted works (and maybe even if not). We might as well get started on the transition early.

      • Dr. Moose
        link
        fedilink
        English
        -1
        edit-2
        16 hours ago

        I’ll be honest with you - I genuinely sympathize with the cause but I don’t see how this could ever be solved with the methods you suggested. The world is not coming together to hold hands and koombayah out of this one. Trade deals are incredibly hard and even harder to enforce so free market is clearly the only path forward here.

    • Lovable Sidekick
      link
      fedilink
      English
      2
      edit-2
      15 hours ago

      You’re getting douchevoted because on lemmy any AI-related comment that isn’t negative enough about AI is the Devil’s Work.

  • @mlg@lemmy.world
    link
    fedilink
    English
    1021 hours ago

    Yeah I have a bash one liner AI model that ingests your media and spits out a 99.9999999% accurate replica through the power of changing the filename.

    cp

    Out performs the latest and greatest AI models

  • @shadowfax13@lemmy.ml
    link
    fedilink
    English
    418 hours ago

    calm down everyone. its only legal for parasitic mega corps, the normal working people will be harassed to suicide same as before.

    its only a crime if the victims was rich or perpetrator was not rich.

  • Alphane Moon
    link
    fedilink
    English
    54
    edit-2
    1 day ago

    And this is how you know that the American legal system should not be trusted.

    Mind you I am not saying this an easy case, it’s not. But the framing that piracy is wrong but ML training for profit is not wrong is clearly based on oligarch interests and demands.

    • themeatbridge
      link
      fedilink
      English
      371 day ago

      This is an easy case. Using published works to train AI without paying for the right to do so is piracy. The judge making this determination is an idiot.

      • Null User Object
        link
        fedilink
        English
        151 day ago

        The judge making this determination is an idiot.

        The judge hasn’t ruled on the piracy question yet. The only thing that the judge has ruled on is, if you legally own a copy of a book, then you can use it for a variety of purposes, including training an AI.

        “But they didn’t own the books!”

        Right. That’s the part that’s still going to trial.

      • @AbidanYre@lemmy.world
        link
        fedilink
        English
        251 day ago

        You’re right. When you’re doing it for commercial gain, it’s not fair use anymore. It’s really not that complicated.

        • @tabular@lemmy.world
          link
          fedilink
          English
          51 day ago

          If you’re using the minimum amount, in a transformative way that doesn’t compete with the original copyrighted source, then it’s still fair use even if it’s commercial. (This is not saying that’s what LLM are doing)

  • Optional
    link
    fedilink
    English
    151 day ago

    Judges: not learning a goddamned thing about computers in 40 years.

    • @BlameTheAntifa@lemmy.world
      link
      fedilink
      English
      36 hours ago

      They aren’t capable of that. This is why you sometimes see people comparing AI to compression, which is a bad faith argument. Depending on the training, AI can make something that is easily recognizable as derivative, but is not identical or even “lossy” identical. But this scenario takes place in a vacuum that doesn’t represent the real world. Unfortunately, we are enslaved by Capitalism, which means the output, which is being sold for-profit, is competing with the very content it was trained upon. This is clearly a violation of basic ethical principles as it actively harms those people whose content was used for training.

    • @kromem@lemmy.world
      link
      fedilink
      English
      113 hours ago

      Even if the AI could spit it out verbatim, all the major labs already have IP checkers on their text models that block it doing so as fair use for training (what was decided here) does not mean you are free to reproduce.

      Like, if you want to be an artist and trace Mario in class as you learn, that’s fair use.

      If once you are working as an artist someone says “draw me a sexy image of Mario in a calendar shoot” you’d be violating Nintendo’s IP rights and liable for infringement.