We will use Grok 3.5 (maybe we should call it 4), which has advanced reasoning, to rewrite the entire corpus of human knowledge, adding missing information and deleting errors.

Then retrain on that.

Far too much garbage in any foundation model trained on uncorrected data.

Source.

More Context

Source.

Source.

  • finitebanjo@lemmy.world
    link
    fedilink
    English
    arrow-up
    56
    ·
    1 month ago

    “If we take this 0.84 accuracy model and train another 0.84 accuracy model on it that will make it a 1.68 accuracy model!”

    ~Fucking Dumbass

  • Naevermix@lemmy.world
    link
    fedilink
    English
    arrow-up
    37
    arrow-down
    1
    ·
    1 month ago

    Elon Musk, like most pseudo intellectuals, has a very shallow understanding of things. Human knowledge is full of holes, and they cannot simply be resolved through logic, which Mush the dweeb imagines.

    • biocoder.ronin@lemmy.ml
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 month ago

      Uh, just a thought. Please pardon, I’m not an Elon shill, I just think your argument phrasing is off.

      How would you know there are holes in understanding, without logic. How would you remedy gaps of understanding in human knowledge, without the application of logic to find things are consistent?

      • andros_rex@lemmy.world
        link
        fedilink
        English
        arrow-up
        8
        ·
        edit-2
        1 month ago

        You have to have data to apply your logic too.

        If it is raining, the sidewalk is wet. Does that mean if the sidewalk is wet, that it is raining?

        There are domains of human knowledge that we will never have data on. There’s no logical way for me to 100% determine what was in Abraham Lincoln’s pockets on the day he was shot.

        When you read real academic texts, you’ll notice that there is always the “this suggests that,” “we can speculate that,” etc etc. The real world is not straight math and binary logic. The closest fields to that might be physics and chemistry to a lesser extent, but even then - theoretical physics must be backed by experimentation and data.

        • biocoder.ronin@lemmy.ml
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          4
          ·
          1 month ago

          Thanks I’ve never heard of data. And I’ve never read an academic text either. Condescending pos

          So, while I’m ironing out your logic for you, “what else would you rely on, if not logic, to prove or disprove and ascertain knowledge about gaps?”

          • andros_rex@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            ·
            edit-2
            1 month ago

            You asked a question, I gave an answer. I’m not sure where you get “condescending” there. I was assuming you had read an academic text, so I was hoping that you might have seen those patterns before.

            You would look at the data for gaps, as my answer explained. You could use logic to predict some gaps, but not all gaps would be predictable. Mendeleev was able to use logic and patterns in the periodic table to predict the existence of germanium and other elements, which data confirmed, but you could not logically derive the existence of protons, electrons and neutrons without the later experimentations of say, JJ Thompson and Rutherford.

            You can’t just feed the sum of human knowledge into a computer and expect it to know everything. You can’t predict “unknown unknowns” with logic.

  • brucethemoose@lemmy.world
    link
    fedilink
    English
    arrow-up
    29
    ·
    edit-2
    1 month ago

    I elaborated below, but basically Musk has no idea WTF he’s talking about.

    If I had his “f you” money, I’d at least try a diffusion or bitnet model (and open the weights for others to improve on), and probably 100 other papers I consider low hanging fruit, before this absolutely dumb boomer take.

    He’s such an idiot know it all. It’s so painful whenever he ventures into a field you sorta know.

    But he might just be shouting nonsense on Twitter while X employees actually do something different. Because if they take his orders verbatim they’re going to get crap models, even with all the stupid brute force they have.

  • RattlerSix@lemmy.world
    link
    fedilink
    English
    arrow-up
    23
    ·
    1 month ago

    I never would have thought it possible that a person could be so full of themselves to say something like that

  • Deflated0ne@lemmy.world
    link
    fedilink
    English
    arrow-up
    22
    ·
    1 month ago

    Dude is gonna spend Manhattan Project level money making another stupid fucking shitbot. Trained on regurgitated AI Slop.

    Glorious.

  • namingthingsiseasy@programming.dev
    link
    fedilink
    English
    arrow-up
    21
    arrow-down
    1
    ·
    1 month ago

    Whatever. The next generation will have to learn to trust whether the material is true or not by using sources like Wikipedia or books by well-regarded authors.

    The other thing that he doesn’t understand (and most “AI” advocates don’t either) is that LLMs have nothing to do with facts or information. They’re just probabilistic models that pick the next word(s) based on context. Anyone trying to address the facts and information produced by these models is completely missing the point.

    • Kyrgizion@lemmy.world
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      3
      ·
      1 month ago

      Thinking wikipedia or other unbiased sources will still be available in a decade or so is wishful thinking. Once the digital stranglehold kicks in, it’ll be mandatory sign-in with gov vetted identity provider and your sources will be limited to what that gov allows you to see. MMW.

      • namingthingsiseasy@programming.dev
        link
        fedilink
        English
        arrow-up
        14
        ·
        1 month ago

        Wikipedia is quite resilient - you can even put it on a USB drive. As long as you have a free operating system, there will always be ways to access it.

        • Dead_or_Alive@lemmy.world
          link
          fedilink
          English
          arrow-up
          8
          ·
          1 month ago

          I keep a partial local copy of Wikipedia on my phone and backup device with an app called Kiwix. Great if you need access to certain items in remote areas with no access to the internet.

      • coolmojo@lemmy.world
        link
        fedilink
        English
        arrow-up
        3
        arrow-down
        5
        ·
        1 month ago

        Yes. There will be no websites only AI and apps. You will be automatically logged in to the apps. Linux, Lemmy will be baned. We will be classed as hackers and criminals. We probably have to build our own mesh network for communication or access it from a secret location.

  • JustAPenguin@lemmy.world
    link
    fedilink
    English
    arrow-up
    17
    ·
    1 month ago

    The thing that annoys me most is that there have been studies done on LLMs where, when trained on subsets of output, it produces increasingly noisier output.

    Sources (unordered):

    Whatever nonsense Muskrat is spewing, it is factually incorrect. He won’t be able to successfully retrain any model on generated content. At least, not an LLM if he wants a successful product. If anything, he will be producing a model that is heavily trained on censored datasets.

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 month ago

      It’s not so simple, there are papers on zero data ‘self play’ or other schemes for using other LLM’s output.

      Distillation is probably the only one you’d want for a pretrain, specifically.

  • Elgenzay@lemmy.ml
    link
    fedilink
    English
    arrow-up
    12
    ·
    1 month ago

    Aren’t you not supposed to train LLMs on LLM-generated content?

    Also he should call it Grok 5; so powerful that it skips over 4. That would be very characteristic of him

    • brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      ·
      edit-2
      1 month ago

      There’s some nuance.

      Using LLMs to augment data, especially for fine tuning (not training the base model), is a sound method. The Deepseek paper using, for instance, generated reasoning traces is famous for it.

      Another is using LLMs to generate logprobs of text, and train not just on the text itself but on the *probability a frontier LLM sees in every ‘word.’ This is called distillation, though there’s some variation and complication. This is also great because it’s more power/time efficient. Look up Arcee models and their distillation training kit for more on this, and code to see how it works.

      There are some papers on “self play” that can indeed help LLMs.

      But yes, the “dumb” way, aka putting data into a text box and asking an LLM to correct it, is dumb and dumber, because:

      • You introduce some combination of sampling errors and repetition/overused word issues, depending on the sampling settings. There’s no way around this with old autoregressive LLMs.

      • You possibly pollute your dataset with “filler”

      • In Musk’s specific proposition, it doesn’t even fill knowledge gaps the old Grok has.

      In other words, Musk has no idea WTF he’s talking about. It’s the most boomer, AI Bro, not techy ChatGPT user thing he could propose.