Elon Musk wants to rewrite "the entire corpus of human knowledge" with Grok

Pro@programming.dev · 3 months ago

Elon Musk wants to rewrite "the entire corpus of human knowledge" with Grok

Cyberflunk@lemmy.world · edit-2 1 day ago

Ⓓⓘⓔ Elon. Get in your starship and chase your Tesla you fucking nazi

finitebanjo@lemmy.world · 3 months ago

“If we take this 0.84 accuracy model and train another 0.84 accuracy model on it that will make it a 1.68 accuracy model!”

~Fucking Dumbass

Naevermix@lemmy.world · 3 months ago

Elon Musk, like most pseudo intellectuals, has a very shallow understanding of things. Human knowledge is full of holes, and they cannot simply be resolved through logic, which Mush the dweeb imagines.

biocoder.ronin@lemmy.ml · 3 months ago

Uh, just a thought. Please pardon, I’m not an Elon shill, I just think your argument phrasing is off.

How would you know there are holes in understanding, without logic. How would you remedy gaps of understanding in human knowledge, without the application of logic to find things are consistent?

andros_rex@lemmy.world · edit-2 3 months ago

You have to have data to apply your logic too.

If it is raining, the sidewalk is wet. Does that mean if the sidewalk is wet, that it is raining?

There are domains of human knowledge that we will never have data on. There’s no logical way for me to 100% determine what was in Abraham Lincoln’s pockets on the day he was shot.

When you read real academic texts, you’ll notice that there is always the “this suggests that,” “we can speculate that,” etc etc. The real world is not straight math and binary logic. The closest fields to that might be physics and chemistry to a lesser extent, but even then - theoretical physics must be backed by experimentation and data.

biocoder.ronin@lemmy.ml · 3 months ago

Thanks I’ve never heard of data. And I’ve never read an academic text either. Condescending pos

So, while I’m ironing out your logic for you, “what else would you rely on, if not logic, to prove or disprove and ascertain knowledge about gaps?”

andros_rex@lemmy.world · edit-2 3 months ago

You asked a question, I gave an answer. I’m not sure where you get “condescending” there. I was assuming you had read an academic text, so I was hoping that you might have seen those patterns before.

You would look at the data for gaps, as my answer explained. You could use logic to predict some gaps, but not all gaps would be predictable. Mendeleev was able to use logic and patterns in the periodic table to predict the existence of germanium and other elements, which data confirmed, but you could not logically derive the existence of protons, electrons and neutrons without the later experimentations of say, JJ Thompson and Rutherford.

You can’t just feed the sum of human knowledge into a computer and expect it to know everything. You can’t predict “unknown unknowns” with logic.

brucethemoose@lemmy.world · edit-2 3 months ago

I elaborated below, but basically Musk has no idea WTF he’s talking about.

If I had his “f you” money, I’d at least try a diffusion or bitnet model (and open the weights for others to improve on), and probably 100 other papers I consider low hanging fruit, before this absolutely dumb boomer take.

He’s such an idiot know it all. It’s so painful whenever he ventures into a field you sorta know.

But he might just be shouting nonsense on Twitter while X employees actually do something different. Because if they take his orders verbatim they’re going to get crap models, even with all the stupid brute force they have.

rottingleaf@lemmy.world · 3 months ago

So where will Musk find that missing information and how will he detect “errors”?

Corhen@lemmy.world · 3 months ago

“and then on retrain on that”

Thats called model collapse.

RattlerSix@lemmy.world · 3 months ago

I never would have thought it possible that a person could be so full of themselves to say something like that

JackbyDev@programming.dev · 3 months ago

Training an AI model on AI output? Isn’t that like the one big no-no?

Deflated0ne@lemmy.world · 3 months ago

Dude is gonna spend Manhattan Project level money making another stupid fucking shitbot. Trained on regurgitated AI Slop.

Glorious.

namingthingsiseasy@programming.dev · 3 months ago

Whatever. The next generation will have to learn to trust whether the material is true or not by using sources like Wikipedia or books by well-regarded authors.

The other thing that he doesn’t understand (and most “AI” advocates don’t either) is that LLMs have nothing to do with facts or information. They’re just probabilistic models that pick the next word(s) based on context. Anyone trying to address the facts and information produced by these models is completely missing the point.

Kyrgizion@lemmy.world · 3 months ago

Thinking wikipedia or other unbiased sources will still be available in a decade or so is wishful thinking. Once the digital stranglehold kicks in, it’ll be mandatory sign-in with gov vetted identity provider and your sources will be limited to what that gov allows you to see. MMW.

namingthingsiseasy@programming.dev · 3 months ago

Wikipedia is quite resilient - you can even put it on a USB drive. As long as you have a free operating system, there will always be ways to access it.

Dead_or_Alive@lemmy.world · 3 months ago

I keep a partial local copy of Wikipedia on my phone and backup device with an app called Kiwix. Great if you need access to certain items in remote areas with no access to the internet.

Warl0k3@lemmy.world · 3 months ago

They may laugh now, but you’re gonna kick ass when you get isekai’d.

coolmojo@lemmy.world · 3 months ago

Yes. There will be no websites only AI and apps. You will be automatically logged in to the apps. Linux, Lemmy will be baned. We will be classed as hackers and criminals. We probably have to build our own mesh network for communication or access it from a secret location.

JcbAzPx@lemmy.world · 3 months ago

Can’t stop the signal.

MolecularCactus1324@lemmy.world · edit-2 2 months ago

deleted by creator

JustAPenguin@lemmy.world · 3 months ago

The thing that annoys me most is that there have been studies done on LLMs where, when trained on subsets of output, it produces increasingly noisier output.

Sources (unordered):

Whatever nonsense Muskrat is spewing, it is factually incorrect. He won’t be able to successfully retrain any model on generated content. At least, not an LLM if he wants a successful product. If anything, he will be producing a model that is heavily trained on censored datasets.

brucethemoose@lemmy.world · edit-2 3 months ago

It’s not so simple, there are papers on zero data ‘self play’ or other schemes for using other LLM’s output.

Distillation is probably the only one you’d want for a pretrain, specifically.

Auli@lemmy.ca · 3 months ago

So just making shit up.

Peerpeer@lemmy.world · 3 months ago

Don’t forget the retraining on the made up shit part!

Antaeus@lemmy.world · 3 months ago

Elon should seriously see a medical professional.

Lord Wiggle@lemmy.world · 3 months ago

He should be locked up in a mental institute. Indefinitely.

Elgenzay@lemmy.ml · 3 months ago

Aren’t you not supposed to train LLMs on LLM-generated content?

Also he should call it Grok 5; so powerful that it skips over 4. That would be very characteristic of him

brucethemoose@lemmy.world · edit-2 3 months ago

There’s some nuance.

Using LLMs to augment data, especially for fine tuning (not training the base model), is a sound method. The Deepseek paper using, for instance, generated reasoning traces is famous for it.

Another is using LLMs to generate logprobs of text, and train not just on the text itself but on the *probability a frontier LLM sees in every ‘word.’ This is called distillation, though there’s some variation and complication. This is also great because it’s more power/time efficient. Look up Arcee models and their distillation training kit for more on this, and code to see how it works.

There are some papers on “self play” that can indeed help LLMs.

But yes, the “dumb” way, aka putting data into a text box and asking an LLM to correct it, is dumb and dumber, because:

You introduce some combination of sampling errors and repetition/overused word issues, depending on the sampling settings. There’s no way around this with old autoregressive LLMs.
You possibly pollute your dataset with “filler”
In Musk’s specific proposition, it doesn’t even fill knowledge gaps the old Grok has.

In other words, Musk has no idea WTF he’s talking about. It’s the most boomer, AI Bro, not techy ChatGPT user thing he could propose.