Coding with LLMs (Claude Code, OpenAI Codex) is often presented as the ‘killer app’ for Generative AI. But looking at data, it seems the one piece of the puzzle missing is actual cost. …
So are we assuming here that LLMs won’t become more efficient over time? GPT-3 has been a frontier model just a few years ago and it’s performance blew everyone’s mind at that time. I can now run equivalent LLM on my personal computer. Why can’t we expect that after a few years Claude Sonnet level of capability won’t be possible to accomplish locally?
I’m still pretty new to Lemmy and the fediverse although I really enjoy it. I’ve noticed some strong dislike of anything and everything AI to the point I think it’s clouding some peoples ability to really see the situation at hand. That said I get a lot of people skepticism, a lot of AI projects are nonsense and things have been over promised. On top of that there’s the more than problematic issue of data centers and the environment. I think people don’t fully grasp how insane some of the achievements of neural nets are, how fast it’s developing, that having models that pretty much pass the Turing test was pure sci-fi just a few years ago, much less are solving legitimate mathematical conjectures as well as other hard problems in science.
They could, but what’s the plan here, exactly? That all these for profit companies who are currently publishing models for free, like Qwen, will continue to do so in the future?
Why not?
Why Microsoft develops it’s .NET ecosystem? Why Google develops Go/Dart? It costs them lots of money and they give it for free.
The answer is: they don’t earn money on it directly, but these tools are a way to tie programmers to their cloud services. If you use .NET you’ll probably end up on Azure. If Go - probably you’ll use GCP.
So I suspect the same will be with LLMs. At some point they will say: “hey, you can use this LLM however you want, but as you are already using it, then you may want to know our platform is optimized for it”
LLM providers are SaaS providers, meaning that even if they were to give you the source of all the tools they use, there’s a fundamental limit to how much you can self host.
A better comparison would be Google giving away their indexed search data: you might be able to run an infinitesimal portion of it on your hardware, and will never ever match the results Google offers on their website, and since it’s a monopoly, you would be at a permanent disadvantage.
Same goes for all these AI companies. They are an oligopoly that give away subpar free models, compared to their cloud offerings. Self hosted LLMs will never stand a chance.
Why can’t we expect that after a few years Claude Sonnet level of capability won’t be possible to accomplish locally?
Because when you’re old enough to remember what AIM chat it’s could do 25 years ago, it stops being impressive what today’s chatbots can do…
It’s seems “new” because everyone hated it and it was just a novelty back then.
But if you read up on them, they did 90% of what modern ones do. And if they had access to today’s computing, the only explanation for why they still suck so much, is that no one has ever wanted them.
So are we assuming here that LLMs won’t become more efficient over time? GPT-3 has been a frontier model just a few years ago and it’s performance blew everyone’s mind at that time. I can now run equivalent LLM on my personal computer. Why can’t we expect that after a few years Claude Sonnet level of capability won’t be possible to accomplish locally?
What’s the cost of the compute you have to run something locally?
Majority of people don’t have 32G of vram to run something remotely as capable
I’m still pretty new to Lemmy and the fediverse although I really enjoy it. I’ve noticed some strong dislike of anything and everything AI to the point I think it’s clouding some peoples ability to really see the situation at hand. That said I get a lot of people skepticism, a lot of AI projects are nonsense and things have been over promised. On top of that there’s the more than problematic issue of data centers and the environment. I think people don’t fully grasp how insane some of the achievements of neural nets are, how fast it’s developing, that having models that pretty much pass the Turing test was pure sci-fi just a few years ago, much less are solving legitimate mathematical conjectures as well as other hard problems in science.
They could, but what’s the plan here, exactly? That all these for profit companies who are currently publishing models for free, like Qwen, will continue to do so in the future?
Why not? Why Microsoft develops it’s .NET ecosystem? Why Google develops Go/Dart? It costs them lots of money and they give it for free.
The answer is: they don’t earn money on it directly, but these tools are a way to tie programmers to their cloud services. If you use .NET you’ll probably end up on Azure. If Go - probably you’ll use GCP.
So I suspect the same will be with LLMs. At some point they will say: “hey, you can use this LLM however you want, but as you are already using it, then you may want to know our platform is optimized for it”
That’s not an accurate analogy.
LLM providers are SaaS providers, meaning that even if they were to give you the source of all the tools they use, there’s a fundamental limit to how much you can self host.
A better comparison would be Google giving away their indexed search data: you might be able to run an infinitesimal portion of it on your hardware, and will never ever match the results Google offers on their website, and since it’s a monopoly, you would be at a permanent disadvantage.
Same goes for all these AI companies. They are an oligopoly that give away subpar free models, compared to their cloud offerings. Self hosted LLMs will never stand a chance.
Because when you’re old enough to remember what AIM chat it’s could do 25 years ago, it stops being impressive what today’s chatbots can do…
It’s seems “new” because everyone hated it and it was just a novelty back then.
But if you read up on them, they did 90% of what modern ones do. And if they had access to today’s computing, the only explanation for why they still suck so much, is that no one has ever wanted them.
The oligarchs just decided it didn’t matter