Something to handle code, text and math.

  • thingsiplay@lemmy.ml
    link
    fedilink
    arrow-up
    1
    arrow-down
    1
    ·
    4 days ago

    I use local LLM with 8gb VRAM and 32gb system RAM, thanks to Vulkan support. My GPU is a RX 7600. I can run qwen/qwen3.6-35B-A3B-Q4_K_M.gguf and gemma-4-26B-A4B-it-Q4_K_M.gguf in example. It will first fill in the GPU and the rest will use the system RAM instead, which is slower but at least it will fit and run bigger models. I just need to lower the context length, which has a great impact (current custom value is 64k for anyone who wants to know).

    But this is still highly limited and not competitive at all. I mostly play around with it and occasionally ask a question here or there and that’s it. So if you are serious about your system, you need something faster and with more than just 8gb VRAM.