I’ve found Qwen3-30B-A3B-Thinking-2507 to be the best all-around “do stuff for me” model that fits on my hardware. I’ve mostly been using it for analyzing and summarizing documents I’ve got on my local hard drive; meeting transcripts, books, and so forth. It’s done surprisingly well on those transcripts, I daresay its summaries are able to tease out patterns that a human wouldn’t have had an easy time spotting.
When it comes to creative writing I mix it up with Llama-3.3-70B-Instruct to enrich the text, using multiple models helps keep it from becoming repetitive and too recognizable in style.
I’ve got Qwen3-Coder-30B-A3B-Instruct kicking around as a programming assistant, but while it’s competent at its job I’ve been finding that the big online models do better (unsurprisingly) so I use those more. Perhaps if I was focusing on code analysis and cleanup I’d be using that one instead but when it comes to writing big new classes or applications in one swoop it pays to go with the best right off the bat. Maybe once the IDEs get a little better at integrating LLMs it might catch up.
I’ve been using Ollama as the framework for running them, it’s got a nice simple API and it runs in the background so it’ll claim and release memory whenever demand for it comes. I used to use KoboldCPP but I had to manually start and stop it a lot and that got tedious.
Realy? I get what you want to say, but at least the power consumption of the machine you need the model to run on will be yours forever. Depending on your energy price it is not 0 per query.
It’s so near zero it makes no difference. It is not a noticeable factor in my decision on whether to use it or not for any given task.
The training of a brand new model is expensive, but once the model has been created it’s cheap to run. If OpenAI went bankrupt tomorrow and shut down the models it had trained would just be sold off to other companies and they’d run them instead, free from the debt burden that OpenAI accrued from the research and training costs that went into producing them. That’s actually a fairly common pattern for first-movers like that, they spend a lot of money blazing the trail and then other companies follow along afterwards and eat their lunch.
That’s great if they actually work. But my experience with the big, corporate-funded models has been pretty freaking abysmal after more than a year of trying to adopt them into my daily workflow. I can’t imagine the performance of local models is better when they’re running on much, much smaller datasets and with much, much less computing power.
I’m happy to be proven wrong, of course, but I just don’t see how it’s possible for local models to compete with the Big Boys in terms of quality… and the quality of the largest models is only middling at best.
You’re free to not use them. Seems like an awful lot of people are using them, though, including myself. They must be getting something out of using them or they’d stop too.
Just because a lot of people are using them does not necessarily mean they are actually valuable. You’re claim assumes that people are acting rationally regarding them. But that’s an erroneous assumption to make.
People are falling in “love” with them. Asking them for advice about mental health. Treating them like they are some kind of all-knowing oracle (or even having any intelligence whatsoever), when in reality they know nothing and cannot reason at all.
Ultimately they are immensely effective at creating a feedback loop that preys on human psychology and reinforces a dependency on it. It’s a bit like addiction in that way.
And even if they were mostly using it for that, who are you to decide what is “valuable” for other people? I happen to think that sports are a huge waste of time, does that mean that stadiums are not valuable?
I run local LLMs and they cost me $0 per query. I don’t plan to charge myself more than that at any point, even if the AI bubble bursts.
Which ones are you running?
I’ve found Qwen3-30B-A3B-Thinking-2507 to be the best all-around “do stuff for me” model that fits on my hardware. I’ve mostly been using it for analyzing and summarizing documents I’ve got on my local hard drive; meeting transcripts, books, and so forth. It’s done surprisingly well on those transcripts, I daresay its summaries are able to tease out patterns that a human wouldn’t have had an easy time spotting.
When it comes to creative writing I mix it up with Llama-3.3-70B-Instruct to enrich the text, using multiple models helps keep it from becoming repetitive and too recognizable in style.
I’ve got Qwen3-Coder-30B-A3B-Instruct kicking around as a programming assistant, but while it’s competent at its job I’ve been finding that the big online models do better (unsurprisingly) so I use those more. Perhaps if I was focusing on code analysis and cleanup I’d be using that one instead but when it comes to writing big new classes or applications in one swoop it pays to go with the best right off the bat. Maybe once the IDEs get a little better at integrating LLMs it might catch up.
I’ve been using Ollama as the framework for running them, it’s got a nice simple API and it runs in the background so it’ll claim and release memory whenever demand for it comes. I used to use KoboldCPP but I had to manually start and stop it a lot and that got tedious.
Realy? I get what you want to say, but at least the power consumption of the machine you need the model to run on will be yours forever. Depending on your energy price it is not 0 per query.
It’s so near zero it makes no difference. It is not a noticeable factor in my decision on whether to use it or not for any given task.
The training of a brand new model is expensive, but once the model has been created it’s cheap to run. If OpenAI went bankrupt tomorrow and shut down the models it had trained would just be sold off to other companies and they’d run them instead, free from the debt burden that OpenAI accrued from the research and training costs that went into producing them. That’s actually a fairly common pattern for first-movers like that, they spend a lot of money blazing the trail and then other companies follow along afterwards and eat their lunch.
It’s cheap to run for one person. Any service running it isn’t cheap when it has a good number of users.
That’s great if they actually work. But my experience with the big, corporate-funded models has been pretty freaking abysmal after more than a year of trying to adopt them into my daily workflow. I can’t imagine the performance of local models is better when they’re running on much, much smaller datasets and with much, much less computing power.
I’m happy to be proven wrong, of course, but I just don’t see how it’s possible for local models to compete with the Big Boys in terms of quality… and the quality of the largest models is only middling at best.
You’re free to not use them. Seems like an awful lot of people are using them, though, including myself. They must be getting something out of using them or they’d stop too.
Just because a lot of people are using them does not necessarily mean they are actually valuable. You’re claim assumes that people are acting rationally regarding them. But that’s an erroneous assumption to make.
People are falling in “love” with them. Asking them for advice about mental health. Treating them like they are some kind of all-knowing oracle (or even having any intelligence whatsoever), when in reality they know nothing and cannot reason at all.
Ultimately they are immensely effective at creating a feedback loop that preys on human psychology and reinforces a dependency on it. It’s a bit like addiction in that way.
Turns out very few people use it that way. Most people use it for far more practical things.
And even if they were mostly using it for that, who are you to decide what is “valuable” for other people? I happen to think that sports are a huge waste of time, does that mean that stadiums are not valuable?