Any alternatives to GPT4All for local ChatGPT-like experience on Linux?

yo_scottie_oh@lemmy.ml · edit-2 15 days ago

Any alternatives to GPT4All for local ChatGPT-like experience on Linux?

will@lemm.ee · 15 days ago

OpenWebUI is a superb front-end and supports just about any backend that you think of (including Ollama for locally hosted LLMs) and has some really nice features like pipelines that can extend out its functionality however you might need. Definitely has the “copy code” feature built-in and outputs markdown for regular documentation purposes.

Grimy@lemmy.world · 15 days ago

OpenWebUI is also my go-to. It works nicely with runpods vllm template, so I can run local models but also use heavier ones at minimal cost when it suits me.

yo_scottie_oh@lemmy.ml · edit-2 14 days ago

Thanks for the tip about OpenWebUI. After watching this video about its features, I want to learn more.

Would you mind sharing a little bit about your setup? For example, do you have a home lab or do you just run OpenWebUI w/ Ollama on a spare laptop or something? I thought I saw some documentation suggesting that this stack can be run on any system, but I’m curious how other people run it in the real world. Thanks!

will@lemm.ee · 14 days ago

Sure, I run OpenWebUI in a docker container from my TrueNAS SCALE home server (it’s one of their standard packages, so basically a 1-click install). From there I’ve configured API use with OpenAI, Gemini, Anthropic and DeepSeek (part of my job involves evaluating the performance of these big models for various in-house tasks), along with pipelines for some of our specific workflows and MCP via mcpo.

I previously had my ollama installation in another docker container but didn’t like having a big GPU in my NAS box, so I moved it to its own box. I am mostly interested in testing small/tiny models there. I again have Ollama running in a Docker container (just the official Docker image), but this time on a Debian bare-metal server, and I configured another OpenWebUI pipeline to point to that (OpenWebUI lets you select which LLM(s) you want to use on a conversation-by-conversation basis, so there’s no problem having a bunch of them hooked up at the same time).

arcane@lemmy.world · 14 days ago

Thank you, this is really helpful to inform my setup!

heisenbug4242@lemmy.world · 15 days ago

If you’re looking for a web UI and a simple way to host one yourself, nothing beats the “llama.cpp” project. They include a “llama-server” program which hosts a simple web server (with a chat webapp) and OpenAI-compatible API endpoint. It now also supports multimodality (for models that support multimodality), meaning you can for example upload an image and ask the assistant to describe the image. An example command to set up such a web server would be:

$ llama-server --threads 6 -m /path/to/model.gguf

Or, for multimodality support (like asking an AI to describe an image), use:

$ llama-server --threads 6 --mmproj /path/to/model/mmproj-F16.gguf -m /path/to/model/model.gguf

simple@lemm.ee · 15 days ago

LM Studio although i’ve never tried the linux version.

vivendi@programming.dev · 14 days ago

I have. AppImage only is a weird choice but it works well

Even_Adder@lemmy.dbzer0.com · 15 days ago

Koboldcpp

brucethemoose@lemmy.world · 14 days ago

You can squeeze a lot more performance out with a newer framework and a model tailored for your GPU and task.

I’d recommend:

Kobold.cpp rocm, follow the quick-install guide here: https://github.com/YellowRoseCx/koboldcpp-rocm/?tab=readme-ov-file#quick-linux-install

Download this quantization, which fits in your VRAM pool nicely and is specifically tuned for coding and planning, select it in kobold.cpp: https://huggingface.co/mradermacher/Qwen3-14B-Esper3-i1-GGUF/blob/main/Qwen3-14B-Esper3.i1-IQ4_NL.gguf

Use the “corporate” UI in kobold.cpp in your browser. If that doesn’t work well, kobold.cpp also works as a generic OpenAI endpoint, which you can access from pretty much any app, like https://openwebui.com/

KiwiTB@lemmy.world · 15 days ago

Page assist, runs in your browser and interfaces with ollama.

Lung@lemmy.world · 15 days ago

VSCode with the open source Cline extension. Easily the best open option, works everywhere. Excellent coding and planning agent, I use it for everything