Magiwarriorx@lemmy.world to

LocalLLaMA@sh.itjust.worksEnglish · 1 year ago

Guide on setting up a local GGML model?

16

Guide on setting up a local GGML model?

Magiwarriorx@lemmy.world to

LocalLLaMA@sh.itjust.worksEnglish · 1 year ago

I’ve been messing around with GPTQ models with ExLlama in ooba, and have gotten 33b models @ 3k running smoothly, but was looking to try something bigger than my VRAM can hold.

However, I’m clearly doing something wrong, and the koboldcpp.exe documentation isn’t clear to me. Does anyone have a good setup guide? My understanding is koboldcpp.exe is preferable for GGML, as ooba’s llama.cpp doesn’t support GGML at >4k context yet.

Chat

actually-a-cat@sh.itjust.works
link
fedilink
English
arrow-up
2·
1 year ago
Those are OpenCL platform and device identifiers, you can use clinfo to find out which numbers are what on your system.

Also note that if you’re building kobold.cpp yourself, you need to build with LLAMA_CLBLAST=1 for OpenCL support to exist in the first place. Or LLAMA_CUBLAS for CUDA.

LocalLLaMA@sh.itjust.works

localllama@sh.itjust.works

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !localllama@sh.itjust.works

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
12 users / month
557 users / 6 months
2 local subscribers
2.19K subscribers
215 Posts
716 Comments
Modlog