Hi,
Just like the title says:
I’m try to run:
With:
- koboldcpp:v1.43 using HIPBLAS on a 7900XTX / Arch Linux
Running :
--stream --unbantokens --threads 8 --usecublas normal
I get very limited output with lots of repetition.
I mostly didn’t touch the default settings:
Does anyone know how I can make things run better?
EDIT: Sorry for multiple posts, Fediverse bugged out.
Yeah, I think you need to set the
contextsize
andropeconfig
. Documentation isn’t completely clear and in some places sort of implies that it should be autodetected based on the model when using a recent version, but the first thing I would try is setting these explicitly as this definitely looks like an encoding issue.