DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

QuillcrestFalconer [he/him]@hexbear.net · 2 days ago

My question is who is naming these functions ‘qX_K_q8_K’

piggy [they/them]@hexbear.net · edit-2 20 hours ago

This is a quantization function. It’s a fairly “math brained” name I agree, but the function is called qX_K_q8_K because it quantizes a value with a quantization index of X (unknown) to one with a quantization index of 8 (bits) which correlates to the memory usage. The 0 vs K portions are how it does rounding, 0 means it does rounding by equal distribution (without offset), and K means it creates a distribution that is more fine grained around more common values and is more rough around least common values. e.g. I have a data set that has a lot of values between 4 and 5 but not a lot of 10s. I have lets say 10 brackets between 4 and 5 but only 3 between 5 and 10.

Basically it’s a lossy compression for a data set into a specific enumeration (roughly correlates with size), so it’s a way to given 1,000,000 numbers from 1-1000000, of putting their values into a range of numbers based on the q level How using different functions affects the output of models is more voodoo than anything else. You get better “quality” output from higher memory space, but quality is a complex metric and doesn’t necessarily map to factual accuracy in the output, just statistical correlation with the model’s data set.

An example of a common quantizer is an analog to digital converter. It must take continuous values from a wave that goes 0 to 1 and transform them into digital values of 0 and 1 with a specific sample rate.

Taking a 32 bit float and copying the value into 32 bit float is an identity quantizer.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

C devs love cryptic names :)

lilypad [she/her, null/void]@hexbear.net · edit-2 2 days ago

Writing lisp:

(defun generate-eight-new-magic-numbers-for-system-x ()...)

Writing c:

struct mnums* g8_nmn_sx () {...}

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

s-exps are the one true syntax and every other syntax was a mistake, I will die on this hill

lilypad [she/her, null/void]@hexbear.net · 2 days ago

Nakoichi [they/them]@hexbear.net · 2 days ago

s-exps is when you fuck your Playstation

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

🤣

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

ggml : x2 speed for WASM by optimizing SIMD