DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

piggy [they/them]@hexbear.net · edit-2 1 day ago

I’m going to say 2 things that are going to be very unpopular but kinda need to be heard.

DeepSeek is turning this place into /r/OpenAI but red, which is incredibly lame
If LLMs are significantly helping your development workflow, you are doing grunt work, you’re not improving your skills, and you’re not working on problems that have any significant difficulty beyond memorizing multiplication tables type recall but for tech.

This optimization is actually grunt work, it’s not a new discovery, it’s simply using SIMD instructions on matrices something that should have been done in the first place either by hand or by a compiler.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 1 day ago

The reality is that most code is very boring, and a lot of optimizations are a result of doing really basic things like this. A model being able to look through the code, notice patterns, and then let you know these kinds of obvious improvements is in fact very useful. It’s not different than using a profiler to find bottlenecks. Having done development for over two decades, I don’t feel like combing through the code to find these kinds of things is a really good use of my time or that it’s improving my skills in any way.

piggy [they/them]@hexbear.net · edit-2 1 day ago

This type of tooling isn’t new and doesn’t require AI models. Performance linters exist in many languages. Rubocop perf, perlint in python, eslint perf rules etc. For C++, clang-tidy and cpp-perf exist.

The only reason LLMs are in this space is because there is a lack of good modern tooling in many languages. Jumping straight to LLMs is boiling the ocean (literally and figuratively).

Not only that but if we’re really gonna argue that “most code is very boring”. That already negates your premise, most boring code isn’t really highly perf sensitive and unique enough to be treated individually through LLMs. Directly needing to point out SIMD instructions in your C++ code basically shows that your compiler tool chain sucks or you’re writing your code in such a “clever” way that it isn’t getting picked up. This is an optimization scenario from 1999.

Likewise if you’re not looking through the code you’re not actually understanding what the performance gaps are or if the LLM is making new ones by generating sub-optimal code. Sometimes the machine spirits react to the prayer protocols and sometimes they don’t. That’s the level of technology you’re arguing at. These aren’t functional performance translations being applied. Once your system is full of this kind of junk, you won’t actually understand what’s going on or how things practically work. Standard perf linters are already not side effects free in some cases but they publish their side effects. LLMs cannot do this by comparison. That’s software engineering, it’s mostly risk management and organization. Yes it’s boring.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 1 day ago

While this type of tooling isn’t new, what the LLM can do is qualitatively different from your basic linter. The power of the tool comes from being able to identify specific patterns in a particular code base. The linters, as the name implies, simply look for a set of common that were encoded in them.

Meanwhile power consumption, which is the one legitimate criticism of LLMs, is precisely what DeepSeek architecture addresses. And there’s every reason to expect that we’ll be seeing further progress here. In fact that’s already happening as we speak https://www.reuters.com/technology/artificial-intelligence/alibaba-releases-ai-model-it-claims-surpasses-deepseek-v3-2025-01-29/

That already negates your premise, most boring code isn’t really highly perf sensitive and unique enough to be treated individually through LLMs.

No, that doesn’t negate my premise at all. Just because code is boring doesn’t mean it’s easy to optimize or to notice problems. The thing that LLMs do well is looking at large volumes of data and identifying patterns within it.

Likewise if you’re not looking through the code you’re not actually understanding what the performance gaps are or if the LLM is making new ones by generating sub-optimal code.

That’s just a straw man, because there’s no reason why you wouldn’t be looking through your code. What LLM does is help you find areas of the code that are worth looking at.

It’s so weird to me how people always have this reaction whenever new technology shows up. Yes, there is a lot of hype around LLMs right now, they’re not a panacea, but that doesn’t mean we should throw the baby out with the bath water. There are legitimate uses for this tech, and it can save you time. Understanding what good uses for it are and the limitations of the tech is far more productive than simply rejecting it entirely. You do you of course.

piggy [they/them]@hexbear.net · edit-2 24 hours ago

That’s just a straw man, because there’s no reason why you wouldn’t be looking through your code. What LLM does is help you find areas of the code that are worth looking at.

It’s not a strawman because classifying unperformant code is a different task than generating performant replacement code. LLM can only generate code via it’s internal weights + input it doesn’t guarantee that that code is compilable, performant, readable, understandable, self documenting or much of anything.

The performance gain here is coincidental simply because the generated code uses functions that call processor features directly rather than get optimized into processor features by a compiler. LLM classifiers are also statistically analyzing the AST for performance they aren’t actually performing real static analysis of the AST or it’s compiled version. It doesn’t calculate a BigO or really know how to reason through this problem, it’s just primed that when you write the for loop to sum, that’s “slower” than using _mm_add_ps. It doesn’t even know which cases of the for loop compile down to a _mm_add_ps instruction on which compilers and which optimization levels.

Lastly you injected this line of reasoning when you basically said “why would I do this boring stuff as a programmer when I can get the LLM to do it”. It’s nice that there’s a tool that you can politely ask to parse your garbage and replace with other garbage that happens to use a function that’s more performant. But not only is this not Software Engineering, but a performant dot product is a solved problem at EVERY level of abstraction. This programming equivalent of tech bros reinventing the train every 5 years.

The fact that this is needed is a problem in and of itself with how people are building this software. This is machine spirit communion with technojargon. Instead of learning how to vectorize algorithms you’re feeding your garbage code through a LLM to produce garbage code with SIMD instructions in it. That is quite literally stunting your growth as a Software Engineer. You are choosing to ignore learning how things actually work because it’s too hard to parse through the existing garbage. A SIMD dot product algo is literally a 2 week college junior homework assignment.

Understanding what good uses for it are and the limitations of the tech is far more productive than simply rejecting it entirely.

I quite literally pointed several limitations in the post you replied to and in this post from a Software Engineering perspective.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 24 hours ago

It’s not a strawman because not only is classifying unperformant code is a different task than generating performant replacement code. LLM can only generate code via it’s internal weights + input it doesn’t guarantee that that code is compilable, performant, readable, understandable, self documenting or much of anything.

Yet, nothing stops you as a developer from reading this code. Furthermore, the real value of DeepSeek R1 is that it’s explaining its reasoning for how it arrived at a solution. That itself can be useful to read through and can give you ideas on how you might want to tackle it. Here’s concrete toy example for you:

<think> Alright, so I need to write a Clojure function that prints the current time. Hmm, where do I start? I’ve heard Clojure is a functional programming language that runs on the Java Virtual Machine. Okay, so maybe I can leverage some Java libraries for this.

First, I remember that in Java, there’s a class called java.util.Date which represents dates and times. Maybe I can use that in Clojure as well. But wait, how do I call Java classes from Clojure? Oh right, you just refer to them using their fully qualified names.

So, I think I’ll need to import java.util.Date or refer to it directly in my function. Then, maybe create a Date object which represents the current time. That should give me something to work with.

But Clojure has its own way of handling dates and times too, right? There’s the clojure.java-time library that provides a more idiomatic way to handle dates in Clojure. I wonder if I should use that instead for better readability and modern features.

Assuming I decide to stick with the built-in Java Date class, how do I get the current time? Oh, there’s a method called System.currentTimeMillis() which returns the current time in milliseconds since the epoch. Maybe I can use that to create a Date object.

So, putting that together, I might write something like (new java.util.Date()). But then, how do I print this date in a human-readable format? Java has SimpleDateFormat for formatting dates. So, I’d need to import that as well.

Wait, Clojure allows me to use Java’s SimpleDateFormat directly. So, I can create a formatter and then format the Date object with it. The pattern string would determine how the date is displayed. For example, “HH:mm:ss” would give me hours, minutes, and seconds in 24-hour format.

Putting this all together into a Clojure function: I’ll define a function that creates a Date object representing now, then use SimpleDateFormat to format it, and finally print it out using println.

But what about dependencies? Do I need to add any additional libraries beyond the standard Java libraries? For java.util.Date and SimpleDateFormat, I think those are part of the standard JDK, so no extra dependencies should be needed.

Alternatively, if I choose to use clojure.java-time, which is a more modern approach, I would need to include that library in my project. But for simplicity, maybe sticking with Java’s utilities is better since they’re already available.

So, step by step:

Create a Date object representing the current time.
Create a SimpleDateFormat with the desired format string.
Format the Date object using the formatter.
Print the formatted string to the console.

I should also consider any potential issues, like thread safety with SimpleDateFormat, but since this is a simple function and not part of a highly concurrent application, I think it’s acceptable for now.

Testing the function in my mind: when I call it, it should print out the current time in HH:mm:ss format. Does that cover everything? Yes, I think so.

Wait, maybe I can also include some examples or documentation on how to use this function. That way, anyone reading the code will know what it does and how to invoke it.

So, putting it all together, here’s what I have in mind for the function: </think>

;; A Clojure function to print the current time
(defn print-current-time []
  ;; Create a Date object representing the current time
  ;; Create a SimpleDateFormat with HH:mm:ss format
  (let [current-date (java.util.Date.)
          formatter (new java.text.SimpleDateFormat "HH:mm:ss")]
    ;; Format the date and print it
    (println (.format formatter current-date))))

Example usage:

(print-current-time)

This function will output the current time in the “HH:mm:ss” format when called. You can invoke it by simply calling print-current-time in your Clojure environment.

Lastly you injected this line of reasoning when you basically said “why would I do this boring stuff as a programmer when I can get the LLM to do it”. It’s nice that there’s a tool that you can politely ask to parse your garbage and replace with other garbage that happens to use a function that’s more performant. But not only is this not Software Engineering, a dot product is a fucking solved problem at EVERY level of abstraction from bits all the way up to your favorite interpreted language.

Once again, what I actually said was that LLM can help you identify bits of code that might be interesting, and this is valuable in a large project. This is exactly the same task you’d use a profiler for.

I quite literally pointed several limitations in the post you replied to and in this post from a Software Engineering perspective.

And I quite literally explained why the tool is still useful. In particular, the argument you keep making that you would just blindly copy/paste code the LLM produces is a complete straw man. A competent engineer will read what the LLM says and use it to inform a solution they understand.

piggy [they/them]@hexbear.net · 24 hours ago

Okay let me ask this question:

Who is this useful for? Who is the target audience for this?

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 23 hours ago

It’s useful for me, I’m the target audience for this. I’m working on a React project right now, and I haven’t touched Js in close to a decade. I know what I want to do conceptually, and I have plenty of experience designing applications. However, I’m not familiar with the nitty gritty of how React works and how to do what I want with it. This tool saves me a ton of time googling these things and wasting hours on sites like stack overflow.

Tachanka [comrade/them]@hexbear.net · 2 days ago

It is 2025

AI writes code to update itself

I still have to load the dishwasher by hand

I still have to change the baby’s diapers

I still have to go to work tomorrow

Do things ever happen?

Some say nothing ever happens.

Others argue that everything always happens.

I love the real movement.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

Hohsia [he/him]@hexbear.net · 2 days ago

A thing I’ve noticed with deepseek is that it operates in a very system-oriented manner (it carefully plans out how to answer your question when you use thinking mode and it’s actually quite interesting) whereas chatgpt just tells you how long it “thought” and ultimately regurgitates an output that it is statistically likely. So we actually get to see a bit of the black box in my view

QuillcrestFalconer [he/him]@hexbear.net · 2 days ago

ChatGPT o1 hides it’s chain of thought so you don’t even really know what the ‘reasoning’ is

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

yeah it’s fascinating to see how the sausage is made

D61 [any]@hexbear.net · 2 days ago

Ask ChatGPT what to do and it will ask to have a gallon of movie concession stand nacho cheese flavored sauce poured directly onto the server rack.

REgon [they/them]@hexbear.net · 2 days ago

Tbh that would be for the best, so I’ll give it to chatGPT on a technicality

Nakoichi [they/them]@hexbear.net · 2 days ago

I have no mouth and I must cream

dat_math [they/them]@hexbear.net · edit-2 2 days ago

the code for the boost was written by R1 itself!

Pretty neat, but this kind of thing will impress me a lot more when it’s genuinely new and creative output, not just the result of being prompted to optimize an existing routine in a prescribed way (using simd instructions to calculate inner products)

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · edit-2 2 days ago

I’m still impressed because it was able to look at the existing solution, recognize a bottleneck, and write the code to address it. Most code is very boring, you don’t need genius solutions in it. And this could be of huge help for developers as well where you could have it analyze code and suggest where improvements can be made. It could be faster than profiling things.

QuillcrestFalconer [he/him]@hexbear.net · 2 days ago

My question is who is naming these functions ‘qX_K_q8_K’

piggy [they/them]@hexbear.net · edit-2 20 hours ago

This is a quantization function. It’s a fairly “math brained” name I agree, but the function is called qX_K_q8_K because it quantizes a value with a quantization index of X (unknown) to one with a quantization index of 8 (bits) which correlates to the memory usage. The 0 vs K portions are how it does rounding, 0 means it does rounding by equal distribution (without offset), and K means it creates a distribution that is more fine grained around more common values and is more rough around least common values. e.g. I have a data set that has a lot of values between 4 and 5 but not a lot of 10s. I have lets say 10 brackets between 4 and 5 but only 3 between 5 and 10.

Basically it’s a lossy compression for a data set into a specific enumeration (roughly correlates with size), so it’s a way to given 1,000,000 numbers from 1-1000000, of putting their values into a range of numbers based on the q level How using different functions affects the output of models is more voodoo than anything else. You get better “quality” output from higher memory space, but quality is a complex metric and doesn’t necessarily map to factual accuracy in the output, just statistical correlation with the model’s data set.

An example of a common quantizer is an analog to digital converter. It must take continuous values from a wave that goes 0 to 1 and transform them into digital values of 0 and 1 with a specific sample rate.

Taking a 32 bit float and copying the value into 32 bit float is an identity quantizer.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

C devs love cryptic names :)

lilypad [she/her, null/void]@hexbear.net · edit-2 2 days ago

Writing lisp:

(defun generate-eight-new-magic-numbers-for-system-x ()...)

Writing c:

struct mnums* g8_nmn_sx () {...}

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

s-exps are the one true syntax and every other syntax was a mistake, I will die on this hill

lilypad [she/her, null/void]@hexbear.net · 2 days ago

Nakoichi [they/them]@hexbear.net · 2 days ago

s-exps is when you fuck your Playstation

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

🤣

TheDrink@hexbear.net · edit-2 2 days ago

deleted by creator

Chump [he/him]@hexbear.net · 2 days ago

Have you tried getting it from ollama? Don’t know the details of your error, but I suspect it can at least pick up after a discount e

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

ollama doesn’t seem to be smart enough to continue if it’s interrupted and just restarts the whole download

Chump [he/him]@hexbear.net · 1 day ago

Damn that’s irritating

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 1 day ago

Particularly so given that client’s whole job is to download multi gig files.

TheDrink@hexbear.net · edit-2 2 days ago

deleted by creator

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

ggml : x2 speed for WASM by optimizing SIMD