DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 2 days ago

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 1 day ago

While this type of tooling isn’t new, what the LLM can do is qualitatively different from your basic linter. The power of the tool comes from being able to identify specific patterns in a particular code base. The linters, as the name implies, simply look for a set of common that were encoded in them.

Meanwhile power consumption, which is the one legitimate criticism of LLMs, is precisely what DeepSeek architecture addresses. And there’s every reason to expect that we’ll be seeing further progress here. In fact that’s already happening as we speak https://www.reuters.com/technology/artificial-intelligence/alibaba-releases-ai-model-it-claims-surpasses-deepseek-v3-2025-01-29/

That already negates your premise, most boring code isn’t really highly perf sensitive and unique enough to be treated individually through LLMs.

No, that doesn’t negate my premise at all. Just because code is boring doesn’t mean it’s easy to optimize or to notice problems. The thing that LLMs do well is looking at large volumes of data and identifying patterns within it.

Likewise if you’re not looking through the code you’re not actually understanding what the performance gaps are or if the LLM is making new ones by generating sub-optimal code.

That’s just a straw man, because there’s no reason why you wouldn’t be looking through your code. What LLM does is help you find areas of the code that are worth looking at.

It’s so weird to me how people always have this reaction whenever new technology shows up. Yes, there is a lot of hype around LLMs right now, they’re not a panacea, but that doesn’t mean we should throw the baby out with the bath water. There are legitimate uses for this tech, and it can save you time. Understanding what good uses for it are and the limitations of the tech is far more productive than simply rejecting it entirely. You do you of course.

piggy [they/them]@hexbear.net · edit-2 1 day ago

That’s just a straw man, because there’s no reason why you wouldn’t be looking through your code. What LLM does is help you find areas of the code that are worth looking at.

It’s not a strawman because classifying unperformant code is a different task than generating performant replacement code. LLM can only generate code via it’s internal weights + input it doesn’t guarantee that that code is compilable, performant, readable, understandable, self documenting or much of anything.

The performance gain here is coincidental simply because the generated code uses functions that call processor features directly rather than get optimized into processor features by a compiler. LLM classifiers are also statistically analyzing the AST for performance they aren’t actually performing real static analysis of the AST or it’s compiled version. It doesn’t calculate a BigO or really know how to reason through this problem, it’s just primed that when you write the for loop to sum, that’s “slower” than using _mm_add_ps. It doesn’t even know which cases of the for loop compile down to a _mm_add_ps instruction on which compilers and which optimization levels.

Lastly you injected this line of reasoning when you basically said “why would I do this boring stuff as a programmer when I can get the LLM to do it”. It’s nice that there’s a tool that you can politely ask to parse your garbage and replace with other garbage that happens to use a function that’s more performant. But not only is this not Software Engineering, but a performant dot product is a solved problem at EVERY level of abstraction. This programming equivalent of tech bros reinventing the train every 5 years.

The fact that this is needed is a problem in and of itself with how people are building this software. This is machine spirit communion with technojargon. Instead of learning how to vectorize algorithms you’re feeding your garbage code through a LLM to produce garbage code with SIMD instructions in it. That is quite literally stunting your growth as a Software Engineer. You are choosing to ignore learning how things actually work because it’s too hard to parse through the existing garbage. A SIMD dot product algo is literally a 2 week college junior homework assignment.

Understanding what good uses for it are and the limitations of the tech is far more productive than simply rejecting it entirely.

I quite literally pointed several limitations in the post you replied to and in this post from a Software Engineering perspective.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 1 day ago

It’s not a strawman because not only is classifying unperformant code is a different task than generating performant replacement code. LLM can only generate code via it’s internal weights + input it doesn’t guarantee that that code is compilable, performant, readable, understandable, self documenting or much of anything.

Yet, nothing stops you as a developer from reading this code. Furthermore, the real value of DeepSeek R1 is that it’s explaining its reasoning for how it arrived at a solution. That itself can be useful to read through and can give you ideas on how you might want to tackle it. Here’s concrete toy example for you:

<think> Alright, so I need to write a Clojure function that prints the current time. Hmm, where do I start? I’ve heard Clojure is a functional programming language that runs on the Java Virtual Machine. Okay, so maybe I can leverage some Java libraries for this.

First, I remember that in Java, there’s a class called java.util.Date which represents dates and times. Maybe I can use that in Clojure as well. But wait, how do I call Java classes from Clojure? Oh right, you just refer to them using their fully qualified names.

So, I think I’ll need to import java.util.Date or refer to it directly in my function. Then, maybe create a Date object which represents the current time. That should give me something to work with.

But Clojure has its own way of handling dates and times too, right? There’s the clojure.java-time library that provides a more idiomatic way to handle dates in Clojure. I wonder if I should use that instead for better readability and modern features.

Assuming I decide to stick with the built-in Java Date class, how do I get the current time? Oh, there’s a method called System.currentTimeMillis() which returns the current time in milliseconds since the epoch. Maybe I can use that to create a Date object.

So, putting that together, I might write something like (new java.util.Date()). But then, how do I print this date in a human-readable format? Java has SimpleDateFormat for formatting dates. So, I’d need to import that as well.

Wait, Clojure allows me to use Java’s SimpleDateFormat directly. So, I can create a formatter and then format the Date object with it. The pattern string would determine how the date is displayed. For example, “HH:mm:ss” would give me hours, minutes, and seconds in 24-hour format.

Putting this all together into a Clojure function: I’ll define a function that creates a Date object representing now, then use SimpleDateFormat to format it, and finally print it out using println.

But what about dependencies? Do I need to add any additional libraries beyond the standard Java libraries? For java.util.Date and SimpleDateFormat, I think those are part of the standard JDK, so no extra dependencies should be needed.

Alternatively, if I choose to use clojure.java-time, which is a more modern approach, I would need to include that library in my project. But for simplicity, maybe sticking with Java’s utilities is better since they’re already available.

So, step by step:

Create a Date object representing the current time.
Create a SimpleDateFormat with the desired format string.
Format the Date object using the formatter.
Print the formatted string to the console.

I should also consider any potential issues, like thread safety with SimpleDateFormat, but since this is a simple function and not part of a highly concurrent application, I think it’s acceptable for now.

Testing the function in my mind: when I call it, it should print out the current time in HH:mm:ss format. Does that cover everything? Yes, I think so.

Wait, maybe I can also include some examples or documentation on how to use this function. That way, anyone reading the code will know what it does and how to invoke it.

So, putting it all together, here’s what I have in mind for the function: </think>

;; A Clojure function to print the current time
(defn print-current-time []
  ;; Create a Date object representing the current time
  ;; Create a SimpleDateFormat with HH:mm:ss format
  (let [current-date (java.util.Date.)
          formatter (new java.text.SimpleDateFormat "HH:mm:ss")]
    ;; Format the date and print it
    (println (.format formatter current-date))))

Example usage:

(print-current-time)

This function will output the current time in the “HH:mm:ss” format when called. You can invoke it by simply calling print-current-time in your Clojure environment.

Lastly you injected this line of reasoning when you basically said “why would I do this boring stuff as a programmer when I can get the LLM to do it”. It’s nice that there’s a tool that you can politely ask to parse your garbage and replace with other garbage that happens to use a function that’s more performant. But not only is this not Software Engineering, a dot product is a fucking solved problem at EVERY level of abstraction from bits all the way up to your favorite interpreted language.

Once again, what I actually said was that LLM can help you identify bits of code that might be interesting, and this is valuable in a large project. This is exactly the same task you’d use a profiler for.

I quite literally pointed several limitations in the post you replied to and in this post from a Software Engineering perspective.

And I quite literally explained why the tool is still useful. In particular, the argument you keep making that you would just blindly copy/paste code the LLM produces is a complete straw man. A competent engineer will read what the LLM says and use it to inform a solution they understand.

piggy [they/them]@hexbear.net · 24 hours ago

Okay let me ask this question:

Who is this useful for? Who is the target audience for this?

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 24 hours ago

It’s useful for me, I’m the target audience for this. I’m working on a React project right now, and I haven’t touched Js in close to a decade. I know what I want to do conceptually, and I have plenty of experience designing applications. However, I’m not familiar with the nitty gritty of how React works and how to do what I want with it. This tool saves me a ton of time googling these things and wasting hours on sites like stack overflow.

piggy [they/them]@hexbear.net · edit-2 22 hours ago

I know what I want to do conceptually, and I have plenty of experience designing applications.

How does AI help you actually traverse the concepts of React that you admit you don’t have nitty gritty knowledge of how they work in terms of designing your application? React is a batteries included framework that has specific ways of doing things that impact the design and concepts that are technically feasible within React itself.

For example React isn’t really optimized to crunch a ton of data performantly so if you’re getting constant data updates over a web socket from multiple points and you want some or all the changes to be reflected you’re gonna have a bad time vs something that has finer grained change controls out of the box such as Angular.

How does AI help you choose between functional and class based React components? How much of your application is doing typical developer copy-pasta instead of creating HOCs for similar functionalities? How did AI help you with that? How is AI helping apply concepts like SOLID into the design of your component tree? How does AI help you decide how to architect components and their children that need to have a lifecycle outside of the typical change-binding flow?

This in my opinion is the crux of the issue, AI cannot solve this problem for you nor can it reasonably explain it in a technical way beyond parroting the vagaries of what I said above. It cannot confer understanding of complex abstract concepts that are fuzzy and have grey areas. It can tell you something may not work explicitly but it cannot educate you realistically on the tradeoffs.

It seems to me that your answer boils down to “code monkey stuff”. AI might help you swing a pickaxe, but it’s not good at explaining where the mine is going to collapse based on the type of rock you’re digging in. Another way of thinking about it is that you could build a building to the “building code” but it will still collapse. AI can explain the building code and loosely verify that you built something to it, but it cannot validate that your building is going to stay standing nor can it practically tell you what you need to change.

My problem with AI tools boils down to this. Software is a medium of communication. It communicates the base of a problem and the technical process of solving it. Software Engineering is a field that attempts to create strong patterns of communication and practices in order to efficiently organize the production of Software. The software industry at large (where most programmers get exposed to the process of building software) often eschews this discipline because of scientific management (the idea you can simply manage a process through fiduciary/managerial knowledge rather than domain knowledge) and the need for instant development to maintain fictional competitive advantage and fictional YoY growth. The industry welcomes AI for 2 reasons:

It can code monkey…eventually. Why pay programmers when you can ask CahpGBT to do it?
It can fix the problem of needing to deliver without knowing what you’re doing… eventually. It fixes the problem of communication without relying on building up the knowledge and practice of Software Engineering. In essence why have people know this discipline and its practical application when you can continue to have the blind leading the blind because ChadGTP can see for us?

This is a disservice to programmers everywhere especially younger ones because it destroys the social reproduction of the capacity to build scalable software and replaces it with you guessed it machine rites. In practice it’s the apotheosis of Conway’s Law in the software industry. We build needlessly complex software that works coincidentally, and soon that software will be analyzed, modified, and ultimately created by a tool that is an overly complex statistical model that also works through the coincidence of statistical approximations.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 22 hours ago

How does AI help you actually traverse the concepts of React that you admit you don’t have nitty gritty knowledge of how they work in terms of designing your application?

It helps me by showing me the syntax and patterns that map to what I’m trying to do conceptually. By pointing me in the right direction, it saves me time searching for these things. I don’t know why that’s so difficult for you to understand.

For example React isn’t really optimized to crunch a ton of data performantly so if you’re getting constant data updates over a web socket from multiple points and you want some or all the changes to be reflected you’re gonna have a bad time vs something that has finer grained change controls out of the box such as Angular.

That’s not a problem I’m solving, and in practice most UIs don’t actually deal with a lot of data because the human user is the limiting factor. I’m working on an application that’s doing fairly vanilla things here.

How does AI help you choose between functional and class based React components? How much of your application is doing typical developer copy-pasta instead of creating HOCs for similar functionalities? How did AI help you with that? How is AI helping apply concepts like SOLID into the design of your component tree? How does AI help you decide how to architect components and their children that need to have a lifecycle outside of the typical change-binding flow?

That’s not a problem it’s solving for me. As I’ve explained to you, I already have plenty of experience and I know how I like to structure applications. I’m used to using re-frame in Clojure, and I’m just looking how to do similar patterns in React. The AI does an excellent job of helping me discover them.

This in my opinion is the crux of the issue, AI cannot solve this problem for you nor can it reasonably explain it in a technical way beyond parroting the vagaries of what I said above. It cannot confer understanding of complex abstract concepts that are fuzzy and have grey areas. It can tell you something may not work explicitly but it cannot educate you realistically on the tradeoffs.

I don’t need it to confer understanding of abstract concepts to me. I need it to show me common patterns within a particular library that map to the concepts I’m already familiar with. I don’t need it to educate me on any trade offs.

Meanwhile, the problems you’re fixating on are not inherent to AI in any way and have always existed in the software industry. Cargo culting is a term for a reason, no AI has been necessary for people do that, nor does absence of AI prevent this from happening. So, your whole argument is completely misdirected because AI is not the problem here. People who were going to cargo cult were gonna do that regardless of the tooling.

This is a disservice to programmers everywhere especially younger ones because it destroys the social reproduction of the capacity to build scalable software and replaces it with you guessed it machine rites.

That’s absolute nonsense. It doesn’t destroy the capacity to build scalable software any more than stack overflow does.

We build needlessly complex software that works coincidentally, and soon that software will be analyzed, modified, and ultimately created by a tool that is an overly complex statistical model that also works through the coincidence of statistical approximations.

You’re saying this as if it wasn’t the case long before AI showed up on the scene. You’re making up a giant straw man of how you pretend software development works which is utterly divorced from what we see happening in the real world. The AI doesn’t change this one bit.

piggy [they/them]@hexbear.net · edit-2 21 hours ago

You’re making up a giant straw man of how you pretend software development works which is utterly divorced from what we see happening in the real world. The AI doesn’t change this one bit.

Commenting this under a post where an AI has spit out a dot product function optimization for an existing dot product function that’s already ~150-250 lines long depending on architectural implementation of which there are about 6. The PR for which has an interaction that is two devs finger pointing about who is responsible for writing tests. The PR for which notes that the original and new function often don’t give the correct answer. Just an amazing response. Chefs kiss.

What a wonderful way to engage with my post. You win bud. You’re the smartest. This industry would never mystify a basic concept that’s about 250 years old with a 716 line PR through its inability to communicate, organize and follow an academic discipline.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 21 hours ago

What a wonderful way to engage with my post. You win bud. You’re the smartest.

Amazing counterpoint you’ve mustered there when presented with the simple fact that all the problems you’re describing have already been happening long before AI showed up on the scene. Way to engage in good faith dialogue. Bravo!

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

DeepSeek R1 just got a 2X speed boost, the code for the boost was written by R1 itself!

ggml : x2 speed for WASM by optimizing SIMD