Challenges in Deciphering the Secrets of Large Language Models

Squire1039@lemm.ee · edit-2 4 months ago

Challenges in Deciphering the Secrets of Large Language Models

orclev@lemmy.world · 4 months ago

Yeah pretty much this. My understanding of the way LLMs function is that they operate on statistical associations of words which would amount to categories in Category Theory. Basically the training phase is classifying words into categories based on the examples in the training input. Then when you feed it a prompt it just uses those categories to parse and “solve” your prompt. It’s not “mysterious” it’s just opaque because it’s an incredibly complicated model. Exactly the sort of thing that people are really bad at working with, but which computers are really good with.