• 0 Posts
  • 4 Comments
Joined 11 months ago
cake
Cake day: October 17th, 2023

help-circle
  • There are already companies out there who are generating what they term small language models - basically hybrid models of say gpt 3.5 plus a large volume of corporate data - but they are all cloud based

    I think you will find most of these are not small language models, but are instead the thing I said above - a llm like gpt + a search engine. Even small language models require millions of texts and only perform very specialised tasks.



  • That’s because LLMs don’t do that.

    The companies that offer those services basically do some tricks behind the curtain.

    Like let’s say you want an LLM to learn your corporate docs. LLMs can’t do that because they need millions of text from across the internet just to learn to speak English… You can’t feed your 1000 docs and 10,000 emails in and point to it and say “Forget the billion documents you injested and pay attention to this… but also retain the ability to speak English”

    What they actually implement is a standard text search engine, that returns matching paragraphs from the relevant documents, prompts to LLM with something like "This paragraph may contain an answer to user question X. If it does, please paraphrase it.