Cazzah@alien.topBtoSelf-Hosted Main@selfhosted.forum•Early Santa: what is missing in selfhostedEnglish
0·
1 year agoThat’s because LLMs don’t do that.
The companies that offer those services basically do some tricks behind the curtain.
Like let’s say you want an LLM to learn your corporate docs. LLMs can’t do that because they need millions of text from across the internet just to learn to speak English… You can’t feed your 1000 docs and 10,000 emails in and point to it and say “Forget the billion documents you injested and pay attention to this… but also retain the ability to speak English”
What they actually implement is a standard text search engine, that returns matching paragraphs from the relevant documents, prompts to LLM with something like "This paragraph may contain an answer to user question X. If it does, please paraphrase it.
I think you will find most of these are not small language models, but are instead the thing I said above - a llm like gpt + a search engine. Even small language models require millions of texts and only perform very specialised tasks.