I’ve installed koboldcpp on a thinkpad x1 with 32gb RAM and a i7-1355U, no GPU. Sure, it’s only just around 1 token/s but for a chat it is still usable (about 15 s per reply). The setup was easier than expected.

    • KinkyThoughts@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      1
      ·
      8 days ago

      I mean the actual context size to be processed for the message, based on chat history, character cards, world info, etc. And which model?