- cross-posted to:
- emacs@communick.news
- cross-posted to:
- emacs@communick.news
I really like the functionality you offer here, very interesting!
You may be interested in the LLM package in GNU ELPA which handles the llm connection and lets the user use the llm they wish. It supports ollama already (as of last week). I’m trying to convince authors of packages like yours to make the switch, so packages can focus on functionality instead of llm connection details.
Very interesting offer. I need to see if you package can provide all low level functionalities to implement my ideas. Are you package provide streaming with ollama llm provider? Can I filter results with streaming enabled? Can I insert generation results to any buffer at any point with streaming enabled? Can I change model, temperature and system messages for implement agents-like functions? Thank you for response :)
Yes, we can stream with ollama, and in general stream with any model to a point in a buffer.
I’m not sure what you mean by filtering results, can you give an example?
You can change the model, temperature and system messages. There’s no more sophisticated support for agent workflows, but if there’s something you think might be a good fit, and is sufficiently general, I’m open to implement it.
I’m not sure what you mean by filtering results, can you give an example?
Not all models can reply with code without additional quotes and explanation. So to use this reply in code file we need to filter code part only to insert it into code buffer.
Thank you. Will see if I can switch to your package as a backend.
Ah got it. That is indeed a problem I’d like to solve. If you look at the Open AI integration, I already have code to solve it there, but how to extend it to everything else has been an open question that I’ll eventually have to figure out. The interfaces involve are also not clear. Any insight you’ve come up with is likely to be helpful, so please don’t hesitate to share.
My solution is to ask llm to return result in some format, like markdown code block, for example. And then process model output line by line to check if line matches prefix/suffix patterns to change state of simple parser state machine. All data before and including prefix and after and including suffix will be dropped. But line-by-line processing is a key here.