ollama
ollama
¶
Ollama inference engine backend.
Classes¶
OllamaEngine
¶
Bases: InferenceEngine
Ollama backend via its native HTTP API.
Source code in src/openjarvis/engine/ollama.py
Functions¶
stream_full
async
¶
stream_full(messages: Sequence[Message], *, model: str, temperature: float = 0.7, max_tokens: int = 1024, **kwargs: Any) -> AsyncIterator[StreamChunk]
Yield StreamChunks including tool_calls.
Unlike the default stream_full in the base class (which wraps
stream() and drops tools), this posts to /api/chat with
tools from kwargs and parses tool_calls out of the streamed
response. Falls back to a tools-less retry on 400 (mirrors
generate()'s behaviour for models that don't support tools).