forked from third-party-mirrors/ollama
- only reload the running llm if the model has changed, or the options for loading the running model have changed - rename loaded llm to runner to differentiate from loaded model image - remove logic which keeps the first system prompt in the generation context