Logo
Explore Help
Sign In
third-party-mirrors/ollama
1
0
Fork 1
You've already forked ollama
Code Issues Pull Requests Actions Packages Projects Releases Wiki Activity
ollama/llm
History
Bruce MacDonald a897e833b8
do not cache prompt (#2018)
- prompt cache causes inferance to hang after some time
2024-01-16 13:48:05 -05:00
..
ext_server
Disable mmap with lora layers (#1985)
2024-01-13 23:36:31 -05:00
generate
Merge pull request #1966 from fpreiss/fpreiss/gen_linux_cuda_detection
2024-01-14 18:00:11 -08:00
llama.cpp @ 328b83de23
revert submodule back to 328b83de23b33240e28f4e74900d1d06726f5eb1
2024-01-10 18:42:39 -05:00
dyn_ext_server.c
Always dynamically load the llm server library
2024-01-11 08:42:47 -08:00
dyn_ext_server.go
do not cache prompt (#2018)
2024-01-16 13:48:05 -05:00
dyn_ext_server.h
Always dynamically load the llm server library
2024-01-11 08:42:47 -08:00
ggml.go
add max context length check
2024-01-12 14:54:07 -08:00
gguf.go
add max context length check
2024-01-12 14:54:07 -08:00
llama.go
remove unused fields and functions
2024-01-09 09:37:40 -08:00
llm.go
add max context length check
2024-01-12 14:54:07 -08:00
payload_common.go
Merge pull request #1935 from dhiltgen/cpu_fallback
2024-01-11 15:52:32 -08:00
payload_darwin.go
Always dynamically load the llm server library
2024-01-11 08:42:47 -08:00
payload_linux.go
Always dynamically load the llm server library
2024-01-11 08:42:47 -08:00
payload_test.go
Fix up the CPU fallback selection
2024-01-11 15:27:06 -08:00
payload_windows.go
Always dynamically load the llm server library
2024-01-11 08:42:47 -08:00
utils.go
partial decode ggml bin for more info
2023-08-10 09:23:10 -07:00
Powered by Gitea Version: 1.23.4 Page: 241ms Template: 9ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API