This website requires JavaScript.
Explore
Help
Sign In
third-party-mirrors
/
ollama
Watch
1
Star
0
Fork
1
You've already forked ollama
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
ollama
/
llm
History
Michael Yang
06b31e2e24
quantize any fp16/fp32 model
...
- FROM /path/to/{safetensors,pytorch} - FROM /path/to/fp{16,32}.bin - FROM model:fp{16,32}
2024-05-03 13:18:28 -07:00
..
ext_server
llm: add back check for empty token cache
2024-04-30 17:38:44 -04:00
generate
Do not build AVX runners on ARM64
2024-04-26 23:55:32 -06:00
llama.cpp
@
952d03dbea
update llama.cpp commit to
952d03d
2024-04-30 17:31:20 -04:00
patches
Fix clip log import
2024-04-26 09:43:46 -07:00
filetype.go
quantize any fp16/fp32 model
2024-05-03 13:18:28 -07:00
ggla.go
…
ggml.go
quantize any fp16/fp32 model
2024-05-03 13:18:28 -07:00
gguf.go
fixes for gguf (
#3863
)
2024-04-23 20:57:20 -07:00
llm_darwin_amd64.go
…
llm_darwin_arm64.go
…
llm_linux.go
…
llm_windows.go
Move nested payloads to installer and zip file on windows
2024-04-23 16:14:47 -07:00
llm.go
quantize any fp16/fp32 model
2024-05-03 13:18:28 -07:00
memory.go
gpu: add 512MiB to darwin minimum, metal doesn't have partial offloading overhead (
#4068
)
2024-05-01 11:46:03 -04:00
payload.go
Move nested payloads to installer and zip file on windows
2024-04-23 16:14:47 -07:00
server.go
Removing go routine calling .wait from load.
2024-05-01 18:51:10 +00:00
status.go
…