ollama

History

Daniel Hiltgen ff4f0cbd1d Prevent multiple concurrent loads on the same gpus

While models are loading, the VRAM metrics are dynamic, so try
to load on a GPU that doesn't have a model actively loading, or wait
to avoid races that lead to OOMs

2024-06-14 14:51:40 -07:00

auth.go

Revert "use post token"

2024-05-11 22:19:14 -07:00

download.go

server: skip blob verification for already verified blobs

2024-06-05 16:39:11 -07:00

fixblobs_test.go

server: replace blob prefix separator from ':' to '-' (#3146 )

2024-03-14 20:18:06 -07:00

fixblobs.go

server: replace blob prefix separator from ':' to '-' (#3146 )

2024-03-14 20:18:06 -07:00

images.go

server: remove jwt decoding error (#5027 )

2024-06-13 11:21:15 -07:00

layer.go

Merge pull request #3718 from ollama/mxyng/modelname-3

2024-05-29 12:02:07 -07:00

manifest_test.go

add OLLAMA_MODELS to envconfig (#5029 )

2024-06-13 12:52:03 -07:00

manifest.go

fix: skip removing layers that no longer exist

2024-06-10 11:32:19 -07:00

model.go

fix: multiple templates when creating from model

2024-06-12 13:35:49 -07:00

modelpath_test.go

add OLLAMA_MODELS to envconfig (#5029 )

2024-06-13 12:52:03 -07:00

modelpath.go

add OLLAMA_MODELS to envconfig (#5029 )

2024-06-13 12:52:03 -07:00

prompt_test.go

change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347 )

2024-03-26 13:04:17 -07:00

prompt.go

change github.com/jmorganca/ollama to github.com/ollama/ollama (#3347 )

2024-03-26 13:04:17 -07:00

routes_create_test.go

add OLLAMA_MODELS to envconfig (#5029 )

2024-06-13 12:52:03 -07:00

routes_delete_test.go

add OLLAMA_MODELS to envconfig (#5029 )

2024-06-13 12:52:03 -07:00

routes_list_test.go

add OLLAMA_MODELS to envconfig (#5029 )

2024-06-13 12:52:03 -07:00

routes_test.go

add OLLAMA_MODELS to envconfig (#5029 )

2024-06-13 12:52:03 -07:00

routes.go

API app/browser access (#4879 )

2024-06-06 15:19:03 -07:00

sched_test.go

Refine CPU load behavior with system memory visibility

2024-06-14 14:51:40 -07:00

sched.go

Prevent multiple concurrent loads on the same gpus

2024-06-14 14:51:40 -07:00

upload.go

lint

2024-06-04 11:13:30 -07:00