Blake Mizerany
c95f97689b
utils/upload: init
2024-04-02 14:15:21 -07:00
Blake Mizerany
618eb5b909
registry: multipart push
2024-04-02 13:40:23 -07:00
Daniel Hiltgen
841adda157
Fix windows lint CI flakiness
2024-04-02 12:22:16 -07:00
Daniel Hiltgen
0035e31af8
Bump to b2581
2024-04-02 11:53:07 -07:00
Blake Mizerany
eb75418be9
build/blob: test ParseRef round-trip
2024-04-02 11:45:01 -07:00
Blake Mizerany
9959da05de
build/blob: break out test refs for other tests/fuzzing
2024-04-02 11:38:10 -07:00
Daniel Hiltgen
c863c6a96d
Merge pull request #3218 from dhiltgen/subprocess
...
Switch back to subprocessing for llama.cpp
2024-04-02 10:49:44 -07:00
Blake Mizerany
aff7970628
build: remove superfluous parseCompleteRef
2024-04-01 23:41:42 -07:00
Blake Mizerany
628f1feb36
build: back to taking manifests as []byte
...
Its nicer to have the manifests be an opaque []byte, rather than a
struct. This way users of the build package don't need to know about the
internal structure of the manifests. The registry can interpret the
manifests as it sees fit, while letting build keep its own Go type of
manifest which is easier to work with in the build package.
2024-04-01 23:18:58 -07:00
Blake Mizerany
ce3125afd5
registry: add New and take a minio client as argument
2024-04-01 22:53:49 -07:00
Blake Mizerany
f488652ba7
build: make Build accept only refs without builds
2024-04-01 22:12:43 -07:00
Blake Mizerany
2318ed2919
build: remove unused manifest()
2024-04-01 21:59:38 -07:00
Blake Mizerany
b1b8be33d9
build: cleanup error names and other things
2024-04-01 21:57:34 -07:00
Blake Mizerany
876f7eab81
build: move Manifest from internal/blobstore to build
...
It was getting confusing to have the arbirary handling of manifests in
the blobstore. It also prevented us from using model.Ref in the
blobstore because of cyclic dependencies.
This is much easier to grok now.
2024-04-01 21:43:30 -07:00
Blake Mizerany
7cfc8a0838
build/blob: fix awkward Ref type
2024-04-01 21:25:18 -07:00
Daniel Hiltgen
1f11b52511
Refined min memory from testing
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
526d4eb204
Release gpu discovery library after use
...
Leaving the cudart library loaded kept ~30m of memory
pinned in the GPU in the main process. This change ensures
we don't hold GPU resources when idle.
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
0a74cb31d5
Safeguard for noexec
...
We may have users that run into problems with our current
payload model, so this gives us an escape valve.
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
10ed1b6292
Detect too-old cuda driver
...
"cudart init failure: 35" isn't particularly helpful in the logs.
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
4fec5816d6
Integration test improvements
...
Cleaner shutdown logic, a bit of response hardening
2024-04-01 16:48:18 -07:00
Daniel Hiltgen
0a0e9f3e0f
Apply 01-cache.diff
2024-04-01 16:48:18 -07:00
Daniel Hiltgen
58d95cc9bd
Switch back to subprocessing for llama.cpp
...
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
2024-04-01 16:48:18 -07:00
Patrick Devine
3b6a9154dd
Simplify model conversion ( #3422 )
2024-04-01 16:14:53 -07:00
Michael Yang
d6dd2ff839
Merge pull request #3241 from ollama/mxyng/mem
...
update memory estimations for gpu offloading
2024-04-01 13:59:14 -07:00
Michael Yang
e57a6ba89f
Merge pull request #2926 from ollama/mxyng/decode-ggml-v2
...
refactor model parsing
2024-04-01 13:58:13 -07:00
Michael Yang
12ec2346ef
Merge pull request #3442 from ollama/mxyng/generate-output
...
fix generate output
2024-04-01 13:56:09 -07:00
Michael Yang
1ec0df1069
fix generate output
2024-04-01 13:47:34 -07:00
Michael Yang
91b3e4d282
update memory calcualtions
...
count each layer independently when deciding gpu offloading
2024-04-01 13:16:32 -07:00
Michael Yang
d338d70492
refactor model parsing
2024-04-01 13:16:15 -07:00
Philipp Gillé
011bb67351
Add chromem-go to community integrations ( #3437 )
2024-04-01 11:17:37 -04:00
Saifeddine ALOUI
d124627202
Update README.md ( #3436 )
2024-04-01 11:16:31 -04:00
Jesse Zhang
b0a8246a69
Community Integration: CRAG Ollama Chat ( #3423 )
...
Corrective Retrieval Augmented Generation Demo, powered by Langgraph and Streamlit 🤗
Support:
- Ollama
- OpenAI APIs
2024-04-01 11:16:14 -04:00
Blake Mizerany
fd411b3cf6
registry: commit Manifest
2024-03-31 18:20:19 -07:00
Blake Mizerany
04f38cf3f4
registry: commit manifest on successful /v1/push
2024-03-31 15:09:24 -07:00
Blake Mizerany
c0eddb10fd
registry: use exact match on path
2024-03-31 15:01:26 -07:00
Blake Mizerany
60ef0e6b4a
oweb: remove Fault
...
Also, fix typo in the comment.
2024-03-31 15:00:25 -07:00
Blake Mizerany
48c60c01e2
registry: move req/resp types to registry/apitype
2024-03-31 12:23:10 -07:00
Blake Mizerany
eb2c442a01
oweb: make DecodeUserJSON take a field name
...
This allows for better error messages when decoding fails. For example,
instead of:
{"code":"invalid_json","message":"unexpected end of JSON input"}
We now get:
{"code":"invalid_json","field":"manifest","message":"unexpected end of JSON input"}
2024-03-31 11:36:51 -07:00
Blake Mizerany
c87fe7df48
client/ollama: make Error.Message optional
2024-03-31 11:12:50 -07:00
Blake Mizerany
5182a1dfb1
client/ollama: document Do
2024-03-31 11:04:20 -07:00
Blake Mizerany
a32e7857b2
client/ollama: docs for Error type
2024-03-31 11:00:07 -07:00
Blake Mizerany
6acc205de0
client/ollama: install and use (*Client).HTTPClient
2024-03-31 10:54:17 -07:00
Blake Mizerany
f6e02d4bc7
client/ollama: Do take a *Client
2024-03-31 10:52:56 -07:00
Yaroslav
e6fb39c182
Update README.md ( #3378 )
...
Plugins list updated
2024-03-31 13:10:05 -04:00
Blake Mizerany
e1d457c73e
client/ollama: report invalid server error response with raw bytes
2024-03-31 09:43:03 -07:00
Blake Mizerany
cd5df121a5
client: include Status in json Error response for symmetry.
...
Also, remove RawBody from error, which was previously used for
debugging.
2024-03-31 09:30:01 -07:00
Blake Mizerany
112ffed189
oweb: move Error and Do to client/ollama
...
This allows users of the ollama client library to need only import the
client/ollama package, rather than the oweb package as well when
inspecting errors.
2024-03-31 09:25:07 -07:00
Blake Mizerany
c49947dcf5
init
2024-03-31 09:24:53 -07:00
sugarforever
e1f1c374ea
Community Integration: ChatOllama ( #3400 )
...
* Community Integration: ChatOllama
* fixed typo
2024-03-30 22:46:50 -04:00
Jeffrey Morgan
06a1508bfe
Update 90_bug_report.yml
2024-03-29 10:11:17 -04:00