ollama/server at 34a75102f7b128ba9675be544ac86dc6e2fc8392 - ollama - Gitea: Git with a cup of tea

norohind/ollama

forked from third-party-mirrors/ollama

History

Jesse Gross 34a75102f7 prompt: Use a single token when estimating mllama context size

Currently we assume that images take 768 tokens of context size for
the purposes of clipping old messages that exceed the context window.
However, our mllama implementation stores the full image embedding
in a single token. As a result, there is significant waste of context
space.

Ideally, we would handle this more generically and have the
implementation report the number of tokens. However, at the moment
this would just result in a similar set of 'if' conditions in the
runner plus APIs to report it back. So for now, we just keep this
simple.

2024-11-05 10:11:50 -08:00

..

add more tests for getting the optimal tiled canvas (#7411 )

2024-10-29 16:28:02 -07:00

server: add tool parsing support for nemotron-mini (#6849 )

2024-09-17 18:06:16 -07:00

auth.go

fix nil deref in auth.go

2024-07-26 14:14:48 -07:00

download.go

server: fix blob download when receiving a 200 response (#6656 )

2024-09-05 10:48:26 -07:00

fixblobs_test.go

server: replace blob prefix separator from ':' to '-' (#3146 )

2024-03-14 20:18:06 -07:00

fixblobs.go

server: replace blob prefix separator from ':' to '-' (#3146 )

2024-03-14 20:18:06 -07:00

images.go

Re-introduce the llama package (#5034 )

2024-10-08 08:53:54 -07:00

layer.go

fix: chmod new layer to 0o644 when creating it

2024-08-16 11:43:19 +08:00

manifest_test.go

lint

2024-08-01 17:06:06 -07:00

manifest.go

only skip invalid json manifests

2024-08-15 10:29:14 -07:00

model_test.go

server: add tool parsing support for nemotron-mini (#6849 )

2024-09-17 18:06:16 -07:00

model.go

image processing for llama3.2 (#6963 )

2024-10-18 16:12:35 -07:00

modelpath_test.go

validate model path

2024-08-28 09:32:57 -07:00

modelpath.go

validate model path

2024-08-28 09:32:57 -07:00

prompt_test.go

runner.go: Better abstract vision model integration

2024-10-30 14:53:43 -07:00

prompt.go

prompt: Use a single token when estimating mllama context size

2024-11-05 10:11:50 -08:00

routes_create_test.go

Merge pull request #6534 from ollama/mxyng/messages

2024-08-30 09:39:59 -07:00

routes_delete_test.go

server: clean up route names for consistency (#6524 )

2024-08-26 19:36:11 -07:00

routes_generate_test.go

image processing for llama3.2 (#6963 )

2024-10-18 16:12:35 -07:00

routes_list_test.go

server: clean up route names for consistency (#6524 )

2024-08-26 19:36:11 -07:00

routes_test.go

image processing for llama3.2 (#6963 )

2024-10-18 16:12:35 -07:00

routes.go

Quiet down debug log of image payload (#7454 )

2024-11-04 13:05:16 -08:00

sched_test.go

Rename gpu package discover (#7143 )

2024-10-16 17:45:00 -07:00

sched.go

Rename gpu package discover (#7143 )

2024-10-16 17:45:00 -07:00

sparse_common.go

Don't hard fail on sparse setup error

2024-08-09 12:16:19 -07:00

sparse_windows.go

Don't hard fail on sparse setup error

2024-08-09 12:16:19 -07:00

upload.go

server: limit upload parts to 16 (#6411 )

2024-08-19 09:20:52 -07:00