Jesse Gross
0c2f95f3de
runner: Initialize numPredict
...
numPredict is used to enforce a limit on the number of tokens to
generate. Is it passed in from Ollama but it is never stored to
be checked.
2024-09-03 21:15:14 -04:00
Jesse Gross
ebdf781397
server: Fix double free on runner subprocess error.
...
If the runner subprocess encounters an error, it will close the HTTP
connect, which causes Ollama to free the instance of the model that has
open. When Ollama exits, it will again try to free the models for all
of the runners that were open, resulting in a double free.
2024-09-03 21:15:14 -04:00
Jesse Gross
23c7c1326e
llm: Fix lint
2024-09-03 21:15:14 -04:00
Daniel Hiltgen
8fe30d161c
Fix filename for non darwin arm builds
2024-09-03 21:15:14 -04:00
jmorganca
a483a4c4ed
lint
2024-09-03 21:15:14 -04:00
Daniel Hiltgen
b267ab92b0
Add missing vendor headers to ggml sync
2024-09-03 21:15:14 -04:00
Daniel Hiltgen
189ca38f1d
Wire up native source file dependencies
...
This should make sure incremental builds correctly identify
when to rebuild components based on which native files
are modified.
2024-09-03 21:15:14 -04:00
Daniel Hiltgen
80db43b7b4
Bump llama sync to 1e6f65
2024-09-03 21:15:14 -04:00
Daniel Hiltgen
47b0e81219
fix dolphin-mistral
2024-09-03 21:15:14 -04:00
Daniel Hiltgen
21947d5c1b
harden integration tests
2024-09-03 21:15:14 -04:00
Daniel Hiltgen
751009a5d7
Runtime selection of new or old runners
...
This adjusts the new runners to comingle with existing runners so we can use an
env var to toggle the new runners on.
2024-09-03 21:15:14 -04:00
Daniel Hiltgen
8527028bf4
Implement timings response in Go server
...
This implements the fields necessary for `run --verbose`
to generate timing information.
2024-09-03 21:15:14 -04:00
Daniel Hiltgen
e0241118d0
Get embeddings working
...
Truncation doesn't pass, but the other embeddings tests pass
2024-09-03 21:15:14 -04:00
Daniel Hiltgen
f97ee8c506
Fix parallel requests
2024-09-03 21:15:13 -04:00
Daniel Hiltgen
e9dd656ff5
Update sync with latest llama.cpp layout, and run against b3485
2024-09-03 21:15:13 -04:00
Daniel Hiltgen
6c0d892498
Prefix all build artifacts with an OS/ARCH dir
...
This will help keep incremental builds from stomping on each other and make it
easier to stitch together the final runner payloads
2024-09-03 21:15:13 -04:00
Daniel Hiltgen
13348e3629
Get linux building
...
Still needs a bit more refinement to (auto)detect cuda/hip and fallback
gracefully if not detected.
2024-09-03 21:15:13 -04:00
jmorganca
3d5a08c315
add note in readme
2024-09-03 21:15:13 -04:00
jmorganca
a29851bc9b
clean up metal code
2024-09-03 21:15:13 -04:00
jmorganca
8dda9293fa
fix Makefile
on windows
2024-09-03 21:15:13 -04:00
jmorganca
b3c62dcafd
remove printing
2024-09-03 21:15:13 -04:00
jmorganca
9b8b7cd9b5
dont apply license to stb_image.h
and json.hpp
2024-09-03 21:15:13 -04:00
jmorganca
1da6c40f4f
lint
2024-09-03 21:15:13 -04:00
jmorganca
76ca2de06e
update sync header
2024-09-03 21:15:13 -04:00
jmorganca
0eabc2e34d
remove unused script
2024-09-03 21:15:13 -04:00
jmorganca
dded27dcfa
fix metal
2024-09-03 21:15:13 -04:00
jmorganca
080b600865
add header to not edit
2024-09-03 21:15:13 -04:00
jmorganca
d6b6de9a5a
add header to not edit
2024-09-03 21:15:13 -04:00
jmorganca
24a741424f
fix build on windows
2024-09-03 21:15:13 -04:00
jmorganca
4d476d894e
fix Makefile
2024-09-03 21:15:13 -04:00
jmorganca
bd94ddfc56
fix README.md
2024-09-03 21:15:13 -04:00
jmorganca
f1f54c5bd5
fix README.md
2024-09-03 21:15:13 -04:00
jmorganca
18662d1180
consistent whitespace
2024-09-03 21:15:13 -04:00
jmorganca
3d1f3569cf
update .gitattributes
2024-09-03 21:15:13 -04:00
jmorganca
083a9e9b4e
link metal
2024-09-03 21:15:13 -04:00
jmorganca
d0703eaf44
wip
2024-09-03 21:15:13 -04:00
jmorganca
ce00e387c3
wip meta
2024-09-03 21:15:13 -04:00
jmorganca
763d7b601c
sync
2024-09-03 21:15:13 -04:00
jmorganca
4d0e6c55b0
remove perl docs
2024-09-03 21:15:13 -04:00
jmorganca
3375b82c56
remove build scripts
2024-09-03 21:15:13 -04:00
jmorganca
b8c1065ab6
remove need for perl
2024-09-03 21:15:13 -04:00
jmorganca
a632a04426
fix output
2024-09-03 21:15:13 -04:00
jmorganca
110f37ffb0
arch build
2024-09-03 21:15:13 -04:00
jmorganca
f2f03ff7f2
add temporary makefile
2024-09-03 21:15:13 -04:00
jmorganca
ba0ff1c46a
fix cuda and rocm builds
2024-09-03 21:15:13 -04:00
jmorganca
9966a055e5
fix cgo flags for darwin amd64
2024-09-03 21:15:13 -04:00
jmorganca
7aa7a3c1e5
remove -fPIC
from build_hipblas.sh
2024-09-03 21:15:13 -04:00
jmorganca
de634b7fd7
fix issues with runner
2024-09-03 21:15:13 -04:00
jmorganca
795753be7e
move sync script back in for now
2024-09-03 21:15:13 -04:00
jmorganca
0eed68fed4
llama: sync
2024-09-03 21:15:13 -04:00