jmorganca
a9884ae136
llama: add clip dependencies
2024-09-03 21:15:12 -04:00
jmorganca
e37651cca0
add clip and parallel requests to the todo list
2024-09-03 21:15:12 -04:00
jmorganca
593d6836ab
fix cuda build
2024-09-03 21:15:12 -04:00
jmorganca
533a7e7d50
fix build on windows
2024-09-03 21:15:12 -04:00
jmorganca
0873d28b16
fix ggml-metal.m
build constraints
2024-09-03 21:15:12 -04:00
jmorganca
bb795faa6c
fix ggml-metal.m
2024-09-03 21:15:12 -04:00
jmorganca
e86db9381a
avx2
should only add avx2
2024-09-03 21:15:12 -04:00
jmorganca
4a5633e4bc
fix sync script
2024-09-03 21:15:12 -04:00
jmorganca
86f453252b
fix ggml-metal.m
2024-09-03 21:15:12 -04:00
jmorganca
dfd8f34806
fix ggml-metal.m
2024-09-03 21:15:12 -04:00
jmorganca
beb847b40f
add license headers
2024-09-03 21:15:12 -04:00
jmorganca
785f76d390
pre-patch
2024-09-03 21:15:12 -04:00
jmorganca
9fe48978a8
move runner
package down
2024-09-03 21:15:12 -04:00
jmorganca
01ccbc07fe
replace static build in llm
2024-09-03 21:15:12 -04:00
jmorganca
ec09be97e8
fix build
2024-09-03 21:15:12 -04:00
jmorganca
6129f30479
wip...
2024-09-03 21:15:12 -04:00
jmorganca
eb1aa97961
rename server
to runner
2024-09-03 21:15:12 -04:00
Jeffrey Morgan
5e921e06ac
Update README.md
2024-09-03 21:15:12 -04:00
Jeffrey Morgan
02089baf70
Update README.md
2024-09-03 21:15:12 -04:00
Jeffrey Morgan
870e91be76
Update README.md
2024-09-03 21:15:12 -04:00
Jeffrey Morgan
7ecc8e86c4
Update README.md
2024-09-03 21:15:12 -04:00
jmorganca
b1696e308e
Add missing hipcc flags
2024-09-03 21:15:12 -04:00
jmorganca
0110994d06
Initial llama
Go module
2024-09-03 21:15:12 -04:00
jmorganca
2ef3a217d1
add sync of llama.cpp
2024-09-03 21:15:12 -04:00
Michael Yang
fccf8d179f
partial decode ggml bin for more info
2023-08-10 09:23:10 -07:00
Bruce MacDonald
984c9c628c
fix embeddings invalid values
2023-08-09 16:50:53 -04:00
Bruce MacDonald
09d8bf6730
fix build errors
2023-08-09 10:45:57 -04:00
Bruce MacDonald
7a5f3616fd
embed text document in modelfile
2023-08-09 10:26:19 -04:00
Michael Yang
f2074ed4c0
Merge pull request #306 from jmorganca/default-keep-system
...
automatically set num_keep if num_keep < 0
2023-08-08 09:25:34 -07:00
Bruce MacDonald
a6f6d18f83
embed text document in modelfile
2023-08-08 11:27:17 -04:00
Jeffrey Morgan
5eb712f962
trim whitespace before checking stop conditions
...
Fixes #295
2023-08-08 00:29:19 -04:00
Michael Yang
4dc5b117dd
automatically set num_keep if num_keep < 0
...
num_keep defines how many tokens to keep in the context when truncating
inputs. if left to its default value of -1, the server will calculate
num_keep to be the left of the system instructions
2023-08-07 16:19:12 -07:00
Michael Yang
b9f4d67554
configurable rope frequency parameters
2023-08-03 22:11:58 -07:00
Michael Yang
c5bcf32823
update llama.cpp
2023-08-03 11:50:24 -07:00
Michael Yang
0e79e52ddd
override ggml-metal if the file is different
2023-08-02 12:50:30 -07:00
Michael Yang
74a5f7e698
no gpu for 70B model
2023-08-01 17:12:50 -07:00
Michael Yang
7a1c3e62dc
update llama.cpp
2023-08-01 16:54:01 -07:00
Michael Yang
319f078dd9
remove -Werror
...
there are compile warnings on Linux which -Werror elevates to errors,
preventing compile
2023-07-31 21:45:56 -07:00
Jeffrey Morgan
7da249fcc1
only build metal for darwin,arm
target
2023-07-31 21:35:23 -04:00
Bruce MacDonald
184ad8f057
allow specifying stop conditions in modelfile
2023-07-28 11:02:04 -04:00
Jeffrey Morgan
dffc8b6e09
update llama.cpp
to d91f3f0
2023-07-28 08:07:48 -04:00
Michael Yang
3549676678
embed ggml-metal.metal
2023-07-27 17:23:29 -07:00
Michael Yang
fadf75f99d
add stop conditions
2023-07-27 17:00:47 -07:00
Michael Yang
ad3a7d0e2c
add NumGQA
2023-07-27 14:05:11 -07:00
Michael Yang
18ffeeec45
update llama.cpp
2023-07-27 14:05:11 -07:00
Michael Yang
cca61181cb
sample metrics
2023-07-27 09:31:44 -07:00
Michael Yang
c490416189
lock on llm.lock(); decrease batch size
2023-07-27 09:31:44 -07:00
Michael Yang
f62a882760
add session expiration
2023-07-27 09:31:44 -07:00
Michael Yang
3003fc03fc
update predict code
2023-07-27 09:31:44 -07:00
Michael Yang
35af37a2cb
session id
2023-07-27 09:31:44 -07:00