Commit Graph

  • 685a53534b manifest: Don't prune layers if we can't open a manifest file Jesse Gross 2024-08-01 15:05:16 -07:00
  • 32e0dca416 change approach, set BasicAuthKey to only password instead of username:password pair, with ollama as user gin account kemalelmizan 2024-08-07 12:42:34 +07:00
  • 1283f413f8 feat: add gin BasicAuth for username:password setup in env kemalelmizan 2024-08-07 12:04:11 +07:00
  • 1db7062806 address comments jmorganca 2024-08-07 00:41:40 -04:00
  • 8ec65f01d8 address comments jmorganca 2024-08-07 00:40:41 -04:00
  • c823cd7615
    Merge branch 'main' into feature/kv-quant Sam 2024-08-07 16:36:16 +12:00
  • 2d8b07279c use errgroup jmorganca 2024-08-07 00:34:23 -04:00
  • 08bba06d50 fix linter jmorganca 2024-08-07 00:24:04 -04:00
  • 696e69986d fix linter jmorganca 2024-08-07 00:19:24 -04:00
  • 5f20cc8a59 fix prompt count jmorganca 2024-08-07 00:09:49 -04:00
  • 7583d1a6c6 fix linter jmorganca 2024-08-06 23:22:53 -04:00
  • e16a46b541 batch embeddings in server jmorganca 2024-08-06 23:18:12 -04:00
  • de4fc29773
    llm: reserve required number of slots for embeddings (#6219) v0.3.4 Jeffrey Morgan 2024-08-06 23:20:49 -04:00
  • 6d1dbda2c9 llm: reserve required number of slots for embeddings jmorganca 2024-08-06 22:49:51 -04:00
  • a38902413e Bump llama sync to 1e6f65 Daniel Hiltgen 2024-08-06 16:50:34 -07:00
  • f183b98307 fix dolphin-mistral Daniel Hiltgen 2024-08-01 14:47:00 -07:00
  • b3444223dc harden integration tests Daniel Hiltgen 2024-08-01 14:41:23 -07:00
  • 8181b54d08 Runtime selection of new or old runners Daniel Hiltgen 2024-08-01 08:54:44 -07:00
  • 544774623d Implement timings response in Go server Daniel Hiltgen 2024-07-29 14:09:55 -07:00
  • 0e5df64772 Get embeddings working Daniel Hiltgen 2024-07-31 11:08:09 -07:00
  • 4d1ee5971e Fix parallel requests Daniel Hiltgen 2024-07-31 15:02:58 -07:00
  • f06354456a Update sync with latest llama.cpp layout, and run against b3485 Daniel Hiltgen 2024-07-29 16:21:09 -07:00
  • c79c69577b Prefix all build artifacts with an OS/ARCH dir Daniel Hiltgen 2024-06-24 09:23:34 -07:00
  • 547ce3088b Get linux building Daniel Hiltgen 2024-06-23 12:07:41 -07:00
  • c0d42c1ab7 add note in readme jmorganca 2024-06-21 16:22:27 -04:00
  • 9bd149452b clean up metal code jmorganca 2024-06-15 10:06:36 -07:00
  • 4610ada320 fix Makefile on windows jmorganca 2024-06-20 21:52:10 -04:00
  • 7e50e5bc04 remove printing jmorganca 2024-06-13 18:41:12 -07:00
  • 02b9a80643 dont apply license to stb_image.h and json.hpp jmorganca 2024-06-13 14:35:11 -07:00
  • 8907a0f685 lint jmorganca 2024-06-13 14:21:55 -07:00
  • b5ec385bd3 update sync header jmorganca 2024-06-13 14:12:23 -07:00
  • c36f7ff5c0 remove unused script jmorganca 2024-06-13 14:07:05 -07:00
  • 18345cf6e4 fix metal jmorganca 2024-06-12 12:18:40 -07:00
  • 5ae1086cb0 add header to not edit jmorganca 2024-06-12 11:40:13 -07:00
  • 544f4d4776 add header to not edit jmorganca 2024-06-12 11:38:42 -07:00
  • fbc3170066 fix build on windows jmorganca 2024-06-12 02:47:12 -04:00
  • 9c8ce6dbb1 fix Makefile jmorganca 2024-06-11 23:18:07 -07:00
  • 2402720440 fix README.md jmorganca 2024-06-11 22:54:45 -07:00
  • 2b234f7448 fix README.md jmorganca 2024-06-11 22:54:31 -07:00
  • e5b8f662af consistent whitespace jmorganca 2024-06-11 22:50:10 -07:00
  • 20e6ada09e update .gitattributes jmorganca 2024-06-11 22:48:06 -07:00
  • 2071d1cac9 link metal jmorganca 2024-06-11 22:46:14 -07:00
  • 3f4613459d wip jmorganca 2024-06-11 18:53:48 -07:00
  • 6c2f7e52a3 wip meta jmorganca 2024-06-11 11:12:00 -07:00
  • 07088ebe21 sync jmorganca 2024-06-10 17:23:09 -07:00
  • 19f8053179 remove perl docs jmorganca 2024-06-10 09:26:19 -07:00
  • 8743873440 remove build scripts jmorganca 2024-06-10 02:56:37 -04:00
  • 83f74ac09b remove need for perl jmorganca 2024-06-10 00:04:21 -04:00
  • c043c41ba4 fix output jmorganca 2024-06-09 23:53:40 -04:00
  • cf34dae340 arch build jmorganca 2024-06-09 20:19:11 -07:00
  • 66714163e8 add temporary makefile jmorganca 2024-06-09 22:33:31 -04:00
  • 17950f79cb fix cuda and rocm builds jmorganca 2024-06-09 19:49:22 -04:00
  • cc77eb9527 fix cgo flags for darwin amd64 jmorganca 2024-06-09 14:30:41 -07:00
  • 5b7e70e3d5 remove -fPIC from build_hipblas.sh jmorganca 2024-06-07 12:52:49 -04:00
  • 31f9e21e2f fix issues with runner jmorganca 2024-06-07 09:32:52 -07:00
  • bdf75e6ceb move sync script back in for now jmorganca 2024-06-07 09:26:44 -07:00
  • c7db1a97c7 llama: sync jmorganca 2024-06-07 00:27:24 -07:00
  • 6dad8ae335 update to d5c938cd jmorganca 2024-06-07 00:15:58 -07:00
  • 460b0395fc add patches jmorganca 2024-06-06 23:55:47 -07:00
  • e02149a3ca cleanup stop code jmorganca 2024-06-04 00:58:58 -07:00
  • 755d3653e6 fix example jmorganca 2024-06-04 00:43:03 -07:00
  • 0c3c7571f5 revert llm changes jmorganca 2024-06-04 00:40:19 -07:00
  • 6277276c20 num predict jmorganca 2024-05-28 23:38:44 -07:00
  • 89f43310c5 basic progress jmorganca 2024-05-28 23:11:48 -07:00
  • 00370c562d add more runner params jmorganca 2024-05-28 00:02:01 -07:00
  • 0203e041c2 truncate stop properly jmorganca 2024-05-27 23:09:56 -07:00
  • a1f3a90153 wip stop tokens jmorganca 2024-05-27 14:38:44 -07:00
  • 8b48f15ff6 embeddings jmorganca 2024-05-27 11:33:47 -07:00
  • d0a55ec7bf remove dependency on llm jmorganca 2024-05-26 23:23:09 -07:00
  • fb969125cd grammar jmorganca 2024-05-26 23:14:44 -07:00
  • c8abab220d sampling jmorganca 2024-05-26 23:01:05 -07:00
  • 4f4261564a better example module, add port jmorganca 2024-05-25 20:11:57 -07:00
  • 60a9588a54 wip jmorganca 2024-05-24 10:09:35 -07:00
  • 3709a098d8 add llava to runner jmorganca 2024-05-23 18:22:15 -07:00
  • 36fb5d988c fix output in build_hipblas.sh jmorganca 2024-05-20 16:43:53 -07:00
  • fe3e5babd5 mods to build_hipblas.sh for linux jmorganca 2024-05-20 16:15:16 -07:00
  • b01e43b347 wip jmorganca 2024-05-20 15:27:10 -07:00
  • afc2586f5a improve cuda and hipblas build scripts jmorganca 2024-05-20 16:17:13 -04:00
  • 8945f0d78b cuda linux jmorganca 2024-05-19 23:11:30 -07:00
  • 9fae49d466 Update README.md Jeffrey Morgan 2024-05-19 16:47:50 -07:00
  • d0319d2d08 Update README.md Jeffrey Morgan 2024-05-19 16:47:19 -07:00
  • abfd121135 disable log file jmorganca 2024-05-19 16:36:32 -07:00
  • f28d1c0aa7 fix readme for llava jmorganca 2024-05-19 16:33:37 -07:00
  • ce82d7c615 add llava jmorganca 2024-05-19 16:30:11 -07:00
  • 0a23d7ee64 llama: add clip dependencies jmorganca 2024-05-19 14:06:46 -07:00
  • dcc3898a60 add clip and parallel requests to the todo list jmorganca 2024-05-19 14:01:52 -07:00
  • aa34ad4cea fix cuda build jmorganca 2024-05-19 03:34:24 -04:00
  • 04721061c4 fix build on windows jmorganca 2024-05-19 03:19:41 -04:00
  • 4e837e2aa1 fix ggml-metal.m build constraints jmorganca 2024-05-19 00:10:15 -07:00
  • 2bd087c928 fix ggml-metal.m jmorganca 2024-05-19 00:06:26 -07:00
  • 920287a25e avx2 should only add avx2 jmorganca 2024-05-18 23:53:29 -07:00
  • 0590208a2a fix sync script jmorganca 2024-05-18 23:50:50 -07:00
  • 275a5dd747 fix ggml-metal.m jmorganca 2024-05-18 23:34:58 -07:00
  • 4efb2154de fix ggml-metal.m jmorganca 2024-05-18 23:31:41 -07:00
  • 6fee1b8d1f add license headers jmorganca 2024-05-18 23:30:28 -07:00
  • 36621eddf5 pre-patch jmorganca 2024-05-18 23:27:01 -07:00
  • d1e8692cd9 move runner package down jmorganca 2024-05-18 23:15:51 -07:00
  • 19a1cccfdc replace static build in llm jmorganca 2024-05-18 22:22:46 -07:00
  • 9c332bab90 fix build jmorganca 2024-05-18 21:23:53 -07:00
  • fd6e4723c4 wip... jmorganca 2024-05-16 13:52:38 -07:00