Commit Graph

  • 8477c61857
    Merge 9594b778894be014632c3f3e5c019d77d659cef0 into b8d5036e3328ebfd16dcc94c60beb46496ba8112 Phil Wornath 2024-11-03 06:26:39 +08:00
  • 1e94171d00
    Merge 940fffea4908053d2498b8ceea3f9f347b2c2222 into b8d5036e3328ebfd16dcc94c60beb46496ba8112 奶茶叔叔 2024-11-03 06:26:11 +08:00
  • 27253cc657
    Merge 58bf80dd85cf434f8cd7355f74b6a86068035da1 into b8d5036e3328ebfd16dcc94c60beb46496ba8112 Edwin.JH.Lee 2024-11-03 06:24:12 +08:00
  • b8d5036e33
    CI: omit unused tools for faster release builds (#7432) Daniel Hiltgen 2024-11-02 13:56:54 -07:00
  • 713096ee32 CI: omit unused tools for faster release builds Daniel Hiltgen 2024-10-30 16:33:33 -07:00
  • 312d9de1d1 llama: Improve error handling Jesse Gross 2024-11-01 15:50:53 -07:00
  • a103dae01e runner.go: Only allocate 1 element embedding batches for mllama Jesse Gross 2024-11-01 14:29:57 -07:00
  • a0f564b389 llama: Improve error handling Jesse Gross 2024-11-01 15:50:53 -07:00
  • d12be39c16
    ollama run: allow /clearscreen to clear the screen - add /clearscreen command to clear the screen (print ESC[H ESC[2J) - change description of keyboard shortcuts to say that Ctrl+L is the same as /clearscreen - It's actually not, but I don't want to deal with manipulating the buffer Henry 2024-11-02 01:07:08 -05:00
  • 9513082bcd runner.go: Only allocate 1 element embedding batches for mllama Jesse Gross 2024-11-01 14:29:57 -07:00
  • d07cf41a97 refactor kv estimation Michael Yang 2024-10-31 13:46:30 -07:00
  • 8c238e70ab mllama cross attention Michael Yang 2024-10-31 13:40:06 -07:00
  • 50369c3100 Workaround buggy P2P ROCm copy on windows Daniel Hiltgen 2024-11-01 13:05:22 -07:00
  • 3be04f1403 Align rocm compiler flags Daniel Hiltgen 2024-11-01 11:12:23 -07:00
  • e8696cf97c
    Update Dockerfile kavita-rane2 2024-11-01 17:26:10 +05:30
  • e3801389e7
    Update Dockerfile kavita-rane2 2024-11-01 17:06:20 +05:30
  • 767aa4e5c2
    Update Dockerfile kavita-rane2 2024-11-01 14:32:24 +05:30
  • 4b7d46854f
    Update Dockerfile kavita-rane2 2024-11-01 13:37:28 +05:30
  • dbb0fe2ff7
    Fix FAQ link to download Ollama Rose Liverman 2024-11-01 01:39:45 -06:00
  • 4ef1ae1096
    Update Dockerfile kavita-rane2 2024-11-01 13:04:38 +05:30
  • c1ac2eba71
    Update rh_linux_deps.sh kavita-rane2 2024-11-01 10:43:09 +05:30
  • 8a9bb0d000
    Add basic mllama integration tests (#7455) Daniel Hiltgen 2024-10-31 17:25:48 -07:00
  • 26acdcf44e runner.go: Don't set cross attention before sending embeddings Jesse Gross 2024-10-31 10:55:31 -07:00
  • 42e5133d9b runner.go: Don't set cross attention before sending embeddings Jesse Gross 2024-10-31 10:55:31 -07:00
  • cb6fa9c5db Add basic mllama integration tests Daniel Hiltgen 2024-10-31 13:26:28 -07:00
  • 921779bb10
    Give unicode test more time to run (#7437) Daniel Hiltgen 2024-10-31 13:35:31 -07:00
  • 10f4b0bc70 Give more time for concurrency test Daniel Hiltgen 2024-10-31 09:51:35 -07:00
  • a0b3ce064c Give unicode test more time to run Daniel Hiltgen 2024-10-30 19:58:10 -07:00
  • 91e5823e7b
    Added ollama-haskell library Tushar Adhatrao 2024-10-31 22:20:03 +05:30
  • cb5676db6c
    Update auth.go Rekt Developer 2024-10-31 12:36:05 +06:00
  • 30cc6cab23 feat: allow setting KV cache type Sam 2024-10-31 08:36:32 +11:00
  • cc0c5c686b
    Add Powershell Community Tool Rodrigo Ribeiro Gomes 2024-10-31 00:38:23 -03:00
  • 37648f29bf
    Add package managers to readme Henry 2024-10-30 21:27:03 -05:00
  • 3ed04536ec Support customized CPU flags for runners Daniel Hiltgen 2024-10-18 15:58:34 -07:00
  • 0b8f215856 Add Perfect Memory AI as community integrations Darius Kocar 2024-10-30 15:18:29 -07:00
  • 16f4eabe2d
    Refine default thread selection for NUMA systems (#7322) v0.4.0-rc6 Daniel Hiltgen 2024-10-30 15:05:45 -07:00
  • c826e57475 runner.go: Better abstract vision model integration Jesse Gross 2024-10-11 15:34:01 -07:00
  • 7e2ed8410b runner.go: Better abstract vision model integration Jesse Gross 2024-10-11 15:34:01 -07:00
  • 712e99d477
    Soften windows clang requirement (#7428) Daniel Hiltgen 2024-10-30 12:28:36 -07:00
  • d2ec289ac7 Soften windows clang requirement Daniel Hiltgen 2024-10-30 11:10:14 -07:00
  • b754f5a6a3
    Remove submodule and shift to Go server - 0.4.0 (#7157) Daniel Hiltgen 2024-10-30 10:34:28 -07:00
  • d24f0b12b2 boost embed endpoint Liu Yuan 2024-10-30 15:54:56 +08:00
  • ee4d1907f3 CI: install msys and clang gcc on win Daniel Hiltgen 2024-10-28 20:46:28 -07:00
  • 47f3bed4f6 Remove llama.cpp submodule and shift new build to top Daniel Hiltgen 2024-10-09 13:52:36 -07:00
  • a805e5947e
    Move windows app out of preview (#7347) Daniel Hiltgen 2024-10-30 09:24:59 -07:00
  • 91dfbb1bba
    windows: Support alt install paths, fit and finish (#6967) Daniel Hiltgen 2024-10-30 09:24:31 -07:00
  • 93f6d39b76 Refine default thread selection for NUMA systems Daniel Hiltgen 2024-10-21 16:37:08 -07:00
  • ac30c6411c
    Merge 1643e01eb3e039744f0caab8d86ee06bd2c8232e into db1842b9e1272237947d427c852c38e48688dd02 Vyacheslav 2024-10-30 12:11:50 +00:00
  • 11caf01cf1
    add terminal App Joey Zheng 2024-10-30 20:04:42 +08:00
  • 129411bb6f
    Update Dockerfile kavita-rane2 2024-10-30 16:22:03 +05:30
  • b015f24948
    Update Dockerfile kavita-rane2 2024-10-30 15:52:11 +05:30
  • 3ba4d500f5 Merge branch 'main' into handle-custom-template-capability-check Kyle Milner 2024-10-30 20:52:09 +11:00
  • 71c2b63615
    Update Dockerfile kavita-rane2 2024-10-30 15:02:45 +05:30
  • 3be00e5120
    Update rh_linux_deps.sh kavita-rane2 2024-10-30 12:38:39 +05:30
  • f88854ed3d
    Update rh_linux_deps.sh kavita-rane2 2024-10-30 11:54:17 +05:30
  • 000f46db79
    Update rh_linux_deps.sh kavita-rane2 2024-10-30 11:32:57 +05:30
  • 58e0cdc343
    Update Dockerfile kavita-rane2 2024-10-30 11:32:08 +05:30
  • 39810b6ced Fit and finish improvements for windows app Daniel Hiltgen 2024-10-24 14:56:37 -07:00
  • cd4c89706b windows: Support alt install paths Daniel Hiltgen 2024-09-25 14:55:54 -07:00
  • b3ecf87b3b Implement tokenize and de-tokenize endpoints Jackson 2024-10-29 21:27:11 -04:00
  • 5bda820a18
    Merge ab2fd8434043bc2bd3ada4baed0d94d8496e6d26 into db1842b9e1272237947d427c852c38e48688dd02 R0CKSTAR 2024-10-30 09:09:31 +08:00
  • db1842b9e1
    add more tests for getting the optimal tiled canvas (#7411) Patrick Devine 2024-10-29 16:28:02 -07:00
  • d6479e0555
    Merge 4ef972bab0bd64d65f382d96710759851512ae2e into c9ca386131881f1fde3ab173e4eb47c8db8d475b Drew Paettie 2024-10-29 16:20:10 -07:00
  • adfd4bd6cf add more tests for getting the optimal tiled canvas Patrick Devine 2024-10-29 16:01:06 -07:00
  • d5e5c30999 fix multiple image inputs Michael Yang 2024-10-28 10:28:01 -07:00
  • c9ca386131
    Switch windows to clang (#7407) Daniel Hiltgen 2024-10-29 13:15:04 -07:00
  • 3136d52ef3 Fail fast with wrong compiler on windows Daniel Hiltgen 2024-10-29 12:10:15 -07:00
  • 3bea335680 Switch over to clang for deepseek on windows Daniel Hiltgen 2024-10-28 20:10:51 -07:00
  • 923b329481 llama: wire up builtin runner Daniel Hiltgen 2024-09-26 15:21:33 -07:00
  • f70833ef32
    Finally got it working. The problem was calling LlamaServer.Embedding in parallel from a go coroutine. When doing that, the embeddings randomly sometimes come back as [], the documents get swapped around and assigned random scores, it's just chaos. No idea why, but it just doesn't work. Removing the coroutine wrapper and just calling sequentially works 100% reliably. Craig R. Hughes 2024-10-29 03:50:30 -07:00
  • e4f7236603
    Copy embeddings out of C memory space, otherwise they get corrupted. Craig R. Hughes 2024-10-29 03:42:15 -07:00
  • bfd2dd67fa
    Merge 5013279dec8d7d0575b68ff2ea63114fa3247c84 into 078f666f73422edc1a3819332c03b6f467d064f4 Alexander Heisler 2024-10-29 03:28:30 -05:00
  • de62ee96e0
    Trim whitespace that Decode likes to stick in there. Craig R. Hughes 2024-10-29 01:04:49 -07:00
  • f6ad996f11
    When reranking, only copy a single embedding element out, because there is only one. Craig R. Hughes 2024-10-29 01:04:32 -07:00
  • 104c853267
    Update Dockerfile kavita-rane2 2024-10-29 12:48:16 +05:30
  • f149ace61d
    Update release.yaml kavita-rane2 2024-10-29 12:26:07 +05:30
  • 8f0214ad0c
    Update Dockerfile kavita-rane2 2024-10-29 11:44:54 +05:30
  • fb08dbe622
    Update rh_linux_deps.sh kavita-rane2 2024-10-29 11:44:17 +05:30
  • dae2d1ae08
    Update env.sh kavita-rane2 2024-10-29 11:43:39 +05:30
  • 925811fd2e
    Update rh_linux_deps.sh kavita-rane2 2024-10-29 11:33:47 +05:30
  • 9ad00e82ab
    This is not ideal - if llama_get_embeddings_ith randomly fails to find the embeddings then we will drop this document. Would be better to retry, but that could go on failing. Better would be to figure ou the underlying bug -- why is llama_get_embeddings_ith sometimes failing? Craig R. Hughes 2024-10-28 19:44:16 -07:00
  • 46cf842723
    Merge branch 'main' into feature/kv-quant Sam 2024-10-29 13:14:36 +11:00
  • 46df792139
    Add --reranking flag for runner and update server.loadModel() to use it when appropriate. Craig R. Hughes 2024-10-28 18:44:11 -07:00
  • 9d481c4cb7
    Add debug output to see what is being passed; fix prompt concatenation Craig R. Hughes 2024-10-28 16:31:46 -07:00
  • c9897b01df
    Implement /api/rerank in go server Craig R. Hughes 2024-10-28 15:05:06 -07:00
  • 93688efa78
    Update to latest llama.cpp and fix all vendor patches and sampling_ext Craig R. Hughes 2024-10-28 13:49:22 -07:00
  • ab2fd84340 Add doc for Moore Threads GPU Xiaodong Ye 2024-10-29 09:26:09 +08:00
  • 18d3826170 Support docker build for MUSA Xiaodong Ye 2024-10-29 09:25:52 +08:00
  • e0dda32620 Update gen_linux.sh to support MUSA Xiaodong Ye 2024-10-29 09:25:15 +08:00
  • 8aec5fe5a3 Support Moore Threads GPU Xiaodong Ye 2024-10-29 09:24:55 +08:00
  • 078f666f73 tests: Add test for Unicode processing Jesse Gross 2024-10-23 15:28:30 -07:00
  • de1557a0dc runner.go: Better handle return NULL values from llama.cpp Jesse Gross 2024-10-22 14:57:46 -07:00
  • 084929c293
    add mllama image processing to the generate handler (#7384) Patrick Devine 2024-10-28 13:51:19 -07:00
  • c63733f434
    Update rh_linux_deps.sh kavita-rane2 2024-10-28 19:01:04 +05:30
  • 9e1228d9c9 forgot api.md janpf 2024-01-08 13:15:18 +01:00
  • df090634f1 added working n_probs pass through janpf 2024-01-08 13:13:32 +01:00
  • 0daf29e985
    Update rh_linux_deps.sh kavita-rane2 2024-10-28 18:46:06 +05:30
  • 3eb6a1454a
    Update rh_linux_deps.sh kavita-rane2 2024-10-28 18:12:04 +05:30
  • a1d18f6321
    Update rh_linux_deps.sh kavita-rane2 2024-10-28 18:00:22 +05:30
  • 8fb936b9e6
    Update rh_linux_deps.sh kavita-rane2 2024-10-28 17:22:14 +05:30