Commit Graph

  • 27331ae3a8 download: add inactivity monitor Michael Yang 2024-01-08 11:44:59 -08:00
  • b6c0ef1e70
    Merge pull request #1961 from jmorganca/mxyng/rm-double-newline Michael Yang 2024-01-12 15:18:19 -08:00
  • 356d178f6e
    Merge pull request #1971 from jmorganca/mxyng/max-context-length Michael Yang 2024-01-12 15:10:25 -08:00
  • eaed6f8c45 add max context length check Michael Yang 2024-01-12 14:54:01 -08:00
  • 6a5bfc2ed6 update actions/setup-go purificant 2024-01-12 17:03:06 +00:00
  • 61cd7dc6aa feat: add flag for specifying port number p3rtang 2024-01-12 23:04:18 +01:00
  • 141afb4c20 reduce image size Ismael 2024-01-12 18:03:02 -04:00
  • cf29bd2d72 fix: request retry with error Michael Yang 2024-01-12 13:32:24 -08:00
  • 87eeb4e885 Update images_test.go Bruce MacDonald 2024-01-12 16:09:43 -05:00
  • 905862e17b improve cuda detection (rel. issue #1704) Fabian Preiss 2024-01-09 21:55:36 +01:00
  • 04ec4f340b trim chat prompt based on llm context size Bruce MacDonald 2024-01-12 15:24:55 -05:00
  • 565f8a3c44
    Convert the REPL to use /api/chat for interactive responses (#1936) Patrick Devine 2024-01-12 12:05:52 -08:00
  • 5121b7ac9c remove double newlines in /set parameter Michael Yang 2024-01-12 11:21:08 -08:00
  • 44beb5b89c add ollama sync command puffo 2024-01-12 12:46:43 -06:00
  • a70262c6b2
    Update README.md Michael Yang 2024-01-12 09:43:04 -08:00
  • 15c2d4be7a First cut at some distributed test rigging Daniel Hiltgen 2023-12-14 11:22:24 -08:00
  • 1296675498
    Added MindMac to Community Integrations -> Web & Desktop section Hoang Nguyen 2024-01-13 00:20:09 +07:00
  • 1ace16ac8d
    Add twinny vscode extension to Extensions and Plugins Richard Macarthy 2024-01-12 13:31:05 +00:00
  • 2ce8a9df48
    Add Dify.AI to community integrations Chenhe Gu 2024-01-12 13:18:04 +08:00
  • 40a0a90a88
    Add group delete to uninstall instructions (#1924) Tristram Oaten 2024-01-12 05:07:00 +00:00
  • f20bf95b33 address the linter Patrick Devine 2024-01-11 15:46:13 -08:00
  • a3507b90cc dry out the display response Patrick Devine 2024-01-11 15:12:22 -08:00
  • 7efd582a83 change interactive mode to use /api/chat instead of generate Patrick Devine 2024-01-11 14:43:30 -08:00
  • cbe20c4375 update readme Michael Yang 2024-01-11 16:24:37 -08:00
  • 5ffbbea1d7 remove client.py Michael Yang 2024-01-11 15:51:47 -08:00
  • 3773fb6465
    Merge pull request #1935 from dhiltgen/cpu_fallback Daniel Hiltgen 2024-01-11 15:52:32 -08:00
  • 7427fa1387 Fix up the CPU fallback selection Daniel Hiltgen 2024-01-11 14:43:16 -08:00
  • f84537e0e0
    Merge pull request #1934 from jmorganca/mxyng/fix-slices Michael Yang 2024-01-11 14:36:20 -08:00
  • 48d3259ace
    Merge branch 'main' into fpreiss/cuda_detection Fabian Preiß 2024-01-11 23:28:24 +01:00
  • d2be6387c9 fix typo Michael Yang 2024-01-11 14:25:21 -08:00
  • d7af35d3d0 import fmt Michael Yang 2024-01-11 14:22:32 -08:00
  • defc1dbd6e use x/exp/slices Michael Yang 2024-01-11 14:20:13 -08:00
  • de2fbdec99
    Merge pull request #1819 from dhiltgen/multi_variant Daniel Hiltgen 2024-01-11 14:00:48 -08:00
  • ed4b3e0b32
    Merge pull request #407 from anuraagdjain/feat/parallel-model-downloads Timothy Jaeryang Baek 2024-01-11 12:53:21 -08:00
  • 12fbe9a491 Update .gitignore Timothy J. Baek 2024-01-11 12:52:13 -08:00
  • bf6685d887 chore: conflict Timothy J. Baek 2024-01-11 12:51:46 -08:00
  • a63507c21e feat: custom port for server Anuraag Jain 2024-01-11 21:54:14 +02:00
  • f5faf79aa1
    Add semantic kernel to Readme (#1931) Eduard van Valkenburg 2024-01-11 20:40:23 +01:00
  • 38cb7d2861
    Add semantic kernel to Readme Eduard van Valkenburg 2024-01-11 20:35:40 +01:00
  • f4f939de28
    Merge pull request #1552 from jmorganca/mxyng/lint-test Michael Yang 2024-01-11 09:37:45 -08:00
  • 39928a42e8 Always dynamically load the llm server library Daniel Hiltgen 2024-01-09 20:29:58 -08:00
  • d88c527be3 Build multiple CPU variants and pick the best Daniel Hiltgen 2024-01-07 15:48:05 -08:00
  • 3bc8b9832b
    fix gpu_test.go Error (same type) uint64->uint32 (#1921) Fabian Preiß 2024-01-11 14:22:23 +01:00
  • 5c5bde3b85
    Merge pull request #451 from goecho/main Timothy Jaeryang Baek 2024-01-11 03:57:38 -08:00
  • f9f8b2ce44
    Add group delete to uninstall instructions Tristram Oaten 2024-01-11 10:24:46 +00:00
  • 99c22ed3f5 fix gpu_test.go Error (same type) uint64->uint32 Fabian Preiss 2024-01-11 09:26:53 +01:00
  • ab6be852c7 revisit memory allocation to account for full kv cache on main gpu v0.1.20 Jeffrey Morgan 2024-01-11 01:45:31 -05:00
  • 567bea537e Add copilot for obsidian plugin to community integration Logan Yang 2024-01-10 22:43:00 -08:00
  • 74f91bc74d Fix bug: Header attributes (Host, Authorization, Origin, Referer) not sanitized goecho 2024-01-11 14:36:34 +08:00
  • 052b33b81b DRY out the Dockefile.build Daniel Hiltgen 2024-01-06 16:46:55 -08:00
  • 8da7bef05f Support multiple variants for a given llm lib type Daniel Hiltgen 2024-01-05 12:13:08 -08:00
  • b24e8d17b2
    Increase minimum CUDA memory allocation overhead and fix minimum overhead for multi-gpu (#1896) Jeffrey Morgan 2024-01-10 19:08:51 -05:00
  • 32c40a644b fixed only includes graph alloc Jeffrey Morgan 2024-01-10 19:06:29 -05:00
  • 65c0dec811 allocate fixed amount before layers Jeffrey Morgan 2024-01-10 18:22:58 -05:00
  • f208251898 better wording Jeffrey Morgan 2024-01-10 11:34:49 -05:00
  • 6675989fea limit overhead to 10% of all gpus Jeffrey Morgan 2024-01-10 11:34:00 -05:00
  • d81944c5cd fix multi gpu overhead Jeffrey Morgan 2024-01-10 11:31:50 -05:00
  • db53635c97 increase minimum cuda overhead and fix minimum overhead for multi-gpu Jeffrey Morgan 2024-01-10 08:48:33 -05:00
  • f83881390f revert submodule back to 328b83de23b33240e28f4e74900d1d06726f5eb1 Jeffrey Morgan 2024-01-10 18:42:39 -05:00
  • ac70ab6761
    Merge pull request #1914 from dhiltgen/smarter_cuda_detection Daniel Hiltgen 2024-01-10 15:21:56 -08:00
  • 3c49c3ab0d Harden GPU mgmt library lookup Daniel Hiltgen 2024-01-10 14:39:51 -08:00
  • 9754ae4c89 Support optional override of the target archictures Daniel Hiltgen 2024-01-10 14:41:02 -08:00
  • 224fbf2795 update submodule to commit 1fc2f265ff9377a37fd2c61eae9cd813a3491bea until its main branch is fixed Jeffrey Morgan 2024-01-10 17:03:11 -05:00
  • 2c6e8f5248
    Update submodule to 6efb8eb30e7025b168f3fda3ff83b9b386428ad6 (#1885) Jeffrey Morgan 2024-01-10 16:48:38 -05:00
  • ec405af756
    Merge pull request #3 from kris-hansen/rebase Kris Hansen 2024-01-10 15:49:40 -05:00
  • b1f29aacd8
    Merge pull request #448 from ollama-webui/doc-update Timothy Jaeryang Baek 2024-01-10 12:40:10 -08:00
  • fc06443e83 fix: updates based on PR comments Kris Hansen 2023-12-27 19:47:08 -05:00
  • f53dbfb802 feat: add list-remote to view remote models Kris Hansen 2023-12-27 18:28:20 -05:00
  • 482b4be1f4 doc: line break removed Timothy J. Baek 2024-01-10 12:39:00 -08:00
  • 7dc45054db fix: updates based on PR comments Kris Hansen 2023-12-27 19:47:08 -05:00
  • 08694208d0 feat: add list-remote to view remote models Kris Hansen 2023-12-27 18:28:20 -05:00
  • 256ea52819
    enh: add ollero.nvim to community applications Marco Antônio 2024-01-10 14:46:06 -03:00
  • be721ca0df add more search paths for cuda libs 1753835310339345804/tmp_refs/heads/cuda-search 1753835310339345804/cuda-search cuda-search Jeffrey Morgan 2024-01-10 09:35:19 -05:00
  • 524841f097
    Merge pull request #443 from ollama-webui/doc-update Timothy Jaeryang Baek 2024-01-09 23:13:57 -08:00
  • 9ce8bb7c6b
    doc: update Timothy Jaeryang Baek 2024-01-10 02:13:49 -05:00
  • 2d9830b2c2
    Merge pull request #442 from ollama-webui/many-models Timothy Jaeryang Baek 2024-01-09 23:11:10 -08:00
  • a63b8c13f0 refac Timothy J. Baek 2024-01-09 23:10:02 -08:00
  • 737928e861 feat: better prompt gen template Timothy J. Baek 2024-01-09 23:06:33 -08:00
  • 9087aa0e30 fix: only ollama models Timothy J. Baek 2024-01-09 22:56:43 -08:00
  • de5c02db5b doc: features Timothy J. Baek 2024-01-09 22:53:22 -08:00
  • 70029d9bed feat: @model group convo Timothy J. Baek 2024-01-09 22:47:31 -08:00
  • 0c7167e64d feat: load ~/.ollama/.env using godotenv Nicholas Dudfield 2024-01-10 13:45:42 +07:00
  • 38d6a3712b unblock condition variable in update_slots when closing server Jeffrey Morgan 2024-01-10 01:18:20 -05:00
  • 1633ba4443
    Merge pull request #441 from ollama-webui/rag Timothy Jaeryang Baek 2024-01-09 21:10:12 -08:00
  • ffba59dc3a Update requirements.txt Timothy J. Baek 2024-01-09 21:09:28 -08:00
  • a24341fb9b update submodule to 6efb8eb30e7025b168f3fda3ff83b9b386428ad6 Jeffrey Morgan 2024-01-09 21:08:33 -05:00
  • 896793ead2 improve cuda detection (rel. issue 1704) Fabian Preiss 2024-01-09 21:55:36 +01:00
  • c1ec604f21 feat: rag md support Timothy J. Baek 2024-01-09 15:24:53 -08:00
  • 358f79f533
    Merge pull request #439 from ollama-webui/rag-context-management Timothy Jaeryang Baek 2024-01-09 14:34:37 -08:00
  • bf1c026666 feat: better rag context management Timothy J. Baek 2024-01-09 14:33:04 -08:00
  • 34344d801c clean up cmake build directory when cross compiling macOS builds v0.1.19 Jeffrey Morgan 2024-01-09 17:13:51 -05:00
  • 51eb2645b7
    Merge pull request #436 from justinh-rahb/patch-1 Timothy Jaeryang Baek 2024-01-09 13:37:33 -08:00
  • dcf16df166
    Merge pull request #438 from ollama-webui/fix Timothy Jaeryang Baek 2024-01-09 13:34:04 -08:00
  • 4f1be8eda5
    Merge pull request #433 from Contribution-Tracking/shell-scripts Timothy Jaeryang Baek 2024-01-09 13:33:54 -08:00
  • 76d37393ee feat: gguf upload Timothy J. Baek 2024-01-09 13:25:42 -08:00
  • e868c8a5c7
    Update api.md (#1878) Robin Glauser 2024-01-09 22:21:17 +01:00
  • 27a74e0313
    Update api.md Robin Glauser 2024-01-09 22:03:19 +01:00
  • c336693f07
    calculate overhead based number of gpu devices (#1875) Jeffrey Morgan 2024-01-09 15:53:33 -05:00
  • 0c84400477 calculate overhead based number of gpu devices Jeffrey Morgan 2024-01-09 15:04:03 -05:00
  • e89dc1d54b
    Merge pull request #1874 from dhiltgen/correct_cuda_min Daniel Hiltgen 2024-01-09 11:37:22 -08:00