Commit Graph

  • e890be4814 Revert "More parallelism on windows generate" Daniel Hiltgen 2024-06-17 13:32:46 -07:00
  • b2799f111b Move libraries out of users path Daniel Hiltgen 2024-06-15 13:17:20 -07:00
  • 152fc202f5
    llm: update llama.cpp commit to 7c26775 (#4896) v0.1.45-rc2 Jeffrey Morgan 2024-06-17 15:56:16 -04:00
  • 4ad0d4d6d3
    Fix a build warning (#5096) Lei Jitang 2024-06-18 02:47:48 +08:00
  • 9b5b69c00f llm: update llama.cpp submodule to 7c26775 jmorganca/llama-cpp-7c26775 jmorganca 2024-06-17 13:46:02 -04:00
  • 163cd3e77c
    gpu: add env var for detecting Intel oneapi gpus (#5076) Jeffrey Morgan 2024-06-16 20:09:05 -04:00
  • 4c2c8f93dd
    Merge pull request #5080 from dhiltgen/debug_intel_crash Daniel Hiltgen 2024-06-16 14:42:41 -07:00
  • fd1e6e0590 Add some more debugging logs for intel discovery Daniel Hiltgen 2024-06-16 07:42:52 -07:00
  • 89c79bec8c
    Add ModifiedAt Field to /api/show (#5033) royjhan 2024-06-15 20:53:56 -07:00
  • c7b77004e3
    docs: add missing powershell package to windows development instructions (#5075) Jeffrey Morgan 2024-06-15 23:08:09 -04:00
  • 07d143f412
    Merge pull request #5058 from coolljt0725/fix_build_warning Daniel Hiltgen 2024-06-15 11:52:36 -07:00
  • a12283e2ff Implement custom github release action Daniel Hiltgen 2024-06-15 08:26:54 -07:00
  • 4b0050cf0e
    Merge pull request #5037 from dhiltgen/faster_win_build v0.1.45-rc1 Daniel Hiltgen 2024-06-15 08:03:05 -07:00
  • 0577af98f4 More parallelism on windows generate Daniel Hiltgen 2024-06-13 17:13:01 -07:00
  • 17ce203a26
    Merge pull request #4875 from dhiltgen/rocm_gfx900_workaround Daniel Hiltgen 2024-06-15 07:38:58 -07:00
  • d76555ffb5
    Merge pull request #4874 from dhiltgen/rocm_v6_bump Daniel Hiltgen 2024-06-15 07:38:32 -07:00
  • 2786dff5d3
    Merge pull request #4264 from dhiltgen/show_gpu_visible_settings Daniel Hiltgen 2024-06-15 07:33:52 -07:00
  • 225f0d1219 gpu: Fix build warning Lei Jitang 2024-06-15 14:26:23 +08:00
  • 532db58311
    Merge pull request #4972 from jayson-cloude/main Daniel Hiltgen 2024-06-14 17:04:40 -07:00
  • 9357570d59 OpenAI Delete Endpoint royh-openai-delete Roy Han 2024-06-14 16:28:22 -07:00
  • 6be309e1bd Centralize GPU configuration vars Daniel Hiltgen 2024-05-08 11:11:50 -07:00
  • da3bf23354 Workaround gfx900 SDMA bugs Daniel Hiltgen 2024-05-31 16:15:21 -07:00
  • 26ab67732b Bump ROCm linux to 6.1.1 Daniel Hiltgen 2024-06-06 10:43:55 -07:00
  • 45cacbaf05
    Merge pull request #4517 from dhiltgen/gpu_incremental Daniel Hiltgen 2024-06-14 15:35:00 -07:00
  • 17df6520c8 Remove mmap related output calc logic Daniel Hiltgen 2024-06-13 09:59:36 -07:00
  • 6f351bf586 review comments and coverage Daniel Hiltgen 2024-06-05 12:07:20 -07:00
  • ff4f0cbd1d Prevent multiple concurrent loads on the same gpus Daniel Hiltgen 2024-06-04 14:08:36 -07:00
  • fc37c192ae Refine CPU load behavior with system memory visibility Daniel Hiltgen 2024-06-03 19:09:23 -07:00
  • 434dfe30c5 Reintroduce nvidia nvml library for windows Daniel Hiltgen 2024-06-03 15:07:50 -07:00
  • 4e2b7e181d Refactor intel gpu discovery Daniel Hiltgen 2024-05-29 16:37:34 -07:00
  • 48702dd149 Harden unload for empty runners Daniel Hiltgen 2024-05-30 16:43:40 -07:00
  • 68dfc6236a refined test timing Daniel Hiltgen 2024-05-31 14:28:02 -07:00
  • 5e8ff556cb Support forced spreading for multi GPU Daniel Hiltgen 2024-05-08 14:32:42 -07:00
  • 6fd04ca922 Improve multi-gpu handling at the limit Daniel Hiltgen 2024-05-18 12:34:31 -07:00
  • 206797bda4 Fix concurrency integration test to work locally Daniel Hiltgen 2024-05-23 13:12:14 -07:00
  • 43ed358f9a Refine GPU discovery to bootstrap once Daniel Hiltgen 2024-05-15 15:13:16 -07:00
  • b32ebb4f29 Use DRM driver for VRAM info for amd Daniel Hiltgen 2024-05-14 16:18:42 -07:00
  • fb9cdfa723 Fix server.cpp for the new cuda build macros Daniel Hiltgen 2024-05-18 16:02:13 -07:00
  • efac488675 Revert "Limit GPU lib search for now (#4777)" Daniel Hiltgen 2024-06-03 08:31:48 -07:00
  • 6b800aa7b7
    openai: do not set temperature to 0 when setting seed (#5045) Jeffrey Morgan 2024-06-14 13:43:56 -07:00
  • dd7c9ebeaf
    server: longer timeout in TestRequests (#5046) Jeffrey Morgan 2024-06-14 09:48:25 -07:00
  • 4dc7fb9525
    update 40xx gpu compat matrix (#5036) Patrick Devine 2024-06-13 20:10:33 -04:00
  • c39761c552
    Merge pull request #5032 from dhiltgen/actually_skip v0.1.44 Daniel Hiltgen 2024-06-13 13:26:09 -07:00
  • aac367636d Actually skip PhysX on windows Daniel Hiltgen 2024-06-13 13:17:19 -07:00
  • 15a687ae4b
    Merge pull request #5031 from ollama/mxyng/fix-multibyte-utf16 Michael Yang 2024-06-13 13:14:55 -07:00
  • d528e1af75 fix utf16 for multibyte runes Michael Yang 2024-06-13 11:39:01 -07:00
  • cd234ce22c parser: add test for multibyte runes Michael Yang 2024-06-13 11:09:22 -07:00
  • 94618b2365
    add OLLAMA_MODELS to envconfig (#5029) Patrick Devine 2024-06-13 15:52:03 -04:00
  • 1fd236d177
    server: remove jwt decoding error (#5027) Jeffrey Morgan 2024-06-13 11:21:15 -07:00
  • e87fc7200d
    Merge pull request #5025 from ollama/mxyng/revert-parser-scan Michael Yang 2024-06-13 10:31:25 -07:00
  • 20b9f8e6f4 Revert "proper utf16 support" Michael Yang 2024-06-13 10:22:16 -07:00
  • c69bc19e46
    move OLLAMA_HOST to envconfig (#5009) Patrick Devine 2024-06-12 18:48:16 -04:00
  • 12209bd021 Remove Latest at API Level Roy Han 2024-06-12 14:43:17 -07:00
  • bba5d177aa
    Merge pull request #5004 from ollama/mxyng/fix-templates Michael Yang 2024-06-12 14:39:29 -07:00
  • c16f8af911 fix: multiple templates when creating from model Michael Yang 2024-06-12 13:30:08 -07:00
  • 217f60c3d9
    Merge pull request #4987 from ollama/mxyng/revert-byte-order v0.1.43 Michael Yang 2024-06-11 16:04:20 -07:00
  • 7bdcd1da94 Revert "Merge pull request #4938 from ollama/mxyng/fix-byte-order" Michael Yang 2024-06-11 15:55:44 -07:00
  • ead259d877
    llm: fix seed value not being applied to requests (#4986) Jeffrey Morgan 2024-06-11 14:24:41 -07:00
  • 2ff45d571d
    Add Ollama-hpp to Community Libraries in README. (#4983) James Montgomery 2024-06-11 14:15:05 -04:00
  • 157f09acdf
    fix: "Skip searching for network devices" jayson-cloude 2024-06-11 16:11:35 +08:00
  • be2c5fd71a Remove :latest from Ollama List Roy Han 2024-06-10 16:39:44 -07:00
  • 0f3cf1d42e
    Merge pull request #4715 from ollama/mxyng/utf16-parser Michael Yang 2024-06-10 11:41:29 -07:00
  • 5bc029c529
    Merge pull request #4921 from ollama/mxyng/import-md Michael Yang 2024-06-10 11:41:09 -07:00
  • e9a9c6a8e8
    Merge pull request #4965 from ollama/mxyng/skip-layer-remove Michael Yang 2024-06-10 11:40:03 -07:00
  • 515f497e6d fix: skip removing layers that no longer exist Michael Yang 2024-06-10 11:15:03 -07:00
  • b27268aaef add test Michael Yang 2024-06-10 11:31:34 -07:00
  • f5f245cc15
    Merge pull request #4938 from ollama/mxyng/fix-byte-order Michael Yang 2024-06-10 09:38:12 -07:00
  • d63e1f5b34 Lint royh-show-rigid Roy Han 2024-06-10 09:36:05 -07:00
  • 94d37fdcae
    fix: examples/langchain-python-rag-privategpt/requirements.txt (#3382) Jim Scardelis 2024-06-09 10:58:09 -07:00
  • b84aea1685
    Critical fix from llama.cpp JSON grammar to forbid un-escaped escape characters inside strings, which breaks parsing. (#3782) Craig Hughes 2024-06-09 13:57:09 -04:00
  • 896495de7b
    Add instructions to easily install specific versions on faq.md (#4084) Napuh 2024-06-09 19:49:03 +02:00
  • 5528dd9d11
    Error handling load_single_document() in ingest.py (#4852) dcasota 2024-06-09 19:41:07 +02:00
  • 943172cbf4
    Update api.md Jeffrey Morgan 2024-06-08 23:04:32 -07:00
  • d8b3e09fb7 llm: enable flash attention by default jmorganca/enable-fa jmorganca 2024-06-08 22:55:22 -07:00
  • 85169e8d6f
    Added headless-ollama (#4612) Nischal Jain 2024-06-09 07:21:16 +05:30
  • 34f142797a
    llm: always add bos token to prompt (#4941) Jeffrey Morgan 2024-06-08 18:47:10 -07:00
  • 46a7f1e74a
    Update README.md with LangChainRust (#4854) Erhan 2024-06-09 03:29:36 +03:00
  • 620d5c569e fix parsing big endian gguf Michael Yang 2024-06-08 12:32:02 -07:00
  • 239a994c47 Initial Functionality Roy Han 2024-06-07 17:44:24 -07:00
  • b9ce7bf75e update import.md Michael Yang 2024-06-07 16:45:15 -07:00
  • cddc63381c
    Merge pull request #4909 from dhiltgen/oneapi_disable Daniel Hiltgen 2024-06-07 14:07:15 -07:00
  • 385a32ecb5
    Merge pull request #4910 from ollama/mxyng/detect-chat-template v0.1.42 Michael Yang 2024-06-07 11:07:39 -07:00
  • 030e765e76 fix create model when template detection errors Michael Yang 2024-06-07 08:55:46 -07:00
  • 05f79602f0 server: dont error on missing tokenizer.chat_template jmorganca/no-error-template jmorganca 2024-06-07 09:12:08 -07:00
  • ab8c929e20 Add ability to skip oneapi generate Daniel Hiltgen 2024-06-07 08:32:49 -07:00
  • ce0dc33cb8
    llm: patch to fix qwen 2 temporarily on nvidia (#4897) Jeffrey Morgan 2024-06-06 23:14:33 -07:00
  • ebbaa8b513 Descriptive arg error messages and other fixes Roy Han 2024-06-06 17:24:14 -07:00
  • 78f81fc0e5
    Merge pull request #4800 from ollama/mxyng/detect-chat-template Michael Yang 2024-06-06 16:17:18 -07:00
  • cdbda76fa9 Clean Up Roy Han 2024-06-06 16:17:07 -07:00
  • 9b6c2e6eb6 detect chat template from KV Michael Yang 2024-06-03 11:06:29 -07:00
  • 30f7064363 Initial Draft of Information Roy Han 2024-06-06 16:00:46 -07:00
  • 1a29e9a879
    API app/browser access (#4879) royjhan 2024-06-06 15:19:03 -07:00
  • ccd624ca44 API Show Extended Roy Han 2024-06-06 14:43:05 -07:00
  • 4bf1da4944
    Separate ListResponse and ModelResponse for api/tags vs api/ps (#4842) royjhan 2024-06-06 10:11:45 -07:00
  • de5beb06b3 server: skip blob verification for already verified blobs Blake Mizerany 2024-05-24 08:40:40 -07:00
  • 98e65929dc
    docs(tools): add gollama (#4829) Sam 2024-06-06 09:13:39 +12:00
  • 66ab48772f proper utf16 support Michael Yang 2024-05-29 21:37:07 -07:00
  • 22fcf8f7de
    Merge pull request #3737 from ollama/mxyng/modelname-4 Michael Yang 2024-06-05 12:05:05 -07:00
  • 28c7813ac4
    API PS Documentation (#4822) royjhan 2024-06-05 11:06:53 -07:00
  • 1d8616d30f
    docs: update to add LLocal.in to web & desktop integrations (#4719) Kartikeya Mishra 2024-06-05 03:13:59 +05:30