Commit Graph

  • 7e571f95f0 trimspace test case Josh Yan 2024-07-01 11:07:48 -07:00
  • da8e2a0447 use kvs to detect embedding models Michael Yang 2024-06-14 14:57:49 -07:00
  • a30915bde1 add capabilities Michael Yang 2024-06-11 14:03:42 -07:00
  • 58e3fff311 rename templates to template Michael Yang 2024-06-10 14:54:42 -07:00
  • 3f0b309ad4 remove ManifestV2 Michael Yang 2024-06-10 08:47:13 -07:00
  • e70610ef06
    Merge pull request #5410 from dhiltgen/ctx_cleanup Daniel Hiltgen 2024-07-01 09:54:20 -07:00
  • dfded7e075
    Merge pull request #5364 from dhiltgen/concurrency_docs Daniel Hiltgen 2024-07-01 09:49:48 -07:00
  • 173b550438 Remove default auto from help message Daniel Hiltgen 2024-07-01 09:48:05 -07:00
  • cff3f44f4a Fix case for NumCtx Daniel Hiltgen 2024-07-01 09:43:59 -07:00
  • 26e4e66faf updated parsefile test Josh Yan 2024-07-01 09:43:49 -07:00
  • b7860f12ad reference license, template, system as files mxyng/layers-from-files Michael Yang 2024-07-01 08:53:25 -07:00
  • e401a23d62 cmd build context Michael Yang 2024-06-30 10:24:31 -07:00
  • 97c9e11768 Switch use_mmap to a pointer type Daniel Hiltgen 2024-06-28 09:57:10 -07:00
  • d46f6ea512 err on insecure path Michael Yang 2024-06-30 11:10:40 -07:00
  • 3518aaef33
    Merge pull request #4218 from dhiltgen/auto_parallel Daniel Hiltgen 2024-07-01 08:32:29 -07:00
  • 7893ccb68c introduce build.go for controlling distribution builds build_dist Blake Mizerany 2024-04-19 14:23:27 -07:00
  • 1963c00201
    Update README.md (#5214) bmizerany/noseek RAPID ARCHITECT 2024-06-30 21:00:57 -05:00
  • 27402cb7a2
    Update gpu.md (#5382) Eduard 2024-07-01 03:48:51 +02:00
  • c1218199cf
    Update api.md Jeffrey Morgan 2024-06-29 16:22:49 -07:00
  • 717f7229eb
    Do not shift context for sliding window models (#5368) v0.1.48 Jeffrey Morgan 2024-06-28 19:39:31 -07:00
  • 80c1a3f812 playing around with truncate stuff Roy Han 2024-06-28 18:17:09 -07:00
  • c111d8bb51 normalization Roy Han 2024-06-28 17:19:04 -07:00
  • 5213c12354 clean up Roy Han 2024-06-28 15:26:58 -07:00
  • b9c74df37b check normalization Roy Han 2024-06-28 15:10:58 -07:00
  • 49e341147d add server function Roy Han 2024-06-28 15:03:53 -07:00
  • c406fa7a4c api/embed draft Roy Han 2024-06-28 14:54:21 -07:00
  • 22458c573a mock up notes Roy Han 2024-06-28 14:21:45 -07:00
  • 0ac5cbc00e separate deprecation changes royh-ls Roy Han 2024-06-28 13:22:37 -07:00
  • aae56abb7c Document concurrent behavior and settings Daniel Hiltgen 2024-06-28 13:15:57 -07:00
  • 5f034f5b63
    Include Show Info in Interactive (#5342) royjhan 2024-06-28 13:15:52 -07:00
  • 1071e17626 lint royh-name Roy Han 2024-06-28 13:14:49 -07:00
  • c9fd7a730a changes Roy Han 2024-06-28 13:10:39 -07:00
  • 01ecaf95fe deprecate Roy Han 2024-06-28 13:07:44 -07:00
  • b910fa9010
    Ollama Show: Check for Projector Type (#5307) royjhan 2024-06-28 11:30:16 -07:00
  • 6d4219083c
    Update docs (#5312) royjhan 2024-06-28 09:58:14 -07:00
  • d77a174eb4 defaut timeout timeout Roy Han 2024-06-27 14:58:31 -07:00
  • 1ed4f521c4
    Merge pull request #5340 from ollama/mxyng/mem Michael Yang 2024-06-27 14:26:49 -07:00
  • de2163dafd gemma2 graph Michael Yang 2024-06-27 10:52:25 -07:00
  • 9bd00041fa trim all params Josh Yan 2024-06-27 11:18:38 -07:00
  • 4e986a823c unquote, trimp space Josh Yan 2024-06-27 10:59:15 -07:00
  • 2cc7d05012
    update readme for gemma 2 (#5333) Michael 2024-06-27 12:45:16 -04:00
  • 123a722a6f
    zip: prevent extracting files into parent dirs (#5314) v0.1.47 Michael Yang 2024-06-26 21:38:21 -07:00
  • 4d311eb731
    llm: architecture patch (#5316) Jeffrey Morgan 2024-06-26 21:38:12 -07:00
  • 02169f3e60 Update docs Roy Han 2024-06-26 14:30:28 -07:00
  • ff191d7cba Initial Draft Roy Han 2024-06-25 13:29:47 -07:00
  • 70d31c1e9a use timestamp from challenge, fallback to local time mxyng/server-timestamp Michael Yang 2024-06-25 10:12:02 -07:00
  • cb42e607c5
    llm: speed up gguf decoding by a lot (#5246) v0.1.46 Blake Mizerany 2024-06-24 21:47:52 -07:00
  • 2aa91a937b
    cmd: defer stating model info until necessary (#5248) Blake Mizerany 2024-06-24 20:14:03 -07:00
  • 0f87628b6d Revert "Initial Batch Embedding" Roy Han 2024-06-24 15:26:05 -07:00
  • c71698426c Separate Rounding Functions Roy Han 2024-06-24 11:09:08 -07:00
  • f93cdfdfae Standardize with ollama.com Roy Han 2024-06-24 10:53:15 -07:00
  • acbffa59e9 llm: suppress large allocations for GGUF arrays bmizerany/nosillyggufslurps Blake Mizerany 2024-06-23 13:55:48 -07:00
  • ccef9431c8
    Merge pull request #5205 from dhiltgen/modelfile_use_mmap Daniel Hiltgen 2024-06-21 16:30:36 -07:00
  • 642cee1342 Sort the ps output Daniel Hiltgen 2024-06-21 15:59:41 -07:00
  • 9a9e7d83c4
    Docs (#5149) royjhan 2024-06-21 15:52:09 -07:00
  • 9929751cc8 Disable concurrency for AMD + Windows Daniel Hiltgen 2024-06-19 13:35:38 -07:00
  • 17b7186cd7 Enable concurrency by default Daniel Hiltgen 2024-05-06 17:47:52 -07:00
  • 189a43caa2
    Merge pull request #5206 from ollama/mxyng/quantize Michael Yang 2024-06-21 13:44:34 -07:00
  • e835ef1836 fix: quantization with template Michael Yang 2024-06-21 13:30:43 -07:00
  • 7e7749224c Fix use_mmap parsing for modelfiles Daniel Hiltgen 2024-06-21 12:27:19 -07:00
  • c7c2f3bc22
    Merge pull request #5194 from dhiltgen/linux_mmap_auto Daniel Hiltgen 2024-06-20 11:44:08 -07:00
  • 54a79d6a8a
    Merge pull request #5125 from dhiltgen/fedora39 Daniel Hiltgen 2024-06-20 11:27:24 -07:00
  • 5bf5aeec01 Refine mmap default logic on linux Daniel Hiltgen 2024-06-20 11:07:04 -07:00
  • e01e535cbb
    Merge pull request #5192 from ollama/mxyng/kv v0.1.45-rc5 v0.1.45 Michael Yang 2024-06-20 10:46:24 -07:00
  • 0195d6a2f8
    Merge pull request #5188 from ollama/jyan/tmpdir2 Josh 2024-06-20 10:40:59 -07:00
  • af370ac178 Parameter Precision Roy Han 2024-06-20 10:38:31 -07:00
  • 8e0641a9bf handle asymmetric embedding KVs Michael Yang 2024-06-20 09:40:17 -07:00
  • 662568d453 err!=nil check Josh Yan 2024-06-20 09:30:59 -07:00
  • 4ebb66c662 reformat error check Josh Yan 2024-06-20 09:23:43 -07:00
  • c494aea5c8 Strip stop strings royh-params Roy Han 2024-06-20 09:06:08 -07:00
  • 23e899f32d skip os.removeAll() if PID does not exist Josh Yan 2024-06-20 08:51:35 -07:00
  • fedf71635e
    Extend api/show and ollama show to return more model info (#4881) v0.1.45-rc4 royjhan 2024-06-19 14:19:02 -07:00
  • 97c59be653
    Merge pull request #5074 from dhiltgen/app_log_rotation Daniel Hiltgen 2024-06-19 13:02:24 -07:00
  • 9d8a4988e8 Implement log rotation for tray app Daniel Hiltgen 2024-06-15 16:30:37 -07:00
  • 1ae0750a21
    Merge pull request #5147 from ollama/mxyng/cleanup Michael Yang 2024-06-19 12:50:31 -07:00
  • 9d91e5e587 remove confusing log message Michael Yang 2024-06-19 11:14:11 -07:00
  • 96624aa412
    Merge pull request #5072 from dhiltgen/windows_path Daniel Hiltgen 2024-06-19 09:13:39 -07:00
  • 10f33b8537
    Merge pull request #5146 from dhiltgen/backout Daniel Hiltgen 2024-06-19 09:12:45 -07:00
  • 4a633cc295
    Merge pull request #5145 from dhiltgen/bad_loads Daniel Hiltgen 2024-06-19 09:12:33 -07:00
  • d34d88e417 Revert "Revert "gpu: add env var for detecting Intel oneapi gpus (#5076)"" Daniel Hiltgen 2024-06-19 08:57:41 -07:00
  • 52ce350b7a Fix bad symbol load detection Daniel Hiltgen 2024-06-19 08:39:07 -07:00
  • 2abebb2cbe
    Merge pull request #5128 from zhewang1-intc/fix_levelzero_empty_symbol_detect Daniel Hiltgen 2024-06-19 08:33:16 -07:00
  • 380e06e5be types/model: remove Digest Blake Mizerany 2024-06-18 13:29:38 -07:00
  • badf975e45 get real func ptr. Wang,Zhe 2024-06-19 09:00:51 +08:00
  • 755b4e4fc2 Revert "gpu: add env var for detecting Intel oneapi gpus (#5076)" Wang,Zhe 2024-06-19 08:59:58 +08:00
  • c22d54895a Initial Batch Embedding Roy Han 2024-06-18 17:34:36 -07:00
  • 1a1c99e334 Bump latest fedora cuda repo to 39 Daniel Hiltgen 2024-06-18 17:13:54 -07:00
  • 21adf8b6d2
    Merge pull request #5121 from ollama/mxyng/deepseekv2 v0.1.45-rc3 Michael Yang 2024-06-18 16:30:58 -07:00
  • 784bf88b0d Wire up windows AMD driver reporting Daniel Hiltgen 2024-06-18 16:22:47 -07:00
  • e873841cbb deepseek v2 graph Michael Yang 2024-06-18 12:42:37 -07:00
  • 26d0bf9236
    Merge pull request #5117 from dhiltgen/fix_prediction Daniel Hiltgen 2024-06-18 11:36:51 -07:00
  • 359b15a597 Handle models with divergent layer sizes Daniel Hiltgen 2024-06-18 11:05:34 -07:00
  • b55958a587
    Merge pull request #5106 from dhiltgen/clean_logs Daniel Hiltgen 2024-06-18 09:24:38 -07:00
  • 7784ca33ce Tighten up memory prediction logging Daniel Hiltgen 2024-06-17 18:39:48 -07:00
  • c9c8c98bf6
    Merge pull request #5105 from dhiltgen/cuda_mmap Daniel Hiltgen 2024-06-17 17:07:30 -07:00
  • 171796791f Adjust mmap logic for cuda windows for faster model load Daniel Hiltgen 2024-06-17 12:14:42 -07:00
  • 176d0f7075
    Update import.md Jeffrey Morgan 2024-06-17 19:44:14 -04:00
  • 8ed51cac37
    Merge pull request #5103 from dhiltgen/faster_win_build Daniel Hiltgen 2024-06-17 14:23:18 -07:00
  • c9e6f0542d
    Merge pull request #5069 from dhiltgen/ci_release Daniel Hiltgen 2024-06-17 13:59:37 -07:00
  • b0930626c5 Add back lower level parallel flags Daniel Hiltgen 2024-06-17 13:44:46 -07:00