Commit Graph

  • 3ecae420ac
    Update api.md (#3945) Darinka 2024-05-07 00:39:58 +03:00
  • a23ed714f4
    Update docs/api.md Jeffrey Morgan 2024-05-06 14:39:54 -07:00
  • 4cbbf0e13b
    Merge pull request #4090 from dhiltgen/rocm_paths Daniel Hiltgen 2024-05-06 14:33:41 -07:00
  • 380378cc80 Use our libraries first Daniel Hiltgen 2024-05-05 17:45:43 -07:00
  • 0963c65027
    Merge pull request #4208 from dhiltgen/fix_sched_test Daniel Hiltgen 2024-05-06 14:23:12 -07:00
  • ed740a2504
    Fix no slots available error with concurrent requests (#4160) Jeffrey Morgan 2024-05-06 14:22:53 -07:00
  • c9f98622b1
    Skip scheduling cancelled requests, always reload unloaded runners (#4189) Jeffrey Morgan 2024-05-06 14:22:24 -07:00
  • 0a954e5066 Fix stale test logic Daniel Hiltgen 2024-05-06 14:15:37 -07:00
  • aa93423fbf
    docs: pbcopy on mac (#3129) Adrien Brault 2024-05-06 22:47:00 +02:00
  • 01c9386267
    Add BrainSoup to compatible clients list (#3473) Nurgo 2024-05-06 22:42:16 +02:00
  • e232fe27ed Add BrainSoup to compatible clients list Grégory Journé 2024-04-03 11:10:23 +02:00
  • af9eb36f9f
    Merge pull request #4135 from dhiltgen/no_physx Daniel Hiltgen 2024-05-06 13:34:00 -07:00
  • 06093fd396
    Merge pull request #4067 from dhiltgen/cudart Daniel Hiltgen 2024-05-06 13:30:27 -07:00
  • d5d821e11f
    Update format/format.go Bruce MacDonald 2024-05-06 12:59:08 -07:00
  • 86b7fcac32
    Update README.md with StreamDeploy (#3621) Tony Loehr 2024-05-06 19:14:41 +01:00
  • 6440150e11
    Merge branch 'main' into patch-1 Bruce MacDonald 2024-05-06 11:14:27 -07:00
  • fb8ddc564e
    chore: delete HEAD (#4194) Hyden Liu 2024-05-07 01:32:30 +08:00
  • 242efe6611
    👌 IMPROVE: add portkey library for production tools (#4119) Saif 2024-05-06 22:55:23 +05:30
  • 2befddf132 log instead of fail David Carreto Fidalgo 2024-05-06 08:57:58 +02:00
  • b27bbaa318 chore: delete HEAD Hyden Liu 2024-05-06 13:26:28 +08:00
  • 62be4e3ff0 cleanup jmorganca 2024-05-05 21:03:30 -07:00
  • 401859b94a cleanup jmorganca 2024-05-05 20:56:51 -07:00
  • 1b0e6c9c0e
    Fix llava models not working after first request (#4164) Jeffrey Morgan 2024-05-05 20:50:31 -07:00
  • 80580ebef5 Skip scheduling cancelled requests, always reload unloaded runners jmorganca 2024-05-05 18:42:45 -07:00
  • e99a4339a4 still check server status in case of hangs jmorganca 2024-05-05 18:14:57 -07:00
  • ba9ff6455c fix linter error jmorganca 2024-05-05 17:59:59 -07:00
  • aa236720df fix build jmorganca 2024-05-05 17:57:59 -07:00
  • e0c64e573e remove retry on completion jmorganca 2024-05-05 17:53:02 -07:00
  • e10299bc6e Don't check server status as it will queue the request anyways jmorganca 2024-05-05 17:35:19 -07:00
  • 26dd10c987 individual requests only for llava models jmorganca 2024-05-05 15:59:17 -07:00
  • 4acf8599c6 fix llava models not working after first request jmorganca 2024-05-05 00:41:32 -07:00
  • dfa2f32ca0
    unload in critical section (#4187) Jeffrey Morgan 2024-05-05 17:18:27 -07:00
  • 840424a2c4
    Merge pull request #4154 from dhiltgen/central_config Daniel Hiltgen 2024-05-05 17:08:26 -07:00
  • 8a51d4a367 unload in critical section jmorganca 2024-05-05 17:00:05 -07:00
  • f56aa20014 Centralize server config handling Daniel Hiltgen 2024-05-04 11:46:01 -07:00
  • 6707768ebd
    chore: format go code (#4149) alwqx 2024-05-06 07:08:09 +08:00
  • c78bb76a12
    update libraries for langchain_community + llama3 changed from llama2 (#4174) Lord Basil - Automate EVERYTHING 2024-05-05 19:07:04 -04:00
  • 942c979232
    allocate a large enough kv cache for all parallel requests (#4162) Jeffrey Morgan 2024-05-05 15:59:32 -07:00
  • 06164911dd
    Update README.md (#4111) Bernardo de Oliveira Bruning 2024-05-05 18:45:32 -03:00
  • e8f64e38a2 move around some of the projects Patrick Devine 2024-05-05 14:40:32 -07:00
  • 588e2fcca2
    Merge remote-tracking branch 'upstream/main' Gamunu Balagalla 2024-05-06 00:22:36 +05:30
  • 2a21363bb7
    validate the format of the digest when getting the model path (#4175) Patrick Devine 2024-05-05 11:46:12 -07:00
  • 76c4af4d0e
    Update gpu.go alecvern 2024-05-05 21:18:34 +03:00
  • 4f1c2a5cdf validate the format of the digest when getting the model path Patrick Devine 2024-05-05 11:09:49 -07:00
  • 026869915f
    Merge pull request #4144 from dhiltgen/max_queue Daniel Hiltgen 2024-05-05 10:53:44 -07:00
  • 45d61aaaa3 Add integration test to push max queue limits Daniel Hiltgen 2024-05-05 10:06:33 -07:00
  • bc01529fe9
    update libraries for langchain_community + llama3 changed from llama2 Lord Basil - Automate EVERYTHING 2024-05-05 12:45:23 -04:00
  • c3e93f4f8c
    more OCD ordering Maurice Nonnekes 2024-05-05 16:40:11 +02:00
  • 798d01fccf
    reorder because OCD Maurice Nonnekes 2024-05-05 16:12:50 +02:00
  • 2c65b8d503
    Add support for BSD Maurice Nonnekes 2024-05-05 16:11:18 +02:00
  • 05b8d1b834
    Merge c9a159b357c7d91adb99744b5ce7a5e521625d20 into 371f5e52aa84b08ec94896f9bd91c9f2064b1288 Neko Ayaka 2024-05-05 22:04:25 +08:00
  • 12fb17bb29 allocate a large enough kv cache for all parallel requests jmorganca 2024-05-04 21:42:24 -07:00
  • 20f6c06569 Make maximum pending request configurable Daniel Hiltgen 2024-05-03 16:25:57 -07:00
  • 371f5e52aa
    Merge pull request #4141 from dhiltgen/win_docs Daniel Hiltgen 2024-05-04 12:50:16 -07:00
  • e006480e49 Explain the 2 different windows download options Daniel Hiltgen 2024-05-03 14:07:38 -07:00
  • 6d589df9e8
    fix: incorrect driver index in oneapi_device_info Gamunu Balagalla 2024-05-04 21:07:21 +05:30
  • 0a2ab4b718
    fix: improve GPU information retreval Gamunu Balagalla 2024-05-04 19:02:55 +05:30
  • f6a7c51eb3 chore: format go code alwqx 2024-05-04 17:33:29 +08:00
  • a716c4f16a
    Merge branch 'ollama:main' into main 1feralcat 2024-05-04 13:32:09 +10:00
  • 34506e3559 'ollama serve' can host static files under $OLLAMA_WEBSITE via '/website' Linh Le 2024-05-04 13:12:42 +10:00
  • aed545872d
    Merge pull request #4143 from ollama/mxyng/final-response Michael Yang 2024-05-03 17:39:49 -07:00
  • 44869c59d6 omit prompt and generate settings from final response Michael Yang 2024-05-03 16:11:49 -07:00
  • 52663284cf
    Merge pull request #4145 from dhiltgen/fix_lint Daniel Hiltgen 2024-05-03 16:53:17 -07:00
  • 42fa9d7f0a Fix lint warnings Daniel Hiltgen 2024-05-03 16:44:19 -07:00
  • d40497b9a2 Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL bruce/iq-quants Bruce MacDonald 2024-05-03 14:51:07 -07:00
  • 43285321b2 Add log to file flag for server Daniel Hiltgen 2024-05-03 13:29:49 -07:00
  • c9a159b357 update entry Patrick Devine 2024-05-03 16:24:57 -04:00
  • 54d4f01311 docs: added Ollama Operator into README.md as one of community projects Neko Ayaka 2024-04-21 14:12:30 +08:00
  • 828e4bf101 s/DisplayLongest/String/ Michael Yang 2024-05-01 10:34:39 -07:00
  • 05105903d8 only quantize language models Michael Yang 2024-04-25 09:01:20 -07:00
  • abf3b1fb34 no iterator Michael Yang 2024-04-25 08:53:08 -07:00
  • 82fcc0601d rebase Michael Yang 2024-04-24 15:06:47 -07:00
  • 185a927210 comments Michael Yang 2024-04-23 15:18:45 -07:00
  • 096ea2c8c3 update tests Michael Yang 2024-04-16 15:37:28 -07:00
  • 06b31e2e24 quantize any fp16/fp32 model Michael Yang 2024-04-12 13:55:12 -07:00
  • b7a87a22b6
    Merge pull request #4059 from ollama/mxyng/parser-2 Michael Yang 2024-05-03 13:01:22 -07:00
  • e8aaea030e
    Update 'llama2' -> 'llama3' in most places (#4116) Dr Nic Williams 2024-05-04 05:25:04 +10:00
  • f87855f0e0 updated files Patrick Devine 2024-05-03 14:56:51 -04:00
  • ef2c29172b Update 'llama2' -> 'llama3' in most places Patrick Devine 2024-05-03 15:03:14 -04:00
  • b1ad3a43cb Skip PhysX cudart library Daniel Hiltgen 2024-05-03 11:55:32 -07:00
  • 267e25a750
    Merge pull request #4129 from dhiltgen/unit_tests Daniel Hiltgen 2024-05-03 11:10:26 -07:00
  • dc5c4c85d8
    Merge remote-tracking branch 'upstream/main' into dev Gamunu Balagalla 2024-05-03 23:15:07 +05:30
  • 4a40ed2513
    feat: add device count function to init Gamunu Balagalla 2024-04-24 20:14:26 +05:30
  • d33089fa99
    Merge branch 'ollama:main' into main Jim Scardelis 2024-05-03 09:39:31 -07:00
  • 9a32c514cb Soften timeouts on sched unit tests Daniel Hiltgen 2024-05-03 09:08:33 -07:00
  • 0905a7b9c5 Add entry in FAQ David Carreto Fidalgo 2024-05-03 11:40:21 +02:00
  • 19bf235ee7 👌 IMPROVE: add portkey library for production tools Saif Ali Shaik 2024-05-03 12:30:36 +05:30
  • 61b287cf25 types/model: make Name.Filepath substitute colons in host with ("%") bmizerany/filepathwithcoloninhost Blake Mizerany 2024-05-02 14:36:46 -07:00
  • a489ec7d90
    Update README.md josc146 2024-05-03 13:16:43 +08:00
  • 76c3108984
    Merge branch 'ollama:main' into main Jesse C. Lin 2024-05-03 10:48:20 +08:00
  • 3e7e3a7c1d
    Merge 9bfd21935dd03fb3175885d8d5f83b4601f26ae0 into e9ae607ece49384f391e7e52e32a58addb336095 James Braza 2024-05-03 09:54:11 +08:00
  • 7a5caad2cd Update README.md Bernardo Bruning 2024-05-02 21:31:23 -03:00
  • 65cc940b69
    Merge dc474f9b83c3a7fb9624c3cf38ce46a807127ca5 into 122b35c7840b96fdacb74e616f6816151c5aa01e Michael Yang 2024-05-02 17:06:24 -07:00
  • dc474f9b83 handle intermediate blobs mxyng/split-bin Michael Yang 2024-05-01 15:19:33 -07:00
  • 41ae232e10 split model layer into metadata and data layers Michael Yang 2024-05-01 11:08:04 -07:00
  • 122b35c784 s/DisplayLongest/String/ Michael Yang 2024-05-01 10:34:39 -07:00
  • 3244a25c79 only quantize language models Michael Yang 2024-04-25 09:01:20 -07:00
  • b535afe35c no iterator Michael Yang 2024-04-25 08:53:08 -07:00
  • fd071eab8b rebase Michael Yang 2024-04-24 15:06:47 -07:00
  • da0bb5d772 comments Michael Yang 2024-04-23 15:18:45 -07:00