Roy Han
53e9576f46
testing clean up
2024-07-11 20:20:14 -07:00
Roy Han
cdb9fe9b06
test values
2024-07-10 09:57:36 -07:00
Roy Han
bcb63e6e0e
touches
2024-07-09 13:37:00 -07:00
Roy Han
c0fa2236cf
integration float32
2024-07-03 12:47:57 -07:00
Roy Han
1a0c8b363c
Truncation Integration Tests
2024-07-01 16:26:30 -07:00
Roy Han
e068e7f698
Integration Test Template
2024-07-01 15:24:26 -07:00
Daniel Hiltgen
6f351bf586
review comments and coverage
2024-06-14 14:55:50 -07:00
Daniel Hiltgen
68dfc6236a
refined test timing
...
adjust timing on some tests so they don't timeout on small/slow GPUs
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
6fd04ca922
Improve multi-gpu handling at the limit
...
Still not complete, needs some refinement to our prediction to understand the
discrete GPUs available space so we can see how many layers fit in each one
since we can't split one layer across multiple GPUs we can't treat free space
as one logical block
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
206797bda4
Fix concurrency integration test to work locally
...
This worked remotely but wound up trying to spawn multiple servers
locally which doesn't work
2024-06-14 14:51:40 -07:00
Daniel Hiltgen
7f2fbad736
Skip max queue test on remote
...
This test needs to be able to adjust the queue size down from
our default setting for a reliable test, so it needs to skip on
remote test execution mode.
2024-05-16 16:24:18 -07:00
Daniel Hiltgen
074dc3b9d8
Integration fixes
2024-05-10 14:20:10 -07:00
Michael Yang
a7248f6ea8
update tests
2024-05-06 15:24:01 -07:00
Daniel Hiltgen
45d61aaaa3
Add integration test to push max queue limits
2024-05-05 10:46:25 -07:00
Daniel Hiltgen
f2ea8470e5
Local unicode test case
2024-04-22 19:29:12 -07:00
Daniel Hiltgen
34b9db5afc
Request and model concurrency
...
This change adds support for multiple concurrent requests, as well as
loading multiple models by spawning multiple runners. The default
settings are currently set at 1 concurrent request per model and only 1
loaded model at a time, but these can be adjusted by setting
OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
2024-04-22 19:29:12 -07:00
Daniel Hiltgen
aeb1fb5192
Add test case for context exhaustion
...
Confirmed this fails on 0.1.30 with known regression
but passes on main
2024-04-04 07:42:17 -07:00
Jeffrey Morgan
cd135317d2
Fix macOS builds on older SDKs ( #3467 )
2024-04-03 10:45:54 -07:00
Daniel Hiltgen
4fec5816d6
Integration test improvements
...
Cleaner shutdown logic, a bit of response hardening
2024-04-01 16:48:18 -07:00
Patrick Devine
1b272d5bcd
change github.com/jmorganca/ollama
to github.com/ollama/ollama
( #3347 )
2024-03-26 13:04:17 -07:00
Daniel Hiltgen
7b6cbc10ec
Integration tests conditionally pull
...
If images aren't present, pull them.
Also fixes the expected responses
2024-03-25 08:57:45 -07:00
Daniel Hiltgen
949b6c01e0
Revamp go based integration tests
...
This uplevels the integration tests to run the server which can allow
testing an existing server, or a remote server.
2024-03-23 14:24:18 +01:00