ollama

Author	SHA1	Message	Date
Josh Yan	309307c8f9	update test, remove comments	2024-07-17 10:46:50 -07:00
Josh Yan	f378058b51	whitespace	2024-07-16 16:45:41 -07:00
Josh	d069cf753b	Merge branch 'main' into jyan/reord-g	2024-07-16 16:42:49 -07:00
Josh Yan	64405525b4	clean up	2024-07-16 16:40:38 -07:00
Josh Yan	dea2204b82	rmv comments	2024-07-16 16:37:50 -07:00
Josh Yan	6ee22d5080	clean	2024-07-16 16:35:15 -07:00
Josh Yan	703ecccc6b	clean	2024-07-16 14:17:44 -07:00
Josh Yan	873f334783	IT WORKS	2024-07-16 14:12:07 -07:00
Josh Yan	fa49bfc0bd	FIXED TESTS	2024-07-16 12:14:10 -07:00
Josh Yan	fc1b3ee9bf	test	2024-07-16 11:21:13 -07:00
Michael Yang	4a565cbf94	add chat and generate tests with mock runner	2024-07-16 09:39:31 -07:00
Josh Yan	25be20949c	test	2024-07-15 15:08:24 -07:00
royjhan	b9f5e16c80	Introduce `/api/embed` endpoint supporting batch embedding (#5127 ) * Initial Batch Embedding * Revert "Initial Batch Embedding" This reverts commit c22d54895a280b54c727279d85a5fc94defb5a29. * Initial Draft * mock up notes * api/embed draft * add server function * check normalization * clean up * normalization * playing around with truncate stuff * Truncation * Truncation * move normalization to go * Integration Test Template * Truncation Integration Tests * Clean up * use float32 * move normalize * move normalize test * refactoring * integration float32 * input handling and handler testing * Refactoring of legacy and new * clear comments * merge conflicts * touches * embedding type 64 * merge conflicts * fix hanging on single string * refactoring * test values * set context length * clean up * testing clean up * testing clean up * remove function closure * Revert "remove function closure" This reverts commit 55d48c6ed17abe42e7a122e69d603ef0c1506787. * remove function closure * remove redundant error check * clean up * more clean up * clean up	2024-07-15 12:14:24 -07:00
Josh Yan	903e9df46f	test	2024-07-15 11:46:49 -07:00
Josh Yan	40c0f9612e	unneccesary	2024-07-14 18:41:16 -07:00
Jeffrey Morgan	ef98803d63	llm: looser checks for minimum memory (#5677 )	2024-07-13 09:20:05 -07:00
Josh Yan	15a0215203	running	2024-07-12 16:49:57 -07:00
Josh Yan	faa3c937cf	writeto	2024-07-12 15:37:27 -07:00
Josh Yan	cf57246aba	write	2024-07-12 12:59:51 -07:00
Josh Yan	6fafe4f753	gguf	2024-07-12 12:58:00 -07:00
Josh Yan	d7c8d4f3f4	ggufwritekv	2024-07-12 12:25:13 -07:00
Josh Yan	3d0fd31f0e	TensorWriter	2024-07-12 12:18:46 -07:00
Josh Yan	e75fb73839	types	2024-07-12 09:42:10 -07:00
Josh Yan	2fdebffc8d	sawp	2024-07-11 18:18:26 -07:00
Josh Yan	29ecfe493b	write	2024-07-11 17:56:51 -07:00
Josh	10e768826c	fix: quant err message (#5616 )	2024-07-11 17:24:29 -07:00
Jeffrey Morgan	c4cf8ad559	llm: avoid loading model if system memory is too small (#5637 ) * llm: avoid loading model if system memory is too small * update log * Instrument swap free space On linux and windows, expose how much swap space is available so we can take that into consideration when scheduling models * use `systemSwapFreeMemory` in check --------- Co-authored-by: Daniel Hiltgen <daniel@ollama.com>	2024-07-11 16:42:57 -07:00
Jeffrey Morgan	791650ddef	sched: only error when over-allocating system memory (#5626 )	2024-07-11 00:53:12 -07:00
Jeffrey Morgan	efbf41ed81	llm: dont link cuda with compat libs (#5621 )	2024-07-10 20:01:52 -07:00
Michael Yang	37a570f962	Merge pull request #5612 from ollama/mxyng/mem chatglm graph	2024-07-10 14:18:33 -07:00
Michael Yang	5a739ff4cb	chatglm graph	2024-07-10 13:43:47 -07:00
Jeffrey Morgan	4e262eb2a8	remove `GGML_CUDA_FORCE_MMQ=on` from build (#5588 )	2024-07-10 13:17:13 -07:00
Daniel Hiltgen	b50c818623	Merge pull request #5607 from dhiltgen/win_rocm_v6 Bump ROCm on windows to 6.1.2	2024-07-10 12:47:10 -07:00
Daniel Hiltgen	1f50356e8e	Bump ROCm on windows to 6.1.2 This also adjusts our algorithm to favor our bundled ROCm. I've confirmed VRAM reporting still doesn't work properly so we can't yet enable concurrency by default.	2024-07-10 11:01:22 -07:00
Daniel Hiltgen	22c81f62ec	Remove duplicate merge glitch	2024-07-10 09:01:33 -07:00
Daniel Hiltgen	2d1e3c3229	Merge pull request #5503 from dhiltgen/dual_rocm Workaround broken ROCm p2p copy	2024-07-09 15:44:16 -07:00
Daniel Hiltgen	b51e3b63ac	Statically link c++ and thread lib This makes sure we statically link the c++ and thread library on windows to avoid unnecessary runtime dependencies on non-standard DLLs	2024-07-09 11:34:30 -07:00
Michael Yang	9bbddc37a7	Merge pull request #5126 from ollama/mxyng/messages update message processing	2024-07-09 09:20:44 -07:00
Daniel Hiltgen	0bacb30007	Workaround broken ROCm p2p copy Enable the build flag for llama.cpp to use CPU copy for multi-GPU scenarios.	2024-07-08 09:40:52 -07:00
Jeffrey Morgan	53da2c6965	llm: remove ambiguous comment when putting upper limit on predictions to avoid infinite generation (#5535 )	2024-07-07 14:32:05 -04:00
Jeffrey Morgan	d8def1ff94	llm: allow gemma 2 to context shift (#5534 )	2024-07-07 13:41:51 -04:00
Jeffrey Morgan	571dc61955	Update llama.cpp submodule to `a8db2a9c` (#5530 )	2024-07-07 13:03:09 -04:00
Jeffrey Morgan	0e09c380fc	llm: print caching notices in debug only (#5533 )	2024-07-07 12:38:04 -04:00
Jeffrey Morgan	4607c70641	llm: add `-DBUILD_SHARED_LIBS=off` to common cpu cmake flags (#5520 )	2024-07-06 18:58:16 -04:00
jmorganca	a08f20d910	release: remove unwanted mingw dll.a files	2024-07-06 15:21:15 -04:00
jmorganca	6cea036027	Revert "llm: only statically link libstdc++" This reverts commit 5796bfc4013f4ebe26cdbf13554332a25c405027.	2024-07-06 15:10:48 -04:00
jmorganca	5796bfc401	llm: only statically link libstdc++	2024-07-06 14:06:20 -04:00
jmorganca	f1a379aa56	llm: statically link pthread and stdc++ dependencies in windows build	2024-07-06 12:54:02 -04:00
jmorganca	9ae146993e	llm: add `GGML_STATIC` flag to windows static lib	2024-07-06 03:27:05 -04:00
Jeffrey Morgan	e0348d3fe8	llm: add `COMMON_DARWIN_DEFS` to arm static build (#5513 )	2024-07-05 22:42:42 -04:00

1 2 3 4 5 ...

603 Commits