5 Commits

Author SHA1 Message Date
jmorganca
e1dfc757b3 revert llm changes 2024-09-03 21:15:13 -04:00
jmorganca
01ccbc07fe replace static build in llm 2024-09-03 21:15:12 -04:00
Bruce MacDonald
d6f692ad1a
Add support for IQ1_S, IQ3_S, IQ2_S, IQ4_XS. IQ4_NL (#4322)
Co-authored-by: ManniX-ITA <20623405+mann1x@users.noreply.github.com>
2024-05-23 13:21:49 -07:00
Michael Yang
01811c176a comments 2024-05-06 15:24:01 -07:00
Michael Yang
9685c34509 quantize any fp16/fp32 model
- FROM /path/to/{safetensors,pytorch}
- FROM /path/to/fp{16,32}.bin
- FROM model:fp{16,32}
2024-05-06 15:24:01 -07:00