Configure the systemd service via a separate file.

Instead of changing the systemd unit file, use a separate config for
the configuration.

refs #3516
This commit is contained in:
Yuri Khrustalev 2024-07-10 13:15:00 +00:00
parent 22c5451fc2
commit 14d22902a7
3 changed files with 24 additions and 12 deletions

View File

@ -73,23 +73,17 @@ If Ollama is run as a macOS application, environment variables should be set usi
### Setting environment variables on Linux
If Ollama is run as a systemd service, environment variables should be set using `systemctl`:
If Ollama is run as a systemd service, environment variables should be set via the file specified in the `EnvironmentFile` option of the systemd unit:
1. Edit the systemd service by calling `systemctl edit ollama.service`. This will open an editor.
2. For each environment variable, add a line `Environment` under section `[Service]`:
1. Edit the file `/etc/ollama/serve.conf`:
```ini
[Service]
Environment="OLLAMA_HOST=0.0.0.0"
OLLAMA_HOST=0.0.0.0
```
3. Save and exit.
4. Reload `systemd` and restart Ollama:
2. Restart Ollama:
```bash
systemctl daemon-reload
systemctl restart ollama
```
@ -272,4 +266,4 @@ The following server settings may be used to adjust how Ollama handles concurren
- `OLLAMA_NUM_PARALLEL` - The maximum number of parallel requests each model will process at the same time. The default will auto-select either 4 or 1 based on available memory.
- `OLLAMA_MAX_QUEUE` - The maximum number of requests Ollama will queue when busy before rejecting additional requests. The default is 512
Note: Windows with Radeon GPUs currently default to 1 model maximum due to limitations in ROCm v5.7 for available VRAM reporting. Once ROCm v6.2 is available, Windows Radeon will follow the defaults above. You may enable concurrent model loads on Radeon on Windows, but ensure you don't load more models than will fit into your GPUs VRAM.
Note: Windows with Radeon GPUs currently default to 1 model maximum due to limitations in ROCm v5.7 for available VRAM reporting. Once ROCm v6.2 is available, Windows Radeon will follow the defaults above. You may enable concurrent model loads on Radeon on Windows, but ensure you don't load more models than will fit into your GPUs VRAM.

View File

@ -102,7 +102,7 @@ sudo chmod +x /usr/bin/ollama
## Installing specific versions
Use `OLLAMA_VERSION` environment variable with the install script to install a specific version of Ollama, including pre-releases. You can find the version numbers in the [releases page](https://github.com/ollama/ollama/releases).
Use `OLLAMA_VERSION` environment variable with the install script to install a specific version of Ollama, including pre-releases. You can find the version numbers in the [releases page](https://github.com/ollama/ollama/releases).
For example:
@ -141,3 +141,8 @@ sudo rm -r /usr/share/ollama
sudo userdel ollama
sudo groupdel ollama
```
Remove the config
```bash
sudo rm -f /etc/ollama
```

View File

@ -99,6 +99,18 @@ configure_systemd() {
status "Adding current user to ollama group..."
$SUDO usermod -a -G ollama $(whoami)
if $SUDO test -f "/etc/ollama/serve.conf" ; then
status "Skip creating ollama serve config file..."
else
status "Creating ollama serve config file..."
$SUDO mkdir -p /etc/ollama/
cat <<EOF | $SUDO tee /etc/ollama/serve.conf >/dev/null
# The list of supported env variables https://github.com/ollama/ollama/blob/main/envconfig/config.go
#OLLAMA_DEBUG=1
#OLLAMA_HOST=0.0.0.0:11434
EOF
fi
status "Creating ollama systemd service..."
cat <<EOF | $SUDO tee /etc/systemd/system/ollama.service >/dev/null
[Unit]
@ -112,6 +124,7 @@ Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"
EnvironmentFile=/etc/ollama/serve.conf
[Install]
WantedBy=default.target