Configure the systemd service via a separate file.

Instead of changing the systemd unit file, use a separate config for the configuration. refs #3516
2024-07-10 13:15:00 +00:00 · 2024-07-10 13:15:00 +00:00 · 14d22902a7
commit 14d22902a7
parent 22c5451fc2
3 changed files with 24 additions and 12 deletions
--- a/docs/faq.md
+++ b/docs/faq.md
@ -73,23 +73,17 @@ If Ollama is run as a macOS application, environment variables should be set usi

 ### Setting environment variables on Linux

-If Ollama is run as a systemd service, environment variables should be set using `systemctl`:
+If Ollama is run as a systemd service, environment variables should be set via the file specified in the `EnvironmentFile` option of the systemd unit:

-1. Edit the systemd service by calling `systemctl edit ollama.service`. This will open an editor.
-
-2. For each environment variable, add a line `Environment` under section `[Service]`:
+1. Edit the file `/etc/ollama/serve.conf`:

    ```ini
-    [Service]
-    Environment="OLLAMA_HOST=0.0.0.0"
+    OLLAMA_HOST=0.0.0.0
    ```

-3. Save and exit.
-
-4. Reload `systemd` and restart Ollama:
+2. Restart Ollama:

   ```bash
-   systemctl daemon-reload
   systemctl restart ollama
   ```

@ -272,4 +266,4 @@ The following server settings may be used to adjust how Ollama handles concurren
 - `OLLAMA_NUM_PARALLEL` - The maximum number of parallel requests each model will process at the same time.  The default will auto-select either 4 or 1 based on available memory.
 - `OLLAMA_MAX_QUEUE` - The maximum number of requests Ollama will queue when busy before rejecting additional requests. The default is 512

-Note: Windows with Radeon GPUs currently default to 1 model maximum due to limitations in ROCm v5.7 for available VRAM reporting.  Once ROCm v6.2 is available, Windows Radeon will follow the defaults above.  You may enable concurrent model loads on Radeon on Windows, but ensure you don't load more models than will fit into your GPUs VRAM.
+Note: Windows with Radeon GPUs currently default to 1 model maximum due to limitations in ROCm v5.7 for available VRAM reporting.  Once ROCm v6.2 is available, Windows Radeon will follow the defaults above.  You may enable concurrent model loads on Radeon on Windows, but ensure you don't load more models than will fit into your GPUs VRAM.
--- a/docs/linux.md
+++ b/docs/linux.md
@ -102,7 +102,7 @@ sudo chmod +x /usr/bin/ollama

 ## Installing specific versions

-Use `OLLAMA_VERSION` environment variable with the install script to install a specific version of Ollama, including pre-releases. You can find the version numbers in the [releases page](https://github.com/ollama/ollama/releases). 
+Use `OLLAMA_VERSION` environment variable with the install script to install a specific version of Ollama, including pre-releases. You can find the version numbers in the [releases page](https://github.com/ollama/ollama/releases).

 For example:

@ -141,3 +141,8 @@ sudo rm -r /usr/share/ollama
 sudo userdel ollama
 sudo groupdel ollama
 ```
+
+Remove the config
+```bash
+sudo rm -f /etc/ollama
+```
--- a/scripts/install.sh
+++ b/scripts/install.sh
@ -99,6 +99,18 @@ configure_systemd() {
    status "Adding current user to ollama group..."
    $SUDO usermod -a -G ollama $(whoami)

+    if $SUDO test -f "/etc/ollama/serve.conf" ; then
+        status "Skip creating ollama serve config file..."
+    else
+        status "Creating ollama serve config file..."
+        $SUDO mkdir -p /etc/ollama/
+        cat <<EOF | $SUDO tee /etc/ollama/serve.conf >/dev/null
+# The list of supported env variables https://github.com/ollama/ollama/blob/main/envconfig/config.go
+#OLLAMA_DEBUG=1
+#OLLAMA_HOST=0.0.0.0:11434
+EOF
+    fi
+
    status "Creating ollama systemd service..."
    cat <<EOF | $SUDO tee /etc/systemd/system/ollama.service >/dev/null
 [Unit]
@ -112,6 +124,7 @@ Group=ollama
 Restart=always
 RestartSec=3
 Environment="PATH=$PATH"
+EnvironmentFile=/etc/ollama/serve.conf

 [Install]
 WantedBy=default.target