diff --git a/README.md b/README.md new file mode 100644 index 0000000..5e2ee0f --- /dev/null +++ b/README.md @@ -0,0 +1,49 @@ +# Nomad Homelab Stacks + +This repository contains HashiCorp Nomad job definitions for various services deployed in a homelab environment. It also includes a Gitea Actions workflow for continuous deployment of these stacks. + +## Repository Structure + +The `stacks/` directory contains subdirectories for different service categories, with each subdirectory holding one or more Nomad job files (e.g., `.nomad`). Each stack also has its own `README.md` for detailed information. + +- [`stacks/ai/ai-backend/README.md`](stacks/ai/ai-backend/README.md): Documentation for the Ollama AI backend. +- [`stacks/ai/ai-frontend/README.md`](stacks/ai/ai-frontend/README.md): Documentation for the Open WebUI and LobeChat AI frontends. +- [`stacks/networking/newt/README.md`](stacks/networking/newt/README.md): Documentation for the Project Newt networking agent. + +## Gitea Actions Deployment Workflow + +The `.gitea/workflows/deploy.yaml` file defines a Gitea Actions workflow to automate the deployment of Nomad jobs from this repository. + +### Workflow: `Deploy to Nomad` + +- **Trigger**: Manually dispatched via `workflow_dispatch`. +- **Inputs**: Requires `stack_name` (a choice from `ai-backend`, `ai-frontend`, `newt`). This corresponds to the name of the Nomad job file (without the `.nomad` extension). +- **Jobs**: + - **`deploy`**: Runs on `ubuntu-latest`. + 1. **Checkout**: Clones the repository. + 2. **Install Nomad CLI (Universal)**: Dynamically detects the architecture (amd64 or arm64), installs `unzip` and `curl`, then downloads and installs the Nomad CLI version 1.9.2. + 3. **Run Deploy**: + - Sets the `NOMAD_ADDR` environment variable to `http://192.168.1.133:4646`. + - Finds the specified `STACK.nomad` file within the `stacks/` directory (including subfolders). + - Executes `nomad job run ` to deploy the selected Nomad job. + +### How to use the Gitea Workflow + +1. Navigate to the "Actions" tab in your Gitea repository. +2. Select the "Deploy to Nomad" workflow. +3. Click "Run workflow" and choose the `stack_name` you wish to deploy (e.g., `ai-backend`, `ai-frontend`, or `newt`). +4. Confirm the deployment. The workflow will automatically install the Nomad CLI, locate the job file, and deploy it to your configured Nomad server. + +This workflow ensures a consistent and automated way to deploy and update services in your homelab environment. + +## Projects Involved (Overview) + +- **[HashiCorp Nomad](https://www.nomadproject.io/)**: A workload orchestrator. +- **[Gitea Actions](https://docs.gitea.io/en-us/actions/)**: A CI/CD solution integrated with Gitea. +- **[Podman](https://podman.io/)**: A daemonless container engine (used by most Nomad jobs in this repo). +- **[Traefik](https://traefik.io/traefik/)**: An open-source Edge Router (used for some services in this repo). +- **[HashiCorp Consul](https://www.consul.io/)**: A service mesh solution (used for service discovery). +- **[Ollama](https://ollama.com/)**: A tool to run large language models locally. +- **[Open WebUI](https://docs.openwebui.com/)**: A user-friendly, open-source web interface for LLMs. +- **[LobeChat](https://github.com/lobehub/lobe-chat)**: An open-source, high-performance, extensible LLM chatbot framework. +- **[Project Newt](https://github.com/fosrl/newt)**: A project for secure and resilient overlay networking. \ No newline at end of file diff --git a/stacks/ai/ai-backend/README.md b/stacks/ai/ai-backend/README.md new file mode 100644 index 0000000..d3c5d27 --- /dev/null +++ b/stacks/ai/ai-backend/README.md @@ -0,0 +1,50 @@ +# AI Backend (Ollama) Nomad Job + +This Nomad job defines the deployment for an Ollama server, which provides a local large language model (LLM) serving environment. It is configured to run on a specific host with GPU acceleration using Vulkan. + +## What is this file? + +The [`ai-backend.nomad`](stacks/ai/ai-backend.nomad) file is a HashiCorp Nomad job specification written in HCL (HashiCorp Configuration Language). It describes how to deploy and manage the Ollama service. + +Key configurations: +- **`job "ai-backend"`**: The main job definition. +- **`datacenters = ["Homelab-PTECH-DC"]`**: Specifies the datacenter where this job should run. +- **`group "ollama-group"`**: Defines a group of tasks. +- **`constraint { attribute = "${meta.device}"; value = "p52-laptop" }`**: Ensures the job runs on the node tagged with `p52-laptop`. +- **`network { port "api" { static = 11434 } }`**: Exposes port 11434 for the Ollama API. +- **`task "ollama"`**: The actual task running the Ollama container. + - **`driver = "podman"`**: Uses Podman to run the container. + - **`env`**: Environment variables for the Ollama container: + - `OLLAMA_HOST = "0.0.0.0:11434"`: Binds Ollama to all network interfaces on port 11434. + - `OLLAMA_ORIGINS = "*"`: Allows requests from any origin (CORS). + - `OLLAMA_VULKAN = "1"`: Enables Vulkan for GPU acceleration. + - `HSA_OVERRIDE_GFX_VERSION = "10.3.0"`: Fallback for ROCm, though Vulkan takes priority. + - **`config`**: Podman-specific configuration: + - `image = "docker.io/ollama/ollama:latest"`: Uses the latest Ollama Docker image. + - `privileged = true`: Grants extended privileges to the container, necessary for direct hardware access for GPU. + - `volumes`: Mounts for persistent data and GPU devices: + - `"/mnt/local-ssd/nomad/stacks/ai/ai-backend/ollama:/root/.ollama"`: Persistent storage for Ollama models and data. + - `"/dev/kfd:/dev/kfd"` and `"/dev/dri:/dev/dri"`: Direct access to AMD GPU kernel driver and DRM (Direct Rendering Manager) devices for Vulkan. +- **`service "ollama"`**: Registers the Ollama service with Consul and Traefik. + - `tags = ["traefik.enable=true"]`: Enables Traefik ingress for this service. + +## How to use it + +To deploy this AI backend: + +1. Ensure you have a Nomad cluster running with a client node tagged `p52-laptop` that has Podman installed and appropriate GPU drivers. +2. Make sure the directory `/mnt/local-ssd/nomad/stacks/ai/ai-backend/ollama` exists on the host for persistent data. +3. Execute the following command on your Nomad server (or a machine with Nomad CLI access configured to connect to your server): + + ```bash + nomad job run stacks/ai/ai-backend.nomad + ``` + +After deployment, Ollama will be accessible on port 11434 on the host machine, and via Traefik if properly configured. + +## Projects Involved + +- **[HashiCorp Nomad](https://www.nomadproject.io/)**: A workload orchestrator that enables an organization to easily deploy and manage any containerized or non-containerized application. +- **[Ollama](https://ollama.com/)**: A tool to run large language models locally. +- **[Podman](https://podman.io/)**: A daemonless container engine for developing, managing, and running OCI containers on your Linux system. +- **[Traefik](https://traefik.io/traefik/)**: An open-source Edge Router that makes publishing your services a fun and easy experience. It receives requests and finds out which components are responsible for handling them. diff --git a/stacks/ai/ai-frontend/README.md b/stacks/ai/ai-frontend/README.md new file mode 100644 index 0000000..109f09f --- /dev/null +++ b/stacks/ai/ai-frontend/README.md @@ -0,0 +1,61 @@ +# AI Frontend Nomad Job + +This Nomad job defines the deployment for two AI frontend applications: Open WebUI and LobeChat. Both frontends are designed to interact with an Ollama backend (like the one defined in `ai-backend.nomad`). + +## What is this file? + +The [`ai-frontend.nomad`](stacks/ai/ai-frontend.nomad) file is a HashiCorp Nomad job specification written in HCL. It describes how to deploy and manage the Open WebUI and LobeChat services. + +Key configurations: + +### Open WebUI Group +- **`group "openwebui"`**: Defines the task group for Open WebUI. +- **`constraint { attribute = "${attr.unique.hostname}"; value = "hp1-home" }`**: Ensures Open WebUI runs on the `hp1-home` node. +- **`network { port "http" { static = 8080; to = 8080 } }`**: Exposes port 8080 for Open WebUI. +- **`service "openwebui"`**: Registers the service with Consul and Traefik. + - `tags = ["traefik.enable=true"]`: Enables Traefik ingress. +- **`task "server"`**: The Open WebUI container. + - **`driver = "podman"`**: Uses Podman. + - **`env { OLLAMA_BASE_URL = "http://ollama:11434" }`**: Configures Open WebUI to connect to the Ollama service. + - **`config { image = "ghcr.io/open-webui/open-webui:main" }`**: Uses the official Open WebUI image. + - **`volumes = ["/mnt/local-ssd/nomad/stacks/ai/ai-frontend/openwebui:/app/backend/data"]`**: Persistent storage for Open WebUI data. + +### LobeChat Group +- **`group "lobechat"`**: Defines the task group for LobeChat. +- **`constraint { attribute = "${attr.unique.hostname}"; value = "hp1-home" }`**: Ensures LobeChat runs on the `hp1-home` node. +- **`network { port "http" { static = 3210; to = 3210 } }`**: Exposes port 3210 for LobeChat. +- **`service "lobechat"`**: Registers the service with Consul. + - *No Traefik tags*: This service is not exposed via Traefik by default. +- **`task "server"`**: The LobeChat container. + - **`driver = "podman"`**: Uses Podman. + - **`env { OLLAMA_PROXY_URL = "http://ollama.service.consul:11434" }`**: Configures LobeChat to connect to the Ollama service via Consul DNS. + - **`config { image = "lobehub/lobe-chat:latest" }`**: Uses the official LobeChat image. + - **`volumes = ["/mnt/local-ssd/nomad/stacks/ai/ai-frontend/lobechat/data:/data"]`**: Persistent storage for LobeChat data. + +## How to use it + +To deploy these AI frontend applications: + +1. Ensure you have a Nomad cluster running with a client node tagged `hp1-home` that has Podman installed. +2. Make sure the following directories exist on the host for persistent data: + - `/mnt/local-ssd/nomad/stacks/ai/ai-frontend/openwebui` + - `/mnt/local-ssd/nomad/stacks/ai/ai-frontend/lobechat/data` +3. Ensure your Ollama backend is deployed and accessible (e.g., via the `ai-backend.nomad` job). +4. Execute the following command on your Nomad server (or a machine with Nomad CLI access configured to connect to your server): + + ```bash + nomad job run stacks/ai/ai-frontend.nomad + ``` + +After deployment: +- Open WebUI will be accessible on port 8080 on the host machine, and via Traefik if properly configured. +- LobeChat will be accessible on port 3210 on the host machine. If you wish to expose LobeChat externally, you will need to add appropriate Traefik tags to its `service` block. + +## Projects Involved + +- **[HashiCorp Nomad](https://www.nomadproject.io/)**: A workload orchestrator. +- **[Open WebUI](https://docs.openwebui.com/)**: A user-friendly, open-source web interface for LLMs. +- **[LobeChat](https://github.com/lobehub/lobe-chat)**: An open-source, high-performance, extensible LLM chatbot framework. +- **[Podman](https://podman.io/)**: A daemonless container engine. +- **[Traefik](https://traefik.io/traefik/)**: An open-source Edge Router (used by Open WebUI). +- **[HashiCorp Consul](https://www.consul.io/)**: A service mesh solution providing service discovery, configuration, and segmentation (used for internal service discovery for Ollama by LobeChat). \ No newline at end of file diff --git a/stacks/networking/newt/README.md b/stacks/networking/newt/README.md new file mode 100644 index 0000000..a7d097e --- /dev/null +++ b/stacks/networking/newt/README.md @@ -0,0 +1,41 @@ +# Networking (Newt Agent) Nomad Job + +This Nomad job defines the deployment for a Newt Agent, which is part of the Project Newt networking solution. It registers a Newt agent on a Nomad client. + +## What is this file? + +The [`newt.nomad`](stacks/networking/newt.nomad) file is a HashiCorp Nomad job specification written in HCL. It describes how to deploy and manage the Newt Agent service. + +Key configurations: +- **`job "networking"`**: The main job definition. +- **`datacenters = ["Homelab-PTECH-DC"]`**: Specifies the datacenter where this job should run. +- **`group "newt"`**: Defines a group of tasks. +- **`network { mode = "bridge" }`**: Configures the network for the task to use bridge mode. +- **`task "newt-agent"`**: The actual task running the Newt Agent container. + - **`driver = "podman"`**: Uses Podman to run the container. + - **`config { image = "docker.io/fosrl/newt:latest" }`**: Uses the latest Newt Agent Docker image. + - **`env`**: Environment variables for the Newt Agent: + - `PANGOLIN_ENDPOINT = "https://proxy.prestonhunter.space"`: The endpoint for the Pangolin proxy. + - `NEWT_ID = "jr0r2x7cujxkipq"`: + - `NEWT_SECRET = "agj92hbufuoehq8etfbndgt9htkigkr3vnh0imq82xaz591b"`: + +## How to use it + +To deploy the Newt Agent: + +1. Ensure you have a Nomad cluster running with a client node that has Podman installed. +2. You will need to obtain your `NEWT_ID` and `NEWT_SECRET` from the Project Newt service. +3. Update the `NEWT_ID` and `NEWT_SECRET` environment variables in the [`newt.nomad`](stacks/networking/newt.nomad) file with your specific values. +4. Execute the following command on your Nomad server (or a machine with Nomad CLI access configured to connect to your server): + + ```bash + nomad job run stacks/networking/newt.nomad + ``` + +After deployment, the Newt Agent will register with the Pangolin endpoint, allowing it to participate in the Project Newt network. + +## Projects Involved + +- **[HashiCorp Nomad](https://www.nomadproject.io/)**: A workload orchestrator. +- **[Project Newt](https://github.com/fosrl/newt)**: A project for secure and resilient overlay networking. +- **[Podman](https://podman.io/)**: A daemonless container engine. \ No newline at end of file