Blender Render Farms
Manage homogeneous and heterogeneous render fleets for 3D animation and VFX production — all supporting infrastructure included
Rendering 3D scenes is embarrassingly parallel: every frame is independent. This makes render farms conceptually simple — split the frame range across workers, collect the output — but operationally messy. You have to provision workers consistently, share scene files across machines, route frames to the right render engine, and keep the farm healthy during a production crunch.
PodWarden manages the render workers and every supporting service they depend on. If you already have shared storage, a render manager, or a monitoring stack, bring it. If you don't, every piece is in the Hub catalog.
What You Need
| Component | Bring your own | Or deploy from Hub |
|---|---|---|
| Shared storage | Existing NFS server, S3 bucket | NFS-Ganesha (scene files + frames over NFS) or MinIO / RustFS (S3 object storage) |
| Render manager | Deadline, OpenCue, Flamenco, custom | Flamenco — open-source Blender render manager, deploys as a workload |
| Database | Existing PostgreSQL or SQLite | PostgreSQL — from Hub, used by Flamenco and other services |
| Monitoring | Existing Prometheus + Grafana | Prometheus + Grafana — cluster and GPU utilization dashboards |
| SSO / access control | Existing identity provider | Keycloak — from Hub, SSO for the render manager web UI and PodWarden itself |
Stack Architecture
Building the Foundation
1. Shared Storage
Render workers need to read scene files and write rendered frames. All workers must reach the same storage location.
If you have an NFS server or S3 bucket, register it as a storage connection in PodWarden. PodWarden checks that every node in your cluster can reach the storage before you deploy workers that depend on it.
If you don't:
- NFS-Ganesha — Import from Hub. Deploy to a node with fast local disk (or attach a large volume). Workers mount
/scenesand/outputvia NFS. Every node sees the same filesystem. - MinIO or RustFS — If your render pipeline supports S3, deploy either as a workload. Use separate buckets for inbound scene archives and outbound frame output.
For large teams with many render nodes, NFS is simpler for Blender (it reads .blend files with relative paths). For distributed pipelines or cloud burst nodes, S3 works well for archival and transfer.
2. Render Manager
The render manager assigns frame ranges to workers and collects output.
If you use Deadline or OpenCue, PodWarden manages the worker containers that connect to your existing manager.
If you don't have one: import Flamenco from Hub. Flamenco is the Blender Foundation's own render manager — it understands Blender jobs natively, runs as a server + worker model, and stores state in PostgreSQL.
- Deploy Flamenco Manager (one instance, persistent deployment) pointing at your NFS storage
- Deploy Flamenco Worker (the workload that runs on render nodes) as a DaemonSet or deployment
- Import PostgreSQL from Hub for Flamenco's database if you don't have one
3. Monitoring
Import Prometheus and Grafana from Hub. For GPU nodes, add DCGM Exporter as a DaemonSet — it runs automatically on every node and exposes per-GPU metrics: utilization, VRAM usage, temperature.
The Grafana dashboard shows at a glance which nodes are rendering, which are idle, and which are struggling (thermal throttle, VRAM pressure, stalled jobs).
Adding Render Nodes
Any machine with SSH access becomes a PodWarden host: gaming PCs, workstations, VMs, cloud instances. PodWarden detects GPU model and VRAM automatically at provisioning time.
Provision one host as the cluster control plane. Then join additional nodes — one click per node, PodWarden installs K3s and the GPU runtime via SSH.
Homogeneous farms
All nodes have the same hardware. Frame distribution is predictable and even. Create one render worker template, deploy to the cluster. The scheduler places workers across all nodes.
Heterogeneous farms
Mixed hardware — some A100 nodes, some older GTX 1080 workstations, some CPU-only machines. Two approaches:
Node selectors — Tag nodes with hardware labels during provisioning:
{ "gpu": "a100" }
{ "gpu": "rtx-4090" }
{ "cpu-only": "true" }Create separate stacks for each hardware class, each with a matching node_selector. Deploy all definitions to the same cluster. Blender workers with CUDA targets land on GPU nodes; CPU workers land on everything else.
Separate clusters — Group high-end nodes into a "fast" cluster and older hardware into a "slow" cluster. Deploy different Flamenco worker templates to each. Your render manager submits heavy frames to the fast cluster and lighter frames to the slow one.
Render Worker Templates
GPU worker (CUDA/OptiX)
Kind: Deployment (or DaemonSet for one-per-node)
Image: linuxserver/blender:latest
GPU count: 1
VRAM: 16Gi
CPU: 16
Memory: 32Gi
Node selector: { "accelerator": "nvidia" }| Variable | Example | Description |
|---|---|---|
RENDER_DEVICE | OPTIX | Render device: OPTIX, CUDA, HIP, CPU |
RENDER_ENGINE | CYCLES | Blender engine |
FLAMENCO_MANAGER | http://flamenco.mesh:8080 | Render manager URL |
WORKER_NAME | (hostname) | Worker display name in Flamenco |
TILE_SIZE | 256 | Tile size — larger for GPU, smaller for CPU |
CPU worker
Kind: Deployment
Image: linuxserver/blender:latest
GPU count: 0
CPU: 32
Memory: 64Gi
Node selector: { "cpu-only": "true" }CPU workers are useful for older hardware still capable of rendering, nodes without NVIDIA GPUs, or scenes that don't run efficiently on GPU (volumes, certain shaders).
Volume mounts
| Mount path | Volume type | Contents |
|---|---|---|
/scenes | NFS | Source .blend files |
/output | NFS | Rendered frame output |
/cache | emptyDir | Blender texture and shader cache |
DaemonSet Mode
Use kind: DaemonSet to run exactly one render worker per node, automatically. Add a node to the cluster and it gets a worker within seconds — no manual assignment. Remove a node and the worker is gone. Ideal when you want maximum utilization with zero scheduling overhead.
Scaling for Deadline Crunches
When a production schedule compresses:
When the crunch passes, remove the rented nodes. Your permanent studio machines stay in the cluster. Billed time on cloud nodes: only the hours they were rendering.
End-to-End Production Sequence
Hub Templates for This Stack
| Template | Role |
|---|---|
| Blender render worker (CUDA/OptiX) | GPU render worker |
| Blender render worker (CPU) | CPU render worker |
| Flamenco Manager | Render job manager |
| Flamenco Worker | Alternative to native Blender worker |
| Deadline Worker | Worker for AWS Thinkbox Deadline |
| OpenCue Worker | Worker for Google OpenCue |
| NFS-Ganesha | NFS server for scene files and frames |
| MinIO | S3 object storage for archives and transfers |
| RustFS | High-performance S3 object storage |
| PostgreSQL | Database for Flamenco Manager |
| Prometheus | Metrics collection |
| Grafana | Render farm dashboards |
| DCGM Exporter | Per-GPU metrics (DaemonSet) |
| Keycloak | SSO for render manager and PodWarden access |
All of these are standard stacks. They live in the same cluster as your render workers, managed the same way, monitored together. A complete render farm — storage, render manager, workers, monitoring — deployed from a single catalog.
Distributed AI Model Training
Run multi-node GPU training jobs across rented cloud instances, DGX workstations, and home servers — all from a single control plane
Video Transcoding at Scale
Build a GPU-accelerated transcoding pipeline on Jetson Orin NX clusters or traditional GPU servers — all supporting infrastructure included