GPU Guide

NeoRun supports NVIDIA GPU passthrough for ML inference, image generation, fine-tuning, and more.

Free Tier GPU Limits

Limit	Value
Max VRAM	8 GB
GPU jobs per day	3
Max runtime per session	2 hours
Idle auto-stop	30 minutes

Enabling GPU

In the deployment wizard, toggle the GPU switch
Select your VRAM requirement: 8 GB, 16 GB, or 24 GB
Deploy — NeoRun will schedule your job on a GPU-equipped worker

Supported GPU Types

NeoRun workers are equipped with NVIDIA GPUs. GPU detection uses pynvml with an nvidia-smi fallback:

GPU	VRAM	Compute Capability
RTX 3060	12 GB	8.6
RTX 3090	24 GB	8.6
RTX 4090	24 GB	8.9
A100	40/80 GB	8.0
H100	80 GB	9.0

GPU + Docker

Containers get NVIDIA GPU access via the NVIDIA Container Toolkit . NeoRun automatically:

Adds --gpus all device requests to the container
Sets NVIDIA_VISIBLE_DEVICES=all and NVIDIA_DRIVER_CAPABILITIES=compute,utility
Allocates larger shared memory (--shm-size=2g) for PyTorch DataLoaders
Scales CPU/memory limits for GPU workloads

Container Isolation

GPU containers use the default runc runtime because gVisor does not support GPU passthrough. To compensate, NeoRun applies a seccomp profile that blocks dangerous syscalls (module loading, mount operations, namespace creation, ptrace).

Example GPU Projects

These templates are pre-configured for GPU use:

Stable Diffusion WebUI — AUTOMATIC1111 image generation
Open WebUI + Ollama — Local LLM chat interface
ComfyUI — Node-based diffusion workflows
FastAPI + PyTorch — ML model serving API
Jupyter + CUDA — GPU-accelerated notebooks

Find them in the Template Gallery under the GPU category.

Idle Detection

GPU pods are expensive. NeoRun monitors network I/O and automatically stops idle pods:

Network tracking: Measures incoming/outgoing bytes per 60-second interval
Idle threshold: Less than 1 KB of traffic (filters out DNS/healthchecks)
Warning: After 25 minutes idle, a notification is sent
Auto-stop: At 30 minutes idle, desired_state is set to stopped
Max runtime: Hard cutoff at 2 hours (free tier)

To prevent auto-stop, keep your pod actively serving requests.

Troubleshooting GPU

”No GPU available”

Check that the worker machine has NVIDIA drivers installed
Verify nvidia-smi returns GPU information
Ensure nvidia-container-toolkit is installed

”CUDA out of memory”

Select a higher VRAM tier in the deployment wizard
Reduce batch size or model precision in your code
Use torch.cuda.empty_cache() to free unused GPU memory

GPU container starts but model doesn’t load

Ensure your requirements.txt includes CUDA-compatible PyTorch:
```
torch --index-url https://download.pytorch.org/whl/cu121
```
Check that the base image has CUDA runtime (NeoRun uses nvidia/cuda:12.1.0-runtime-ubuntu22.04 for GPU builds)