Stacks
Stacks
A stack (mantle) is a validated composition of machine, runtime, model, and optional layers.
A stack — called a mantle in the UI — is the central object in Mantler. It describes a complete inference setup: which machine to run on, which inference runtime to use, which model to load, and optionally how to expose it and wrap it with tooling.
Layers
| Layer | Role | Required |
|---|---|---|
| Machine | The target machine | Yes |
| Runtime | Inference runtime (Ollama, vLLM, llama.cpp, MLX, TensorRT-LLM, …) | Yes |
| Model | The language model | Yes |
| Harness | Prompt router / tool wrapper (harness) | Optional |
| Orchestrator | Multi-step orchestrator (LangChain, AutoGen, etc.) | Optional |
| Endpoint | OpenAI-compatible endpoint exposure | Optional |
How stacks are built
Stacks are composed in The Forge — the visual builder in the Mantler web app. You select a layer combination and the compatibility engine resolves in real time whether the combination will work on your hardware.
No AI is in the composition loop. Compatibility is determined by curated rules, learned recipes from successful deployments, and community outcome telemetry.
In this section
- Building stacks — how to use the Forge
- Compatibility — how compatibility resolution works
- Deploying — deploy a stack and monitor its status