Deploying

Note: This page is a stub. Full documentation is coming soon.

Once a stack (mantle) is created in the Forge, you deploy it from the mantle detail page.

Deploy flow

Clicking Deploy triggers the following sequence:

The control plane sends commands to mantlerd on the target machine.
The runtime is installed (if not already present).
The model is pulled to the machine.
If a Endpoint layer is included, the endpoint becomes active.

You can monitor the deployment on the machine detail page in the web app, or with the CLI:

mantler runtime list
mantler model list
mantler info

Endpoint activation

When the Endpoint layer is active and the model is loaded, the mantle's model ID becomes available in /v1/models and accepts chat completions requests.

Redeploying and updating

To update a running stack — for example to change the model or runtime version — edit the mantle in the Forge and redeploy. The daemon handles the transition, stopping the old model and starting the new one.

Telemetry

mantlerd reports runtime and model telemetry back to the control plane during each check-in:

Runtime version and status
Model availability and load state
GPU memory usage
Tokens per second and time-to-first-token from recent requests

This data is visible on the machine and mantle detail pages in the web app.

Deploying

Deploy flow

Endpoint activation

Redeploying and updating

Telemetry

On this page