Private autonomous agents. Fixed yearly price. Clean hydropower.

Host the whole agent, not just the LLM call.

ScaLabs Cloud runs OpenClaw and Hermes-style agents in Kathmandu, beside the same inference stack they call. Your plan includes the runtime, private state, model usage, Whisper STT, GLM-OCR + dots.ocr + olmOCR-2 document vision, and Chatterbox + Qwen3-TTS voice output (with voice cloning) for normal agent workloads.

Request a founding place See founding tiers

STT / OCR / TTS included Private memory, files, and tools 3-year price guarantee No training on customer data

managed-agent.stack private network

Your agent runtime OpenClaw / Hermes + gateways

Included utility rails inference + speech + OCR + voice

Private ScaLabs endpoint OpenAI-compatible API on local network

Why this exists

Agents should not need five providers just to stay alive.

One managed stack

Tenant runtime, private memory, gateway configuration, model calls, speech, OCR, and voice output belong in one plan.

Built for always-on loops

Run normal agent workloads without watching every token. Fair-use allowances stop resale, spam, and runaway loops, not regular work.

Your state stays yours

Memory, files, credentials, conversations, and tool state stay inside your tenant environment and are never used for model training.

Open runtimes, managed ops

Bring OpenClaw, Hermes Agent, or a reviewed compatible runtime. We operate the tenant, gateways, quotas, logs, secrets, and restarts.

Supported runtimes

Bring your agent framework. We host it.

ScaLabs Cloud is runtime-agnostic — we don't force a custom framework on you. Two open-source agent runtimes are first-class at founding launch; others get reviewed on request.

OpenClaw

Open-source personal AI assistant framework — runs on any OS, any platform, with native gateway integrations for Telegram, Slack, Discord, WhatsApp, and more. The lobster way.

github.com/openclaw/openclaw →

Hermes Agent by Nous Research

The self-improving agent — persistent memory, scheduled automations, browser control, MCP/tool use, kanban-style planner. The agent that grows with you.

github.com/NousResearch/hermes-agent →

Using Claude Managed Agents instead? Anthropic keeps the agent loop on their side; we can host the sandbox half on our workers at 20–50 % below the Cloudflare / Daytona / Modal / Vercel rates Anthropic named as launch partners. See /sandboxes/claude-self-hosted for the breakdown.

Also reviewing on a case-by-case basis: AutoGen-compatible runtimes and customer-provided container images. Talk to us if your framework isn't on this list.

Included utility rails

Speech, OCR, and voice output are not add-ons.

Useful agents listen, read, reason, and respond. For normal, non-spam agent use, those glue calls are effectively unlimited inside the plan instead of metered through separate speech, OCR, and TTS vendors.

STT

Whisper-class speech-to-text

Let agents handle voice notes, calls, meetings, and audio snippets without adding a separate speech API bill.

OCR

GLM-class document vision

Let agents read screenshots, receipts, PDFs, forms, and image-heavy inputs inside the same hosted workflow.

TTS

Chatterbox + Qwen3-TTS voice output

Natural voice output plus zero-shot voice cloning. Built on Chatterbox (MIT) and Qwen3-TTS (Apache 2.0) — the same engines we sell standalone on /utilities.

API

OpenAI-compatible local endpoint

The agent runtime calls ScaLabs Cloud inference over the private network, with quota enforcement and model-specific fair-use allowances.

Included still has guardrails.

Normal voice, OCR, and TTS workloads are part of the plan. Spam, resale, credential abuse, quota bypassing, and runaway loops are not; rate limits and abuse controls keep the cohort usable.

Founding tiers

One private agent stack. Two levels of headroom.

Core is for personal and small-team agents. Pro is for heavier loops, larger models, and higher gateway and network egress limits.

Agent Core EUR 49/mo

EUR 588/year at launch. EUR 19 refundable founding deposit credited against the first invoice.

8 GB tenant RAM quota
Fair-share 4 vCPU-class runtime capacity
Model-specific fair-use inference, up to 320M tokens/month on the highest-allowance model
OpenClaw / Hermes-compatible runtime with private tenant volume
Included normal-use STT (Whisper + Qwen3-ASR), OCR (GLM-OCR + dots.ocr + olmOCR-2), and TTS (Chatterbox + Qwen3-TTS + Kokoro + Piper)

Request Core

More headroom

Agent Pro EUR 79/mo

EUR 948/year at launch. EUR 29 refundable founding deposit credited against the first invoice.

12 GB tenant RAM quota
Fair-share 6 vCPU-class runtime capacity
Model-specific fair-use inference, up to 510M tokens/month on the highest-allowance model
Included normal-use STT (Whisper + Qwen3-ASR), OCR (GLM-OCR + dots.ocr + olmOCR-2), and TTS (Chatterbox + Qwen3-TTS + Kokoro + Piper)
Larger-model access, confirmed after benchmarks and license review
Higher gateway and network egress limits

Request Pro

CPU is fair-share, not hard dedicated vCores. RAM quota, annual price, refund terms, and the no-training commitment are the founding commitments; exact storage, gateway, utility, and egress limits are finalized in launch terms after runtime benchmarks.

Usage promise

Use the default agent models without metering anxiety. Heavier models get published limits.

The 320M / 510M headline comes from the fastest model class. Larger models are available with lower monthly allowances so the plan stays predictable for everyone.

Model	Core allowance / mo	Pro allowance / mo	Notes
HimalayaGPT 0.5B	Unlimited fair-use	Unlimited fair-use	Free for everyone — Nepali sovereign LLM
Qwen 3.6 27B	159M tokens	254M tokens	Dense agent and coding model
Qwen 3.6 35B A3B	272M tokens	435M tokens	Small-active MoE sweet spot
Gemma 4 31B	136M tokens	218M tokens	Dense model, lower throughput
Gemma 4 26B A4B	318M tokens	508M tokens	Highest headline allowance
DeepSeek V4 Flash	Pro only	109M tokens	Pro-only model, confirmed after benchmarks
Cohere Command A+	75M tokens	140M tokens	Open-weight enterprise flagship — multilingual, agentic
Qwen 3.5 122B A10B	102M tokens	163M tokens	Larger MoE workflow option
MiniMax M2.7	40M tokens	127M tokens	Large-agent MoE, pending license review

Efficiency gains should flow back to customers.

If ScaLabs Cloud's runtime improves or model-serving costs fall, founding customers benefit through higher practical allowances, better model availability, or both. The founding price stays fixed for 3 years while yearly renewals stay current. The table above is confirmed at launch after benchmarks and runtime soak tests.

Agent safety

An agent with tools is infrastructure. We treat it like one.

A useful agent can touch tools, credentials, and outside systems. Isolation, tool grants, secrets, logs, restart policy, quota enforcement, and customer-controlled deletion are part of the product.

Per-tenant isolation Separate containers, volumes, memory stores, and quota enforcement.

Least-privilege tools Gateways, MCP servers, shell, browser, and webhooks stay off until explicitly enabled.

Secrets handling Customer credentials are injected at runtime and never written into logs.

Dashboard and audit trail Status, usage, gateway config, secrets, restarts, quota events, and metadata logs are visible per tenant.

No cross-tenant learning Agent memory and skill loops stay tenant-scoped. No shared memory store seeded from customer data.

Runaway-loop throttles Agent loops, STT, OCR, TTS, and egress remain governed by rate limits and abuse controls.

Join Founding Batch 1

Request a founding place before the batch closes.

This form is not a payment step. Tell us what you want to run; if it fits the cohort, we send the deposit, refund, saved-card, and annual billing terms for review before any payment authorization.

Core: EUR 19 refundable deposit; first annual charge is EUR 588 less deposit credit at launch.
Pro: EUR 29 refundable deposit; first annual charge is EUR 948 less deposit credit at launch.
The founding price is guaranteed for 3 years if yearly renewals stay current.
Plans are paid yearly and can be canceled at renewal boundaries.
Your card is charged for the annual plan only after launch notice and authorization.
If ScaLabs Cloud misses the stated launch window or cancels the cohort, the deposit is refundable.

Continue in the ScaLabs Cloud Console

We'll create your account and email you a 6-digit sign-in code. Finish the request inside the console.

Practical questions

The details that matter before you reserve.

Does this mean unlimited inference?

It means normal autonomous-agent workflows are covered by the plan instead of billed per token. It does not mean unrestricted resale, spam, or runaway loops. Every tier has model-specific fair-use allowances, rate limits, anti-resale terms, and abuse controls.

Are STT, OCR, and TTS really included?

Yes. Whisper + Qwen3-ASR for speech-to-text, GLM-OCR + dots.ocr + olmOCR-2 for document vision (Nepali / Devanagari included), and Chatterbox + Qwen3-TTS + Kokoro + Piper for text-to-speech are all included for normal, non-spam agent workflows. They remain subject to rate limits, anti-resale terms, and abuse throttles.

Why run the agent in Nepal if I am in Europe or the US?

This is built for agents that run in the background through Telegram, Slack, Discord, WhatsApp, email, web UI, or scheduled tasks. For those workflows, the important round trip is often inside the agent loop: state, tool decisions, and model calls running close together — our network keeps that path short and cheap.

When do I pay?

The form is free. If your use case fits the cohort, you can pay a small refundable deposit after reviewing the terms. The annual subscription is charged only at launch after advance notice and saved-card authorization for the selected plan.

What happens if launch slips?

The deposit is refundable if ScaLabs Cloud misses the stated launch window or cancels the cohort. The deposit is demand validation, not a security, loan, investment, or equity entitlement.

Can I bring my existing agent framework?

The first cohort is designed for OpenClaw and Hermes Agent patterns: messaging-first agents, headless server workflows, MCP/tool use, scheduled tasks, and gateway-based operation. Customer-provided runtimes are reviewed after isolation, support, and billing are stable.

What happens to memory, files, logs, and tool data?

They stay inside the tenant environment. ScaLabs Cloud does not use customer prompts, outputs, memory, files, conversations, or tool-call data for model training. Tool calls and admin actions can be logged as metadata; request and response bodies are not logged by default.

What is still launch-gated?

We do not promise hard dedicated vCores, exact public latency numbers, or unbenchmarked throughput. Exact public latency numbers and throughput characteristics are confirmed after launch benchmarks.