Inference model · GEMMA
Gemma 4 26B A4B
The cheapest model in the catalog. 4B active parameters, 26B total. Built for volume.
Gemma 4 26B A4B is what we recommend when an agent is going to call inference hundreds of thousands of times a day and you can tolerate some per-call quality variance. The cost ceiling stays predictable; the floor stays useful.
When to pick it
- Background research agents, daily-digest workflows
- High-throughput document processing pipelines
- Anything where you’d otherwise want to “use a cheap model and re-rank”
When to look elsewhere
- Strict output-quality floors → use a dense model (Qwen 3.6 27B or Gemma 4 31B)
- Function-calling heavy workflows that need very stable schemas → Qwen 3.6 27B
Request Gemma 4 26B A4B access
Get an API key for this model.
Pay-per-use, no deposit, no commitment. We'll send your API key and the OpenAI-compatible endpoint URL within one working day.
Request received. We'll follow up with founding terms.
Please complete the required fields and try again.