Inference model · GEMMA

Gemma 4 26B A4B

The cheapest model in the catalog. 4B active parameters, 26B total. Built for volume.

Gemma 4 26B A4B is what we recommend when an agent is going to call inference hundreds of thousands of times a day and you can tolerate some per-call quality variance. The cost ceiling stays predictable; the floor stays useful.

When to pick it

  • Background research agents, daily-digest workflows
  • High-throughput document processing pipelines
  • Anything where you’d otherwise want to “use a cheap model and re-rank”

When to look elsewhere

  • Strict output-quality floors → use a dense model (Qwen 3.6 27B or Gemma 4 31B)
  • Function-calling heavy workflows that need very stable schemas → Qwen 3.6 27B