Models and Providers
Glue ships with a curated catalog. Selected models are always written as provider/model — for example anthropic/claude-sonnet-4-6. Credentials never live in project config; they come from env vars, ~/.glue/credentials.json, or an OS keychain layer later.
Canonical source: docs/reference/models.yaml.
Recommended coding models
| Model | Provider | Capabilities | Notes |
|---|---|---|---|
anthropic/claude-sonnet-4-6★ | anthropic | 💬🔧👁📎{}🧠⌨ | Default high-quality coding model. |
anthropic/claude-haiku-4-5★ | anthropic | 💬🔧👁{}⌨ | Good small_model candidate for titles, summaries, and quick checks. |
anthropic/claude-opus-4-7★ | anthropic | 💬🔧👁📎{}🧠⌨ | Most capable Claude model — agentic coding and long-horizon work. |
openai/gpt-5.5★ | openai | 💬🔧👁📎{}🧠⌨ | Frontier general-purpose and agentic coding model. |
openai/gpt-5.5-pro★ | openai | 💬🔧👁📎{}🧠⌨ | High-compute reasoning variant. Use for hard architecture, proofs, and long-horizon planning. |
openai/gpt-5.4-mini★ | openai | 💬🔧👁{}⌨ | Default small_model candidate. |
openai/gpt-5.4-nano★ | openai | 💬🔧{}⌨ | Cheapest GPT-5.4-class model for high-volume tasks. |
openai/o3★ | openai | 💬🔧{}🧠⌨ | Reasoning-specialized flagship. Use for hard debugging, proofs, and multi-step planning. |
gemini/gemini-3.5-flash★ | gemini | 💬🔧👁📎{}🧠⌨🌐 | GA Flash (May 2026). Most intelligent Flash-tier model for agentic and coding tasks. Supports Google Search grounding. |
gemini/gemini-3.1-pro-preview★ | gemini | 💬🔧👁📎{}🧠⌨🌐 | Pro-class preview. Strongest Gemini model for complex reasoning, coding, and research. Supports Google Search grounding. |
gemini/gemini-3-flash-preview★ | gemini | 💬🔧👁📎{}⌨ | Fast, balanced, multimodal preview. Strong agentic-coding candidate. |
gemini/gemini-3.1-flash-lite★ | gemini | 💬🔧👁{}⌨ | GA Flash-Lite (May 2026). Cost-efficient, fastest performance for high-frequency, lightweight tasks. |
mistral/devstral-latest★ | mistral | 💬🔧{}⌨ | Mistral's agentic coding model. Strong tool use. |
mistral/mistral-large-latest★ | mistral | 💬🔧👁{}⌨ | Flagship multimodal general-purpose model. |
mistral/mistral-small-latest★ | mistral | 💬🔧👁{}🧠⌨ | Unified reasoning + coding + vision (Small 4, March 2026). Strong small-model default; cheaper than Medium with better agentic behavior. |
mistral/magistral-medium-latest★ | mistral | 💬🔧{}🧠 | Pure reasoning model. Use when you need explicit long-chain thinking; not a general chat pick. |
mistral/mistral-medium-latest★ | mistral | 💬🔧👁{} | Mistral Medium 3.5 (April 2026). Mid-tier general-purpose; multimodal with tool use. |
groq/llama-4-scout-17b-16e-instruct★ | groq | 💬🔧👁{}⌨ | Native multimodal MoE in production on Groq. Strong speed/quality default as of 2026-04. |
groq/gpt-oss-120b★ | groq | 💬🔧{}🧠⌨ | OpenAI's open-weight reasoning + coding model. 120B weights at Groq speed. |
groq/gpt-oss-20b★ | groq | 💬🔧{}⌨ | Coding-optimized, faster/cheaper than 120B. |
copilot/claude-sonnet-4-6★ | copilot | 💬🔧👁{}⌨ | Uses your GitHub Copilot subscription. |
copilot/claude-opus-4-7★ | copilot | 💬🔧👁📎{}🧠⌨ | Top-tier Anthropic model via Copilot subscription. |
copilot/gpt-5.5★ | copilot | 💬🔧👁📎{}🧠⌨ | Uses your GitHub Copilot subscription. |
copilot/gpt-5.3-codex★ | copilot | 💬🔧{}⌨ | OpenAI's agentic coding model via Copilot subscription. |
openrouter/claude-sonnet-4-6★ | openrouter | 💬🔧👁📎{}🧠⌨ | Useful when users want one router key. |
openrouter/gpt-5.4-mini★ | openrouter | 💬🔧👁{}⌨ | Small-model fallback through a router. |
openrouter/gemini-3.5-flash★ | openrouter | 💬🔧👁📎{}🌐 | Good for research and extraction workflows. |
Local models
| Model | Provider | Capabilities | Notes |
|---|---|---|---|
ollama/qwen3-coder:30b★ | ollama | 💬🔧{}⌨🖥 | Consensus local coding agent. 30B/3.3B MoE. ~20 GB VRAM at Q4_K_M. |
ollama/qwen3-coder-next:latest★ | ollama | 💬🔧{}⌨🖥 | 2026-04 flagship local coder. Qwen3-Next-80B-A3B; ~52 GB at Q4_K_M, ~85 GB at Q8_0. Needs a workstation GPU or 64 GB+ Mac. |
ollama/qwen3.6:35b★ | ollama | 💬🔧👁{}🧠⌨🖥 | Latest Qwen generalist with agentic coding upgrades. Vision + thinking + tools. ~24 GB at Q4_K_M. |
ollama/gemma4:26b★ | ollama | 💬🔧👁{}⌨🖥 | Google's latest with native function-calling. Multimodal, 256K context. ~18 GB at Q4_K_M. |
ollama/devstral:24b★ | ollama | 💬🔧{}⌨🖥 | Mistral's agentic coding model. ~14 GB at Q4_K_M. |
ollama/qwen3:8b★ | ollama | 💬🔧{}🖥 | Low-end floor for tool use. ~5 GB at Q4_K_M. Fits a 16 GB laptop. |
OpenAI-compatible endpoints
Any endpoint that speaks the OpenAI wire format can be added with adapter: openai. That includes Groq, Ollama, vLLM, LM Studio, and OpenRouter — each listed separately in the catalog because their base URLs and auth differ, not because the wire format does.
providers:
local-vllm:
adapter: openai
base_url: http://localhost:8000/v1
auth:
api_key: noneMinimal config
active_model: anthropic/claude-sonnet-4-6Credentials come from the environment in this example (ANTHROPIC_API_KEY). To override any catalog entry or add a provider of your own, drop a models.yaml into ~/.glue/ — it merges on top of the bundled catalog.