Models and Providers

Glue ships with a curated catalog. Selected models are always written as provider/model — for example anthropic/claude-sonnet-4-6. Credentials never live in project config; they come from env vars, ~/.glue/credentials.json, or an OS keychain layer later.

Canonical source: docs/reference/models.yaml.

shipping

Recommended coding models

Anthropic, OpenAI, Gemini, Mistral, Groq, and other hosted providers.
Model	Provider	Capabilities	Notes
`anthropic/claude-sonnet-4-6`★	anthropic	💬🔧👁📎{}🧠⌨	Default high-quality coding model.
`anthropic/claude-haiku-4-5`★	anthropic	💬🔧👁{}⌨	Good small_model candidate for titles, summaries, and quick checks.
`anthropic/claude-opus-4-7`★	anthropic	💬🔧👁📎{}🧠⌨	Most capable Claude model — agentic coding and long-horizon work.
`openai/gpt-5.5`★	openai	💬🔧👁📎{}🧠⌨	Frontier general-purpose and agentic coding model.
`openai/gpt-5.5-pro`★	openai	💬🔧👁📎{}🧠⌨	High-compute reasoning variant. Use for hard architecture, proofs, and long-horizon planning.
`openai/gpt-5.4-mini`★	openai	💬🔧👁{}⌨	Default small_model candidate.
`openai/gpt-5.4-nano`★	openai	💬🔧{}⌨	Cheapest GPT-5.4-class model for high-volume tasks.
`openai/o3`★	openai	💬🔧{}🧠⌨	Reasoning-specialized flagship. Use for hard debugging, proofs, and multi-step planning.
`gemini/gemini-3.5-flash`★	gemini	💬🔧👁📎{}🧠⌨🌐	GA Flash (May 2026). Most intelligent Flash-tier model for agentic and coding tasks. Supports Google Search grounding.
`gemini/gemini-3.1-pro-preview`★	gemini	💬🔧👁📎{}🧠⌨🌐	Pro-class preview. Strongest Gemini model for complex reasoning, coding, and research. Supports Google Search grounding.
`gemini/gemini-3-flash-preview`★	gemini	💬🔧👁📎{}⌨	Fast, balanced, multimodal preview. Strong agentic-coding candidate.
`gemini/gemini-3.1-flash-lite`★	gemini	💬🔧👁{}⌨	GA Flash-Lite (May 2026). Cost-efficient, fastest performance for high-frequency, lightweight tasks.
`mistral/devstral-latest`★	mistral	💬🔧{}⌨	Mistral's agentic coding model. Strong tool use.
`mistral/mistral-large-latest`★	mistral	💬🔧👁{}⌨	Flagship multimodal general-purpose model.
`mistral/mistral-small-latest`★	mistral	💬🔧👁{}🧠⌨	Unified reasoning + coding + vision (Small 4, March 2026). Strong small-model default; cheaper than Medium with better agentic behavior.
`mistral/magistral-medium-latest`★	mistral	💬🔧{}🧠	Pure reasoning model. Use when you need explicit long-chain thinking; not a general chat pick.
`mistral/mistral-medium-latest`★	mistral	💬🔧👁{}	Mistral Medium 3.5 (April 2026). Mid-tier general-purpose; multimodal with tool use.
`groq/llama-4-scout-17b-16e-instruct`★	groq	💬🔧👁{}⌨	Native multimodal MoE in production on Groq. Strong speed/quality default as of 2026-04.
`groq/gpt-oss-120b`★	groq	💬🔧{}🧠⌨	OpenAI's open-weight reasoning + coding model. 120B weights at Groq speed.
`groq/gpt-oss-20b`★	groq	💬🔧{}⌨	Coding-optimized, faster/cheaper than 120B.
`copilot/claude-sonnet-4-6`★	copilot	💬🔧👁{}⌨	Uses your GitHub Copilot subscription.
`copilot/claude-opus-4-7`★	copilot	💬🔧👁📎{}🧠⌨	Top-tier Anthropic model via Copilot subscription.
`copilot/gpt-5.5`★	copilot	💬🔧👁📎{}🧠⌨	Uses your GitHub Copilot subscription.
`copilot/gpt-5.3-codex`★	copilot	💬🔧{}⌨	OpenAI's agentic coding model via Copilot subscription.
`openrouter/claude-sonnet-4-6`★	openrouter	💬🔧👁📎{}🧠⌨	Useful when users want one router key.
`openrouter/gpt-5.4-mini`★	openrouter	💬🔧👁{}⌨	Small-model fallback through a router.
`openrouter/gemini-3.5-flash`★	openrouter	💬🔧👁📎{}🌐	Good for research and extraction workflows.

Local models

Ollama runs on localhost; capability depends on hardware.
Model	Provider	Capabilities	Notes
`ollama/qwen3-coder:30b`★	ollama	💬🔧{}⌨🖥	Consensus local coding agent. 30B/3.3B MoE. ~20 GB VRAM at Q4_K_M.
`ollama/qwen3-coder-next:latest`★	ollama	💬🔧{}⌨🖥	2026-04 flagship local coder. Qwen3-Next-80B-A3B; ~52 GB at Q4_K_M, ~85 GB at Q8_0. Needs a workstation GPU or 64 GB+ Mac.
`ollama/qwen3.6:35b`★	ollama	💬🔧👁{}🧠⌨🖥	Latest Qwen generalist with agentic coding upgrades. Vision + thinking + tools. ~24 GB at Q4_K_M.
`ollama/gemma4:26b`★	ollama	💬🔧👁{}⌨🖥	Google's latest with native function-calling. Multimodal, 256K context. ~18 GB at Q4_K_M.
`ollama/devstral:24b`★	ollama	💬🔧{}⌨🖥	Mistral's agentic coding model. ~14 GB at Q4_K_M.
`ollama/qwen3:8b`★	ollama	💬🔧{}🖥	Low-end floor for tool use. ~5 GB at Q4_K_M. Fits a 16 GB laptop.

OpenAI-compatible endpoints

Any endpoint that speaks the OpenAI wire format can be added with adapter: openai. That includes Groq, Ollama, vLLM, LM Studio, and OpenRouter — each listed separately in the catalog because their base URLs and auth differ, not because the wire format does.

~/.glue/models.yaml — OpenAI-compatible provider

yaml

providers:
  local-vllm:
    adapter: openai
    base_url: http://localhost:8000/v1
    auth:
      api_key: none

Minimal config

~/.glue/config.yaml — quickest path to a running agent

yaml

active_model: anthropic/claude-sonnet-4-6

Credentials come from the environment in this example (ANTHROPIC_API_KEY). To override any catalog entry or add a provider of your own, drop a models.yaml into ~/.glue/ — it merges on top of the bundled catalog.

Configuration reference →

Models and Providers ​

Recommended coding models ​

Local models ​

OpenAI-compatible endpoints ​

Minimal config ​

Models and Providers

Recommended coding models

Local models

OpenAI-compatible endpoints

Minimal config