Skip to content

Web Tools

Glue includes web-oriented tools for fetching pages and searching the web. All are configured under the web: block in your config.

web_fetch

Fetches a URL and returns its content as clean markdown. The pipeline:

  1. HTML pages — extracts main content, strips boilerplate, converts to markdown
  2. PDF documents — extracts text directly; if the result is empty (scanned PDF), falls back to OCR via Mistral or OpenAI vision

Optional Jina AI fallback for difficult pages when JINA_API_KEY is set.

Searches the web and returns structured results. Four providers:

ProviderAPI Key Env Var
DuckDuckGoNone
Brave SearchBRAVE_API_KEY
TavilyTAVILY_API_KEY
FirecrawlFIRECRAWL_API_KEY

If no explicit provider is configured, Glue auto-detects the first available configured provider and falls back to DuckDuckGo when no API-backed provider is available.

Configuration

yaml
web:
  fetch:
    timeout_seconds: 30
    max_bytes: 5242880
    allow_jina_fallback: true

  search:
    provider: "brave" # brave | tavily | firecrawl | duckduckgo
    max_results: 10

  pdf:
    enable_ocr_fallback: true
    ocr_provider: "mistral" # mistral | openai

See also

Released under the MIT License.