Self-Hosted SearXNG: Bypass Claude Code Web Search Rate Limits

Why I replaced Claude Code's built-in WebSearch with a local SearXNG container, and how I wired it up in about ten minutes.

The Problem: Built-In Web Search Hits a Wall

Claude Code ships with two web tools: WebSearch and WebFetch. Both are useful, both are rate-limited.

In a normal Q&A session I barely notice. In a real research-heavy session — comparing libraries, hunting docs across half a dozen domains, validating a CVE, surveying competitor pricing pages — the limits show up fast. The agent burns through its allowance, then either stalls or starts working with stale training data instead of current sources.

The failure mode is quiet but expensive. The agent does not loudly say "I am out of searches." It just stops checking, makes a confident claim, and I find out later it was wrong.

That is the problem worth solving.

Why SearXNG

SearXNG is a privacy-respecting metasearch engine. It does not have its own index — it queries Google, Bing, DuckDuckGo, Brave, Wikipedia, GitHub, Stack Overflow, and dozens of other backends, then aggregates the results.

Three properties matter for an AI coding agent:

No rate limits you do not impose yourself. It runs on your machine. The only limit is whatever the upstream engines push back with, and SearXNG rotates across them.
JSON output mode. A single ?format=json flag gives back clean, parseable results — perfect for piping into a tool that an LLM calls.
Self-hosted and private. My queries do not get logged, profiled, or fed back into someone else's ad model. As a developer pasting half-finished code into search bars all day, that matters.

The combination — fast, free, JSON-native, private — is exactly what Claude Code needs to research without hitting a wall.

The Setup: Docker Compose + Bash Wrapper

The whole integration is two pieces: a SearXNG container and a small shell script Claude Code can call as a Bash tool.

Step 1: Run SearXNG Locally

A minimal docker-compose.yml:

services:
  searxng:
    image: searxng/searxng:latest
    container_name: searxng
    ports:
      - "8888:8080"
    volumes:
      - ./searxng/settings.yml:/etc/searxng/settings.yml:ro
      - ./searxng/limiter.toml:/etc/searxng/limiter.toml:ro
    environment:
      - SEARXNG_BASE_URL=http://localhost:8888/
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8080/"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 15s

Enable JSON output by editing searxng/settings.yml:

search:
  formats:
    - html
    - json

Disable SearXNG's own bot detection so my agent does not throttle itself. Without this, rapid queries from the same IP trip the built-in limiter and searches start failing — exactly the problem I am trying to escape. Create searxng/limiter.toml:

# Disable rate limiting for local use
[botdetection.ip_limit]
link_token = false

This matters more than it looks. SearXNG ships bot detection on by default because public instances get hammered by scrapers, and upstream engines will block the instance if it looks like one. For a single-user local setup, that protection just gets in my way.

Start it once:

docker compose up -d searxng

Restart, and SearXNG is reachable at http://localhost:8888 with a JSON API and no internal rate limiting.

Step 2: A Bash Wrapper Claude Code Can Call

The wrapper does three things: check that SearXNG is up, encode the query, and format results so the LLM can read them efficiently.

#!/usr/bin/env bash
# SearXNG web search wrapper for Claude Code
# Usage: websearch "query" [num_results]

set -euo pipefail

SEARXNG_URL="${SEARXNG_URL:-http://localhost:8888}"
QUERY="${1:?Usage: websearch \"query\" [num_results]}"
NUM_RESULTS="${2:-10}"

if ! curl -sf "${SEARXNG_URL}/" >/dev/null 2>&1; then
    echo "ERROR: SearXNG not running at ${SEARXNG_URL}"
    echo "Start it: docker compose up -d searxng"
    exit 1
fi

ENCODED_QUERY=$(python3 -c "import urllib.parse; print(urllib.parse.quote('$QUERY'))")

curl -sf "${SEARXNG_URL}/search?q=${ENCODED_QUERY}&format=json" | python3 -c "
import json, sys
data = json.load(sys.stdin)
results = data.get('results', [])[:${NUM_RESULTS}]
if not results:
    print('No results found.')
    sys.exit(0)
print(f'Found {len(results)} results for: ${QUERY}\n')
for i, r in enumerate(results, 1):
    print(f'{i}. {r[\"title\"]}')
    print(f'   URL: {r[\"url\"]}')
    content = r.get('content', '').strip()
    if content:
        print(f'   {content[:300]}')
    print()
"

Drop it at ~/.claude/bin/websearch, chmod +x it, and that is the integration.

Step 3: Tell Claude Code to Prefer It

A short note in ~/.claude/CLAUDE.md flips the default. Tell the agent to prefer the local wrapper over the built-in tools, and point it at the script:

Prefer local SearXNG over WebSearch/WebFetch — the built-in tools are rate-limited.
Use ~/.claude/bin/websearch "your query" [num_results] instead.

From that point on, the agent reaches for the wrapper first and only falls back to the built-in tools when it has a specific reason — for example, fetching a page's full body text, which is curl's job once websearch has surfaced a URL.

What Changes in Practice

Three things got noticeably better for me.

Research sessions stop stalling. A long debugging or comparison session can fire fifty searches without hitting any limit. The agent stays in "check the source" mode instead of drifting back to whatever its training data remembers.

Latency drops. My local SearXNG instance answers in ~100-300ms once warm. The built-in tools have to round-trip through additional infrastructure. Faster searches mean the agent is willing to verify more often, which is exactly the behavior I want.

Privacy is mine again. Searches like "how to fix X in our internal Y service" never leave the machine for any party that was not strictly needed to answer the query. SearXNG even strips the most obvious identifying headers before forwarding to upstream engines.

What It Does Not Solve

This is not a substitute for Claude Code's WebFetch when I want the full content of a single known URL — curl plus a markdown converter handles that better anyway. It is also not a replacement for documentation MCP servers like Context7 when I need version-accurate API references; metasearch is best for breadth, MCP docs servers for depth.

And it does require keeping a small Docker container alive. On any developer machine I already have running databases or services in Docker, that is a non-issue. On a laptop I spin up rarely, I sometimes need to remember to start the container.

The Pattern Underneath

The bigger lesson is not really about SearXNG. It is about how I think when an AI tool's built-in capability has a ceiling that interferes with real work.

My instinct used to be to live with the limit. The better move, I have found, is to look at what the tool is doing — in this case, "issue web searches and return summarized results" — and ask whether the same shape of capability exists in an open, self-hostable form. Often it does. Wiring it in is a one-evening job. The agent gets meaningfully more useful, and the limit stops being a thing I work around.

Rate limits are a product decision someone made about a shared service. My local machine is not subject to that decision. When the work calls for unlimited search, I run unlimited search.

Self-Hosted SearXNG for Claude Code

Quick Answer