OPEN SOURCE · MIT · FREE

Free LLMs,
on your machine.

FreeIA Gateway aggregates 10 free LLMs (9 cloud + Ollama) behind an OpenAI and Anthropic-compatible API. Automatic fallback, smart cache, native MCP server. No subscription.

Open source · MIT OpenAI compatible Anthropic compatible Self-hosted Zero subscription
⚡ ApiDelta today
Loading digest…
    Read full digest →

    Everything you need, free

    Nine providers aggregated, one endpoint, zero cost.

    ROUTING
    Automatic fallback

    Cerebras → Groq → Sambanova → Gemini → HuggingFace → NVIDIA NIM → Cloudflare AI → OpenRouter → Mistral → Ollama. 429 error or outage? The next one silently takes over.

    API
    Compatible OpenAI

    Endpoint /v1/chat/completions compatible OpenAI. Fonctionne avec tout client compatible : AnythingLLM, OpenCode, LibreChat, Continue.dev, Jan, LM Studio, Cursor…

    RESILIENCE
    Circuit breaker

    If a provider goes down or saturates, automatic cooldown (30s). Silent retry. Zero manual intervention.

    QUOTAS
    Quota manager

    SQLite tracking of requests and tokens per provider. Automatic daily reset. You never exceed free tier limits.

    DOCUMENTS
    RAG via AnythingLLM

    Upload your docs, query them with your free LLMs. No cloud subscription required.

    PRIVACY
    Self-hosted

    Nothing passes through our servers. Only requests to the free public APIs leave your network.

    MCP
    Native MCP server

    Connect Claude Code, Cursor or Cline directly to your free LLMs. MCP-compatible HTTP endpoint. 3-line config.

    ROUTING
    Smart routing

    Long context → Cerebras skipped automatically. Image detected → Gemini only. Explicit provider hint → direct routing. Zero added latency.

    CACHE
    Semantic cache

    Exact cache (SHA-256) or semantic (fastembed, cosine ≥ 0.90). Similar requests served from SQLite (TTL 1h). Preserves your quotas.

    SDK
    Compatible SDK Anthropic

    Point the Anthropic SDK at http://localhost:8002. Native /v1/messages endpoint — same format, same fields, free LLMs behind. stream=False only (streaming not supported).

    Ten LLMs, zero cost

    Create an account on each platform and generate your API key. No credit card required on any of them.

    Priority 1 · ~2 000 tok/s
    Cerebras, Llama 3.3 70B
    5 000 req/jour · 1 000 000 tokens
    Priority 2 · ~700 tok/s
    Groq, Llama 3.3 70B
    14 400 req/jour · 500 000 tokens
    Priority 3 · ~400 tok/s
    Sambanova, Llama 3.3 70B
    1 000 req/jour · 1 000 000 tokens
    + Llama 3.1 405B · Qwen 2.5 72B via model hint
    Priority 4
    Gemini Flash
    1 500 req/jour · 1 000 000 tokens
    Priority 5
    HuggingFace, Llama 3.1 70B
    1 000 req/jour · 500 000 tokens
    Priority 6 · 40 RPM
    NVIDIA NIM 100+ MODELS
    40 RPM · Llama, Mistral, DeepSeek, Qwen…
    Priority 7 · free
    Cloudflare AI
    10 000 req/jour · Llama, Mistral, Qwen…
    Priority 8 · 33 free models
    OpenRouter FREE
    200 req/jour · Qwen3 480B, DeepSeek R1, Mistral…
    Priority 9 · reserve
    Mistral Large
    100 req/jour · 200 000 tokens
    Priority 10 · local · no key
    Ollama LOCAL
    Unlimited · all installed models

    Up and running in 4 steps

    10 minutes, no experience required. If you can copy-paste, you can do this.

    Before you start — install these 2 free tools
    Once both are installed and Docker Desktop is open, open PowerShell (Windows) or Terminal (Mac/Linux) and continue below.
    1
    Download the code

    Two options. The easiest: download the ZIP, unzip it, no Git needed.

    Download ZIP
    No tools required · ~130 KB
    OR
    With Git (to get updates easily)
    git clone https://github.com/MAXIAWORLD/freeiaforge
    cd freeiaforge

    With the ZIP: right-click the downloaded file → Extract All → you get a freeiaforge-master folder. Rename it to freeiaforge if you want, then open it.

    2
    Get a free API key

    One key is enough to start. Easiest: Cerebras — Google login, no credit card, 5000 requests/day.

    1. Go to cloud.cerebras.ai and sign in with Google.
    2. In the menu, click API KeysCreate API Key.
    3. Copy the key (starts with csk-…) — keep it handy for the next step.

    You can add more keys later to increase your quota: Groq, Gemini, Mistral, OpenRouter… All optional.

    3
    Start FreeIA

    One command. The script creates the config file, lets you paste your key, then starts everything.

    Windows — double-click start.bat in the freeiaforge folder, OR in the terminal:
    .\start.bat
    Mac / Linux (Terminal)
    ./start.sh

    What happens?

    1. The script creates backend/.env and asks you to paste your key.
    2. Open backend/.env in Notepad (or your editor), find the CEREBRAS_API_KEY= line, paste your key after the =, save, close.
    3. Go back to the terminal and press Enter. Docker download starts (5-10 min the first time).
    4. When you see Uvicorn running on http://0.0.0.0:8002, you're ready.
    4
    Connect your client

    In any OpenAI-compatible client (AnythingLLM, OpenCode, LibreChat, Cursor, Jan, LM Studio…), set these 3 values:

    Base URL  : http://localhost:8002/v1
    API Key   : freeai
    Model     : freeai-gateway

    Test directly in the browser: http://localhost:8002/docs (interactive Swagger UI).

    Something broken? Common fixes
    "docker: command not found"

    Docker Desktop is not installed or not running. Install it, open the app, wait until the whale shows "Running", then retry.

    "git: command not found"

    Install Git for Windows from git-scm.com. Close and reopen PowerShell after installation.

    "env file backend/.env not found"

    You ran docker compose up directly instead of the .\start.bat script. Fix: type copy backend\.env.example backend\.env then retry.

    "script execution is disabled" on PowerShell

    Easiest fix: use start.bat instead (double-click or .\start.bat) — no Windows block. Or in PowerShell: Set-ExecutionPolicy -Scope CurrentUser RemoteSigned, confirm with Y, then retry.

    "port 8002 already in use"

    Another app is already using that port. Close it, or edit docker-compose.yml and change "8002:8002" to "8003:8002" (then use localhost:8003).

    Open source · MIT

    No account.
    Zero middleman.

    Open source code, no sign-up, no subscription. Your data never passes through our servers — only through the APIs of the providers you choose.

    GitHub → View the code Questions? Email us