On a typical Claude Code coding turn we measured ~2,400 input tokens and ~700 output tokens. Multiply that by Claude 3.5 Sonnet's direct API price and you get $0.0177 per turn. Run the same turn through qlaud → DeepSeek V3 and you pay $0.00057 — 31× less. Across a 100-turn debugging session, that's the difference between $1.77 and 5.7 cents.
The math
The numbers below assume the input + output token counts above (a representative coding turn — Claude Code reads a few files, writes a patch, runs a test). All prices are the customer-facing rate (upstream cost × qlaud's 1.07 markup).
| Model (via qlaud) | Input $/1M | Output $/1M | Per-turn cost | 100-turn session |
|---|---|---|---|---|
claude-3-5-sonnet-20241022 | $3.21 | $16.05 | $0.0189 | $1.89 |
deepseek-v3 | $0.29 | $1.18 | $0.00152 | $0.15 |
deepseek-r1 (reasoning) | $0.59 | $2.34 | $0.00305 | $0.31 |
llama-3.3-70b via Groq | $0.63 | $0.85 | $0.00211 | $0.21 |
Why the gap is so big
Anthropic prices Claude 3.5 Sonnet at the top of the closed-frontier-model tier — those models pay for the largest training runs and the biggest research teams. DeepSeek prices V3 at the floor of what their inference economics actually cost, because the open-weights distribution model means they don't need to amortize an exclusive product moat through API margins.
For most coding tasks you don't need the marginal frontier capability — you need decent reasoning + good code synthesis + tool-call reliability. DeepSeek V3 hits that bar today.
The setup, end-to-end
One env var change in your shell:
export ANTHROPIC_BASE_URL=https://api.qlaud.ai
export ANTHROPIC_API_KEY=ak_live_<your_key>
claudeClaude Code reads those two vars and hits us instead of api.anthropic.com. Internally we translate the Anthropic Messages format → OpenAI Chat Completions (DeepSeek's native shape) → Cloudflare AI Gateway → DeepSeek. The streaming response comes back through the same pipe. Tool calls, system prompts, max_tokens, stop_sequences, reasoning content — all preserved.
When you still want Claude
DeepSeek V3 isn't Claude 3.5 Sonnet. The places we've seen Claude pull ahead in real Claude Code sessions:
- Agent loops with 20+ tool calls in a row — Claude is more reliable about re-reading the spec each turn.
- Subtle multi-file refactors where one change implies edits in three other files — Claude tends to find the implied changes; DeepSeek sometimes stops at the first file.
- Anything that benefits from the larger Claude context window (200K vs 64K).
The clean answer: route by task. Use DeepSeek V3 for the 80%, fall back to Claude 3.5 Sonnet for the 20%. Both run through qlaud with the same API key — no SDK juggling, no rebuilding your agent loop.
Get started
Sign up for qlaud, top up $5, point Claude Code at us. The first $5 will run you a few days of DeepSeek V3 sessions. If you don't love it, top-ups don't auto-renew — just walk away.