DeepSeek Provider
Drop-in replacement for openai.OpenAI pointed at DeepSeek's API. Supports DeepSeek-V3 (deepseek-chat) and DeepSeek-R1 (deepseek-reasoner), with full reasoning token tracking.
Usage
# Before
from openai import OpenAI
client = OpenAI(api_key="sk-...", base_url="https://api.deepseek.com")
# After — one import change
from kostrack import DeepSeek
client = DeepSeek(
tags={
"project": "openmanagr",
"feature": "journal-classification",
}
)
response = client.chat.completions.create(
model="deepseek-chat",
messages=[{"role": "user", "content": "Classify this transaction..."}],
)
DeepSeek's API is fully OpenAI-compatible. The DeepSeek wrapper is built on openai.OpenAI with base_url="https://api.deepseek.com" — so all OpenAI SDK options (timeouts, retries, proxies) work unchanged.
Constructor parameters
| Parameter | Type | Description |
|---|---|---|
tags | dict | Attribution tags applied to every call from this client. |
api_key | str | DeepSeek API key. Defaults to $DEEPSEEK_API_KEY env var. |
pricing_model | str | "per_token" (default). Batch not yet offered by DeepSeek. |
**openai_kwargs | any | All other kwargs passed to openai.OpenAI() — timeout, max_retries, etc. |
DeepSeek-R1 (reasoning model)
Use deepseek-reasoner to access DeepSeek-R1. Reasoning tokens are tracked in token_breakdown.thinking and billed as output tokens — exactly the same pattern as OpenAI o1/o3:
reasoner = DeepSeek(
tags={"project": "openmanagr", "feature": "ifrs-interpretation"}
)
response = reasoner.chat.completions.create(
model="deepseek-reasoner",
messages=[{"role": "user", "content": "Interpret IAS 29..."}],
)
# token_breakdown.thinking = reasoning tokens used
Supported models and pricing
| Model | Input / 1M tokens | Output / 1M tokens | Cache read / 1M |
|---|---|---|---|
| deepseek-chat (V3) | $0.27 | $1.10 | $0.07 |
| deepseek-reasoner (R1) | $0.55 | $2.19 | $0.14 |
Cache read tokens are charged at the discounted rate when your prompt hits DeepSeek's KV cache. Cache miss tokens (uncached) are charged at the standard input rate.
Token breakdown
| Field | Description |
|---|---|
input_tokens | Standard prompt tokens |
output_tokens | Completion tokens (includes reasoning tokens for R1) |
cached_tokens | Cache hit tokens — charged at discounted rate |
token_breakdown.cache_write | Cache miss tokens — charged at full input rate |
token_breakdown.thinking | Reasoning tokens (R1 only) — subset of output_tokens |
Multi-model tracing
DeepSeek integrates with the same kostrack.trace() system as other providers. When you mix DeepSeek and Anthropic calls inside a trace, cost_breakdown() shows per-model attribution:
with kostrack.trace(tags={"feature": "month-end-close"}) as t:
with kostrack.span("reason", parent=t):
reasoner.chat.completions.create(model="deepseek-reasoner", ...)
with kostrack.span("extract", parent=t):
anthropic_client.messages.create(model="claude-haiku-4-5-20251001", ...)
with kostrack.span("post", parent=t):
anthropic_client.messages.create(model="claude-sonnet-4-6", ...)
for item in t.cost_breakdown():
print(f"{item['model']}: ${item['cost_usd']:.6f} ({item['pct']}%)")
# deepseek-reasoner: $0.0142 (71.0%)
# claude-haiku-4-5-20251001: $0.0038 (19.0%)
# claude-sonnet-4-6: $0.0020 (10.0%)