SDK Reference

DeepSeek Provider

Drop-in replacement for openai.OpenAI pointed at DeepSeek's API. Supports DeepSeek-V3 (deepseek-chat) and DeepSeek-R1 (deepseek-reasoner), with full reasoning token tracking.

Usage

# Before
from openai import OpenAI
client = OpenAI(api_key="sk-...", base_url="https://api.deepseek.com")

# After — one import change
from kostrack import DeepSeek

client = DeepSeek(
    tags={
        "project": "openmanagr",
        "feature": "journal-classification",
    }
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Classify this transaction..."}],
)

OpenAI-compatible API

DeepSeek's API is fully OpenAI-compatible. The DeepSeek wrapper is built on openai.OpenAI with base_url="https://api.deepseek.com" — so all OpenAI SDK options (timeouts, retries, proxies) work unchanged.

Constructor parameters

Parameter	Type	Description
`tags`	dict	Attribution tags applied to every call from this client.
`api_key`	str	DeepSeek API key. Defaults to `$DEEPSEEK_API_KEY` env var.
`pricing_model`	str	`"per_token"` (default). Batch not yet offered by DeepSeek.
`**openai_kwargs`	any	All other kwargs passed to `openai.OpenAI()` — `timeout`, `max_retries`, etc.

DeepSeek-R1 (reasoning model)

Use deepseek-reasoner to access DeepSeek-R1. Reasoning tokens are tracked in token_breakdown.thinking and billed as output tokens — exactly the same pattern as OpenAI o1/o3:

reasoner = DeepSeek(
    tags={"project": "openmanagr", "feature": "ifrs-interpretation"}
)

response = reasoner.chat.completions.create(
    model="deepseek-reasoner",
    messages=[{"role": "user", "content": "Interpret IAS 29..."}],
)
# token_breakdown.thinking = reasoning tokens used

Supported models and pricing

Model	Input / 1M tokens	Output / 1M tokens	Cache read / 1M
deepseek-chat (V3)	$0.27	$1.10	$0.07
deepseek-reasoner (R1)	$0.55	$2.19	$0.14

Cache read tokens are charged at the discounted rate when your prompt hits DeepSeek's KV cache. Cache miss tokens (uncached) are charged at the standard input rate.

Token breakdown

Field	Description
`input_tokens`	Standard prompt tokens
`output_tokens`	Completion tokens (includes reasoning tokens for R1)
`cached_tokens`	Cache hit tokens — charged at discounted rate
`token_breakdown.cache_write`	Cache miss tokens — charged at full input rate
`token_breakdown.thinking`	Reasoning tokens (R1 only) — subset of output_tokens

Multi-model tracing

DeepSeek integrates with the same kostrack.trace() system as other providers. When you mix DeepSeek and Anthropic calls inside a trace, cost_breakdown() shows per-model attribution:

with kostrack.trace(tags={"feature": "month-end-close"}) as t:
    with kostrack.span("reason", parent=t):
        reasoner.chat.completions.create(model="deepseek-reasoner", ...)
    with kostrack.span("extract", parent=t):
        anthropic_client.messages.create(model="claude-haiku-4-5-20251001", ...)
    with kostrack.span("post", parent=t):
        anthropic_client.messages.create(model="claude-sonnet-4-6", ...)

for item in t.cost_breakdown():
    print(f"{item['model']}: ${item['cost_usd']:.6f} ({item['pct']}%)")
# deepseek-reasoner:        $0.0142 (71.0%)
# claude-haiku-4-5-20251001: $0.0038 (19.0%)
# claude-sonnet-4-6:         $0.0020 (10.0%)

← Previous

Gemini

Tracing & Spans