SDK Reference

DeepSeek Provider

Drop-in replacement for openai.OpenAI pointed at DeepSeek's API. Supports DeepSeek-V3 (deepseek-chat) and DeepSeek-R1 (deepseek-reasoner), with full reasoning token tracking.

Usage

# Before
from openai import OpenAI
client = OpenAI(api_key="sk-...", base_url="https://api.deepseek.com")

# After — one import change
from kostrack import DeepSeek

client = DeepSeek(
    tags={
        "project": "openmanagr",
        "feature": "journal-classification",
    }
)

response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Classify this transaction..."}],
)
OpenAI-compatible API

DeepSeek's API is fully OpenAI-compatible. The DeepSeek wrapper is built on openai.OpenAI with base_url="https://api.deepseek.com" — so all OpenAI SDK options (timeouts, retries, proxies) work unchanged.

Constructor parameters

ParameterTypeDescription
tagsdictAttribution tags applied to every call from this client.
api_keystrDeepSeek API key. Defaults to $DEEPSEEK_API_KEY env var.
pricing_modelstr"per_token" (default). Batch not yet offered by DeepSeek.
**openai_kwargsanyAll other kwargs passed to openai.OpenAI()timeout, max_retries, etc.

DeepSeek-R1 (reasoning model)

Use deepseek-reasoner to access DeepSeek-R1. Reasoning tokens are tracked in token_breakdown.thinking and billed as output tokens — exactly the same pattern as OpenAI o1/o3:

reasoner = DeepSeek(
    tags={"project": "openmanagr", "feature": "ifrs-interpretation"}
)

response = reasoner.chat.completions.create(
    model="deepseek-reasoner",
    messages=[{"role": "user", "content": "Interpret IAS 29..."}],
)
# token_breakdown.thinking = reasoning tokens used

Supported models and pricing

ModelInput / 1M tokensOutput / 1M tokensCache read / 1M
deepseek-chat (V3)$0.27$1.10$0.07
deepseek-reasoner (R1)$0.55$2.19$0.14

Cache read tokens are charged at the discounted rate when your prompt hits DeepSeek's KV cache. Cache miss tokens (uncached) are charged at the standard input rate.

Token breakdown

FieldDescription
input_tokensStandard prompt tokens
output_tokensCompletion tokens (includes reasoning tokens for R1)
cached_tokensCache hit tokens — charged at discounted rate
token_breakdown.cache_writeCache miss tokens — charged at full input rate
token_breakdown.thinkingReasoning tokens (R1 only) — subset of output_tokens

Multi-model tracing

DeepSeek integrates with the same kostrack.trace() system as other providers. When you mix DeepSeek and Anthropic calls inside a trace, cost_breakdown() shows per-model attribution:

with kostrack.trace(tags={"feature": "month-end-close"}) as t:
    with kostrack.span("reason", parent=t):
        reasoner.chat.completions.create(model="deepseek-reasoner", ...)
    with kostrack.span("extract", parent=t):
        anthropic_client.messages.create(model="claude-haiku-4-5-20251001", ...)
    with kostrack.span("post", parent=t):
        anthropic_client.messages.create(model="claude-sonnet-4-6", ...)

for item in t.cost_breakdown():
    print(f"{item['model']}: ${item['cost_usd']:.6f} ({item['pct']}%)")
# deepseek-reasoner:        $0.0142 (71.0%)
# claude-haiku-4-5-20251001: $0.0038 (19.0%)
# claude-sonnet-4-6:         $0.0020 (10.0%)