Tracing & Spans
Wrap multi-step agentic workflows to get total cost per workflow run — not just per individual API call.
The problem
A single user action in a LangGraph agent might trigger 8–20 LLM calls across multiple models. Without tracing, you see 20 disconnected rows in your call log. With tracing, you see one workflow run costing $0.0047 — attributable to a feature, user, and timestamp.
trace()
Opens a root trace span. All LLM calls on the same thread inside the block are linked by trace_id:
import kostrack
from kostrack import Anthropic
client = Anthropic(tags={"project": "openmanagr"})
with kostrack.trace(tags={"feature": "month-end-close"}) as t:
result_1 = client.messages.create(...)
result_2 = client.messages.create(...)
print(f"Total: ${t.total_cost_usd:.6f} in {t.call_count} calls")
span()
Opens a child span for named sub-workflows. Costs roll up from child spans to the root trace on exit:
with kostrack.trace(tags={"feature": "invoice-pipeline"}) as root:
with kostrack.span("validate", parent=root):
client.messages.create(...) # validation call
with kostrack.span("classify", parent=root):
oai_client.chat.completions.create(...) # OpenAI call
with kostrack.span("post", parent=root):
client.messages.create(...) # posting call
# root.total_cost_usd = sum of all three spans
Multi-model cost breakdown
When a trace spans multiple providers or models, cost_breakdown() returns per-model attribution sorted by spend:
with kostrack.trace(tags={"feature": "month-end-close"}) as t:
with kostrack.span("reason", parent=t):
deepseek.chat.completions.create(model="deepseek-reasoner", ...)
with kostrack.span("extract", parent=t):
anthropic.messages.create(model="claude-haiku-4-5-20251001", ...)
print(f"Total: ${t.total_cost_usd:.6f} across {t.call_count} calls")
for item in t.cost_breakdown():
print(f" {item['model']}: ${item['cost_usd']:.6f} ({item['pct']}%)")
# deepseek-reasoner: $0.0142 (78.9%)
# claude-haiku-4-5-20251001: $0.0038 (21.1%)
Schema
| Column | Description |
|---|---|
trace_id | UUID shared by all calls in a workflow run |
span_id | UUID for this specific call |
parent_span_id | UUID of the parent span (NULL for root calls) |
Querying traces
-- Total cost per workflow run, most expensive first
SELECT * FROM trace_costs
ORDER BY total_cost_usd DESC LIMIT 20;
-- Calls in a specific trace
SELECT time, provider, model, cost_usd, tags->>'span_name'
FROM llm_calls
WHERE trace_id = 'your-trace-uuid'
ORDER BY time;
Thread safety
The trace stack is thread-local — each thread has its own active trace. Concurrent requests in a FastAPI or async application will never interfere with each other's traces.
Wrap your LangGraph graph.invoke() call inside a kostrack.trace() block. Every LLM node in the graph will automatically inherit the trace ID.