Cost Tracking & Logging¶
The Visibility Problem¶
LLM API costs can escalate quickly and unpredictably:
- No visibility: Without tracking, you discover cost issues after the invoice arrives
- Attribution difficulty: Hard to know which features, users, or prompts drive costs
- Debugging challenges: Can't replay or analyze requests without logs
- Compliance needs: Some use cases require audit trails of AI interactions
Two Layers of Cost Awareness¶
majordomo-llm provides two layers of cost awareness:
- Built-in usage metrics: Every response includes token counts and calculated costs
- Optional logging subsystem: Async, non-blocking persistence of all requests with full request/response bodies
The logging system uses pluggable adapters:
| Component | Local Development | Production |
|---|---|---|
| Metadata DB | SqliteAdapter |
PostgresAdapter, MySQLAdapter |
| Body Storage | FileStorageAdapter |
S3Adapter |
Basic Cost Tracking¶
No logging required—every response includes usage metrics:
response = await llm.get_response("Hello!")
print(f"Input tokens: {response.input_tokens}")
print(f"Output tokens: {response.output_tokens}")
print(f"Total cost: ${response.total_cost:.6f}")
print(f"Response time: {response.response_time:.2f}s")
Costs are calculated using per-model rates from llm_config.yaml:
gpt-4o:
input_cost_per_million: 2.5
output_cost_per_million: 10.0
Full Request Logging¶
For persistence and analytics:
from majordomo_llm import get_llm_instance
from majordomo_llm.logging import LoggingLLM, SqliteAdapter, FileStorageAdapter
# Create adapters
db = await SqliteAdapter.create("llm_logs.db")
storage = await FileStorageAdapter.create("./request_logs")
# Wrap any LLM instance
llm = get_llm_instance("openai", "gpt-4o", api_key_alias="prod-key-1")
logged_llm = LoggingLLM(llm, db, storage)
# Use normally - logging happens async in background
response = await logged_llm.get_response("Hello!")
# Ensure logs are written before shutdown
await logged_llm.flush()
What Gets Logged¶
- Timestamp, provider, model
- Token counts and costs
- Response time
- API key hash and alias (for attribution)
- Full request/response bodies (stored separately in S3/filesystem)
Next Steps¶
See the Cost Tracking & Logging recipe for production configurations with Postgres and S3.