overgrowth / API_MONITORING.md
Graham Paasch
Improve monitoring UX and deterministic offline outputs
1209812
# 🔍 API Monitoring & Transparency
## What Judges Will See
When you click "🚀 Run Full Pipeline" in Overgrowth, you get **complete transparency** into every API call the system makes:
### 📊 Live Session Statistics Dashboard
At the top of the interface, you'll see a real-time dashboard showing:
- **Total Cost**: Running total of all AI API costs (calculated in real-time)
- **API Calls**: Count of LLM calls (OpenAI/Anthropic/OpenRouter) and GNS3 calls
- **Token Usage**: Total tokens consumed (input/output breakdown)
- **Session Duration**: How long the current session has been running
- **Error Count**: Any failed API calls
- **Cost Per Call**: Average cost efficiency
### 🤖 Real-Time API Activity Feed
Every single API call is logged with full details:
```
✅ 🤖 **ANTHROPIC** `claude-3-haiku-20240307`
| 📝 1,247→892 tokens
| 💰 $0.0043
| ⏱️ 2,314ms
✅ 🌐 **LOCAL-MCP** `get_topology`
| ⏱️ 458ms
✅ 🤖 **OPENAI** `gpt-4o`
| 📝 2,105→1,543 tokens
| 💰 $0.0201
| ⏱️ 3,127ms
```
### 💰 Cost Tracking Features
1. **Per-Model Pricing** - Accurate pricing for each AI model:
- GPT-4o: $2.50/$10.00 per 1M tokens (in/out)
- Claude 3 Haiku: $0.25/$1.25 per 1M tokens
- Claude 3.5 Sonnet: $3.00/$15.00 per 1M tokens
2. **Real-Time Calculation** - Costs computed immediately after each call
3. **Session Totals** - Running total updated after every operation
4. **Token Breakdown** - Separate counts for input vs output tokens
5. **Budget Guardrails** - Optional session budget with alerts
- Set `API_BUDGET_USD` (e.g., `50` or `15.5`) to display remaining budget and trigger warnings
- Tweak alert threshold with `API_BUDGET_ALERT_FRACTION` (default `0.8` for 80%)
## Why This Impresses Judges
### 1. **Enterprise-Grade Observability**
This isn't a demo - it's production-ready software with full audit trails
### 2. **Cost Transparency**
Users know exactly what they're spending in real-time (critical for enterprise adoption)
### 3. **Multi-API Tracking**
Monitors both AI APIs (OpenAI/Anthropic) AND infrastructure APIs (GNS3)
### 4. **Proof of Execution**
Every claim is backed by verifiable API calls with timestamps and metrics
### 5. **Error Visibility**
Failed calls are tracked and displayed - shows robust error handling
## Technical Implementation
### Architecture
```
User Action
Agent/Pipeline Code
API Call (LLM or GNS3)
API Monitor (tracks start)
Execute API Request
API Monitor (tracks completion with tokens/cost/timing)
UI Updates (real-time refresh of stats and activity feed)
```
### Key Components
1. **`agent/api_monitor.py`** - Singleton monitor tracking all API usage
- Thread-safe for concurrent calls
- Tracks tokens, costs, timing, errors
- Exports JSON for auditing
2. **`agent/llm_client.py`** - Instrumented LLM client
- Tracks every OpenAI/Anthropic/OpenRouter call
- Captures actual token usage from API responses
- Calculates costs based on current pricing
3. **`agent/local_mcp.py`** - Instrumented MCP client
- Tracks all GNS3 API calls
- Monitors infrastructure operations
- Provides timing data
4. **`app.py`** - Gradio UI integration
- Live dashboard at top of interface
- Auto-refresh after pipeline execution
- Manual refresh buttons for real-time updates
## Demo Scenario for Judges
1. **Judge opens the Space**
- Sees "Session Statistics" showing $0.00 cost, 0 calls
2. **Judge clicks "🚀 Run Full Pipeline"**
- API Activity Feed populates in real-time
- Each LLM call shows model, tokens, cost, timing
- GNS3 calls show infrastructure operations
3. **Judge sees completion**
- Pipeline status includes API usage summary at top
- Session Statistics show total cost (e.g., $0.15)
- Activity Feed shows 5-10 API calls with full details
4. **Judge clicks "🔄 Refresh Stats"**
- Dashboard updates instantly
- All data persists across the session
## Comparison to Other Submissions
Most hackathon projects hide their API usage. Overgrowth makes it a **feature**:
| Other Projects | Overgrowth |
|----------------|------------|
| ❌ Hidden API costs | ✅ Real-time cost tracking |
| ❌ No token visibility | ✅ Per-call token counts |
| ❌ Unknown model usage | ✅ Model names displayed |
| ❌ No timing data | ✅ Response time for every call |
| ❌ Silent failures | ✅ Error tracking with messages |
## Future Enhancements
- **Budget Alerts**: Warn when session cost exceeds threshold
- **Cost Optimization**: Suggest cheaper models for simple tasks
- **Historical Analytics**: Track costs over time with charts
- **Export Reports**: Download API usage as CSV/JSON for accounting
- **Provider Comparison**: Show cost differences between OpenAI vs Anthropic
- **Streaming Token Counter**: Live token count during streaming responses
## For Development/Testing
Reset the monitor:
```python
from agent.api_monitor import monitor
monitor.reset()
```
Export session data:
```python
json_data = monitor.export_json()
# Save to file or send to analytics platform
```
Access raw calls:
```python
all_calls = monitor.get_all_calls()
for call in all_calls:
print(f"{call.provider}: ${call.estimated_cost}")
```
---
**This level of transparency demonstrates that Overgrowth is enterprise-ready, not just a hackathon prototype.**