Debugging Guide¶
This guide provides comprehensive debugging strategies for troubleshooting issues in the AI Assistant System.
Overview¶
Debugging AI systems requires a systematic approach to identify and resolve issues. This guide covers tools, techniques, and best practices for effective debugging.
Common Issues and Solutions¶
1. Model Not Responding¶
Symptoms: - Timeouts when making requests - Empty or null responses - Connection errors
Debugging Steps:
-
Check API keys and configuration:
import os from app.core.config import get_settings settings = get_settings() print(f"API Key configured: {bool(settings.OPENAI_API_KEY)}") print(f"Base URL: {settings.OPENAI_BASE_URL}")
-
Test connectivity:
import httpx async def test_connection(): try: async with httpx.AsyncClient() as client: response = await client.get(f"{settings.OPENAI_BASE_URL}/models") print(f"Connection status: {response.status_code}") except Exception as e: print(f"Connection failed: {e}")
-
Check rate limits:
from app.core.monitoring import RateLimitMonitor monitor = RateLimitMonitor() current_usage = monitor.get_current_usage() print(f"Current usage: {current_usage}")
2. Poor Response Quality¶
Symptoms: - Irrelevant responses - Inconsistent output format - Low-quality content
Debugging Steps:
-
Analyze the prompt:
from app.core.debugging import PromptAnalyzer analyzer = PromptAnalyzer() prompt_analysis = analyzer.analyze_prompt(prompt) print(f"Prompt clarity: {prompt_analysis.clarity}") print(f"Prompt specificity: {prompt_analysis.specificity}")
-
Check model parameters:
def debug_model_parameters(): print(f"Temperature: {temperature}") print(f"Max tokens: {max_tokens}") print(f"Top-p: {top_p}") print(f"Frequency penalty: {frequency_penalty}")
-
Test with different models:
models = ["gpt-3.5-turbo", "gpt-4", "claude-3-opus"] for model in models: response = await generate_response(prompt, model=model) print(f"{model}: {response.text[:100]}...")
3. Performance Issues¶
Symptoms: - Slow response times - High resource usage - Memory leaks
Debugging Steps:
-
Profile the request:
import time import psutil from app.core.debugging import PerformanceProfiler with PerformanceProfiler() as profiler: start_time = time.time() start_memory = psutil.Process().memory_info().rss response = await generate_response(prompt) end_time = time.time() end_memory = psutil.Process().memory_info().rss print(f"Response time: {end_time - start_time:.2f}s") print(f"Memory used: {(end_memory - start_memory) / 1024 / 1024:.2f}MB")
-
Check token usage:
def analyze_token_usage(response): print(f"Input tokens: {response.usage.prompt_tokens}") print(f"Output tokens: {response.usage.completion_tokens}") print(f"Total tokens: {response.usage.total_tokens}")
-
Monitor system resources:
import psutil def check_system_resources(): cpu_percent = psutil.cpu_percent() memory = psutil.virtual_memory() disk = psutil.disk_usage('/') print(f"CPU usage: {cpu_percent}%") print(f"Memory usage: {memory.percent}%") print(f"Disk usage: {disk.percent}%")
Debugging Tools¶
1. Logging¶
Configure comprehensive logging:
import logging
from app.core.debugging import configure_debug_logging
# Configure debug logging
configure_debug_logging(level=logging.DEBUG)
# Create logger
logger = logging.getLogger(__name__)
async def debug_generate_response(prompt):
logger.debug(f"Generating response for prompt: {prompt[:50]}...")
try:
response = await model.generate(prompt)
logger.debug(f"Response generated: {response.text[:50]}...")
return response
except Exception as e:
logger.error(f"Error generating response: {e}", exc_info=True)
raise
2. Request Tracing¶
Trace requests through the system:
from app.core.debugging import RequestTracer
tracer = RequestTracer()
@tracer.trace_request
async def traced_generate_response(prompt):
trace_id = tracer.start_trace("generate_response")
try:
# Log request details
tracer.log_event(trace_id, "request_started", {"prompt_length": len(prompt)})
# Generate response
response = await model.generate(prompt)
# Log response details
tracer.log_event(trace_id, "response_received", {
"response_length": len(response.text),
"token_usage": response.usage.total_tokens
})
return response
finally:
tracer.end_trace(trace_id)
3. Error Analyzer¶
Analyze errors for patterns:
from app.core.debugging import ErrorAnalyzer
error_analyzer = ErrorAnalyzer()
@error_analyzer.analyze_errors
async def debug_with_error_handling(prompt):
try:
return await model.generate(prompt)
except Exception as e:
error_analyzer.record_error(e, {
"prompt": prompt,
"model": model.name,
"timestamp": time.time()
})
raise
Advanced Debugging Techniques¶
1. Response Comparison¶
Compare responses across models:
from app.core.debugging import ResponseComparator
comparator = ResponseComparator()
async def compare_responses(prompt):
models = ["gpt-3.5-turbo", "gpt-4", "claude-3-opus"]
responses = {}
for model in models:
response = await generate_response(prompt, model=model)
responses[model] = response
# Compare responses
comparison = comparator.compare(responses)
print(f"Quality scores: {comparison.quality_scores}")
print(f"Response lengths: {comparison.lengths}")
print(f"Similarity matrix: {comparison.similarity_matrix}")
2. A/B Testing¶
Debug by testing variations:
from app.core.debugging import ABTestDebugger
ab_debugger = ABTestDebugger()
async def debug_prompt_variations(original_prompt):
variations = [
original_prompt,
f"Please {original_prompt.lower()}",
f"Could you {original_prompt.lower()}?",
f"I need you to {original_prompt.lower()}."
]
results = {}
for i, variation in enumerate(variations):
response = await generate_response(variation)
results[f"variation_{i}"] = {
"prompt": variation,
"response": response.text,
"quality": evaluate_quality(response.text)
}
return results
3. Memory Profiling¶
Profile memory usage:
from app.core.debugging import MemoryProfiler
memory_profiler = MemoryProfiler()
@memory_profiler.profile_memory
async def profile_memory_usage(prompt):
# Take memory snapshot
snapshot1 = memory_profiler.take_snapshot()
response = await model.generate(prompt)
# Take another snapshot
snapshot2 = memory_profiler.take_snapshot()
# Compare snapshots
diff = memory_profiler.compare_snapshots(snapshot1, snapshot2)
print(f"Memory difference: {diff}")
return response
Debugging Checklist¶
Before Debugging¶
- Reproduce the Issue
- Can you consistently reproduce the problem?
- What are the exact steps to reproduce?
-
What are the expected vs actual results?
-
Gather Information
- Error messages and stack traces
- System logs and metrics
- Configuration details
- Recent changes
During Debugging¶
- Isolate the Problem
- Narrow down the component causing the issue
- Test with minimal inputs
-
Disable non-essential features
-
Form Hypotheses
- What could be causing this issue?
-
How can you test each hypothesis?
-
Test Systematically
- Change one variable at a time
- Document each test and result
- Use control cases for comparison
After Debugging¶
- Verify the Fix
- Ensure the issue is resolved
- Check for regression
-
Test edge cases
-
Document the Solution
- What was the root cause?
- How was it fixed?
- How can it be prevented?
Debugging Best Practices¶
1. Use Version Control¶
Track changes to isolate when issues were introduced:
git bisect start
git bisect bad # Current version with issue
git bisect good [commit_before_issue]
# Git will checkout commits for testing
git bisect good # or bad depending on test result
2. Implement Health Checks¶
Add health checks for early detection:
from app.core.debugging import HealthChecker
health_checker = HealthChecker()
@health_checker.check_health
async def check_model_health():
try:
response = await model.generate("Test prompt", max_tokens=5)
return True
except Exception:
return False
3. Create Debug Utilities¶
Build reusable debugging tools:
class DebugUtils:
@staticmethod
def print_request_details(request):
print(f"Method: {request.method}")
print(f"URL: {request.url}")
print(f"Headers: {request.headers}")
print(f"Body: {request.body}")
@staticmethod
def print_response_details(response):
print(f"Status: {response.status_code}")
print(f"Headers: {response.headers}")
print(f"Body: {response.text[:100]}...")
4. Use Automated Testing¶
Catch issues early with tests:
import pytest
@pytest.mark.asyncio
async def test_model_response():
response = await model.generate("Test prompt")
assert response.text is not None
assert len(response.text) > 0
assert response.usage.total_tokens > 0
Troubleshooting Specific Components¶
1. Tool Integration¶
Debug tool execution:
from app.core.debugging import ToolDebugger
tool_debugger = ToolDebugger()
async def debug_tool_execution(tool_name, parameters):
# Log tool call
tool_debugger.log_tool_call(tool_name, parameters)
try:
result = await tool_registry.execute(tool_name, parameters)
tool_debugger.log_tool_result(tool_name, result)
return result
except Exception as e:
tool_debugger.log_tool_error(tool_name, e)
raise
2. Caching Issues¶
Debug cache behavior:
from app.core.debugging import CacheDebugger
cache_debugger = CacheDebugger()
async def debug_cache_behavior(key):
# Check if key exists in cache
exists = cache_debugger.key_exists(key)
print(f"Key exists in cache: {exists}")
if exists:
# Get cache metadata
metadata = cache_debugger.get_metadata(key)
print(f"Cache metadata: {metadata}")
# Monitor cache operations
with cache_debugger.monitor_cache_operations():
result = await cache.get_or_set(key, lambda: expensive_operation())
return result
3. Agent Behavior¶
Debug agent decision-making:
from app.core.debugging import AgentDebugger
agent_debugger = AgentDebugger()
async def debug_agent_decision(agent, input_data):
# Log agent state
agent_debugger.log_agent_state(agent)
# Trace decision process
with agent_debugger.trace_decision():
decision = await agent.make_decision(input_data)
# Log reasoning
agent_debugger.log_reasoning(agent, decision)
return decision
Conclusion¶
Effective debugging is essential for maintaining a reliable AI system. By using the tools and techniques outlined in this guide, you can systematically identify and resolve issues.
Remember that debugging is an iterative process. Start with the most likely causes, test your hypotheses systematically, and document your findings. Over time, you'll build intuition for quickly identifying and resolving common issues.
The key to successful debugging is maintaining a curious, methodical mindset and leveraging the right tools for the job.