Caching Architecture¶
The AI Assistant System includes a sophisticated multi-layer caching architecture designed to improve performance and reduce API costs.
Overview¶
The caching system is built around several key components: - Cache Layers: Multiple storage backends (memory, Redis) - Compression: Reduces memory usage for cached content - Batching: Groups similar operations for efficiency - Integration: Seamless integration with agents and tools
Cache Layers¶
Memory Cache¶
The fastest cache layer, storing data in application memory.
from app.core.caching.layers.memory import MemoryCache
cache = MemoryCache(max_size=1000, ttl=3600)
Redis Cache¶
Distributed cache layer for multi-instance deployments.
from app.core.caching.layers.redis_cache import RedisCache
cache = RedisCache(redis_url="redis://localhost:6379/0")
Compression¶
The caching system includes intelligent compression to reduce memory usage:
from app.core.caching.compression.compressor import CacheCompressor
compressor = CacheCompressor()
compressed_data = compressor.compress(large_data)
Batching¶
Batch processing groups similar operations to reduce overhead:
from app.core.caching.batching.batch_processor import BatchProcessor
processor = BatchProcessor(batch_size=10, timeout=5.0)
Integration with Agents¶
The agent system includes automatic caching of responses:
from app.core.caching.integration.agent_cache import AgentCache
agent_cache = AgentCache(cache_layer=redis_cache)
Integration with Tools¶
Tool responses are automatically cached when enabled:
from app.core.caching.integration.tool_cache import ToolCache
tool_cache = ToolCache(cache_layer=memory_cache)
Configuration¶
Configure caching in your environment:
CACHE_ENABLED=true
CACHE_TTL=3600
REDIS_URL=redis://localhost:6379/0
CACHE_COMPRESSION=true
Monitoring¶
Monitor cache performance using the built-in metrics:
from app.core.caching.monitoring.metrics import CacheMetrics
metrics = CacheMetrics()
hit_rate = metrics.get_hit_rate()
Best Practices¶
- Choose appropriate TTL: Set expiration times based on data volatility
- Use compression: Enable compression for large cached objects
- Monitor hit rates: Track cache effectiveness
- Layer appropriately: Use memory cache for hot data, Redis for shared data