#
014 - Caching with Valkey
Last Modified: May 12th, 2026
Title: Adoption of AWS Valkey for Platform-Wide Caching
Status: Proposed
#
Context
Our platform requires a centralized, low-latency data store to solve disparate needs across our OutSystems and Angular environments. Beyond simple session persistence, we are facing increased complexity in AI state management, API security/rate-limiting, and real-time metadata search.
The Drivers:
- Session Continuity: Synchronizing user state between OutSystems (IdP) and Angular modules.
- AI Token Costs: High overhead in re-sending context to LLMs for agentic workflows.
- Security & Scale: The need for instantaneous token invalidation and granular rate-limiting across 10,000+ concurrent users.
- Metadata Discovery: Fast, filtered searching of platform resources without hitting the primary SQL/NoSQL databases.
#
Decision
We will implement a Shared AWS Valkey (Serverless) cluster as the platform's Tier-1 in-memory data plane. All services will interface with this cluster via a standardized VPC-attached API Layer (Lambda/API Gateway) or direct SDK integration for high-performance use cases.
#
Rationale
#
1. Cross-Platform Session Persistence
- The Problem: OutSystems and Angular need to know the user's
last_pathandcompany_id. - The Valkey Solution: Acts as the "source of truth" bridge. OutSystems hydrates the session upon login; Angular updates it on every route change.
#
2. Agentic AI & Memory Caching
- The Problem: Agentic workflows (like our document processing engines) are expensive if we re-prompt the entire context.
- The Valkey Solution: Use Valkey Vector Search to store and retrieve "Agent Memory" (embeddings). By caching previous reasoning steps and document metadata, we can reduce LLM token consumption by up to 90%.
#
3. Token Lifecycle & Invalidation
- The Problem: Relying solely on JWT expiration is a security risk. We need to revoke access immediately if a user logs out or is deactivated.
- The Valkey Solution:
- Blacklisting: Store
JTI(JWT IDs) with a TTL matching the token's life. - Rate Limiting: Implement "Leaky Bucket" algorithms to prevent API abuse.
- Bloom Filters: Use Valkey's native Bloom filters for ultra-fast, memory-efficient checks on whether a token has been invalidated.
- Blacklisting: Store
#
4. Real-Time Search & Aggregations
- The Problem: SQL queries for "Company Search" or "Active Task Search" are slow during peak load.
- The Valkey Solution: Utilize Valkey Search to index platform metadata. This allows for full-text and faceted search across text, tags, and numeric ranges directly in-memory, bypassing the primary database.
#
Implications
- People/Training: We must establish a Global Key Namespacing Standard (e.g.,
app:feature:tenant:user_id) to prevent collisions in the shared cluster. - Process Adjustments: Terraform modules must be updated to include Valkey security group ingress for all platform subnets.
- Tooling: Development of a shared Python/TypeScript library to wrap Valkey GLIDE (the enterprise driver) for consistent error handling and retry logic.
#
Trade-Offs
- Benefits: 60%+ cost reduction vs. ElastiCache for Redis; sub-millisecond p99 latency; significant reduction in database IOPS.
- Drawbacks: Introduces a "Single Point of Failure" risk (mitigated by AWS Serverless High Availability); requires stricter governance over TTLs to prevent memory exhaustion.
#
Key Evaluation Metrics
- Token Savings: Monitor the reduction in LLM input tokens for Agentic AI workflows.
- Invalidation Latency: Ensure token revocation propagates globally in < 100 ms.
- Search Performance: Target < 10 ms for complex metadata search queries.
- Cloud Spend: Maintain a total platform caching cost of < $1,500/mo for 10k users.
#
Conclusion
Standardizing on AWS Valkey transforms our caching layer into a strategic asset. It not only solves the immediate OutSystems/Angular session gap but also provides the infrastructure needed to scale our AI and security features while maintaining a strict FinOps posture.