# 014 - Caching with Valkey

Alexander Zabielski

Last Modified: May 12th, 2026

Title: Adoption of AWS Valkey for Platform-Wide Caching

Status: Proposed

# Context

Our platform requires a centralized, low-latency data store to solve disparate needs across our OutSystems and Angular environments. Beyond simple session persistence, we are facing increased complexity in AI state management, API security/rate-limiting, and real-time metadata search.

The Drivers:

Session Continuity: Synchronizing user state between OutSystems (IdP) and Angular modules.
AI Token Costs: High overhead in re-sending context to LLMs for agentic workflows.
Security & Scale: The need for instantaneous token invalidation and granular rate-limiting across 10,000+ concurrent users.
Metadata Discovery: Fast, filtered searching of platform resources without hitting the primary SQL/NoSQL databases.

# Decision

We will implement a Shared AWS Valkey (Serverless) cluster as the platform's Tier-1 in-memory data plane. All services will interface with this cluster via a standardized VPC-attached API Layer (Lambda/API Gateway) or direct SDK integration for high-performance use cases.

# Rationale

# 1. Cross-Platform Session Persistence

The Problem: OutSystems and Angular need to know the user's last_path and company_id.
The Valkey Solution: Acts as the "source of truth" bridge. OutSystems hydrates the session upon login; Angular updates it on every route change.

# 2. Agentic AI & Memory Caching

The Problem: Agentic workflows (like our document processing engines) are expensive if we re-prompt the entire context.
The Valkey Solution: Use Valkey Vector Search to store and retrieve "Agent Memory" (embeddings). By caching previous reasoning steps and document metadata, we can reduce LLM token consumption by up to 90%.

# 3. Token Lifecycle & Invalidation

The Problem: Relying solely on JWT expiration is a security risk. We need to revoke access immediately if a user logs out or is deactivated.
The Valkey Solution:
- Blacklisting: Store JTI (JWT IDs) with a TTL matching the token's life.
- Rate Limiting: Implement "Leaky Bucket" algorithms to prevent API abuse.
- Bloom Filters: Use Valkey's native Bloom filters for ultra-fast, memory-efficient checks on whether a token has been invalidated.

# 4. Real-Time Search & Aggregations

The Problem: SQL queries for "Company Search" or "Active Task Search" are slow during peak load.
The Valkey Solution: Utilize Valkey Search to index platform metadata. This allows for full-text and faceted search across text, tags, and numeric ranges directly in-memory, bypassing the primary database.

# Implications

People/Training: We must establish a Global Key Namespacing Standard (e.g., app:feature:tenant:user_id) to prevent collisions in the shared cluster.
Process Adjustments: Terraform modules must be updated to include Valkey security group ingress for all platform subnets.
Tooling: Development of a shared Python/TypeScript library to wrap Valkey GLIDE (the enterprise driver) for consistent error handling and retry logic.

# Trade-Offs

Benefits: 60%+ cost reduction vs. ElastiCache for Redis; sub-millisecond p99 latency; significant reduction in database IOPS.
Drawbacks: Introduces a "Single Point of Failure" risk (mitigated by AWS Serverless High Availability); requires stricter governance over TTLs to prevent memory exhaustion.

# Key Evaluation Metrics

Token Savings: Monitor the reduction in LLM input tokens for Agentic AI workflows.
Invalidation Latency: Ensure token revocation propagates globally in < 100 ms.
Search Performance: Target < 10 ms for complex metadata search queries.
Cloud Spend: Maintain a total platform caching cost of < $1,500/mo for 10k users.

# Conclusion

Standardizing on AWS Valkey transforms our caching layer into a strategic asset. It not only solves the immediate OutSystems/Angular session gap but also provides the infrastructure needed to scale our AI and security features while maintaining a strict FinOps posture.