Stateless agents are useless in production. Users expect AI to remember their name, their preferences, and what they discussed five minutes ago. AWS Bedrock AgentCore's memory layer solves this—but most teams configure it wrong. Here's how to get it right.
What is the AgentCore Memory Layer?
The AgentCore Memory Layer is a managed memory system that persists conversation context across agent interactions. It stores messages in DynamoDB, handles session isolation automatically, and provides configurable context windows to manage token costs. Memory can be scoped to sessions (temporary), users (persistent), or globally (shared across all users).
Why Memory Matters
Without memory, every message is the start of a new conversation:
User: "My name is Alex and I'm building a fintech app."
Agent: "Nice to meet you, Alex! I can help with fintech..."
User: "What tech stack should I use?"
Agent: "I'd be happy to help! What kind of project are you working on?"
# ❌ Agent forgot Alex and the fintech context
With memory:
User: "My name is Alex and I'm building a fintech app."
Agent: "Nice to meet you, Alex! I can help with fintech..."
User: "What tech stack should I use?"
Agent: "For your fintech app, Alex, I'd recommend..."
# ✅ Agent remembers context
Memory transforms agents from frustrating to useful.
Memory Scopes: Session vs User vs Global
AgentCore supports three memory scopes:
Session Memory (Default)
memory:
type: session
ttl_hours: 4
- Lifespan: One conversation (expires with TTL)
- Use case: Customer support, one-off queries
- Isolation: Each session is independent
- Cost: Low (short-lived data)
User Memory
memory:
type: user
ttl_days: 30
- Lifespan: Persists across sessions for same user
- Use case: Personal assistants, ongoing relationships
- Isolation: Each user has separate memory
- Cost: Medium (longer retention)
Global Memory
memory:
type: global
ttl_days: 90
- Lifespan: Shared across all users
- Use case: Company knowledge, shared context
- Isolation: None—everyone sees same memory
- Cost: Higher (must manage carefully)
Choosing the Right Scope
| Use Case | Recommended Scope |
|---|---|
| Customer support chat | Session |
| Personal finance advisor | User |
| Internal company assistant | User + Global |
| Quick Q&A bot | Session |
| Health/medical assistant | User (with encryption) |
Basic Memory Configuration
Minimal Setup
# agent_config.py
from bedrock_agentcore import Agent, MemoryConfig
memory = MemoryConfig(
type="session",
storage="dynamodb",
table_name="agentcore-memory",
ttl_hours=24
)
agent = Agent(
name="MyAgent",
model="anthropic.claude-3-sonnet",
memory=memory
)
Full Configuration
memory = MemoryConfig(
# Scope settings
type="user", # session, user, or global
ttl_hours=168, # 7 days
# Storage settings
storage="dynamodb",
table_name="agentcore-memory",
region="us-east-1",
# Context management
max_messages=100, # Keep last 100 messages
context_window_tokens=8000, # Max tokens to include
summarize_after=50, # Summarize older messages
# Security
isolation="user", # Isolate by user ID
encryption="AES-256", # Encrypt at rest
# Performance
cache_ttl_seconds=300, # Cache recent lookups
batch_write=True # Batch DynamoDB writes
)
Context Window Strategies
LLMs have token limits. You can't send the entire conversation history. Here's how to manage it:
Strategy 1: Sliding Window (Simple)
Keep the last N messages:
memory = MemoryConfig(
context_window_strategy="sliding",
max_messages=20
)
Messages 1-20: [dropped]
Messages 21-40: Included in context
Pros: Simple, predictable Cons: Loses early context
Strategy 2: Summarization (Recommended)
Summarize old messages, keep recent ones:
memory = MemoryConfig(
context_window_strategy="summarize",
summarize_after=20,
summary_model="anthropic.claude-3-haiku" # Cheap model for summaries
)
Messages 1-20: Summarized → "User Alex discussed fintech app. Decided on React + Node stack."
Messages 21-40: Included in full
Pros: Retains key context, manages tokens Cons: Summary may lose nuance
Strategy 3: Semantic Retrieval (Advanced)
Store messages in vector DB, retrieve relevant ones:
memory = MemoryConfig(
context_window_strategy="semantic",
vector_store="pinecone",
retrieval_k=10 # Retrieve 10 most relevant messages
)
User: "What stack did we decide on?"
Retrieved: Messages about tech stack decisions (semantically similar)
Pros: Always retrieves relevant context Cons: More complex, requires vector DB
For most production agents, Strategy 2 (Summarization) offers the best balance.
Memory Isolation for Multi-Tenant Apps
If your agent serves multiple companies, memory isolation is critical.
Problem: Cross-Tenant Leakage
Company A user: "Our revenue is $5M"
Company B user: "What's my company's revenue?"
Agent (wrong): "Your revenue is $5M" # ❌ Leaked Company A's data
Solution: Tenant Isolation
memory = MemoryConfig(
type="user",
isolation="tenant",
tenant_id_source="jwt.claims.org_id" # Extract from auth token
)
Memory key structure:
tenant:company-a:user:alex:session:123
tenant:company-b:user:bob:session:456
DynamoDB Table Design for Multi-Tenancy
# Primary key includes tenant
table_schema = {
"partition_key": "tenant_id#user_id",
"sort_key": "session_id#timestamp",
"attributes": {
"messages": "list",
"summary": "string",
"metadata": "map"
}
}
Advanced: Hybrid Memory with Vector Search
For complex use cases, combine session memory with long-term vector retrieval.
Architecture
Implementation
from bedrock_agentcore import Agent, MemoryConfig, VectorMemory
# Short-term memory (recent context)
short_term = MemoryConfig(
type="session",
storage="dynamodb",
max_messages=20
)
# Long-term memory (historical knowledge)
long_term = VectorMemory(
storage="pinecone",
index_name="agent-memory",
embedding_model="amazon.titan-embed-text-v1",
retrieval_k=5
)
agent = Agent(
name="AdvancedAgent",
model="anthropic.claude-3-sonnet",
memory={
"short_term": short_term,
"long_term": long_term,
"merge_strategy": "concat" # or "summarize"
}
)
This pattern works well for:
- Personal assistants that remember months of history
- Enterprise agents that recall past projects
- Support agents that reference previous tickets
Common Memory Mistakes
Mistake 1: No TTL (Memory Grows Forever)
# ❌ Bad
memory:
type: user
# No TTL - data accumulates forever
# ✅ Good
memory:
type: user
ttl_days: 30
max_messages: 500
Mistake 2: Missing Isolation
# ❌ Bad - users can see each other's data
memory:
type: global
# ✅ Good - user-level isolation
memory:
type: user
isolation: user
Mistake 3: No Encryption for Sensitive Data
# ❌ Bad for healthcare/finance
memory:
type: user
encryption: false
# ✅ Good
memory:
type: user
encryption: AES-256
kms_key_id: arn:aws:kms:us-east-1:123:key/abc
Mistake 4: Ignoring Context Window
# ❌ Bad - may exceed token limits
memory:
max_messages: 1000
# No context window limit
# ✅ Good
memory:
max_messages: 100
context_window_tokens: 8000
summarize_after: 50
Production Memory Checklist
Before deploying, verify:
Security
- Memory isolation configured (user or tenant level)
- Encryption at rest enabled
- Encryption in transit (HTTPS/TLS)
- KMS key rotation scheduled
Performance
- TTL set to prevent unbounded growth
- Context window strategy configured
- DynamoDB on-demand or provisioned appropriately
- Cache enabled for frequent lookups
Compliance
- Data retention policy documented
- User data deletion process tested
- Audit logging enabled
- GDPR/CCPA requirements met
Monitoring
- DynamoDB metrics in CloudWatch
- Memory retrieval latency tracked
- Storage costs monitored
- Context window usage logged
Next Steps
Memory is just one piece of production agents. Continue with:
-
Getting Started with AgentCore → Set up your first agent if you haven't already
-
Multi-Agent Orchestration → Share memory across multiple agents
-
AgentCore Observability → Debug memory issues in production
-
AgentCore vs ADK → Compare memory capabilities across platforms
Building stateful agents for enterprise?
At Cognilium, we've built agents that remember months of context while maintaining strict data isolation. Let's discuss your memory architecture →
Share this article
Muhammad Mudassir
Founder & CEO, Cognilium AI
Muhammad Mudassir
Founder & CEO, Cognilium AI
Mudassir Marwat is the Founder & CEO of Cognilium AI, where he leads the design and deployment of pr...
