Back to Blog
Published:
Last Updated:
Fresh Content

AgentCore Memory Layer: How to Build Stateful AI Agents

9 min read
1,800 words
high priority
M

Muhammad Mudassir

Founder & CEO, Cognilium AI

AWS Bedrock AgentCore memory layer architecture with DynamoDB integration and session isolation
Master AgentCore's memory layer for stateful AI agents. Session memory, user persistence, and context management patterns with production code examples.
stateful AI agentsagent memory managementDynamoDB agent memorysession memory AIcontext window managementAWS agent persistence

Stateless agents are useless in production. Users expect AI to remember their name, their preferences, and what they discussed five minutes ago. AWS Bedrock AgentCore's memory layer solves this—but most teams configure it wrong. Here's how to get it right.

What is the AgentCore Memory Layer?

The AgentCore Memory Layer is a managed memory system that persists conversation context across agent interactions. It stores messages in DynamoDB, handles session isolation automatically, and provides configurable context windows to manage token costs. Memory can be scoped to sessions (temporary), users (persistent), or globally (shared across all users).

Why Memory Matters

Without memory, every message is the start of a new conversation:

User: "My name is Alex and I'm building a fintech app."
Agent: "Nice to meet you, Alex! I can help with fintech..."

User: "What tech stack should I use?"
Agent: "I'd be happy to help! What kind of project are you working on?"
# ❌ Agent forgot Alex and the fintech context

With memory:

User: "My name is Alex and I'm building a fintech app."
Agent: "Nice to meet you, Alex! I can help with fintech..."

User: "What tech stack should I use?"
Agent: "For your fintech app, Alex, I'd recommend..."
# ✅ Agent remembers context

Memory transforms agents from frustrating to useful.

Memory Scopes: Session vs User vs Global

AgentCore supports three memory scopes:

Session Memory (Default)

memory:
  type: session
  ttl_hours: 4
  • Lifespan: One conversation (expires with TTL)
  • Use case: Customer support, one-off queries
  • Isolation: Each session is independent
  • Cost: Low (short-lived data)

User Memory

memory:
  type: user
  ttl_days: 30
  • Lifespan: Persists across sessions for same user
  • Use case: Personal assistants, ongoing relationships
  • Isolation: Each user has separate memory
  • Cost: Medium (longer retention)

Global Memory

memory:
  type: global
  ttl_days: 90
  • Lifespan: Shared across all users
  • Use case: Company knowledge, shared context
  • Isolation: None—everyone sees same memory
  • Cost: Higher (must manage carefully)

Choosing the Right Scope

Use CaseRecommended Scope
Customer support chatSession
Personal finance advisorUser
Internal company assistantUser + Global
Quick Q&A botSession
Health/medical assistantUser (with encryption)

Basic Memory Configuration

Minimal Setup

# agent_config.py
from bedrock_agentcore import Agent, MemoryConfig

memory = MemoryConfig(
    type="session",
    storage="dynamodb",
    table_name="agentcore-memory",
    ttl_hours=24
)

agent = Agent(
    name="MyAgent",
    model="anthropic.claude-3-sonnet",
    memory=memory
)

Full Configuration

memory = MemoryConfig(
    # Scope settings
    type="user",                    # session, user, or global
    ttl_hours=168,                  # 7 days
    
    # Storage settings
    storage="dynamodb",
    table_name="agentcore-memory",
    region="us-east-1",
    
    # Context management
    max_messages=100,               # Keep last 100 messages
    context_window_tokens=8000,     # Max tokens to include
    summarize_after=50,             # Summarize older messages
    
    # Security
    isolation="user",               # Isolate by user ID
    encryption="AES-256",           # Encrypt at rest
    
    # Performance
    cache_ttl_seconds=300,          # Cache recent lookups
    batch_write=True                # Batch DynamoDB writes
)

Context Window Strategies

LLMs have token limits. You can't send the entire conversation history. Here's how to manage it:

Strategy 1: Sliding Window (Simple)

Keep the last N messages:

memory = MemoryConfig(
    context_window_strategy="sliding",
    max_messages=20
)
Messages 1-20: [dropped]
Messages 21-40: Included in context

Pros: Simple, predictable Cons: Loses early context

Strategy 2: Summarization (Recommended)

Summarize old messages, keep recent ones:

memory = MemoryConfig(
    context_window_strategy="summarize",
    summarize_after=20,
    summary_model="anthropic.claude-3-haiku"  # Cheap model for summaries
)
Messages 1-20: Summarized → "User Alex discussed fintech app. Decided on React + Node stack."
Messages 21-40: Included in full

Pros: Retains key context, manages tokens Cons: Summary may lose nuance

Strategy 3: Semantic Retrieval (Advanced)

Store messages in vector DB, retrieve relevant ones:

memory = MemoryConfig(
    context_window_strategy="semantic",
    vector_store="pinecone",
    retrieval_k=10  # Retrieve 10 most relevant messages
)
User: "What stack did we decide on?"
Retrieved: Messages about tech stack decisions (semantically similar)

Pros: Always retrieves relevant context Cons: More complex, requires vector DB

For most production agents, Strategy 2 (Summarization) offers the best balance.

Memory Isolation for Multi-Tenant Apps

If your agent serves multiple companies, memory isolation is critical.

Problem: Cross-Tenant Leakage

Company A user: "Our revenue is $5M"
Company B user: "What's my company's revenue?"
Agent (wrong): "Your revenue is $5M"  # ❌ Leaked Company A's data

Solution: Tenant Isolation

memory = MemoryConfig(
    type="user",
    isolation="tenant",
    tenant_id_source="jwt.claims.org_id"  # Extract from auth token
)

Memory key structure:

tenant:company-a:user:alex:session:123
tenant:company-b:user:bob:session:456

DynamoDB Table Design for Multi-Tenancy

# Primary key includes tenant
table_schema = {
    "partition_key": "tenant_id#user_id",
    "sort_key": "session_id#timestamp",
    "attributes": {
        "messages": "list",
        "summary": "string",
        "metadata": "map"
    }
}

Advanced: Hybrid Memory with Vector Search

For complex use cases, combine session memory with long-term vector retrieval.

Architecture

Architecture Diagram

Implementation

from bedrock_agentcore import Agent, MemoryConfig, VectorMemory

# Short-term memory (recent context)
short_term = MemoryConfig(
    type="session",
    storage="dynamodb",
    max_messages=20
)

# Long-term memory (historical knowledge)
long_term = VectorMemory(
    storage="pinecone",
    index_name="agent-memory",
    embedding_model="amazon.titan-embed-text-v1",
    retrieval_k=5
)

agent = Agent(
    name="AdvancedAgent",
    model="anthropic.claude-3-sonnet",
    memory={
        "short_term": short_term,
        "long_term": long_term,
        "merge_strategy": "concat"  # or "summarize"
    }
)

This pattern works well for:

  • Personal assistants that remember months of history
  • Enterprise agents that recall past projects
  • Support agents that reference previous tickets

Common Memory Mistakes

Mistake 1: No TTL (Memory Grows Forever)

# ❌ Bad
memory:
  type: user
  # No TTL - data accumulates forever
# ✅ Good
memory:
  type: user
  ttl_days: 30
  max_messages: 500

Mistake 2: Missing Isolation

# ❌ Bad - users can see each other's data
memory:
  type: global
# ✅ Good - user-level isolation
memory:
  type: user
  isolation: user

Mistake 3: No Encryption for Sensitive Data

# ❌ Bad for healthcare/finance
memory:
  type: user
  encryption: false
# ✅ Good
memory:
  type: user
  encryption: AES-256
  kms_key_id: arn:aws:kms:us-east-1:123:key/abc

Mistake 4: Ignoring Context Window

# ❌ Bad - may exceed token limits
memory:
  max_messages: 1000
  # No context window limit
# ✅ Good
memory:
  max_messages: 100
  context_window_tokens: 8000
  summarize_after: 50

Production Memory Checklist

Before deploying, verify:

Security

  • Memory isolation configured (user or tenant level)
  • Encryption at rest enabled
  • Encryption in transit (HTTPS/TLS)
  • KMS key rotation scheduled

Performance

  • TTL set to prevent unbounded growth
  • Context window strategy configured
  • DynamoDB on-demand or provisioned appropriately
  • Cache enabled for frequent lookups

Compliance

  • Data retention policy documented
  • User data deletion process tested
  • Audit logging enabled
  • GDPR/CCPA requirements met

Monitoring

  • DynamoDB metrics in CloudWatch
  • Memory retrieval latency tracked
  • Storage costs monitored
  • Context window usage logged

Next Steps

Memory is just one piece of production agents. Continue with:

  1. Getting Started with AgentCore → Set up your first agent if you haven't already

  2. Multi-Agent Orchestration → Share memory across multiple agents

  3. AgentCore Observability → Debug memory issues in production

  4. AgentCore vs ADK → Compare memory capabilities across platforms


Building stateful agents for enterprise?

At Cognilium, we've built agents that remember months of context while maintaining strict data isolation. Let's discuss your memory architecture →

Share this article

Muhammad Mudassir

Muhammad Mudassir

Founder & CEO, Cognilium AI

Mudassir Marwat is the Founder & CEO of Cognilium AI, where he leads the design and deployment of pr...

Frequently Asked Questions

Find answers to common questions about the topics covered in this article.

Still have questions?

Get in touch with our team for personalized assistance.

Contact Us