How to Add Context to AI Agents: A Step-by-Step Implementation Guide
AI agents without context are like employees who forget everything after each conversation. They ask the same questions repeatedly, miss important details, and fail to personalize interactions. This tutorial shows you exactly how to add different types of context to your AI agents—with working code you can use today.
Why Your AI Agent Needs Context
Consider this scenario:
Without context:
User: "Book me a table at my usual restaurant"
Agent: "I'd be happy to help! Which restaurant would you like? And for how many people?"
User: "The one I went to last week. For my anniversary."
Agent: "Could you tell me the name of the restaurant? And when is your anniversary?"
With context:
User: "Book me a table at my usual restaurant"
Agent: "I'll book Osteria Francescana for 2 on February 28th at 7 PM—same time as your last visit. Should I request your preferred corner table?"
The difference is context. The second agent knows the user's favorite restaurant, party size, anniversary date, and seating preference. That knowledge transforms a frustrating interrogation into a delightful interaction.
The Four Types of Context Every Agent Needs
Before diving into implementation, understand the four context types:
| Type | What It Contains | Persistence | Example |
|---|---|---|---|
| System Context | Instructions, personality, capabilities | Static | "You are a helpful assistant that speaks casually" |
| Conversation Context | Current session's message history | Session | The last 10 messages exchanged |
| User Context | Preferences, history, profile data | Persistent | "User is vegetarian, prefers morning meetings" |
| Environmental Context | Time, location, external data | Dynamic | Current weather, calendar events |
Let's implement each one.
Step 1: Setting Up Your Base Agent
We'll use Python with the Anthropic API, but these patterns work with any LLM provider.
import anthropic
from datetime import datetime
from typing import Optional
import json
class ContextAwareAgent:
def __init__(self, api_key: str):
self.client = anthropic.Anthropic(api_key=api_key)
self.model = "claude-sonnet-4-20250514"
# Context stores
self.system_context = ""
self.conversation_history = []
self.user_profile = {}
self.environmental_context = {}
def chat(self, message: str) -> str:
# Build full context (we'll expand this)
system_prompt = self._build_system_prompt()
messages = self._build_messages(message)
response = self.client.messages.create(
model=self.model,
max_tokens=1024,
system=system_prompt,
messages=messages
)
assistant_message = response.content[0].text
# Store in conversation history
self.conversation_history.append({"role": "user", "content": message})
self.conversation_history.append({"role": "assistant", "content": assistant_message})
return assistant_message
def _build_system_prompt(self) -> str:
return self.system_context
def _build_messages(self, current_message: str) -> list:
return self.conversation_history + [{"role": "user", "content": current_message}]
Step 2: Adding System Context
System context defines your agent's personality and capabilities. Make it specific but not brittle.
def set_system_context(self, context: str):
"""Set the base personality and instructions for the agent."""
self.system_context = context
# Usage
agent = ContextAwareAgent(api_key="your-key")
agent.set_system_context("""
You are a helpful personal assistant with these characteristics:
- Communication style: Friendly and concise
- When uncertain, ask clarifying questions
- Always confirm important actions before executing
- Reference previous conversations when relevant
Capabilities:
- Calendar management
- Restaurant recommendations and bookings
- Travel planning
- General knowledge queries
""")
Best Practice: Don't include timestamps or dynamic data at the start of your system prompt—it kills KV-cache efficiency. Put dynamic information later in the context or in environmental context.
Step 3: Implementing Conversation Context (Short-Term Memory)
Conversation context maintains continuity within a session. The key challenge: context windows are finite.
Basic Implementation
class ConversationManager:
def __init__(self, max_turns: int = 20):
self.max_turns = max_turns
self.history = []
def add_turn(self, role: str, content: str):
self.history.append({
"role": role,
"content": content,
"timestamp": datetime.now().isoformat()
})
self._trim_if_needed()
def _trim_if_needed(self):
"""Keep only the most recent turns."""
if len(self.history) > self.max_turns * 2: # user + assistant = 2 per turn
self.history = self.history[-(self.max_turns * 2):]
def get_messages(self) -> list:
"""Return messages in API-compatible format."""
return [{"role": m["role"], "content": m["content"]} for m in self.history]
Advanced: Context Compaction
For long-running sessions, summarize older messages instead of dropping them:
class SmartConversationManager:
def __init__(self, client, max_recent: int = 10, summary_threshold: int = 20):
self.client = client
self.max_recent = max_recent
self.summary_threshold = summary_threshold
self.history = []
self.summary = None
def add_turn(self, role: str, content: str):
self.history.append({"role": role, "content": content})
# Check if we need to compact
if len(self.history) > self.summary_threshold:
self._compact()
def _compact(self):
"""Summarize older messages, keep recent ones verbatim."""
# Split: old messages to summarize, recent to keep
old_messages = self.history[:-self.max_recent]
recent_messages = self.history[-self.max_recent:]
# Summarize old messages
messages_text = "\n".join([f"{m['role']}: {m['content']}" for m in old_messages])
summary_response = self.client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=500,
messages=[{
"role": "user",
"content": f"""Summarize this conversation, focusing on:
- Key decisions made
- User preferences revealed
- Important information shared
- Open action items
Conversation:
{messages_text}
Provide a concise summary:"""
}]
)
# Update state
new_summary = summary_response.content[0].text
if self.summary:
self.summary = f"{self.summary}\n\nMore recent: {new_summary}"
else:
self.summary = new_summary
self.history = recent_messages
def get_context(self) -> str:
"""Return full context with summary + recent messages."""
context = ""
if self.summary:
context += f"Previous conversation summary:\n{self.summary}\n\n"
context += "Recent messages:\n"
for m in self.history:
context += f"{m['role'].capitalize()}: {m['content']}\n"
return context
Step 4: Adding User Context (Long-Term Memory)
User context persists across sessions. This is where personalization happens.
Option A: Build Your Own User Store
import sqlite3
from typing import Dict, Any
import json
class UserContextStore:
def __init__(self, db_path: str = "user_context.db"):
self.conn = sqlite3.connect(db_path)
self._init_db()
def _init_db(self):
self.conn.execute("""
CREATE TABLE IF NOT EXISTS user_profiles (
user_id TEXT PRIMARY KEY,
profile JSON,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
self.conn.execute("""
CREATE TABLE IF NOT EXISTS user_facts (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_id TEXT,
fact_type TEXT,
content TEXT,
confidence REAL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
""")
self.conn.commit()
def get_profile(self, user_id: str) -> Dict[str, Any]:
cursor = self.conn.execute(
"SELECT profile FROM user_profiles WHERE user_id = ?",
(user_id,)
)
row = cursor.fetchone()
return json.loads(row[0]) if row else {}
def update_profile(self, user_id: str, updates: Dict[str, Any]):
current = self.get_profile(user_id)
current.update(updates)
self.conn.execute("""
INSERT OR REPLACE INTO user_profiles (user_id, profile, updated_at)
VALUES (?, ?, CURRENT_TIMESTAMP)
""", (user_id, json.dumps(current)))
self.conn.commit()
def add_fact(self, user_id: str, fact_type: str, content: str, confidence: float = 1.0):
self.conn.execute("""
INSERT INTO user_facts (user_id, fact_type, content, confidence)
VALUES (?, ?, ?, ?)
""", (user_id, fact_type, content, confidence))
self.conn.commit()
def get_relevant_facts(self, user_id: str, query: str, limit: int = 10) -> list:
# Simple keyword matching - upgrade to embeddings for production
cursor = self.conn.execute("""
SELECT fact_type, content, confidence FROM user_facts
WHERE user_id = ?
ORDER BY confidence DESC, created_at DESC
LIMIT ?
""", (user_id, limit))
return cursor.fetchall()
Option B: Use an External Context API (Recommended)
Building robust context infrastructure is complex. External APIs like Dytto handle the hard parts—collection, storage, extraction, and retrieval:
import requests
class DyttoContextProvider:
def __init__(self, api_key: str):
self.api_key = api_key
self.base_url = "https://api.dytto.app/v1"
def get_context(self, user_id: str) -> dict:
"""Get comprehensive user context in one call."""
response = requests.get(
f"{self.base_url}/context",
headers={"Authorization": f"Bearer {self.api_key}"},
params={"user_id": user_id}
)
return response.json()
def search_context(self, user_id: str, query: str) -> list:
"""Semantic search over user's context."""
response = requests.get(
f"{self.base_url}/search",
headers={"Authorization": f"Bearer {self.api_key}"},
params={"user_id": user_id, "query": query}
)
return response.json().get("results", [])
def store_fact(self, user_id: str, category: str, content: str):
"""Store a new fact learned during conversation."""
response = requests.post(
f"{self.base_url}/facts",
headers={"Authorization": f"Bearer {self.api_key}"},
json={
"user_id": user_id,
"category": category,
"content": content
}
)
return response.json()
Integrating User Context Into Your Agent
class ContextAwareAgent:
def __init__(self, api_key: str, context_provider):
self.client = anthropic.Anthropic(api_key=api_key)
self.context_provider = context_provider
self.current_user_id = None
def start_session(self, user_id: str):
self.current_user_id = user_id
self.user_context = self.context_provider.get_context(user_id)
def _build_system_prompt(self) -> str:
base_prompt = """You are a helpful personal assistant."""
# Add user-specific context
if self.user_context:
preferences = self.user_context.get("preferences", {})
if preferences:
base_prompt += f"\n\nUser preferences:\n{json.dumps(preferences, indent=2)}"
patterns = self.user_context.get("patterns", {})
if patterns:
base_prompt += f"\n\nBehavioral patterns:\n{json.dumps(patterns, indent=2)}"
return base_prompt
Step 5: Adding Environmental Context
Environmental context includes dynamic information like time, location, and external data:
import requests
from datetime import datetime
import pytz
class EnvironmentProvider:
def __init__(self, user_timezone: str = "UTC"):
self.timezone = pytz.timezone(user_timezone)
def get_temporal_context(self) -> dict:
now = datetime.now(self.timezone)
return {
"current_time": now.strftime("%H:%M"),
"current_date": now.strftime("%Y-%m-%d"),
"day_of_week": now.strftime("%A"),
"time_of_day": self._get_time_of_day(now.hour)
}
def _get_time_of_day(self, hour: int) -> str:
if 5 <= hour < 12:
return "morning"
elif 12 <= hour < 17:
return "afternoon"
elif 17 <= hour < 21:
return "evening"
else:
return "night"
def get_weather(self, lat: float, lon: float) -> dict:
# Using Open-Meteo (free, no API key needed)
response = requests.get(
"https://api.open-meteo.com/v1/forecast",
params={
"latitude": lat,
"longitude": lon,
"current_weather": True
}
)
return response.json().get("current_weather", {})
Integrating Environmental Context
def _build_environmental_context(self) -> str:
env = self.environment_provider.get_temporal_context()
context = f"""Current context:
- Time: {env['current_time']} ({env['time_of_day']})
- Date: {env['current_date']} ({env['day_of_week']})"""
if self.user_location:
weather = self.environment_provider.get_weather(*self.user_location)
context += f"\n- Weather: {weather.get('temperature')}°C, {weather.get('weathercode')}"
return context
Step 6: Putting It All Together
Here's a complete, production-ready implementation:
import anthropic
from datetime import datetime
from typing import Optional, Dict, Any
import json
class ProductionContextAgent:
def __init__(
self,
anthropic_api_key: str,
context_api_key: Optional[str] = None
):
self.client = anthropic.Anthropic(api_key=anthropic_api_key)
self.model = "claude-sonnet-4-20250514"
# Context providers
self.conversation = SmartConversationManager(self.client)
self.user_store = DyttoContextProvider(context_api_key) if context_api_key else None
self.environment = EnvironmentProvider()
# Current session state
self.user_id = None
self.user_context = {}
self.system_prompt = "You are a helpful assistant."
def start_session(self, user_id: str, timezone: str = "UTC"):
"""Initialize a session for a specific user."""
self.user_id = user_id
self.environment = EnvironmentProvider(timezone)
# Load user context
if self.user_store:
self.user_context = self.user_store.get_context(user_id)
# Reset conversation for new session
self.conversation = SmartConversationManager(self.client)
def chat(self, message: str) -> str:
"""Process a message with full context awareness."""
# Build the complete system prompt with all context
full_system_prompt = self._build_full_system_prompt()
# Get conversation history plus new message
messages = self.conversation.get_messages()
messages.append({"role": "user", "content": message})
# Make the API call
response = self.client.messages.create(
model=self.model,
max_tokens=1024,
system=full_system_prompt,
messages=messages
)
assistant_response = response.content[0].text
# Update conversation history
self.conversation.add_turn("user", message)
self.conversation.add_turn("assistant", assistant_response)
# Extract and store any new insights (async in production)
self._extract_insights(message, assistant_response)
return assistant_response
def _build_full_system_prompt(self) -> str:
"""Combine all context sources into the system prompt."""
sections = [self.system_prompt]
# Add user context if available
if self.user_context:
sections.append(self._format_user_context())
# Add environmental context
sections.append(self._format_environmental_context())
# Add conversation summary if exists
if self.conversation.summary:
sections.append(f"Conversation summary:\n{self.conversation.summary}")
return "\n\n".join(sections)
def _format_user_context(self) -> str:
"""Format user context for the prompt."""
parts = ["About this user:"]
if prefs := self.user_context.get("preferences"):
parts.append(f"Preferences: {json.dumps(prefs)}")
if patterns := self.user_context.get("patterns"):
parts.append(f"Behavioral patterns: {json.dumps(patterns)}")
if insights := self.user_context.get("insights"):
parts.append(f"Key insights: {json.dumps(insights[:5])}")
return "\n".join(parts)
def _format_environmental_context(self) -> str:
"""Format environmental context for the prompt."""
env = self.environment.get_temporal_context()
return f"Current time: {env['current_time']} on {env['day_of_week']}, {env['current_date']}"
def _extract_insights(self, user_message: str, assistant_response: str):
"""Extract and store insights from the conversation."""
if not self.user_store:
return
# Use a lightweight extraction (could be async/batched in production)
extraction_prompt = f"""Analyze this exchange and extract any facts about the user worth remembering.
User: {user_message}
Assistant: {assistant_response}
Facts to remember (JSON array, empty if none):"""
try:
response = self.client.messages.create(
model=self.model,
max_tokens=200,
messages=[{"role": "user", "content": extraction_prompt}]
)
facts = json.loads(response.content[0].text)
for fact in facts:
self.user_store.store_fact(
self.user_id,
fact.get("category", "general"),
fact.get("content")
)
except (json.JSONDecodeError, KeyError):
pass # Extraction failed, not critical
Step 7: Testing Your Context-Aware Agent
# Initialize the agent
agent = ProductionContextAgent(
anthropic_api_key="your-anthropic-key",
context_api_key="your-dytto-key" # Optional
)
# Configure the base personality
agent.system_prompt = """You are a helpful personal assistant.
You're friendly, efficient, and always reference context when available.
When you learn something new about the user, acknowledge it naturally."""
# Start a session
agent.start_session(user_id="user_123", timezone="America/New_York")
# Have a conversation
print(agent.chat("Hi, I'm planning a birthday dinner for my wife next week"))
# Agent responds knowing nothing about preferences yet
print(agent.chat("She's vegetarian and loves Italian food"))
# Agent acknowledges and stores this information
print(agent.chat("Any restaurant recommendations?"))
# Agent recommends vegetarian-friendly Italian restaurants
# Later session...
agent.start_session(user_id="user_123", timezone="America/New_York")
print(agent.chat("I need restaurant ideas again"))
# Agent remembers: "Since your wife enjoys vegetarian Italian, here are some options..."
Common Mistakes to Avoid
1. Putting Timestamps at the Start of System Prompts
Bad:
system_prompt = f"Current time: {datetime.now()}\nYou are a helpful assistant..."
Good:
system_prompt = "You are a helpful assistant..."
environmental_context = f"Current time: {datetime.now()}"
# Add environmental context at the end, after stable instructions
2. Modifying Previous Context Mid-Conversation
Don't retroactively edit conversation history. It breaks KV-cache and confuses the model.
Bad:
def correct_previous_response(self, index, new_content):
self.history[index]["content"] = new_content # Don't do this
Good:
def add_correction(self, correction):
self.history.append({
"role": "user",
"content": f"Correction to previous: {correction}"
})
3. Overloading Context with Raw Data
Bad:
user_context = get_all_user_data(user_id) # Everything they've ever done
system_prompt = f"User data: {json.dumps(user_context)}"
Good:
relevant_context = get_relevant_context(user_id, current_query)
system_prompt = f"Relevant user info: {format_concisely(relevant_context)}"
4. Not Handling Context Gracefully When Missing
Bad:
user_prefs = context["preferences"]["dining"]["cuisine"] # Crashes if missing
Good:
user_prefs = context.get("preferences", {}).get("dining", {}).get("cuisine", "not specified")
Performance Optimization Tips
-
Cache aggressively: User context that doesn't change often should be cached locally with appropriate TTLs.
-
Use async extraction: Don't block the response to extract insights. Run extraction in the background.
-
Batch similar operations: If you're storing multiple facts, batch them into a single API call.
-
Profile your context pipeline: If context retrieval takes 500ms before you even call the LLM, you have a problem.
-
Consider edge caching: For environmental context like weather, cache responses for 15-30 minutes.
Frequently Asked Questions
How much context should I include?
Start minimal and expand based on observed failures. Most agents work better with 2,000-4,000 tokens of highly relevant context than 20,000 tokens of everything. Measure whether context is actually being used.
Should I build my own context system or use an API?
For MVPs and early-stage products, use external APIs like Dytto—you'll ship faster and avoid infrastructure headaches. Build custom only when you have specific compliance requirements or need deep integration with proprietary data.
How do I handle multiple users with shared context?
Use namespace isolation. Every context query should include user_id. For shared contexts (like team knowledge), create explicit "team" namespaces that users can access.
What if my context exceeds the model's limit?
Implement progressive disclosure: give the agent tools to retrieve more context on demand rather than front-loading everything. The agent can call search_user_history("restaurants") when it needs that information.
How do I measure if context is working?
Track: (1) Repeated questions—is the agent asking things it should know? (2) Personalization accuracy—are recommendations relevant? (3) User satisfaction—do users feel understood? (4) Context utilization—is the model actually referencing provided context?
How do I handle context conflicts?
When old information conflicts with new, the model should prefer recent information. Make recency clear in your context formatting: "Previous preference (6 months ago): Italian. Recent preference (last week): Japanese."
Conclusion
Adding context to AI agents transforms them from stateless question-answerers into intelligent assistants that understand and remember their users. The key principles:
-
Separate context types: System, conversation, user, and environmental context serve different purposes.
-
Keep context relevant: Don't dump everything in—curate what's likely useful for the current interaction.
-
Make it append-only: Avoid modifying previous context to maintain cache efficiency.
-
Extract continuously: Learn from every interaction and store insights for future use.
-
Use the right tools: External APIs like Dytto handle the infrastructure complexity so you can focus on your agent's core value.
Start simple, measure what works, and iterate. The best context systems evolved from basic prototypes that learned from real user interactions.
Ready to add context to your agents? Explore Dytto's context API to get sophisticated user context without building the infrastructure yourself.