TechSetupGuides
Intermediateaimemoryagentsragvector-databasellmembeddingspythonqdrantopenaipersonalization

Mem0: Universal Memory Layer for AI Agents

Open-source intelligent memory layer that gives AI agents persistent, personalized memory capabilities with multi-level context retention, semantic retrieval, and support for 20+ vector databases.

  1. Step 1

    What is Mem0?

    Mem0 (pronounced "mem-zero") is a universal memory layer for AI agents that enhances AI assistants with persistent, personalized memory capabilities. Unlike traditional context windows that forget everything between sessions, Mem0 enables AI systems to remember user preferences, adapt to individual needs, and continuously learn over time.

    With 56,891+ stars on GitHub, Mem0 has become the leading open-source solution for adding intelligent memory to AI applications. It achieved +26% accuracy over OpenAI's built-in memory on the LoCoMo benchmark, 91% faster responses than full-context approaches, and 90% lower token usage than maintaining full conversation history.

    Mem0 is ideal for customer support chatbots, AI assistants, autonomous agents, healthcare applications, and any AI system that benefits from personalized, context-aware interactions across sessions.

  2. Step 2

    Technology Stack & Architecture

    Mem0 uses a hybrid graph, vector, and key-value store architecture to achieve superior performance. The system operates through an intelligent extraction and retrieval pipeline:

    Core Components:

    • LLM Processing: Default uses GPT-4-mini for fact extraction and memory updates. Supports OpenAI, Anthropic (Claude), HuggingFace, AWS Bedrock, Azure OpenAI, Gemini, Groq, Ollama, and any LiteLLM-compatible model
    • Embedding Models: Default is text-embedding-3-small (1536 dimensions). Recommends Qwen 600M+ or GTE models for hybrid search scenarios
    • Vector Storage: 20+ supported backends including Qdrant (default), Pinecone, ChromaDB, Weaviate, Milvus, pgvector, Supabase, Redis, Elasticsearch, MongoDB, and Faiss
    • History Database: SQLite for local development, PostgreSQL for production

    Memory Types:

    1. User Memory: Long-term preferences and facts about specific users
    2. Session Memory: Temporary context for individual conversations
    3. Agent Memory: System-level patterns and learned behaviors

    Retrieval Strategy:

    Mem0 employs multi-signal retrieval combining:

    • Semantic similarity (vector search)
    • BM25 keyword matching
    • Entity matching and linking
    • Temporal reasoning for time-aware retrieval

    This hybrid approach outperforms any single retrieval signal and enables accurate fact recall even at scale.

    Mem0 Architecture:
    
    ┌─────────────────────────────────────────┐
    │  User Interaction / Agent Input         │
    └──────────────┬──────────────────────────┘
                   │
                   ▼
    ┌─────────────────────────────────────────┐
    │  LLM Extraction (GPT-4-mini default)    │
    │  • Extract atomic facts                 │
    │  • Deduplicate against existing memory  │
    └──────────────┬──────────────────────────┘
                   │
                   ▼
    ┌─────────────────────────────────────────┐
    │  Embedding (text-embedding-3-small)     │
    │  • Convert facts to 1536-dim vectors    │
    └──────────────┬──────────────────────────┘
                   │
                   ▼
    ┌─────────────────────────────────────────┐
    │  Vector Store (Qdrant/Pinecone/etc)     │
    │  • Index by user_id, session_id, agent  │
    │  • Store embeddings + metadata          │
    └──────────────┬──────────────────────────┘
                   │
            [Retrieval Phase]
                   │
                   ▼
    ┌─────────────────────────────────────────┐
    │  Multi-Signal Retrieval                 │
    │  • Semantic similarity search           │
    │  • BM25 keyword matching                │
    │  • Entity matching                      │
    │  • Temporal filtering                   │
    └──────────────┬──────────────────────────┘
                   │
                   ▼
    ┌─────────────────────────────────────────┐
    │  Inject into LLM Context Window         │
    │  • Top-k relevant memories              │
    │  • Minimal token overhead               │
    └─────────────────────────────────────────┘
  3. Step 3

    Prerequisites

    Before installing Mem0, ensure your system meets these requirements:

    Required:

    • Python 3.10 or higher - Mem0 requires modern Python features
    • OpenAI API key - Default LLM and embedding provider (obtain from OpenAI Platform)

    Optional (for self-hosted vector databases):

    • Docker - For running Qdrant, Milvus, or other containerized vector stores
    • PostgreSQL - For pgvector or production history storage
    • Redis - For Redis-based vector storage

    For production deployments:

    • API keys for your chosen LLM provider (OpenAI, Anthropic, etc.)
    • Vector database credentials (Pinecone, Weaviate, etc.)
    • Sufficient storage - Memory scales with usage; plan for growth

    Mem0 offers three deployment tiers:

    1. Library (pip/npm) - Local testing and prototyping
    2. Self-Hosted Server - Docker-based deployment with authentication
    3. Cloud Platform - Zero-ops managed service at app.mem0.ai
    # Verify Python version (must be 3.10+)
    python --version
    
    # Set OpenAI API key (required for default configuration)
    export OPENAI_API_KEY="your-openai-api-key"
    
    # Optional: Verify Docker (if using self-hosted vector stores)
    docker --version
    
    # Optional: Check available memory and disk space
    free -h
    df -h
  4. Step 4

    Installation - Python SDK

    The Python SDK is the primary way to integrate Mem0 into your applications. Installation is straightforward:

    # Basic installation
    pip install mem0ai
    
    # Install with vector store support (includes Chroma, Pinecone, etc.)
    pip install "mem0ai[vector_stores]"
    
    # Install with NLP enhancements for better extraction
    pip install "mem0ai[nlp]"
    
    # Install all optional dependencies
    pip install "mem0ai[all]"
    
    # Verify installation
    python -c "from mem0 import Memory; print('Mem0 installed successfully')"
  5. Step 5

    Quick Start - Default Configuration

    Get started with Mem0 in under 5 minutes using the default configuration. This setup uses:

    • GPT-4-mini for fact extraction
    • text-embedding-3-small for embeddings (1536 dimensions)
    • Qdrant vector database (local storage at /tmp/qdrant)
    • SQLite history database at ~/.mem0/history.db

    The default configuration is perfect for development and prototyping. For production, you'll want to customize the vector store and LLM providers (covered in later steps).

    from mem0 import Memory
    
    # Initialize with default configuration
    m = Memory()
    
    # Add memories from a conversation
    # Mem0 automatically extracts and stores atomic facts
    messages = [
        {"role": "user", "content": "Hi, I'm Alice. I'm a software engineer."},
        {"role": "assistant", "content": "Hello Alice! Nice to meet you."},
        {"role": "user", "content": "I love Python and prefer dark mode in my IDE."},
    ]
    
    # Store memories with user_id for personalization
    result = m.add(messages, user_id="alice")
    print(f"Stored {len(result)} memories for alice")
    
    # Retrieve relevant memories
    query = "What programming languages does Alice like?"
    memories = m.search(query, user_id="alice")
    
    # Display retrieved memories
    for mem in memories:
        print(f"Memory: {mem['memory']}")
        print(f"Score: {mem['score']}")
        print(f"Created: {mem['created_at']}")
        print("---")
  6. Step 6

    Custom Configuration - Production Setup

    For production deployments, you'll want to customize Mem0 to use your preferred LLM provider, vector database, and embedding model. Mem0's configuration system is flexible and supports all major providers.

    Configuration Structure:

    The config dictionary has several main sections:

    • llm - Language model for fact extraction and updates
    • embedder - Embedding model for vector generation
    • vector_store - Vector database for memory storage
    • history_db - Database for operation history

    Common Production Configurations:

    1. OpenAI + Pinecone - Fully managed, zero-ops
    2. Anthropic + Qdrant - Claude for extraction, self-hosted vectors
    3. Ollama + ChromaDB - Fully local, no external APIs
    4. Azure OpenAI + pgvector - Enterprise PostgreSQL setup
    from mem0 import Memory
    
    # Example 1: Custom OpenAI + Qdrant (cloud)
    config = {
        "llm": {
            "provider": "openai",
            "config": {
                "model": "gpt-4o-mini",
                "temperature": 0.1,
                "max_tokens": 2000,
            }
        },
        "embedder": {
            "provider": "openai",
            "config": {
                "model": "text-embedding-3-small",
                "embedding_dims": 1536
            }
        },
        "vector_store": {
            "provider": "qdrant",
            "config": {
                "host": "localhost",
                "port": 6333,
                "collection_name": "memories",
                "embedding_model_dims": 1536,
                "distance": "cosine"
            }
        }
    }
    
    m = Memory.from_config(config)
    
    # Example 2: Anthropic Claude + Pinecone
    config_claude = {
        "llm": {
            "provider": "anthropic",
            "config": {
                "model": "claude-sonnet-4-6",
                "temperature": 0.0,
                "max_tokens": 4000,
            }
        },
        "vector_store": {
            "provider": "pinecone",
            "config": {
                "api_key": "your-pinecone-api-key",
                "index_name": "mem0-memories",
                "environment": "us-west1-gcp"
            }
        }
    }
    
    # Example 3: Fully local with Ollama + ChromaDB
    config_local = {
        "llm": {
            "provider": "ollama",
            "config": {
                "model": "llama3.2",
                "ollama_base_url": "http://localhost:11434"
            }
        },
        "embedder": {
            "provider": "ollama",
            "config": {
                "model": "nomic-embed-text"
            }
        },
        "vector_store": {
            "provider": "chroma",
            "config": {
                "collection_name": "memories",
                "path": "./chroma_db"
            }
        }
    }
    ⚠ Heads up: Never commit API keys to version control. Use environment variables or a secrets manager. For production, enable authentication and encrypt data at rest.
  7. Step 7

    Setting Up Vector Databases

    Mem0 supports 20+ vector databases. Here's how to set up the most popular options:

    Qdrant (Default - Recommended for Self-Hosted):

    Qdrant is the default and offers excellent performance with Docker deployment.

    Pinecone (Managed Cloud Service):

    Zero-ops vector database with automatic scaling.

    pgvector (PostgreSQL Extension):

    Ideal if you're already using PostgreSQL.

    ChromaDB (Simple Local Option):

    Perfect for development and small-scale deployments.

    Weaviate, Milvus, Redis, Supabase:

    All supported with similar configuration patterns. Check the official Mem0 docs for specific configuration examples.

    # Qdrant - Docker setup
    docker run -d -p 6333:6333 -p 6334:6334 \
      -v $(pwd)/qdrant_storage:/qdrant/storage \
      qdrant/qdrant
    
    # Verify Qdrant is running
    curl http://localhost:6333/
    
    # pgvector - Enable in PostgreSQL
    # First, install the extension (Ubuntu/Debian):
    sudo apt-get install postgresql-16-pgvector
    
    # Then in PostgreSQL:
    # CREATE EXTENSION vector;
    
    # ChromaDB - No setup needed, runs in-process
    # Just install: pip install chromadb
    
    # Pinecone - Cloud service, no local setup
    # Sign up at https://www.pinecone.io/
    # Get API key and environment name from dashboard
  8. Step 8

    Core Operations - Add, Search, Update, Delete

    Mem0 provides four core operations for managing memories:

    1. Add Memories Extract and store facts from conversations or raw text.

    2. Search Memories Retrieve relevant memories using semantic search with multi-signal ranking.

    3. Update Memories Modify existing memories when information changes.

    4. Delete Memories Remove memories by ID or clear all memories for a user.

    Memory Levels:

    You can scope memories to different levels:

    • user_id - Long-term user preferences and facts
    • agent_id - Agent-specific learned behaviors
    • run_id or session_id - Temporary session context

    Combine multiple IDs for hierarchical memory structures.

    from mem0 import Memory
    
    m = Memory()
    
    # 1. ADD - Store memories from messages
    messages = [
        {"role": "user", "content": "My name is Bob and I work at Acme Corp."},
        {"role": "user", "content": "I prefer email updates over phone calls."}
    ]
    
    result = m.add(messages, user_id="bob")
    print(f"Added memories: {result}")
    
    # ADD - Store from raw text
    m.add("Bob's favorite color is blue.", user_id="bob")
    
    # 2. SEARCH - Retrieve relevant memories
    memories = m.search(
        query="What is Bob's communication preference?",
        user_id="bob",
        limit=5  # Top 5 results
    )
    
    for mem in memories:
        print(f"{mem['memory']} (score: {mem['score']})")
    
    # 3. UPDATE - Modify existing memory
    memory_id = memories[0]['id']
    m.update(
        memory_id=memory_id,
        data="Bob now prefers Slack messages over email."
    )
    
    # 4. DELETE - Remove specific memory
    m.delete(memory_id=memory_id)
    
    # DELETE - Clear all memories for a user
    m.delete_all(user_id="bob")
    
    # GET - Retrieve all memories for a user
    all_memories = m.get_all(user_id="bob")
    print(f"Total memories for Bob: {len(all_memories)}")
    
    # MULTI-LEVEL - Combine user and session context
    m.add(
        "Currently discussing project budget",
        user_id="bob",
        session_id="meeting-2026-05-27"
    )
    
    # Search with multiple filters
    memories = m.search(
        query="project discussion",
        user_id="bob",
        session_id="meeting-2026-05-27"
    )
  9. Step 9

    CLI Tool - Command Line Interface

    Mem0 provides a CLI for quick memory operations without writing Python code. This is useful for testing, debugging, and simple memory management tasks.

    The CLI supports all core operations and can be integrated into shell scripts or automation workflows.

    # Install CLI
    pip install mem0-cli
    
    # Initialize configuration (creates ~/.mem0/config.yaml)
    mem0 init
    
    # Add a memory
    mem0 add "Alice prefers Python over JavaScript" --user-id alice
    
    # Search memories
    mem0 search "programming preferences" --user-id alice
    
    # List all memories for a user
    mem0 list --user-id alice
    
    # Delete a specific memory
    mem0 delete <memory-id>
    
    # Clear all memories for a user
    mem0 clear --user-id alice
    
    # View CLI help
    mem0 --help
    mem0 add --help
  10. Step 10

    Integration with AI Frameworks

    Mem0 integrates seamlessly with popular AI frameworks and platforms:

    LangGraph Integration:

    Use Mem0 as a memory layer for LangGraph customer support bots and autonomous agents.

    CrewAI Integration:

    Give CrewAI agents persistent memory across tasks and conversations.

    Vercel AI SDK:

    Add memory to Next.js AI applications.

    ChatGPT Integration:

    Mem0 provides a browser extension for Chrome that stores memories across ChatGPT, Claude, and Perplexity conversations.

    AI Coding Assistants:

    Integrate with Claude Code, Cursor, Windsurf, and other coding assistants for project-specific context retention.

    # LangGraph Integration Example
    from langgraph.graph import StateGraph
    from mem0 import Memory
    
    memory = Memory()
    
    def chatbot_node(state):
        user_id = state["user_id"]
        message = state["message"]
        
        # Retrieve relevant memories
        memories = memory.search(message, user_id=user_id, limit=3)
        context = "\n".join([m['memory'] for m in memories])
        
        # Add context to LLM prompt
        response = llm.invoke(f"Context: {context}\n\nUser: {message}")
        
        # Store new interaction
        memory.add([
            {"role": "user", "content": message},
            {"role": "assistant", "content": response}
        ], user_id=user_id)
        
        return {"response": response}
    
    # CrewAI Integration Example
    from crewai import Agent, Task, Crew
    from mem0 import Memory
    
    memory = Memory()
    
    researcher = Agent(
        role="Researcher",
        goal="Research topics thoroughly",
        backstory="Expert researcher with access to memory",
        memory=memory  # Attach Mem0 to agent
    )
    
    task = Task(
        description="Research AI agent memory systems",
        agent=researcher,
        user_id="research-session-001"  # Memory scoping
    )
    
    crew = Crew(agents=[researcher], tasks=[task])
    result = crew.kickoff()
  11. Step 11

    Memory Management Best Practices

    To get the most out of Mem0 in production, follow these best practices:

    1. Memory Scoping Strategy:

    • Use user_id for long-term user preferences and facts
    • Use session_id for temporary conversation context
    • Use agent_id for system-level learned patterns
    • Combine IDs for hierarchical organization

    2. Fact Granularity:

    • Let Mem0's LLM extract atomic facts automatically
    • One fact per memory yields better retrieval than compound statements
    • Example: "Alice likes Python" + "Alice works at Acme" is better than "Alice likes Python and works at Acme"

    3. Retrieval Optimization:

    • Use specific queries for better semantic matching
    • Adjust limit parameter based on context window size
    • Consider reranking for critical applications

    4. Performance & Scalability:

    • Enable asynchronous writes for non-blocking operations
    • Use metadata filtering to scope searches
    • Monitor token usage and adjust extraction prompts if needed
    • Plan for ~6,000-7,000 tokens per operation in production

    5. Privacy & Compliance:

    • Implement user consent before storing memories
    • Provide clear deletion mechanisms
    • Use metadata to track retention policies
    • Consider encryption for sensitive data

    6. Memory Lifecycle:

    • Regularly audit memory quality and relevance
    • Implement staleness detection for time-sensitive facts
    • Archive old sessions to prevent context pollution
    • Version memories when facts change (e.g., job changes)
    # Hierarchical memory scoping
    memory = Memory()
    
    # Long-term user facts
    memory.add(
        "Alice is a senior engineer at Acme Corp",
        user_id="alice",
        metadata={"category": "profile", "retention_days": 365}
    )
    
    # Session-specific context
    memory.add(
        "Currently debugging authentication issue",
        user_id="alice",
        session_id="support-ticket-1234",
        metadata={"category": "session", "retention_days": 30}
    )
    
    # Agent-level patterns
    memory.add(
        "Users from this domain prefer technical explanations",
        agent_id="support-bot-v2",
        metadata={"category": "pattern", "learned_from": "alice"}
    )
    
    # Scoped retrieval with metadata filtering
    memories = memory.search(
        query="authentication",
        user_id="alice",
        session_id="support-ticket-1234",
        filters={"category": "session"}  # Only session memories
    )
    
    # Memory cleanup based on metadata
    import datetime
    
    def cleanup_old_sessions():
        all_memories = memory.get_all()
        for mem in all_memories:
            metadata = mem.get('metadata', {})
            if metadata.get('category') == 'session':
                created = datetime.datetime.fromisoformat(mem['created_at'])
                age_days = (datetime.datetime.now() - created).days
                if age_days > metadata.get('retention_days', 30):
                    memory.delete(mem['id'])
                    print(f"Deleted expired session memory: {mem['id']}")
    
    cleanup_old_sessions()
  12. Step 12

    Self-Hosted Deployment with Docker

    For production workloads, deploy Mem0 as a self-hosted server with authentication and persistent storage. This setup includes:

    • Docker containers for Mem0 server and vector database
    • Authentication enabled by default
    • Persistent volumes for data retention
    • API access via HTTP endpoints

    The self-hosted server provides the same API as the cloud platform but runs entirely on your infrastructure.

    # docker-compose.yml for Mem0 self-hosted deployment
    version: '3.8'
    
    services:
      qdrant:
        image: qdrant/qdrant:latest
        ports:
          - "6333:6333"
          - "6334:6334"
        volumes:
          - qdrant_storage:/qdrant/storage
        restart: unless-stopped
    
      mem0-server:
        image: mem0ai/mem0:latest
        ports:
          - "8000:8000"
        environment:
          - OPENAI_API_KEY=${OPENAI_API_KEY}
          - QDRANT_HOST=qdrant
          - QDRANT_PORT=6333
          - AUTH_ENABLED=true
          - SECRET_KEY=${SECRET_KEY}  # Generate with: openssl rand -base64 32
        depends_on:
          - qdrant
        restart: unless-stopped
    
      postgres:
        image: postgres:16-alpine
        environment:
          - POSTGRES_USER=mem0
          - POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
          - POSTGRES_DB=mem0_history
        volumes:
          - postgres_data:/var/lib/postgresql/data
        restart: unless-stopped
    
    volumes:
      qdrant_storage:
      postgres_data:
  13. Step 13

    Monitoring & Performance Optimization

    Monitor your Mem0 deployment to ensure optimal performance and catch issues early:

    Key Metrics to Track:

    • Token Usage: ~6,000-7,000 tokens per add operation (LLM extraction + deduplication)
    • Latency: p50 should be 0.88-1.09s for retrieval
    • Memory Count: Track growth rate and storage usage
    • Retrieval Quality: Monitor relevance scores and user feedback
    • API Errors: Track failed operations and retry patterns

    Performance Optimization:

    1. Reduce Token Usage: Use smaller LLMs for extraction (e.g., GPT-4o-mini vs GPT-4)
    2. Improve Latency: Enable caching, use local embeddings (FastEmbed), implement async writes
    3. Scale Vector Search: Use managed vector DBs (Pinecone, Weaviate Cloud) or scale Qdrant clusters
    4. Memory Quality: Regularly audit extracted facts, tune extraction prompts

    Benchmarks (v3.0 Algorithm):

    • LoCoMo benchmark: 91.6 score
    • LongMemEval: 94.8 score
    • Temporal reasoning improvement: +29.6 points
    • Multi-hop task improvement: +23.1 points
    # Performance monitoring example
    import time
    import logging
    from mem0 import Memory
    
    logging.basicConfig(level=logging.INFO)
    logger = logging.getLogger(__name__)
    
    memory = Memory()
    
    def monitor_operation(operation_name, func, *args, **kwargs):
        """Wrapper to monitor Mem0 operations"""
        start = time.time()
        try:
            result = func(*args, **kwargs)
            duration = time.time() - start
            logger.info(f"{operation_name} completed in {duration:.2f}s")
            
            # Log to metrics system (Prometheus, DataDog, etc.)
            # metrics.record_latency(operation_name, duration)
            
            return result
        except Exception as e:
            duration = time.time() - start
            logger.error(f"{operation_name} failed after {duration:.2f}s: {e}")
            # metrics.record_error(operation_name, str(e))
            raise
    
    # Monitor add operation
    messages = [{"role": "user", "content": "I prefer dark mode"}]
    result = monitor_operation(
        "memory_add",
        memory.add,
        messages,
        user_id="alice"
    )
    
    # Monitor search operation
    memories = monitor_operation(
        "memory_search",
        memory.search,
        "preferences",
        user_id="alice"
    )
    
    # Analyze retrieval quality
    def analyze_retrieval_quality(memories, threshold=0.7):
        """Check if retrieved memories meet quality threshold"""
        if not memories:
            logger.warning("No memories retrieved")
            return False
        
        avg_score = sum(m['score'] for m in memories) / len(memories)
        logger.info(f"Average relevance score: {avg_score:.3f}")
        
        if avg_score < threshold:
            logger.warning(f"Low relevance score: {avg_score:.3f}")
            return False
        
        return True
    
    analyze_retrieval_quality(memories)
  14. Step 14

    Troubleshooting Common Issues

    Here are solutions to common problems when working with Mem0:

    Issue 1: "No memories retrieved" / Low recall

    • Check that memories were added with correct user_id
    • Verify vector database is running and accessible
    • Try broader search queries
    • Check embedding dimensions match between add and search

    Issue 2: High latency / Slow responses

    • Enable async writes if blocking on add operations
    • Use local embedding models (FastEmbed) instead of API calls
    • Reduce limit parameter in search
    • Check vector database performance and indexing

    Issue 3: Memory staleness / Outdated facts

    • Implement regular memory audits
    • Use timestamps to filter recent memories
    • Update changed facts explicitly with update() method
    • Consider archiving old memories to separate storage

    Issue 4: High token usage / API costs

    • Use smaller LLMs for extraction (GPT-4o-mini, Claude Haiku)
    • Implement client-side deduplication before adding
    • Batch add operations when possible
    • Consider local LLMs (Ollama) for non-critical extractions

    Issue 5: Vector database connection errors

    • Verify database is running: docker ps or service status
    • Check firewall rules and port accessibility
    • Validate credentials and connection strings
    • Review database logs for specific errors
    # Troubleshooting commands
    
    # Check Qdrant is running and accessible
    curl http://localhost:6333/
    curl http://localhost:6333/collections
    
    # View Qdrant logs
    docker logs <qdrant-container-id>
    
    # Check Mem0 local storage
    ls -lh /tmp/qdrant/
    ls -lh ~/.mem0/
    
    # Test Mem0 installation
    python -c "from mem0 import Memory; m = Memory(); print('OK')"
    
    # Debug Python dependencies
    pip show mem0ai
    pip list | grep -E '(mem0|qdrant|openai|chromadb)'
    
    # Clear local Qdrant data (reset)
    rm -rf /tmp/qdrant/
    
    # Clear Mem0 history database
    rm ~/.mem0/history.db
    
    # Enable debug logging in Python
    import logging
    logging.basicConfig(level=logging.DEBUG)
    
    # Check OpenAI API key
    echo $OPENAI_API_KEY | head -c 20
    curl https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY"
    ⚠ Heads up: Clearing vector database storage will delete all memories permanently. Always backup production data before running destructive operations.
  15. Step 15

    Migration from OpenAI Memory or Other Solutions

    If you're migrating from OpenAI's built-in memory, LangChain memory, or other solutions, here's how to transition to Mem0:

    From OpenAI Memory:

    OpenAI's memory API stores unstructured conversation context. Mem0 offers:

    • 26% better accuracy on memory benchmarks
    • 90% lower token usage
    • Multi-signal retrieval (semantic + keyword + entity)
    • Self-hosted options for data control

    Migration Steps:

    1. Export existing memories (if possible)
    2. Reformat to Mem0 conversation format
    3. Bulk import using add() method
    4. Verify retrieval quality
    5. Update application code to use Mem0 API

    From LangChain Memory:

    LangChain's memory is tied to chain/agent execution. Mem0 provides:

    • Standalone memory layer (framework-agnostic)
    • Better scaling for multi-user applications
    • Advanced retrieval with hybrid search

    Mem0 can be used alongside LangChain or as a replacement.

    # Migration example: Bulk import from existing system
    from mem0 import Memory
    import json
    
    memory = Memory()
    
    # Example: Import from JSON export
    with open('existing_memories.json', 'r') as f:
        data = json.load(f)
    
    # Bulk import with progress tracking
    for user_data in data['users']:
        user_id = user_data['id']
        conversations = user_data['conversations']
        
        for conv in conversations:
            # Convert to Mem0 message format
            messages = [
                {"role": msg['role'], "content": msg['content']}
                for msg in conv['messages']
            ]
            
            try:
                result = memory.add(
                    messages,
                    user_id=user_id,
                    metadata={
                        "imported_from": "openai-memory",
                        "original_timestamp": conv['timestamp']
                    }
                )
                print(f"Imported {len(result)} memories for user {user_id}")
            except Exception as e:
                print(f"Error importing for {user_id}: {e}")
    
    # Verify migration
    total_memories = len(memory.get_all())
    print(f"Total memories after migration: {total_memories}")
    
    # Quality check: Test retrieval
    test_query = "What are user preferences?"
    results = memory.search(test_query, limit=10)
    print(f"Retrieval test returned {len(results)} results")
  16. Step 16

    Advanced Use Cases

    Mem0 enables sophisticated AI applications with persistent memory:

    1. Multi-Agent Collaboration: Share memories across agents for coordinated task execution.

    2. Personalized Customer Support: Remember customer history, preferences, and past issues across all interactions.

    3. Healthcare Patient Tracking: Maintain long-term patient preferences, medical history, and care plans with HIPAA-compliant deployment.

    4. Educational Adaptive Learning: Track student progress, learning style, and knowledge gaps to personalize instruction.

    5. Smart Home Assistants: Learn household routines, preferences, and patterns over time.

    6. Code Review Assistants: Remember project coding standards, past discussions, and team preferences.

    7. Research Assistants: Accumulate domain knowledge across research sessions and papers.

    The key is combining user-level, session-level, and agent-level memories to create rich, context-aware experiences.

    # Advanced use case: Multi-agent customer support system
    from mem0 import Memory
    
    memory = Memory()
    
    class CustomerSupportSystem:
        def __init__(self):
            self.memory = Memory()
        
        def handle_customer_query(self, customer_id, query, agent_id):
            """
            Multi-level memory retrieval:
            - Customer history (user_id)
            - Current session context (session_id)
            - Agent best practices (agent_id)
            """
            # Retrieve customer history
            customer_memories = self.memory.search(
                query,
                user_id=customer_id,
                limit=5
            )
            
            # Retrieve agent-level best practices
            agent_memories = self.memory.search(
                query,
                agent_id=agent_id,
                limit=3
            )
            
            # Combine context
            context = {
                "customer_history": [m['memory'] for m in customer_memories],
                "agent_knowledge": [m['memory'] for m in agent_memories]
            }
            
            # Generate response with full context
            response = self.generate_response(query, context)
            
            # Store new interaction
            self.memory.add(
                [
                    {"role": "user", "content": query},
                    {"role": "assistant", "content": response}
                ],
                user_id=customer_id,
                metadata={
                    "agent_id": agent_id,
                    "resolved": False,
                    "category": "support"
                }
            )
            
            return response
        
        def learn_from_interaction(self, customer_id, agent_id, outcome):
            """Store learned patterns for future use"""
            if outcome == "successful":
                self.memory.add(
                    f"This approach worked well for customer issues",
                    agent_id=agent_id,
                    metadata={"learned_from": customer_id, "outcome": outcome}
                )
        
        def generate_response(self, query, context):
            # Implement LLM call with context
            pass
    
    # Usage
    support = CustomerSupportSystem()
    response = support.handle_customer_query(
        customer_id="customer-123",
        query="My API key isn't working",
        agent_id="support-bot-v2"
    )
    
    support.learn_from_interaction(
        customer_id="customer-123",
        agent_id="support-bot-v2",
        outcome="successful"
    )

Feature requests

Sign in to suggest features or vote on existing ones.

No feature requests yet.

Discussion

0 people marked this as worked·Sign in to mark your own.

Sign in to join the discussion.

No comments yet.