TechSetupGuides
Intermediateaillmobservabilityevaluationtracingpythonagentsragguardrailsred-teaming

RagaAI Catalyst: Agent AI Observability and Evaluation Framework

Comprehensive platform for LLM project management, evaluation, tracing, and monitoring with support for agents, RAG applications, and multi-model AI systems.

  1. Step 1

    What is RagaAI Catalyst?

    RagaAI Catalyst is a comprehensive Python SDK and platform designed for managing, evaluating, and optimizing LLM projects. It provides end-to-end observability for AI applications including agent tracing, multi-agentic system debugging, execution graph visualization, and advanced analytics.

    Key capabilities include:

    • Project Management: Create and organize LLM projects with different use cases
    • Dataset Management: Efficiently manage training and evaluation datasets
    • Evaluation Management: Create experiments and run metrics on RAG applications
    • Trace Management: Record and analyze execution traces of RAG applications
    • Agent Tracing: Track multi-agent system behaviors and interactions
    • Prompt Management: Manage and version prompts for your AI applications
    • Synthetic Data Generation: Generate synthetic data for testing and evaluation
    • Guardrail Management: Implement safety filters to prevent harmful outputs
    • Red-teaming: Comprehensive scans to detect model vulnerabilities and biases
  2. Step 2

    Technology stack

    RagaAI Catalyst is built on a modern Python stack with extensive LLM ecosystem integrations:

    Core Stack:

    • Python 3.10-3.13
    • aiohttp for async HTTP operations
    • Pydantic for data validation
    • Pandas for data processing
    • Rich for terminal UI

    LLM Frameworks:

    • LangChain (core and full framework)
    • LlamaIndex
    • LiteLLM for multi-model support

    Model Providers:

    • OpenAI
    • Google Generative AI
    • Groq
    • Anthropic

    Observability:

    • OpenTelemetry (SDK, OTLP exporter, instrumentation)
    • OpenInference (for all major frameworks: LangChain, LlamaIndex, CrewAI, Haystack, Anthropic, OpenAI, Mistral, etc.)

    Utilities:

    • Requests, tqdm, tiktoken, Jinja2, PyYAML
    Tech Stack:
    ├── Python 3.10-3.13
    ├── Core Libraries
    │   ├── aiohttp>=3.10.2
    │   ├── pydantic
    │   ├── pandas
    │   └── rich>=13.9.4
    ├── LLM Frameworks
    │   ├── langchain-core>=0.2.11
    │   ├── langchain>=0.2.11
    │   ├── llama-index>=0.10.0
    │   └── litellm>=1.51.1
    ├── Model Clients
    │   ├── openai>=1.57.0
    │   ├── google-genai>=1.3.0
    │   └── groq>=0.11.0
    ├── Observability
    │   ├── opentelemetry-sdk
    │   ├── opentelemetry-exporter-otlp
    │   └── openinference-* (all major frameworks)
    └── Utilities
        ├── requests~=2.32.3
        ├── tiktoken>=0.7.0
        └── tqdm>=4.66.5
  3. Step 3

    Prerequisites

    Before installing RagaAI Catalyst, ensure you have:

    • Python 3.10-3.13 installed
    • pip or uv package manager
    • RagaAI platform credentials (access_key and secret_key)
    # Check Python version (3.10-3.13 required)
    python --version
    
    # Recommended: Install uv package manager
    curl -LsSf https://astral.sh/uv/install.sh | sh
    
    # Verify installation
    python --version  # Should be 3.10-3.13
  4. Step 4

    Installation

    Install RagaAI Catalyst via pip. The package includes all core dependencies for LLM observability and evaluation.

    # Using pip
    pip install ragai-catalyst
    
    # Using uv (recommended)
    uv pip install ragai-catalyst
  5. Step 5

    Authentication setup

    Before using RagaAI Catalyst, you need to obtain credentials from the RagaAI platform:

    1. Navigate to your profile settings on the RagaAI platform
    2. Select "Authentication" to create your keys
    3. Click "Generate new key" to create access and secret keys
    4. Store these credentials securely

    You can configure credentials via environment variables or directly in code:

    # Option 1: Environment variables
    export RAGAI_ACCESS_KEY="your_access_key"
    export RAGAI_SECRET_KEY="your_secret_key"
    export RAGAI_BASE_URL="https://api.raga.ai"  # or your self-hosted URL
    
    # Option 2: .env file
    # Create a .env file in your project
    echo "RAGAI_ACCESS_KEY=your_access_key" > .env
    echo "RAGAI_SECRET_KEY=your_secret_key" >> .env
    echo "RAGAI_BASE_URL=https://api.raga.ai" >> .env
  6. Step 6

    Initializing the SDK

    Import and initialize the RagaAI Catalyst SDK with your credentials. The SDK provides a unified interface for all observability features.

    from ragai_catalyst import RagaAICatalyst
    import os
    
    # Option 1: Pass credentials directly
    catalyst = RagaAICatalyst(
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY",
        base_url="https://api.raga.ai"
    )
    
    # Option 2: Use environment variables
    catalyst = RagaAICatalyst(
        access_key=os.getenv("RAGAI_ACCESS_KEY"),
        secret_key=os.getenv("RAGAI_SECRET_KEY"),
        base_url=os.getenv("RAGAI_BASE_URL", "https://api.raga.ai")
    )
    
    # Verify connection
    print("Connected to RagaAI Catalyst!")
  7. Step 7

    Project management

    Create and manage LLM projects. Projects organize your datasets, evaluations, and traces under a single namespace.

    from ragai_catalyst import RagaAICatalyst
    
    # Initialize catalyst
    catalyst = RagaAICatalyst(
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY"
    )
    
    # Create a new project
    project = catalyst.create_project(
        project_name="My-RAG-Application",
        usecase="Chatbot"
    )
    print(f"Created project: {project.name}")
    
    # List available use cases
    use_cases = project.use_cases()
    print(f"Use cases: {use_cases}")
    
    # List all your projects
    projects = catalyst.list_projects()
    for p in projects:
        print(f"- {p.name}: {p.usecase}")
  8. Step 8

    Dataset management

    Manage datasets efficiently for evaluation. Upload CSV files and map columns to the RAG schema (query, response, context, etc.).

    from ragai_catalyst import DatasetManager
    
    # Initialize Dataset manager for a specific project
    dataset_manager = DatasetManager(
        project_name="My-RAG-Application",
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY"
    )
    
    # List existing datasets
    datasets = dataset_manager.list_datasets()
    print(f"Existing Datasets: {datasets}")
    
    # Create a dataset from CSV with custom schema mapping
    dataset_manager.create_from_csv(
        csv_path="path/to/your/data.csv",
        mappings={
            "question": "query",      # Your column -> RAG schema field
            "answer": "response",     # Map to response field
            "source_text": "context"  # Map to context field
        },
        schema_mapping="custom"
    )
    
    # Get the default schema mapping
    schema = dataset_manager.get_schema_mapping()
    print(f"Schema: {schema}")
  9. Step 9

    Evaluation management

    Create and run evaluations to measure your RAG application performance. RagaAI supports multiple metrics for evaluating retrieval quality, response accuracy, and more.

    Available metrics include:

    • Faithfulness (does the answer follow from context)
    • Context relevance (is the retrieved context relevant)
    • Answer relevance (is the answer relevant to the question)
    • Similarity metrics
    • Custom metrics
    from ragai_catalyst import Evaluation
    
    # Create an evaluation experiment
    evaluation = Evaluation(
        project_name="My-RAG-Application",
        dataset_name="MyDataset",
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY"
    )
    
    # List available metrics
    available_metrics = evaluation.list_metrics()
    print(f"Available metrics: {available_metrics}")
    
    # Configure the schema mapping for your dataset
    schema_mapping = {
        "query": "prompt",           # Your CSV column for queries
        "response": "response",      # Your CSV column for responses
        "context": "context_passages", # Your CSV column for context
        "reference": "ground_truth"  # (Optional) ground truth answers
    }
    
    # Run evaluation with specific metrics
    evaluation.add_metrics(
        metrics=["faithfulness_v2", "relevance_v2"],
        schema=schema_mapping
    )
    
    # Get evaluation results
    results = evaluation.evaluate()
    print(f"Evaluation Results: {results}")
  10. Step 10

    Trace management

    Record and analyze traces of your RAG application execution. Traces provide visibility into the step-by-step flow of your AI application, helping you debug and optimize performance.

    from ragai_catalyst import TraceManager
    
    # Initialize trace manager
    tm = TraceManager(
        project_name="My-RAG-Application",
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY"
    )
    
    # Record a trace for a request
    tm.record_trace(
        input_query="What is the capital of France?",
        output_response="The capital of France is Paris.",
        context=["Paris is the capital and largest city of France"],
        metadata={
            "model": "gpt-4",
            "latency_ms": 245,
            "tokens_used": 42
        }
    )
    
    # List all traces
    traces = tm.list_traces()
    for trace in traces:
        print(f"Trace ID: {trace.id}, Input: {trace.input[:50]}...")
    
    # Get a specific trace
    t = tm.get_trace(trace_id="trace-uuid-here")
    print(f"Trace details: {t}")
  11. Step 11

    Agent tracing (multi-agent systems)

    Track multi-agent system behaviors and interactions. RagaAI Catalyst provides specialized tracing for agent-based workflows, including tool usage, agent handoffs, and decision-making processes.

    from ragai_catalyst import AgentTracer
    
    # Initialize agent tracer
    tracer = AgentTracer(
        project_name="Multi-Agent-System",
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY"
    )
    
    # Trace an agent interaction
    with tracer.trace_agent(agent_name="researcher") as trace:
        # Your agent code here
        result = agent.run("Research quantum computing")
        
        # Record agent-specific metadata
        trace.record_tool_call(
            tool_name="search_engine",
            input={"query": "quantum computing"},
            output="Search results..."
        )
        trace.record_decision(
            decision="Continue research",
            reasoning="More information needed",
            alternatives_explored=["summarize", "ask_user"]
        )
    
    # View agent execution graph
    executor_graph = tracer.get_execution_graph(trace.id)
    print(f"Execution graph: {executor_graph}")
  12. Step 12

    Prompt management

    Manage and version prompts for your AI applications. Store, version, and retrieve prompts efficiently for consistent AI behavior.

    from ragai_catalyst import PromptManager
    
    # Initialize prompt manager
    pm = PromptManager(
        project_name="My-RAG-Application",
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY"
    )
    
    # Create and store a prompt
    prompt_id = pm.create_prompt(
        name="rag-system-prompt",
        template="""You are a helpful AI assistant. Use the following context to answer the question.
    
    Context: {context}
    Question: {question}
    
    Answer:""",
        version="1.0",
        variables=["context", "question"]
    )
    print(f"Created prompt with ID: {prompt_id}")
    
    # List all prompts
    prompts = pm.list_prompts()
    for p in prompts:
        print(f"- {p.name} (v{p.version}): {p.template[:50]}...")
    
    # Get a specific prompt
    prompt = pm.get_prompt(prompt_id)
    rendered = prompt.render(context="Some context", question="What is X?")
    print(f"Rendered: {rendered}")
  13. Step 13

    Synthetic data generation

    Generate synthetic data for testing and evaluation. Create diverse test cases without collecting real user data.

    from ragai_catalyst import SyntheticDataGenerator
    
    # Initialize synthetic data generator
    generator = SyntheticDataGenerator(
        project_name="My-RAG-Application",
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY"
    )
    
    # Generate synthetic Q&A pairs
    data = generator.generate(
        num_samples=100,
        domain="technology",
        include_context=True,
        difficulty_levels=["easy", "medium", "hard"]
    )
    
    # Export to CSV
    generator.export_to_csv(
        data=data,
        output_path="synthetic_data.csv",
        schema={
            "query": "question",
            "response": "answer",
            "context": "context_passages"
        }
    )
    
    print(f"Generated {len(data)} synthetic samples")
  14. Step 14

    Guardrail management

    Implement safety filters (guardrails) to prevent harmful, biased, or inappropriate AI outputs. Configure multiple guardrail types for comprehensive protection.

    Guardrail types include:

    • Toxicity detection
    • PII (personally identifiable information) detection
    • Jailbreak attempt detection
    • Custom regex/keyword filters
    from ragai_catalyst import GuardrailManager
    
    # Initialize guardrail manager
    guardrail = GuardrailManager(
        project_name="My-RAG-Application",
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY"
    )
    
    # Create a toxicity guardrail
    guardrail.create_guardrail(
        name="toxicity-filter",
        type="toxicity",
        threshold=0.7,
        action="block"  # or "flag", "modify"
    )
    
    # Create a PII detection guardrail
    guardrail.create_guardrail(
        name="pii-detector",
        type="pii",
        fields=["email", "phone", "ssn"],
        action="redact"
    )
    
    # Run guardrails on input/output
    def check_with_guardrails(input_text: str, output_text: str):
        # Check input
        input_result = guardrail.check(input_text)
        if input_result.blocked:
            print(f"Input blocked: {input_result.reason}")
            return None
        
        # Check output
        output_result = guardrail.check(output_text)
        if output_result.blocked:
            print(f"Output blocked: {output_result.reason}")
            return None
        elif output_result.redacted:
            print(f"Output redacted: {output_result.redacted_text}")
            return output_result.redacted_text
        
        return output_text
  15. Step 15

    Red-teaming

    Perform comprehensive scans to detect model vulnerabilities, biases, and potential misuse. Red-teaming helps identify security gaps before deployment.

    Red-teaming capabilities:

    • Jailbreak attack simulation
    • Bias detection
    • Adversarial input testing
    • Prompt injection detection
    from ragai_catalyst import RedTeaming
    
    # Initialize red-teaming module
    redteam = RedTeaming(
        project_name="My-RAG-Application",
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY"
    )
    
    # Run a comprehensive vulnerability scan
    scan_result = redteam.run_scan(
        model="gpt-4",
        attack_types=[
            "jailbreak",
            "prompt_injection",
            "bias_detection",
            "pii_extraction"
        ],
        num_attacks=50,
        severity_threshold="medium"
    )
    
    # View scan results
    print(f"Vulnerabilities found: {len(scan_result.vulnerabilities)}")
    for vuln in scan_result.vulnerabilities:
        print(f"- [{vuln.severity}] {vuln.type}: {vuln.description}")
    
    # Get detailed report
    report = redteam.generate_report(scan_result)
    print(f"Report: {report}")
  16. Step 16

    Self-hosted deployment

    Deploy RagaAI Catalyst as a self-hosted solution for complete control over your observability infrastructure. The self-hosted version includes a dashboard with timeline and execution graph views.

    # Using Docker (if available)
    docker pull ragaai/catalyst:latest
    
    docker run -d \
      --name ragaai-catalyst \
      -p 8080:8080 \
      -v ragaai-data:/app/data \
      -e POSTGRES_URL=postgresql://user:pass@localhost:5432/ragaai \
      ragaai/catalyst:latest
    
    # Access the dashboard at http://localhost:8080
    
    # Initialize SDK with self-hosted instance
    from ragai_catalyst import RagaAICatalyst
    
    sdk = RagaAICatalyst(
        access_key="YOUR_ACCESS_KEY",
        secret_key="YOUR_SECRET_KEY",
        base_url="http://your-self-hosted-domain:8080/api"
    )
  17. Step 17

    Resources

    Official GitHub repository: https://github.com/raga-ai-hub/RagaAI-Catalyst

    PyPI package: https://pypi.org/project/ragai-catalyst/

    Stars: 16,166+ (indicating strong community adoption)

    Documentation: Refer to the GitHub README for the latest API documentation and examples.

    Support: Reach out to the RagaAI team through GitHub issues for questions and feature requests.

    GitHub: https://github.com/raga-ai-hub/RagaAI-Catalyst
    PyPI: https://pypi.org/project/ragai-catalyst/
    Stars: 16,166+
    Documentation: See GitHub README
    Support: GitHub Issues

Feature requests

Sign in to suggest features or vote on existing ones.

No feature requests yet.

Discussion

0 people marked this as worked·Sign in to mark your own.

Sign in to join the discussion.

No comments yet.