RagaAI Catalyst: Agent AI Observability and Evaluation Framework
Comprehensive platform for LLM project management, evaluation, tracing, and monitoring with support for agents, RAG applications, and multi-model AI systems.
- Step 1
What is RagaAI Catalyst?
RagaAI Catalyst is a comprehensive Python SDK and platform designed for managing, evaluating, and optimizing LLM projects. It provides end-to-end observability for AI applications including agent tracing, multi-agentic system debugging, execution graph visualization, and advanced analytics.
Key capabilities include:
- Project Management: Create and organize LLM projects with different use cases
- Dataset Management: Efficiently manage training and evaluation datasets
- Evaluation Management: Create experiments and run metrics on RAG applications
- Trace Management: Record and analyze execution traces of RAG applications
- Agent Tracing: Track multi-agent system behaviors and interactions
- Prompt Management: Manage and version prompts for your AI applications
- Synthetic Data Generation: Generate synthetic data for testing and evaluation
- Guardrail Management: Implement safety filters to prevent harmful outputs
- Red-teaming: Comprehensive scans to detect model vulnerabilities and biases
- Step 2
Technology stack
RagaAI Catalyst is built on a modern Python stack with extensive LLM ecosystem integrations:
Core Stack:
- Python 3.10-3.13
- aiohttp for async HTTP operations
- Pydantic for data validation
- Pandas for data processing
- Rich for terminal UI
LLM Frameworks:
- LangChain (core and full framework)
- LlamaIndex
- LiteLLM for multi-model support
Model Providers:
- OpenAI
- Google Generative AI
- Groq
- Anthropic
Observability:
- OpenTelemetry (SDK, OTLP exporter, instrumentation)
- OpenInference (for all major frameworks: LangChain, LlamaIndex, CrewAI, Haystack, Anthropic, OpenAI, Mistral, etc.)
Utilities:
- Requests, tqdm, tiktoken, Jinja2, PyYAML
Tech Stack: ├── Python 3.10-3.13 ├── Core Libraries │ ├── aiohttp>=3.10.2 │ ├── pydantic │ ├── pandas │ └── rich>=13.9.4 ├── LLM Frameworks │ ├── langchain-core>=0.2.11 │ ├── langchain>=0.2.11 │ ├── llama-index>=0.10.0 │ └── litellm>=1.51.1 ├── Model Clients │ ├── openai>=1.57.0 │ ├── google-genai>=1.3.0 │ └── groq>=0.11.0 ├── Observability │ ├── opentelemetry-sdk │ ├── opentelemetry-exporter-otlp │ └── openinference-* (all major frameworks) └── Utilities ├── requests~=2.32.3 ├── tiktoken>=0.7.0 └── tqdm>=4.66.5 - Step 3
Prerequisites
Before installing RagaAI Catalyst, ensure you have:
- Python 3.10-3.13 installed
- pip or uv package manager
- RagaAI platform credentials (access_key and secret_key)
# Check Python version (3.10-3.13 required) python --version # Recommended: Install uv package manager curl -LsSf https://astral.sh/uv/install.sh | sh # Verify installation python --version # Should be 3.10-3.13 - Step 4
Installation
Install RagaAI Catalyst via pip. The package includes all core dependencies for LLM observability and evaluation.
# Using pip pip install ragai-catalyst # Using uv (recommended) uv pip install ragai-catalyst - Step 5
Authentication setup
Before using RagaAI Catalyst, you need to obtain credentials from the RagaAI platform:
- Navigate to your profile settings on the RagaAI platform
- Select "Authentication" to create your keys
- Click "Generate new key" to create access and secret keys
- Store these credentials securely
You can configure credentials via environment variables or directly in code:
# Option 1: Environment variables export RAGAI_ACCESS_KEY="your_access_key" export RAGAI_SECRET_KEY="your_secret_key" export RAGAI_BASE_URL="https://api.raga.ai" # or your self-hosted URL # Option 2: .env file # Create a .env file in your project echo "RAGAI_ACCESS_KEY=your_access_key" > .env echo "RAGAI_SECRET_KEY=your_secret_key" >> .env echo "RAGAI_BASE_URL=https://api.raga.ai" >> .env - Step 6
Initializing the SDK
Import and initialize the RagaAI Catalyst SDK with your credentials. The SDK provides a unified interface for all observability features.
from ragai_catalyst import RagaAICatalyst import os # Option 1: Pass credentials directly catalyst = RagaAICatalyst( access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY", base_url="https://api.raga.ai" ) # Option 2: Use environment variables catalyst = RagaAICatalyst( access_key=os.getenv("RAGAI_ACCESS_KEY"), secret_key=os.getenv("RAGAI_SECRET_KEY"), base_url=os.getenv("RAGAI_BASE_URL", "https://api.raga.ai") ) # Verify connection print("Connected to RagaAI Catalyst!") - Step 7
Project management
Create and manage LLM projects. Projects organize your datasets, evaluations, and traces under a single namespace.
from ragai_catalyst import RagaAICatalyst # Initialize catalyst catalyst = RagaAICatalyst( access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY" ) # Create a new project project = catalyst.create_project( project_name="My-RAG-Application", usecase="Chatbot" ) print(f"Created project: {project.name}") # List available use cases use_cases = project.use_cases() print(f"Use cases: {use_cases}") # List all your projects projects = catalyst.list_projects() for p in projects: print(f"- {p.name}: {p.usecase}") - Step 8
Dataset management
Manage datasets efficiently for evaluation. Upload CSV files and map columns to the RAG schema (query, response, context, etc.).
from ragai_catalyst import DatasetManager # Initialize Dataset manager for a specific project dataset_manager = DatasetManager( project_name="My-RAG-Application", access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY" ) # List existing datasets datasets = dataset_manager.list_datasets() print(f"Existing Datasets: {datasets}") # Create a dataset from CSV with custom schema mapping dataset_manager.create_from_csv( csv_path="path/to/your/data.csv", mappings={ "question": "query", # Your column -> RAG schema field "answer": "response", # Map to response field "source_text": "context" # Map to context field }, schema_mapping="custom" ) # Get the default schema mapping schema = dataset_manager.get_schema_mapping() print(f"Schema: {schema}") - Step 9
Evaluation management
Create and run evaluations to measure your RAG application performance. RagaAI supports multiple metrics for evaluating retrieval quality, response accuracy, and more.
Available metrics include:
- Faithfulness (does the answer follow from context)
- Context relevance (is the retrieved context relevant)
- Answer relevance (is the answer relevant to the question)
- Similarity metrics
- Custom metrics
from ragai_catalyst import Evaluation # Create an evaluation experiment evaluation = Evaluation( project_name="My-RAG-Application", dataset_name="MyDataset", access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY" ) # List available metrics available_metrics = evaluation.list_metrics() print(f"Available metrics: {available_metrics}") # Configure the schema mapping for your dataset schema_mapping = { "query": "prompt", # Your CSV column for queries "response": "response", # Your CSV column for responses "context": "context_passages", # Your CSV column for context "reference": "ground_truth" # (Optional) ground truth answers } # Run evaluation with specific metrics evaluation.add_metrics( metrics=["faithfulness_v2", "relevance_v2"], schema=schema_mapping ) # Get evaluation results results = evaluation.evaluate() print(f"Evaluation Results: {results}") - Step 10
Trace management
Record and analyze traces of your RAG application execution. Traces provide visibility into the step-by-step flow of your AI application, helping you debug and optimize performance.
from ragai_catalyst import TraceManager # Initialize trace manager tm = TraceManager( project_name="My-RAG-Application", access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY" ) # Record a trace for a request tm.record_trace( input_query="What is the capital of France?", output_response="The capital of France is Paris.", context=["Paris is the capital and largest city of France"], metadata={ "model": "gpt-4", "latency_ms": 245, "tokens_used": 42 } ) # List all traces traces = tm.list_traces() for trace in traces: print(f"Trace ID: {trace.id}, Input: {trace.input[:50]}...") # Get a specific trace t = tm.get_trace(trace_id="trace-uuid-here") print(f"Trace details: {t}") - Step 11
Agent tracing (multi-agent systems)
Track multi-agent system behaviors and interactions. RagaAI Catalyst provides specialized tracing for agent-based workflows, including tool usage, agent handoffs, and decision-making processes.
from ragai_catalyst import AgentTracer # Initialize agent tracer tracer = AgentTracer( project_name="Multi-Agent-System", access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY" ) # Trace an agent interaction with tracer.trace_agent(agent_name="researcher") as trace: # Your agent code here result = agent.run("Research quantum computing") # Record agent-specific metadata trace.record_tool_call( tool_name="search_engine", input={"query": "quantum computing"}, output="Search results..." ) trace.record_decision( decision="Continue research", reasoning="More information needed", alternatives_explored=["summarize", "ask_user"] ) # View agent execution graph executor_graph = tracer.get_execution_graph(trace.id) print(f"Execution graph: {executor_graph}") - Step 12
Prompt management
Manage and version prompts for your AI applications. Store, version, and retrieve prompts efficiently for consistent AI behavior.
from ragai_catalyst import PromptManager # Initialize prompt manager pm = PromptManager( project_name="My-RAG-Application", access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY" ) # Create and store a prompt prompt_id = pm.create_prompt( name="rag-system-prompt", template="""You are a helpful AI assistant. Use the following context to answer the question. Context: {context} Question: {question} Answer:""", version="1.0", variables=["context", "question"] ) print(f"Created prompt with ID: {prompt_id}") # List all prompts prompts = pm.list_prompts() for p in prompts: print(f"- {p.name} (v{p.version}): {p.template[:50]}...") # Get a specific prompt prompt = pm.get_prompt(prompt_id) rendered = prompt.render(context="Some context", question="What is X?") print(f"Rendered: {rendered}") - Step 13
Synthetic data generation
Generate synthetic data for testing and evaluation. Create diverse test cases without collecting real user data.
from ragai_catalyst import SyntheticDataGenerator # Initialize synthetic data generator generator = SyntheticDataGenerator( project_name="My-RAG-Application", access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY" ) # Generate synthetic Q&A pairs data = generator.generate( num_samples=100, domain="technology", include_context=True, difficulty_levels=["easy", "medium", "hard"] ) # Export to CSV generator.export_to_csv( data=data, output_path="synthetic_data.csv", schema={ "query": "question", "response": "answer", "context": "context_passages" } ) print(f"Generated {len(data)} synthetic samples") - Step 14
Guardrail management
Implement safety filters (guardrails) to prevent harmful, biased, or inappropriate AI outputs. Configure multiple guardrail types for comprehensive protection.
Guardrail types include:
- Toxicity detection
- PII (personally identifiable information) detection
- Jailbreak attempt detection
- Custom regex/keyword filters
from ragai_catalyst import GuardrailManager # Initialize guardrail manager guardrail = GuardrailManager( project_name="My-RAG-Application", access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY" ) # Create a toxicity guardrail guardrail.create_guardrail( name="toxicity-filter", type="toxicity", threshold=0.7, action="block" # or "flag", "modify" ) # Create a PII detection guardrail guardrail.create_guardrail( name="pii-detector", type="pii", fields=["email", "phone", "ssn"], action="redact" ) # Run guardrails on input/output def check_with_guardrails(input_text: str, output_text: str): # Check input input_result = guardrail.check(input_text) if input_result.blocked: print(f"Input blocked: {input_result.reason}") return None # Check output output_result = guardrail.check(output_text) if output_result.blocked: print(f"Output blocked: {output_result.reason}") return None elif output_result.redacted: print(f"Output redacted: {output_result.redacted_text}") return output_result.redacted_text return output_text - Step 15
Red-teaming
Perform comprehensive scans to detect model vulnerabilities, biases, and potential misuse. Red-teaming helps identify security gaps before deployment.
Red-teaming capabilities:
- Jailbreak attack simulation
- Bias detection
- Adversarial input testing
- Prompt injection detection
from ragai_catalyst import RedTeaming # Initialize red-teaming module redteam = RedTeaming( project_name="My-RAG-Application", access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY" ) # Run a comprehensive vulnerability scan scan_result = redteam.run_scan( model="gpt-4", attack_types=[ "jailbreak", "prompt_injection", "bias_detection", "pii_extraction" ], num_attacks=50, severity_threshold="medium" ) # View scan results print(f"Vulnerabilities found: {len(scan_result.vulnerabilities)}") for vuln in scan_result.vulnerabilities: print(f"- [{vuln.severity}] {vuln.type}: {vuln.description}") # Get detailed report report = redteam.generate_report(scan_result) print(f"Report: {report}") - Step 16
Self-hosted deployment
Deploy RagaAI Catalyst as a self-hosted solution for complete control over your observability infrastructure. The self-hosted version includes a dashboard with timeline and execution graph views.
# Using Docker (if available) docker pull ragaai/catalyst:latest docker run -d \ --name ragaai-catalyst \ -p 8080:8080 \ -v ragaai-data:/app/data \ -e POSTGRES_URL=postgresql://user:pass@localhost:5432/ragaai \ ragaai/catalyst:latest # Access the dashboard at http://localhost:8080 # Initialize SDK with self-hosted instance from ragai_catalyst import RagaAICatalyst sdk = RagaAICatalyst( access_key="YOUR_ACCESS_KEY", secret_key="YOUR_SECRET_KEY", base_url="http://your-self-hosted-domain:8080/api" ) - Step 17
Resources
Official GitHub repository: https://github.com/raga-ai-hub/RagaAI-Catalyst
PyPI package: https://pypi.org/project/ragai-catalyst/
Stars: 16,166+ (indicating strong community adoption)
Documentation: Refer to the GitHub README for the latest API documentation and examples.
Support: Reach out to the RagaAI team through GitHub issues for questions and feature requests.
GitHub: https://github.com/raga-ai-hub/RagaAI-Catalyst PyPI: https://pypi.org/project/ragai-catalyst/ Stars: 16,166+ Documentation: See GitHub README Support: GitHub Issues
Feature requests
Sign in to suggest features or vote on existing ones.
No feature requests yet.
Discussion
Sign in to join the discussion.
No comments yet.