KrillinAI: Video Translation and Dubbing Setup
Complete setup guide for KrillinAI - an AI-powered video translation and dubbing tool that supports 100 languages. Includes installation for Windows/Linux/macOS, configuration of Whisper speech recognition, LLM translation, voice cloning with TTS, and Docker deployment options.
- Step 1
System Prerequisites
KrillinAI is a Go-based video translation and dubbing tool that integrates speech recognition, LLM translation, and voice synthesis. It supports desktop and server deployment modes across all major platforms. The tool automatically handles dependency installation, but you'll need API access for speech recognition and translation services.
# Check system compatibility uname -a # Linux/macOS systeminfo # Windows # Verify internet connection (required for API calls and model downloads) ping -c 3 api.openai.com # Check available disk space (models can be 1-3GB) df -h # Linux/macOS wmic logicaldisk get size,freespace,caption # Windows⚠ Heads up: KrillinAI requires stable internet for API calls to speech recognition and LLM services. Local Whisper models (FasterWhisper, WhisperCpp) will download automatically on first use and require 1-3GB storage per model. - Step 2
Choose Your Deployment Mode
KrillinAI offers two deployment modes: Desktop (GUI application with built-in browser interface) and Server (lightweight web service accessed via browser). Desktop mode is recommended for most users as it provides a sleek interface with no additional configuration. Server mode is ideal for headless deployments, Docker containers, or remote access scenarios.
Desktop Mode: - Pre-packaged with web UI - One-click launch - Automatic port management - Ideal for: Local development, content creators Server Mode: - Minimal footprint - Manual config.toml setup required - Browser access at http://127.0.0.1:8888 - Ideal for: Production deployments, Docker, remote servers - Step 3
Download KrillinAI Executable
Download the pre-compiled executable for your operating system from the GitHub releases page. KrillinAI provides native builds for Windows (x64 and x86), Linux (x64 and ARM64), and macOS (Intel and Apple Silicon). Choose the desktop version (filename contains 'desktop') for GUI mode, or the standard version for server mode.
# Visit the releases page https://github.com/krillinai/KrillinAI/releases # Download for your platform: # - Windows Desktop: krillinai-desktop-windows-amd64.exe # - Linux Desktop: krillinai-desktop-linux-amd64 # - macOS Desktop (Intel): krillinai-desktop-darwin-amd64 # - macOS Desktop (M-series): krillinai-desktop-darwin-arm64 # - Server versions: Same names without "desktop" # Linux/macOS: Make executable chmod +x krillinai-desktop-linux-amd64 # Optional: Move to system PATH sudo mv krillinai-desktop-linux-amd64 /usr/local/bin/krillinai krillinai --version - Step 4
macOS Security Configuration
macOS blocks unsigned executables by default. You must manually trust the KrillinAI binary before first launch. This is a standard security measure for applications distributed outside the Mac App Store.
# Remove quarantine attribute from the downloaded executable xattr -d com.apple.quarantine krillinai-desktop-darwin-arm64 # Alternative: Trust via System Settings # 1. Try to open the app (will be blocked) # 2. System Settings → Privacy & Security # 3. Scroll to "Security" section # 4. Click "Open Anyway" next to the KrillinAI message # Verify executable is trusted spctl -a -v krillinai-desktop-darwin-arm64⚠ Heads up: This step is ONLY required on macOS. Windows and Linux users can skip directly to launching the application. - Step 5
Launch Desktop Version (Recommended)
For desktop mode, simply double-click the executable. KrillinAI will automatically start a local web server and open the interface in your default browser. No configuration file is required for basic usage with cloud-based services.
# Windows: Double-click the .exe file # Or from Command Prompt: krillinai-desktop-windows-amd64.exe # Linux: ./krillinai-desktop-linux-amd64 # macOS: ./krillinai-desktop-darwin-arm64 # The web interface will automatically open at: # http://127.0.0.1:8888 - Step 6
Server Version Setup (Optional)
For server deployments, create a configuration directory and populate
config.tomlfrom the example template. The configuration file defines API credentials, model choices, and service endpoints. Server mode requires manual configuration before first launch.# Create configuration directory mkdir -p config cd config # Download example configuration curl -O https://raw.githubusercontent.com/krillinai/KrillinAI/master/config-example.toml # Rename and edit mv config-example.toml config.toml nano config.toml # or vim, code, etc. - Step 7
Configure Speech Recognition (Transcribe)
KrillinAI supports multiple speech recognition engines: cloud-based OpenAI Whisper (all platforms), local FasterWhisper (Windows/Linux), WhisperKit (macOS M-series only), WhisperCpp (all platforms), and Alibaba Cloud ASR. Cloud providers require API keys; local engines download models automatically on first use.
[transcribe] provider = "openai" # Options: openai, faster-whisper, whisper-kit, whisper-cpp, aliyun # OpenAI Whisper (cloud) [transcribe.openai] api_key = "sk-your-openai-api-key-here" model = "whisper-1" # Default and only option # FasterWhisper (local, Windows/Linux) [transcribe.faster-whisper] model = "large-v2" # Options: tiny, medium, large-v2 # Auto-downloads on first use; requires 1-3GB disk space # WhisperKit (local, macOS M-series only) [transcribe.whisper-kit] model = "default" # WhisperCpp (local, cross-platform) [transcribe.whisper-cpp] model = "base" # Lightweight, fast # Alibaba Cloud ASR (requires separate setup) [transcribe.aliyun] access_key_id = "your-access-key" access_key_secret = "your-secret" app_key = "your-app-key"⚠ Heads up: OpenAI Whisper requires a paid API key. Free accounts have limited quota. Local Whisper engines (FasterWhisper, WhisperCpp) run offline but require significant CPU/GPU resources. - Step 8
Configure LLM Translation
KrillinAI is compatible with any LLM provider that implements the OpenAI API specification. This includes OpenAI GPT models, Google Gemini, DeepSeek, Alibaba Tongyi Qianwen, and locally-hosted models via OpenAI-compatible servers (e.g., LM Studio, Ollama with OpenAI compatibility layer).
[llm] provider = "openai" # Any OpenAI-compatible API # OpenAI GPT [llm.openai] api_key = "sk-your-openai-api-key-here" model = "gpt-4" # Options: gpt-4, gpt-4-turbo, gpt-3.5-turbo base_url = "https://api.openai.com/v1" # Default # DeepSeek (OpenAI-compatible) [llm.deepseek] api_key = "your-deepseek-api-key" model = "deepseek-chat" base_url = "https://api.deepseek.com/v1" # Local model via Ollama (with OpenAI compatibility) [llm.local] api_key = "not-needed" # Placeholder model = "llama3.1:70b" base_url = "http://localhost:11434/v1" # Ollama OpenAI endpoint # Google Gemini (via OpenAI-compatible proxy) [llm.gemini] api_key = "your-gemini-api-key" model = "gemini-pro" base_url = "https://generativelanguage.googleapis.com/v1beta/openai/" - Step 9
Configure Text-to-Speech (TTS)
TTS is optional but recommended for complete video dubbing. KrillinAI supports OpenAI TTS (simple, high-quality), Alibaba Cloud Voice Service (more voice options), and CosyVoice (advanced voice cloning). Voice cloning preserves the original speaker's characteristics in the target language.
[tts] provider = "openai" # Options: openai, aliyun, cosyvoice # OpenAI TTS (recommended for simplicity) [tts.openai] api_key = "sk-your-openai-api-key-here" model = "tts-1-hd" # Options: tts-1, tts-1-hd voice = "alloy" # Options: alloy, echo, fable, onyx, nova, shimmer # Alibaba Cloud Voice Service [tts.aliyun] access_key_id = "your-access-key" access_key_secret = "your-secret" app_key = "your-app-key" voice = "xiaoyun" # See Alibaba Cloud docs for full list # CosyVoice (voice cloning) [tts.cosyvoice] endpoint = "http://localhost:50000" # Self-hosted CosyVoice server voice_preset = "default" # Or path to voice sample for cloning⚠ Heads up: Voice cloning with CosyVoice requires self-hosting a separate inference server. See the CosyVoice documentation for setup instructions. - Step 10
Launch Server Version
With configuration complete, start the KrillinAI server. The application will bind to port 8888 by default and serve the web interface. Access the UI in your browser to begin translating videos.
# Ensure config/config.toml exists ls config/config.toml # Start the server ./krillinai # Linux/macOS krillinai.exe # Windows # The server will output: # Server listening on http://127.0.0.1:8888 # Open in browser open http://127.0.0.1:8888 # macOS xdg-open http://127.0.0.1:8888 # Linux start http://127.0.0.1:8888 # Windows - Step 11
Docker Deployment (Alternative)
KrillinAI supports Docker for isolated, reproducible deployments. The Docker image includes all dependencies and can be configured via environment variables or a mounted config.toml file. This is the recommended approach for production deployments and CI/CD pipelines.
# Clone the repository to access docker-compose.yml git clone https://github.com/krillinai/KrillinAI.git cd KrillinAI # Option 1: Docker Compose (recommended) docker-compose up -d # Option 2: Manual Docker run docker run -d \ -p 8888:8888 \ -v $(pwd)/config:/app/config \ -v $(pwd)/output:/app/output \ --name krillinai \ krillinai/krillinai:latest # View logs docker-compose logs -f # or docker logs -f krillinai # Access the interface http://localhost:8888⚠ Heads up: Mount the config directory as a volume to persist configuration. Mount an output directory to access translated videos outside the container. - Step 12
Translate Your First Video
KrillinAI provides a web interface for uploading videos or fetching from URLs (including YouTube via yt-dlp integration). The workflow is: upload video → select source/target languages → choose output format (landscape/portrait) → start translation. Progress updates appear in real-time.
1. Open http://127.0.0.1:8888 in your browser 2. Click "New Translation Task" 3. Upload a video file OR paste a YouTube/video URL 4. Select source language (or auto-detect) 5. Select target language(s) - supports 100+ languages 6. Choose output format: - Landscape (16:9) for YouTube - Portrait (9:16) for TikTok, Instagram Reels - Auto-detect from source video 7. Configure options: - Enable/disable voice dubbing (TTS) - Enable/disable subtitle burning - Adjust subtitle styling 8. Click "Start Translation" 9. Monitor progress in the task list 10. Download completed video from the output panel - Step 13
Understanding the Translation Pipeline
KrillinAI executes a three-stage pipeline: Transcribe (speech to text via Whisper), Translate (text to text via LLM with context preservation), and Synthesize (text to speech via TTS). Each stage can be configured independently. The LLM translation stage uses context-aware chunking to maintain semantic coherence across subtitle boundaries.
Pipeline Stages: 1. Transcribe (Audio → Text) - Extract audio from video - Run Whisper speech recognition - Generate timestamped subtitles - Intelligent segmentation for natural breaks 2. Translate (Text → Text) - Chunk subtitles with context overlap - LLM translates with semantic awareness - Preserve timing and formatting - Adjust subtitle length for target language 3. Synthesize (Text → Audio) - Generate dubbed audio via TTS - Optional: Clone original voice characteristics - Align audio timing with video - Mix dubbed audio with video 4. Compose (Final Output) - Burn subtitles into video (optional) - Adjust aspect ratio for target platform - Export in optimized format - Step 14
Configure API Keys via Web Interface
The desktop version allows configuring API keys directly in the web interface without editing config.toml. Navigate to Settings → API Configuration to input your OpenAI, Alibaba Cloud, or other provider credentials. Changes are saved to the application's data directory.
Web UI Configuration Path: 1. Click the ⚙️ Settings icon (top-right) 2. Select "API Configuration" tab 3. Choose your provider: - OpenAI: Enter API key for Whisper + GPT + TTS - Alibaba Cloud: Enter AccessKey, SecretKey, AppKey - Custom: Enter base_url for OpenAI-compatible APIs 4. Test connection: Click "Validate" button 5. Save configuration 6. Restart if prompted (desktop version may require restart) Settings are stored in: - Windows: %APPDATA%\krillinai\config.toml - macOS: ~/Library/Application Support/krillinai/config.toml - Linux: ~/.config/krillinai/config.toml - Step 15
Troubleshooting Common Issues
Common issues include model download failures (check internet connection and disk space), API authentication errors (verify keys are valid and have sufficient quota), and video processing errors (ensure input video is not corrupted). Check the application logs for detailed error messages.
# View application logs # Desktop version: Check the web UI console (F12 in browser) # Server version: Logs print to terminal # Check API connectivity curl https://api.openai.com/v1/models \ -H "Authorization: Bearer sk-your-api-key" # Verify model downloads (local Whisper) # Models stored in: # - Windows: %USERPROFILE%\.cache\whisper # - Linux/macOS: ~/.cache/whisper ls ~/.cache/whisper # Test Whisper locally (debug) python -c "import whisper; whisper.load_model('base')" # FFmpeg not found (rare, should auto-install) # Manually install: sudo apt install ffmpeg # Ubuntu/Debian brew install ffmpeg # macOS choco install ffmpeg # Windows # Port 8888 already in use # Edit config.toml: [server] port = 8889 # Change to available port # Docker container fails to start docker-compose logs docker inspect krillinai⚠ Heads up: OpenAI API rate limits apply. Free tier accounts have strict quotas (3 requests/min). Upgrade to a paid account for production use. - Step 16
Performance Optimization
For faster processing, use local Whisper models (FasterWhisper on GPU-enabled systems), batch multiple videos, and choose smaller LLM models (gpt-3.5-turbo instead of gpt-4). GPU acceleration significantly improves Whisper transcription speed.
# Enable GPU acceleration for FasterWhisper (NVIDIA) # Requires CUDA toolkit installed nvidia-smi # Verify GPU is detected # In config.toml: [transcribe.faster-whisper] model = "large-v2" device = "cuda" # Options: cpu, cuda compute_type = "float16" # Faster on GPU # CPU optimization (multi-threading) [transcribe.faster-whisper] device = "cpu" compute_type = "int8" # Quantized for speed num_threads = 8 # Match your CPU core count # Batch processing via API (future feature) # For now, queue multiple tasks in the web UI # Use smaller models for faster turnaround: # - Whisper: tiny (fast, lower accuracy) vs large-v2 (slow, high accuracy) # - LLM: gpt-3.5-turbo (fast, cheap) vs gpt-4 (slow, expensive, best quality) - Step 17
Production Deployment Checklist
For production deployments, enable HTTPS (via reverse proxy like Nginx or Caddy), implement authentication (basic auth or OAuth), set up monitoring and logging, configure automatic backups of configuration and output files, and use Docker Compose with health checks and restart policies.
# docker-compose.prod.yml version: '3.8' services: krillinai: image: krillinai/krillinai:latest restart: always ports: - "127.0.0.1:8888:8888" # Bind to localhost only volumes: - ./config:/app/config:ro # Read-only config - ./output:/app/output - ./logs:/app/logs environment: - LOG_LEVEL=info healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8888/health"] interval: 30s timeout: 10s retries: 3 mem_limit: 8g cpus: 4 nginx: image: nginx:alpine restart: always ports: - "443:443" - "80:80" volumes: - ./nginx.conf:/etc/nginx/nginx.conf:ro - ./ssl:/etc/nginx/ssl:ro depends_on: - krillinai⚠ Heads up: Never expose the KrillinAI web interface directly to the internet without authentication. Use a reverse proxy with HTTPS and implement access controls. - Step 18
Community and Support Resources
KrillinAI maintains active community channels for troubleshooting and feature requests. The project documentation is hosted on DeepWiki, and the GitHub issues tracker is monitored by maintainers. For real-time support, join the QQ group (primarily Chinese-language community).
Official Resources: - GitHub Repository: https://github.com/krillinai/KrillinAI - Documentation: https://deepwiki.com/krillinai/KlicStudio - Issue Tracker: https://github.com/krillinai/KrillinAI/issues - QQ Group: 754069680 (Chinese community) Related Projects: - Whisper: https://github.com/openai/whisper - FasterWhisper: https://github.com/guillaumekln/faster-whisper - CosyVoice: https://github.com/FunAudioLLM/CosyVoice Useful Documentation: - Docker Deployment: See docker.md in repository - Alibaba Cloud Setup: See aliyun.md in repository - API Reference: Coming soon (project is actively developed)
Feature requests
Sign in to suggest features or vote on existing ones.
No feature requests yet.
Discussion
Sign in to join the discussion.
No comments yet.