TechSetupGuides
Advancedaimachine-learningcomputer-visionimage-processingpythononnxopencv

HivisionIDPhotos: AI-Powered ID Photo Generator

A lightweight, offline-capable AI tool for automatic ID photo generation with portrait matting, background replacement, and layout photo creation.

  1. Step 1

    Project Overview

    HivisionIDPhotos is an intelligent algorithm system for producing ID photos. It uses AI models for face detection, portrait matting, and automatic photo adjustments to generate standard ID photos from user images.

    Key capabilities:

    • Lightweight matting (purely offline, CPU-only inference)
    • Generates standard ID photos and layout photos in various sizes
    • Supports pure offline or edge-cloud inference
    • Multiple matting model options for different use cases
    • FastAPI and Gradio interfaces for API and web usage
  2. Step 2

    Environment Requirements

    Ensure your system meets the following requirements before installation:

    • Python: >= 3.7 (primarily tested on Python 3.10)
    • Operating System: Linux, Windows, or macOS
    • Memory: At least 4GB RAM (16GB+ recommended for beast mode)
    • Disk Space: ~500MB for the project and model weights
    python --version
    # Should be 3.7 or higher
  3. Step 3

    Clone the Repository

    Clone the HivisionIDPhotos repository from GitHub:

    git clone https://github.com/Zeyi-Lin/HivisionIDPhotos.git
    cd HivisionIDPhotos
  4. Step 4

    Set Up Python Virtual Environment

    Create and activate a Python virtual environment. Using conda or venv is recommended to isolate dependencies.

    # Using conda (recommended)
    conda create -n hivision python=3.10
    conda activate hivision
    
    # Or using venv
    python -m venv hivision_env
    source hivision_env/bin/activate  # On Linux/macOS
    # source hivision_env\Scripts\activate  # On Windows
  5. Step 5

    Install Dependencies

    Install the core dependencies and application dependencies. The project is split into base requirements (core functionality) and app requirements (Gradio/FastAPI for web interface).

    pip install -r requirements.txt
    pip install -r requirements-app.txt
  6. Step 6

    Download Model Weights

    The project requires pre-trained model weights for matting and face detection. You can either download them using a script or manually.

    # Method 1: Use the download script
    python scripts/download_model.py --models all
    
    # Method 2: Download manually and place in hivision/creator/weights/
  7. Step 7

    Manual Model Download (Optional)

    If the download script does not work, download these models manually and save them to hivision/creator/weights/:

    | Model File | Size | Source | Description | |---|---|---|---| | modnet_photographic_portrait_matting.onnx | 24.7MB | MODNet | Official matting weights | | hivision_modnet.onnx | 24.7MB | Release | Improved matting for color backgrounds | | rmbg-1.4.onnx | 176.2MB | BRIA AI | High-quality matting | | birefnet-v1-lite.onnx | 224MB | BiRefNet | Best-quality matting, GPU required

    ⚠ Heads up: At least one matting model is required to run the project. HIVISION_MODNET is the default and recommended for CPU-only inference.
  8. Step 8

    Face Detection Model Setup (Optional)

    By default, the project uses MTCNN for face detection (offline, fast, works on CPU). You can also use RetinaFace (higher accuracy, slower CPU) or Face++ (online API, highest accuracy).

    RetinaFace Setup (Offline, High Accuracy, Moderate CPU Speed):

    # Download RetinaFace weights and place in hivision/creator/retinaface/weights/
    curl -L https://github.com/Zeyi-Lin/HivisionIDPhotos/releases/download/pretrained-model/retinaface-resnet50.onnx \
      -o hivision/creator/retinaface/weights/retinaface-resnet50.onnx
  9. Step 9

    GPU Acceleration Setup (Optional)

    For NVIDIA GPU acceleration with the birefnet-v1-lite model (~16GB VRAM recommended), install CUDA-enabled libraries:

    For CUDA 12.x and cuDNN 8:

    pip install onnxruntime-gpu==1.18.0
    pip install torch --index-url https://download.pytorch.org/whl/cu121
    ⚠ Heads up: CUDA installations are backward compatible. If you have CUDA 12.6 but the available torch build is for 12.4, you can still install the 12.4 version.
  10. Step 10

    Run Gradio Demo

    Launch the interactive web interface for generating ID photos. This is the simplest way to use the tool.

    python app.py
  11. Step 11

    Using the Gradio Interface

    After running app.py, open http://127.0.0.1:7860 in your browser. The interface allows you to:

    • Upload a photo
    • Choose output size (standard sizes for various countries)
    • Select matting model
    • Choose background color
    • Apply beauty effects
    • Generate layout photos (6-inch, A4, etc.)
    • Enable face alignment and rotation correction
    ⚠ Heads up: If you see an error about missing models, ensure you downloaded at least one matting model and placed it in the `hivision/creator/weights/` directory.
  12. Step 12

    Python Inference CLI

    Use the command-line interface for batch processing or scripting:

    1. Create ID Photo:

    python inference.py \
      -i demo/images/test0.jpg \
      -o ./output.png \
      --height 413 \
      --width 295
  13. Step 13

    CLI Inference Options

    2. Portrait Matting Only (extract person from background):

    python inference.py -t human_matting \
      -i demo/images/test0.jpg \
      -o ./matting.png \
      --matting_model hivision_modnet
  14. Step 14

    Add Background to Transparent Image

    3. Add Background Color to Transparent PNG:

    python inference.py -t add_background \
      -i ./output.png \
      -o ./output_colored.jpg \
      -c 4f83ce \
      -k 30 \
      -r 1
  15. Step 15

    Generate Layout Photo

    4. Create Six-Inch Layout Photo (for ID card printing):

    python inference.py -t generate_layout_photos \
      -i ./output_colored.jpg \
      -o ./layout.jpg \
      --height 413 \
      --width 295 \
      -k 200
  16. Step 16

    Deploy FastAPI Backend

    Run the project as an API service for programmatic access:

    python deploy_api.py
  17. Step 17

    API Service Features

    The FastAPI backend provides RESTful endpoints for:

    • ID photo generation
    • Portrait matting
    • Background color addition
    • Layout photo creation
    • Batch processing support

    CURL Request Example:

    curl -X POST "http://localhost:8080/api/v1/idphoto" \
      -F "image=@path/to/photo.jpg" \
      -F "height=413" \
      -F "width=295" \
      -F "matting_model=hivision_modnet" \
      -F "background_color=4f83ce"
  18. Step 18

    Docker Deployment

    Deploy the application using Docker for consistent environments across systems.

    # Pull the official image
    docker pull linzeyi/hivision_idphotos
    
    # Or build from local Dockerfile (after placing model weights)
    docker build -t linzeyi/hivision_idphotos .
  19. Step 19

    Run Docker Containers

    Run Both Simultaneously:

    docker compose up -d
  20. Step 20

    Environment Variables

    The project supports several configuration options via environment variables:

    | Variable | Type | Description | |---|---|---| | FACE_PLUS_API_KEY | Optional | Face++ API key for online face detection | | FACE_PLUS_API_SECRET | Optional | Face++ API secret | | RUN_MODE | Optional | Set to beast for faster inference (models stay in memory) |

    docker run -d -p 7860:7860 \
      -e FACE_PLUS_API_KEY=your_key \
      -e FACE_PLUS_API_SECRET=your_secret \
      -e RUN_MODE=beast \
      linzeyi/hivision_idphotos
  21. Step 21

    Performance Reference

    Benchmark results on Mac M1 Max 64GB (CPU only, non-GPU acceleration):

    | Model Combination | Memory Usage | Inference Time (512x715) | Inference Time (764x1146) | |---|---|---|---| | MODNet + MTCNN | 410MB | 0.207s | 0.246s | | MODNet + RetinaFace | 405MB | 0.571s | 0.971s | | BiRefNet-lite + RetinaFace | 6.20GB | 7.063s | 7.128s |

    ⚠ Heads up: BiRefNet-lite requires significant memory (~6GB) and works best with GPU acceleration.
  22. Step 22

    Technology Stack

    Key technologies and frameworks used:

    | Category | Technology | Purpose | |---|---|---| | Language | Python 3.7+ | Core implementation | | Deep Learning | ONNX Runtime | Model inference | | Image Processing | OpenCV | Image manipulation | | Matting Models | MODNet, RMBG-1.4, BiRefNet | Portrait extraction | | Face Detection | MTCNN, RetinaFace, Face++ | Face localization | | Web Framework | FastAPI | REST API backend | | UI Framework | Gradio | Web interface | | Containerization | Docker | Deployment | | Machine Learning | NumPy, PIL | Numerical operations |

  23. Step 23

    Model Architecture Details

    The matting models use deep neural networks:

    MODNet (Hivision Variant):

    • Lightweight architecture optimized for real-time performance
    • Runs efficiently on CPU
    • Good balance of speed and quality

    RMBG-1.4 (BRIA AI):

    • Vision Transformer (ViT)-based architecture
    • Higher quality matting
    • Slower inference (177MB model)

    BiRefNet V1-lite:

    • Bidirectional refinement network
    • State-of-the-art matting quality
    • Requires GPU for practical inference speed
    # View available models in the codebase
    from hivision.creator.choose_handler import HUMAN_MATTING_MODELS
    print(HUMAN_MATTING_MODELS)
  24. Step 24

    Troubleshooting

    Issue 3: CUDA/GPU not working Solution: Verify cuDNN is installed or try CPU-only mode with onnxruntime instead of onnxruntime-gpu.

    python app.py --port 8080 --host 0.0.0.0
  25. Step 25

    Advanced Customization

    2. Modify preset colors: Edit demo/assets/color_list_EN.csv (name, hex)

    3. Add custom watermark fonts: Place font files in hivision/plugin/font/ and update hivision/plugin/watermark.py.

    Standard,413,295
    One inch,567,413
    Two inches,626,413
  26. Step 26

    Community Projects and Extensions

    Several community-built extensions exist:

    • HivisionIDPhotos-ComfyUI: ComfyUI workflow for ID photo processing
    • HivisionIDPhotos-cpp: C++ version for better performance
    • HivisionIDPhotos-windows-GUI: Windows desktop app
    • HivisionIDPhotos-wechat-weapp: WeChat mini program
    # Explore community projects at:
    # https://github.com/Zeyi-Lin/HivisionIDPhotos

Feature requests

Sign in to suggest features or vote on existing ones.

No feature requests yet.

Discussion

0 people marked this as worked·Sign in to mark your own.

Sign in to join the discussion.

No comments yet.