TechSetupGuides
Intermediatepythonmachine-learningainlppytorchtensorflowdeep-learninghuggingface

Hugging Face Transformers: State-of-the-Art Machine Learning

Get started with Hugging Face Transformers, the leading library for state-of-the-art machine learning models in PyTorch, TensorFlow, and JAX. Learn installation, configuration, and practical use cases for NLP, vision, audio, and multimodal tasks.

  1. Step 1

    Check Python version

    Ensure you have Python 3.10 or higher installed. Transformers has been tested on Python 3.10+ and works best with recent versions.

    python --version
  2. Step 2

    Set up a virtual environment (recommended)

    Creating a virtual environment helps manage dependencies and avoid conflicts between packages. This is especially important for ML projects with many dependencies.

    # Create virtual environment
    python -m venv transformers-env
    
    # Activate on Linux/Mac
    source transformers-env/bin/activate
    
    # Activate on Windows
    transformers-env\Scripts\activate
  3. Step 3

    Install Transformers with PyTorch

    Install the core transformers library along with PyTorch as the backend. PyTorch is the recommended choice for research and experimentation with Transformers.

    pip install transformers torch
  4. Step 4

    Install Transformers with TensorFlow (alternative)

    If you prefer TensorFlow or need it for production deployment, install Transformers with TensorFlow backend instead. Note that PyTorch has more seamless integration with Transformers.

    pip install transformers tensorflow
  5. Step 5

    Install additional dependencies

    For production use and advanced features, install additional recommended packages including accelerate for distributed training, datasets for loading ML datasets, tokenizers for fast tokenization, and sentencepiece for certain model tokenizers.

    pip install accelerate datasets tokenizers sentencepiece
  6. Step 6

    Verify installation

    Test that Transformers is installed correctly by running a simple sentiment analysis pipeline. This will download a pre-trained model and run inference.

    python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('Hugging Face is amazing!'))"
  7. Step 7

    Understand the Pipeline API

    The Pipeline API is the simplest way to use Transformers. It provides a high-level abstraction for common ML tasks including text generation, image segmentation, automatic speech recognition, and more. Here are several practical examples.

    from transformers import pipeline
    
    # Sentiment analysis
    sentiment = pipeline('sentiment-analysis')
    result = sentiment('I love using Transformers!')
    print(result)
    
    # Text generation
    generator = pipeline('text-generation', model='gpt2')
    text = generator('Once upon a time', max_length=50)
    print(text)
    
    # Question answering
    qa = pipeline('question-answering')
    context = 'Transformers is a library by Hugging Face.'
    question = 'Who created Transformers?'
    answer = qa(question=question, context=context)
    print(answer)
  8. Step 8

    Use the AutoModel API for custom workflows

    For more control beyond pipelines, use the AutoModel API. This allows you to load any model from the Hugging Face Hub and use it with custom preprocessing and postprocessing.

    from transformers import AutoTokenizer, AutoModel
    import torch
    
    # Load model and tokenizer
    model_name = 'bert-base-uncased'
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModel.from_pretrained(model_name)
    
    # Tokenize input
    text = 'Hello, how are you?'
    inputs = tokenizer(text, return_tensors='pt')
    
    # Run inference
    with torch.no_grad():
        outputs = model(**inputs)
        last_hidden_states = outputs.last_hidden_state
    
    print(f'Output shape: {last_hidden_states.shape}')
  9. Step 9

    Fine-tune a model with the Trainer API

    The Trainer API provides a complete training and evaluation loop, abstracting away boilerplate code. It supports features like mixed precision, distributed training, and gradient accumulation.

    from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer
    from datasets import load_dataset
    
    # Load dataset and model
    dataset = load_dataset('imdb')
    model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)
    
    # Tokenize dataset
    def tokenize_function(examples):
        return tokenizer(examples['text'], padding='max_length', truncation=True)
    
    tokenized_datasets = dataset.map(tokenize_function, batched=True)
    
    # Configure training
    training_args = TrainingArguments(
        output_dir='./results',
        num_train_epochs=3,
        per_device_train_batch_size=8,
        per_device_eval_batch_size=8,
        warmup_steps=500,
        weight_decay=0.01,
        logging_dir='./logs',
    )
    
    # Create trainer and train
    trainer = Trainer(
        model=model,
        args=training_args,
        train_dataset=tokenized_datasets['train'].select(range(1000)),
        eval_dataset=tokenized_datasets['test'].select(range(100))
    )
    
    trainer.train()
  10. Step 10

    Work with vision models

    Transformers supports vision tasks using Vision Transformer (ViT) and other vision models. You can perform image classification, object detection, and image segmentation.

    from transformers import pipeline
    from PIL import Image
    import requests
    
    # Load image classifier
    classifier = pipeline('image-classification', model='google/vit-base-patch16-224')
    
    # Download and classify an image
    url = 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg'
    image = Image.open(requests.get(url, stream=True).raw)
    
    results = classifier(image)
    for result in results:
        print(f"{result['label']}: {result['score']:.4f}")
  11. Step 11

    Use translation models

    Transformers includes powerful translation models for translating text between languages. The library supports hundreds of language pairs.

    from transformers import pipeline
    
    # English to French translation
    translator = pipeline('translation_en_to_fr', model='Helsinki-NLP/opus-mt-en-fr')
    text = 'Hello, how are you doing today?'
    translation = translator(text, max_length=100)
    print(translation[0]['translation_text'])
    
    # For other language pairs, use models like:
    # Helsinki-NLP/opus-mt-en-de (English to German)
    # Helsinki-NLP/opus-mt-en-es (English to Spanish)
    # Helsinki-NLP/opus-mt-fr-en (French to English)
  12. Step 12

    Configure model cache directory

    By default, Transformers caches downloaded models in your home directory. You can customize this location using the HF_HOME or HF_HUB_CACHE environment variable.

    # Set cache directory (Linux/Mac)
    export HF_HOME=/path/to/your/cache
    
    # Set cache directory (Windows)
    set HF_HOME=C:\path\to\your\cache
    
    # Or set in Python
    import os
    os.environ['HF_HOME'] = '/path/to/your/cache'
  13. Step 13

    Use models offline

    After downloading models, you can use them offline by setting the HF_HUB_OFFLINE environment variable. Models are loaded from the local cache.

    # Enable offline mode (Linux/Mac)
    export HF_HUB_OFFLINE=1
    
    # Enable offline mode (Windows)
    set HF_HUB_OFFLINE=1
  14. Step 14

    Explore the Hugging Face Hub

    The Hugging Face Hub hosts over 1 million pre-trained model checkpoints across various tasks. Browse models, datasets, and spaces at huggingface.co.

    from huggingface_hub import list_models
    
    # List popular text classification models
    models = list_models(task='text-classification', sort='downloads', limit=5)
    for model in models:
        print(f'{model.id} - {model.downloads} downloads')
    
    # Search for specific models
    bert_models = list_models(search='bert', limit=10)
    for model in bert_models:
        print(model.id)
  15. Step 15

    Use Flash Attention for faster inference

    Flash Attention can significantly speed up inference for large language models. Install flash-attn and enable it in your model configuration.

    # Install Flash Attention (requires CUDA)
    pip install flash-attn --no-build-isolation
  16. Step 16

    Enable Flash Attention in models

    Load models with Flash Attention enabled for 2-4x faster inference on compatible hardware (NVIDIA GPUs with compute capability 8.0+).

    from transformers import AutoModelForCausalLM
    
    # Load model with Flash Attention
    model = AutoModelForCausalLM.from_pretrained(
        'meta-llama/Llama-2-7b-hf',
        attn_implementation='flash_attention_2',
        torch_dtype='auto',
        device_map='auto'
    )
  17. Step 17

    Use quantization for memory efficiency

    Quantization reduces model memory usage by using lower precision weights. This allows you to run larger models on consumer hardware.

    from transformers import AutoModelForCausalLM, BitsAndBytesConfig
    
    # 8-bit quantization
    quantization_config = BitsAndBytesConfig(load_in_8bit=True)
    model = AutoModelForCausalLM.from_pretrained(
        'meta-llama/Llama-2-7b-hf',
        quantization_config=quantization_config,
        device_map='auto'
    )
    
    # 4-bit quantization (even more memory efficient)
    quantization_config = BitsAndBytesConfig(
        load_in_4bit=True,
        bnb_4bit_compute_dtype='float16',
        bnb_4bit_quant_type='nf4'
    )
    model = AutoModelForCausalLM.from_pretrained(
        'meta-llama/Llama-2-13b-hf',
        quantization_config=quantization_config,
        device_map='auto'
    )
    ⚠ Heads up: Quantization requires the bitsandbytes library: pip install bitsandbytes
  18. Step 18

    Generate text with advanced decoding strategies

    Control text generation quality using different decoding strategies including greedy search, beam search, top-k sampling, and top-p (nucleus) sampling.

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_name = 'gpt2'
    model = AutoModelForCausalLM.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
    prompt = 'In the future, AI will'
    inputs = tokenizer(prompt, return_tensors='pt')
    
    # Greedy search (deterministic)
    output = model.generate(**inputs, max_length=50)
    print(tokenizer.decode(output[0]))
    
    # Beam search (better quality)
    output = model.generate(**inputs, max_length=50, num_beams=5)
    print(tokenizer.decode(output[0]))
    
    # Top-k sampling (creative)
    output = model.generate(**inputs, max_length=50, do_sample=True, top_k=50)
    print(tokenizer.decode(output[0]))
    
    # Top-p (nucleus) sampling (balanced)
    output = model.generate(**inputs, max_length=50, do_sample=True, top_p=0.95, temperature=0.9)
    print(tokenizer.decode(output[0]))
  19. Step 19

    Work with audio models

    Transformers supports audio tasks including automatic speech recognition (ASR), audio classification, and text-to-speech using models like Wav2Vec2 and Whisper.

    from transformers import pipeline
    
    # Automatic speech recognition with Whisper
    asr = pipeline('automatic-speech-recognition', model='openai/whisper-base')
    audio_file = 'path/to/audio.wav'
    transcription = asr(audio_file)
    print(transcription['text'])
    
    # Audio classification
    classifier = pipeline('audio-classification', model='facebook/wav2vec2-base-960h')
    result = classifier(audio_file)
    print(result)
    ⚠ Heads up: Audio tasks require librosa or soundfile: pip install librosa soundfile
  20. Step 20

    Install from source for latest features

    Installing from the GitHub repository gives you access to the latest features and bug fixes before they are released on PyPI.

    # Clone the repository
    git clone https://github.com/huggingface/transformers.git
    cd transformers
    
    # Install in editable mode
    pip install -e .
    
    # Or install directly from GitHub
    pip install git+https://github.com/huggingface/transformers.git
  21. Step 21

    Common use cases and applications

    Transformers powers a wide range of applications across industries. Here are some common use cases: NLP Tasks - Text classification (sentiment analysis, spam detection), named entity recognition, question answering, text summarization, and translation. Vision Tasks - Image classification, object detection, image segmentation, and image captioning. Audio Tasks - Speech recognition, audio classification, and text-to-speech. Multimodal Tasks - Visual question answering, image-to-text generation, and document understanding. Production Applications - Chatbots and virtual assistants, content generation and copywriting, code generation and completion, sentiment analysis for customer feedback, and language translation services.

Feature requests

Sign in to suggest features or vote on existing ones.

No feature requests yet.

Discussion

0 people marked this as worked·Sign in to mark your own.

Sign in to join the discussion.

No comments yet.