Never miss a release that matters
AI-powered summaries of every GitHub release.
AI Summaries
Changelogs condensed into clear, actionable insights.
Always Free
Track up to 5 packages at no cost, forever.
Weekly Digest
A curated summary of every release, delivered weekly.
TL;DR
The transformers library now supports Gemma 4 Unified, a multimodal model without dedicated vision/audio towers, alongside new models like Sapiens2 (human-centric vision), DeepSeek-OCR-2 (OCR tasks), and Mellum (code generation).
Breaking
- The Gemma4 vision pooler now casts inputs to float32 to prevent potential overflow errors when using float16 precision. (This affects models using the Gemma4 vision component.)
New
- Gemma 4 Unified: A simplified multimodal model offering strong performance without separate vision and audio encoders. (Multimodal models process both text and images/audio.)
- Sapiens2: New vision transformers designed for human-centric tasks like pose estimation. (Pose estimation identifies body positions.)
- Mellum: A code-focused Mixture-of-Experts model for code generation. (Mixture-of-Experts models use multiple sub-models.)
Fixes Worth Knowing
- Fixed a potential float16 overflow issue in the Gemma4 vision pooler, improving stability.
Before You Upgrade
- If you are using the Gemma4 vision pooler with float16 precision, be aware of the change to float32 casting, which may slightly alter results.
TL;DR
The transformers library expands model support with GLM-4.7, GLM-Image, LWDetr, LightOnOCR, and MiniMax-M2, alongside numerous bug fixes and performance improvements focused on generation and stability.
Breaking
- Deprecated classes have been removed.
dtype per sub configis deprecated.- Unsafe
torch.load()has been fixed, potentially impacting custom loading procedures (security fix).
New
- Added support for GLM-4.7 and GLM-Image models.
- Expanded model coverage with LWDetr and LightOnOCR.
Fixes Worth Knowing
- Resolved generation length issues with
qwen2_5_omniand DiT models. - Corrected bugs in Fuyu processor width calculation.
- Fixed failing tests for several models including
Bart,llava,Pix2Struct, and others. - Addressed a crash when using FSDP2 with Tensor Parallelism.
- Improved stability with FlashAttention and quantized models.
- Resolved UTF-8 encoding issues on Windows.
Before You Upgrade
- Review
TL;DR
Qwen models (image and language) now load and function correctly, resolving issues with model type recognition and cached tokenizers.
Fixes Worth Knowing
- Grouped beam search (advanced decoding) now correctly uses configuration parameters.
- Offline tokenizers (pre-downloaded vocabularies) now load properly for Mistral models.
- Learning rate scheduler parsing is more robust.
TL;DR
The transformers library now supports Vault-Gemma, a new 1B parameter text generation model (privacy-focused language model) from Google, offering a privacy-preserving alternative to existing models.
New
- Vault-Gemma Support: Added the
google/vaultgemma-1bmodel, trained with differential privacy for enhanced data security. - Chat Interface: Interact with Vault-Gemma directly using the
transformers chatcommand-line tool.
Before You Upgrade
Install Vault-Gemma specifically using pip install git+https://github.com/huggingface/[email protected] as it’s a preview release and doesn’t follow standard versioning.
TL;DR
Aya Vision, a new state-of-the-art multilingual multimodal model (handles images & text), is now available, enabling image understanding and text generation in 23 languages.
New
- Aya Vision Models: Added 8B and 32B parameter models for multimodal tasks.
- Multilingual Support: Supports 23 languages for both visual and textual understanding.
Before You Upgrade
Install using pip install git+https://github.com/huggingface/[email protected] to access the Aya Vision models.
TL;DR
The transformers library now uses Git repositories for model storage, enabling versioning, access control, and scalability, fundamentally changing how models are downloaded and shared.
Breaking
- Model uploads using the previous system are no longer supported; upgrade to this release or use the new CLI tools.
- TensorFlow users: pinned sentencepiece to 0.1.91 to resolve build issues.
New
- Git-backed Model Storage: Models are now stored in Git repositories (with S3 for large files), providing versioning via tags, branches, or commit hashes (e.g.,
AutoTokenizer.from_pretrained("model", revision="v2.0.1")). You can even clone model repositories locally. - TensorFlow 2.0 Support: Added functionality for state-of-the-art sequence-to-sequence transformers in TensorFlow.
- Seq2Seq Trainer: A specialized
Trainerfor sequence-to-sequence models is available, improving API support and performance.
Fixes Worth Knowing
- Fixed issues with pipelines (text generation, QA) and tokenizers, improving stability and functionality.
- Improved error messages
TL;DR
The release introduces Longformer, a new model for processing long sequences of text, alongside several community notebooks demonstrating its use and other models.
Breaking
- Model instantiation for BART, Flaubert, Japanese BERT variants, Finnish BERT variants, Dutch BERT, and ALBERT from TensorFlow now requires the full model ID (e.g., "cl-tohoku/bert-base-japanese") instead of relying on hardcoded URLs.
New
- Longformer Support: Added the Longformer model architecture, tokenizer, and pre-trained weights for tasks like question answering and sequence classification.
- Community Notebooks: Several new notebooks are available demonstrating fine-tuning and pre-training techniques for various models, including Longformer, BART, and T5.
Fixes Worth Knowing
- Corrected tokenizer behavior for summarization pipelines and fast tokenizers.
- Fixed issues with MNLI and SST-2 datasets.
- Improved robustness of the
max_lenattribute and added deprecation warnings. - Fixed tokenization of extra ID symbols in the T5 tokenizer.
Before You Upgrade
- Update your code to use the full model ID when instantiating
TL;DR
The transformers library now supports DistilBERT, a faster and lighter version of BERT, alongside new checkpoints for GPT-2 Large and XLM, significantly expanding model options for various natural language processing (NLP) tasks.
Breaking
- A new dependency,
sacremoses(a Moses tokenizer port), is required for XLM support. - XLM tokenization in Thai, Japanese, and Chinese may require additional, optional dependencies (pythainlp, kytea, jieba) which must be installed separately.
New
- DistilBERT: A distilled version of BERT offering improved speed and efficiency.
- GPT-2 Large: The 774M parameter GPT-2 model is now available.
- AutoModels: Generic classes for easier model instantiation using
from_pretrained().
Fixes Worth Knowing
- Improved multi-GPU training stability.
- Corrected saving and reloading of models with pruned heads.
- Fixed issues with GPT-2 and RoBERTa tokenizers related to sentence spacing.
- Enhanced XLM tokenization for multilingual inputs.
- Added shortcuts for accessing special token IDs (e
TL;DR
This release updates the transformers library with improved model saving/loading and replaces the old learning rate warmup with more flexible scheduling options.
Breaking
warmup_linearinOpenAIAdamandBertAdamis removed; use the new schedule classes instead (learning rate adjustments).
New
- BERT language model fine-tuning scripts are added (scripts for training).
- GLUE task support is expanded in
run_classifier.py(natural language understanding benchmark).
Fixes Worth Knowing
- Tokenizers now support sequences longer than 512 tokens (input length).
- GPT-2 loss computation and FP16 training stability are improved (generation quality).
- Model serialization is more reliable (saving/loading models).