Release Notes
model : support granite multilingual embeddings R2 (ibm-granite/granite-embedding-{97,311}m-multilingual-r2) (#22716)
Add support for the ibm-granite/granite-embedding-{97m,311m}-multilingual-r2 embedding models:
Added a version of the gpt4o tokenizer that has a fixed regex (better handling of marks), and different token merging setting for the 97m model
Reused gemma4 tokenizer for the 311m model
granite-embedding-*-multilingual-r2 : add support SwiGLU FFN for Granite Embedding Multilingual R2
added new GGUF key
.hidden_activation (LLM_KV_HIDDEN_ACT) + writer added a forward declaration of llm_ffn_op_type to llama-hparams.h
added llm_ffn_op in hparams
added LLM_FFN_NONE = 0 sentinel to llm_ffn_op_type (value-initialization), modern-bert: explicitly assigns LLM_FFN_GEGLU before reading GGUF (unchanged).
centralized hidden_act mapping in llama-model.cpp, added llm_ffn_op_type_from_string() helper, mirroring rope_scaling_type/llama_rope_scaling_type_from_string()
modern-bert reads the GGUF key (when present) and uses the resulting op in its FFN graph
Added granite-embedding-{97m,311m}-multilingual-r2 to the converter code
Added the hashes for the granite embedding multilingual R2 models
Set the hidden_activation in the GGUF if the field is present in config.json (such as for the granite embedding models)
macOS/iOS:
- macOS Apple Silicon (arm64)
- macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED
- macOS Intel (x64)
- iOS XCFramework
Linux:
- Ubuntu x64 (CPU)
- Ubuntu arm64 (CPU)
- Ubuntu s390x (CPU)
- Ubuntu x64 (Vulkan)
- Ubuntu arm64 (Vulkan)
- Ubuntu x64 (ROCm 7.2)
- Ubuntu x64 (OpenVINO)
- Ubuntu x64 (SYCL FP32) DISABLED
Android:
Windows:
- Windows x64 (CPU)
- Windows arm64 (CPU)
- Windows x64 (CUDA 12) - CUDA 12.4 DLLs
- Windows x64 (CUDA 13) - CUDA 13.3 DLLs
- Windows x64 (Vulkan)
- Windows x64 (SYCL) DISABLED
- Windows x64 (HIP)
openEuler:
- DISABLED
- openEuler x86 (310p)
- openEuler x86 (910b, ACL Graph)
- openEuler aarch64 (310p)
- openEuler aarch64 (910b, ACL Graph)
UI: