b9626
b9626
View on GitHubView PackagePublished: Jun 13, 2026

Release Notes

Add arch support for cohere2-MoE (#24260)

  • Add arch support for cohere2-MoE

  • Removed redundant gating_func checks

  • Changed ffn lookup to prefer prefix_dense_intermediate_size

  • Renamed arch to cohere2moe

  • Removed redundant lmhead check and chat template changes

  • Removed lm_head.weight check from modify tensors, load output tensor not required, fallback to token_embd.weight

  • Changed to (routed+shared)*0.5 for shared expert combined avg

  • fixed sliding_window_pattern issue and pattern

  • Fixed transformers crash 'first_k_dense_replace' error

  • Remove comment

  • Removed cohere2-moe as a tokenizer type and kept as tiny_aya. Renamed North-Mini-Code-1.0.

  • Fixed MTP fail, changed to use iSWA

  • Fixed remaining todos: cohere2moe renamed, changed swa parsing to use get_key_or_arr, removed extra get_arr use

  • Force metadata usage

Co-authored-by: Sigbjørn Skjæret [email protected]

  • Remove Cohere2 checkpoint comment

Co-authored-by: Sigbjørn Skjæret [email protected]

  • Remove MTP comment

Co-authored-by: Sigbjørn Skjæret [email protected]

  • Regenerate cohere2moe tokenizer hash

  • Add cohere2moe to Llama Model Saver supported list

  • Check for zerobios tensors and add support for Command to use LayerNorm

  • Map expert_selection_fn to sigmoid in base.py instead of command.py

  • use bools for foundnorm/foundnormrms

Co-authored-by: Sigbjørn Skjæret [email protected]


Co-authored-by: Sigbjørn Skjæret [email protected]

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI: