b9745
b9745
View on GitHubView PackagePublished: Jun 21, 2026

Release Notes

spec : Support Step3.5/3.7 flash mtp3 (#24340)

  • add mtp_layer_offset + include nextn flags in graph reuse

  • add llama_set_mtp_layer_offset + llama_model_n_nextn_layer API

  • offset head select + require all MTP blocks

  • speculative multi-head process()

  • speculative multi-head draft()

  • gather outputs via inp_out_ids

  • cleanup

  • fix core

  • minor cleanup

  • merged draft_multi_head into draft()

  • mtp rename nextn

  • Apply suggestions from code review

Co-authored-by: Aman Gupta [email protected]

  • clean-up comments

  • fix for multi seq

  • apply suggestions && chain-heads comment

  • add a reference for chain_heads discussion


Co-authored-by: Aman Gupta [email protected]

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

  • DISABLED
  • openEuler x86 (310p)
  • openEuler x86 (910b, ACL Graph)
  • openEuler aarch64 (310p)
  • openEuler aarch64 (910b, ACL Graph)

UI: