b9310
b9310
View on GitHubView PackagePublished: May 25, 2026

Release Notes

server: fix checkpoints creation (#22929)

  • common : add common_chat_split_by_role

  • cont : fix spans to reach end of message

  • server: fix checkpoints creation

  • extract message_spans from chat templates
  • find the prompt token position before the latest user message
  • split prompt batching at that position
  • create a context checkpoint before the latest user input
  • avoid periodic mid-prompt checkpoints when that position is known
  • handle multimodal prompts when mapping text/template positions to server prompt tokens
  • add --checkpoint-min-step to control minimum spacing between checkpoints
  • cont : clean-up

  • Support autoparser detection for message barriers

  • server: fix message span delimiter and update docs


Co-authored-by: Alde Rojas [email protected] Co-authored-by: Georgi Gerganov [email protected] Co-authored-by: Piotr Wilkin [email protected]

macOS/iOS:

Linux:

Android:

Windows:

openEuler:

UI: