Question about tool-calling order in chat_template.jinja
Hi, I have a question about the tool-calling order in the current chat_template.jinja.
From reading the template, assistant messages seem to be rendered in this order:
reasoning/reasoning_contenttool_callstool_responses(if present, including forward-scannedrole: "tool"messages)content
In other words, the template effectively does:
thinking_text
tool_calls
tool_responses
content
However, in many tool-calling setups, I would normally expect the order to be:
reasoningcontenttool_call
What makes this especially confusing is that the model itself seems to generate in that latter order as well — i.e. reasoning -> content -> tool_call during generation — while the template renders it as reasoning -> tool_call -> (tool_response) -> content.
Because of that, I’m wondering whether this ordering in the template is intentional and part of a Gemma-specific canonical format, or whether it might be an unintended bug in the template.
The relevant rendering order appears to come from the fact that the template outputs:
thinking_textfirst- then
message['tool_calls'] - then
tool_responses/ scanned tool messages - and only after that
message['content']
My main concern is round-trip consistency between:
- the order the model seems to generate, and
- the order the chat template serializes.
So my question is:
Is this ordering intentional, or should content come before tool_calls in assistant messages that contain both?
If this is intended, could you clarify what the expected canonical order is for Gemma tool-calling turns?
Thanks.
Hi @json0 ,
Thanks for reporting this issue. I was able to reproduce this issue for this particular edge case and have shared it with Engineering team for further review.