Question about tool-calling order in chat_template.jinja

#67
by json0 - opened

Hi, I have a question about the tool-calling order in the current chat_template.jinja.

From reading the template, assistant messages seem to be rendered in this order:

  1. reasoning / reasoning_content
  2. tool_calls
  3. tool_responses (if present, including forward-scanned role: "tool" messages)
  4. content

In other words, the template effectively does:

thinking_text
tool_calls
tool_responses
content

However, in many tool-calling setups, I would normally expect the order to be:

  1. reasoning
  2. content
  3. tool_call

What makes this especially confusing is that the model itself seems to generate in that latter order as well — i.e. reasoning -> content -> tool_call during generation — while the template renders it as reasoning -> tool_call -> (tool_response) -> content.

Because of that, I’m wondering whether this ordering in the template is intentional and part of a Gemma-specific canonical format, or whether it might be an unintended bug in the template.

The relevant rendering order appears to come from the fact that the template outputs:

  • thinking_text first
  • then message['tool_calls']
  • then tool_responses / scanned tool messages
  • and only after that message['content']

My main concern is round-trip consistency between:

  • the order the model seems to generate, and
  • the order the chat template serializes.

So my question is:

Is this ordering intentional, or should content come before tool_calls in assistant messages that contain both?

If this is intended, could you clarify what the expected canonical order is for Gemma tool-calling turns?

Thanks.

Google org

Hi @json0 ,

Thanks for reporting this issue. I was able to reproduce this issue for this particular edge case and have shared it with Engineering team for further review.

Sign up or log in to comment