Skip to content

Conversation

@mmastrangelo
Copy link

Summary

Prevent ActionStep.to_messages() from emitting empty assistant messages when model_output is empty or whitespace-only.

Motivation / Problem

In tool-calling flows, models may return tool calls without any assistant text (tool-call-only responses). In these cases ActionStep.model_output can be "" or whitespace. The previous implementation always appended an assistant ChatMessage built from model_output.strip(), which could result in an assistant message with empty text content.

This causes:

  • Noise / unnecessary tokens in the replayed conversation history.
  • Provider compatibility issues: in a live agent setup using xAI Grok (grok-code-fast-1), consecutive empty assistant messages were correlated with a reproducible provider-side token counting failure (“maximum prompt length” / bogus token count), despite otherwise reasonable payload sizes.

Reproduction

  1. Run a tool-calling agent against a model/provider that sometimes emits tool calls with no assistant text.
  2. On those steps, ActionStep.model_output may be "" or whitespace.
  3. When converting memory back to chat messages, ActionStep.to_messages() appended an assistant message using model_output.strip().
  4. If strip() produced an empty string, an empty assistant message was added to the conversation history.
  5. In the reported grok-code-fast-1 setup, these empty assistant messages were sufficient to reliably trigger a provider token-counting failure.

Change

In src/smolagents/memory.py, ActionStep.to_messages() now:

  • computes text = self.model_output.strip()
  • only appends the assistant ChatMessage when text is non-empty

This preserves behavior for normal non-empty outputs while preventing informationless assistant messages.

Tests

  • Added unit test: tests/test_memory.py::test_action_step_to_messages_ignores_blank_model_output
    • Ensures blank/whitespace-only model_output does not produce an assistant message.

Validation / Test Plan

  • Run unit tests (make test) and quality checks (make quality).
  • Confirmed locally with a live agent setup using grok-code-1 that the previously reproducible token computation error no longer occurs after this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants