-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Description
Description
When streaming responses from LLMs contain multi-byte UTF-8 characters (Turkish, emoji, CJK), they can be corrupted if the HTTP stream splits at byte boundaries. This causes the LLM to detect malformed JSON in structured output and restart generation, resulting in double-generated content within a single response.
Symptoms
- Malformed JSON in agent responses: Response contains nested/duplicated JSON structures
- LLM double-generation: The LLM starts generating, gets cut off mid-character, then restarts and generates again
- Affects specific languages: Turkish characters (ü, ö, ş, ç, ğ), emoji (🚀, 🎯), CJK (中, 日)
- Memory block errors: Downstream blocks fail with "field doesn't exist" because the response wasn't properly formatted
Example of corrupted response:
{
"model": "google/gemini-3-flash-preview",
"content": "{\"response\": \"First part of response... Ö{\"response\": \"Complete response...\"}",
"toolCalls": { "list": [] }
}The first "response" field is cut off at a multi-byte character boundary, then the JSON restarts.
Root Cause
When HTTP streaming splits at byte boundaries within multi-byte UTF-8 sequences:
- Example: Turkish "Ö" is 2 bytes in UTF-8 (0xC3 0x96)
- If split between chunks: Chunk 1 gets 0xC3, Chunk 2 gets 0x96
TextDecoder.decode(chunk)without{ stream: true }doesn't maintain state- Each chunk decoded separately outputs a replacement character (U+FFFD)
- Result: JSON becomes malformed → LLM detects error → restarts generation
Impact
This affects:
- All workflows using agents with structured output (responseFormat)
- Any language with multi-byte UTF-8 characters
- Streaming responses from any LLM provider (OpenAI, Anthropic, Google, OpenRouter, etc.)
Example Scenario
A support bot configured with structured output (strict JSON schema) receives a query in a language with multi-byte characters. The agent's response gets split during streaming:
- First chunk arrives with incomplete byte sequence of a multi-byte character
- TextDecoder without
{ stream: true }outputs replacement character - JSON becomes invalid
- LLM detects malformed JSON (syntax error)
- LLM regenerates response within same call
- Both partial and regenerated responses concatenated = corrupt output
- Downstream memory/processing blocks fail with "field doesn't exist" errors
Workaround
Use TextDecoder.decode(value, { stream: true }) where TextDecoder is created once outside the streaming loop. This allows the decoder to maintain internal state and buffer incomplete multi-byte sequences until the next chunk arrives.