Skip to content

Comments

[RHIDP-11647] Add interrupted user query to conversation#1217

Open
Jdubrick wants to merge 1 commit intolightspeed-core:mainfrom
Jdubrick:add-persisted-convo-on-interrupt
Open

[RHIDP-11647] Add interrupted user query to conversation#1217
Jdubrick wants to merge 1 commit intolightspeed-core:mainfrom
Jdubrick:add-persisted-convo-on-interrupt

Conversation

@Jdubrick
Copy link
Contributor

@Jdubrick Jdubrick commented Feb 24, 2026

Description

In #1176 a new endpoint was added for interrupting in-flight queries (streaming). If a query was interrupted however it was not adding what the user said to the conversation. This change adds the user query to the conversation and adds a generic "you interrupted this" response from the LLM side. This makes sure that users displaying the conversation in a chatbot window don't have missing data.

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Claude Opus 4.6
  • Generated by:

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • New Features

    • Interrupted streaming requests now automatically persist user queries and partial responses, allowing you to review them later
    • Interrupted responses are clearly marked to indicate cancellation
  • Improvements

    • Enhanced error handling and recovery during request cancellation
    • Optimized token consumption tracking to exclude interruption scenarios

Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 24, 2026

Walkthrough

Introduced per-stream interrupt callbacks and improved persistence of interrupted streaming requests. Modified streaming query interrupt handling to register callbacks that persist user queries and interrupted responses when streams are cancelled, with new constants and comprehensive test coverage.

Changes

Cohort / File(s) Summary
Interrupt Callback Infrastructure
src/utils/stream_interrupts.py
Added optional on_interrupt callback parameter to ActiveStream and register_stream(). Modified cancel_stream() to capture and schedule the interrupt callback as a separate asyncio task on stream cancellation.
Streaming Query Interrupt Persistence
src/app/endpoints/streaming_query.py
Introduced _persist_interrupted_turn() to persist user query and interrupted response on cancellation. Added _register_interrupt_callback() to register cancellation callback with persistence guard. Integrated callback registration in generate_response() and updated cancellation error handling to mark response as INTERRUPTED and invoke persistence without masking original exception.
Constants
src/constants.py
Added public constant INTERRUPTED_RESPONSE_MESSAGE with value "You interrupted this request." for interrupted streaming request handling.
Test Coverage
tests/unit/app/endpoints/test_streaming_query.py
Renamed test to reflect interrupt persistence. Extended test scenarios to verify append_turn_to_conversation and store_query_results are invoked during cancellation, interrupted events are emitted, and failure paths are handled correctly while preserving cleanup semantics.

Sequence Diagram

sequenceDiagram
    participant Client
    participant StreamQuery as Streaming<br/>Query Endpoint
    participant Registry as StreamInterrupt<br/>Registry
    participant Callback as Interrupt<br/>Callback
    participant Persistence as Persistence<br/>Layer

    Client->>StreamQuery: start_streaming_request()
    StreamQuery->>Registry: register_stream(on_interrupt=callback)
    Registry->>Registry: store ActiveStream with callback
    
    Client-->>Client: cancel_request()
    Client->>Registry: cancel_stream(request_id)
    Registry->>Registry: retrieve on_interrupt callback
    Registry->>Callback: schedule callback as async task
    Registry-->>Client: return CANCELLED
    
    Callback->>StreamQuery: _persist_interrupted_turn()
    StreamQuery->>Persistence: append_turn_to_conversation()
    StreamQuery->>Persistence: store_query_results(INTERRUPTED)
    Persistence-->>StreamQuery: success
    
    StreamQuery->>StreamQuery: yield interrupted_event
    StreamQuery->>Registry: deregister_stream()
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • lightspeed-stack#1176: Introduces StreamInterruptRegistry and request_id-based interruption mechanisms that this PR extends with per-stream callback support and interrupt persistence logic.
  • lightspeed-stack#866: Modifies the same files (src/app/endpoints/streaming_query.py, src/constants.py, src/utils/stream_interrupts.py) with streaming query and interrupt handling changes that this PR builds upon.

Suggested reviewers

  • tisnik
  • eranco74
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main change: adding persistence of interrupted user queries to conversations, which is the core objective of this PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
tests/unit/app/endpoints/test_streaming_query.py (2)

1558-1592: Test uses the real Singleton StreamInterruptRegistry despite class-level mock fixture.

This test directly instantiates StreamInterruptRegistry() (line 1563), which returns the real Singleton — bypassing the isolate_stream_interrupt_registry autouse fixture that mocks get_stream_interrupt_registry. This is intentional since the test is exercising the registry's actual cancel_stream + callback behavior, but a comment clarifying this would prevent confusion when someone sees this coexisting with the mock fixture.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/app/endpoints/test_streaming_query.py` around lines 1558 - 1592,
The test test_cancel_stream_callback_persists_when_error_hits_outside_generator
intentionally instantiates the real StreamInterruptRegistry() (bypassing the
isolate_stream_interrupt_registry autouse fixture that mocks
get_stream_interrupt_registry) to exercise real cancel_stream + on_interrupt
behavior; add a short clarifying comment above the StreamInterruptRegistry()
instantiation explaining that this is deliberate and that the autouse fixture
exists but is intentionally not used here so future readers won't mistake it for
an oversight.

1492-1556: Real task cancellation test may be flaky on slow CI runners.

The asyncio.sleep(0.05) delays at lines 1547 and 1549 assume the event loop will schedule and propagate the cancellation within 50ms. On heavily loaded CI machines, this can occasionally fail.

Consider using a more deterministic synchronization approach, such as an asyncio.Event set by the generator after it yields the first item, so the test knows the task is blocked before cancelling.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/app/endpoints/test_streaming_query.py` around lines 1492 - 1556,
The test test_generate_response_task_cancel_persists_results is flaky because it
uses fixed asyncio.sleep delays; replace those sleeps with a deterministic
Event-based handshake: add an asyncio.Event called started_event, set
started_event.set() in slow_generator immediately after yielding the first
token, then in the test await started_event.wait() before calling task.cancel();
keep cancel_event to control whether the generator would produce the second
token (so the generator still blocks until you want it to), and otherwise assert
the same conditions (result contains interrupted event,
append_turn_to_conversation and store_query_results called, and
isolate_stream_interrupt_registry.deregister_stream called with
test_request_id).
src/app/endpoints/streaming_query.py (1)

376-425: Guard mechanism is sound; consider documenting the threading model.

The mutable list[bool] guard relies on asyncio's cooperative single-threaded execution to avoid data races between the _on_interrupt task and the in-generator CancelledError handler. This is correct but subtle.

A brief inline comment noting why a plain list is safe (single-threaded event loop — no lock needed) would help future maintainers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/endpoints/streaming_query.py` around lines 376 - 425, The guard list
used in _register_interrupt_callback (the variable guard and the nested
_on_interrupt coroutine) is safe without locks because asyncio runs handlers
cooperatively on a single-threaded event loop; add a short inline comment by the
guard definition (or just above _on_interrupt) explaining that the mutable
one-element list is intentionally used as a shared guard and is safe from data
races due to the single-threaded cooperative execution model (and that the
CancelledError handler and _on_interrupt will not run concurrently on multiple
OS threads). Keep the comment concise and reference guard, _on_interrupt, and
the in-generator CancelledError handler so future maintainers understand the
threading assumption.
src/utils/stream_interrupts.py (1)

108-109: Fire-and-forget task may silently swallow unexpected errors.

create_task returns a Task that is immediately discarded. If on_interrupt raises an unhandled exception, Python will emit "Task exception was never retrieved" to stderr. While the current _persist_interrupted_turn implementation catches broadly, a future refactor could break that invariant.

Consider naming the task and adding a done-callback for defensive logging:

♻️ Suggested improvement
         if on_interrupt is not None:
-            asyncio.get_running_loop().create_task(on_interrupt())
+            task = asyncio.get_running_loop().create_task(
+                on_interrupt(), name=f"on_interrupt-{request_id}"
+            )
+            task.add_done_callback(
+                lambda t: logger.error(
+                    "on_interrupt callback failed: %s", t.exception()
+                )
+                if t.exception()
+                else None
+            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/stream_interrupts.py` around lines 108 - 109, The fire-and-forget
asyncio.create_task call discards the Task and can lead to "Task exception was
never retrieved" if on_interrupt raises; change the call that invokes
on_interrupt() (where create_task is used) to capture the Task, set a
descriptive name (via create_task(..., name="persist_interrupted_turn") or
task.set_name()) and attach a done-callback that checks task.exception() and
logs any exception (use the module/process logger), and reference the current
_persist_interrupted_turn and on_interrupt symbols so the task name and callback
clearly indicate this operation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/app/endpoints/streaming_query.py`:
- Around line 376-425: The guard list used in _register_interrupt_callback (the
variable guard and the nested _on_interrupt coroutine) is safe without locks
because asyncio runs handlers cooperatively on a single-threaded event loop; add
a short inline comment by the guard definition (or just above _on_interrupt)
explaining that the mutable one-element list is intentionally used as a shared
guard and is safe from data races due to the single-threaded cooperative
execution model (and that the CancelledError handler and _on_interrupt will not
run concurrently on multiple OS threads). Keep the comment concise and reference
guard, _on_interrupt, and the in-generator CancelledError handler so future
maintainers understand the threading assumption.

In `@src/utils/stream_interrupts.py`:
- Around line 108-109: The fire-and-forget asyncio.create_task call discards the
Task and can lead to "Task exception was never retrieved" if on_interrupt
raises; change the call that invokes on_interrupt() (where create_task is used)
to capture the Task, set a descriptive name (via create_task(...,
name="persist_interrupted_turn") or task.set_name()) and attach a done-callback
that checks task.exception() and logs any exception (use the module/process
logger), and reference the current _persist_interrupted_turn and on_interrupt
symbols so the task name and callback clearly indicate this operation.

In `@tests/unit/app/endpoints/test_streaming_query.py`:
- Around line 1558-1592: The test
test_cancel_stream_callback_persists_when_error_hits_outside_generator
intentionally instantiates the real StreamInterruptRegistry() (bypassing the
isolate_stream_interrupt_registry autouse fixture that mocks
get_stream_interrupt_registry) to exercise real cancel_stream + on_interrupt
behavior; add a short clarifying comment above the StreamInterruptRegistry()
instantiation explaining that this is deliberate and that the autouse fixture
exists but is intentionally not used here so future readers won't mistake it for
an oversight.
- Around line 1492-1556: The test
test_generate_response_task_cancel_persists_results is flaky because it uses
fixed asyncio.sleep delays; replace those sleeps with a deterministic
Event-based handshake: add an asyncio.Event called started_event, set
started_event.set() in slow_generator immediately after yielding the first
token, then in the test await started_event.wait() before calling task.cancel();
keep cancel_event to control whether the generator would produce the second
token (so the generator still blocks until you want it to), and otherwise assert
the same conditions (result contains interrupted event,
append_turn_to_conversation and store_query_results called, and
isolate_stream_interrupt_registry.deregister_stream called with
test_request_id).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 051c06c and 8d0e142.

📒 Files selected for processing (4)
  • src/app/endpoints/streaming_query.py
  • src/constants.py
  • src/utils/stream_interrupts.py
  • tests/unit/app/endpoints/test_streaming_query.py

Copy link
Contributor

@tisnik tisnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants