Improve tool efficiency: connection pooling, parallel execution, dedu…#305
Open
Varahiskillhub wants to merge 1 commit intousestrix:mainfrom
Open
Improve tool efficiency: connection pooling, parallel execution, dedu…#305Varahiskillhub wants to merge 1 commit intousestrix:mainfrom
Varahiskillhub wants to merge 1 commit intousestrix:mainfrom
Conversation
…p pre-filter, memory compression - Add string-similarity pre-filter to vulnerability deduplication to limit LLM comparisons to the top 10 most similar reports instead of all reports - Replace per-request httpx.AsyncClient with persistent connection pool per sandbox, eliminating repeated TCP/TLS handshake overhead - Execute independent tools concurrently via asyncio.gather while keeping state-modifying tools sequential - Lower memory compression threshold from 100K to 60K tokens and cache token counts to avoid redundant litellm.token_counter calls - Double compression chunk size from 10 to 20 messages to halve LLM calls - Replace asyncio.sleep(0.5) polling with event-based wake signaling in agent state for immediate response to state changes https://claude.ai/code/session_012JYGtxVh4zRbzXKarNmb11
Contributor
Greptile OverviewGreptile SummaryThis PR implements several performance optimizations to reduce latency and LLM API costs:
Issues found:
Confidence Score: 3/5
Important Files Changed
|
Comment on lines
+48
to
+52
| async def close_sandbox_client(sandbox_id: str) -> None: | ||
| """Close and remove the HTTP client for a sandbox when it's torn down.""" | ||
| client = _sandbox_clients.pop(sandbox_id, None) | ||
| if client: | ||
| await client.aclose() |
Contributor
There was a problem hiding this comment.
close_sandbox_client is defined but never called in the codebase. Connection pool clients accumulate without cleanup when sandboxes are torn down, leading to resource leaks.
Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/tools/executor.py
Line: 48:52
Comment:
`close_sandbox_client` is defined but never called in the codebase. Connection pool clients accumulate without cleanup when sandboxes are torn down, leading to resource leaks.
How can I resolve this? If you propose a fix, please make it concise.
Comment on lines
+47
to
+62
| _token_cache: dict[int, int] = {} | ||
|
|
||
|
|
||
| def _count_tokens(text: str, model: str) -> int: | ||
| cache_key = hash(text) | ||
| if cache_key in _token_cache: | ||
| return _token_cache[cache_key] | ||
|
|
||
| try: | ||
| count = litellm.token_counter(model=model, text=text) | ||
| return int(count) | ||
| count = int(litellm.token_counter(model=model, text=text)) | ||
| except Exception: | ||
| logger.exception("Failed to count tokens") | ||
| return len(text) // 4 # Rough estimate | ||
| count = len(text) // 4 # Rough estimate | ||
|
|
||
| _token_cache[cache_key] = count | ||
| return count |
Contributor
There was a problem hiding this comment.
_token_cache grows unbounded. For long-running agents with many unique messages, this will consume increasing memory. Consider adding LRU eviction or size limits.
Prompt To Fix With AI
This is a comment left during a code review.
Path: strix/llm/memory_compressor.py
Line: 47:62
Comment:
`_token_cache` grows unbounded. For long-running agents with many unique messages, this will consume increasing memory. Consider adding LRU eviction or size limits.
How can I resolve this? If you propose a fix, please make it concise.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…p pre-filter, memory compression