Skip to content

fix: delete hottier, query cache metric#1563

Merged
nikhilsinhaparseable merged 1 commit intoparseablehq:mainfrom
parmesant:misc-fixes
Mar 1, 2026
Merged

fix: delete hottier, query cache metric#1563
nikhilsinhaparseable merged 1 commit intoparseablehq:mainfrom
parmesant:misc-fixes

Conversation

@parmesant
Copy link
Contributor

@parmesant parmesant commented Mar 1, 2026

Fixes #XXXX.

Description


This PR has:

  • been tested to ensure log ingestion and log query works.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added documentation for new or modified features or behaviors.

Summary by CodeRabbit

  • Improvements
    • Enhanced multi-tenant behavior for hot-tier file and metadata paths so tenant-scoped data is resolved correctly.
    • Metrics now include tenant identification alongside stream for improved observability.
    • Simplified object store path resolution for non-hot-tier flows to ensure consistent storage access.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 1, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aca604f and 3737343.

📒 Files selected for processing (2)
  • src/hottier.rs
  • src/query/stream_schema_provider.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/query/stream_schema_provider.rs

Walkthrough

Introduces tenant-aware path resolution for hot-tier storage operations and updates query cache hit metrics to include tenant context; object store URL selection in stream schema provider is simplified to always use the glob storage URL.

Changes

Cohort / File(s) Summary
Hot Tier Path Tenant Awareness
src/hottier.rs
Made hot-tier path construction tenant-aware: deletion paths, hot-tier file paths (e.g., .hot_tier.json) and related path-resolution logic now include tenant_id when provided; removed legacy commented path code and adjusted related doc comment formatting.
Query Cache Metrics & Object Store
src/query/stream_schema_provider.rs
QUERY_CACHE_HIT metric now records both stream and tenant_id labels. Simplified object store URL logic by always using glob_storage.store_url() (removed conditional tenant-subpath handling).

Sequence Diagram(s)

(omitted — changes are localized path/metric updates and do not introduce a new multi-component control flow requiring visualization)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • fix hot tier path #1562: Makes hottier.rs path resolution tenant-aware — directly related changes to hot-tier metadata and deletion path logic.

Poem

🐰 I hopped through paths both new and neat,
Tenant names tucked in each file's seat,
Metrics now sing with tenant and stream,
A tidy burrow for storage's dream! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description uses the template but is entirely incomplete: the issue reference is a placeholder, all description sections are empty comments, and none of the three checklist items are checked. Fill in the actual issue number, provide descriptions of the goal and key changes, and check off the relevant checklist items (testing, comments, documentation) that apply to this PR.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: delete hottier, query cache metric' is specific and directly reflects the main changes made to hottier.rs and stream_schema_provider.rs regarding hot tier deletion and query cache metrics.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/hottier.rs (1)

106-114: ⚠️ Potential issue | 🟠 Major

Logic error: condition excludes valid streams from hot tier size calculation.

The condition stream != current_stream && tenant_id != *current_tenant_id uses AND, which incorrectly excludes streams that match either the current stream name or the current tenant. The intent is to exclude only the specific (stream, tenant) pair.

For example, with current_stream="A" and current_tenant="T1":

  • stream="A", tenant="T2" → excluded (wrong - different tenant, should include)
  • stream="B", tenant="T1" → excluded (wrong - different stream, should include)
🐛 Fix: Use OR or negated equality
                 if self.check_stream_hot_tier_exists(&stream, &tenant_id)
-                    && stream != current_stream
-                    && tenant_id != *current_tenant_id
+                    && !(stream == current_stream && tenant_id == *current_tenant_id)
                 {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hottier.rs` around lines 106 - 114, The loop condition incorrectly uses
&& to exclude streams, causing any stream that matches either current_stream or
current_tenant_id to be skipped; change the predicate in the for loop (where
PARSEABLE.streams.list, check_stream_hot_tier_exists, and get_hot_tier are used)
so that only the exact pair (stream == current_stream && tenant_id ==
*current_tenant_id) is excluded — e.g., replace the current_stream/tenant check
with its negation (not (stream == current_stream && tenant_id ==
*current_tenant_id)) or use || (stream != current_stream || tenant_id !=
*current_tenant_id) so you still call get_hot_tier for other streams/tenants.
🧹 Nitpick comments (1)
src/hottier.rs (1)

216-223: Consider cleaning up empty tenant directories after stream deletion.

After deleting hot_tier_path/tenant_id/stream, the hot_tier_path/tenant_id/ directory may remain empty. While this doesn't cause functional issues, you could reuse the existing delete_empty_directory_hot_tier() helper (used in cleanup_hot_tier_old_data) to clean up the parent tenant directory if it becomes empty.

♻️ Optional cleanup
         let path = if let Some(tenant_id) = tenant_id.as_ref() {
             self.hot_tier_path.join(tenant_id).join(stream)
         } else {
             self.hot_tier_path.join(stream)
         };
         fs::remove_dir_all(&path).await?;
+        // Clean up empty parent directories
+        if let Some(parent) = path.parent() {
+            let _ = delete_empty_directory_hot_tier(parent.to_path_buf()).await;
+        }
 
         Ok(())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hottier.rs` around lines 216 - 223, When deleting the stream directory,
ensure you also remove the parent tenant directory if it becomes empty: after
calling fs::remove_dir_all(path).await? for the stream, if tenant_id.as_ref() is
Some(...), call the existing helper delete_empty_directory_hot_tier() (the same
helper used by cleanup_hot_tier_old_data) with the tenant path
(self.hot_tier_path.join(tenant_id)) to clean up the empty tenant directory;
keep the existing error handling and only invoke the helper for the tenant
branch.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/hottier.rs`:
- Around line 106-114: The loop condition incorrectly uses && to exclude
streams, causing any stream that matches either current_stream or
current_tenant_id to be skipped; change the predicate in the for loop (where
PARSEABLE.streams.list, check_stream_hot_tier_exists, and get_hot_tier are used)
so that only the exact pair (stream == current_stream && tenant_id ==
*current_tenant_id) is excluded — e.g., replace the current_stream/tenant check
with its negation (not (stream == current_stream && tenant_id ==
*current_tenant_id)) or use || (stream != current_stream || tenant_id !=
*current_tenant_id) so you still call get_hot_tier for other streams/tenants.

---

Nitpick comments:
In `@src/hottier.rs`:
- Around line 216-223: When deleting the stream directory, ensure you also
remove the parent tenant directory if it becomes empty: after calling
fs::remove_dir_all(path).await? for the stream, if tenant_id.as_ref() is
Some(...), call the existing helper delete_empty_directory_hot_tier() (the same
helper used by cleanup_hot_tier_old_data) with the tenant path
(self.hot_tier_path.join(tenant_id)) to clean up the empty tenant directory;
keep the existing error handling and only invoke the helper for the tenant
branch.

ℹ️ Review info

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b2fae6f and aca604f.

📒 Files selected for processing (2)
  • src/hottier.rs
  • src/query/stream_schema_provider.rs

coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 1, 2026
@nikhilsinhaparseable nikhilsinhaparseable merged commit 3bf5673 into parseablehq:main Mar 1, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants