[Draft, Not for review] Sync changes from CDB_DiskANN repo by gopalrs · Pull Request #748 · microsoft/DiskANN

gopalrs · 2026-02-10T04:44:34Z

Integrating filter search with diskann-benchmark.

- Refactored recall utilities in diskann-benchmark - Updated tokio utilities - Added attribute and format parser improvements in label-filter - Updated ground_truth utilities in diskann-tools

Copilot

Pull request overview

Syncs changes aimed at integrating label-filtered search into the benchmarking/tooling flow, primarily by enhancing ground-truth label evaluation and aligning benchmark recall logic with diskann-benchmark-core.

Changes:

Extend diskann-tools ground-truth label filtering to handle array-valued label structures and add diagnostics/logging.
Refactor diskann-benchmark recall utilities to delegate to diskann-benchmark-core.
Adjust diskann-label-filter JSON flattening and minor formatting/import changes; update diskann-tools dependencies.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
diskann-tools/src/utils/ground_truth.rs	Adds array-expansion support for label evaluation and extra tracing diagnostics during bitmap + ground-truth generation.
diskann-tools/Cargo.toml	Updates dependencies (adds `serde_json`, removes some unused deps) and adjusts dev-deps/features.
diskann-label-filter/src/utils/flatten_utils.rs	Modifies how arrays are flattened into pointer paths.
diskann-label-filter/src/parser/format.rs	Minor formatting-only change.
diskann-label-filter/src/attribute.rs	Adds an import (currently unused).
diskann-benchmark/src/utils/tokio.rs	Tweaks doc comment and removes unit tests.
diskann-benchmark/src/utils/recall.rs	Replaces local recall/AP implementations with `diskann-benchmark-core` wrappers/types.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-10T04:48:47Z

diskann-tools/Cargo.toml

 [package]
 name = "diskann-tools"
 edition = "2021"
 version.workspace = true
 authors.workspace = true
 description.workspace = true
 documentation.workspace = true
-license.workspace = true



This crate no longer declares license.workspace = true, while the rest of the workspace crates do. This makes the crate metadata inconsistent and can break workspace-wide tooling (e.g., publish checks) that expects license metadata to be inherited from the workspace. Re-add license.workspace = true under [package] for consistency.

Copilot · 2026-02-10T04:48:47Z

diskann-benchmark/src/utils/recall.rs

+    results: &dyn benchmark_core::recall::Rows<T>,
+    groundtruth: &dyn benchmark_core::recall::Rows<T>,
    recall_k: &[usize],
    recall_n: &[usize],


compute_multiple_recalls changed from taking StridedView + an enhanced_metrics flag to taking &dyn benchmark_core::recall::Rows<T> and no longer accepts enhanced_metrics. There are existing call sites in diskann-benchmark/src/backend/exhaustive/* that still call compute_multiple_recalls(..., false) and pass StridedView conversions, which will not compile. Either update those call sites in this PR, or provide a backwards-compatible wrapper/overload that preserves the old signature and adapts into benchmark_core::recall::Rows.

Suggested change

recall_n: &[usize],

recall_n: &[usize],

_enhanced_metrics: bool,

Copilot · 2026-02-10T04:48:47Z

diskann-benchmark/src/utils/recall.rs

+            average_precision: m.average_precision,
        }
    }
 }


This module removed APMetrics and compute_average_precision, but there are existing call sites (e.g. diskann-benchmark/src/backend/index/range_search.rs and .../result.rs) that still reference utils::recall::compute_average_precision and utils::recall::APMetrics. As-is, the benchmark crate will not compile. Either restore these items as thin wrappers/type aliases around benchmark_core::recall equivalents, or update the call sites in this PR to use the new AveragePrecisionMetrics API.

Suggested change

}

}

/// Backwards-compatible alias for the core average precision metrics.

/// This keeps existing `utils::recall::APMetrics` call sites compiling.

pub(crate) type APMetrics = benchmark_core::recall::AveragePrecisionMetrics;

/// Re-export the core `compute_average_precision` function so existing

/// `utils::recall::compute_average_precision` call sites continue to work.

pub(crate) use benchmark_core::recall::compute_average_precision;

Copilot · 2026-02-10T04:48:48Z

diskann-tools/src/utils/ground_truth.rs

+    // Print per-query match counts
+    for (i, bitmap) in query_bitmaps.iter().enumerate() {
+        if i < 10 || bitmap.is_empty() {
+            tracing::debug!(
+                "Query {}: {} base vectors matched the filter",
+                i,
+                bitmap.len()
+            );
+        }
+    }


The per-query debug logging loop will emit a debug! line for every query with an empty bitmap (if i < 10 || bitmap.is_empty()), which can explode log volume when filters are selective or misconfigured (especially if debug logging is enabled). Consider limiting this to the first N queries only, or logging aggregate statistics plus a small sample of empty/non-empty queries.

Copilot · 2026-02-10T04:48:48Z

diskann-tools/src/utils/ground_truth.rs

+/// Expands a JSON object with array-valued fields into multiple objects with scalar values.
+/// For example: {"country": ["AU", "NZ"], "year": 2007}
+/// becomes: [{"country": "AU", "year": 2007}, {"country": "NZ", "year": 2007}]
+/// 
+/// If multiple fields have arrays, all combinations are generated.
+fn expand_array_fields(value: &Value) -> Vec<Value> {
+    match value {
+        Value::Object(map) => {
+            // Start with a single empty object
+            let mut results: Vec<Map<String, Value>> = vec![Map::new()];
+
+            for (key, val) in map.iter() {
+                if let Value::Array(arr) = val {
+                    // Expand: for each existing result, create copies for each array element
+                    let mut new_results: Vec<Map<String, Value>> = Vec::new();
+                    for existing in results.iter() {
+                        for item in arr.iter() {
+                            let mut new_map: Map<String, Value> = existing.clone();
+                            new_map.insert(key.clone(), item.clone());
+                            new_results.push(new_map);
+                        }
+                    }
+                    // If array is empty, keep existing results without this key
+                    if !arr.is_empty() {
+                        results = new_results;
+                    }
+                } else {
+                    // Non-array field: add to all existing results
+                    for existing in results.iter_mut() {
+                        existing.insert(key.clone(), val.clone());
+                    }
+                }
+            }
+
+            results.into_iter().map(Value::Object).collect()
+        }
+        // If not an object, return as-is
+        _ => vec![value.clone()],
+    }
+}
+
+/// Evaluates a query expression against a label, expanding array fields first.
+/// Returns true if any expanded variant matches the query.
+fn eval_query_with_array_expansion(query_expr: &ASTExpr, label: &Value) -> bool {
+    let expanded = expand_array_fields(label);
+    expanded.iter().any(|item| eval_query_expr(query_expr, item))
+}


expand_array_fields clones the full JSON object and generates a Cartesian product of array-valued fields, and it's invoked inside the inner loop over base_labels for every query. This can become extremely expensive (exponential in the number/size of array fields, multiplied by num_queries * num_base_labels). Consider avoiding full expansion/allocation by evaluating the predicate directly against arrays (e.g., treat equality as "any element matches" for array-valued fields), or precomputing any needed expansions once per base label rather than per query evaluation.

Copilot · 2026-02-10T04:48:48Z

diskann-label-filter/src/attribute.rs


 use std::fmt::Display;
 use std::hash::{Hash, Hasher};
+use std::io::Write;


use std::io::Write; is unused in this module (no Write methods are called; the write_u8/write_i64/etc calls are on std::hash::Hasher). This will trigger unused_imports warnings under typical lint settings; remove the import.

Suggested change

use std::io::Write;

Copilot · 2026-02-10T04:48:48Z

diskann-benchmark/src/utils/tokio.rs

@@ -18,21 +18,3 @@ pub(crate) fn block_on<F: std::future::Future>(future: F) -> F::Output {
        .expect("current thread runtime initialization failed")
        .block_on(future)
 }


The runtime helper previously had a unit test validating runtime(num_threads) creates the expected number of workers, but the entire test module was removed. If the old test was removed due to tokio runtime metrics being unstable/unsupported in this build, consider replacing it with a stable assertion (or moving coverage to an integration test) so runtime()/block_on() behavior remains exercised by CI.

Copilot · 2026-02-10T04:48:48Z

diskann-tools/src/utils/ground_truth.rs

+    for (query_idx, npq) in ground_truth.iter().enumerate() {
+        let neighbors: Vec<_> = npq.iter().collect();
+        let neighbor_count = neighbors.len();
+
+        if query_idx < 10 {
+            // Print top K IDs and distances for first 10 queries
+            let top_ids: Vec<u32> = neighbors.iter().take(10).map(|n| n.id).collect();
+            let top_dists: Vec<f32> = neighbors.iter().take(10).map(|n| n.distance).collect();
+            tracing::debug!(


The added ground-truth debug section allocates and clones neighbor data for every query (let neighbors: Vec<_> = npq.iter().collect();) even though detailed logging is only done for the first 10 queries. For large benchmarks this adds avoidable overhead. Prefer using npq.iter().count()/npq.len() for counts and only collecting the top-N neighbors for the small set of queries you actually log.

Copilot · 2026-02-10T04:48:49Z

diskann-label-filter/src/utils/flatten_utils.rs

            Value::Array(arr) => {
                for (i, item) in arr.iter().enumerate() {
-                    flatten_recursive(item, stack.push(&i, separator), out, separator);
+                    flatten_recursive(item, stack.push(&String::from(""), separator), out, separator);


Array flattening is now pushing an empty string segment for every array element, which drops the index from the generated path (e.g., "/tags/0" becomes something like "/tags/") and can also create duplicate keys. This breaks the JSON Pointer semantics documented in this file and will fail the existing tests that assert "/tags/0" and "/tags/1". Use the numeric index (i) when building JSON Pointer paths, or implement the existing include_array_indices config flag (currently unused) to conditionally include the index.

Suggested change

flatten_recursive(item, stack.push(&String::from(""), separator), out, separator);

let index = i.to_string();

flatten_recursive(item, stack.push(&index, separator), out, separator);

Sync changes from CDB_DiskANN repo

e3e1651

- Refactored recall utilities in diskann-benchmark - Updated tokio utilities - Added attribute and format parser improvements in label-filter - Updated ground_truth utilities in diskann-tools

gopalrs requested review from a team and Copilot February 10, 2026 04:44

gopalrs changed the title ~~Sync changes from CDB_DiskANN repo~~ [Draft, Not for review] Sync changes from CDB_DiskANN repo Feb 10, 2026

Copilot started reviewing on behalf of gopalrs February 10, 2026 04:45 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Draft, Not for review] Sync changes from CDB_DiskANN repo#748

[Draft, Not for review] Sync changes from CDB_DiskANN repo#748
gopalrs wants to merge 1 commit intomainfrom
sync-from-cdb-diskann

gopalrs commented Feb 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	recall_n: &[usize],
	recall_n: &[usize],
	_enhanced_metrics: bool,

-}
+}
+/// Backwards-compatible alias for the core average precision metrics.
+/// This keeps existing `utils::recall::APMetrics` call sites compiling.
+pub(crate) type APMetrics = benchmark_core::recall::AveragePrecisionMetrics;
+/// Re-export the core `compute_average_precision` function so existing
+/// `utils::recall::compute_average_precision` call sites continue to work.
+pub(crate) use benchmark_core::recall::compute_average_precision;

	flatten_recursive(item, stack.push(&String::from(""), separator), out, separator);
	let index = i.to_string();
	flatten_recursive(item, stack.push(&index, separator), out, separator);

Conversation

gopalrs commented Feb 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants