Skip to content

(RFC) Refactor search interface with unified SearchDispatch trait#773

Open
narendatha wants to merge 9 commits intomainfrom
users/narendatha/search_refactor
Open

(RFC) Refactor search interface with unified SearchDispatch trait#773
narendatha wants to merge 9 commits intomainfrom
users/narendatha/search_refactor

Conversation

@narendatha
Copy link

@narendatha narendatha commented Feb 12, 2026

Refactor search interface with unified SearchDispatch trait

Summary

This PR refactors the search interface in diskann to provide a unified entry point via the SearchDispatch trait, improving code organization and API clarity. This also removes (un)popular SearchParams and search specific structs. This PR also removes (un)popular SearchParamsand replaces with search specific structs, so it is clear what params are used and which ones are ignored.

Changes

New Architecture

  • SearchDispatch trait - A unified dispatch interface that all search types implement, enabling a single index.search() entry point
  • Separate search type files:
    • dispatch.rs - Core trait definition
    • graph_search.rs - Standard k-NN graph search (GraphSearch, RecordedGraphSearch)
    • range_search.rs - Range-based search (RangeSearch, RangeSearchOutput)
    • multihop_search.rs - Label-filtered search (MultihopSearch)
    • diverse_search.rs - Diversity-aware search (DiverseSearch, feature-gated)

API Changes

Before After
index.execute_search(..., parameters, ...) index.search(..., search_params, ...)
index.search_recorded(...) index.debug_search(...)
Multiple search methods Single search() with dispatch

Code Organization

  • Moved internal implementations (range_search_internal, multihop_search_internal) to their respective search type files
  • Moved NotInMutWithLabelCheck helper to multihop_search.rs
  • Added From impls for SearchParams -> GraphSearch and RangeSearchParams -> RangeSearch
  • Removed convenience methods (graph_search, multihop_search, range_search) from DiskANNIndex to avoid encouraging direct use
  • Added local test helper functions where needed for backwards compatibility

Other

  • Renamed search_recorded -> debug_search to clearly signal debug-only intent
  • Bumped version to 0.46.0

Usage Example

use diskann::graph::{GraphSearch, RangeSearch, MultihopSearch};

// Standard k-NN search
let params = GraphSearch::new(10, 100, None)?;
let stats = index.search(&strategy, &context, &query, &params, &mut output).await?;

// Range search
let params = RangeSearch::new(100, 0.5)?;
let result = index.search(&strategy, &context, &query, &params, &mut ()).await?;

// Filtered search
let params = MultihopSearch::new(GraphSearch::new(10, 100, None)?, &filter);
let stats = index.search(&strategy, &context, &query, &params, &mut output).await?;

Why flat_search is not included

flat_search remains a separate method because it has unique constraints incompatible with SearchDispatch: (1) it requires an additional generic bound SearchAccessor<'a>: IdIterator<I> that forces the strategy's accessor to support iteration over all IDs - a capability not all strategies provide, and (2) it takes an extra vector_filter closure parameter that other search types don't need. Since flat_search is fundamentally a linear scan with filtering rather than a graph traversal, keeping it as a distinct method with its own signature is cleaner than complicating the trait to accommodate it.

Testing

  • All existing tests pass
  • cargo build --workspace
  • cargo test -p diskann -p diskann-providers -p diskann-disk

- Add SearchDispatch trait for unified search entry point
- Split search types into separate files (graph_search, range_search, multihop_search, diverse_search)
- Rename execute_search to search, parameters to search_params
- Rename search_recorded to debug_search to signal debug-only intent
- Move internal implementations to respective search type files
- Add From impls for SearchParams/RangeSearchParams to new types
- Add test helper functions in test modules for backwards compat
- Bump version to 0.46.0
@narendatha narendatha requested review from a team and Copilot February 12, 2026 16:55
@narendatha narendatha changed the title Refactor search interface with unified SearchDispatch trait (RFC) Refactor search interface with unified SearchDispatch trait Feb 12, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the search interface in the diskann crate to provide a unified entry point via the SearchDispatch trait. The refactoring improves code organization by moving search implementations to dedicated files and consolidates the API around a single index.search() method that dispatches to type-specific implementations.

Changes:

  • Introduces SearchDispatch trait as the unified search interface, with separate implementations for standard graph search, range search, multihop (label-filtered) search, and diversity-aware search
  • Replaces multiple search methods (execute_search, multihop_search, range_search) with a single search() method that delegates to SearchDispatch::dispatch()
  • Renames search_recorded to debug_search to clearly signal debug-only intent
  • Maintains backward compatibility by implementing SearchDispatch for the legacy SearchParams type

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
diskann/src/graph/search/dispatch.rs Core SearchDispatch trait definition
diskann/src/graph/search/graph_search.rs Standard k-NN graph search implementation (GraphSearch, RecordedGraphSearch)
diskann/src/graph/search/range_search.rs Range-based search implementation and range_search_internal helper
diskann/src/graph/search/multihop_search.rs Label-filtered search and multihop_search_internal helper, includes NotInMutWithLabelCheck predicate
diskann/src/graph/search/diverse_search.rs Feature-gated diversity-aware search implementation
diskann/src/graph/search/mod.rs Module organization with public re-exports
diskann/src/graph/mod.rs Top-level re-exports of search types
diskann/src/graph/index.rs New unified search() method, renamed debug_search(), removed old search methods, made internal helpers pub(crate)
diskann/src/graph/test/cases/grid.rs Updated to use GraphSearch::new() and params.k
diskann-providers/src/index/diskann_async.rs Test helper functions for backward-compatible multihop and range search calls
diskann-providers/src/index/wrapped_async.rs Converts SearchParams to GraphSearch for unified API
diskann-disk/src/search/provider/disk_provider.rs Updated to use GraphSearch and renamed debug_search
diskann-benchmark-core/src/search/graph/*.rs Updated benchmark code to use new search dispatch API
Cargo.toml, Cargo.lock Version bump to 0.46.0 across all workspace crates

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

narendatha and others added 4 commits February 12, 2026 22:39
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This enables RecordedGraphSearch to implement SearchDispatch properly,
since it holds &mut recorder which requires mutable access.

- Change dispatch signature from &self to &mut self
- Update index.search to take &mut P for search params
- Implement SearchDispatch for RecordedGraphSearch
- Remove apologetic comment about trait limitations
- Update all callers to use &mut params
- RecordedGraphSearch in graph_search.rs
- MultihopSearch in multihop_search.rs
@narendatha
Copy link
Author

narendatha commented Feb 12, 2026 via email

Copy link
Contributor

@hildebrandmw hildebrandmw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @narendatha! I think this does a good job of tying together the disparate APIs we've gathered. I'm pretty strongly in favor of getting rid of the legacy SearchParams and RangeSearchParams to avoid clutter, and updating the structs defined in this PR to not expose all the methods as public and therefore render the constructor invariant checks invalid.

- Rename SearchDispatch trait to Search and move to mod.rs
- Remove unnecessary doc comment about &mut self (can be inferred)
- Remove 'feature-gated' mention from diverse_search.rs module doc
- Fix banner styles to use ///// format instead of //===
- Make create_diverse_scratch an inherent method on DiverseSearch
- Remove flat_search mention from index.search docs
- Redirect debug_search to use RecordedGraphSearch internally
- Revert version bump (0.46.0 -> 0.45.0)

Co-authored-by: hildebrandmw <hildebrandmw@users.noreply.github.com>
@narendatha
Copy link
Author

@microsoft-github-policy-service agree company="Microsoft"

…roUsize

- Rename GraphSearch to KnnSearch for clarity (k-NN search)
- Change k_value and l_value from usize to NonZeroUsize (compile-time zero check)
- Rename RecordedGraphSearch to RecordedKnnSearch
- Move RangeSearch validation from misc.rs to range_search.rs
- Remove deprecated SearchParams, SearchParamsError, RangeSearchParams, RangeSearchParamsError
- Update all callers across diskann, diskann-providers, diskann-disk, diskann-benchmark crates
- Remove ensure_positive helper (no longer needed with NonZeroUsize)
- Fix duplicated cfg attribute in diverse_search.rs
@codecov-commenter
Copy link

codecov-commenter commented Feb 13, 2026

Codecov Report

❌ Patch coverage is 97.68786% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.00%. Comparing base (ea4078e) to head (5e81b35).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
diskann/src/graph/search/range_search.rs 97.69% 5 Missing ⚠️
diskann-providers/src/index/wrapped_async.rs 0.00% 3 Missing ⚠️
diskann/src/graph/search/multihop_search.rs 98.05% 3 Missing ⚠️
diskann-benchmark/src/inputs/async_.rs 0.00% 2 Missing ⚠️
diskann-benchmark/src/backend/index/result.rs 66.66% 1 Missing ⚠️
diskann-benchmark/src/backend/index/search/knn.rs 80.00% 1 Missing ⚠️
...iskann-benchmark/src/backend/index/search/range.rs 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #773      +/-   ##
==========================================
- Coverage   90.29%   89.00%   -1.30%     
==========================================
  Files         428      431       +3     
  Lines       78420    78496      +76     
==========================================
- Hits        70809    69865     -944     
- Misses       7611     8631    +1020     
Flag Coverage Δ
miri 89.00% <97.68%> (-1.30%) ⬇️
unittests 89.00% <97.68%> (-1.30%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
diskann-benchmark-core/src/search/graph/knn.rs 94.97% <100.00%> (+0.08%) ⬆️
...iskann-benchmark-core/src/search/graph/multihop.rs 98.70% <100.00%> (+0.01%) ⬆️
diskann-benchmark-core/src/search/graph/range.rs 93.70% <100.00%> (+0.18%) ⬆️
diskann-disk/src/search/provider/disk_provider.rs 91.05% <100.00%> (+0.06%) ⬆️
diskann-providers/src/index/diskann_async.rs 96.36% <100.00%> (+0.01%) ⬆️
diskann/src/error/ann_error.rs 96.57% <ø> (-0.04%) ⬇️
diskann/src/graph/index.rs 95.91% <100.00%> (-0.55%) ⬇️
diskann/src/graph/misc.rs 80.00% <ø> (-14.60%) ⬇️
diskann/src/graph/search/knn_search.rs 100.00% <100.00%> (ø)
diskann/src/graph/test/cases/grid.rs 95.69% <100.00%> (+0.24%) ⬆️
... and 7 more

... and 43 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@hildebrandmw hildebrandmw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @narendatha, I'm really liking how this is coming together!

/// - [`search::KnnSearch`]: Standard k-NN graph-based search
/// - [`search::MultihopSearch`]: Label-filtered search with multi-hop expansion
/// - [`search::RangeSearch`]: Range-based search within a distance radius
/// - [`search::DiverseSearch`]: Diversity-aware search (feature-gated)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick naming/export observation, I'd like to hear your thoughts. The links here scope the search parameters to the search module, but they are also re-exported through graph.

There is a world where we structure it like

diskann::graph::search::KNN;
diskann::graph::search::MultiHop;
diskann::graph::search::Range;
diskann::graph::search::Diverse;

That is, we can drop the Search suffixes, use the module for namespacing, and do not reexport through graph (the Search trait itself should probably be reexported though).

What do you think?

/// let result = index.search(&strategy, &context, &query, &mut params, &mut ()).await?;
/// // result.ids and result.distances contain the matches
/// ```
pub fn search<'a, S, T, O: 'a, OB, P>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, search_params should probably come first now since it's the most important argument at this level for determining what kind of search is about to get run.

strategy: &'a S,
context: &'a DP::Context,
query: &'a T,
search_params: &'a mut P,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking search_params by mutable reference is a bit unfortunate for the caller. I know why we did this, but it has an unfortunate side-effect of making it harder to work with in cases where mutability is not needed.

To that end, what if we do this:

  1. Take search_params by value.
  2. Implement Search for both KnnSearch and &KnnSearch - making sure that KnnSearch by value simply delegates to the &KnnSearch implementation.

It does make generic plumbing when mutability is required more tricky (i.e., we'd need HRTB all over the place), but I think that might be preferable to always forcing mutability?

/// # Errors
///
/// Returns an error if there is a failure accessing elements or computing distances.
fn dispatch<'a>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop the explicit lifetime 'a here. Additionally, when you drop the lifetime, you will be able to get rid of the explicit lifetime in DiskANNIndex::search as well.

/// # Errors
///
/// Returns an error if there is a failure accessing elements or computing distances.
fn dispatch<'a>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small preference for calling this method search, following the convention of traits having a single method named after themselves.

///
/// let mut params = KnnSearch::new(10, 100, None)?;
/// let stats = index.search(&strategy, &context, &query, &mut params, &mut output).await?;
/// ```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that 90% of the time, I'm a fan of attaching documentation to the trait implementations and allowing that to be discovered while perusing rustdoc.

However, this feels like a situation where the algorithmic level documentation would be more discoverable if attached to the KnnSearch struct (and other searchers) themselves or as module-level documentation. The trait-impl can then include a cross-reference to the struct level documentation.

Basically - these structs exist almost exclusively to implement Search. So struct-level documentation feels appropriate.

Thinking of how this would look for users of the docs, I think it's cleaner. What do you think?

}

impl From<KnnSearchError> for ANNError {
fn from(err: KnnSearchError) -> Self {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend attaching #[track_caller] to all conversions to ANNError as it will improve the quality of the error message if backtraces are not collected.

/// Search list size - controls accuracy vs speed tradeoff.
l_value: NonZeroUsize,
/// Optional beam width for parallel graph exploration.
beam_width: Option<usize>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since beam_width cannot be zero, we can embed it in an Option<NonZeroUsize> to formalize the guarantee.

diskann::graph::SearchParams::new(run.search_n, *search_l, None).unwrap();
let k = NonZeroUsize::new(run.search_n).expect("search_n must be non-zero");
let l = NonZeroUsize::new(*search_l).expect("search_l must be non-zero");
let search_params = diskann::graph::KnnSearch::new(k, l, None).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small thing I noticed: since KnnSearch::new already checks the k <= l invariant, we could also check for non-zero there and simplify the caller's life.

This is especially relevant because I'm seeing expect being applied to the NonZeroUsize constructors where-as before those paths wouldn't necessarily panic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants