Add API to load pre-built index in wrapped_async by Sanhaoji2 · Pull Request #749 · microsoft/DiskANN

Sanhaoji2 · 2026-02-10T09:25:31Z

Does this PR have a descriptive title that could go in our release notes?
Does this PR add any new dependencies?
Does this PR modify any existing APIs?
Is the change to the API backwards compatible?
Should this result in any changes to our documentation, either updating existing docs or adding new ones?

Reference Issues/PRs

What does this implement/fix? Briefly explain your changes.

Any other comments?

Copilot

Pull request overview

Adds a synchronous API in wrapped_async for loading a pre-built in-memory index from files by constructing a DataProvider via the storage loading traits, then wrapping it in the sync DiskANNIndex wrapper.

Changes:

Introduces SyncLoadWith + create_data_provider() in diskann-providers/src/index/wrapped_async.rs to load an index from a path/config.
Adds futures-executor dependency to synchronously block_on the async LoadWith future.
Updates Cargo.lock for the new dependency.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 6 comments.

File	Description
diskann-providers/src/index/wrapped_async.rs	Adds synchronous load API and data-provider construction for pre-built index loading.
diskann-providers/Cargo.toml	Adds `futures-executor` dependency.
Cargo.lock	Records dependency resolution changes for `futures-executor`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-10T09:29:58Z

diskann-providers/src/index/wrapped_async.rs

+    let data_provider = DP::load_with(provider, &pq_context);
+
+    block_on(data_provider)
+}


This introduces futures_executor::block_on + a new futures-executor dependency just to synchronously wait for DP::load_with. Since this module already depends on Tokio and constructs runtimes/handles, prefer using a Tokio runtime/handle to block_on the load (or otherwise avoid adding an extra executor dependency).

Copilot · 2026-02-10T09:29:58Z

diskann-providers/Cargo.toml

 prost = "0.14.1"
 futures-util.workspace = true
+futures-executor = "0.3"


New dependency is added with an inline version (futures-executor = "0.3") instead of using [workspace.dependencies] like most shared third-party crates in this repo. Consider adding it to the workspace dependency list and referencing it via workspace = true to keep versions consistent across crates.

Copilot · 2026-02-10T09:29:58Z

diskann-providers/src/index/wrapped_async.rs

+pub trait SyncLoadWith: Sized {
+
+    fn load_with(path: &str, index_config: IndexConfiguration) -> Self;
+}
+
+// Load static memory index from pre-built files synchronously
+impl<DP> SyncLoadWith for DiskANNIndex<DP>
+where
+    DP: DataProvider<InternalId = u32> + LoadWith<AsyncQuantLoadContext, Error = ANNError>,
+{
+    fn load_with(path: &str, index_config: IndexConfiguration) -> Self {
+        let storage = FileStorageProvider;
+        let data_provider = create_data_provider(&storage, path, &index_config);
+
+        match data_provider {
+            Ok(dp) => DiskANNIndex::<DP>::new_with_current_thread_runtime(index_config.config, dp),
+            Err(e) => panic!("Failed to create data provider: {}", e),
+        }
+    }
+}
+
+pub fn create_data_provider<'a, P, DP>(
+    provider: &P,
+    path: &'a str,
+    index_config: &'a IndexConfiguration,
+) -> ANNResult<DP>
+where
+    P: StorageReadProvider,
+    DP: DataProvider + LoadWith<AsyncQuantLoadContext, Error = ANNError>,
+{
+    let pq_context = AsyncQuantLoadContext {
+        metadata: AsyncIndexMetadata::new(path),
+        num_frozen_points: index_config.num_frozen_pts,
+        metric: index_config.dist_metric,
+        prefetch_lookahead: index_config.prefetch_lookahead.map(|x| x.get()),
+        is_disk_index: false, // only support in-memory index loading here
+        prefetch_cache_line_level: index_config.prefetch_cache_line_level,
+    };
+
+    let data_provider = DP::load_with(provider, &pq_context);
+
+    block_on(data_provider)
+}


The new synchronous load path (SyncLoadWith::load_with / create_data_provider) isn’t covered by tests. Given the number of moving parts (context creation, file layout, error propagation), please add a unit/integration test that round-trips an index to storage and loads it through this wrapper API (ideally using VirtualStorageProvider to avoid filesystem side effects).

Copilot · 2026-02-10T09:29:58Z

diskann-providers/src/index/wrapped_async.rs

+
+use crate::{model::IndexConfiguration,
+     storage::{StorageReadProvider, LoadWith, AsyncQuantLoadContext,AsyncIndexMetadata, file_storage_provider::FileStorageProvider}};
+


The new use crate::{... storage::{...}} import block is not formatted consistently (extra blank line, missing spaces after commas, long single line). Please run rustfmt / reformat this use to match the rest of the crate’s import style for readability and to avoid CI format/lint churn.

Copilot · 2026-02-10T09:29:59Z

diskann-providers/src/index/wrapped_async.rs

+    fn load_with(path: &str, index_config: IndexConfiguration) -> Self {
+        let storage = FileStorageProvider;
+        let data_provider = create_data_provider(&storage, path, &index_config);
+
+        match data_provider {
+            Ok(dp) => DiskANNIndex::<DP>::new_with_current_thread_runtime(index_config.config, dp),
+            Err(e) => panic!("Failed to create data provider: {}", e),
+        }


SyncLoadWith::load_with panics on data-provider load failure. Since this is a public API for loading an index, it should return a Result (e.g., ANNResult<Self>) and propagate the underlying error instead of crashing the process.

Copilot · 2026-02-10T09:29:59Z

diskann-providers/src/index/wrapped_async.rs

+    let pq_context = AsyncQuantLoadContext {
+        metadata: AsyncIndexMetadata::new(path),
+        num_frozen_points: index_config.num_frozen_pts,
+        metric: index_config.dist_metric,
+        prefetch_lookahead: index_config.prefetch_lookahead.map(|x| x.get()),
+        is_disk_index: false, // only support in-memory index loading here
+        prefetch_cache_line_level: index_config.prefetch_cache_line_level,
+    };


create_data_provider duplicates the existing storage::create_load_context() helper (diskann-providers/src/storage/index_storage.rs) that already builds an AsyncQuantLoadContext from IndexConfiguration. Reusing the shared helper would reduce drift and keep load semantics consistent across the crate.

codecov-commenter · 2026-02-10T11:59:45Z

Codecov Report

❌ Patch coverage is 0% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.97%. Comparing base (a7aa13c) to head (e3001a4).

Files with missing lines	Patch %	Lines
diskann-providers/src/index/wrapped_async.rs	0.00%	27 Missing ⚠️

❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #749      +/-   ##
==========================================
- Coverage   89.01%   88.97%   -0.04%     
==========================================
  Files         428      428              
  Lines       78294    78321      +27     
==========================================
- Hits        69691    69687       -4     
- Misses       8603     8634      +31

Flag	Coverage Δ
miri	`88.97% <0.00%> (-0.04%)`	⬇️
unittests	`88.97% <0.00%> (-0.04%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
diskann-providers/src/index/wrapped_async.rs	`0.00% <0.00%> (ø)`

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

hildebrandmw · 2026-02-11T17:56:42Z

diskann-providers/src/index/wrapped_async.rs

+            Err(e) => Err(e),
+        }
+    }
+}


I feel like this would be an endless game of whack-a-mole, trying to introduce sync interfaces for all the async interfaces.

What if we did this instead:

Somewhere high in the stack of diskann-providers, we can make a function like block_on that can run any future. Then for these situations where we have an async function/trait that doesn't really need a multi-threaded runtime and is not performance sensitive, block_on can be used and we don't have to duplicate our interfaces.

Add API to load pre-built index to wrapped_async

ba7a263

Sanhaoji2 requested review from a team and Copilot February 10, 2026 09:25

Copilot started reviewing on behalf of Sanhaoji2 February 10, 2026 09:26 View session

Copilot AI reviewed Feb 10, 2026

View reviewed changes

Sanhaoji2 added 3 commits February 10, 2026 19:40

Fix compile issue

498622a

Fix compile issue

124e0f9

fix compile issue

e3001a4

hildebrandmw requested changes Feb 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add API to load pre-built index in wrapped_async#749

Add API to load pre-built index in wrapped_async#749
Sanhaoji2 wants to merge 4 commits intomainfrom
jegao/MemoryIndexLoadingAPI

Sanhaoji2 commented Feb 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

Copilot AI Feb 10, 2026

Uh oh!

codecov-commenter commented Feb 10, 2026

Uh oh!

hildebrandmw Feb 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants


		use crate::{model::IndexConfiguration,
		storage::{StorageReadProvider, LoadWith, AsyncQuantLoadContext,AsyncIndexMetadata, file_storage_provider::FileStorageProvider}};

Conversation

Sanhaoji2 commented Feb 10, 2026

Reference Issues/PRs

What does this implement/fix? Briefly explain your changes.

Any other comments?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Feb 10, 2026

Codecov Report

Uh oh!

hildebrandmw Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants