Skip to content

Comments

Populate per-asset dataStandard for BIDS, HED, NWB, and extensions#1811

Open
yarikoptic wants to merge 7 commits intomasterfrom
hed
Open

Populate per-asset dataStandard for BIDS, HED, NWB, and extensions#1811
yarikoptic wants to merge 7 commits intomasterfrom
hed

Conversation

@yarikoptic
Copy link
Member

@yarikoptic yarikoptic commented Feb 24, 2026

Summary

  • Populate per-asset dataStandard field (new in dandischema) for all standards:
    • BIDS: from dataset_description.json with BIDSVersion
    • HED: from HEDVersion in dataset_description.json, including library schemas (e.g. "sc:1.0.0") as extensions
    • NWB: from NWB file metadata, with ndx-* extensions extracted from h5file["specifications"] group
    • OME/NGFF: for .ome.zarr assets in BIDS datasets
  • Centralize backward-compat guard as _SCHEMA_BAREASSET_HAS_DATASTANDARD in bases.py — single bool gates all new dandischema features (dataStandard, version, extensions), with RuntimeError if dandischema >= 0.12.2 unexpectedly lacks the field
  • Add get_nwb_extensions() to pynwb_utils.py — reads HDF5 specifications group, filters core namespaces, returns {name: latest_version} for each extension
  • All new code is no-op against released dandischema (< 0.12.2)

Depends on: dandi/dandischema#XXXX

Test plan

  • TestBIDSDatasetDescriptionDataStandard — 5 tests covering BIDS always set, HED detected/absent, HED list form, HED library schemas as extensions
  • test_get_nwb_extensions — verifies ndx-* extraction with version sorting, core namespace filtering
  • test_get_nwb_extensions_no_specs — empty dict when no specifications group
  • 460 tests pass with dev dandischema, 426 pass with released dandischema 0.12.1 (new feature tests correctly skip)

PR in dandi-schema:

🤖 Generated with Claude Code

yarikoptic and others added 7 commits February 18, 2026 21:05
- BIDSDatasetDescriptionAsset.get_metadata(): always sets BIDS standard
  (with BIDSVersion), adds HED when HEDVersion present in JSON, warns
  on read failure
- NWBAsset.get_metadata(): sets NWB standard
- ZarrBIDSAsset.get_metadata(): sets OME/NGFF for .ome.zarr assets
- Guard for older dandischema without dataStandard on BareAsset;
  RuntimeError if dandischema >= 0.12.2 lacks it
- Register ai_generated pytest marker in tox.ini

Requires dandischema with per-asset dataStandard support (0.12.2+).
Works silently with older dandischema.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add get_nwb_extensions() to pynwb_utils that reads h5file["specifications"]
  to discover ndx-* namespaces and their versions (filtering out core/hdmf)
- NWBAsset.get_metadata() populates StandardsType.extensions with ndx-*
  extensions found in the NWB file
- BIDSDatasetDescriptionAsset.get_metadata() extracts HED library schemas
  from list-valued HEDVersion (e.g. ["8.2.0", "sc:1.0.0"]) into extensions
- Add tests for both NWB extension extraction and HED library schema parsing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…S_DATASTANDARD

Single bool in bases.py guards all dataStandard/version/extensions
features that ship together in dandischema >= 0.12.2, replacing
scattered "field in Model.model_fields" checks.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move json, h5py, get_nwb_extensions, and _SCHEMA_BAREASSET_HAS_DATASTANDARD
imports to module top level in tests.  Keep pynwb_utils import deferred in
bases.py (heavy transitive deps: h5py/pynwb/hdmf/numpy) per existing
convention.  Add import guidance to CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
hed_standard does not exist in dandischema < 0.12.2, so gate the import
on _SCHEMA_BAREASSET_HAS_DATASTANDARD (all new symbols ship together).
HED detection is skipped when unavailable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
from dandi.bids_validator_deno import bids_validate

from .bases import GenericAsset, LocalFileAsset, NWBAsset
from .bases import GenericAsset, LocalFileAsset, NWBAsset, _SCHEMA_BAREASSET_HAS_DATASTANDARD

Check notice

Code scanning / CodeQL

Cyclic import Note

Import of module
dandi.files.bases
begins an import cycle.
@codecov
Copy link

codecov bot commented Feb 24, 2026

Codecov Report

❌ Patch coverage is 48.73418% with 81 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.79%. Comparing base (2d10167) to head (d000be8).
⚠️ Report is 45 commits behind head on master.

Files with missing lines Patch % Lines
dandi/files/bids.py 21.15% 41 Missing ⚠️
dandi/files/bases.py 5.26% 18 Missing ⚠️
dandi/tests/test_files.py 62.22% 17 Missing ⚠️
dandi/pynwb_utils.py 75.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1811      +/-   ##
==========================================
- Coverage   75.11%   74.79%   -0.32%     
==========================================
  Files          84       84              
  Lines       11920    12079     +159     
==========================================
+ Hits         8954     9035      +81     
- Misses       2966     3044      +78     
Flag Coverage Δ
unittests 74.79% <48.73%> (-0.32%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant