Add remote-read support to read_nwb#2190
Open
h-mayorquin wants to merge 2 commits into
Open
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## dev #2190 +/- ##
==========================================
+ Coverage 95.99% 96.00% +0.01%
==========================================
Files 30 30
Lines 2970 2982 +12
Branches 431 433 +2
==========================================
+ Hits 2851 2863 +12
Misses 67 67
Partials 52 52
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
pynwb.read_nwbread_nwb
read_nwb now accepts remote URLs (s3://, gs://, abfs://, https://, etc.) and
dispatches by URL shape: .zarr suffixes and DANDI Zarr assets under /zarr/ go
to NWBZarrIO, everything else to NWBHDF5IO. Remote files are opened through
fsspec using the URL's actual scheme, replacing the hardcoded
fsspec.filesystem("http") that mishandled non-HTTP schemes.
Adds integration tests covering local HDF5/Zarr reads and anonymous public
remote reads over HTTPS for both backends.
0ccff58 to
6536cda
Compare
oruebel
reviewed
Jul 1, 2026
oruebel
previously approved these changes
Jul 1, 2026
oruebel
reviewed
Jul 1, 2026
Contributor
|
Other than moving the changelog entry, this looks good to me. I'll let @rly handle merge. |
oruebel
approved these changes
Jul 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2150.
The main contribution of this PR is enabling
pynwb.read_nwb(path)to read nwbfiles in remote public locations such as public DANDI Archive assets served over HTTPS (e.g.https://dandiarchive.s3.amazonaws.com/...) or any publicly-readable S3, GCS, or HTTPS object store.To move this forward I have decided not to use the
can_readmethods for dispatch, like I did in #1994 for local file support. What we have here is simple URL pattern matching: routing based on conventions like the.zarrsuffix and DANDI's/zarr/<uuid>/URL layout, without opening the file. This avoids the performance cost of reading the file twice and in addition does not overpromise (can_read implies a strong contract that is hard-to-deliver for all the complexity of remote files).Private files still can be accessed in a lot of cases. For example, a private S3 bucket read with
s3://when AWS credentials are already configured in the environment (AWS_PROFILEor~/.aws/credentials), ags://object whenGOOGLE_APPLICATION_CREDENTIALSis set, or anabfs://path via Azure managed identity; fsspec picks up each backend's default credential chain automatically.That said, I did not want to make the signature more complex by adding login configuration parameters as that would be against the original spirit of this function. The original
read_nwbdesign from #1974 was as simple as possible without config. Power-user scenarios (e.g. forced h5py ROS3 driver or custom S3-compatible endpoints) continue to require dropping toNWBHDF5IOorNWBZarrIOdirectly and I think this is where private access should be.The remote-Zarr test depends on the
resolve_refself-reference fix for fsspec stores from hdmf-dev/hdmf-zarr#348, which I opened upstream. That fix is now released in hdmf-zarr 0.13.0, so no special pin is needed and the test runs against a normalhdmf-zarrinstall.I am also fixing a pre-existing scheme bug in
NWBHDF5IO.read_nwb's streaming branch: the fsspec filesystem was hard-coded to"http"regardless of URL scheme, sos3://,gs://, andabfs://paths silently failed for non-HTTP backends.Checklist
ruff check . && codespellfrom the source directory.