Support sharding through config and raster_write_kwargs#1106
Open
melonora wants to merge 30 commits intoscverse:mainfrom
Open
Support sharding through config and raster_write_kwargs#1106melonora wants to merge 30 commits intoscverse:mainfrom
melonora wants to merge 30 commits intoscverse:mainfrom
Conversation
- Added additional settings - Allow environment variables that overwrite config
Collaborator
Author
|
Failing atm due to ome-zarr not yet being released. You can test locally with ome-zarr-py from main. Also, need to add support for zarrs to improve speed of shard io |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1106 +/- ##
==========================================
+ Coverage 91.93% 91.95% +0.02%
==========================================
Files 51 51
Lines 7772 7881 +109
==========================================
+ Hits 7145 7247 +102
- Misses 627 634 +7
🚀 New features to boost your workflow:
|
The reason for only supporting these versions is that they provide the proper use of the zarr api inside dask and also the possibility for setting the tune optimization. The latter is required to prevent errors due to collapsing dask partitions when reading data back in from parquet.
|
Should we also allow the control of sharding for anndata? |
Collaborator
Author
|
Yes, but not as part of this PR. I will adjust the config though to accommodate. |
fix: pass raster_write_kwargs recursively when writing list of elements
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds the following:
raster_write_kwargsfor io functions like.writeand.write_element. This also adds the ability to write sharded arrays. Support for anndata sharding is to be added in a follow up PR.raster_write_kwargsargument.raster_chunksandraster_shards. The config can now be stored in a default location or a custom location. Additionally, environment variables can be set to temporarily override the values.Additional changes
@LucaMarconato