Updates from upstream repository.#108
Merged
douglowe merged 47 commits intoeScienceLab:developfrom Apr 29, 2026
Merged
Conversation
…and corresponding test
… supporting Link header alternate
… remote context retrieval
…olved via alternate Link header
… case-insensitively for remote context retrieval
Fixed Check for link between root data entity an data entities
…iguration parameters
feat: extend and improve caching support
feat: allow context-prefixed terms in validation
feat: support remote JSON-LD context retrieval via alternate link headers
…etadata descriptor
…ontology and use sh:targetClass
fix: improve SHACL violation parsing
fix(core): 💄 fix output formatting
ci(gh-actions): ⬆️ update outdated GitHub Actions
prepare release v0.9.0
There was a problem hiding this comment.
Pull request overview
Syncs in upstream updates focused on improving SHACL validation robustness (especially for SPARQL constraints), expanding RO-Crate test coverage/data, and adding configurable HTTP caching for remote context retrieval.
Changes:
- Add HTTP caching configuration (CLI + settings) and improve remote JSON-LD
@contextretrieval logic (content-type handling + Link rel=alternate fallback). - Improve SHACL validation handling for violations where
sourceShapeis a BNode (e.g., SPARQL constraints) and align ontology parsingpublicIDusing JSON-LD@base. - Add/extend unit + integration tests and new test profiles/crates (including
@baseuse and cyclic/unlinked datasets).
Reviewed changes
Copilot reviewed 29 out of 32 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_remote_context_retrieval.py | New unit tests for remote context retrieval logic. |
| tests/unit/requirements/test_shacl_checks.py | New unit tests covering SHACL check description fallbacks + parent-shape resolution. |
| tests/ro_crates.py | Adds new test crate path helpers. |
| tests/integration/test_sparql_constraints.py | New integration coverage for SPARQL constraint violations and BNode sourceShape. |
| tests/integration/profiles/test_metadata_only.py | Excludes the new @base crate from generic metadata-only test set. |
| tests/integration/profiles/ro-crate/test_valid_ro-crate.py | Adds integration test for a valid crate with @base set (with skip). |
| tests/integration/profiles/ro-crate/test_data_entity_metadata.py | Adds integration test for datasets not linked to the root. |
| tests/data/profiles/sparql_test/profile.ttl | Adds a dedicated test profile for SPARQL constraint handling. |
| tests/data/profiles/sparql_test/must/agent_project_intersection.ttl | Adds a shape with an always-failing SPARQL constraint to trigger violations. |
| tests/data/crates/valid/rocrate-with-custom-terms/ro-crate-metadata.json | Extends custom-terms crate with rdfs prefix usage. |
| tests/data/crates/valid/rocrate-with-at-base-set/ro-crate-metadata.json | Adds a large “valid” crate using JSON-LD @base. |
| tests/data/crates/valid/rocrate-with-at-base-set/index.html | Adds accompanying file for the @base crate. |
| tests/data/crates/valid/rocrate-with-at-base-set/example1/.gitkeep | Keeps an empty example directory tracked. |
| tests/data/crates/invalid/4_data_entity_metadata/cyclic_datasets/ro-crate-metadata.json | Adds an invalid crate with cyclic/unlinked datasets for negative testing. |
| rocrate_validator/utils/rdf.py | Adds helper to extract JSON-LD @base from metadata. |
| rocrate_validator/utils/io_helpers/output/text/formatters.py | Adjusts Rich padding for text output formatting. |
| rocrate_validator/utils/http.py | Enhances HttpRequester singleton with cache configuration and cleanup. |
| rocrate_validator/requirements/shacl/validator.py | Uses @base as ontology parse publicID when available. |
| rocrate_validator/requirements/shacl/utils.py | Adds resolve_parent_shape helper to map BNode constraint nodes to owning shapes. |
| rocrate_validator/requirements/shacl/models.py | Removes debug serialization side-effect when processing groups. |
| rocrate_validator/requirements/shacl/checks.py | Handles SHACL violations with BNode sourceShape via resolve_parent_shape; improves description fallback. |
| rocrate_validator/profiles/ro-crate/ontology.ttl | Adds ROCrateMetadataFileDescriptor class and an individual for ro-crate-metadata.json. |
| rocrate_validator/profiles/ro-crate/must/4_data_entity_metadata.ttl | Updates “linked to root” constraint to allow indirect linking via path expressions. |
| rocrate_validator/profiles/ro-crate/must/1_file-descriptor_metadata.ttl | Adds SPARQL-based existence check + adjusts descriptor identification and targeting. |
| rocrate_validator/profiles/ro-crate/must/0_file_descriptor_format.py | Improves remote context retrieval and compaction key checking (compact IRI prefixes). |
| rocrate_validator/models.py | Adds cache settings to ValidationSettings and initializes HttpRequester cache. |
| rocrate_validator/constants.py | Renames/adjusts default HTTP cache setting constant. |
| rocrate_validator/cli/commands/validate.py | Adds CLI options for cache configuration and wires them into ValidationSettings. |
| pyproject.toml | Bumps package version to 0.9.0. |
| .github/workflows/testing.yaml | Updates GitHub Actions versions for checkout/setup-python. |
| .github/workflows/release.yaml | Updates GitHub Actions versions for checkout/setup-python in release pipeline. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a merge in of developments made upstream. Copilot summary below.
=========================================
This pull request introduces several new features and improvements, primarily focused on enhancing HTTP cache configurability, improving JSON-LD context retrieval and validation, and updating dependencies and metadata. The most significant changes include new command-line options for HTTP cache control, more robust handling of JSON-LD contexts (including alternate links), and various codebase and workflow updates.
HTTP Cache Configuration Enhancements:
--cache-max-age,--cache-path,--no-cache) to thevalidateCLI command, allowing users to control HTTP cache behavior for remote resource fetching. These options are wired through to the validation settings and used to initialize the cache. [1] [2] [3] [4]JSON-LD Context Retrieval & Validation:
FileDescriptorJsonLdFormatto handle alternate links via HTTPLinkheaders, ensure correct content types, and robustly resolve relative URLs when necessary. This improves compatibility with various JSON-LD context hosting setups. [1] [2]Dependency and Workflow Updates:
actions/checkoutandactions/setup-pythonfor improved reliability and security. [1] [2] [3]0.9.0inpyproject.toml.Validation and Model Improvements:
RO-Crate Profile Updates: