Skip to content

Updates from upstream repository.#108

Merged
douglowe merged 47 commits intoeScienceLab:developfrom
crs4:develop
Apr 29, 2026
Merged

Updates from upstream repository.#108
douglowe merged 47 commits intoeScienceLab:developfrom
crs4:develop

Conversation

@douglowe
Copy link
Copy Markdown

@douglowe douglowe commented Apr 29, 2026

This is a merge in of developments made upstream. Copilot summary below.

=========================================

This pull request introduces several new features and improvements, primarily focused on enhancing HTTP cache configurability, improving JSON-LD context retrieval and validation, and updating dependencies and metadata. The most significant changes include new command-line options for HTTP cache control, more robust handling of JSON-LD contexts (including alternate links), and various codebase and workflow updates.

HTTP Cache Configuration Enhancements:

  • Added new command-line options (--cache-max-age, --cache-path, --no-cache) to the validate CLI command, allowing users to control HTTP cache behavior for remote resource fetching. These options are wired through to the validation settings and used to initialize the cache. [1] [2] [3] [4]
  • Updated the default HTTP cache max age to 300 seconds and improved how cache settings are handled and logged throughout the codebase. [1] [2]

JSON-LD Context Retrieval & Validation:

  • Refactored JSON-LD context retrieval in FileDescriptorJsonLdFormat to handle alternate links via HTTP Link headers, ensure correct content types, and robustly resolve relative URLs when necessary. This improves compatibility with various JSON-LD context hosting setups. [1] [2]
  • Enhanced the logic for checking unexpected keys in JSON-LD entities, properly handling reserved keywords and compact IRIs, and improved debug logging for these checks.
  • Fixed minor issues and improved error messages in JSON-LD key compaction checks.

Dependency and Workflow Updates:

  • Updated GitHub Actions to use the latest versions of actions/checkout and actions/setup-python for improved reliability and security. [1] [2] [3]
  • Bumped the project version to 0.9.0 in pyproject.toml.

Validation and Model Improvements:

  • Improved logging and error handling in the validation models and context, including better reporting of unmatched profiles and more robust statistics output. [1] [2] [3]

RO-Crate Profile Updates:

  • Added a new SHACL constraint to the RO-Crate profile to ensure the existence of a metadata file descriptor entity, strengthening conformance checks.

floWetzels and others added 30 commits February 11, 2026 14:21
… case-insensitively for remote context retrieval
Fixed Check for link between root data entity an data entities
feat: extend and improve caching support
feat: allow context-prefixed terms in validation
feat: support remote JSON-LD context retrieval via alternate link headers
@douglowe douglowe self-assigned this Apr 29, 2026
Copilot AI review requested due to automatic review settings April 29, 2026 14:31
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Syncs in upstream updates focused on improving SHACL validation robustness (especially for SPARQL constraints), expanding RO-Crate test coverage/data, and adding configurable HTTP caching for remote context retrieval.

Changes:

  • Add HTTP caching configuration (CLI + settings) and improve remote JSON-LD @context retrieval logic (content-type handling + Link rel=alternate fallback).
  • Improve SHACL validation handling for violations where sourceShape is a BNode (e.g., SPARQL constraints) and align ontology parsing publicID using JSON-LD @base.
  • Add/extend unit + integration tests and new test profiles/crates (including @base use and cyclic/unlinked datasets).

Reviewed changes

Copilot reviewed 29 out of 32 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/unit/test_remote_context_retrieval.py New unit tests for remote context retrieval logic.
tests/unit/requirements/test_shacl_checks.py New unit tests covering SHACL check description fallbacks + parent-shape resolution.
tests/ro_crates.py Adds new test crate path helpers.
tests/integration/test_sparql_constraints.py New integration coverage for SPARQL constraint violations and BNode sourceShape.
tests/integration/profiles/test_metadata_only.py Excludes the new @base crate from generic metadata-only test set.
tests/integration/profiles/ro-crate/test_valid_ro-crate.py Adds integration test for a valid crate with @base set (with skip).
tests/integration/profiles/ro-crate/test_data_entity_metadata.py Adds integration test for datasets not linked to the root.
tests/data/profiles/sparql_test/profile.ttl Adds a dedicated test profile for SPARQL constraint handling.
tests/data/profiles/sparql_test/must/agent_project_intersection.ttl Adds a shape with an always-failing SPARQL constraint to trigger violations.
tests/data/crates/valid/rocrate-with-custom-terms/ro-crate-metadata.json Extends custom-terms crate with rdfs prefix usage.
tests/data/crates/valid/rocrate-with-at-base-set/ro-crate-metadata.json Adds a large “valid” crate using JSON-LD @base.
tests/data/crates/valid/rocrate-with-at-base-set/index.html Adds accompanying file for the @base crate.
tests/data/crates/valid/rocrate-with-at-base-set/example1/.gitkeep Keeps an empty example directory tracked.
tests/data/crates/invalid/4_data_entity_metadata/cyclic_datasets/ro-crate-metadata.json Adds an invalid crate with cyclic/unlinked datasets for negative testing.
rocrate_validator/utils/rdf.py Adds helper to extract JSON-LD @base from metadata.
rocrate_validator/utils/io_helpers/output/text/formatters.py Adjusts Rich padding for text output formatting.
rocrate_validator/utils/http.py Enhances HttpRequester singleton with cache configuration and cleanup.
rocrate_validator/requirements/shacl/validator.py Uses @base as ontology parse publicID when available.
rocrate_validator/requirements/shacl/utils.py Adds resolve_parent_shape helper to map BNode constraint nodes to owning shapes.
rocrate_validator/requirements/shacl/models.py Removes debug serialization side-effect when processing groups.
rocrate_validator/requirements/shacl/checks.py Handles SHACL violations with BNode sourceShape via resolve_parent_shape; improves description fallback.
rocrate_validator/profiles/ro-crate/ontology.ttl Adds ROCrateMetadataFileDescriptor class and an individual for ro-crate-metadata.json.
rocrate_validator/profiles/ro-crate/must/4_data_entity_metadata.ttl Updates “linked to root” constraint to allow indirect linking via path expressions.
rocrate_validator/profiles/ro-crate/must/1_file-descriptor_metadata.ttl Adds SPARQL-based existence check + adjusts descriptor identification and targeting.
rocrate_validator/profiles/ro-crate/must/0_file_descriptor_format.py Improves remote context retrieval and compaction key checking (compact IRI prefixes).
rocrate_validator/models.py Adds cache settings to ValidationSettings and initializes HttpRequester cache.
rocrate_validator/constants.py Renames/adjusts default HTTP cache setting constant.
rocrate_validator/cli/commands/validate.py Adds CLI options for cache configuration and wires them into ValidationSettings.
pyproject.toml Bumps package version to 0.9.0.
.github/workflows/testing.yaml Updates GitHub Actions versions for checkout/setup-python.
.github/workflows/release.yaml Updates GitHub Actions versions for checkout/setup-python in release pipeline.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread rocrate_validator/profiles/ro-crate/must/4_data_entity_metadata.ttl
Comment thread tests/unit/test_remote_context_retrieval.py
Comment thread tests/integration/test_sparql_constraints.py
Comment thread rocrate_validator/cli/commands/validate.py
Comment thread rocrate_validator/cli/commands/validate.py
Comment thread rocrate_validator/utils/http.py
@douglowe douglowe merged commit b4c33bd into eScienceLab:develop Apr 29, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants