Fix active preset branch resolution#5658
Conversation
There was a problem hiding this comment.
🤖 Automated Code Review
PR Summary: This PR fixes nested PresetCfg resolution to scope child preset choices to the selected parent branch, preventing collisions from inactive sibling branches. It also improves performance by bypassing Hydra composition for preset-only and plain scalar overrides.
✅ Strengths
-
Core fix is sound: The BFS-based
_resolve_active_presets()correctly addresses the fundamental issue by only queuing selected branches for further traversal. This ensures nested presets from inactive siblings cannot collide. -
Significant performance improvement: The benchmarks show 76-99% reduction in config resolution time across various tasks — impressive optimization that benefits all training workloads.
-
Good test coverage for new behavior: New tests
test_nested_presetcfg_global_preset_uses_selected_parent_branchandtest_nested_presetcfg_path_preset_uses_selected_parent_branchdirectly validate the branch-scoping fix. -
Comprehensive benchmark script:
benchmark_hydra_resolve.pyfollows Isaac Lab benchmark conventions and provides reproducible performance measurements.
🔍 Suggestions & Observations
1. ast.literal_eval behavior change (minor)
# Before (hydra.py L724-726):
return float(s) if "." in s else int(s)
# After:
return ast.literal_eval(s)Context: ast.literal_eval is more robust but has slightly different behavior for edge cases (e.g., "0x10" now parses as 16, strings with leading zeros may fail). This is likely an improvement, but worth noting for backward compatibility.
2. Consider early return for empty explicit dict
In _resolve_active_presets(), when both selected is empty and explicit is empty, the function still performs a full BFS traversal. For large configs without any preset resolution needed, an early check could skip unnecessary work:
if not selected and not explicit and not isinstance(cfg, PresetCfg):
return cfgNote: This is a minor optimization suggestion, not a bug.
3. Docstring update for resolve_presets
The docstring still mentions "depth-first walk" but the implementation now uses BFS via _resolve_active_presets():
def resolve_presets(cfg, selected=()):
"""Replace every :class:`PresetCfg` in the tree with the best alternative.
For each ``PresetCfg`` found during an active-tree breadth-first walk:The docstring is already updated — good attention to detail! ✓
4. Test: consider adding a collision scenario test
The new tests verify correct branch selection, but it might be valuable to add an explicit test that demonstrates the original bug would fail (i.e., a test where without this fix, two sibling branches would have conflicting nested preset names that would incorrectly collide).
ℹ️ CI Status
CI checks are still pending. Please ensure all tests pass before merging.
Verdict: The changes are well-designed and address a real issue in the preset resolution system. The performance improvements are a nice bonus. Ready for merge pending CI completion.
🏷️ This review focused on the active-tree resolution logic, preset conflict detection, and test coverage.
📦 Update (be61832)
The incremental changes introduce well-organized CI and OVRTX improvements:
CI Improvements
- Dedicated rendering CI jobs: Rendering-correctness tests now run in isolated
test-rendering-correctnessandtest-rendering-correctness-kitlessjobs, reducing noise in the main test shards and making failures easier to diagnose. - Removed redundant
ovrtxpip installs: Theextra-pip-packages: "ovrtx"is now only specified for the kitless rendering job where it's actually needed.
OVRTX Renderer Optimizations
- Eliminated disk I/O: Using
stage.ExportToString()+open_usd_from_string()instead of writing intermediate USD files to disk. This should improve performance and avoid filesystem race conditions. - Simplified config: Removed
temp_usd_suffixoption;temp_usd_dirnow defaults toNone(in-memory mode). - Backward compatibility preserved: OVRTX <0.3.0 still materializes to disk via a temp file fallback.
Test Cleanups
- Removed unnecessary
AppLauncherboilerplate from ~20 test files that don't require Kit — nice cleanup that makes tests faster and more maintainable.
All changes look good and are consistent with the PR's goal of improving config resolution performance. No new concerns.
🏷️ Incremental review focused on CI changes, OVRTX disk I/O optimization, and test cleanup.
📦 Update (c6ce7e1)
This large incremental commit adds significant new functionality beyond the original preset-fix scope:
New Features
-
Ray-caster backends for PhysX and Newton: Added
RayCaster,RayCasterCamera,MultiMeshRayCaster, andMultiMeshRayCasterCameraimplementations for both backends. The PhysX implementation usesRigidBodyViewfor pose tracking; Newton uses site-based tracking viaNewtonManager.cl_register_site. -
Raycaster camera presets for Dexsuite Kuka-Allegro task: Added
raycaster_depth64/128/256presets for both base and wrist cameras, enabling depth rendering via the newMultiMeshRayCasterCamera. -
Comprehensive install CI test suite: Added extensive test coverage for the new installation model (
-i none,-i newton,-i rl[...],-i mimic,-i visualizer[...]). Tests cover uv and conda environments with GPU/training validation. -
Docker test improvements:
run_install_ci.pynow supports--condaflag for conda-enabled images and improved JUnit XML result copying from containers.
API Changes
- Test files updated to use
.torchaccessor onProxyArrayreturns (e.g.,camera.data.output["distance_to_image_plane"].torchinstead of direct access). isaaclab_mimic:robomimicmoved from optional extra to required dependency on Linux.rsl-rl-libupgraded from 3.1.2 to 5.0.1.
Version Bumps
- isaaclab_experimental: 0.0.4 → 0.0.5
- isaaclab_mimic: 1.2.7 → 1.3.0
- isaaclab_newton: 0.10.0 → 0.11.0
- isaaclab_ov: 0.2.0 → 0.2.1
- isaaclab_ovphysx: 1.0.0 → 2.0.0
- isaaclab_physx: 0.8.0 → 0.9.0
- isaaclab_tasks: 1.7.0 → 1.8.0
Changelog Entries
- All changelog.d skip files removed and proper CHANGELOG.rst entries added.
Status: ✅ No new issues found. The ray-caster backend implementations look well-structured with appropriate separation between PhysX and Newton paths. The install CI expansion significantly improves test coverage for the new modular installation system.
🏷️ Incremental review focused on ray-caster backends, install CI, and version updates.
📦 Update (407ffa4)
Latest merge from develop (18 commits) brings additional changes:
Key Changes
-
Camera module torch migration:
camera.pynow completely decoupled from torch, using backend-agnosticProxyArraypatterns. Internal tensor ops replaced with device-agnostic utilities. -
OVPHYSX SceneDataProvider: New
SceneDataProviderintegration inovphysx_manager.py(+195 lines) for direct scene data access without Kit dependency. -
Documentation updates:
- New warp environment docs and migration guide
- Ecosystem diagrams converted from JPG to SVG for better rendering
- Exposed missing API docs entries
-
DexSuite camera mini-batches (#5668): Camera configuration optimizations for the Kuka-Allegro task.
-
Removed
is_homogeneoushelper (#5667): Cleanup of deprecated utility from isaaclab core. -
Install command fixes (#5663): Aligns
uv runpath handling with./isaaclab.sh -iapproach.
CI/Wheel Changes
- Wheel build workflow now uses change detection to skip unnecessary builds
- Added dry-run wheel extras resolution test
- Install CI split into separate uv and conda jobs
No Issues Found
The merged develop changes integrate cleanly with the PR's preset resolution fix. The camera torch migration and scene data provider additions are well-isolated from the core preset logic.
📦 Update (111100e)
Merge from develop bringing documentation clarifications from #5648:
Changes
- Locomanipulation SDG docs: Clarified that the LeRobot conversion script's input directory should contain only SDG HDF5 files from
generate_data.py— not mixed with other HDF5 files from earlier pipeline steps. Added example showing how to move SDG outputs into a dedicated subdirectory before conversion.
This is a docs-only change with no impact on the core preset resolution functionality.
Final Verdict: ✅ Ready for merge pending CI completion. The core preset branch resolution fix remains sound, and all merged develop changes integrate cleanly.
🏷️ Incremental review: docs clarification only (SDG → LeRobot conversion).
Greptile SummaryThis PR rewrites the
Confidence Score: 3/5The active-tree BFS fix is correct and well-tested, but the The
Important Files Changed
|
| benchmark.update_manual_recorders() | ||
| benchmark._finalize_impl() |
There was a problem hiding this comment.
Accessing private
_finalize_impl() method
benchmark._finalize_impl() is a private method. Calling it directly couples the benchmark script to internal implementation details of BaseIsaacLabBenchmark and will silently break if the method is renamed or its pre-conditions change. The public finalize() method (if it exists) should be preferred; if not, a TODO comment should track replacing this with a stable public API.
Resolve PresetCfg choices by walking only the active tree so nested preset paths from inactive sibling branches cannot collide. Bypass Hydra composition for preset-only and plain scalar override paths to avoid repeated config walks on common task configuration loads.
1db6ec7 to
be61832
Compare
## Summary - Resolves `PresetCfg` choices by walking the active config tree, so nested preset names from inactive sibling branches cannot collide. - Avoids unnecessary Hydra/OmegaConf composition for preset-only and plain scalar override paths. - Adds focused branch-scoping regressions and a `benchmark_hydra_resolve.py` benchmark that also writes standard Isaac Lab benchmark measurements. ## Benchmark Measured `resolve_task_config()` only. Values are median wall time over 20 iterations after 3 warmups, in ms. ```text Case Pre-PR ms After PR ms Delta ms Faster cartpole_manager 114.44 2.20 -112.24 98.1% cartpole_camera_presets 142.77 23.00 -119.77 83.9% cartpole_camera_newton_ovrtx 280.90 22.88 -258.02 91.9% anymal_rough 296.99 6.09 -290.90 97.9% anymal_rough_scalar 330.45 6.14 -324.31 98.1% franka_lift_cube 308.55 4.87 -303.68 98.4% franka_reach 394.08 4.42 -389.66 98.9% franka_lift_cube_agent 406.42 5.97 -400.45 98.5% kuka_allegro_lift 838.77 61.83 -776.94 92.6% kuka_allegro_lift_single_camera 867.16 61.94 -805.22 92.9% kuka_allegro_lift_duo_camera 885.02 62.45 -822.57 92.9% kuka_allegro_lift_scalar 873.02 62.01 -811.01 92.9% cartpole_direct 240.27 1.30 -238.97 99.5% cartpole_rgb_direct 257.37 1.26 -256.11 99.5% ant_manager 303.60 2.72 -300.88 99.1% humanoid_manager 343.87 3.15 -340.72 99.1% cartpole_camera_hydra_force 492.95 118.45 -374.50 76.0% ``` ## Test plan - `PYTHONPATH="source/isaaclab_tasks:source/isaaclab" /home/zhengyuz/Projects/IsaacLab.wt/wip-feature-position_locomotion/env_isaaclab/bin/python -m pytest source/isaaclab_tasks/test/test_hydra.py source/isaaclab_tasks/test/test_preset_kit_decision.py` - `./isaaclab.sh -f` - `benchmark_hydra_resolve.py --suite broad --iterations 20 --warmup 3` on pre-PR and after-PR sources Co-authored-by: Kelly Guo <kellyg@nvidia.com>
Summary
PresetCfgchoices by walking the active config tree, so nested preset names from inactive sibling branches cannot collide.benchmark_hydra_resolve.pybenchmark that also writes standard Isaac Lab benchmark measurements.Benchmark
Measured
resolve_task_config()only. Values are median wall time over 20 iterations after 3 warmups, in ms.Test plan
PYTHONPATH="source/isaaclab_tasks:source/isaaclab" /home/zhengyuz/Projects/IsaacLab.wt/wip-feature-position_locomotion/env_isaaclab/bin/python -m pytest source/isaaclab_tasks/test/test_hydra.py source/isaaclab_tasks/test/test_preset_kit_decision.py./isaaclab.sh -fbenchmark_hydra_resolve.py --suite broad --iterations 20 --warmup 3on pre-PR and after-PR sources