[Fix] Exclude numpy 2.3.5 from every IsaacLab install path#5656
Conversation
There was a problem hiding this comment.
Automated Review Summary
PR #5656: [Fix] Force numpy>=2.4.1 in --install to escape OpenBLAS atfork SIGSEGV
✅ Overview
This PR addresses a critical SIGSEGV crash caused by numpy 2.3.5's vendored OpenBLAS (libscipy_openblas64_-fdde5778.so) during SimulationApp startup. The fix forces numpy >= 2.4.1 after the pin-pink installation step.
🔍 Key Findings
✅ Strengths
-
Correct Root Cause Identification: The PR correctly identifies that
pip install --upgrade --force-reinstall pin pin-pink==3.1.0 daqp==0.8.5resolves numpy independently, landing on 2.3.5 due to cmeel-boost'snumpy<2.4cap. -
Strategic Placement: Adding the numpy upgrade after
_ensure_pink_ik_dependencies_installedensures it's the final pip operation, preventing subsequent steps from downgrading numpy. -
Graceful Error Handling: Using
check=Falsewith a clear warning allows the installation to complete even if the numpy upgrade fails, while informing users of the risk. -
Good Documentation: The inline comment references upstream issues (
numpy/numpy#30092,OpenMathLib/OpenBLAS#5520) and internal tracking (OMPE-92261). -
Comprehensive Testing: 54/54 Pink IK unit tests verified locally against numpy 2.4.5.
⚠️ Minor Observations
-
Early Return Addition (line 226): The
returnstatement added after the pin-pink install failure is correct but represents a subtle behavior change - previously the function would continue (doing nothing useful) whereas now it explicitly returns. This is actually an improvement. -
Dependency Warning: Users may see a pip resolver warning about cmeel-boost's
numpy<2.4constraint. This is documented in the PR description as cosmetic and expected.
📋 Files Changed
| File | Changes | Assessment |
|---|---|---|
install.py |
+18 lines | ✅ Clean implementation |
jichuanh-force-numpy-install.rst |
+13 lines | ✅ Proper changelog format |
🧪 CI Status
Pre-commit and wheel build passing. Installation tests pending.
📌 Verdict
LGTM - This is a well-reasoned fix for a critical crash. The implementation follows existing patterns, handles errors gracefully, and is well-documented. The approach of forcing numpy >= 2.4.1 as the final pip step is appropriate given the constraint resolution complexities.
🤖 Review generated by isaaclab-review-bot
Update (1499d08): Reviewed new changes since beab776:
📁 New/Updated Files
| File | Status | Assessment |
|---|---|---|
docker/Dockerfile.curobo |
Added numpy upgrade | ✅ Applies same fix to cuRobo Docker image |
changelog.d/jichuanh-force-numpy-install.rst |
Added | ✅ Comprehensive changelog entry |
install.py |
Enhanced | ✅ Better documentation with upstream issue refs |
🔍 Changes Analysis
-
Docker cuRobo Fix: Adds
RUN ${ISAACLAB_PATH}/isaaclab.sh -p -m pip install --upgrade "numpy>=2.4.1"after curobo and isaaclab_teleop installs. This is necessary becausenvidia-curoboandisaaclab_teleoppull index-retargeting→pin→ cmeel-boost, which would otherwise drag numpy back to 2.3.5. -
Changelog Entry: Well-written documentation covering both the base
--installfix and the Dockerfile.curobo fix. Explains the root cause clearly. -
install.py Enhancement: The
_ensure_numpy_above_openblas_atfork_bugfunction now has comprehensive docstring explaining the technical details and rationale.
✅ Verdict
Changes look good. The cuRobo Docker fix is a necessary addition to ensure the numpy upgrade is applied consistently across all installation paths. LGTM.
🤖 Update by isaaclab-review-bot
Update (6c1ad76): Reviewed new changes since 1499d08:
📁 New/Updated Files
| File | Status | Assessment |
|---|---|---|
source/isaaclab/setup.py |
Updated | ✅ Added numpy>=2,!=2.3.5 constraint |
source/isaaclab_rl/setup.py |
Updated | ✅ Added numpy>=2,!=2.3.5 constraint |
source/isaaclab_tasks/setup.py |
Updated | ✅ Added numpy>=2,!=2.3.5 constraint |
source/isaaclab_visualizers/setup.py |
Updated | ✅ Added numpy>=2,!=2.3.5 constraint |
isaaclab_rl/changelog.d/... |
Added | ✅ Changelog entry |
isaaclab_tasks/changelog.d/... |
Added | ✅ Changelog entry |
isaaclab_visualizers/changelog.d/... |
Added | ✅ Changelog entry |
🔍 Changes Analysis
-
Defense in Depth: The
!=2.3.5exclusion in all setup.py files adds another layer of protection by explicitly blocking the broken numpy version from pip's intermediate resolutions, not just forcing an upgrade at the end. -
Consistent Constraint Pattern: All subpackages now use consistent
numpy>=2,!=2.3.5specification with comments pointing to the main explanation insource/isaaclab/setup.py. -
Comprehensive Changelogs: Each affected package has its own changelog entry documenting the fix.
✅ Verdict
Excellent additions. This defense-in-depth approach is a good improvement over relying solely on the post-install upgrade. By excluding numpy 2.3.5 from the package requirements, pip will actively avoid it during resolution rather than potentially installing it and then upgrading. This makes the fix more robust across different installation scenarios. LGTM.
🤖 Update by isaaclab-review-bot
Update (eae5a01): Reviewed changes since 6c1ad76:
🔄 Strategy Change: From Forced Upgrade to Exclusion
The approach has been simplified from forcing numpy>=2.4.1 upgrade to exclusively using numpy!=2.3.5 exclusions across all install paths.
📁 Changes Summary
| File | Change | Assessment |
|---|---|---|
docker/Dockerfile.base |
ARM pre-install: numpy → numpy!=2.3.5 |
✅ Consistent exclusion |
docker/Dockerfile.curobo |
Removed post-install upgrade, added numpy!=2.3.5 to curobo install |
✅ Cleaner approach |
source/isaaclab/install.py |
Passes numpy!=2.3.5 in pip commands instead of forcing upgrade |
✅ Less aggressive |
source/isaaclab_mimic/setup.py |
Added numpy!=2.3.5 |
✅ New coverage |
source/isaaclab_teleop/setup.py |
Added numpy!=2.3.5 |
✅ New coverage |
source/isaaclab/changelog.d/... |
Updated with comprehensive list of exclusion sites | ✅ Thorough docs |
🔍 Analysis
This is a cleaner approach compared to the previous force-upgrade strategy:
- Less Intrusive: Exclusion (
!=2.3.5) lets pip choose any valid numpy version rather than forcing a specific minimum (>=2.4.1) - Better Compatibility: Won't conflict with other constraints that might cap numpy below 2.4
- Consistent Pattern: All sites now use the same
numpy!=2.3.5pattern - Comprehensive Coverage: Added exclusions to previously-missed packages (
isaaclab_mimic,isaaclab_teleop)
✅ Verdict
LGTM - The simplified exclusion-only approach is more robust and less likely to cause resolver conflicts. Good decision to pivot from the upgrade-forcing strategy. All install paths are now covered with the !=2.3.5 exclusion.
🤖 Update by isaaclab-review-bot
beab776 to
1499d08
Compare
1499d08 to
6c1ad76
Compare
numpy 2.3.5 ships a vendored OpenBLAS (libscipy_openblas64_-fdde5778.so) whose pthread_atfork handler crashes Kit's libomni.platforminfo fork() during SimulationApp startup. The release is excluded at every site that pulls numpy directly or transitively, so no pip resolve during isaaclab.sh --install or any Docker image build can land on it -- even transiently: source/isaaclab/setup.py source/isaaclab_tasks/setup.py source/isaaclab_rl/setup.py source/isaaclab_visualizers/setup.py source/isaaclab_teleop/setup.py (transitive via dex-retargeting) source/isaaclab_mimic/setup.py (transitive via h5py) isaaclab.cli.commands.install._ensure_pink_ik_dependencies_installed isaaclab.cli.commands.install._maybe_preinstall_arm_nlopt docker/Dockerfile.base (ARM nlopt prep) docker/Dockerfile.curobo (ARM nlopt prep + nvidia-curobo install) Each touchpoint adds only the ``!=2.3.5`` exclusion; no other version constraints are introduced. Validated: - env_isaaclab_test smoke test (numpy 2.4.5 + cmeel pinocchio + pink + daqp + qpsolvers all import; toy IK solve OK). - IsaacLab Pink IK unit tests: 54/54 pass against numpy 2.4.5. - PR isaac-sim#5655 worst-case run (diagnostic imports numpy before pytest spawns Kit, the order that originally crashed): 36 pass / 0 fail. The isaaclab_physx surface gripper SIGSEGV is gone. Related: numpy/numpy#30092, OpenMathLib/OpenBLAS#5520
6c1ad76 to
eae5a01
Compare
TL;DR
Force-upgrade numpy to
>=2.4.1at the end of every install path that touchespin-pink/ cmeel:isaaclab.sh --install— new unconditional_ensure_numpy_above_openblas_atfork_bug()step at end ofcommand_install. Runs regardless of whether the pin-pink probe passes/fails (covers both fresh-install and upgrade-on-existing-env paths).docker/Dockerfile.curobo— applies the samepip install --upgrade numpy>=2.4.1after the cuRobo image's post---installsteps (nvidia-curobo+isaaclab_teleopeditable install), which otherwise drag numpy back to 2.3.5 viadex-retargeting→pin→ cmeel-boost.numpy 2.4.1+ ships the upstream OpenBLAS atfork fix (OpenMathLib/OpenBLAS#5520), so the entire 2.3.x risk class (incl. the broken
libscipy_openblas64_-fdde5778.sobundled with numpy 2.3.5) is bypassed.Complements #5642 but does not depend on it; the two can land independently. This PR alone is sufficient to keep numpy 2.3.5 out of the test image.
Why the setup.py pin alone isn't enough
isaaclab.sh --installrunspip install -e <submodule>per submodule, then finishes with_ensure_pink_ik_dependencies_installeddoingpip install --upgrade --force-reinstall pin pin-pink==3.1.0 daqp==0.8.5. That final force-reinstall is a fresh pip resolve that sees only pin-pink's deps (numpy>=1.19) plus cmeel-boost's transitivenumpy<2.4cap, lands on numpy 2.3.5, and overrides every prior install. The per-packagenumpy!=2.3.5constraints in #5642 never get re-evaluated at that point.Evidence: PR #5655's diagnostic captured numpy 2.3.5 +
libscipy_openblas64_-fdde5778.soin every test container afterisaaclab.sh --installon the #5642 branch.How the fix works
pip prints one resolver-warning line about cmeel-boost's
numpy<2.4cap, then installs numpy 2.4.5. numpy's stable C ABI (numpy ≥ 2.0) keeps cmeel's compiled extensions (libpinocchio, libcoal, …) working at runtime.Dockerfile.curobodoes the same upgrade after its own post---installpip steps to keep the cuRobo image consistent.Validation
Local (Python 3.12,
env_isaaclab_test, numpy force-upgraded to 2.4.5):import numpy, pinocchio, pink, daqp, qpsolvers— all OKlibscipy_openblas64_-32a4b2a6.so(≠ broken-fdde5778)pin.integrate+ numpy interop): OKPinkKinematicsConfigurationagainst test URDF: OKIsaacLab Pink IK unit tests against numpy 2.4.5:
test_pink_ik_components.pytest_local_frame_task.pytest_null_space_posture_task.py54/54 IsaacLab Pink IK unit tests green against numpy 2.4.5.
CI on companion diagnostic PR #5655:
numpy 2.4.5+libscipy_openblas64_-32a4b2a6.so(clean) in the dep-manifest diagnostic.isaaclab_physx::test_surface_gripper(the canary that originally SIGSEGV'd on numpy 2.3.5) passes cleanly.Files touched
Type of change
Risk
--install. Cosmetic.numpy>=2.4.1,<2.5) if a future minor numpy bump regresses.Related
Checklist
pre-commitchecks with./isaaclab.sh --format