Skip to content

[Fix] Exclude numpy 2.3.5 from every IsaacLab install path#5656

Draft
hujc7 wants to merge 1 commit into
isaac-sim:developfrom
hujc7:jichuanh/force-numpy-install
Draft

[Fix] Exclude numpy 2.3.5 from every IsaacLab install path#5656
hujc7 wants to merge 1 commit into
isaac-sim:developfrom
hujc7:jichuanh/force-numpy-install

Conversation

@hujc7
Copy link
Copy Markdown
Collaborator

@hujc7 hujc7 commented May 16, 2026

TL;DR

Force-upgrade numpy to >=2.4.1 at the end of every install path that touches pin-pink / cmeel:

  1. isaaclab.sh --install — new unconditional _ensure_numpy_above_openblas_atfork_bug() step at end of command_install. Runs regardless of whether the pin-pink probe passes/fails (covers both fresh-install and upgrade-on-existing-env paths).
  2. docker/Dockerfile.curobo — applies the same pip install --upgrade numpy>=2.4.1 after the cuRobo image's post---install steps (nvidia-curobo + isaaclab_teleop editable install), which otherwise drag numpy back to 2.3.5 via dex-retargetingpin → cmeel-boost.

numpy 2.4.1+ ships the upstream OpenBLAS atfork fix (OpenMathLib/OpenBLAS#5520), so the entire 2.3.x risk class (incl. the broken libscipy_openblas64_-fdde5778.so bundled with numpy 2.3.5) is bypassed.

Complements #5642 but does not depend on it; the two can land independently. This PR alone is sufficient to keep numpy 2.3.5 out of the test image.

Why the setup.py pin alone isn't enough

isaaclab.sh --install runs pip install -e <submodule> per submodule, then finishes with _ensure_pink_ik_dependencies_installed doing pip install --upgrade --force-reinstall pin pin-pink==3.1.0 daqp==0.8.5. That final force-reinstall is a fresh pip resolve that sees only pin-pink's deps (numpy>=1.19) plus cmeel-boost's transitive numpy<2.4 cap, lands on numpy 2.3.5, and overrides every prior install. The per-package numpy!=2.3.5 constraints in #5642 never get re-evaluated at that point.

Evidence: PR #5655's diagnostic captured numpy 2.3.5 + libscipy_openblas64_-fdde5778.so in every test container after isaaclab.sh --install on the #5642 branch.

How the fix works

# install.py — at end of command_install, after _ensure_pink_ik_dependencies_installed:
run_command(pip_cmd + ["install", "--upgrade", "numpy>=2.4.1"])

pip prints one resolver-warning line about cmeel-boost's numpy<2.4 cap, then installs numpy 2.4.5. numpy's stable C ABI (numpy ≥ 2.0) keeps cmeel's compiled extensions (libpinocchio, libcoal, …) working at runtime.

Dockerfile.curobo does the same upgrade after its own post---install pip steps to keep the cuRobo image consistent.

Validation

Local (Python 3.12, env_isaaclab_test, numpy force-upgraded to 2.4.5):

  • import numpy, pinocchio, pink, daqp, qpsolvers — all OK
  • Bundled OpenBLAS hash: libscipy_openblas64_-32a4b2a6.so (≠ broken -fdde5778)
  • Toy 2-DoF pinocchio model + pink FrameTask + daqp solve: OK
  • 20-step Pink IK convergence loop (heavy pin.integrate + numpy interop): OK
  • IsaacLab PinkKinematicsConfiguration against test URDF: OK

IsaacLab Pink IK unit tests against numpy 2.4.5:

Test file Result
test_pink_ik_components.py 21/21 passed
test_local_frame_task.py 24/24 passed
test_null_space_posture_task.py 9/9 passed

54/54 IsaacLab Pink IK unit tests green against numpy 2.4.5.

CI on companion diagnostic PR #5655:

  • Every base-image test job reports numpy 2.4.5 + libscipy_openblas64_-32a4b2a6.so (clean) in the dep-manifest diagnostic.
  • Worst-case import order (numpy imported in a shell subprocess before pytest spawns Kit) also passes — confirming the upstream OpenBLAS atfork fix is real, not just dodge-by-import-order. isaaclab_physx::test_surface_gripper (the canary that originally SIGSEGV'd on numpy 2.3.5) passes cleanly.

Files touched

source/isaaclab/isaaclab/cli/commands/install.py        (+39 lines, new _ensure_numpy_above_openblas_atfork_bug + call site)
docker/Dockerfile.curobo                                (+9 lines, post-install upgrade step)
source/isaaclab/changelog.d/jichuanh-force-numpy-install.rst   (changelog fragment)

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Risk

  • Resolver-warning line about cmeel-boost's cap during --install. Cosmetic.
  • numpy 2.4.x deprecation/removal of an API used by pinocchio/pink → caught by the 54/54 Pink IK unit tests above; cap upper bound (numpy>=2.4.1,<2.5) if a future minor numpy bump regresses.
  • cmeel-boost's C extensions ABI-break against numpy 2.4 → very unlikely (numpy 2.x ABI is stable); pinocchio import + compute verified locally.

Related

Checklist

@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label May 16, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Review Summary

PR #5656: [Fix] Force numpy>=2.4.1 in --install to escape OpenBLAS atfork SIGSEGV

✅ Overview

This PR addresses a critical SIGSEGV crash caused by numpy 2.3.5's vendored OpenBLAS (libscipy_openblas64_-fdde5778.so) during SimulationApp startup. The fix forces numpy >= 2.4.1 after the pin-pink installation step.


🔍 Key Findings

✅ Strengths

  1. Correct Root Cause Identification: The PR correctly identifies that pip install --upgrade --force-reinstall pin pin-pink==3.1.0 daqp==0.8.5 resolves numpy independently, landing on 2.3.5 due to cmeel-boost's numpy<2.4 cap.

  2. Strategic Placement: Adding the numpy upgrade after _ensure_pink_ik_dependencies_installed ensures it's the final pip operation, preventing subsequent steps from downgrading numpy.

  3. Graceful Error Handling: Using check=False with a clear warning allows the installation to complete even if the numpy upgrade fails, while informing users of the risk.

  4. Good Documentation: The inline comment references upstream issues (numpy/numpy#30092, OpenMathLib/OpenBLAS#5520) and internal tracking (OMPE-92261).

  5. Comprehensive Testing: 54/54 Pink IK unit tests verified locally against numpy 2.4.5.

⚠️ Minor Observations

  1. Early Return Addition (line 226): The return statement added after the pin-pink install failure is correct but represents a subtle behavior change - previously the function would continue (doing nothing useful) whereas now it explicitly returns. This is actually an improvement.

  2. Dependency Warning: Users may see a pip resolver warning about cmeel-boost's numpy<2.4 constraint. This is documented in the PR description as cosmetic and expected.


📋 Files Changed

File Changes Assessment
install.py +18 lines ✅ Clean implementation
jichuanh-force-numpy-install.rst +13 lines ✅ Proper changelog format

🧪 CI Status

Pre-commit and wheel build passing. Installation tests pending.


📌 Verdict

LGTM - This is a well-reasoned fix for a critical crash. The implementation follows existing patterns, handles errors gracefully, and is well-documented. The approach of forcing numpy >= 2.4.1 as the final pip step is appropriate given the constraint resolution complexities.


🤖 Review generated by isaaclab-review-bot


Update (1499d08): Reviewed new changes since beab776:

📁 New/Updated Files

File Status Assessment
docker/Dockerfile.curobo Added numpy upgrade ✅ Applies same fix to cuRobo Docker image
changelog.d/jichuanh-force-numpy-install.rst Added ✅ Comprehensive changelog entry
install.py Enhanced ✅ Better documentation with upstream issue refs

🔍 Changes Analysis

  1. Docker cuRobo Fix: Adds RUN ${ISAACLAB_PATH}/isaaclab.sh -p -m pip install --upgrade "numpy>=2.4.1" after curobo and isaaclab_teleop installs. This is necessary because nvidia-curobo and isaaclab_teleop pull in dex-retargetingpin → cmeel-boost, which would otherwise drag numpy back to 2.3.5.

  2. Changelog Entry: Well-written documentation covering both the base --install fix and the Dockerfile.curobo fix. Explains the root cause clearly.

  3. install.py Enhancement: The _ensure_numpy_above_openblas_atfork_bug function now has comprehensive docstring explaining the technical details and rationale.

✅ Verdict

Changes look good. The cuRobo Docker fix is a necessary addition to ensure the numpy upgrade is applied consistently across all installation paths. LGTM.

🤖 Update by isaaclab-review-bot


Update (6c1ad76): Reviewed new changes since 1499d08:

📁 New/Updated Files

File Status Assessment
source/isaaclab/setup.py Updated ✅ Added numpy>=2,!=2.3.5 constraint
source/isaaclab_rl/setup.py Updated ✅ Added numpy>=2,!=2.3.5 constraint
source/isaaclab_tasks/setup.py Updated ✅ Added numpy>=2,!=2.3.5 constraint
source/isaaclab_visualizers/setup.py Updated ✅ Added numpy>=2,!=2.3.5 constraint
isaaclab_rl/changelog.d/... Added ✅ Changelog entry
isaaclab_tasks/changelog.d/... Added ✅ Changelog entry
isaaclab_visualizers/changelog.d/... Added ✅ Changelog entry

🔍 Changes Analysis

  1. Defense in Depth: The !=2.3.5 exclusion in all setup.py files adds another layer of protection by explicitly blocking the broken numpy version from pip's intermediate resolutions, not just forcing an upgrade at the end.

  2. Consistent Constraint Pattern: All subpackages now use consistent numpy>=2,!=2.3.5 specification with comments pointing to the main explanation in source/isaaclab/setup.py.

  3. Comprehensive Changelogs: Each affected package has its own changelog entry documenting the fix.

✅ Verdict

Excellent additions. This defense-in-depth approach is a good improvement over relying solely on the post-install upgrade. By excluding numpy 2.3.5 from the package requirements, pip will actively avoid it during resolution rather than potentially installing it and then upgrading. This makes the fix more robust across different installation scenarios. LGTM.

🤖 Update by isaaclab-review-bot


Update (eae5a01): Reviewed changes since 6c1ad76:

🔄 Strategy Change: From Forced Upgrade to Exclusion

The approach has been simplified from forcing numpy>=2.4.1 upgrade to exclusively using numpy!=2.3.5 exclusions across all install paths.

📁 Changes Summary

File Change Assessment
docker/Dockerfile.base ARM pre-install: numpynumpy!=2.3.5 ✅ Consistent exclusion
docker/Dockerfile.curobo Removed post-install upgrade, added numpy!=2.3.5 to curobo install ✅ Cleaner approach
source/isaaclab/install.py Passes numpy!=2.3.5 in pip commands instead of forcing upgrade ✅ Less aggressive
source/isaaclab_mimic/setup.py Added numpy!=2.3.5 ✅ New coverage
source/isaaclab_teleop/setup.py Added numpy!=2.3.5 ✅ New coverage
source/isaaclab/changelog.d/... Updated with comprehensive list of exclusion sites ✅ Thorough docs

🔍 Analysis

This is a cleaner approach compared to the previous force-upgrade strategy:

  1. Less Intrusive: Exclusion (!=2.3.5) lets pip choose any valid numpy version rather than forcing a specific minimum (>=2.4.1)
  2. Better Compatibility: Won't conflict with other constraints that might cap numpy below 2.4
  3. Consistent Pattern: All sites now use the same numpy!=2.3.5 pattern
  4. Comprehensive Coverage: Added exclusions to previously-missed packages (isaaclab_mimic, isaaclab_teleop)

✅ Verdict

LGTM - The simplified exclusion-only approach is more robust and less likely to cause resolver conflicts. Good decision to pivot from the upgrade-forcing strategy. All install paths are now covered with the !=2.3.5 exclusion.

🤖 Update by isaaclab-review-bot

@hujc7 hujc7 force-pushed the jichuanh/force-numpy-install branch from beab776 to 1499d08 Compare May 16, 2026 21:35
@hujc7 hujc7 force-pushed the jichuanh/force-numpy-install branch from 1499d08 to 6c1ad76 Compare May 16, 2026 21:44
numpy 2.3.5 ships a vendored OpenBLAS
(libscipy_openblas64_-fdde5778.so) whose pthread_atfork handler crashes
Kit's libomni.platforminfo fork() during SimulationApp startup. The
release is excluded at every site that pulls numpy directly or
transitively, so no pip resolve during isaaclab.sh --install or any
Docker image build can land on it -- even transiently:

  source/isaaclab/setup.py
  source/isaaclab_tasks/setup.py
  source/isaaclab_rl/setup.py
  source/isaaclab_visualizers/setup.py
  source/isaaclab_teleop/setup.py        (transitive via dex-retargeting)
  source/isaaclab_mimic/setup.py         (transitive via h5py)
  isaaclab.cli.commands.install._ensure_pink_ik_dependencies_installed
  isaaclab.cli.commands.install._maybe_preinstall_arm_nlopt
  docker/Dockerfile.base                 (ARM nlopt prep)
  docker/Dockerfile.curobo               (ARM nlopt prep + nvidia-curobo install)

Each touchpoint adds only the ``!=2.3.5`` exclusion; no other version
constraints are introduced.

Validated:
- env_isaaclab_test smoke test (numpy 2.4.5 + cmeel pinocchio + pink + daqp
  + qpsolvers all import; toy IK solve OK).
- IsaacLab Pink IK unit tests: 54/54 pass against numpy 2.4.5.
- PR isaac-sim#5655 worst-case run (diagnostic imports numpy before pytest spawns
  Kit, the order that originally crashed): 36 pass / 0 fail. The
  isaaclab_physx surface gripper SIGSEGV is gone.

Related: numpy/numpy#30092, OpenMathLib/OpenBLAS#5520
@hujc7 hujc7 force-pushed the jichuanh/force-numpy-install branch from 6c1ad76 to eae5a01 Compare May 18, 2026 03:47
@github-actions github-actions Bot added the isaac-mimic Related to Isaac Mimic team label May 18, 2026
@hujc7 hujc7 changed the title [Fix] Force numpy>=2.4.1 in --install to escape OpenBLAS atfork SIGSEGV [Fix] Exclude numpy 2.3.5 from every IsaacLab install path May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

infrastructure isaac-lab Related to Isaac Lab team isaac-mimic Related to Isaac Mimic team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant