Skip to content

Release GIL when calling mkl_lapack::orgqr#2850

Merged
antonwolfy merged 4 commits intomasterfrom
add-gil-release-to-orgqr
Apr 21, 2026
Merged

Release GIL when calling mkl_lapack::orgqr#2850
antonwolfy merged 4 commits intomasterfrom
add-gil-release-to-orgqr

Conversation

@antonwolfy
Copy link
Copy Markdown
Contributor

@antonwolfy antonwolfy commented Apr 13, 2026

This PR fixes a deadlock in QR decomposition tests by adding GIL release before mkl_lapack::orgqr call.

The hang occurred because:

  1. dpnp/dpctl submits a host_task to manage Python object lifetimes
  2. host_task needs to acquire the GIL to decrement reference counts
  3. if the main thread holds the GIL during queue submission → deadlock
  4. orgqr is currently implemented in oneMKL as GPU-to-Host reverse offload:
exec_q.submit([&](sycl::handler& cgh) {
  cgh.depends_on(depends);
   cgh.host_task([=]() { orgqr_host(...); });
}).wait();

As a solution PR proposes to release GIL using before calling the OneMKL operations. The GIL is automatically reacquired when the function returns (RAII).

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • Have you added documentation for your changes, if necessary?
  • Have you added your changes to the changelog?

@antonwolfy antonwolfy added this to the 0.20.0 release milestone Apr 13, 2026
@antonwolfy antonwolfy self-assigned this Apr 13, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 13, 2026

View rendered docs @ https://intelpython.github.io/dpnp/index.html

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 13, 2026

Array API standard conformance tests for dpnp=0.20.0dev6=py313h509198e_13 ran successfully.
Passed: 1356
Failed: 4
Skipped: 16

@coveralls
Copy link
Copy Markdown
Collaborator

coveralls commented Apr 13, 2026

Coverage Status

coverage: 78.429%. remained the same — add-gil-release-to-orgqr into master

@antonwolfy antonwolfy marked this pull request as ready for review April 13, 2026 20:37
Comment thread CHANGELOG.md Outdated
Comment thread dpnp/backend/extensions/lapack/orgqr.cpp
Co-authored-by: vlad-perevezentsev <vladislav.perevezentsev@intel.com>
@vlad-perevezentsev vlad-perevezentsev self-requested a review April 14, 2026 14:00
Copy link
Copy Markdown
Contributor

@vlad-perevezentsev vlad-perevezentsev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thank you @antonwolfy

@antonwolfy antonwolfy merged commit 3ab1216 into master Apr 21, 2026
74 of 75 checks passed
@antonwolfy antonwolfy deleted the add-gil-release-to-orgqr branch April 21, 2026 09:13
github-actions bot added a commit that referenced this pull request Apr 21, 2026
This PR fixes a deadlock in QR decomposition tests by adding GIL release
before `mkl_lapack::orgqr` call.

The hang occurred because:
1. dpnp/dpctl submits a `host_task` to manage Python object lifetimes
2. host_task needs to acquire the GIL to decrement reference counts
3. if the main thread holds the GIL during queue submission → deadlock
4. `orgqr` is currently implemented in oneMKL as GPU-to-Host reverse
offload:
```cpp
exec_q.submit([&](sycl::handler& cgh) {
  cgh.depends_on(depends);
   cgh.host_task([=]() { orgqr_host(...); });
}).wait();
```

As a solution PR proposes to release GIL using before calling the OneMKL
operations. The GIL is automatically reacquired when the function
returns (RAII). 3ab1216
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants