Skip to content

Add patch and regression test for invalid SPIR-V produced from duplicate-predecessor phi nodes #1306

Draft
colleeneb wants to merge 4 commits into
mainfrom
spirv_patch
Draft

Add patch and regression test for invalid SPIR-V produced from duplicate-predecessor phi nodes #1306
colleeneb wants to merge 4 commits into
mainfrom
spirv_patch

Conversation

@colleeneb

Copy link
Copy Markdown
Contributor

TLDR: This is a patch for the SPIRV translator to avoid producing invalid SPIRV. A C++ reproducer and test is included. Feedback on the patch is needed since it was done with the help of Claude and although it seems correct I am not a SPIRV expert.

Background: The SPIRV translator produces invalid SPV from LLVM IR for the case where LLVM IR has a phi node with multiple entries that have the same predecessor.
For example:

 %r = phi i32 [ 7, %sw ], [ 7, %sw ], [ 9, %def ]    

In LLVM IR, duplicate entries are legal only if in the value in the entries which have the same predecessor are all the same (https://github.com/llvm/llvm-project/blob/ca7933e47d3a3451d81e72ac174dcb5aa28b59d1/llvm/lib/IR/Verifier.cpp#L3406 ). LLVM may generate IR with the entries with the same predecessor and the same value in code with switch/case statements (see the C++ example in tests/compiler/TestSpirvDuplicatePhiHip.hip)

However, if the SPIRV translator encounters code like this, it currently produces:

%retval_0_i = OpPhi %uint %uint_9 %sw_default_i %uint_7 %entry %uint_7 %entry %uint_7 %entry 

This SPIRV is invalid since SPIRV does not allow multiple entries in the OpPhi with the same predecessor (https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpPh "There must be exactly one Parent i for each parent block of the current block in the CFG.”), and running spirv-val fails on it (https://github.com/KhronosGroup/SPIRV-Tools/blob/58fe144fdc8847b303be51d4f8fcc9e7da17056e/source/val/validate_cfg.cpp#L88)

> spirv-val p.spv
error: line 59: OpPhi's number of incoming blocks (4) does not match block's predecessor count (4).
  %retval_0_i = OpPhi %uint %uint_9 %sw_default_i %uint_7 %entry %uint_7 %entry %uint_7 %entry

This is also discussed in KhronosGroup/SPIRV-LLVM-Translator#2702 . The PR that closed it (KhronosGroup/SPIRV-LLVM-Translator#2736) is only for the translation of SPIRV to LLVM IR, not the LLVM IR to SPIRV direction. (This PR partially fixes issue #2702 in the part that is responsible for SPIR-V to LLVM IR translation.).

This patch collapses the duplicate entries into one entry. In my understanding, this shouldn’t lose any information as all the duplicate entries are the same.

@colleeneb

Copy link
Copy Markdown
Contributor Author

/run-aurora-ci

@pvelesko

pvelesko commented Jun 21, 2026

Copy link
Copy Markdown
Collaborator

nice. how did you find this?

@colleeneb

Copy link
Copy Markdown
Contributor Author

Ah, it showed up in one of the applications on Aurora -- the JIT was failing, so I ran the validator on it and found it. The code has -O3 + -optnones like the example here so I think triggered something I hadn't seen before.

@pvelesko

Copy link
Copy Markdown
Collaborator

Need to investigate this further. HipVerify should have caught the invalid IR. Did that happen>

@colleeneb

Copy link
Copy Markdown
Contributor Author

I don't think HipVerify caught it, at least I didn't see an error at compile or runtime. But if you run the example code, dump the SPIRV, and then run spir-val on it, you can see that it fails validation:

> hipcc -O3 main.cpp 
> CHIP_BE=level0 CHIP_DUMP_SPIRV=ON ./a.out  
> spirv-val hip-spirv-a.out-cfea77.spv
error: line 57: OpPhi's number of incoming blocks (4) does not match block's predecessor count (4).
  %retval_0_i = OpPhi %uint %uint_9 %sw_default_i %uint_7 %entry %uint_7 %entry %uint_7 %entry

The code in main.cpp:

  #include <hip/hip_runtime.h>
  #include <cstdio>

  __device__ int grad(int sel) {
    int v = 7;
    switch (sel) {
      case 0:
      case 4:
      case 10:
        return v;      // 3 cases share one return -> switch block has 3 edges to the merge
      default:
        v = 9;
    }
    return v;
  }

  __global__ __attribute__((optnone))
  void k(int *out, int sel) { out[0] = grad(sel); }

  // Host reference matching grad().
  static int grad_host(int sel) {
    switch (sel) {
      case 0:
      case 4:
      case 10: return 7;
      default: return 9;
    }
  }

  int main() {
    int *d_out = nullptr;
    hipMalloc(&d_out, sizeof(int));
    k<<<1, 1>>>(d_out, 7);
    hipDeviceSynchronize();
    return 0;
  }

Let me know if I missed something!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants