feat(godbolt): add Compiler Explorer integration for rust-cuda kernels by g-ampo · Pull Request #394 · Rust-GPU/rust-cuda

g-ampo · 2026-05-06T21:47:06Z

Summary

Add Compiler Explorer integration so users can compile rust-cuda kernels to PTX in the browser.

Closes #177, partially addresses #353.

Building on the discussion between @LegNeato and @Forsworns .This takes the "wrapper script that replicates cuda_builder's rustc invocation" route that @LegNeato suggested (to avoiding the need for cargo on the CE side)

Changes

contrib/godbolt/rust-cuda-wrapper.sh - wrapper that creates a temp Cargo project with the same rustflags as cuda_builder::invoke_rustc(), builds to PTX, extracts output from cargo JSON
contrib/godbolt/install.sh - installs pinned nightly, builds rustc_codegen_nvvm, lays out a self-contained prefix
contrib/godbolt/rust-cuda.defaults.properties - CE language/compiler config
contrib/godbolt/test-kernel.rs - sample kernels for smoke-testing
contrib/godbolt/README.md - local testing and upstream submission steps

Testing

cargo check passes
cargo clippy --workspace passes
Tested on: Ubuntu 24.04, RTX 3060 Ti, CUDA 13.0

LegNeato

Thanks for working on this! I am not super familiar with compiler explorer, so apologies if my review questions are dumb.

LegNeato · 2026-05-07T18:36:36Z

Thinking about this some more, instead of shell perhaps it would be better to write this tool in rust (a normal binary that can compile with any version of rust). Then we can add tests and such to our CI to make sure we don't inadvertently break the integration. As it is I am worried that we land something and don't know it breaks until we go to update the version in compiler explorer. Thoughts?

g-ampo · 2026-05-07T20:31:30Z

Yes would make sense. The main thing the Rust binary would still do is shell out to cargo build under the hood, so the breakage surface is similar to shell (for the glue parts)... but if it imports cuda_std or hooks into the build logic directly, that gives compile-time detection when APIs change, which shell can't do :)

Would be (very) happy to rewrite it in Rust.
Should I do it in this PR or land what's here and follow up?

LegNeato · 2026-05-08T02:55:59Z

Up to you!

…integration test

g-ampo · 2026-05-11T22:22:50Z

@LegNeato Done in 46e82ec: wrapper is now a stand-alone Rust binary (plus an integration test that compiles test-kernel.rs and asserts the output looks like PTX ).

LegNeato · 2026-05-14T21:36:25Z

+/// Rustflags applied on every build. These mirror the flags `cuda_builder`
+/// passes when invoking rustc directly, so PTX produced through the wrapper
+/// matches what a normal rust-cuda build would emit.
+const STATIC_RUSTFLAGS: [&str; 6] = [


Can the wrapper just use cuda_builder directly so these can't get out of sync? It looks like you can set DEP_RUSTC_CODEGEN_NVVM_OUT_DIR or put it in the proper place expected by

rust-cuda/crates/cuda_builder/src/lib.rs

Line 494 in 103a8d5

fn search_backend_artifact(filename: &str) -> Option<PathBuf> {

and it should possibly work?

There is some drift risk but unfortunatelly a direct cuda_builder dependency doesn't work here:

cuda_builder -> nvvm -> cust_raw, and cust_raw (links = "cuda") requires a CUDA SDK at build time. The wrapper would no longer build without a full toolchain which is the thing the Rust rewrite was meant to avoid...

CudaBuilder::build() writes cargo: lines to stdout (it's a build.rs API); the wrapper needs stdout to be PTX only for CE. The flag logic in invoke_rustc is private anyway. DEP_RUSTC_CODEGEN_NVVM_OUT_DIR only changes where the backend .so is found, not either blocker.

here the real fix would be to extract the flag-building from invoke_rustc into a small dep-free crate that both cuda_builder and the wrapper call so there would be one source of truth. If thats fine with you I would just do this here in this PR.

feat(godbolt): add Compiler Explorer integration for rust-cuda kernels

7ae9aed

LegNeato requested changes May 7, 2026

View reviewed changes

update CE display name to reflect branch and codegen path

b32ed08

g-ampo requested a review from LegNeato May 7, 2026 09:38

use semver=main for trunk-first sorting in CE dropdown

a4743a8

refactor(godbolt): rewrite wrapper as a stand-alone rust binary with …

46e82ec

…integration test

LegNeato requested changes May 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(godbolt): add Compiler Explorer integration for rust-cuda kernels#394

feat(godbolt): add Compiler Explorer integration for rust-cuda kernels#394
g-ampo wants to merge 4 commits into
Rust-GPU:mainfrom
g-ampo:godbolt-integration

g-ampo commented May 6, 2026 •

edited

Loading

Uh oh!

LegNeato left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LegNeato commented May 7, 2026

Uh oh!

g-ampo commented May 7, 2026

Uh oh!

LegNeato commented May 8, 2026

Uh oh!

g-ampo commented May 11, 2026

Uh oh!

LegNeato May 14, 2026

Uh oh!

g-ampo May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

g-ampo commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Testing

Uh oh!

LegNeato left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LegNeato commented May 7, 2026

Uh oh!

g-ampo commented May 7, 2026

Uh oh!

LegNeato commented May 8, 2026

Uh oh!

g-ampo commented May 11, 2026

Uh oh!

LegNeato May 14, 2026

Choose a reason for hiding this comment

Uh oh!

g-ampo May 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

g-ampo commented May 6, 2026 •

edited

Loading