Skip to content

[Mono.Android] CoreCLR: file/line in stack traces with FastDev#11702

Open
jonathanpeppers wants to merge 8 commits into
mainfrom
jonathanpeppers/upgraded-succotash
Open

[Mono.Android] CoreCLR: file/line in stack traces with FastDev#11702
jonathanpeppers wants to merge 8 commits into
mainfrom
jonathanpeppers/upgraded-succotash

Conversation

@jonathanpeppers

Copy link
Copy Markdown
Member

Why

On CoreCLR with FastDev, runtime-rendered managed stack traces (Environment.StackTrace, Exception.StackTrace) print method names only, even though portable PDBs are deployed alongside the assemblies in files/.__override__/<arch>/. No in File.cs:line N.

Root cause: we resolve FastDev assemblies through host_runtime_contract.external_assembly_probe, which reads the .dll into a heap buffer and hands the bytes to the runtime. The CLR never opens the file itself, so Assembly.Location is empty and StackTraceSymbols has no anchor for finding the sibling .pdb.

dotnet/macios does not have this problem because it lists assemblies in TRUSTED_PLATFORM_ASSEMBLIES with full paths, so the CLR opens them from disk and Assembly.Location is populated.

Approach

In Debug startup, append TRUSTED_PLATFORM_ASSEMBLIES to the properties passed to coreclr_initialize, listing the full on-disk path of every assembly from the build-time typemap (type_map_unique_assemblies) that is also present in the override directory. The CLR then mmap's those files itself via PEImage::OpenImage, which records m_path, populates Assembly.Location, and lets StackTraceSymbols.TryOpenAssociatedPortablePdb find the matching .pdb via simple sibling-file lookup.

  • No DebugType change, no opt-in property, no impact on Release builds.
  • TPA list is bounded by the build-time typemap, so we never pass arbitrary files from the override directory.
  • BCL/framework assemblies (not in the typemap, never pushed via FastDev) keep flowing through the existing external_assembly_probe / AssemblyStore path unchanged.
  • Storage for the TPA string and the augmented const char** arrays uses function-local static to satisfy CoreCLR's requirement that the pointers outlive coreclr_initialize.

Test

Adds StackTraceContainsLineNumbers to InstallAndRunTests (CoreCLR, Debug, FastDev): runs an app emitting Environment.StackTrace and asserts logcat contains a frame matching at ...MainActivity.OnCreate... in ...MainActivity.cs:line N.

Related to dotnet/runtime#124087 — this is the dotnet/android-side workaround that does not require changes to the CLR's StackFrameHelper code path.

Checklist

  • Useful description of why the change is necessary.
  • Links to issues fixed
  • Unit tests

When CoreCLR runs an Android app with FastDev, app assemblies live on disk
in `files/.__override__/<arch>/` with their portable PDBs alongside. But
we resolved them through `host_runtime_contract.external_assembly_probe`,
which reads the .dll into a heap buffer and hands the bytes to the runtime.
The CLR never opens the file itself, so `Assembly.Location` is empty and
`StackTraceSymbols` has no anchor for finding the sibling .pdb. The result:
`Console.WriteLine(Environment.StackTrace)` and `Exception.StackTrace`
print method names only, no `in File.cs:line N` info.

Fix: in Debug startup, append `TRUSTED_PLATFORM_ASSEMBLIES` to the
properties passed to `coreclr_initialize`, listing the full on-disk path
of every assembly from the typemap that is also present in the override
directory. The CLR then mmap's those files itself via `PEImage::OpenImage`,
which records `m_path`, populates `Assembly.Location`, and lets
`StackTraceSymbols.TryOpenAssociatedPortablePdb` find the matching .pdb
via simple sibling-file lookup. No DebugType change required, no opt-in
property, no impact on Release.

The TPA list is bounded by `type_map_unique_assemblies` (the build-time
set of assemblies contributing typemap entries), so we never pass arbitrary
files. BCL assemblies still flow through the existing FastDev/AssemblyStore
probe path unchanged.

Test: adds `StackTraceContainsLineNumbers` to `InstallAndRunTests`
(CoreCLR, Debug, FastDev) which runs an app emitting `Environment.StackTrace`
and asserts logcat contains a frame like
`at ...MainActivity.OnCreate ... in ...MainActivity.cs:line N`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings June 18, 2026 20:09

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves CoreCLR Debug + FastDev developer experience by enabling portable PDB lookup for runtime-rendered managed stack traces (so Environment.StackTrace / Exception.StackTrace include in File.cs:line N) when assemblies are loaded from the FastDev override directory.

Changes:

  • Appends a TRUSTED_PLATFORM_ASSEMBLIES (TPA) list during CoreCLR initialization in Debug+FastDev so CoreCLR opens override assemblies from disk and populates Assembly.Location.
  • Adds FastDevAssemblies::build_tpa_list() to construct a bounded TPA list from typemap-known assemblies present in .__override__/<abi>/.
  • Adds a device integration test asserting file/line info appears in stack traces under CoreCLR Debug FastDev.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
tests/MSBuildDeviceIntegration/Tests/InstallAndRunTests.cs Adds a CoreCLR Debug FastDev device test that validates stack traces contain File.cs:line N.
src/native/clr/include/host/fastdev-assemblies.hh Declares FastDevAssemblies::build_tpa_list() (Debug) with a Release stub.
src/native/clr/host/host.cc Extends CoreCLR init properties (Debug) to include TPA when FastDev override assemblies are present.
src/native/clr/host/fastdev-assemblies.cc Implements build_tpa_list() by enumerating typemap-known assemblies and including those found on disk in the override directory.

Comment thread tests/MSBuildDeviceIntegration/Tests/InstallAndRunTests.cs
Comment thread src/native/clr/host/fastdev-assemblies.cc
- Bail out cleanly if dirfd() fails
- Assert #STACKTRACE-END# marker to catch truncated output

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jonathanpeppers

Copy link
Copy Markdown
Member Author

/review

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

Android PR Reviewer completed successfully!

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Android PR Reviewer — ⚠️ Needs Changes

The production change is correct and nicely scoped. A few things I verified that look good:

  • build_tpa_list is compiled only in DEBUG_BUILD (src/native/clr/host/CMakeLists.txt), so the type_map* symbols it dereferences are always present — no Release link/ODR hazard.
  • Augmentation is gated behind if constexpr (Constants::is_debug_build), and TRUSTED_PLATFORM_ASSEMBLIES is a reserved runtime property (Microsoft.Android.Sdk.RuntimeConfig.targets), so it's filtered out of the generated property list → no duplicate key passed to coreclr_initialize.
  • The static storage for the TPA string and the augmented const char** arrays correctly outlives coreclr_initialize; lifetime reasoning is sound.
  • TPA is bounded by the build-time typemap and name_length excludes the trailing NUL, so <name> + ".dll" is right.
  • Both earlier review notes — asserting #STACKTRACE-END# and checking dirfd() failure — are already addressed in the current diff. 👍

My one blocking concern is device-test determinism (the test reads logcat.log after a capture that stops at Displayed, which races the async stdout→logcat flush of the trace), plus a minor regex-specificity suggestion. Both are inline.

Severity Count
❌ error 0
⚠️ warning 1
💡 suggestion 1

CI: the Azure DevOps build (dotnet-android) is still in progress at review time — not yet green — so the PR isn't mergeable regardless of the above.

👍 Clear root-cause writeup linking Assembly.LocationStackTraceSymbols portable-PDB sibling lookup, and good instinct to bound the TPA list by the typemap rather than trusting arbitrary files in the override directory.

Generated by Android PR Reviewer for issue #11702 · 1.1K AIC · ⌖ 45.9 AIC · ⊞ 37.8K
Comment /review to run again

Comment thread tests/MSBuildDeviceIntegration/Tests/InstallAndRunTests.cs
Comment thread tests/MSBuildDeviceIntegration/Tests/InstallAndRunTests.cs Outdated
- Use MonitorAdbLogcat to wait for #STACKTRACE-END# marker instead of
  WaitForActivityToStart, which races the async stdout->logcat flush
  and could finish capturing before the trace lands.
- Drop RegexOptions.Singleline so the OnCreate frame and file:line
  must appear on the same line, ensuring the assertion is specific to
  the OnCreate frame.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jonathanpeppers

Copy link
Copy Markdown
Member Author

/review

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown

Android PR Reviewer completed successfully!

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 Code Review — CoreCLR: file/line in stack traces with FastDev

Nice, well-scoped change. The design is sound and security-conscious: build_tpa_list derives the TPA set from the trusted build-time typemap (type_map_unique_assemblies) intersected with what's actually present in the per-ABI .__override__ dir, rather than blindly enumerating the override directory — so there's no path-traversal / arbitrary-file surface. Things I verified and liked:

  • Correctly gated to Debug. fastdev-assemblies.cc is DEBUG_BUILD-only (CMake) with inline non-DEBUG stubs in the header, and host.cc wraps the new prop-array logic in if constexpr (Constants::is_debug_build). No ODR/link surprises.
  • No duplicate runtime property. TRUSTED_PLATFORM_ASSEMBLIES is listed in _RuntimeConfigReservedProperties, so appending it once in host.cc won't collide with a property already passed to coreclr_initialize.
  • Resource-clean. build_tpa_list closedirs on every return path; the function-local static prop storage in host.cc is intentional (it must outlive coreclr_initialize) and thread-safe to initialize.
  • Small cleanup. Tightened a C-style cast to static_cast<int>(...) along the way.
  • Good regression test. StackTraceContainsLineNumbers forces FastDev (EmbedAssembliesIntoApk=false) and asserts a real ...MainActivity.cs:line <N> frame from logcat.

Two minor 💡 suggestions posted inline (no blockers):

  1. Document that the feature no-ops when the native typemap is empty (managed/trimmable typemap), since it depends on type_map_unique_assemblies.
  2. Use a raw string literal for the injected source, to match the sibling tests in the same file.

Verdict: No blocking issues from me. I'm holding off on a ✅ LGTM only because the internal Azure DevOps PR builds were still in progress at review time (no failures observed) — and this is a device test that needs CI to actually exercise it. Please confirm those go green before merging.

— Issues: 0 ❌ · 0 ⚠️ · 2 💡

Generated by Android PR Reviewer for issue #11702 · 2.4K AIC · ⌖ 47.2 AIC · ⊞ 37.8K
Comment /review to run again

Comment thread src/native/clr/host/fastdev-assemblies.cc Outdated
Comment thread tests/MSBuildDeviceIntegration/Tests/InstallAndRunTests.cs Outdated
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment on lines +139 to +144
// NOTE: The TPA list is sourced from `type_map_unique_assemblies`, which is
// only populated when `_AndroidTypeMapImplementation=llvm-ir` (the Debug
// default). With `managed` or `trimmable` typemaps the native typemap is
// empty, so no TPA paths are added and stack frames won't carry file/line
// info even under FastDev.
uint64_t expected_count = type_map.unique_assemblies_count;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we even using type_map_unique_assemblies as the source of all the assemblies? don't we have a better way of listing the assemblies? Alternatively, should we change how we generate the assemblies list so that it works also for the trimmable type map?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. Switched to enumerating *.dll in the override directory directly (commit 69f51db), which makes this work for managed and trimmable typemaps too — not just llvm-ir.

No additional attack surface: the override dir is the app's private storage, only writable by adb push from the developer's build, and the existing clr_external_assembly_probe already loads anything CoreCLR requests from this same directory unconditionally.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also looking if we could do DebugType=embedded by default on Android, but the SDK sets it so early we don't have a good way to change it from a workload:

We could override portable -> embedded always, but there might be some cases Android class libraries still want pdbs.

jonathanpeppers and others added 4 commits June 24, 2026 11:02
Drops the dependency on type_map_unique_assemblies (only populated for
_AndroidTypeMapImplementation=llvm-ir) and enumerates *.dll in the
override directory instead. The feature now works for managed and
trimmable typemaps too.

The override dir is the app's private storage, only writable via
adb push from the developer's build, and the existing
clr_external_assembly_probe already loads anything CoreCLR asks for
from this same directory unconditionally - so TPA-ing every *.dll
adds no attack surface.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CoreCLR's BindByTpaList in assemblybindercommon.cpp gives the external
assembly probe precedence over TPA. When our clr_external_assembly_probe
returns true with the FastDev assembly bytes, CoreCLR constructs the
PEImage's m_path from just the bare simple name + .dll (since the app
is not a bundle), so Assembly.Location ends up as 'Foo.dll' with no
directory. StackTraceSymbols.TryOpenAssociatedPortablePdb then probes a
non-existent 'Foo.pdb' and stack frames render without file/line info.

Fix: when build_tpa_list successfully passes the override directory's
DLLs to coreclr_initialize via TRUSTED_PLATFORM_ASSEMBLIES, set a flag
and short-circuit the FastDev probe so CoreCLR falls through to the TPA
path. The TPA path calls PEImage::OpenImage(fullPath, ...) and sets
Assembly.Location to the full disk path, which makes sibling PDB
discovery work.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Passing TRUSTED_PLATFORM_ASSEMBLIES to coreclr_initialize changes the
CLR binder mode such that System.Private.CoreLib.dll is resolved via
the TPA / external probe path instead of CoreCLR's built-in bootstrap.
On partial FastDev deployments (e.g. MonoAndroidExportReferencedAppStarts,
which only syncs ~29 user assemblies and has no assembly store and no
CoreLib in .__override__/), the binder cannot find CoreLib and the
process aborts with 'Failed to initialize CoreCLR. Error code: 80070002'.

Gate TPA on the presence of System.Private.CoreLib.dll in the override
directory. Full-deploy FastDev scenarios (such as the StackTrace
file/line test) continue to get TPA and the resulting Assembly.Location
fix; partial deployments retain the pre-PR behavior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CoreCLR loads System.Private.CoreLib.dll via the external assembly probe
during runtime startup (before the regular TPA-aware binder is online).
When tpa_in_use was set we returned nullptr for every probe call, so the
CoreLib bootstrap could not find CoreLib and coreclr_initialize aborted
with 0x80070002 even on full FastDev deployments where CoreLib was sitting
right there in .__override__/<arch>/.

Special-case 'System.Private.CoreLib.dll' so the probe returns its bytes
even with TPA active. CoreLib has no user code to symbolicate, so its
bare-filename Assembly.Location is irrelevant for the file/line work.
All other assemblies still yield to TPA, get their full disk paths, and
let StackTraceSymbols find sibling .pdb files.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants