[Mono.Android] CoreCLR: file/line in stack traces with FastDev#11702
[Mono.Android] CoreCLR: file/line in stack traces with FastDev#11702jonathanpeppers wants to merge 8 commits into
Conversation
When CoreCLR runs an Android app with FastDev, app assemblies live on disk in `files/.__override__/<arch>/` with their portable PDBs alongside. But we resolved them through `host_runtime_contract.external_assembly_probe`, which reads the .dll into a heap buffer and hands the bytes to the runtime. The CLR never opens the file itself, so `Assembly.Location` is empty and `StackTraceSymbols` has no anchor for finding the sibling .pdb. The result: `Console.WriteLine(Environment.StackTrace)` and `Exception.StackTrace` print method names only, no `in File.cs:line N` info. Fix: in Debug startup, append `TRUSTED_PLATFORM_ASSEMBLIES` to the properties passed to `coreclr_initialize`, listing the full on-disk path of every assembly from the typemap that is also present in the override directory. The CLR then mmap's those files itself via `PEImage::OpenImage`, which records `m_path`, populates `Assembly.Location`, and lets `StackTraceSymbols.TryOpenAssociatedPortablePdb` find the matching .pdb via simple sibling-file lookup. No DebugType change required, no opt-in property, no impact on Release. The TPA list is bounded by `type_map_unique_assemblies` (the build-time set of assemblies contributing typemap entries), so we never pass arbitrary files. BCL assemblies still flow through the existing FastDev/AssemblyStore probe path unchanged. Test: adds `StackTraceContainsLineNumbers` to `InstallAndRunTests` (CoreCLR, Debug, FastDev) which runs an app emitting `Environment.StackTrace` and asserts logcat contains a frame like `at ...MainActivity.OnCreate ... in ...MainActivity.cs:line N`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Improves CoreCLR Debug + FastDev developer experience by enabling portable PDB lookup for runtime-rendered managed stack traces (so Environment.StackTrace / Exception.StackTrace include in File.cs:line N) when assemblies are loaded from the FastDev override directory.
Changes:
- Appends a
TRUSTED_PLATFORM_ASSEMBLIES(TPA) list during CoreCLR initialization in Debug+FastDev so CoreCLR opens override assemblies from disk and populatesAssembly.Location. - Adds
FastDevAssemblies::build_tpa_list()to construct a bounded TPA list from typemap-known assemblies present in.__override__/<abi>/. - Adds a device integration test asserting file/line info appears in stack traces under CoreCLR Debug FastDev.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/MSBuildDeviceIntegration/Tests/InstallAndRunTests.cs | Adds a CoreCLR Debug FastDev device test that validates stack traces contain File.cs:line N. |
| src/native/clr/include/host/fastdev-assemblies.hh | Declares FastDevAssemblies::build_tpa_list() (Debug) with a Release stub. |
| src/native/clr/host/host.cc | Extends CoreCLR init properties (Debug) to include TPA when FastDev override assemblies are present. |
| src/native/clr/host/fastdev-assemblies.cc | Implements build_tpa_list() by enumerating typemap-known assemblies and including those found on disk in the override directory. |
- Bail out cleanly if dirfd() fails - Assert #STACKTRACE-END# marker to catch truncated output Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/review |
|
✅ Android PR Reviewer completed successfully! |
There was a problem hiding this comment.
🤖 Android PR Reviewer — ⚠️ Needs Changes
The production change is correct and nicely scoped. A few things I verified that look good:
build_tpa_listis compiled only inDEBUG_BUILD(src/native/clr/host/CMakeLists.txt), so thetype_map*symbols it dereferences are always present — no Release link/ODR hazard.- Augmentation is gated behind
if constexpr (Constants::is_debug_build), andTRUSTED_PLATFORM_ASSEMBLIESis a reserved runtime property (Microsoft.Android.Sdk.RuntimeConfig.targets), so it's filtered out of the generated property list → no duplicate key passed tocoreclr_initialize. - The
staticstorage for the TPA string and the augmentedconst char**arrays correctly outlivescoreclr_initialize; lifetime reasoning is sound. - TPA is bounded by the build-time typemap and
name_lengthexcludes the trailing NUL, so<name> + ".dll"is right. - Both earlier review notes — asserting
#STACKTRACE-END#and checkingdirfd()failure — are already addressed in the current diff. 👍
My one blocking concern is device-test determinism (the test reads logcat.log after a capture that stops at Displayed, which races the async stdout→logcat flush of the trace), plus a minor regex-specificity suggestion. Both are inline.
| Severity | Count |
|---|---|
| ❌ error | 0 |
| 1 | |
| 💡 suggestion | 1 |
CI: the Azure DevOps build (dotnet-android) is still in progress at review time — not yet green — so the PR isn't mergeable regardless of the above.
👍 Clear root-cause writeup linking Assembly.Location → StackTraceSymbols portable-PDB sibling lookup, and good instinct to bound the TPA list by the typemap rather than trusting arbitrary files in the override directory.
Generated by Android PR Reviewer for issue #11702 · 1.1K AIC · ⌖ 45.9 AIC · ⊞ 37.8K
Comment /review to run again
- Use MonitorAdbLogcat to wait for #STACKTRACE-END# marker instead of WaitForActivityToStart, which races the async stdout->logcat flush and could finish capturing before the trace lands. - Drop RegexOptions.Singleline so the OnCreate frame and file:line must appear on the same line, ensuring the assertion is specific to the OnCreate frame. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
/review |
|
✅ Android PR Reviewer completed successfully! |
There was a problem hiding this comment.
🤖 Code Review — CoreCLR: file/line in stack traces with FastDev
Nice, well-scoped change. The design is sound and security-conscious: build_tpa_list derives the TPA set from the trusted build-time typemap (type_map_unique_assemblies) intersected with what's actually present in the per-ABI .__override__ dir, rather than blindly enumerating the override directory — so there's no path-traversal / arbitrary-file surface. Things I verified and liked:
- Correctly gated to Debug.
fastdev-assemblies.ccisDEBUG_BUILD-only (CMake) with inline non-DEBUG stubs in the header, andhost.ccwraps the new prop-array logic inif constexpr (Constants::is_debug_build). No ODR/link surprises. - No duplicate runtime property.
TRUSTED_PLATFORM_ASSEMBLIESis listed in_RuntimeConfigReservedProperties, so appending it once inhost.ccwon't collide with a property already passed tocoreclr_initialize. - Resource-clean.
build_tpa_listclosedirs on every return path; the function-localstaticprop storage inhost.ccis intentional (it must outlivecoreclr_initialize) and thread-safe to initialize. - Small cleanup. Tightened a C-style cast to
static_cast<int>(...)along the way. - Good regression test.
StackTraceContainsLineNumbersforces FastDev (EmbedAssembliesIntoApk=false) and asserts a real...MainActivity.cs:line <N>frame from logcat.
Two minor 💡 suggestions posted inline (no blockers):
- Document that the feature no-ops when the native typemap is empty (
managed/trimmabletypemap), since it depends ontype_map_unique_assemblies. - Use a raw string literal for the injected source, to match the sibling tests in the same file.
Verdict: No blocking issues from me. I'm holding off on a ✅ LGTM only because the internal Azure DevOps PR builds were still in progress at review time (no failures observed) — and this is a device test that needs CI to actually exercise it. Please confirm those go green before merging.
— Issues: 0 ❌ · 0
Generated by Android PR Reviewer for issue #11702 · 2.4K AIC · ⌖ 47.2 AIC · ⊞ 37.8K
Comment /review to run again
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| // NOTE: The TPA list is sourced from `type_map_unique_assemblies`, which is | ||
| // only populated when `_AndroidTypeMapImplementation=llvm-ir` (the Debug | ||
| // default). With `managed` or `trimmable` typemaps the native typemap is | ||
| // empty, so no TPA paths are added and stack frames won't carry file/line | ||
| // info even under FastDev. | ||
| uint64_t expected_count = type_map.unique_assemblies_count; |
There was a problem hiding this comment.
Why are we even using type_map_unique_assemblies as the source of all the assemblies? don't we have a better way of listing the assemblies? Alternatively, should we change how we generate the assemblies list so that it works also for the trimmable type map?
There was a problem hiding this comment.
Good call. Switched to enumerating *.dll in the override directory directly (commit 69f51db), which makes this work for managed and trimmable typemaps too — not just llvm-ir.
No additional attack surface: the override dir is the app's private storage, only writable by adb push from the developer's build, and the existing clr_external_assembly_probe already loads anything CoreCLR requests from this same directory unconditionally.
There was a problem hiding this comment.
I was also looking if we could do DebugType=embedded by default on Android, but the SDK sets it so early we don't have a good way to change it from a workload:
We could override portable -> embedded always, but there might be some cases Android class libraries still want pdbs.
Drops the dependency on type_map_unique_assemblies (only populated for _AndroidTypeMapImplementation=llvm-ir) and enumerates *.dll in the override directory instead. The feature now works for managed and trimmable typemaps too. The override dir is the app's private storage, only writable via adb push from the developer's build, and the existing clr_external_assembly_probe already loads anything CoreCLR asks for from this same directory unconditionally - so TPA-ing every *.dll adds no attack surface. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CoreCLR's BindByTpaList in assemblybindercommon.cpp gives the external assembly probe precedence over TPA. When our clr_external_assembly_probe returns true with the FastDev assembly bytes, CoreCLR constructs the PEImage's m_path from just the bare simple name + .dll (since the app is not a bundle), so Assembly.Location ends up as 'Foo.dll' with no directory. StackTraceSymbols.TryOpenAssociatedPortablePdb then probes a non-existent 'Foo.pdb' and stack frames render without file/line info. Fix: when build_tpa_list successfully passes the override directory's DLLs to coreclr_initialize via TRUSTED_PLATFORM_ASSEMBLIES, set a flag and short-circuit the FastDev probe so CoreCLR falls through to the TPA path. The TPA path calls PEImage::OpenImage(fullPath, ...) and sets Assembly.Location to the full disk path, which makes sibling PDB discovery work. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Passing TRUSTED_PLATFORM_ASSEMBLIES to coreclr_initialize changes the CLR binder mode such that System.Private.CoreLib.dll is resolved via the TPA / external probe path instead of CoreCLR's built-in bootstrap. On partial FastDev deployments (e.g. MonoAndroidExportReferencedAppStarts, which only syncs ~29 user assemblies and has no assembly store and no CoreLib in .__override__/), the binder cannot find CoreLib and the process aborts with 'Failed to initialize CoreCLR. Error code: 80070002'. Gate TPA on the presence of System.Private.CoreLib.dll in the override directory. Full-deploy FastDev scenarios (such as the StackTrace file/line test) continue to get TPA and the resulting Assembly.Location fix; partial deployments retain the pre-PR behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CoreCLR loads System.Private.CoreLib.dll via the external assembly probe during runtime startup (before the regular TPA-aware binder is online). When tpa_in_use was set we returned nullptr for every probe call, so the CoreLib bootstrap could not find CoreLib and coreclr_initialize aborted with 0x80070002 even on full FastDev deployments where CoreLib was sitting right there in .__override__/<arch>/. Special-case 'System.Private.CoreLib.dll' so the probe returns its bytes even with TPA active. CoreLib has no user code to symbolicate, so its bare-filename Assembly.Location is irrelevant for the file/line work. All other assemblies still yield to TPA, get their full disk paths, and let StackTraceSymbols find sibling .pdb files. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Why
On CoreCLR with FastDev, runtime-rendered managed stack traces (
Environment.StackTrace,Exception.StackTrace) print method names only, even though portable PDBs are deployed alongside the assemblies infiles/.__override__/<arch>/. Noin File.cs:line N.Root cause: we resolve FastDev assemblies through
host_runtime_contract.external_assembly_probe, which reads the.dllinto a heap buffer and hands the bytes to the runtime. The CLR never opens the file itself, soAssembly.Locationis empty andStackTraceSymbolshas no anchor for finding the sibling.pdb.dotnet/macios does not have this problem because it lists assemblies in
TRUSTED_PLATFORM_ASSEMBLIESwith full paths, so the CLR opens them from disk andAssembly.Locationis populated.Approach
In Debug startup, append
TRUSTED_PLATFORM_ASSEMBLIESto the properties passed tocoreclr_initialize, listing the full on-disk path of every assembly from the build-time typemap (type_map_unique_assemblies) that is also present in the override directory. The CLR then mmap's those files itself viaPEImage::OpenImage, which recordsm_path, populatesAssembly.Location, and letsStackTraceSymbols.TryOpenAssociatedPortablePdbfind the matching.pdbvia simple sibling-file lookup.DebugTypechange, no opt-in property, no impact on Release builds.external_assembly_probe/AssemblyStorepath unchanged.const char**arrays uses function-localstaticto satisfy CoreCLR's requirement that the pointers outlivecoreclr_initialize.Test
Adds
StackTraceContainsLineNumberstoInstallAndRunTests(CoreCLR, Debug, FastDev): runs an app emittingEnvironment.StackTraceand asserts logcat contains a frame matchingat ...MainActivity.OnCreate... in ...MainActivity.cs:line N.Related to dotnet/runtime#124087 — this is the dotnet/android-side workaround that does not require changes to the CLR's
StackFrameHelpercode path.Checklist