ParparVM: index native-symbol and class lookups in the dead-code cull (O(N^2) -> ~O(N))#5236
ParparVM: index native-symbol and class lookups in the dead-code cull (O(N^2) -> ~O(N))#5236shai-almog wants to merge 3 commits into
Conversation
…e cull
The dead-code optimizer ("unused Method cull") asks, for every Java method
and class, "is this referenced by native code?" by substring-scanning the
entire native source corpus per query (isMethodUsedByNative ->
for each native source: s.contains(symbol)). That is O(methods x native_bytes),
so apps that pull in a lot of native source (many cn1libs: camera, scanners,
maps, ...) pay a large, growing cull cost.
Native symbols are CN1 mangled identifiers, and a query X matches iff X is a
substring of some maximal identifier token in the native text. So we build,
once per run, a suffix automaton over the DISTINCT native identifier tokens
(NativeSymbolIndex) and answer each query in O(|symbol|). Tokenizing into the
distinct set dedups repeated symbols across files, bounding the structure.
Semantics are identical to the old String.contains scan because the query
strings are themselves delimiter-free identifiers.
Measured on a real failing iOS build (5476 classes held fixed, native corpus
scaled by duplicating .m files):
native files cull (before) cull (after) methods removed
62 15.2s 14.3s 6968 / 6968
462 24.2s 15.5s 6968 / 6968
1362 48.2s 16.3s 6968 / 6968
The cull removes the exact same 6968 methods at every size (correctness gate),
and the native-scan cost (~32s at 1362 files) collapses to a sub-second
one-time index build. The residual flat ~15s is the separate class-graph
lookup cost (findClass/getClassObject linear scans), addressed in a follow-up.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
✅ ByteCodeTranslator Quality ReportTest & Coverage
Benchmark Results
Static Analysis
Generated automatically by the PR CI workflow. |
Cloudflare Preview
|
|
Compared 125 screenshots: 125 matched. Benchmark ResultsDetailed Performance Metrics
|
|
Compared 125 screenshots: 125 matched. Benchmark ResultsDetailed Performance Metrics
|
|
Compared 124 screenshots: 124 matched. Benchmark Results
Build and Run Timing
Detailed Performance Metrics
|
|
Compared 128 screenshots: 128 matched. Benchmark Results
Detailed Performance Metrics
|
The cull's other O(N^2): getClassObject, getClassByName and ByteCodeClass.findClass each did an O(N) linear scan of the whole class list, and they're called per dependency per class during markDependencies / updateAllDependencies, which run up to ~5 times per build. That's the dominant cost once the native-scan (previous commit) is removed. Replace the three scans with a shared name -> class HashMap, rebuilt lazily when `classes` changes (tracked by the (reference, size) pair, which is sufficient because `classes` is only ever reassigned, grown via add(), or cleared -- never mutated to a same-ref/same-size/different-content state). First-match semantics are preserved (first-wins on duplicate class names). SpotBugs: the two lazily-built static index caches in Parser (this one and the native-symbol index) are excluded for LI_LAZY_INIT_STATIC / LI_LAZY_INIT_UPDATE_STATIC -- the translator runs single-threaded across one translation run, the same rationale as the existing exclusions in that file. Measured on the same harness (5476 classes fixed, native corpus scaled), with the native-symbol index from the previous commit in place: native files cull (before both) cull (after both) methods removed 62 15s 7s 6968 / 6968 462 24s 7s 6968 / 6968 1362 48s 7s 6968 / 6968 Cull time is now flat in both class count and native-file count, removing the exact same 6968 methods. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Compared 128 screenshots: 128 matched. Benchmark Results
Build and Run Timing
Detailed Performance Metrics
|
✅ Continuous Quality ReportTest & Coverage
Static Analysis
Generated automatically by the PR CI workflow. |
cleanup() clears `classes` in place (same List reference), so the name index's (reference, size) staleness guard cannot detect a subsequent same-size refill when multiple translation runs share one JVM -- e.g. ParserTest, which calls Parser.cleanup() in @beforeeach and then parses a single class per test, so every test after the first saw (same ref, size 1) and got the previous test's cached index. The single-run translator never hits this. Null the index in cleanup() so each run rebuilds from its own classes. Verified: ParserTest 10/10 green locally. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Compared 125 screenshots: 125 matched. Benchmark ResultsDetailed Performance Metrics
|
|
Compared 121 screenshots: 121 matched. |
Problem
The ParparVM dead-code optimizer ("unused Method cull") had two quadratic "scan the same thing repeatedly" costs that made large / native-heavy iOS apps slow to translate (and on the build server, slow enough to hit the translator timeout):
isMethodUsedByNative->for each native source: s.contains(symbol)): O(methods × native_bytes). Apps pulling in many cn1libs with native code (camera, scanners, maps, ...) pay heavily.getClassObject/getClassByName/ByteCodeClass.findClasseach linear-scanned the whole class list, called per dependency per class duringmarkDependencies/updateAllDependencies(run ~5× per build): O(N²).Found while investigating a real iOS cloud build hitting the translator timeout. (Heap was ruled out — the cost is deterministic/algorithmic, not GC.)
Fix
Two indexes, each built once and queried in ~O(1):
NativeSymbolIndex— a suffix automaton over the distinct native identifier tokens. Native symbols are CN1 mangled identifiers, and a query matches iff it's a substring of some token, so this is semantically identical to the oldString.contains. Tokenizing into the distinct set dedups across files, bounding the structure.O(methods × native_bytes)->O(native_bytes)once +O(|symbol|)per query.HashMap— replaces the three linear scans. Rebuilt lazily whenclasseschanges (tracked by the(reference, size)pair, which is sufficient becauseclassesis only ever reassigned / grown viaadd()/ cleared). First-match semantics preserved.Both run single-threaded across one translation run; the lazily-built static caches are excluded for
LI_LAZY_INIT_*inspotbugs-exclude.xml, matching the file's existing convention.Measurements
Real failing iOS build's classes (5476, held fixed); native corpus scaled by duplicating
.mfiles:🤖 Generated with Claude Code