fix: make Rive data binding thread-safe on iOS & Android (#297)#298
Open
mfazekas wants to merge 20 commits into
Open
fix: make Rive data binding thread-safe on iOS & Android (#297)#298mfazekas wants to merge 20 commits into
mfazekas wants to merge 20 commits into
Conversation
The Rive iOS runtime (RiveViewModel/RiveView, the Obj-C RiveDataBindingViewModelInstance caches, and the C++ ViewModelInstanceRuntime maps) is not thread-safe and must be driven from the main thread, where the state machine advances and property listeners fire. The Nitro layer accessed it from three threads with no synchronization: - main render thread (state machine advance / afterUpdate reconfigure) - JS thread (synchronous property accessors) - Swift cooperative pool (every `Promise.async` block) Concurrent access corrupts those caches, producing the crashes in #297 (NSMutableDictionary mutation, unordered_map free-of-unallocated, artboard instancing). Thread Sanitizer flagged ~25 data races at the data-binding boundary. Route all data-binding access through the main thread: - add `MainThread.run` (runs inline on main, hops via `DispatchQueue.main.sync` otherwise) and `Promise.onMain` (does the work on main and returns an already-resolved Promise, so Nitro's `.then` never races a background resolve) - apply to HybridViewModelInstance, HybridViewModel, every HybridViewModel*Property, and the property listener helper After: TSan reports 0 races under the same stress; data binding still renders.
- example/src/__tests__/issue297.databinding.test.ts: on-device harness test that drives `viewModelAsync` (resolved off-main) concurrently with synchronous property access on the same ViewModelInstance. It stays green; built with Thread Sanitizer it reports the data race (unfixed) or is clean (fixed). Turn CONCURRENCY/ROUNDS up to reproduce the hard crash. - rn-harness.config.mjs: HARNESS_TSAN-gated TSAN_OPTIONS via appLaunchOptions, and HARNESS_METRO_PORT override (defaults unchanged). - Issue297ThreadRace.tsx: manual reproducer screen (same stress in-app). - gitignore the .harness/ build cache.
…no-deprecated in repro)
Renders a live RiveView (state machine on the render thread) and churns the same property from the JS thread through useRiveNumber (writes + addListener via remounts). Under a TSan build it flags the listener race on the unfixed runtime, clean on the fixed one.
…ad (#297) setNumberInputValue/getNumberInputValue, setBooleanInputValue/getBooleanInputValue, triggerInput, and setTextRunValue/getTextRunValue touch the artboard/state machine from the JS thread. Funnel them through MainThread.run, same as the data-binding API.
…modification (#297) onChanged() iterates the listener map on a Dispatchers.Default flow collector while addListener/removeListener mutate it from the JS/native thread, throwing ConcurrentModificationException and crashing the app (the Android analogue of the data-binding race fixed on iOS). Use ConcurrentHashMap and iterate over a toList() snapshot so dispatch is safe even if a callback adds/removes a listener mid-iteration.
…under TSan in one step The harness only drives a prebuilt app, so Thread Sanitizer (a compile flag) must be baked into the app rather than passed to the harness. This script builds the example with -enableThreadSanitizer YES, installs it on the booted simulator, and runs the harness against it.
TSAN_OPTIONS is only read by the TSan runtime and ignored on a non-TSan build, so the HARNESS_TSAN gate was pointless. Always set it; running under TSan is now just 'yarn test:harness:ios' once the app is built with -enableThreadSanitizer YES.
Auto-detect the booted simulator in rn-harness.config.mjs (override with DEVICE_MODEL / IOS_VERSION; falls back to the previous fixed default). Now 'yarn ios' + 'yarn test:harness:ios' work without naming a device. Drop the now- redundant device detection from the TSan script.
Booted-sim auto-detection is ambiguous with more than one booted device (it and 'simctl install booted' can disagree). Collect all booted sims and warn which one is used, pointing at DEVICE_MODEL / IOS_VERSION to pin it.
The 3-step TSan flow (build:ios --extra-params, simctl install booted, test:harness:ios) is simple enough and the harness now auto-targets the booted simulator, so the wrapper script just duplicated it (via a different build path).
92cbb5d to
69526cd
Compare
There was a problem hiding this comment.
Pull request overview
This PR addresses intermittent crashes related to Rive data-binding thread-safety by enforcing single-threaded access patterns across iOS and Android, and adds harness-based regression probes for issue #297.
Changes:
- iOS: serialize Rive data-binding/property access onto the main thread (
MainThread.run,Promise.onMain) across ViewModel/property wrappers and legacy sync RiveView input/text-run APIs. - Android: make property listener dispatch safe under concurrent add/remove by switching to
ConcurrentHashMapand iterating a snapshot. - Tooling/tests: add harness tests + repro screen for issue #297, improve harness simulator selection/TSan options, and bump
react-native-nitro-modulesto0.35.9.
Reviewed changes
Copilot reviewed 24 out of 27 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| yarn.lock | Updates lockfile for react-native-nitro-modules@0.35.9. |
| package.json | Bumps react-native-nitro-modules dependency + peer range. |
| ios/Utils.swift | Adds MainThread.run and Promise.onMain helpers for main-thread serialization. |
| ios/HybridViewModelTriggerProperty.swift | Routes trigger calls through MainThread.run. |
| ios/HybridViewModelStringProperty.swift | Main-thread wraps sync access; async uses Promise.onMain. |
| ios/HybridViewModelNumberProperty.swift | Main-thread wraps sync access; async uses Promise.onMain. |
| ios/HybridViewModelListProperty.swift | Main-thread wraps list access/mutation; async uses Promise.onMain. |
| ios/HybridViewModelInstance.swift | Main-thread wraps instance/property lookups + replace operations. |
| ios/HybridViewModelImageProperty.swift | Main-thread wraps image value setting. |
| ios/HybridViewModelEnumProperty.swift | Main-thread wraps sync access; async uses Promise.onMain. |
| ios/HybridViewModelColorProperty.swift | Main-thread wraps sync access; async uses Promise.onMain. |
| ios/HybridViewModelBooleanProperty.swift | Main-thread wraps sync access; async uses Promise.onMain. |
| ios/HybridViewModelArtboardProperty.swift | Main-thread wraps artboard binding. |
| ios/HybridViewModel.swift | Main-thread wraps ViewModel operations; async uses Promise.onMain. |
| ios/HybridRiveView.swift | Main-thread wraps legacy sync input/text-run APIs. |
| ios/BaseHybridViewModelProperty.swift | Main-thread wraps listener add/remove operations. |
| expo55-example/package.json | Bumps example’s react-native-nitro-modules to 0.35.9. |
| expo-example/package.json | Bumps example’s react-native-nitro-modules to 0.35.9. |
| example/src/reproducers/Issue297ThreadRace.tsx | Adds manual in-app stress reproducer for #297. |
| example/src/tests/issue297.hooks.test.tsx | Adds harness test reproducing the race via public hooks. |
| example/src/tests/issue297.databinding.test.ts | Adds harness test probe for concurrent data-binding access. |
| example/rn-harness.config.mjs | Improves simulator targeting and configures TSan env options. |
| example/package.json | Bumps example’s react-native-nitro-modules to 0.35.9. |
| example/ios/Podfile.lock | Updates iOS pods (NitroModules/RNRive/etc.) reflecting dependency bump. |
| eslint.config.mjs | Ignores harness cache directory. |
| android/src/main/java/com/margelo/nitro/rive/BaseHybridViewModelPropertyImpl.kt | Uses ConcurrentHashMap + snapshot iteration to avoid ConcurrentModificationException. |
| .gitignore | Ignores .harness/ build cache. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
52
to
+56
| func swap(index1: Double, index2: Double) throws -> Bool { | ||
| let idx1 = UInt32(index1) | ||
| let idx2 = UInt32(index2) | ||
| guard idx1 < property.count && idx2 < property.count else { | ||
| return false | ||
| MainThread.run { | ||
| let idx1 = UInt32(index1) | ||
| let idx2 = UInt32(index2) | ||
| guard idx1 < property.count && idx2 < property.count else { |
Comment on lines
17
to
+21
| func getInstanceAt(index: Double) throws -> (any HybridViewModelInstanceSpec)? { | ||
| guard let instance = property.instance(at: Int32(index)) else { return nil } | ||
| return HybridViewModelInstance(viewModelInstance: instance) | ||
| MainThread.run { () -> (any HybridViewModelInstanceSpec)? in | ||
| guard let instance = property.instance(at: Int32(index)) else { return nil } | ||
| return HybridViewModelInstance(viewModelInstance: instance) | ||
| } |
Comment on lines
38
to
41
| func addInstanceAt(instance: any HybridViewModelInstanceSpec, index: Double) throws -> Bool { | ||
| let viewModelInstance = try requireViewModelInstance(instance) | ||
| return property.insert(viewModelInstance, at: Int32(index)) | ||
| return MainThread.run { property.insert(viewModelInstance, at: Int32(index)) } | ||
| } |
Comment on lines
48
to
50
| func removeInstanceAt(index: Double) throws { | ||
| property.remove(at: Int32(index)) | ||
| MainThread.run { property.remove(at: Int32(index)) } | ||
| } |
Comment on lines
17
to
+22
| func createInstanceByIndex(index: Double) throws -> (any HybridViewModelInstanceSpec)? { | ||
| guard index >= 0 else { return nil } | ||
| guard let viewModel = viewModel, | ||
| let vmi = viewModel.createInstance(fromIndex: UInt(index)) else { return nil } | ||
| return HybridViewModelInstance(viewModelInstance: vmi) | ||
| MainThread.run { () -> (any HybridViewModelInstanceSpec)? in | ||
| guard index >= 0 else { return nil } | ||
| guard let viewModel = viewModel, | ||
| let vmi = viewModel.createInstance(fromIndex: UInt(index)) else { return nil } | ||
| return HybridViewModelInstance(viewModelInstance: vmi) |
Comment on lines
134
to
138
| func setNumberInputValue(name: String, value: Double, path: String?) throws { | ||
| try getRiveView().setNumberInputValue(name: name, value: Float(value), path: path) | ||
| try MainThread.run { | ||
| try getRiveView().setNumberInputValue(name: name, value: Float(value), path: path) | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the occasional crashes in #297. The Rive data-binding objects aren't thread-safe and must be driven from a single thread, but the Nitro layer accessed them from several threads with no synchronization:
RiveDataBindingViewModelInstance/ C++ViewModelInstanceRuntimecaches are main-thread-only, yet they were touched from the JS thread (sync accessors), the Swift cooperative pool (everyPromise.async), and the main render loop. Concurrent access corrupts those caches → the three crash stacks in the issue.BaseHybridViewModelPropertyImpliterated its listener map on aDispatchers.Defaultflow collector whileaddListener/removeListenermutated it from the JS thread →ConcurrentModificationException(this is what crashed thetest-harness-androidCI job).The fix
iOS — route all data-binding access through the main thread:
MainThread.run— runs inline on main, else hops viaDispatchQueue.main.sync.Promise.onMain— dispatches work to the main queue asynchronously (non-blocking) and returns aPromisethat resolves once the work completes on main. ReplacesPromise.async, which ran closures on the Swift cooperative pool off-main.HybridViewModelInstance,HybridViewModel, everyHybridViewModel*Property, the property listener helper, and the deprecated imperativeRiveViewinput/text-run methods (set/getNumberInputValue,set/getBooleanInputValue,triggerInput,set/getTextRunValue).react-native-nitro-modulesto0.35.9, which fixes a Swift-level Promise listener race (fix: FixPromise<void> was destroyed!race condition by synchronizing SwiftPromiseoperations mrousavy/nitro#1338). Full C++ object-bridging coverage lands in fix(core): synchronize C++ Promise<T> state-inspection accessors mrousavy/nitro#1398.Android —
BaseHybridViewModelPropertyImplnow uses aConcurrentHashMapand iterates atoList()snapshot inonChanged, so dispatch is safe even if a callback adds/removes a listener mid-iteration.Verification (Thread Sanitizer)
Built the iOS example with
-enableThreadSanitizer YESand ran a concurrent data-binding stress (viewModelAsync+ synchronous property access on the same instance). TSan reports data races on the unfixed build and none after the fix; under heavy load the unfixed build crashes, the fixed one doesn't.On Android, the harness test hits the
ConcurrentModificationExceptioncrash before the fix and passes after.Tests
react-native-harness tests — they always pass, the sanitizer is the real assertion (unfixed runtime → race reported / crash; fixed → clean):
example/src/__tests__/issue297.databinding.test.ts—viewModelAsync(off-main) concurrent with synchronous property access. RaiseCONCURRENCY/ROUNDSto reproduce the hard crash (what took down the Android CI job).example/src/__tests__/issue297.hooks.test.tsx— the same bug via the publicuse*hooks: a liveRiveViewadvances the state machine on the render thread whileuseRiveNumberchurns the property (writes +addListener) on the JS thread.example/src/reproducers/Issue297ThreadRace.tsxis a manual in-app reproducer.Running the harness tests under Thread Sanitizer (iOS)
TSan is a compile flag, and the harness only drives a prebuilt app (it has no build step), so it can't be a harness option — it's the normal build plus one xcodebuild flag.
yarn iosbuilds and installs the app, then the harness drives it:--no-packagerstopsrun-iosfrom starting Metro (the harness starts its own). The harness auto-targets the booted simulator (seern-harness.config.mjs). TSan reports land in/tmp/tsan_harness.<pid>; grep forWARNING: ThreadSanitizer: data race— present on the unfixed runtime, absent after the fix.Prefer an explicit build + install (what CI does)? Note
build:iosneeds--buildFolderto land inios/build:yarn build:ios --buildFolder build --extra-params "-enableThreadSanitizer YES" xcrun simctl install booted ios/build/Build/Products/Debug-iphonesimulator/RiveExample.app yarn test:harness:ios --testPathPattern issue297Or enable Thread Sanitizer in the Xcode scheme (Run → Diagnostics) and
⌘Ronce to build+install.