Skip to content

iOS: AudioDeviceModuleObserver's DISPATCH_TIME_FOREVER waits can deadlock the JS thread (total UI freeze) under default config #89

@chen-rn

Description

@chen-rn

Follow-up to livekit/client-sdk-react-native#389, where @jankosecki traced this deadlock and offered to file it here. We hit the same bug in production with the default configuration, so we're filing it with our data. All file/line references below are against the 144.1.0 tag.

Environment

  • @livekit/react-native-webrtc 144.1.0, @livekit/react-native 2.11.0
  • React Native 0.85.3 (New Architecture), Expo SDK 56
  • iOS, physical device, production build
  • Default audio setup: stock registerGlobals(), no autoConfigureAudioSession changes, no manual audio session configuration

Mechanism

Three pieces interact:

  1. ios/RCTWebRTC/AudioDeviceModuleObserver.m blocks the native thread driving RTCAudioDeviceModule delegate callbacks with dispatch_semaphore_wait(..., DISPATCH_TIME_FOREVER) while waiting for a JS reply — six sites: L61 (didCreateEngine), L82 (willEnableEngine), L107 (willStartEngine), L128 (didStopEngine), L149 (didDisableEngine), L161 (willReleaseEngine). The reply only arrives if the JS thread is free to run the event listener and call the corresponding audioDeviceModuleResolve* method (WebRTCModule+RTCAudioDeviceModule.m:232-258).
  2. The observer is installed unconditionally in WebRTCModule.m:98-99, so every app pays the blocking-wait cost even if it never registers JS-side audio engine hooks.
  3. Several peer-connection methods are blocking-synchronous JS→native calls that dispatch_sync onto the serial workerQueue — notably peerConnectionAddTransceiver (WebRTCModule+RTCPeerConnection.m:534, dispatch_sync at L539), which holds the JS thread until the queue block returns.

If an AVAudioEngine stop/restart (e.g. a mic publish flipping the engine from playout-only to duplex) is in flight when JS enters peerConnectionAddTransceiver, the workerQueue block stalls inside libwebrtc on state held by the restart, the restart's delegate callback waits forever for a JS reply, and JS is parked inside the synchronous bridge call. Circular wait — JS → workerQueue → libwebrtc/ADM state → delegate semaphore → JS — and nothing ever proceeds.

Reproduction shape

Publish a microphone track and then a camera track back-to-back right after Connected (a voice+video call against a LiveKit agent, in our case). The mic publish triggers the playout-only → duplex engine restart; the camera publish issues the synchronous addTransceiver. The race window is a few milliseconds, so it's intermittent — most calls are fine, then one freezes. livekit/client-sdk-react-native#389 has the mirror-image trace from the subscribe side (remote audio+video tracks on join).

Production occurrence

UTC, from our backend records of the frozen call:

Time Event
01:46:33.116 Last JS-side analytics write — JS thread provably alive
01:46:34.642 Agent subscribed our microphone track (mic publish completed)
Camera track never published
room.disconnect() never ran; user force-killed the app

In a healthy call by the same user two minutes earlier, the mic→camera publish gap was 35 ms.

Impact

Permanent app-wide UI freeze: the JS thread never returns, every React Native touchable is dead, and no crash report is generated since it's a deadlock rather than a crash. The user's only recourse is force-killing the app, and because room.disconnect() can never run, the session also lingers server-side.

Proposed fix

Bound the six waits (e.g. dispatch_time(DISPATCH_TIME_NOW, 2 * NSEC_PER_SEC)), drain any stale signal left on the long-lived semaphore by a previously timed-out round before sending the event, and on timeout emit an os_log error and return the default 0 so the engine operation degrades gracefully instead of freezing the app. We're running exactly this as a local patch in production. Happy to open a PR if the approach is acceptable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions