Skip to content

Expose event queue depth as orchestration-visible state #9

@affandar

Description

@affandar

Problem

Duroxide event queues are effectively opaque to orchestration code and external callers.

When an orchestration accumulates unmatched events on a queue such as "messages", there is no way to inspect how many are pending. The only signal today is a runtime warning when continue_as_new() drops carry-forward events beyond the hard cap:

Dropping carry-forward event beyond limit of 20 during continue-as-new

That is not sufficient because:

  • it appears only in worker logs
  • external processes polling orchestration status cannot see it
  • it fires after messages have already been dropped
  • it does not help orchestrations make better decisions before overflow happens

Concrete use case

PilotSwarm has parent orchestrations that receive a mixed stream of child updates, user prompts, and management messages on the same "messages" queue.

When child updates get noisy, the parent queue can build up enough unmatched events that carry-f

Probrflow drops user prompts during continue_as_new().

We can see the warning in logs, but clients, operators, and the orchestration itself cannot observe queue pressure directly.

What we want is visibility into queue depth so that:

  1. external callers can tell that their messages are pending behind backlog
  2. orchestrations can avoid risky checkpoints when a queue is near overflow
  3. operators can debug "message ignored" incidents without log forensics

Proposal

Expose queue depth as orchestration-visible/readable state.

Useful options would be:

Option A: expose in getStatus()

Include queue depths in orchestration status:

{
  status: "Running",
  customStatus: ..., 
  queueDepths: {
    messages: 12,
    child_updates: 3
  }
}

Option B: expose on orchestration context

Allow orchestration code to inspect queue depth:

const depth = ctx.queueDepth("messages");

This would let orchestrations skip continue_as_new() when queue pressure is high, or publish queue health into custom status.

Option C: expose carry-forward counts in history/status

If full live queue depth is hard to provide, at minimum expose how many unmatched events existed before carry-forward truncation during continue_as_new().

Why this matters

  • log warnings are too late and too hidden
  • queue pressure is a first-class operational signal
  • parent/child orchestration topologies need to know when backlog is forming
  • status visibility would make "why was my message ignored?" diagnosable without worker log access

Relationship to tryDequeueEvent()

This is complementary to a non-blocking dequeue API.

The tryDequeueEvent() issue is about exact drain-until-empty behavior inside orchestrations.

This issue is about visibility: allowing orchestration code and external callers to observe queue backlog and understand when messages are pending or at risk.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions