fix(telemetry): make ThreadingInstrumentor opt-in with env-var precedence#2167
Open
jpr5 wants to merge 1 commit intostrands-agents:mainfrom
Open
fix(telemetry): make ThreadingInstrumentor opt-in with env-var precedence#2167jpr5 wants to merge 1 commit intostrands-agents:mainfrom
jpr5 wants to merge 1 commit intostrands-agents:mainfrom
Conversation
…ence
strands-agents>=1.35.0 unconditionally calls ThreadingInstrumentor().
instrument() on Tracer construction. When another OTel setup (Azure
Monitor, opentelemetry-distro, Langfuse autoloader, etc.) already
wraps concurrent.futures.ThreadPoolExecutor.submit, or when the host
sets OTEL_PYTHON_DISABLED_INSTRUMENTATIONS=threading, strands still
installs its own wrapper — defeating the documented disable mechanism
and imposing global Python state mutation on every strands user.
Tracer now accepts an instrument_threading: bool | None kwarg. Resolver
precedence (highest → lowest):
1. OTEL_PYTHON_DISABLED_INSTRUMENTATIONS containing 'threading' →
always disabled (matches OpenTelemetry's auto-loader semantics).
2. Explicit Tracer(instrument_threading=...) kwarg.
3. STRANDS_INSTRUMENT_THREADING env var ('1'/'true'/'yes').
4. Default: disabled.
_maybe_instrument_threading checks the underscore-prefixed private flag
_is_instrumented_by_opentelemetry to avoid stacking wrappers when the
host has already installed the instrumentor. instrument() failures are
caught and logged (error level when the user explicitly requested,
warning level otherwise) so telemetry failure never crashes the host.
Regression tests run in isolated subprocesses because BaseInstrumentor
is a process-wide singleton. Tests cover default-off, env-var opt-in,
kwarg opt-in, env-var-disabled-precedence-over-kwarg-opt-in,
kwarg-false-overrides-env-opt-in, idempotency when already instrumented,
graceful failure on instrument() exceptions, and user-requested failures
logged at ERROR while auto-enabled failures log at WARNING.
Downstream workaround in CopilotKit PR #4083 (commit 9227bc27d,
showcase/packages/strands/src/agent_server.py) neutralized the
instrumentor at import time; this removes the need for that workaround.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
strands-agents>=1.35.0 unconditionally calls ThreadingInstrumentor().instrument() on Tracer construction. When another OTel setup (Azure Monitor, opentelemetry-distro, Langfuse autoloader, etc.) already wraps concurrent.futures.ThreadPoolExecutor.submit, or when the host sets OTEL_PYTHON_DISABLED_INSTRUMENTATIONS=threading, strands still installs its own wrapper — defeating the documented disable mechanism and imposing global Python state mutation on every strands user.
Tracer now accepts an instrument_threading: bool | None kwarg. Resolver precedence (highest → lowest):
_maybe_instrument_threading checks the underscore-prefixed private flag _is_instrumented_by_opentelemetry to avoid stacking wrappers when the host has already installed the instrumentor. instrument() failures are caught and logged (error level when the user explicitly requested, warning level otherwise) so telemetry failure never crashes the host.
Regression tests run in isolated subprocesses because BaseInstrumentor is a process-wide singleton. Tests cover default-off, env-var opt-in, kwarg opt-in, env-var-disabled-precedence-over-kwarg-opt-in, kwarg-false-overrides-env-opt-in, idempotency when already instrumented, graceful failure on instrument() exceptions, and user-requested failures logged at ERROR while auto-enabled failures log at WARNING.
Downstream workaround in CopilotKit PR #4083 (commit 9227bc27d, showcase/packages/strands/src/agent_server.py) neutralized the instrumentor at import time; this removes the need for that workaround.