Trim import tractor 0.42s -> 0.15s (gh #470)#478
Open
goodboy wants to merge 5 commits into
Open
Conversation
`get_caller_mod()` (nested in `get_logger()`) walks the WHOLE call-stack via `inspect.stack()`, which also resolves src-file info for every frame and scans all of `sys.modules` per frame via `inspect.getmodule()`. During nested imports (deep importlib stacks) each module-level `get_logger()` call costs ~5-10ms, making the ~39 such calls dominate `import tractor` wall-time: ~244ms of the ~420ms total (see gh #470). Deats, - resolve the caller frame with `sys._getframe(frames_up)` and map its `f_globals['__name__']` through `sys.modules`: O(1) vs. O(stack x sys.modules). - guard `ValueError` (stack too shallow) -> `None`, matching the existing null-caller handling at all use-sites. - drop the now-unused `inspect` imports; pull `FrameType` from `types` instead. Results: `import tractor` drops 0.42s -> ~0.155s; sequential `.start_actor()` spawn latency ~0.42 -> ~0.18s/actor. Prompt-IO: ai/prompt-io/claude/20260702T155626Z_65bf9df5_prompt_io.md (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
Move every import-time-only-by-accident dep off the eager `import tractor` path so cold child-actor boots only pay for what they actually use: - `bidict` -> `TYPE_CHECKING` in `discovery._addr` (annotation-only; `_address_types` is a plain `dict` literal). - `multiaddr` -> `TYPE_CHECKING` + fn-local imports in `discovery._multiaddr.mk_maddr()`/`parse_maddr()`; also `TYPE_CHECKING` the `Multiaddr` annots in `ipc._tcp`/`._uds` (adds future-annots to `._multiaddr`). - `colorlog` -> fn-local in `log.get_console_log()`. - `pdbp` + `wrapt` -> fn-local in `devx._frame_stack.hide_runtime_frames()`/`api_frame()`. - `platformdirs` -> fn-local in `runtime._state.get_rt_dir()`. Still eager (documented follow-ups), - `pdbp` via `devx.debug._repl` class-bases (`PdbREPL(pdbp.Pdb)`) + the module-lvl `@pdbp.hideframe` in `._tty_lock`; needs a `._repl` restructure. - `platformdirs` via the `UDSAddress.def_bindspace: ClassVar` class-body eval of `get_rt_dir()`; needs an `Address`-proto rework. - `stackscope` is already fn-local; `setproctitle` is not imported anywhere. Prompt-IO: ai/prompt-io/claude/20260702T155626Z_65bf9df5_prompt_io.md (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
`asyncio` (~5ms) only matters for infected-aio actors yet gets imported by every cold `import tractor` via module-lvl `.to_asyncio` imports in the debug-REPL + spawn-entry mods. Deats, - `devx.debug._trace`/`._tty_lock`: mv `import asyncio` under `TYPE_CHECKING` + fn-local it at the two `asyncio.current_task()` call-sites; fn-local the `run_trio_task_in_future` imports in the infected-aio-only branches. - `spawn._entry`: fn-local `run_as_asyncio_guest` inside the `infect_asyncio=True` branches of `_mp_main()`/ `_trio_main()`. - `tractor/__init__.py`: add a PEP-562 module `__getattr__` lazy-loading `.to_asyncio` on first attr-access so the public `tractor.to_asyncio.<attr>` API (e.g. `LinkedTaskChannel` annots in `test_child_manages_service_nursery.py` + downstream users) keeps working unchanged. Prompt-IO: ai/prompt-io/claude/20260702T155626Z_65bf9df5_prompt_io.md (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
Log the AI-assisted session per the NLNet generative-AI policy: prompt, profiling findings, per-file diff pointers, measured results and the unimplemented `pdbp`/`platformdirs` deferral follow-ups. (this patch was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
import tractor 0.42s -> 0.15s (gh #470)
There was a problem hiding this comment.
Pull request overview
This PR reduces import tractor wall time (and downstream actor boot latency) by removing expensive eager imports and optimizing logger caller-module detection.
Changes:
- Reworks
tractor.log.get_logger()caller-module resolution to avoidinspect.stack()/inspect.getmodule()overhead; movescolorlogto a function-local import. - Introduces PEP 562 lazy submodule loading for
tractor.to_asyncioto keepasynciooff the eagerimport tractorpath. - Converts several imports (
platformdirs,multiaddr,bidict,wrapt,pdbp, and someasyncio/tractor.to_asynciocall sites) toTYPE_CHECKINGor local imports to reduce import-time cost.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| tractor/spawn/_entry.py | Defers .to_asyncio import to infect_asyncio=True branches to avoid eager asyncio cost. |
| tractor/runtime/_state.py | Lazy-imports platformdirs inside get_rt_dir() to reduce eager imports. |
| tractor/log.py | Replaces inspect.stack()-based caller detection with sys._getframe(); lazy-imports colorlog in console handler setup. |
| tractor/ipc/_uds.py | Moves multiaddr.Multiaddr import to TYPE_CHECKING to avoid eager import cost. |
| tractor/ipc/_tcp.py | Same as _uds.py: Multiaddr import moved under TYPE_CHECKING. |
| tractor/discovery/_multiaddr.py | Adds postponed annotations + local multiaddr imports in parsing/formatting helpers. |
| tractor/discovery/_addr.py | Moves bidict import under TYPE_CHECKING (annotation-only). |
| tractor/devx/debug/_tty_lock.py | Defers asyncio / .to_asyncio imports to runtime branches that require them. |
| tractor/devx/debug/_trace.py | Defers asyncio / .to_asyncio imports to runtime branches that require them. |
| tractor/devx/_frame_stack.py | Defers pdbp and wrapt imports to their use sites to reduce eager import cost. |
| tractor/init.py | Adds __getattr__ to lazily load tractor.to_asyncio on first access. |
| ai/prompt-io/claude/20260702T155626Z_65bf9df5_prompt_io.raw.md | Adds raw prompt/output capture related to the work on gh #470. |
| ai/prompt-io/claude/20260702T155626Z_65bf9df5_prompt_io.md | Adds summarized prompt/output capture related to the work on gh #470. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The #470 boot-latency example hard-coded spawning each `worker_<i>` subactor concurrently from a bg `trio.Task` (so each child's cold `import tractor` overlaps). Add a `main()` `spawn_subs_in_bg_tasks` flag so the serial-spawn path can be demo'd/compared too: flip it `False` to `start_actor()` each sub inline in the loop before handing the ready `Portal` to the bg task. Deats, - factor an `open_ep(ptl, i)` helper out of `spawn_and_open_ep()` - just the `Portal.open_context()` + `wait_for_result()` half, now that the spawn step is caller-optional. - `spawn_and_open_ep()` grows a `maybe_ptl: Portal|None = None` param: spawn the subactor itself when unset (bg-task path), OW reuse the pre-spawned one (serial path). - move the "overlap cold imports" rationale comment onto the new `main()` param where the toggle now lives. (this commit msg was generated in some part by [`claude-code`][claude-code-gh]) [claude-code-gh]: https://github.com/anthropics/claude-code
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Trim
import tractor0.42s -> 0.15s (gh #470)Motivation
import tractormeasured ~0.42s cold and, on the defaulttriospawn backend, that import ~100% dominates per-actor spawn latency: a
pure
start_actor()(spawn + boot + register) came in at~0.40-0.44s/actor, so sequential subactor-per-core spawns cost N ×
~0.4s. Surfaced while reworking the landing-page example in #460;
issue #470 proposed lazy-importing the heavy/optional deps. The
practical payoff: cheap enough cold boots that a tree can spawn
subactors serially (inline
start_actor()) without thebg-
trio.Taskimport-overlap trick just to stay responsive.Profiling (
-X importtime+cProfile) showed the dep-list onlyaccounted for ~20ms of the total — the dominant ~244ms was
log.get_logger()'s nestedget_caller_mod()callinginspect.stack()at module level in ~39 tractor modules, whichbuilds src-file info for EVERY (import-time-deep) stack frame and
scans all of
sys.modulesper frame viainspect.getmodule().Summary of changes
get_caller_mod()now resolves the callerframe via
sys._getframe(frames_up)+ af_globals['__name__']->sys.moduleslookup instead of the O(stack × sys.modules)inspect.stack()walk.import tractorcost to cut actor spawn latency #470 checklist:bidict,multiaddr->TYPE_CHECKING/fn-local(
discovery._addr/._multiaddr,ipc._tcp/._uds);colorlog-> fn-local in
log.get_console_log();pdbp+wrapt->fn-local in
devx._frame_stack;platformdirs-> fn-local inruntime._state.get_rt_dir().asyncioentirely fortrio-only apps: fn-local the.to_asyncioimports indevx.debug._trace/._tty_lock+spawn._entry's infected-aio branches, with a PEP-562__getattr__intractor/__init__.pykeeping the publictractor.to_asyncio.<attr>access working unchanged.the NLNet generative-AI policy.
we_are_processesexample: amain()spawn_subs_in_bg_taskstoggle so the serial spawn path(inline
start_actor()per sub) can be shown alongside thebg-
trio.Taskoverlap path — now that a cold child boots in ~0.18sthe overlap trick is no longer needed to keep tree spawn-time sane.
Factors an
open_ep()helper out ofspawn_and_open_ep()(whichgrows a
maybe_ptlparam) so the spawn step is caller-optional.Results:
import tractor0.42s -> ~0.145s (-65%); sequentialstart_actor()latency ~0.42 -> ~0.18s/actor. Full suite greenunder both
--tpt-protobackends (tcp403 passed,uds401passed, 0 failures each).
Future follow up
See the remaining follow-ups in issue #470: the
pdbp(~10ms, needs a
devx.debug._replrestructure) andplatformdirs(~1.5ms, needs an
Address-proto rework ofUDSAddress.def_bindspace) deferrals stay open there.Links
import tractorcost to cut actor spawn latency #470 (left open to track the follow-upsabove)
main_thread_forkserverbackend #463) which amortize theimport cost but don't remove it for the
triobackend(this pr content was generated in some part by
claude-code)