fix: normalize shell background commands#7859
fix: normalize shell background commands#7859zouyonghe wants to merge 3 commits intoAstrBotDevs:masterfrom
Conversation
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The quote/comment parsing state machine is duplicated between
_command_effective_endand_is_unquoted_char_at; consider extracting a shared helper or reusing one of them to avoid drift and make the background-operator handling easier to reason about. _is_self_detached_commandnow overlaps semantically with_uses_explicit_background_launcherand the new_prepare_shell_backgroundflow; it might be worth either inlining/deprecating it or aligning its behavior and callers with the new helpers to reduce confusion about which function should be used for background detection.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The quote/comment parsing state machine is duplicated between `_command_effective_end` and `_is_unquoted_char_at`; consider extracting a shared helper or reusing one of them to avoid drift and make the background-operator handling easier to reason about.
- `_is_self_detached_command` now overlaps semantically with `_uses_explicit_background_launcher` and the new `_prepare_shell_background` flow; it might be worth either inlining/deprecating it or aligning its behavior and callers with the new helpers to reduce confusion about which function should be used for background detection.
## Individual Comments
### Comment 1
<location path="astrbot/core/tools/computer_tools/shell.py" line_range="135" />
<code_context>
- return True
- return tokens[-1] == "&"
+
+def _strip_plain_trailing_background_operator(command: str) -> str | None:
+ effective_end = _command_effective_end(command)
+ tail = command[:effective_end].rstrip()
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying background handling by relying solely on shlex-based token parsing and a single classifier helper instead of custom character-level parsing utilities.
You can drop the custom mini‑parser and keep all behavior by using `shlex` as the single source of truth for both detection and stripping of trailing `&`.
Concretely:
1. **Use only `shlex` for parsing/background detection.**
2. **Strip a trailing `&` at the token level**, instead of walking the raw string with quote/escape logic.
This lets you remove `_command_effective_end` and `_is_unquoted_char_at`, and simplify `_prepare_shell_background`.
### 1. Simplify `_strip_plain_trailing_background_operator`
Instead of scanning characters, reuse `_command_tokens_before_comment` and work purely on tokens:
```python
def _strip_plain_trailing_background_operator(command: str) -> str | None:
tokens = _command_tokens_before_comment(command)
if not tokens or tokens[-1] != "&":
return None
# Drop the final '&' token; comments after '&' are already removed
stripped_tokens = tokens[:-1]
stripped = " ".join(stripped_tokens)
# NOTE: join() will normalize spacing and drop inline comments after '&',
# but that content would not be executed anyway.
return stripped if stripped else ""
```
Because `shlex` respects quotes, this only matches an **unquoted** trailing `&` token. `&` inside quotes will be part of another token and will not be equal to `"&"`.
You can then delete:
```python
def _command_effective_end(...): ...
def _is_unquoted_char_at(...): ...
```
### 2. Unify background classification logic
You can centralize the classification into a single helper and keep `_prepare_shell_background` very small:
```python
from enum import Enum, auto
class BackgroundKind(Enum):
EXPLICIT_LAUNCHER = auto()
TRAILING_AMP = auto()
NONE = auto()
def _background_kind(command: str) -> BackgroundKind:
tokens = _command_tokens_before_comment(command)
if not tokens:
return BackgroundKind.NONE
if tokens[0].lower() in {"nohup", "setsid", "disown", "start", "start-process"}:
return BackgroundKind.EXPLICIT_LAUNCHER
if tokens[-1] == "&":
return BackgroundKind.TRAILING_AMP
return BackgroundKind.NONE
```
```python
def _prepare_shell_background(command: str, background: bool) -> tuple[str, bool]:
if not background:
# Preserve old behavior when background flag is False
return command, False
kind = _background_kind(command)
if kind is BackgroundKind.EXPLICIT_LAUNCHER:
# Command already backgrounds itself
return command, False
if kind is BackgroundKind.TRAILING_AMP:
stripped = _strip_plain_trailing_background_operator(command)
# stripped is never None here, but keep defensive fallback
return stripped or command, True
# No explicit backgrounding; use the flag
return command, True
```
This:
- Keeps all the semantics you added: supports `background` flag, respects self‑detaching launchers, avoids double‑backgrounding, and honors quoted `&`.
- Removes the custom character‑level parser and overlapping helpers.
- Makes reasoning about background handling revolve around a single, `shlex`‑based path.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request updates the shell execution tool and system prompts to handle background tasks by instructing the LLM to avoid background operators and automatically stripping trailing ampersands when the background flag is set. The review identifies opportunities to optimize command parsing, remove dead code, and fix potential path-handling issues on Windows.
| def _strip_plain_trailing_background_operator(command: str) -> str | None: | ||
| effective_end = _command_effective_end(command) | ||
| tail = command[:effective_end].rstrip() | ||
| if not tail.endswith("&"): | ||
| return None | ||
| if not _is_unquoted_char_at(tail, len(tail) - 1): | ||
| return None | ||
|
|
||
| stripped = tail[:-1].rstrip() | ||
| return stripped if stripped != tail else None |
There was a problem hiding this comment.
This function performs two full passes over the command string (one in _command_effective_end and one in _is_unquoted_char_at). Additionally, the manual parsing logic for escapes (\\) and quotes is inconsistent with the shlex usage in _command_tokens_before_comment (which uses posix=False). On Windows, treating \\ as an escape character will break paths (e.g., C:\\temp).
Consider unifying the parsing logic into a single pass that identifies both the comment start and the trailing ampersand, while respecting the platform's shell rules.
| def _is_self_detached_command(command: str) -> bool: | ||
| tokens = _command_tokens_before_comment(command) | ||
| if not tokens: | ||
| return False | ||
|
|
||
| if _uses_explicit_background_launcher(command): | ||
| return True | ||
| return tokens[-1] == "&" |
There was a problem hiding this comment.
The function _is_self_detached_command appears to be dead code in the production logic. It was previously used in the call method but has been replaced by _prepare_shell_background. Unless it is intended to be part of a public API (which seems unlikely for a private helper), it should be removed to maintain code cleanliness. Note that you will also need to update or remove the corresponding unit tests in tests/unit/test_func_tool_manager.py.
| def _uses_explicit_background_launcher(command: str) -> bool: | ||
| tokens = _command_tokens_before_comment(command) | ||
| if not tokens: | ||
| return False | ||
| return tokens[0].lower() in {"nohup", "setsid", "disown", "start", "start-process"} |
There was a problem hiding this comment.
This function tokenizes the entire command string just to check the first token. For long commands, this is inefficient. You can use shlex.shlex.get_token() to retrieve only the first token.
| def _uses_explicit_background_launcher(command: str) -> bool: | |
| tokens = _command_tokens_before_comment(command) | |
| if not tokens: | |
| return False | |
| return tokens[0].lower() in {"nohup", "setsid", "disown", "start", "start-process"} | |
| def _uses_explicit_background_launcher(command: str) -> bool: | |
| lex = shlex.shlex(command, posix=False) | |
| lex.whitespace_split = True | |
| lex.commenters = "" | |
| try: | |
| first_token = lex.get_token() | |
| except ValueError: | |
| return False | |
| if not first_token: | |
| return False | |
| return first_token.lower() in {"nohup", "setsid", "disown", "start", "start-process"} |
| return None | ||
|
|
||
| stripped = tail[:-1].rstrip() | ||
| return stripped if stripped != tail else None |
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The semantics of
_is_self_detached_commandchanged to rely on_background_kindand now treat any command with a trailing&as self-detached; if this helper is used externally, this could be a subtle behavior change, so consider either inlining/removing it or keeping its original semantics and using a separate helper for the new classification. - _strip_plain_trailing_background_operator
can return an empty string when the command is just&, which then gets passed tosb.shell.exec`; it may be safer to detect this case earlier and either no-op or treat it as a foreground/background flag error rather than executing an empty command.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The semantics of `_is_self_detached_command` changed to rely on `_background_kind` and now treat any command with a trailing `&` as self-detached; if this helper is used externally, this could be a subtle behavior change, so consider either inlining/removing it or keeping its original semantics and using a separate helper for the new classification.
- _strip_plain_trailing_background_operator` can return an empty string when the command is just `&`, which then gets passed to `sb.shell.exec`; it may be safer to detect this case earlier and either no-op or treat it as a foreground/background flag error rather than executing an empty command.
## Individual Comments
### Comment 1
<location path="astrbot/core/tools/computer_tools/shell.py" line_range="144-149" />
<code_context>
+ return tokens
+
+
+def _strip_plain_trailing_background_operator(command: str) -> str | None:
+ tokens = _command_tokens_before_comment(command)
+ if not tokens or tokens[-1] != "&":
+ return None
- first = tokens[0].lower()
- if first in {"nohup", "setsid", "disown", "start", "start-process"}:
- return True
- return tokens[-1] == "&"
+ stripped = " ".join(tokens[:-1])
+ return stripped if stripped else ""
</code_context>
<issue_to_address>
**issue (bug_risk):** Avoid returning an empty command string when the input is only a background operator.
For input like just `"&"`, `_command_tokens_before_comment` returns `['&']`, so this helper yields `""` and `_prepare_shell_background` ends up calling `sb.shell.exec` with an empty command. Instead, when `tokens[:-1]` is empty, return `None` so `_prepare_shell_background` can treat this as "no valid command" and fall back to the original `command` or handle the case explicitly, rather than executing an empty command.
</issue_to_address>
### Comment 2
<location path="astrbot/core/tools/computer_tools/shell.py" line_range="21" />
<code_context>
}
+class BackgroundKind(Enum):
+ EXPLICIT_LAUNCHER = auto()
+ TRAILING_AMP = auto()
</code_context>
<issue_to_address>
**issue (complexity):** Consider replacing the enum- and helper-based background handling with a single classifier function reused by both `_prepare_shell_background` and `_is_self_detached_command` to simplify control flow and tokenization.
You can keep the new behavior and reduce indirection by:
- Removing `BackgroundKind` and `_strip_plain_trailing_background_operator`.
- Doing a single tokenization/classification pass.
- Reusing that result for both `_prepare_shell_background` and `_is_self_detached_command`.
For example:
```python
def _command_tokens_before_comment(command: str) -> list[str]:
lex = shlex.shlex(command, posix=False)
lex.whitespace_split = True
lex.commenters = ""
try:
tokens = list(lex)
except ValueError:
return []
comment_index = next(
(index for index, token in enumerate(tokens) if token.startswith("#")),
None,
)
if comment_index is not None:
tokens = tokens[:comment_index]
return tokens
```
Add a single classifier:
```python
def _classify_background(command: str) -> tuple[list[str], bool, bool]:
"""
Returns (tokens_before_comment, has_explicit_launcher, has_trailing_amp).
"""
tokens = _command_tokens_before_comment(command)
if not tokens:
return tokens, False, False
first = tokens[0].lower()
has_explicit_launcher = first in {"nohup", "setsid", "disown", "start", "start-process"}
has_trailing_amp = tokens[-1] == "&"
return tokens, has_explicit_launcher, has_trailing_amp
```
Then `_prepare_shell_background` becomes a single, straightforward branching point without enum/extra helper:
```python
def _prepare_shell_background(command: str, background: bool) -> tuple[str, bool]:
if not background:
return command, False
tokens, has_explicit_launcher, has_trailing_amp = _classify_background(command)
# Respect explicit background launchers: treat as already-detached, do not
# request separate background execution.
if has_explicit_launcher:
return command, False
# If the caller passed a trailing '&', strip it and run in background.
if has_trailing_amp:
stripped = " ".join(tokens[:-1])
return stripped if stripped else "", True
# No launcher or '&': honor the background flag as-is.
return command, True
```
And `_is_self_detached_command` can delegate to the same classifier without an enum:
```python
def _is_self_detached_command(command: str) -> bool:
_, has_explicit_launcher, has_trailing_amp = _classify_background(command)
return has_explicit_launcher or has_trailing_amp
```
This:
- Keeps all existing behavior (explicit launchers, trailing `&`, stripping `&`, background flag handling).
- Eliminates the enum and double tokenization.
- Centralizes background-related logic into a single classifier and a single decision function.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- The new async shell tests all reimplement the same
FakeShell/FakeBooter/FakeConfig/FakeWrapperscaffolding; consider extracting a small factory/helper to reduce duplication and make future behavior changes to the fake environment easier to keep in sync. - The empty-command check for
background=Trueis currently done inExecuteShellTool.callafter_prepare_shell_background; consider moving that validation into_prepare_shell_backgroundso all background normalization and related error handling are centralized in one place.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The new async shell tests all reimplement the same `FakeShell`/`FakeBooter`/`FakeConfig`/`FakeWrapper` scaffolding; consider extracting a small factory/helper to reduce duplication and make future behavior changes to the fake environment easier to keep in sync.
- The empty-command check for `background=True` is currently done in `ExecuteShellTool.call` after `_prepare_shell_background`; consider moving that validation into `_prepare_shell_background` so all background normalization and related error handling are centralized in one place.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Summary
This fixes CUA desktop launches where the model passes both
background=trueand a shell-level trailing&, for examplefirefox &.The CUA booter has its own background launcher that detaches GUI apps through a Python
subprocess.Popenwrapper. The generic shell tool previously treated any trailing&as already detached and disabled the tool-levelbackgroundflag, which bypassed the CUA wrapper and could leave GUI launch commands waiting until timeout.Changes
background=truewithout appending&,nohup, or other shell background operators.astrbot_execute_shellbackgroundparameter description.&commands whenbackground=true, sofirefox &is executed asfirefoxwith tool-managed background execution.nohup ... &.Verification
uv run pytest tests/unit/test_func_tool_manager.py tests/unit/test_astr_main_agent.py -quv run pytest tests/unit/test_func_tool_manager.py tests/unit/test_astr_main_agent.py tests/unit/test_cua_computer_use.py -quv run ruff format .uv run ruff check .Summary by Sourcery
Normalize shell background handling for CUA desktop launches to rely on the tool-level background flag while preserving explicit self-detaching commands.
Bug Fixes:
astrbot_execute_shellstill go through the CUA wrapper by stripping plain trailing&when the background flag is set and rejecting empty commands.Enhancements:
astrbot_execute_shellbackground parameter description and the CUA desktop control prompt to discourage manual shell background operators.Documentation:
background=truewithout&or other shell background operators.Tests:
&rejection, quoted ampersands, and updated prompt/schema expectations.