[codex] integrate extended-data package fabric#6
Conversation
|
Important Review skippedToo many files! This PR contains 223 files, which is 73 over the limit of 150. To get a review, narrow the scope: ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (223)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Review Summary
I've reviewed this major refactoring PR that consolidates extended-data into a tiered Python package. The changes are extensive (128 files, 2669 additions, 592 deletions) and represent a significant architectural improvement with first-class primitives, containers, IO, inputs, logging, connectors, secrets, and workflow surface.
Critical Issue Found
Logic Error in CLI (src/extended_data/connectors/cli.py lines 170-172):
The cmd_methods function computes a docstring but never uses or displays it. Lines 171-172 execute attr.__doc__.split("\n")[0].strip()[:50] without assigning the result to a variable or outputting it. This causes the "methods" command to silently iterate through methods but display nothing to the user.
Fix needed:
# Get first line of docstring
doc = attr.__doc__.split("\n")[0].strip()[:50] if attr.__doc__ else "No description"
_write_stdout(f" {name:<30} {doc}")Security & Architecture Assessment
Positive aspects:
- The
VendorConnectorBaseclass provides good security primitives including rate limiting, automatic retries, and proper error handling - The
SecretsConnectorproperly handles both native and CLI fallback modes for the secrets sync functionality - Type annotations are comprehensive throughout the codebase
- The connector registry pattern provides good separation of concerns with lazy loading of optional dependencies
No blocking security vulnerabilities found in the reviewed files, but the logic error in the CLI blocks functionality.
Recommendation
Please fix the critical logic error in cmd_methods before merging. The rest of the changes appear well-structured and properly implemented. The architectural consolidation improves maintainability and the tiered package structure provides clear separation of concerns.
You can now have the agent implement changes and create commits directly on your pull request's source branch. Simply comment with /q followed by your request in natural language to ask the agent to make changes.
There was a problem hiding this comment.
Code Review
This pull request introduces a new 'Tier 2' container layer to the extended-data package, including ExtendedDict, ExtendedList, ExtendedSet, and ExtendedString classes that provide ergonomic, chainable methods for data manipulation. It also refactors the package structure to separate pure primitives (Tier 1) from these containers (Tier 2) and higher-order processors (Tier 3), while updating the ConnectorFabric and various vendor connectors to support these new containers. The review identified a critical issue in ExtendedDict where the UserDict.update method bypasses the overridden __setitem__, preventing proper promotion of nested data; an actionable fix was provided to override update to ensure all data is correctly processed through the extend_data factory.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
Summary
Validation
uv run --extra tests pytest testsuv run --extra typing mypy src/extended_datauvx ruff check src tests examples README.md docs/package-surface.mduv buildNotes