Skip to content

feat(flow): add ranking context assets#604

Merged
LinoGiger merged 2 commits into
mainfrom
feat(flow)/add-ranking-context-assets
Jun 2, 2026
Merged

feat(flow): add ranking context assets#604
LinoGiger merged 2 commits into
mainfrom
feat(flow)/add-ranking-context-assets

Conversation

@RapidPoseidon
Copy link
Copy Markdown
Contributor

Summary

  • add optional context_assets to create_new_flow_batch
  • upload context asset files via the existing AssetUploader path and send contextAsset on flow item creation
  • update generated create-flow-item input model for the optional contextAsset field

Validation

  • Not run: python/uv are unavailable in this container
  • Not run: pyright requires node, and node is unavailable in this container

🔗 Session: https://session-f0bf5a67.poseidon.rapidata.internal/

@claude
Copy link
Copy Markdown

claude Bot commented Jun 2, 2026

Code Review

Overview

This PR adds an optional context_assets parameter to create_new_flow_batch(), allowing callers to supply image/video/audio context alongside the instruction. The implementation uploads assets via the existing AssetUploader path and threads the result through the new contextAsset field on CreateFlowItemEndpointInput. The approach is consistent with how the existing media_context pattern works elsewhere in the SDK.


Issues

1. Mustache templates not updated (required by CLAUDE.md)

create_flow_item_endpoint_input.py is an auto-generated file. CLAUDE.md explicitly states:

Note that if there are any changes that have to be made in those files you MUST also update the mustache files under openapi/templates.

No mustache template changes are included in this PR. On the next OpenAPI codegen run the changes to create_flow_item_endpoint_input.py will be overwritten.

2. Documentation not updated (required by CLAUDE.md)

CLAUDE.md states:

when updating any interfaces make sure you update the docs and examples.

docs/flows.md documents context with a code example but context_assets is not mentioned at all. The new parameter should be shown alongside the existing context example.

3. Pyright not run (required by CLAUDE.md)

The PR description explicitly notes pyright was not run. CLAUDE.md requires running pyright src/rapidata/rapidata_client and ensuring no errors before submitting.


Minor observations

Validation placement — The context_assets length check happens before the expensive upload work, which is correct. However, context_assets=[] (empty list) is blocked by not 1 <= len(...) <= 10, so the error message "must contain between 1 and 10 assets" could be split into two clearer messages ("must not be empty" vs "must not exceed 10 assets"), but this is a style nit.

Upload ordercontext_asset_input is uploaded after the datapoints in create_new_flow_batch. If the context upload fails, the dataset and uploaded datapoints are already created and become orphaned. This is an existing structural concern (not new here), but worth noting for future hardening.

upload_and_map_asset with a list — Passing list[str] to upload_and_map_asset produces a MultiAssetInput (via AssetMapper.create_existing_asset_input(uploaded_names)). This is the correct type for multi-asset context and reuses the existing codepath cleanly.

No test coverage — There are no tests for the new parameter path. Given the PR description also notes the code was not run locally, test coverage is especially important here.


Summary

The core logic (validation, upload, model field wiring) is correct and follows existing patterns. Before merging:

  1. Update the mustache template(s) for CreateFlowItemEndpointInput to include the contextAsset field.
  2. Add context_assets documentation and a code example to docs/flows.md.
  3. Run pyright src/rapidata/rapidata_client and resolve any errors.
  4. Add at least one test covering the context_assets path.

RapidPoseidon and others added 2 commits June 2, 2026 09:55
Co-Authored-By: luca <luca@rapidata.ai>
Regenerated against the merged backend flow contract (api.rabbitdata.ch).
Adds contextAsset to the flow item query/get output models that the
manual edit had missed, and refreshes the merged schema specs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: luca <luca@rapidata.ai>
@RapidPoseidon RapidPoseidon force-pushed the feat(flow)/add-ranking-context-assets branch from 197b913 to 9c3c919 Compare June 2, 2026 10:46
@claude
Copy link
Copy Markdown

claude Bot commented Jun 2, 2026

PR Review: feat(flow): add ranking context assets

Overall this is a clean addition that follows existing patterns well. A few issues worth addressing before merging.


🔴 Critical

Mustache templates not updated

CLAUDE.md explicitly requires: "if there are any changes that have to be made in those files you MUST also update the mustache files under openapi/templates."

The three auto-generated model files were modified directly:

  • src/rapidata/api_client/models/create_flow_item_endpoint_input.py
  • src/rapidata/api_client/models/get_flow_item_by_id_endpoint_output.py
  • src/rapidata/api_client/models/query_flow_items_endpoint_output.py

Without corresponding openapi/templates updates, these changes will be silently overwritten the next time the client is regenerated from the OpenAPI spec.


🟡 Medium

Documentation not updated (docs/flows.md)

CLAUDE.md requires docs updates for any interface change. The create_new_flow_batch method gained a new context_assets parameter, but docs/flows.md only shows context. A usage example (similar to the existing context example) is missing:

flow_item = flow.create_new_flow_batch(
    datapoints=[...],
    context_assets=["https://example.com/reference.jpg"],  # not documented
)

Pyright not run

CLAUDE.md requires pyright src/rapidata/rapidata_client to pass before merging. The PR notes this couldn't be run in the container — this should be verified in CI or locally before the PR lands.

Missing nullable: true on contextAsset in the OpenAPI schemas

All sibling optional fields carry nullable: true (e.g., context, timeToLiveInSeconds, drainDurationInSeconds). The new contextAsset field is missing it in all three schema files (flow.openapi.json, rapidata.openapi.json, rapidata.filtered.openapi.json). If the server ever needs to explicitly null the field, this will require another schema update.


🔵 Minor

No guard against passing both context and context_assets

A caller can currently supply both simultaneously with no feedback. The docstring doesn't clarify the relationship between the two. At minimum, the docstring should say whether they are independent or mutually exclusive.

Inconsistent to_dict() null-handling vs existing contextAsset fields

Existing models with a contextAsset field (e.g., i_workflow_config_ranking_workflow_config.py) include an explicit null-propagation guard:

if self.context_asset is None and "context_asset" in self.model_fields_set:
    _dict['contextAsset'] = None

The three new models omit this. If contextAsset is intentionally non-nullable on the wire that omission is fine, but it should be consistent or the asymmetry should be noted.

AssetUploader instantiated per call

# rapidata_flow.py:96
context_asset_input = (
    AssetUploader(self._openapi_service).upload_and_map_asset(context_assets)
    if context_assets
    else None
)

AssetUploader uses class-level caches so this is functionally correct, but it creates a throwaway instance. The rest of the upload path in create_new_flow_batch could share a single uploader instance (like how RapidataDataset.add_datapoints is called on an existing dataset instance).

Dataset left orphaned on context-asset upload failure

If the dataset and datapoints upload succeed but upload_and_map_asset raises (e.g., network error on the context asset), the dataset is already created with no cleanup. This is consistent with the existing accept_failed_uploads behavior, but worth noting in the docstring or handling explicitly.


✅ Looks Good

  • Validation for 1–10 context assets is correct and fires before the upload.
  • The upload_and_map_asset list path correctly wraps multiple assets in MultiAssetInput and a single-element list in ExistingAssetInput (consistent with AssetMapper).
  • Schema additions in all three OpenAPI files are consistent with each other.
  • from __future__ import annotations and TYPE_CHECKING conventions are followed correctly.

@LinoGiger LinoGiger merged commit a19b58a into main Jun 2, 2026
3 checks passed
@LinoGiger LinoGiger deleted the feat(flow)/add-ranking-context-assets branch June 2, 2026 11:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants