Enable internal text embedding API #3441
Conversation
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…configuration Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
… and telemetry integration Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…ate empty embeddings Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…oint/health sub-objects Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…orization Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…lthCheckConfig in converter
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…dding controller tests, switching the dab schema for the embedding system to default to false
embeddings endpoint is now permanently fixed to /embed with no user-configurable path option. This removes unnecessary configuration surface since the feature has not been released yet, eliminating the need for backward compatibility. Changes: - Remove path property from dab.draft.schema.json - Remove Path, UserProvidedPath, and EffectivePath from EmbeddingsEndpointOptions - Remove EffectiveEndpointPath from EmbeddingsOptions - Remove path deserialization from EmbeddingsOptionsConverterFactory - Remove --runtime.embeddings.endpoint.path CLI option - Remove path configuration logic from ConfigGenerator - Remove endpoint path validation from RuntimeConfigValidator - Update Startup.cs logging to use DEFAULT_PATH constant - Update all tests to remove path references
|
Commenter does not have sufficient privileges for PR 3441 in repo Azure/data-api-builder |
…om/ajtiwari07/data-api-builder into add-internal-text-embedding-system
| } | ||
|
|
||
| /// <inheritdoc/> | ||
| public async Task<EmbeddingBatchResult> TryEmbedBatchAsync(string[] texts, CancellationToken cancellationToken = default) |
There was a problem hiding this comment.
nit: This has the
checks duplicated inline which are also present in the method ValidateEmbedBatchRequest. Can we use validation method here instead.
There was a problem hiding this comment.
There is a difference in validations between these methods. I have extracted validation logic methods where ever applicable.
|
/azp run |
|
Azure Pipelines successfully started running 6 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 6 pipeline(s). |
…om/ajtiwari07/data-api-builder into add-internal-text-embedding-system
|
/azp run |
|
Azure Pipelines successfully started running 6 pipeline(s). |
…om/ajtiwari07/data-api-builder into add-internal-text-embedding-system
|
/azp run |
|
Commenter does not have sufficient privileges for PR 3441 in repo Azure/data-api-builder |
|
/azp run |
|
Azure Pipelines successfully started running 6 pipeline(s). |
Aniruddh25
left a comment
There was a problem hiding this comment.
Main question around need for embedding cache options. Cant we reuse runtime cache options to decide what we do for embeddings, does it need to be granular?
| "type": "string", | ||
| "description": "Cache level (L1 for in-memory only, L1L2 for in-memory + distributed). Defaults to L1.", | ||
| "enum": ["L1", "L1L2"], | ||
| "default": "L1" |
There was a problem hiding this comment.
Shouldnt the default cache be L1L2 ? since we would like the embeddings to be stored in Redis Cache? and Redis is L2.
There was a problem hiding this comment.
In this phase we only support L1 via Fusion cache. In phase 2 we will enable support for L2 with redis which would default to L1 or L1L2 like you have suggested.
| "size-chars": { | ||
| "type": "integer", | ||
| "description": "The size of each chunk in characters.", | ||
| "default": 800, |
There was a problem hiding this comment.
PR description says default is 1000. What is the correct default?
There was a problem hiding this comment.
Modified the PR description, you can refer to PRD as a SOT: #3331
| RuntimeEmbeddingsTimeoutMs = runtimeEmbeddingsTimeoutMs; | ||
| // Embeddings Endpoint | ||
| RuntimeEmbeddingsEndpointEnabled = runtimeEmbeddingsEndpointEnabled; | ||
| RuntimeEmbeddingsEndpointRoles = runtimeEmbeddingsEndpointRoles; |
There was a problem hiding this comment.
IS EmbeddingsEndpointPath not configurable through dab configure? Why not?
Is EmbeddingsCache not configurable? same question for chunking options.
| Dimensions: dimensions, | ||
| TimeoutMs: timeoutMs, | ||
| Endpoint: endpointOptions, | ||
| Health: healthOptions); |
There was a problem hiding this comment.
doesnt mention cache options, but the schema has cache options. I think we dont need cache options at the embeddings level. The global runtime cache options should be sufficient.
| // Copyright (c) Microsoft Corporation. | ||
| // Licensed under the MIT License. | ||
|
|
||
| using System.Text.Json; |
There was a problem hiding this comment.
cache options are not deserialized either. Better to simply remove them from the dab.draft.schema
| // Register embedding service if configured and enabled. | ||
| // NOTE: IEmbeddingService is only registered when enabled to avoid constructor | ||
| // failures when config has empty/placeholder values for disabled embeddings. | ||
| // TODO: To support hot-reload for embeddings (toggling enabled on/off at runtime), |
There was a problem hiding this comment.
Is a separate issue created for to track this?
| } | ||
|
|
||
| // Check if embedding service is available | ||
| if (_embeddingService is null || !_embeddingService.IsEnabled) |
There was a problem hiding this comment.
Whether embeddingService is enabled or not depends on if embeddingOptions.Enabled is set or not, so why check it here again?
| private readonly HttpClient _httpClient; | ||
| private readonly EmbeddingsOptions _options; | ||
| private readonly ILogger<EmbeddingService> _logger; | ||
| private readonly IFusionCache _cache; |
There was a problem hiding this comment.
Do we plan to use Redis Cache in a future PR?
Summary
This PR adds configurable text chunking capabilities to the embeddings API, enabling automatic text segmentation before embedding generation. This feature supports both single-text and multi-document batch processing with runtime configuration and query parameter overrides.
Changes
Configuration
Added EmbeddingsChunkingOptions.cs - Configuration model for chunking behavior
Enabled (bool) - Enable/disable chunking
SizeChars (int) - Chunk size in characters (default: 800)
OverlapChars (int) - Overlap between chunks (default: 100)
EffectiveSizeChars property ensures minimum valid chunk size
Modified EmbeddingsOptions.cs - Added Chunking property and IsChunkingEnabled helper
Removed EmbeddingsCacheOptions.cs - Simplified configuration by removing unused cache feature
API Enhancements
Modified Controllers/EmbeddingController.cs
Auto-detects request type (single text vs. document array)
Implements overlapping text chunking algorithm
Supports query parameter overrides: $chunking.enabled, $chunking.size-chars, $chunking.overlap-chars
Returns multiple embeddings per document when chunking is enabled
Added Models/EmbedDocumentRequest.cs - Request model for document arrays
Added Models/EmbedDocumentResponse.cs - Response model with chunked embeddings
Schema (schemas)
Modified dab.draft.schema.json - Added chunking configuration schema with validation rules
Testing (UnitTests)
Added EmbeddingsChunkingOptionsTests.cs (13 tests) - Configuration validation
Added ChunkTextTests.cs (21 tests) - Chunking algorithm validation including edge cases
Modified EmbeddingControllerTests.cs (+18 tests) - API endpoint tests for chunking and document arrays
Total Test Coverage: 72 tests (48 existing + 24 new) - All passing## Why make this change?
Testing
All 72 unit tests passing
Edge cases covered: empty text, very small chunks, overlap larger than chunk size, Unicode text
Query parameter parsing validated
Backward compatibility verified
Breaking Changes
None - This is a backward-compatible addition. Existing single-text requests continue to work without modification.