Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions apigw-lambda-opensearch-serverless-nextgen/.cfnlintrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
configure_rules:
E3030:
# python3.14 is valid but not yet in cfn-lint's schema
exceptions:
- python3.14
10 changes: 10 additions & 0 deletions apigw-lambda-opensearch-serverless-nextgen/.checkov.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Checkov suppressions for sample application
# These controls are intentionally skipped as this is a demonstration/sample project.
# Production workloads should implement these controls.

skip-checks:
- CKV_AWS_115 # Lambda reserved concurrency — not required for sample application
- CKV_AWS_116 # Lambda DLQ — not required for sample application with synchronous API handlers
- CKV_AWS_117 # Lambda in VPC — not required for sample application
- CKV_AWS_158 # CloudWatch LogGroup KMS encryption — not required for sample application log data
- CKV_AWS_173 # Lambda env var encryption — no secrets stored, only configuration values in sample application
49 changes: 49 additions & 0 deletions apigw-lambda-opensearch-serverless-nextgen/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# AWS SAM
.aws-sam/
packaged.yaml

# Python
__pycache__/
*.py[cod]
*$py.class
*.so
*.egg-info/
*.egg
dist/
build/
.eggs/

# Virtual environments
.venv/
venv/
ENV/

# IDE
.idea/
.vscode/
*.swp
*.swo
*~

# OS
.DS_Store
Thumbs.db

# Environment variables
.env
.env.local

# Coverage & testing
htmlcov/
.coverage
.coverage.*
.pytest_cache/
.mypy_cache/

# Distribution
*.whl

# Local Config
mise.local.toml
.kiro
blog/
176 changes: 176 additions & 0 deletions apigw-lambda-opensearch-serverless-nextgen/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
# Amazon API Gateway to AWS Lambda to Amazon OpenSearch Serverless NextGen

This pattern deploys a serverless semantic search API using Amazon API Gateway, AWS Lambda, and Amazon OpenSearch Serverless with the NextGen architecture. All three services operate on a pay-per-use model with no minimum baseline cost, meaning the entire stack incurs zero compute charges when idle. You pay only for storage of indexed data.

Learn more about this pattern at Serverless Land Patterns: https://serverlessland.com/patterns/apigw-lambda-opensearch-serverless-nextgen

Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example.

## Requirements

* [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.
* [AWS CLI installed and configured](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html)
* [Git installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
* [AWS SAM CLI](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/serverless-sam-cli-install.html) installed
* [Python 3.14](https://www.python.org/downloads/)

**Region availability:** This pattern uses OpenSearch Serverless AI connectors and hybrid search, which are available in the following regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (Spain), and Europe (Stockholm). See the [launch announcement](https://aws.amazon.com/about-aws/whats-new/2025/08/amazon-opensearch-serverless-ai-connectors-hybrid-search/) for details.


## Deployment Instructions

1. Create a new directory, navigate to that directory in a terminal and clone the GitHub repository:
```
git clone https://github.com/aws-samples/serverless-patterns
```
1. Change directory to the pattern directory:
```
cd serverless-patterns/apigw-lambda-opensearch-serverless-nextgen
```
1. Build the application:
```
sam build
```
1. Deploy the application:
```
sam deploy --guided
```
1. During the prompts:
* Enter a stack name
* Enter the desired AWS Region
* Accept the default parameter values or customize them
* Allow SAM CLI to create IAM roles with the required permissions

Once you have run `sam deploy --guided` mode once and saved arguments to a configuration file (samconfig.toml), you can use `sam deploy` in future to use these defaults.


## How it works

![Architecture diagram](images/architecture.png)

Figure 1 - Architecture

This pattern creates a REST API backed by three AWS Lambda functions that interact with an OpenSearch Serverless NextGen collection configured for vector search:

1. The client sends an HTTPS request (SigV4-signed) to Amazon API Gateway with IAM authorization.
2. API Gateway routes the request to the appropriate Lambda function based on path: Search (`POST /search`), Index (`POST /index`), or Delete (`DELETE /documents`).
3. The Lambda function calls the OpenSearch Serverless collection — performing a neural/lexical/hybrid query, bulk indexing via the ingest pipeline, or a bulk delete.
4. For semantic and hybrid search, and during document indexing, the OpenSearch ML model calls Amazon Bedrock (Amazon Titan Text Embeddings V2) to generate 1024-dimensional embeddings server-side.
5. For hybrid search, the search pipeline applies min-max score normalization to combine BM25 (lexical) and k-NN (semantic) results with configurable weights (0.3 lexical / 0.7 semantic).

The OpenSearch collection lives inside a NextGen collection group, which enables scale-to-zero behavior. When idle, both indexing and search OCUs (OpenSearch Compute Units) drop to zero. When a request arrives, capacity provisions in approximately 10 seconds. Requests are queued (not dropped) during this window.

The NextGen collection group is created using a Lambda-backed custom resource since CloudFormation doesn't yet natively support the `Generation` parameter.

### Scale-to-zero in action

The chart below shows the OCU (OpenSearch Compute Unit) metrics from CloudWatch during a test run:

![CloudWatch metrics showing Search and Indexing OCUs scaling from zero, handling traffic, then returning to zero](images/search-acu-scaling.png)

*Figure 2 — Two test runs separated by a period of no activity. Search OCUs scale 0 → 2 during queries, Indexing OCUs scale 0 → 1 during document ingestion. Both return to 0 after the idle timeout.*

The pattern:
1. **Idle** — Both indexing and search OCUs sit at 0. No compute cost.
2. **Traffic arrives** — First request triggers provisioning (~10 seconds). Requests are queued during this window.
3. **Active** — OCUs scale up to match demand, up to the configured maximum.
4. **Traffic subsides** — After 10 minutes of no requests, OCUs scale back to 0.

## Testing

Install the test dependencies:

```bash
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r tests/requirements.txt
```

### Unit tests

Run the unit tests (no deployed stack or AWS credentials required):

```bash
pytest tests/unit/ -v
```

### Integration tests

The repository includes integration tests that exercise all three search modes against a 50-product outdoor equipment catalog:

```bash
# Set your stack name and region
export STACK_NAME="your-stack-name"
export AWS_REGION="your-region"

# Run integration tests (requires a deployed stack)
pytest tests/integration/ -v -s
```

The tests demonstrate semantic understanding: `"shoes for the beach"` matches "Summer Beach Sandals" (no keyword overlap), `"charging phone while camping"` matches "Solar Power Bank" (intent matching), and hybrid mode combines both signals for queries like `"waterproof bag for kayaking"` → "Dry Bag 20L".

### Manual testing with awscurl

Install the project dependencies (includes `awscurl`):

```bash
pip install -r requirements.txt
```

Set your stack name and region (if not already set):

```bash
STACK_NAME="your-stack-name"
AWS_REGION="your-region"
```

Index a document:

```bash
awscurl --service execute-api --region $AWS_REGION -X POST \
-H "Content-Type: application/json" \
-d '{
"documents": [{
"id": "doc-1",
"title": "OpenSearch Serverless NextGen",
"content": "The next generation architecture scales to zero and provisions in seconds."
}]
}' \
"$(aws cloudformation describe-stacks --stack-name $STACK_NAME --region $AWS_REGION --query 'Stacks[0].Outputs[?OutputKey==`IndexApiUrl`].OutputValue' --output text)"
```

Search for it:

```bash
awscurl --service execute-api --region $AWS_REGION -X POST \
-H "Content-Type: application/json" \
-d '{"query": "serverless scaling", "mode": "hybrid"}' \
"$(aws cloudformation describe-stacks --stack-name $STACK_NAME --region $AWS_REGION --query 'Stacks[0].Outputs[?OutputKey==`SearchApiUrl`].OutputValue' --output text)"
```

Delete a document:

```bash
awscurl --service execute-api --region $AWS_REGION -X DELETE \
-H "Content-Type: application/json" \
-d '{"ids": ["doc-1"]}' \
"$(aws cloudformation describe-stacks --stack-name $STACK_NAME --region $AWS_REGION --query 'Stacks[0].Outputs[?OutputKey==`DeleteApiUrl`].OutputValue' --output text)"
```

> **Note:** The first request after an idle period takes approximately 10 seconds while OpenSearch provisions compute from zero. Subsequent requests respond at normal latency.

## Cleanup

> **Warning:** This will permanently delete all indexed documents in the OpenSearch collection. Back up any data you need to retain before proceeding.

1. Delete the stack:
```bash
sam delete
```

This removes all resources including the API Gateway, Lambda functions, OpenSearch collection, collection group, security policies, IAM roles, and CloudWatch log groups.

----
Copyright 2026 Amazon.com, Inc. or its affiliates. All Rights Reserved.

SPDX-License-Identifier: MIT-0
68 changes: 68 additions & 0 deletions apigw-lambda-opensearch-serverless-nextgen/example-pattern.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
{
"title": "API Gateway to Lambda to OpenSearch Serverless NextGen",
"description": "Deploy a serverless semantic search API with zero baseline compute cost using Lambda and OpenSearch Serverless NextGen (scale-to-zero).",
"language": "Python",
"level": "300",
"framework": "SAM",
"introBox": {
"headline": "How it works",
"text": [
"This pattern deploys an Amazon API Gateway REST API backed by three AWS Lambda functions that perform semantic, lexical, and hybrid search against an Amazon OpenSearch Serverless NextGen collection.",
"Amazon OpenSearch Serverless NextGen scales compute to zero when idle and provisions in approximately 10 seconds when traffic arrives. Combined with Lambda's own scale-to-zero, the entire stack incurs zero compute cost when not in use.",
"Embeddings are generated server-side by an OpenSearch ML model connected to Amazon Bedrock (Amazon Titan Text Embeddings V2) — Lambda functions send and receive plain text only.",
"A hybrid search pipeline applies min-max score normalization to combine BM25 (lexical) and k-NN (semantic) results with configurable weights."
]
},
"gitHub": {
"template": {
"repoURL": "https://github.com/aws-samples/serverless-patterns/tree/main/apigw-lambda-opensearch-serverless-nextgen",
"templateURL": "serverless-patterns/apigw-lambda-opensearch-serverless-nextgen",
"projectFolder": "apigw-lambda-opensearch-serverless-nextgen",
"templateFile": "template.yaml"
}
},
"resources": {
"bullets": [
{
"text": "Introducing the next generation of Amazon OpenSearch Serverless",
"link": "https://aws.amazon.com/blogs/aws/introducing-the-next-generation-of-amazon-opensearch-serverless-for-building-your-agentic-ai-applications/"
},
{
"text": "Amazon OpenSearch Serverless",
"link": "https://docs.aws.amazon.com/opensearch-service/latest/developerguide/serverless.html"
},
{
"text": "OpenSearch neural search",
"link": "https://opensearch.org/docs/latest/search-plugins/neural-search/"
},
{
"text": "Amazon Titan Text Embeddings V2",
"link": "https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html"
}
]
},
"deploy": {
"text": [
"sam build",
"sam deploy --guided"
]
},
"testing": {
"text": [
"See the GitHub repo for detailed testing instructions."
]
},
"cleanup": {
"text": [
"Delete the stack: <code>sam delete --stack-name STACK_NAME</code>."
]
},
"authors": [
{
"name": "Pete Davis",
"image": "https://github.com/pjdavis-aws.png",
"bio": "Senior Partner Solution Architect at AWS",
"linkedin": "peter-davis-2676585"
}
]
}
Loading