Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions runpodctl/reference/runpodctl-serverless.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,10 @@ Create a new Serverless endpoint from a template or from a Hub repo:
# Create from a template
runpodctl serverless create --name "my-endpoint" --template-id "tpl_abc123"

# Create from a template with model references
runpodctl serverless create --template-id "tpl_abc123" --gpu-id ADA_24 \
--model-reference https://example.com/models/llama:v1

# Create from a Hub repo
runpodctl hub search vllm # Find the hub ID
runpodctl serverless create --hub-id cm8h09d9n000008jvh2rqdsmb --name "my-vllm"
Expand Down Expand Up @@ -159,6 +163,10 @@ Execution timeout in seconds. Jobs that exceed this duration are terminated. The
Environment variable in `KEY=VALUE` format. Use multiple `--env` flags to set multiple variables. When deploying from `--hub-id`, these values override the Hub release defaults.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added --model-reference flag documentation based on PR #276 in runpodctl. The PR adds the flag to cmd/serverless/create.go with constraints: only works with --template-id (not --hub-id) and requires GPU compute type.

Source: runpod/runpodctl#276

</ResponseField>

<ResponseField name="--model-reference" type="string">
Model reference URL to attach to the endpoint. Use multiple `--model-reference` flags to attach multiple models. Only supported with `--template-id` (not `--hub-id`) and requires GPU compute type.
</ResponseField>

### Update an endpoint

Update endpoint configuration:
Expand Down
Loading