Pydantic Agents

Render Developer Q&A Assistant showcasing observable AI with Pydantic Agents, Pydantic Embedder, Logfire, and Render Workflows

Intelligent question-answering system that demonstrates real-world AI observability patterns. This example project shows how to build, instrument, and monitor a multi-stage LLM pipeline with full cost tracking, quality evaluation, and performance monitoring.

What This App Does

This is an AI-powered Q&A assistant for Render documentation. Users can ask questions about Render's platform, and the app provides accurate, well-researched answers backed by the official documentation.

User Experience

Ask a question - "How do I deploy a Node.js app on Render?" or "What database plans are available?"
Watch the pipeline - Track progress as the run moves through 8 stages (embedding → retrieval → generation → verification)
Get accurate answers - Receive detailed responses with sources from Render docs
Quality guaranteed - Every answer is verified for accuracy and rated by dual AI evaluators

Key Features

Hybrid search - Combines semantic understanding with keyword matching for better retrieval
Multi-stage verification - Extracts claims, verifies against docs, checks technical accuracy
Iterative refinement - Automatically regenerates low-quality answers with feedback
Cost tracking - See exactly how much each question costs to answer
Parallel fan-out - The pipeline runs on Render Workflows, fanning out the heaviest stages (technical accuracy + dual-model evaluation) across instances so they execute concurrently

What This Demonstrates

Render Capabilities

Render Workflows - The Q&A pipeline and ingestion run as durable workflow tasks with per-task retries, timeouts, and cross-instance parallel fan-out
PostgreSQL with pgvector + full-text - Managed hybrid search database
Web Service + Static Site - FastAPI gateway + Next.js frontend
Cron Jobs - Scheduled ingestion refresh that triggers the workflow fan-out
Blueprint deploy + env groups - render.yaml provisions everything; shared config lives in two env groups

Logfire Features

LLM Traces - Complete visibility into every AI call (OpenAI + Anthropic auto-instrumented)
HTTP Tracing - FastAPI auto-instrumentation for request/response tracking
Database Monitoring - AsyncPG auto-instrumentation for query performance
Cost Tracking - Per-stage and per-execution cost attribution with custom metrics
Multi-Model Evals - Dual-rater quality assessment (OpenAI + Anthropic)
Session Tracking - End-to-end user journey with distributed tracing
Custom Metrics - Business-specific metrics (cost, quality, iterations)
SQL Queries - Custom analytics on AI performance

Pydantic Stack

This project is built end-to-end on the Pydantic ecosystem:

Pydantic AI Agents — every pipeline stage (generation, claims extraction, accuracy check, dual-rater evaluation) is a pydantic_ai.Agent with a typed output_type. Multi-provider orchestration (Claude + GPT) runs through OpenAIProvider / AnthropicProvider in a single pipeline. See backend/pipeline/.
Pydantic Embedder — pydantic_ai.Embedder with OpenAIEmbeddingModel powers question embedding (embed_query) and batch claim embedding (embed_documents) for verification. Auto-instrumented by logfire.instrument_pydantic_ai(). See backend/pipeline/embeddings.py and backend/pipeline/verification.py.
Pydantic Models — Claims, accuracy scores, eval dimensions, and pipeline state are parsed directly into Pydantic models (e.g. ClaimsOutput, EvaluationOutput). pydantic-settings manages config in backend/config.py.
Pydantic GenAI Prices — model pricing is loaded dynamically from the pydantic/genai-prices registry, then combined with per-agent token counts from result.usage() to produce per-stage cost attribution. See backend/prices.py.
Logfire — distributed traces, custom metrics, dual-model evals, and cost attribution. Auto-instruments FastAPI, AsyncPG, HTTPX, and Pydantic AI. See backend/observability.py.

Architecture

The frontend connects to a backend FastAPI gateway that triggers a Render Workflows run and polls it for the result. The 8-stage pipeline and ingestion execute as workflow tasks that fan out across instances.

┌─────────────────────────────────────────────────────────────┐
│  Frontend (Next.js + TypeScript)                            │
│  Deployed as: Render Static Site                            │
│  - Question input UI                                        │
│  - Progress via polling (POST /ask → poll GET /ask/{id})    │
│  - Answer display with metrics                              │
└─────────────────────────────────────────────────────────────┘
                          ↓ HTTPS
┌─────────────────────────────────────────────────────────────┐
│  API Gateway (FastAPI + Logfire)                            │
│  Deployed as: Render Web Service (Python 3.13)              │
│  - POST /ask        → start_task("…/run_qa_pipeline")       │
│  - GET  /ask/{id}   → get_task_run(id) (poll status/result) │
│  - /health, /history, /stats, /sessions/{id}/logs           │
└─────────────────────────────────────────────────────────────┘
            ↓ Render SDK (start_task / get_task_run)
┌─────────────────────────────────────────────────────────────┐
│  Render Workflows service  (Python 3.13)                    │
│  Orchestrator: run_qa_pipeline                              │
│  ┌────────────────────────────────────────────────────────┐ │
│  │ [1] Question Embedding      (OpenAI)        in-process  │ │
│  │ [2] RAG Document Retrieval  (pgvector+BM25) in-process  │ │
│  │ [3] Answer Generation       (Claude)        ⟶ subtask   │ │
│  │ [4] Claims Extraction       (GPT)           ⟶ subtask   │ │
│  │ [5] Claims Verification     (RAG again)     ⟶ subtask   │ │
│  │ [6] Technical Accuracy      (Claude)    ┐               │ │
│  │ [7] Quality Rating          (OpenAI+    ├─ 3 parallel   │ │
│  │                              Anthropic) ┘   subtasks    │ │
│  │ [8] Quality Gate            (Pass or Iterate) in-process │ │
│  └────────────────────────────────────────────────────────┘ │
│  Ingestion: ingest_all → ingest_core, then 6 add_* in       │
│             parallel (replaces the old serial preDeploy)    │
└─────────────────────────────────────────────────────────────┘
            ↓                                    ↓
┌──────────────────────┐           ┌───────────────────────────┐
│  PostgreSQL          │           │  Logfire                  │
│  (Render Managed)    │           │  (Pydantic)               │
│  - pgvector ext      │           │  - Distributed traces     │
│  - RAG embeddings    │           │  - Cost attribution       │
│  - Full-text search  │           │  - Quality metrics        │
└──────────────────────┘           │  - Custom dashboards      │
                                   └───────────────────────────┘

  Cron (daily) ─ start_task("…/ingest_all") ─▶ Workflows service

Why hybrid? Workflows aren't HTTP-facing, so a client (the gateway) triggers tasks via the SDK and reads run status. Stages 1, 2, and 8 are cheap/data-dependent and stay in-process on the orchestrator; only the heavy, independently-retryable LLM stages are promoted to their own tasks. Stages 6 + 7 run as three concurrent subtasks on separate instances. See workflows/app.py.

Project Structure

render-qa-assistant/
├── backend/
│   ├── main.py                    # FastAPI gateway (triggers + polls workflow runs)
│   ├── api/
│   │   └── logs.py                # Logfire logs API endpoint
│   ├── pipeline/                  # 8-stage pipeline implementation (reused by workflows)
│   ├── models.py                  # Pydantic models
│   ├── database.py                # PostgreSQL + pgvector
│   ├── observability.py           # Logfire configuration
│   └── config.py                  # Settings management
├── workflows/                     # Render Workflows service
│   ├── app.py                     # Workflows() instance + all @app.task defs
│   ├── serialization.py           # JSON boundary helpers (model_dump/model_validate)
│   └── trigger_ingest.py          # Cron entrypoint → start_task("…/ingest_all")
├── frontend/
│   ├── src/                       # Next.js + TypeScript UI
│   └── package.json
├── data/
│   ├── embeddings/                # Pre-embedded documentation
│   └── scripts/                   # Data ingestion scripts
├── docs/
│   ├── PIPELINE.md                # Detailed pipeline guide
│   ├── OBSERVABILITY.md           # Logfire instrumentation guide
│   ├── CONFIGURATION.md           # Configuration reference
│   └── HYBRID_SEARCH.md           # Hybrid search deep-dive
├── pyproject.toml                 # Python dependencies (uv)
├── uv.lock                        # Locked dependency versions
├── .python-version                # Pins Python to 3.13
├── render.yaml                    # Infrastructure as code
├── .env.example                   # Environment variables template
└── README.md                      # This file

Quick Start

Prerequisites

uv (manages Python 3.13 automatically)
Node.js 18+
PostgreSQL 16+ (with pgvector extension)
OpenAI API key
Anthropic API key
Logfire account — sign in at logfire.pydantic.dev, create a project (US region), then:
1. Settings → Write Tokens → create a token → LOGFIRE_TOKEN in .env
2. Settings → Read Tokens → create a token → LOGFIRE_READ_TOKEN in .env
3. View traces in the Live panel under your project

Local Development (with Make)

# 1. Install everything (uv installs Python 3.13 automatically)
make install

# 2. Set up .env file (copy from example and fill in your keys)
cp .env.example .env

# 3. Start database
make db-start

# 4. Load documentation (this step might take a while!)
make ingest

# 5. Run backend (in one terminal)
make run-backend

# 6. Run frontend (in another terminal)
make run-frontend

Asking questions locally requires a deployed Workflow for now. POST /ask delegates to a Workflows service; with nothing to delegate to it returns 503 WORKFLOW_SLUG is not configured. But you can run the rest of the stack locally — no Render cloud resources, no API key — with a local dev server:
# Terminal 1 — local workflow dev server (loads .env, listens on :8120)
render workflows dev -- uv run render-workflows workflows.app:app

# Terminal 2 — gateway pointed at the local dev server
RENDER_USE_LOCAL_DEV=true WORKFLOW_SLUG=local \
  uv run uvicorn backend.main:app --reload --port 8000

# Terminal 3 — frontend
cd frontend && npm run dev
RENDER_USE_LOCAL_DEV=true points the SDK at http://localhost:8120 instead of Render's cloud (no token needed); WORKFLOW_SLUG=local just satisfies the gateway's guard. Set both in .env to skip the prefixes. The workflow runs against your local DATABASE_URL, so History populates normally. (Alternatively, set RENDER_API_KEY + the real WORKFLOW_SLUG to target a deployed cloud Workflows service, which writes to its own database.)

Local config → deployed env groups. Locally every process reads one .env (copied from .env.example). On deploy that same config splits into the two Render env groups in render.yaml — see Deploy → Environment groups. Local-only knobs (RENDER_USE_LOCAL_DEV, the Docker DATABASE_URL) aren't in any group; in the cloud the SDK uses the platform socket and DATABASE_URL is injected from the database.

make ingest runs the full pipeline: bulk doc embeddings, plus the curated "special pages" that get explicit-injection into RAG context (pricing, AI agent, autoscaling, Node.js). To re-load just one of those after editing its script, use the per-target shortcuts:

make add-pricing      # render.com/pricing tables
make add-ai-agent     # render.com/tutorials/agents-on-render-workflows (AI agents → Render Workflows)
make add-autoscaling  # render.com/docs/scaling
make add-nodejs       # render.com/docs/deploy-node-express-app

Access locally:

Deploy to Render

1. Set up a Logfire account.

Before clicking the deploy button, sign in at logfire.pydantic.dev, create a project (US region), and generate two tokens:

Preferences → Write Tokens → create token → save as LOGFIRE_TOKEN
Preferences → Read Tokens → create token → save as LOGFIRE_READ_TOKEN

You'll paste both into the Render Dashboard in step 3.

2. One-click deploy

Render reads render.yaml and provisions:

PostgreSQL database with pgvector (pydantic-agents-workflows-db)
API gateway web service (pydantic-agents-workflows-api, FastAPI + Logfire)
Ingestion refresh cron (pydantic-agents-workflows-ingest, triggers the workflow daily)
Frontend static site (pydantic-agents-workflows-frontend, Next.js)
Two environment groups that hold all shared config (see below)

On Apply, Render prompts once for the secret values in the env groups (OPENAI_API_KEY, ANTHROPIC_API_KEY, both Logfire tokens, and — left blank for now — RENDER_API_KEY / WORKFLOW_SLUG). You fill these at the group level, not per service.

Environment groups

render.yaml defines two reusable env groups so config lives in one place instead of being duplicated across services:

Group	Contents	Linked to
`pydantic-agents-workflows-pipeline`	LLM/Logfire secrets + all pipeline, RAG, and model config (~20 vars)	API gateway and the Workflows service (step 3)
`pydantic-agents-workflows-pipeline-trigger`	`RENDER_API_KEY`, `WORKFLOW_SLUG`	API gateway and the ingest cron

The payoff is pydantic-agents-workflows-pipeline: the gateway and the Workflows service both run the same backend.config.Settings, so they need identical config. Linking the group to the hand-created Workflows service (step 3) replaces pasting ~20 variables by hand. DATABASE_URL stays per-service (it's injected from the database, which can't live in a group), and the frontend's NEXT_PUBLIC_API_URL stays inline (unique, build-time).

3. Create the Workflows service

Blueprints (render.yaml) can't create Workflows yet, so do this once in the Dashboard.

3a. Open the create form. In the Render Dashboard, click New → Workflow. Connect this GitHub repo (or your fork) when prompted.

3b. Fill in every field exactly as below:

Field	Value
Name	`pydantic-agents-workflow` (this becomes the workflow slug)
Project / Environment	Same project + `production` environment as the rest of the stack
Language / Runtime	`Python 3`
Branch	`main` (or the branch you deploy)
Region	`Oregon` (must match `pydantic-agents-workflows-db`)
Root Directory	(leave blank — the repo root)
Build Command	`pip install uv && uv sync --no-dev --frozen`
Start Command	`uv run render-workflows workflows.app:app`
Instance Type	`Standard` (the tasks are I/O-bound; no need for Pro)

uv: command not found? A hand-created Workflow service doesn't get uv pre-installed (unlike Blueprint services), so the build command installs it first with pip install uv.

Pin Python to 3.13. The build may default to a newer Python (e.g. 3.14) and ignore the repo's .python-version. Add an env var PYTHON_VERSION = 3.13 in step 3c so the build matches uv.lock and the rest of the stack.

3c. Link config and add the database. The Workflows service runs the same backend.config.Settings as the gateway, so instead of re-typing every variable, link the pydantic-agents-workflows-pipeline env group the Blueprint already created:

Under Environment → Environment Groups, click Link Existing Group → pydantic-agents-workflows-pipeline. This pulls in both API keys, both Logfire tokens, and all pipeline/RAG/model config in one step.

Add the two variables that can't come from the group (env groups hold only plain key: value pairs — no database links):

Variable	Required?	Value / Source
`DATABASE_URL`	✅ Required	Click Add from Database → `pydantic-agents-workflows-db` (already provisioned by step 2's Blueprint — you are not creating a new database, just linking the existing one). Use the same database as the gateway so the History tab populates.
`PYTHON_VERSION`	Recommended	`3.13` (see the build note above)

Bind DATABASE_URL, don't hardcode it. Add from Database injects the managed internal connection string and auto-updates if creds rotate. Pasting a literal URL (into the service or the group) is a static snapshot that breaks on rotation — avoid it.

The group's four secrets (OPENAI_API_KEY, ANTHROPIC_API_KEY, LOGFIRE_TOKEN, LOGFIRE_READ_TOKEN) are set once when applying the Blueprint (step 2) and reused here by linking the group — the first three are required and the service crashes on startup without them (no defaults in backend/config.py).

End state — the Workflows service environment:

[linked group]  pydantic-agents-workflows-pipeline   # 4 secrets + pipeline/RAG/model config
DATABASE_URL    → from pydantic-agents-workflows-db   # per-service bind, not in any group
PYTHON_VERSION  = 3.13

3d. Create the service and wait for the first deploy to finish. Then copy the service's slug (shown on its Dashboard page / in its URL, e.g. pydantic-agents-workflow) — you'll set it as WORKFLOW_SLUG in the pydantic-agents-workflows-pipeline-trigger group in step 4, which the gateway and cron both inherit.

4. Fill in the env-group values

Because the gateway and cron read everything from the two env groups, you set values on the groups, not on each service — every linked service picks them up automatically.

pydantic-agents-workflows-pipeline (drives the gateway + Workflows service) — set the four secrets once, when you apply the Blueprint in step 2:

Variable	Source
`OPENAI_API_KEY`	platform.openai.com
`ANTHROPIC_API_KEY`	console.anthropic.com
`LOGFIRE_TOKEN`	Logfire write token from step 1
`LOGFIRE_READ_TOKEN`	Logfire read token from step 1

pydantic-agents-workflows-pipeline-trigger (shared by the gateway + cron) — set these after the Workflows service exists (step 3):

Variable	Source
`RENDER_API_KEY`	Render Account Settings → API Keys
`WORKFLOW_SLUG`	The Workflows service slug from step 3 (e.g. `pydantic-agents-workflow`)

Edit a group under Dashboard → Env Groups → <group>. Saving re-deploys every service linked to it, so the gateway and cron both pick up WORKFLOW_SLUG from a single edit.

Auto-filled, no action needed: DATABASE_URL (injected from the database service) and the rest of pydantic-agents-workflows-pipeline's config (QUALITY_THRESHOLD, ACCURACY_THRESHOLD, AGREEMENT_THRESHOLD, MAX_ITERATIONS, MAX_TOKENS, TIMEOUT_SECONDS, RAG_TOP_K, SIMILARITY_THRESHOLD, VERIFICATION_THRESHOLD, EMBEDDING_MODEL, EMBEDDING_DIMENSIONS, the model-selection vars, ENABLE_CACHING, LOG_LEVEL) ship with sensible defaults in render.yaml.

5. Wire the frontend to the backend

After the gateway deploys, copy its public URL from the service's Dashboard page and set it as the NEXT_PUBLIC_API_URL env var on the frontend service, then redeploy the frontend so the value takes effect. For this deploy that's:

NEXT_PUBLIC_API_URL=https://pydantic-agents-workflows-api.onrender.com

Use the base origin only — no trailing slash and no /api path (the frontend appends /ask, /health, etc. itself). If your service name isn't globally unique, Render adds a random suffix (…-api-xxxx.onrender.com), so always copy the exact URL shown in the Dashboard.

6. Seed the corpus, then done

The Workflows service has no documents until ingestion runs. Trigger it once to seed the DB (the cron will keep it fresh afterward):

render workflows start ingest_all   # or trigger from the Dashboard

7. (Optional) Smoke-test the pipeline from the Dashboard

Once the corpus is seeded, you can run the Q&A pipeline directly in the Render dashboard — no frontend needed. In the Workflows service → Tasks, start the run_qa_pipeline task with this input:

{ "question": "How do I deploy an AI agent on Render?" }

Documentation

Core Guides

docs/PIPELINE.md - Detailed breakdown of the 8-stage pipeline
docs/OBSERVABILITY.md - Comprehensive Logfire instrumentation guide
docs/CONFIGURATION.md - All configuration options and tuning
docs/HYBRID_SEARCH.md - Technical deep-dive on hybrid search

External Resources

Logfire Documentation: https://docs.pydantic.dev/logfire/
Pydantic AI Documentation: https://ai.pydantic.dev/
Render Documentation: https://docs.render.com/

Contributing

This is a demo project, but improvements are welcome!

Fork the repository
Create a feature branch
Make your changes
Add tests
Submit a pull request

License

MIT License - see LICENSE file for details

Acknowledgments

Built to showcase:

Logfire by Pydantic - AI observability platform
Render - Modern cloud platform
Pydantic AI - Type-safe AI agent framework
OpenAI & Anthropic - LLM providers

Ready to build observable AI? Fork this repo and deploy to Render to get started!

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
.claude/skills		.claude/skills
backend		backend
dashboards		dashboards
data		data
docker		docker
docs		docs
frontend		frontend
workflows		workflows
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
.uvversion		.uvversion
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
render.yaml		render.yaml
uv.lock		uv.lock

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Pydantic Agents

Table of Contents

What This App Does

User Experience

Key Features

What This Demonstrates

Render Capabilities

Logfire Features

Pydantic Stack

Architecture

Project Structure

Quick Start

Prerequisites

Local Development (with Make)

Deploy to Render

1. Set up a Logfire account.

2. One-click deploy

Environment groups

3. Create the Workflows service

4. Fill in the env-group values

5. Wire the frontend to the backend

6. Seed the corpus, then done

7. (Optional) Smoke-test the pipeline from the Dashboard

Documentation

Core Guides

External Resources

Contributing

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages