Render Developer Q&A Assistant showcasing observable AI with Pydantic Agents, Pydantic Embedder, Logfire, and Render Workflows
Intelligent question-answering system that demonstrates real-world AI observability patterns. This example project shows how to build, instrument, and monitor a multi-stage LLM pipeline with full cost tracking, quality evaluation, and performance monitoring.
- What This App Does
- What This Demonstrates
- Architecture
- Quick Start
- Deploy to Render
- Example Metrics
- Documentation
- Contributing
- License
This is an AI-powered Q&A assistant for Render documentation. Users can ask questions about Render's platform, and the app provides accurate, well-researched answers backed by the official documentation.
- Ask a question - "How do I deploy a Node.js app on Render?" or "What database plans are available?"
- Watch the pipeline - Track progress as the run moves through 8 stages (embedding → retrieval → generation → verification)
- Get accurate answers - Receive detailed responses with sources from Render docs
- Quality guaranteed - Every answer is verified for accuracy and rated by dual AI evaluators
- Hybrid search - Combines semantic understanding with keyword matching for better retrieval
- Multi-stage verification - Extracts claims, verifies against docs, checks technical accuracy
- Iterative refinement - Automatically regenerates low-quality answers with feedback
- Cost tracking - See exactly how much each question costs to answer
- Parallel fan-out - The pipeline runs on Render Workflows, fanning out the heaviest stages (technical accuracy + dual-model evaluation) across instances so they execute concurrently
- Render Workflows - The Q&A pipeline and ingestion run as durable workflow tasks with per-task retries, timeouts, and cross-instance parallel fan-out
- PostgreSQL with pgvector + full-text - Managed hybrid search database
- Web Service + Static Site - FastAPI gateway + Next.js frontend
- Cron Jobs - Scheduled ingestion refresh that triggers the workflow fan-out
- Blueprint deploy + env groups -
render.yamlprovisions everything; shared config lives in two env groups
- LLM Traces - Complete visibility into every AI call (OpenAI + Anthropic auto-instrumented)
- HTTP Tracing - FastAPI auto-instrumentation for request/response tracking
- Database Monitoring - AsyncPG auto-instrumentation for query performance
- Cost Tracking - Per-stage and per-execution cost attribution with custom metrics
- Multi-Model Evals - Dual-rater quality assessment (OpenAI + Anthropic)
- Session Tracking - End-to-end user journey with distributed tracing
- Custom Metrics - Business-specific metrics (cost, quality, iterations)
- SQL Queries - Custom analytics on AI performance
This project is built end-to-end on the Pydantic ecosystem:
- Pydantic AI Agents — every pipeline stage (generation, claims extraction, accuracy check, dual-rater evaluation) is a
pydantic_ai.Agentwith a typedoutput_type. Multi-provider orchestration (Claude + GPT) runs throughOpenAIProvider/AnthropicProviderin a single pipeline. Seebackend/pipeline/. - Pydantic Embedder —
pydantic_ai.EmbedderwithOpenAIEmbeddingModelpowers question embedding (embed_query) and batch claim embedding (embed_documents) for verification. Auto-instrumented bylogfire.instrument_pydantic_ai(). Seebackend/pipeline/embeddings.pyandbackend/pipeline/verification.py. - Pydantic Models — Claims, accuracy scores, eval dimensions, and pipeline state are parsed directly into Pydantic models (e.g.
ClaimsOutput,EvaluationOutput).pydantic-settingsmanages config inbackend/config.py. - Pydantic GenAI Prices — model pricing is loaded dynamically from the
pydantic/genai-pricesregistry, then combined with per-agent token counts fromresult.usage()to produce per-stage cost attribution. Seebackend/prices.py. - Logfire — distributed traces, custom metrics, dual-model evals, and cost attribution. Auto-instruments FastAPI, AsyncPG, HTTPX, and Pydantic AI. See
backend/observability.py.
The frontend connects to a backend FastAPI gateway that triggers a Render Workflows run and polls it for the result. The 8-stage pipeline and ingestion execute as workflow tasks that fan out across instances.
┌─────────────────────────────────────────────────────────────┐
│ Frontend (Next.js + TypeScript) │
│ Deployed as: Render Static Site │
│ - Question input UI │
│ - Progress via polling (POST /ask → poll GET /ask/{id}) │
│ - Answer display with metrics │
└─────────────────────────────────────────────────────────────┘
↓ HTTPS
┌─────────────────────────────────────────────────────────────┐
│ API Gateway (FastAPI + Logfire) │
│ Deployed as: Render Web Service (Python 3.13) │
│ - POST /ask → start_task("…/run_qa_pipeline") │
│ - GET /ask/{id} → get_task_run(id) (poll status/result) │
│ - /health, /history, /stats, /sessions/{id}/logs │
└─────────────────────────────────────────────────────────────┘
↓ Render SDK (start_task / get_task_run)
┌─────────────────────────────────────────────────────────────┐
│ Render Workflows service (Python 3.13) │
│ Orchestrator: run_qa_pipeline │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ [1] Question Embedding (OpenAI) in-process │ │
│ │ [2] RAG Document Retrieval (pgvector+BM25) in-process │ │
│ │ [3] Answer Generation (Claude) ⟶ subtask │ │
│ │ [4] Claims Extraction (GPT) ⟶ subtask │ │
│ │ [5] Claims Verification (RAG again) ⟶ subtask │ │
│ │ [6] Technical Accuracy (Claude) ┐ │ │
│ │ [7] Quality Rating (OpenAI+ ├─ 3 parallel │ │
│ │ Anthropic) ┘ subtasks │ │
│ │ [8] Quality Gate (Pass or Iterate) in-process │ │
│ └────────────────────────────────────────────────────────┘ │
│ Ingestion: ingest_all → ingest_core, then 6 add_* in │
│ parallel (replaces the old serial preDeploy) │
└─────────────────────────────────────────────────────────────┘
↓ ↓
┌──────────────────────┐ ┌───────────────────────────┐
│ PostgreSQL │ │ Logfire │
│ (Render Managed) │ │ (Pydantic) │
│ - pgvector ext │ │ - Distributed traces │
│ - RAG embeddings │ │ - Cost attribution │
│ - Full-text search │ │ - Quality metrics │
└──────────────────────┘ │ - Custom dashboards │
└───────────────────────────┘
Cron (daily) ─ start_task("…/ingest_all") ─▶ Workflows service
Why hybrid? Workflows aren't HTTP-facing, so a client (the gateway) triggers tasks via the SDK and reads run status. Stages 1, 2, and 8 are cheap/data-dependent and stay in-process on the orchestrator; only the heavy, independently-retryable LLM stages are promoted to their own tasks. Stages 6 + 7 run as three concurrent subtasks on separate instances. See
workflows/app.py.
render-qa-assistant/
├── backend/
│ ├── main.py # FastAPI gateway (triggers + polls workflow runs)
│ ├── api/
│ │ └── logs.py # Logfire logs API endpoint
│ ├── pipeline/ # 8-stage pipeline implementation (reused by workflows)
│ ├── models.py # Pydantic models
│ ├── database.py # PostgreSQL + pgvector
│ ├── observability.py # Logfire configuration
│ └── config.py # Settings management
├── workflows/ # Render Workflows service
│ ├── app.py # Workflows() instance + all @app.task defs
│ ├── serialization.py # JSON boundary helpers (model_dump/model_validate)
│ └── trigger_ingest.py # Cron entrypoint → start_task("…/ingest_all")
├── frontend/
│ ├── src/ # Next.js + TypeScript UI
│ └── package.json
├── data/
│ ├── embeddings/ # Pre-embedded documentation
│ └── scripts/ # Data ingestion scripts
├── docs/
│ ├── PIPELINE.md # Detailed pipeline guide
│ ├── OBSERVABILITY.md # Logfire instrumentation guide
│ ├── CONFIGURATION.md # Configuration reference
│ └── HYBRID_SEARCH.md # Hybrid search deep-dive
├── pyproject.toml # Python dependencies (uv)
├── uv.lock # Locked dependency versions
├── .python-version # Pins Python to 3.13
├── render.yaml # Infrastructure as code
├── .env.example # Environment variables template
└── README.md # This file
- uv (manages Python 3.13 automatically)
- Node.js 18+
- PostgreSQL 16+ (with pgvector extension)
- OpenAI API key
- Anthropic API key
- Logfire account — sign in at logfire.pydantic.dev, create a project (US region), then:
- Settings → Write Tokens → create a token →
LOGFIRE_TOKENin.env - Settings → Read Tokens → create a token →
LOGFIRE_READ_TOKENin.env - View traces in the Live panel under your project
- Settings → Write Tokens → create a token →
# 1. Install everything (uv installs Python 3.13 automatically)
make install
# 2. Set up .env file (copy from example and fill in your keys)
cp .env.example .env
# 3. Start database
make db-start
# 4. Load documentation (this step might take a while!)
make ingest
# 5. Run backend (in one terminal)
make run-backend
# 6. Run frontend (in another terminal)
make run-frontendAsking questions locally requires a deployed Workflow for now.
POST /askdelegates to a Workflows service; with nothing to delegate to it returns503 WORKFLOW_SLUG is not configured. But you can run the rest of the stack locally — no Render cloud resources, no API key — with a local dev server:# Terminal 1 — local workflow dev server (loads .env, listens on :8120) render workflows dev -- uv run render-workflows workflows.app:app # Terminal 2 — gateway pointed at the local dev server RENDER_USE_LOCAL_DEV=true WORKFLOW_SLUG=local \ uv run uvicorn backend.main:app --reload --port 8000 # Terminal 3 — frontend cd frontend && npm run dev
RENDER_USE_LOCAL_DEV=truepoints the SDK athttp://localhost:8120instead of Render's cloud (no token needed);WORKFLOW_SLUG=localjust satisfies the gateway's guard. Set both in.envto skip the prefixes. The workflow runs against your localDATABASE_URL, so History populates normally. (Alternatively, setRENDER_API_KEY+ the realWORKFLOW_SLUGto target a deployed cloud Workflows service, which writes to its own database.)
Local config → deployed env groups. Locally every process reads one
.env(copied from.env.example). On deploy that same config splits into the two Render env groups inrender.yaml— see Deploy → Environment groups. Local-only knobs (RENDER_USE_LOCAL_DEV, the DockerDATABASE_URL) aren't in any group; in the cloud the SDK uses the platform socket andDATABASE_URLis injected from the database.
make ingest runs the full pipeline: bulk doc embeddings, plus the curated "special pages" that get explicit-injection into RAG context (pricing, AI agent, autoscaling, Node.js). To re-load just one of those after editing its script, use the per-target shortcuts:
make add-pricing # render.com/pricing tables
make add-ai-agent # render.com/tutorials/agents-on-render-workflows (AI agents → Render Workflows)
make add-autoscaling # render.com/docs/scaling
make add-nodejs # render.com/docs/deploy-node-express-appAccess locally:
- Frontend: http://localhost:3000
- API docs: http://localhost:8000/docs
- Logfire: https://logfire.pydantic.dev
Before clicking the deploy button, sign in at logfire.pydantic.dev, create a project (US region), and generate two tokens:
- Preferences → Write Tokens → create token → save as
LOGFIRE_TOKEN - Preferences → Read Tokens → create token → save as
LOGFIRE_READ_TOKEN
You'll paste both into the Render Dashboard in step 3.
Render reads render.yaml and provisions:
- PostgreSQL database with pgvector (
pydantic-agents-workflows-db) - API gateway web service (
pydantic-agents-workflows-api, FastAPI + Logfire) - Ingestion refresh cron (
pydantic-agents-workflows-ingest, triggers the workflow daily) - Frontend static site (
pydantic-agents-workflows-frontend, Next.js) - Two environment groups that hold all shared config (see below)
On Apply, Render prompts once for the secret values in the env groups
(OPENAI_API_KEY, ANTHROPIC_API_KEY, both Logfire tokens, and — left blank for now —
RENDER_API_KEY / WORKFLOW_SLUG). You fill these at the group level, not per service.
render.yaml defines two reusable env groups
so config lives in one place instead of being duplicated across services:
| Group | Contents | Linked to |
|---|---|---|
pydantic-agents-workflows-pipeline |
LLM/Logfire secrets + all pipeline, RAG, and model config (~20 vars) | API gateway and the Workflows service (step 3) |
pydantic-agents-workflows-pipeline-trigger |
RENDER_API_KEY, WORKFLOW_SLUG |
API gateway and the ingest cron |
The payoff is pydantic-agents-workflows-pipeline: the gateway and the Workflows service both run the same
backend.config.Settings, so they need identical config. Linking the group to the
hand-created Workflows service (step 3) replaces pasting ~20 variables by hand. DATABASE_URL
stays per-service (it's injected from the database, which can't live in a group), and the
frontend's NEXT_PUBLIC_API_URL stays inline (unique, build-time).
Blueprints (render.yaml) can't create Workflows yet, so do this once in the Dashboard.
3a. Open the create form. In the Render Dashboard, click New → Workflow. Connect this GitHub repo (or your fork) when prompted.
3b. Fill in every field exactly as below:
| Field | Value |
|---|---|
| Name | pydantic-agents-workflow (this becomes the workflow slug) |
| Project / Environment | Same project + production environment as the rest of the stack |
| Language / Runtime | Python 3 |
| Branch | main (or the branch you deploy) |
| Region | Oregon (must match pydantic-agents-workflows-db) |
| Root Directory | (leave blank — the repo root) |
| Build Command | pip install uv && uv sync --no-dev --frozen |
| Start Command | uv run render-workflows workflows.app:app |
| Instance Type | Standard (the tasks are I/O-bound; no need for Pro) |
uv: command not found? A hand-created Workflow service doesn't getuvpre-installed (unlike Blueprint services), so the build command installs it first withpip install uv.Pin Python to 3.13. The build may default to a newer Python (e.g. 3.14) and ignore the repo's
.python-version. Add an env varPYTHON_VERSION=3.13in step 3c so the build matchesuv.lockand the rest of the stack.
3c. Link config and add the database. The Workflows service runs the same
backend.config.Settings as the gateway, so instead of re-typing every variable, link the
pydantic-agents-workflows-pipeline env group the Blueprint already created:
-
Under Environment → Environment Groups, click Link Existing Group →
pydantic-agents-workflows-pipeline. This pulls in both API keys, both Logfire tokens, and all pipeline/RAG/model config in one step. -
Add the two variables that can't come from the group (env groups hold only plain
key: valuepairs — no database links):Variable Required? Value / Source DATABASE_URL✅ Required Click Add from Database → pydantic-agents-workflows-db(already provisioned by step 2's Blueprint — you are not creating a new database, just linking the existing one). Use the same database as the gateway so the History tab populates.PYTHON_VERSIONRecommended 3.13(see the build note above)Bind
DATABASE_URL, don't hardcode it. Add from Database injects the managed internal connection string and auto-updates if creds rotate. Pasting a literal URL (into the service or the group) is a static snapshot that breaks on rotation — avoid it.
The group's four secrets (OPENAI_API_KEY, ANTHROPIC_API_KEY, LOGFIRE_TOKEN,
LOGFIRE_READ_TOKEN) are set once when applying the Blueprint (step 2) and reused here by linking
the group — the first three are required and the service crashes on startup without them
(no defaults in backend/config.py).
End state — the Workflows service environment:
[linked group] pydantic-agents-workflows-pipeline # 4 secrets + pipeline/RAG/model config
DATABASE_URL → from pydantic-agents-workflows-db # per-service bind, not in any group
PYTHON_VERSION = 3.13
3d. Create the service and wait for the first deploy to finish. Then copy the service's
slug (shown on its Dashboard page / in its URL, e.g. pydantic-agents-workflow) — you'll
set it as WORKFLOW_SLUG in the pydantic-agents-workflows-pipeline-trigger group in step 4, which the gateway and cron
both inherit.
Because the gateway and cron read everything from the two env groups, you set values on the groups, not on each service — every linked service picks them up automatically.
pydantic-agents-workflows-pipeline (drives the gateway + Workflows service) — set the four secrets once, when
you apply the Blueprint in step 2:
| Variable | Source |
|---|---|
OPENAI_API_KEY |
platform.openai.com |
ANTHROPIC_API_KEY |
console.anthropic.com |
LOGFIRE_TOKEN |
Logfire write token from step 1 |
LOGFIRE_READ_TOKEN |
Logfire read token from step 1 |
pydantic-agents-workflows-pipeline-trigger (shared by the gateway + cron) — set these after the Workflows service
exists (step 3):
| Variable | Source |
|---|---|
RENDER_API_KEY |
Render Account Settings → API Keys |
WORKFLOW_SLUG |
The Workflows service slug from step 3 (e.g. pydantic-agents-workflow) |
Edit a group under Dashboard → Env Groups →
<group>. Saving re-deploys every service linked to it, so the gateway and cron both pick upWORKFLOW_SLUGfrom a single edit.
Auto-filled, no action needed: DATABASE_URL (injected from the database service) and the
rest of pydantic-agents-workflows-pipeline's config (QUALITY_THRESHOLD, ACCURACY_THRESHOLD, AGREEMENT_THRESHOLD,
MAX_ITERATIONS, MAX_TOKENS, TIMEOUT_SECONDS, RAG_TOP_K, SIMILARITY_THRESHOLD,
VERIFICATION_THRESHOLD, EMBEDDING_MODEL, EMBEDDING_DIMENSIONS, the model-selection vars,
ENABLE_CACHING, LOG_LEVEL) ship with sensible defaults in render.yaml.
After the gateway deploys, copy its public URL from the service's Dashboard page and set it as
the NEXT_PUBLIC_API_URL env var on the frontend service, then redeploy the frontend so the
value takes effect. For this deploy that's:
NEXT_PUBLIC_API_URL=https://pydantic-agents-workflows-api.onrender.com
Use the base origin only — no trailing slash and no /api path (the frontend appends
/ask, /health, etc. itself). If your service name isn't globally unique, Render adds a random
suffix (…-api-xxxx.onrender.com), so always copy the exact URL shown in the Dashboard.
The Workflows service has no documents until ingestion runs. Trigger it once to seed the DB (the cron will keep it fresh afterward):
render workflows start ingest_all # or trigger from the DashboardOnce the corpus is seeded, you can run the Q&A pipeline directly in the Render dashboard — no frontend needed. In the
Workflows service → Tasks, start the run_qa_pipeline task with this input:
{ "question": "How do I deploy an AI agent on Render?" }- docs/PIPELINE.md - Detailed breakdown of the 8-stage pipeline
- docs/OBSERVABILITY.md - Comprehensive Logfire instrumentation guide
- docs/CONFIGURATION.md - All configuration options and tuning
- docs/HYBRID_SEARCH.md - Technical deep-dive on hybrid search
- Logfire Documentation: https://docs.pydantic.dev/logfire/
- Pydantic AI Documentation: https://ai.pydantic.dev/
- Render Documentation: https://docs.render.com/
This is a demo project, but improvements are welcome!
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
MIT License - see LICENSE file for details
Built to showcase:
- Logfire by Pydantic - AI observability platform
- Render - Modern cloud platform
- Pydantic AI - Type-safe AI agent framework
- OpenAI & Anthropic - LLM providers
Ready to build observable AI? Fork this repo and deploy to Render to get started!