Skip to content

AtLongLastAnalytics/vigil

Vigil

CI License: Apache 2.0 Python 3.12+ Azure Functions

Vigil is an open-source pipeline monitoring agent for Azure Synapse Analytics. It delivers automated daily health reports so data engineering teams in regulated industries have a timestamped audit trail of every pipeline run — without building and maintaining custom monitoring infrastructure.

Deploy it once. Every morning your team receives a report like this:

Pipeline Health Summary — 2026-03-27 08:00 EST
─────────────────────────────────────────────
✅ Succeeded    12
❌ Failed        1
🔄 In Progress   0
─────────────────────────────────────────────
Total runs: 13  |  Success rate: 92.3%

Each failed pipeline is listed by name with its run ID and duration, so on-call engineers go straight to the problem.


Why Vigil

Managed-identity runtime by design. Vigil uses Azure managed identity for Synapse, reports blob storage, and email delivery at runtime — no service connection strings or shared access keys for those integrations are required in app settings. The Azure Functions runtime storage account uses a standard access key, which is an Azure Functions platform requirement for deployment via Kudu.

Minimal footprint. Serverless Consumption plan. Zero infrastructure to manage between runs. The entire monitoring loop — fetch, summarise, archive, email — fits in a single Azure Function.

Audit-ready. Every run archives a timestamped CSV snapshot to blob storage. Structured logs flow to Application Insights with custom dimensions. Nothing is ephemeral.


Designed for Regulated Environments

Vigil's technical choices were made with compliance requirements in mind:

  • No shared-key credentials for data service integrations — Synapse, ACS, and reports storage all use managed identity; the only key in app settings is AzureWebJobsStorage for the Azure Functions runtime storage, which is a platform requirement for Kudu zip deployment
  • Immutable daily snapshots — one CSV per run, stored in blob storage, provides an audit trail for data governance, SOC 2, or internal compliance reviews
  • Least-privilege RBAC by default — the function holds Synapse Monitoring Operator (read-only), never Synapse Administrator
  • Structured logging to Application Insights — every execution emits JSON-structured log entries with custom dimensions (pipeline counts, success rates, blob paths) suitable for automated alerting and incident reconstruction

Quick Start

Five steps to your first daily pipeline report:

1. Provision Azure resources

Follow AZURE_SETUP.md to create the required Azure resources and configure RBAC. Estimated time: 15–20 minutes. Estimated cost: ~$1–5/month.

2. Configure email domain

In Azure Portal, open your Azure Communication Services resource → EmailDomains → add a verified domain. Copy the sender address (DoNotReply@<domain>.azurecomm.net).

3. Set environment variables

cp local.settings.json.example local.settings.json
# Fill in SYNAPSE_ENDPOINT, ACS_ENDPOINT, SENDER_ADDRESS, RECIPIENT_ADDRESSES

4. Authenticate

az login

5. Run locally

func start

Then trigger and verify using the steps in Testing → Local below.

For production: push to main — GitHub Actions runs tests and deploys automatically. See Testing → Production for trigger and verification steps.


Configuration

All configuration is loaded from environment variables via config.py.

Required Variables

Variable Example Description
SYNAPSE_ENDPOINT https://myworkspace.dev.azuresynapse.net Synapse workspace endpoint
ACS_ENDPOINT https://<acs-resource>.communication.azure.com Azure Communication Services endpoint
SENDER_ADDRESS DoNotReply@domain.azurecomm.net Email sender (from verified ACS domain)
RECIPIENT_ADDRESSES user1@example.com,user2@example.com Comma-separated recipient list

Optional Variables

Variable Default Description
MANAGED_IDENTITY_CLIENT_ID (empty) Client ID for user-assigned managed identity
HOURS_BACK 24 Hours to look back for pipeline runs (1–720)
TIMEZONE US/Eastern Timezone for report timestamps (tz database name)
MONITOR_SCHEDULE 0 0 6,8 * * * NCRONTAB schedule for the timer trigger
BLOB_STORAGE_ACCOUNT_URL (empty) Storage account URL for CSV archiving. Required in production mode
BLOB_CONTAINER_NAME reports Blob container for CSV snapshots
BLOB_FOLDER_PATH dailysnapshots Folder within the container

Archiving is mandatory. In local mode (AZURE_FUNCTIONS_ENVIRONMENT=Development) Vigil always writes to local data/. In production mode, Vigil always writes to blob storage and requires BLOB_STORAGE_ACCOUNT_URL.

Authentication

The function uses DefaultAzureCredential from azure-identity:

  • Local developmentaz login (recommended)
  • Azure deployment — system-assigned managed identity (automatic, no configuration needed)
  • User-assigned identity — set MANAGED_IDENTITY_CLIENT_ID

Local Development

Prerequisites

Setup

git clone https://github.com/AtLongLastAnalytics/vigil.git
cd vigil
uv sync --group dev
cp local.settings.json.example local.settings.json
# Edit local.settings.json with your values
az login
func start

Tests and Linting

# Run all tests
uv run pytest tests/ -v

# Lint
uv run ruff check .
uv run ruff check --fix .

Testing

Local (Development)

Prerequisites: az login authenticated, local.settings.json filled in with real values, func start running in one terminal.

Trigger the function:

# Linux/macOS
curl -X POST http://localhost:7071/admin/functions/vigil_monitor \
  -H "Content-Type: application/json" -d "{}"

# Windows PowerShell
curl.exe -X POST http://localhost:7071/admin/functions/vigil_monitor `
  -H "Content-Type: application/json" -d "{}"

The Content-Type: application/json header and {} body are required — the Functions host returns 415 Unsupported Media Type without them.

Verify:

Check Where
Email arrived Your recipient inbox (RECIPIENT_ADDRESSES)
CSV snapshot data/pipeline_runs_<date>.csv in the project root
Function logs Terminal running func start

In local mode (AZURE_FUNCTIONS_ENVIRONMENT=Development), the archive writes to data/ on the local filesystem instead of blob storage.

Production (Azure)

One-time GitHub Actions setup (see Azure Deployment below for full detail):

  1. Create a production GitHub environment with environment secret AZURE_PUBLISH_PROFILE (publish profile XML from Azure Portal)
  2. Create repository variable AZURE_FUNCTIONAPP_NAME with your Function App name

Deploy:

git push origin main  # CI runs tests, then deploys on pass

Watch the run at: GitHub repo → Actions → deploy workflow

Trigger the deployed function:

# Linux/macOS
curl -X POST "https://<func-name>.azurewebsites.net/admin/functions/vigil_monitor" \
  -H "x-functions-key: <host-key>" \
  -H "Content-Type: application/json" \
  -d "{}"

# Windows PowerShell
curl.exe -X POST "https://<func-name>.azurewebsites.net/admin/functions/vigil_monitor" `
  -H "x-functions-key: <host-key>" `
  -H "Content-Type: application/json" `
  -d "{}"

Get <host-key> from: Azure Portal → Function App → App keys → default

Verify:

Check Where
Email arrived Your recipient inbox
CSV uploaded Reports storage account → Containers → reports/dailysnapshots/
Structured logs Function App → Application Insights → Logs → query traces
No errors Application Insights → Failures

Azure Deployment

Step 0 — Provision the Function App (Terraform)

The infra/ directory contains lightweight Terraform that creates only the Function App and its environment variables. All other resources (Synapse, ACS, storage accounts, App Insights) are assumed to exist — see AZURE_SETUP.md for those.

What it creates: runtime storage account, Consumption hosting plan, Linux Function App with system-assigned managed identity and all app settings pre-populated.

cd infra
cp terraform.tfvars.example terraform.tfvars
# Fill in terraform.tfvars with your values

terraform init
terraform plan
terraform apply

After apply, Terraform outputs the managed_identity_principal_id — use that object ID when assigning RBAC roles (AZURE_SETUP.md Step 6).

app_insights_connection_string is optional so you can proceed without it, but strongly recommended for production diagnostics and alerting.

For team/shared environments, prefer a remote Azure Storage Terraform backend instead of local state files.

Prefer the portal? Skip this step and follow AZURE_SETUP.md Step 5–7 manually instead.

Option 1: GitHub Actions (Recommended)

The workflow runs tests on every push and PR, then deploys to Azure only on pushes to main. Deployment credentials are scoped to a GitHub environment, which lets you require manual approval before any deploy reaches Azure and keeps OIDC configuration isolated to the deploy job.

CI/CD Security Notes

  • Runtime authentication uses managed identity. CI/CD deployment uses a publish profile secret scoped to the production GitHub environment.
  • The workflow requires one environment secret (AZURE_PUBLISH_PROFILE) and one repository variable (AZURE_FUNCTIONAPP_NAME).
  • uv is installed in CI via pinned package version (uv==0.11.2) instead of piping a remote installer script to shell.

One-time setup:

  1. In your GitHub repo → SettingsEnvironmentsNew environment → name it production
  2. Under Environment secretsAdd secret:
  • Name: AZURE_PUBLISH_PROFILE — paste the full XML contents of the publish profile downloaded from your Function App in Azure Portal (Overview → Get publish profile)
  1. Optionally add yourself under Required reviewers to manually approve every deploy before it runs
  2. In SettingsSecrets and variablesActionsVariables, create AZURE_FUNCTIONAPP_NAME to match your Function App name

Fork and Deploy in ~10 Minutes

  1. Fork this repository.
  2. Create a Linux Python 3.12 Function App in Azure (or use Terraform — see Step 0 above).
  3. In Azure Portal → your Function App → OverviewGet publish profile → download the file.
  4. In GitHub, create environment production and add AZURE_PUBLISH_PROFILE as an environment secret (paste the full XML).
  5. Add repository variable AZURE_FUNCTIONAPP_NAME.
  6. Push to main and watch the deploy workflow run.

Deploy:

git push origin main  # tests run, then deploys automatically on pass

Option 2: Manual

uv export --no-hashes -o requirements.txt
func azure functionapp publish <function-app-name>

Post-Deployment: Application Settings

In Azure Portal → Function App → Configuration → Application settings, add all required variables. Ensure System Assigned Managed Identity is enabled and the following RBAC roles are assigned to the managed identity:

Resource Role
Synapse Workspace Synapse Monitoring Operator
Communication Services Azure Communication Service Email Sender
Storage Account Storage Blob Data Contributor

See AZURE_SETUP.md for step-by-step provisioning instructions.


Project Structure

vigil/
├── function_app.py              # Azure Function entry point (orchestration)
├── config.py                    # Environment variable loading and validation
├── monitor.py                   # Synapse pipeline run fetching
├── email_service.py             # HTML report generation and email delivery
├── archive_service.py           # CSV snapshot archiving to blob storage
├── constants.py                 # Shared constants
├── host.json                    # Azure Functions runtime config
├── pyproject.toml               # Project metadata and dependencies (uv)
├── local.settings.json.example  # Local development settings template
├── AZURE_SETUP.md               # Azure resource provisioning and RBAC guide (portal)
├── infra/                       # Terraform — creates Function App + app settings only
│   ├── main.tf                  #   Provider config
│   ├── variables.tf             #   Input variables (mirrors config.py env vars)
│   ├── function_app.tf          #   Storage account, hosting plan, Function App
│   ├── outputs.tf               #   Function App name, URL, managed identity principal ID
│   └── terraform.tfvars.example #   Copy to terraform.tfvars and fill in values
├── tests/                       # pytest test suite with shared fixtures
└── .github/workflows/           # CI/CD pipeline (test → deploy on push to main)

requirements.txt is auto-generated by uv — do not edit it directly. To regenerate after updating pyproject.toml, run uv export --no-hashes --no-editable -o requirements.txt.


Monitoring and Observability

Vigil emits structured log entries with custom_dimensions to Application Insights on every execution:

  • Pipeline summary — total runs, succeeded, failed, in-progress counts, success rate
  • Archive operations — blob path, row count, overwrite warnings
  • Email delivery — recipient count, send confirmation

Set up an Application Insights alert on logged errors to get paged when monitoring itself fails.


Troubleshooting

Symptom Likely cause Fix
Missing required environment variable Env var not set Check local.settings.json / Function App config
Auth error locally Not logged in az login
Auth error in Azure RBAC not assigned Verify managed identity role assignments
No pipeline runs Wrong time window Increase HOURS_BACK; check TIMEZONE
Email not delivered ACS domain not verified Complete domain verification in ACS portal

Contributing

See CONTRIBUTING.md for code style, testing requirements, and pull request guidelines.

Security

See SECURITY.md for supported versions, vulnerability reporting, and hardening guidance.


License

Apache 2.0 — see LICENSE.


Built by AtLongLast Analytics

Vigil is built and maintained by AtLongLast Analytics, a data engineering consultancy specialising in production-grade data systems for compliance-driven organisations in finance, healthcare, and pharma.

Need custom monitoring, alerting, or regulated data infrastructure? Get in touch →

About

Vigil is an Azure Function that monitors Synapse Analytics pipeline runs and delivers scheduled HTML email reports via Azure Communication Services, with CSV archiving to Blob Storage and managed identity authentication throughout.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors