Skip to content
View slmatthiesen's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report slmatthiesen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
slmatthiesen/README.md

AI Engineer building agentic systems


LinkedIn San Francisco Profile views


πŸ‘‹ Β /me

const steven = {
  role:    "AI Engineer Β· CTO @ INTU",
  focus:   ["LLM agents", "agentic issue-fix pipelines", "RAG",
            "multimodal doc + image ingestion", "evals", "observability", "fully agentic systems"],
  stack:   ["React", "Node", "Python", "Rust", "Postgres", "GraphQL"],
  web3:    ["MPC", "DKG", "EVM", "Solana", "Solidity"],
  shipping: "production agent systems",
};

I build agent systems that survive contact with production β€” tool-using LLMs wired through MCP, grounded by RAG, gated by eval harnesses, and instrumented end-to-end so failures are observable instead of mysterious. Before AI I spent four years deep in Web3, leading an MPC wallet-infrastructure team across cryptography, smart contracts, and Rust.


πŸ” Β /method

   ideas ──▢ evals ──▢ guardrails ──▢ build ──▢ review ──▢ production
     β–²                                                           β”‚
     └─────────── observe Β· measure Β· iterate Β· harden β—€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Design intent before code: I write the evals and guardrails before a line ships, then let observability close the loop β€” every production failure feeds the next iteration instead of disappearing.


🧠  /stack

AI / ML

MCP RAG Evals LangSmith pgvector Google ADK

Languages & Core

TypeScript Node.js React Python Go Rust GraphQL PostgreSQL MySQL PHP

Platforms

AWS Google Cloud Docker Firebase Vertex AI BigQuery Cloud Run Gemini API GitHub Actions DigitalOcean

Web3

Solidity Ethers MPC The Graph


πŸš€ Β /review

πŸ” INTU β€” Web3 onboarding via MPC

CTO Β· Lead Engineer

Open-source NPM package orchestrating distributed key generation (DKG) and multi-party computation, removing seed phrases from the onboarding flow. Cross-chain transaction flows across EVM networks, bridged to Solana β€” sending a Solana tx authorized by an EVM signature. Self-hosted The Graph indexers for chains without hosted support.

Rust Β· Solidity Β· MPC Β· EVM Β· TypeScript

πŸ€– Agentic Github Issues Fixer

Autonomous coding agent

An agent that triages open GitHub issues, reproduces the bug, drafts a fix, and opens a PR β€” closing the loop from issue to reviewable change. Proof: medplum/medplum#9293 β€” an upstream OSS fix landed fully agentically (working branch).

Agents Β· Tool Use Β· GitHub API Β· OSS

Watch the agentic issue-fix demo

▢️ Watch the demo

πŸ“ˆ Algorithmic Futures Trading

Quant Research Β· WIP

Backtest harness and execution research for systematic futures strategies β€” applying the same eval + observability discipline I use on AI agents to strategy selection, slippage modeling, and live risk.

Python Β· Quant Β· Backtesting Β· WIP

Watch the algorithmic futures trading demo

▢️ Watch the demo

🩺 OpenEMR Clinical Agent

Selected Project Β· 2026

LLM agent layered onto an open-source EHR that reads patient charts and relays clinical context on demand. Lab-report ingestion pipeline produces summaries with source-page citations, so clinicians can verify any agent-surfaced claim β€” a RAG pattern tuned for high-stakes clinical use.

RAG Β· LLM Agents Β· Citations Β· Healthcare

Watch the OpenEMR Clinical Agent demo

▢️ Watch the demo

🍻 Happy Hour Friends β€” Crowdsourced happy hour finder

Live Β· 2026

Fully agent-operated site: every update β€” parsed automatically from the web or submitted by users β€” passes strict agentic moderation gates (classify β†’ verify, versioned prompts, audited apply path) before going live. The test: can my agent safeguards run the site without my intervention? The product itself is dead-simple β€” venues and deals in one sortable, filterable view, kept current by crowdsourcing.

Agents Β· Crowdsourcing Β· Moderation Gates Β· Next.js

🍽️ GURUPass / Pass Rewards β€” Restaurant AI Agents

Lead AI & Blockchain Engineer

Tool-using LLM agents handling order intake and menu Q&A, wired through MCP with structured-output validation. Curated eval set + offline regression harness catches failures before deploy; production traces drive failure-mode analysis. Personalization layer surfaces targeted coupons from purchase history.

MCP Β· Agents Β· Evals Β· Personalization


⚑  Efficiency > token-maxing

I burn a lot of tokens β€” on purpose. But spending them to look busy is waste.
The craft is signal per token: tight context, sharp evals, and failure modes that are observable instead of mysterious.
Every system above was designed, built, and shipped on a ~$100/month plan.

LinkedIn Β 

Popular repositories Loading

  1. architecture-decision-record architecture-decision-record Public

    Forked from architecture-decision-record/architecture-decision-record

    Architecture decision record (ADR) examples for software planning, IT leadership, and template documentation

  2. worker-comfyui worker-comfyui Public

    Forked from runpod-workers/worker-comfyui

    ComfyUI as a serverless API on RunPod

    Python

  3. metriport metriport Public

    Forked from metriport/metriport

    Metriport is an open-source universal API for healthcare data.

    JavaScript

  4. medplum medplum Public

    Forked from medplum/medplum

    Medplum is a healthcare platform that helps you quickly develop high-quality compliant applications.

    TypeScript

  5. slmatthiesen slmatthiesen Public