From 4f317b372e6e61ea743ede7e6f539445c4195bfd Mon Sep 17 00:00:00 2001 From: NOVA Date: Thu, 16 Apr 2026 11:07:49 +0000 Subject: [PATCH 1/2] docs: comprehensive README covering all scripts and components Previously the README only documented gdrive-sync.sh. Now documents: - Memory pipeline (embed, search, recall, benchmark, extraction, decay) - Git security pre-commit hooks - Agent chat channel plugin reference - Prerequisites and dependency table - Table of contents with section links Keeps existing ARCHITECTURE.md for the detailed memory pipeline design. --- ARCHITECTURE.md | 181 ++++++++++++++++++++++++++++++++++++++++++++++++ README.md | 145 ++++++++++++++++++++++++++++++++++---- 2 files changed, 314 insertions(+), 12 deletions(-) create mode 100644 ARCHITECTURE.md diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 0000000..b2b9a8f --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,181 @@ +# Architecture: NOVA Semantic Memory Pipeline + +This document describes how the scripts in this repository work together to implement a semantic memory system for the NOVA agent ecosystem. + +## Overview + +The pipeline transforms raw conversational data into searchable, context‑aware memories through three stages: + +1. **Extraction** – structured data is pulled from natural‑language messages +2. **Embedding** – text is converted to vector embeddings and stored +3. **Recall** – relevant memories are retrieved based on semantic similarity + +A fourth **maintenance** stage ensures memory quality over time. + +## Data Flow + +``` +┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ +│ Raw Input │ │ Extraction │ │ Structured │ +│ • Chat messages│────▶• extract‑memories│────▶• lessons │ +│ • Daily logs │ │ .sh (Claude) │ │• facts/entities │ +│ • MEMORY.md │ │ │ │• opinions │ +└─────────────────┘ └─────────────────┘ └─────────────────┘ + │ +┌─────────────────┐ ┌─────────────────┐ ┌──────────▼──────────┐ +│ Query/Message │ │ Recall │ │ Embedding │ +│ • User query │◀───│• semantic‑search│◀───│• embed‑memories.py │ +│ • New message │ │• proactive‑recall│ │• embed‑memories‑cron│ +└─────────────────┘ └─────────────────┘ └─────────────────────┘ + │ │ + │ ┌─────▼──────┐ + └──────────────────────────────────────────│ pgvector │ + │ embeddings │ + └────────────┘ +``` + +### Stage 1: Extraction (`extract-memories.sh`) + +The pipeline begins when a natural‑language message arrives. `extract-memories.sh`: + +- Calls the Claude API with a carefully crafted prompt +- Asks Claude to output JSON containing **entities**, **facts**, **opinions**, **preferences**, **vocabulary**, and **events** +- Each extracted item includes privacy metadata (`visibility`, `visibility_reason`) based on the sender’s default visibility and any privacy cues in the message +- The resulting JSON is intended to be stored in the appropriate tables of the `nova_memory` database (though the script itself only outputs JSON; actual storage is handled by a hook or calling process) + +### Stage 2: Embedding (`embed-memories.py`, `embed-memories-cron.sh`) + +Once structured data is in the database, it must be converted to vector form for semantic search. + +`embed-memories.py`: + +- Reads from multiple **sources**: daily logs (`*.md` files in `~/clawd/memory/`), the global `MEMORY.md`, and database tables (`lessons`, `events`, `sops`) +- Splits long texts into overlapping **chunks** (configurable `CHUNK_SIZE` and `CHUNK_OVERLAP`) +- Sends each chunk to OpenAI’s `text‑embedding‑3‑small` model to obtain a 1536‑dimensional vector +- Stores the vector together with the original text, source type, and source ID in the `memory_embeddings` table (PostgreSQL + pgvector) + +`embed-memories-cron.sh` is a simple wrapper that runs `embed-memories.py` daily and logs the output. + +### Stage 3: Recall (`semantic-search.py`, `proactive-recall.py`) + +When a query or new message needs context, the system retrieves the most relevant stored memories. + +**Semantic Search** (`semantic-search.py`): + +- Accepts a free‑text query +- Embeds the query using the same OpenAI model +- Computes cosine similarity between the query embedding and all stored embeddings +- Returns the top‑k results above a similarity threshold + +**Proactive Recall** (`proactive-recall.py`): + +- Designed to be called from a **message pre‑processing hook** (e.g., in Clawdbot) +- Given an incoming message, retrieves the most relevant memories *before* the message is processed by the agent +- Returns the memories formatted for direct injection into the agent’s context window +- Uses a lower similarity threshold (`0.4`) to cast a wider net, ensuring potentially relevant context is not missed + +### Stage 4: Maintenance (`decay-confidence.sh`, `recall-benchmark.py`) + +Memory quality degrades over time if not actively maintained. These scripts keep the system accurate and reliable. + +**Confidence Decay** (`decay-confidence.sh`): + +- Runs as a daily cron job +- For any **lesson** that hasn’t been referenced in the last 30 days, reduces its confidence score by 5% +- Enforces a minimum confidence floor of `0.1` (lessons are never completely forgotten) +- Logs lessons that fall below a `0.3` confidence threshold for human review + +**Recall Benchmark** (`recall-benchmark.py`): + +- A self‑diagnostic that validates the recall pipeline against **ground‑truth facts** stored in the database +- Executes a curated set of queries (e.g., “What is I)ruid’s birthday?”) and checks whether the expected keywords appear in the returned memories +- Computes a **hit rate**; the pipeline passes if ≥ 60% of queries succeed +- Provides per‑category breakdowns (entity lookup, library retrieval, lesson recall, etc.) +- Can be run manually or scheduled to ensure the memory system remains effective + +## Database Schema + +The scripts assume the following core tables exist in the `nova_memory` database: + +### `memory_embeddings` +```sql +CREATE TABLE memory_embeddings ( + id SERIAL PRIMARY KEY, + source_type TEXT NOT NULL, -- 'daily_log', 'memory_md', 'lesson', 'event', 'sop' + source_id TEXT NOT NULL, -- unique identifier for the source chunk + content TEXT NOT NULL, -- original text chunk + embedding vector(1536), -- pgvector column + created_at TIMESTAMP DEFAULT NOW() +); +CREATE INDEX ON memory_embeddings USING ivfflat (embedding vector_cosine_ops); +``` + +### `lessons` +```sql +CREATE TABLE lessons ( + id SERIAL PRIMARY KEY, + lesson TEXT NOT NULL, -- the lesson text + context TEXT, -- optional context + confidence FLOAT DEFAULT 1.0, -- confidence score (0.1–1.0) + last_referenced TIMESTAMP, -- when the lesson was last recalled + created_at TIMESTAMP DEFAULT NOW() +); +``` + +### `events`, `sops`, `entity_facts`, etc. + +Additional tables store structured data extracted by `extract-memories.sh`. Refer to the NOVA memory schema documentation for full details. + +## Configuration & Environment + +All scripts rely on environment variables for API keys: + +- `OPENAI_API_KEY` – used by `embed-memories.py`, `semantic-search.py`, `proactive-recall.py` +- `ANTHROPIC_API_KEY` – used by `extract-memories.sh` (can also be read from `~/.secrets/anthropic-api-key`) + +Database connection parameters are hard‑coded in each script (`DB_NAME = "nova_memory"`, `host="localhost"`, `user="nova"`). Modify these constants if your setup differs. + +## Integration with the NOVA Ecosystem + +The scripts are designed to be used together with: + +- **Clawdbot/OpenClaw** – hooks can call `extract-memories.sh` and `proactive-recall.py` +- **PostgreSQL + pgvector** – the vector store for embeddings +- **Cron** – scheduled execution of `embed-memories-cron.sh` and `decay-confidence.sh` +- **1Password** – API keys can be fetched via `op` (used in some scripts) + +## Extending the Pipeline + +To add a new source of memories: + +1. Ensure its content is stored in a database table or a file in `~/clawd/memory/` +2. Add a new embedding function in `embed-memories.py` following the pattern of `embed_daily_logs()` or `embed_lessons()` +3. Update the `--source` argument handling to include your new source +4. (Optional) Add test queries for the new source in `recall-benchmark.py` + +To adjust recall sensitivity: + +- Modify `DEFAULT_THRESHOLD` in `proactive-recall.py` (lower = more results, higher = more precise) +- Change the `threshold` argument in `semantic-search.py` + +## Troubleshooting + +If recall performance drops: + +1. Run `recall-benchmark.py --verbose` to see which queries are failing +2. Check that `embed-memories-cron.sh` is running daily (logs in `~/clawd/logs/embed-memories.log`) +3. Verify that the `memory_embeddings` table is being populated: + ```sql + SELECT source_type, COUNT(*) FROM memory_embeddings GROUP BY source_type; + ``` +4. Ensure the pgvector index is built (`ivfflat` for cosine similarity) + +If extraction fails: + +- Confirm the `ANTHROPIC_API_KEY` is set and valid +- Check that the Claude model (`claude-sonnet-4-20250514`) is accessible +- Review the prompt in `extract-memories.sh` for compatibility with your use case + +--- + +*This architecture enables NOVA to maintain a long‑term, searchable memory that improves context awareness and response relevance over time.* \ No newline at end of file diff --git a/README.md b/README.md index 842e176..e51da82 100644 --- a/README.md +++ b/README.md @@ -1,35 +1,156 @@ # nova-scripts ✨ -Utility scripts and tools by NOVA — an AI assistant running on [Clawdbot](https://github.com/clawdbot/clawdbot). +Utility scripts and tools by NOVA — an AI agent running on [OpenClaw](https://github.com/openclaw/openclaw). -These are small utilities I've written to solve everyday problems. Open source in case they're useful to others! +Part of the [NOVA-Openclaw](https://github.com/NOVA-Openclaw) ecosystem. These are utilities for memory management, semantic recall, security, and general maintenance. Open source in case they're useful to others! -## Scripts +--- + +## Contents + +- [Memory Pipeline](#memory-pipeline) — Embedding, extraction, search, recall +- [Security](#security) — Pre-commit secret scanning +- [Utilities](#utilities) — Google Drive sync +- [Agent Chat Channel](#agent-chat-channel) — Inter-agent messaging plugin +- [Prerequisites](#prerequisites) + +--- + +## Memory Pipeline + +Scripts for managing NOVA's semantic memory system: extracting memories from conversations, embedding them with vector representations, searching by meaning, and maintaining quality over time. + +### embed-memories.py + +Embed memory content using OpenAI's text-embedding API and store vectors in PostgreSQL with pgvector. Supports multiple source types (daily logs, entity facts, lessons, events, and more). + +```bash +python3 scripts/embed-memories.py # Embed all sources +python3 scripts/embed-memories.py --source daily_log # Embed only daily logs +python3 scripts/embed-memories.py --reindex # Drop and recreate all embeddings +``` + +### semantic-search.py + +Query embedded memories using natural language. Uses cosine similarity to find the most relevant stored memories. + +```bash +python3 scripts/semantic-search.py "what did we discuss about the app?" +python3 scripts/semantic-search.py "project architecture" --limit 10 +``` + +### proactive-recall.py + +Pre-message context retrieval — gets relevant memories *before* processing an incoming message and outputs JSON for injection into agent context. Used by the semantic-recall hook. + +```bash +python3 scripts/proactive-recall.py "user's message here" +``` + +### recall-benchmark.py + +Self-diagnostic that tests the semantic recall pipeline against known ground-truth facts in the database. Measures retrieval accuracy across different query patterns. + +```bash +python3 scripts/recall-benchmark.py # Run benchmark +python3 scripts/recall-benchmark.py --verbose # Detailed per-query results +python3 scripts/recall-benchmark.py --json # Machine-readable output +``` + +Exit code 0 if hit rate ≥ 60%. + +### extract-memories.sh + +Extract structured memories from conversation text using the Anthropic Claude API. Respects sender privacy and visibility preferences. + +```bash +echo "conversation text" | ./scripts/extract-memories.sh +``` + +Requires `ANTHROPIC_API_KEY` (or `~/.secrets/anthropic-api-key`). + +### decay-confidence.sh + +Decay confidence scores for lessons that haven't been referenced recently. Prevents stale knowledge from ranking too highly in recall. Designed for daily cron execution. + +```bash +# Crontab entry: +0 4 * * * ~/nova-scripts/scripts/decay-confidence.sh +``` + +### embed-memories-cron.sh + +Cron wrapper for nightly embedding runs. Activates the Python venv, runs the embedding script, and logs output. + +```bash +# Crontab entry: +0 3 * * * ~/nova-scripts/scripts/embed-memories-cron.sh +``` + +--- + +## Security + +### git-security/ + +Pre-commit hook that scans staged files for potential secret leaks before they're committed. Detects API keys (Anthropic, OpenAI, AWS, GitHub), private keys, passwords, and other sensitive patterns. + +```bash +# Install hooks to a repository: +./scripts/git-security/install-hooks.sh /path/to/repo +``` + +This will: +1. Copy the pre-commit scanning hook to `.git/hooks/pre-commit` +2. Update `.gitignore` with common secret file patterns (`.env`, `*.pem`, `*.key`, etc.) + +--- + +## Utilities ### gdrive-sync.sh Simple Google Drive folder sync using [gogcli](https://gogcli.sh). ```bash -./gdrive-sync.sh pull # Download from GDrive to local -./gdrive-sync.sh push # Upload from local to GDrive -./gdrive-sync.sh status # Show files in both locations +./scripts/gdrive-sync.sh pull # Download from GDrive to local +./scripts/gdrive-sync.sh push # Upload from local to GDrive +./scripts/gdrive-sync.sh status # Show files in both locations ``` -**Requirements:** -- [gogcli](https://gogcli.sh) (`brew install steipete/tap/gogcli`) -- `jq` for JSON parsing -- Authenticated gog account (`gog auth add you@gmail.com`) - **Configuration:** Edit the variables at the top of the script: - `LOCAL_DIR` — local directory to sync - `GDRIVE_FOLDER_ID` — Google Drive folder ID - `ACCOUNT` — your Google account email +--- + +## Agent Chat Channel + +The `agent-chat-channel/` directory contains a full OpenClaw channel plugin for PostgreSQL-based inter-agent messaging. It uses `LISTEN/NOTIFY` for real-time message delivery, mention-based routing, and deduplication via a processed-messages table. + +See [`agent-chat-channel/README.md`](agent-chat-channel/README.md) for full documentation and [`agent-chat-channel/SETUP.md`](agent-chat-channel/SETUP.md) for quick setup instructions. + +--- + +## Prerequisites + +| Dependency | Used By | Install | +|------------|---------|---------| +| Python 3 | Memory scripts | System package manager | +| `psycopg2` | Memory scripts | `pip install psycopg2-binary` | +| `openai` | embed-memories, semantic-search, proactive-recall | `pip install openai` | +| PostgreSQL + pgvector | Memory storage | [pgvector docs](https://github.com/pgvector/pgvector) | +| Anthropic API key | extract-memories.sh | [anthropic.com](https://www.anthropic.com/) | +| OpenAI API key | Embedding scripts | [platform.openai.com](https://platform.openai.com/) | +| [gogcli](https://gogcli.sh) | gdrive-sync.sh | `brew install steipete/tap/gogcli` | +| `jq` | gdrive-sync.sh | System package manager | +| Node.js + npm | agent-chat-channel | [nodejs.org](https://nodejs.org/) | + ## License MIT — do whatever you want with these. --- -*Made with 💜 by NOVA (Neural Oracle, Velvet Attitude)* +*Part of the [NOVA-Openclaw](https://github.com/NOVA-Openclaw) project.* From 910d5503ad5fba0b2f90c898e57aa7a2607317d4 Mon Sep 17 00:00:00 2001 From: NOVA Date: Wed, 6 May 2026 19:07:30 +0000 Subject: [PATCH 2/2] =?UTF-8?q?docs:=20comprehensive=20README,=20ARCHITECT?= =?UTF-8?q?URE.md,=20and=20Clawdbot=20=E2=86=92=20OpenClaw=20migration?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Rewrite root README.md to cover ALL 7+ scripts and modules: - Memory Pipeline (extract, embed, recall, benchmark, decay) - Utilities (gdrive-sync, agent-install) - Security (git-security hooks) - Plugin (agent-chat-channel) - Add ARCHITECTURE.md explaining system relationships, memory pipeline data flow, agent chat plugin role, and git security purpose - Fix stale brand references: Clawdbot → OpenClaw throughout all markdown, YAML config, JS comments, and Python docstrings - Leave legitimate filesystem paths intact (clawdbot-plugins/ dir, .clawdbot/ legacy config path) --- ARCHITECTURE.md | 311 +++++++++++++++---------- README.md | 311 +++++++++++++++++++------ agent-chat-channel/README.md | 12 +- agent-chat-channel/SETUP.md | 12 +- agent-chat-channel/example-config.yaml | 11 +- agent-chat-channel/index.js | 4 +- scripts/proactive-recall.py | 2 +- 7 files changed, 440 insertions(+), 223 deletions(-) diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md index b2b9a8f..b9eed52 100644 --- a/ARCHITECTURE.md +++ b/ARCHITECTURE.md @@ -1,181 +1,242 @@ -# Architecture: NOVA Semantic Memory Pipeline +# nova-scripts Architecture -This document describes how the scripts in this repository work together to implement a semantic memory system for the NOVA agent ecosystem. +> *Memory flows through stone channels,* +> *Voices carried on database waves,* +> *Knowledge held in vector space.* -## Overview +This document describes how the components in this repository relate to each other and their role in the broader NOVA agent ecosystem. -The pipeline transforms raw conversational data into searchable, context‑aware memories through three stages: +--- -1. **Extraction** – structured data is pulled from natural‑language messages -2. **Embedding** – text is converted to vector embeddings and stored -3. **Recall** – relevant memories are retrieved based on semantic similarity +## System Overview -A fourth **maintenance** stage ensures memory quality over time. +This repository contains three distinct subsystems that support the NOVA agent ecosystem: -## Data Flow +1. **Memory Pipeline** — Persistent semantic memory for agents +2. **Agent Chat Channel** — Inter-agent messaging via PostgreSQL +3. **Git Security** — Pre-commit secret scanning ``` -┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ -│ Raw Input │ │ Extraction │ │ Structured │ -│ • Chat messages│────▶• extract‑memories│────▶• lessons │ -│ • Daily logs │ │ .sh (Claude) │ │• facts/entities │ -│ • MEMORY.md │ │ │ │• opinions │ -└─────────────────┘ └─────────────────┘ └─────────────────┘ - │ -┌─────────────────┐ ┌─────────────────┐ ┌──────────▼──────────┐ -│ Query/Message │ │ Recall │ │ Embedding │ -│ • User query │◀───│• semantic‑search│◀───│• embed‑memories.py │ -│ • New message │ │• proactive‑recall│ │• embed‑memories‑cron│ -└─────────────────┘ └─────────────────┘ └─────────────────────┘ - │ │ - │ ┌─────▼──────┐ - └──────────────────────────────────────────│ pgvector │ - │ embeddings │ - └────────────┘ +┌─────────────────────────────────────────────────────────────┐ +│ NOVA Agent Ecosystem │ +│ │ +│ ┌────────────┐ ┌──────────────┐ ┌────────────────┐ │ +│ │ Memory │ │ Agent Chat │ │ Git Security │ │ +│ │ Pipeline │ │ Channel │ │ Hooks │ │ +│ └─────┬───────┘ └──────┬───────┘ └───────┬────────┘ │ +│ │ │ │ │ +│ ▼ ▼ ▼ │ +│ ┌────────────────────────────────────────────────────┐ │ +│ │ PostgreSQL (nova_memory) │ │ +│ │ memory_embeddings │ lessons │ events │ │ │ +│ │ agent_chat │ sops │ │ │ │ +│ └────────────────────────────────────────────────────┘ │ +│ │ +│ OpenAI (text-embedding-3-small) ◄── Embedding API │ +│ Anthropic (Claude) ◄── Extraction API │ +└─────────────────────────────────────────────────────────────┘ ``` -### Stage 1: Extraction (`extract-memories.sh`) +--- + +## Memory Pipeline -The pipeline begins when a natural‑language message arrives. `extract-memories.sh`: +The memory pipeline is the core of NOVA's persistent, semantically-searchable memory. It transforms raw chat messages into vector embeddings that can be retrieved at runtime for context injection. -- Calls the Claude API with a carefully crafted prompt -- Asks Claude to output JSON containing **entities**, **facts**, **opinions**, **preferences**, **vocabulary**, and **events** -- Each extracted item includes privacy metadata (`visibility`, `visibility_reason`) based on the sender’s default visibility and any privacy cues in the message -- The resulting JSON is intended to be stored in the appropriate tables of the `nova_memory` database (though the script itself only outputs JSON; actual storage is handled by a hook or calling process) +### Data Flow -### Stage 2: Embedding (`embed-memories.py`, `embed-memories-cron.sh`) +``` +Chat Message + │ + ▼ +extract-memories.sh ────────────► Database tables +(Anthropic Claude API) (entities, facts, + │ lessons, events, + ▼ preferences, etc.) +embed-memories.py ──────────────► memory_embeddings table +(OpenAI embeddings API, (pgvector column) + pgvector, PostgreSQL) + │ + ▼ +proactive-recall.py ◄────── New message triggers recall +(Pre-message context injection) │ + │ │ + ▼ ▼ +Agent session gets semantic-search.py +relevant memory context (Ad-hoc CLI queries) + │ + ▼ +recall-benchmark.py ─── Validates pipeline accuracy +(Self-diagnostic) against known ground truth + │ + ▼ +decay-confidence.sh ─── Gradually reduces confidence +(Cron, daily) of stale/unreferenced lessons +``` -Once structured data is in the database, it must be converted to vector form for semantic search. +### Stage 1: Extraction -`embed-memories.py`: +**Script:** `scripts/extract-memories.sh` -- Reads from multiple **sources**: daily logs (`*.md` files in `~/clawd/memory/`), the global `MEMORY.md`, and database tables (`lessons`, `events`, `sops`) -- Splits long texts into overlapping **chunks** (configurable `CHUNK_SIZE` and `CHUNK_OVERLAP`) -- Sends each chunk to OpenAI’s `text‑embedding‑3‑small` model to obtain a 1536‑dimensional vector -- Stores the vector together with the original text, source type, and source ID in the `memory_embeddings` table (PostgreSQL + pgvector) +Incoming chat messages (from any channel — Signal, WhatsApp, Discord, etc.) are processed through the `extract-memories.sh` script. It calls the Anthropic Claude API with a structured prompt that: -`embed-memories-cron.sh` is a simple wrapper that runs `embed-memories.py` daily and logs the output. +- Parses the message for entities, facts, opinions, preferences, vocabulary, and events +- Applies privacy detection (respecting per-user default visibility settings and override cues) +- Returns structured JSON stored in the database -### Stage 3: Recall (`semantic-search.py`, `proactive-recall.py`) +### Stage 2: Embedding -When a query or new message needs context, the system retrieves the most relevant stored memories. +**Scripts:** `scripts/embed-memories.py`, `scripts/embed-memories-cron.sh` -**Semantic Search** (`semantic-search.py`): +The embedding script reads from five source types: -- Accepts a free‑text query -- Embeds the query using the same OpenAI model -- Computes cosine similarity between the query embedding and all stored embeddings -- Returns the top‑k results above a similarity threshold +| Source | Database Table / File | Description | +|---|---|---| +| `daily_log` | `~/clawd/memory/*.md` | Daily markdown logs | +| `memory_md` | `~/clawd/MEMORY.md` | Main memory file | +| `lesson` | `lessons` table | Learned lessons from corrections | +| `event` | `events` table | Calendar events | +| `sop` | `sops` table | Standard Operating Procedures | -**Proactive Recall** (`proactive-recall.py`): +Each source is chunked (1000 chars per chunk with 200 char overlap), embedded via OpenAI's `text-embedding-3-small` model, and stored in the `memory_embeddings` table with a `pgvector` vector column. -- Designed to be called from a **message pre‑processing hook** (e.g., in Clawdbot) -- Given an incoming message, retrieves the most relevant memories *before* the message is processed by the agent -- Returns the memories formatted for direct injection into the agent’s context window -- Uses a lower similarity threshold (`0.4`) to cast a wider net, ensuring potentially relevant context is not missed +The cron wrapper (`embed-memories-cron.sh`) runs this daily to keep embeddings current. -### Stage 4: Maintenance (`decay-confidence.sh`, `recall-benchmark.py`) +### Stage 3: Recall -Memory quality degrades over time if not actively maintained. These scripts keep the system accurate and reliable. +**Scripts:** `scripts/proactive-recall.py`, `scripts/semantic-search.py` -**Confidence Decay** (`decay-confidence.sh`): +**Proactive Recall:** Before processing a user message, `proactive-recall.py` embeds the message query and performs a nearest-neighbor search against the `memory_embeddings` table. The top results are injected into the agent's context as "relevant memories." -- Runs as a daily cron job -- For any **lesson** that hasn’t been referenced in the last 30 days, reduces its confidence score by 5% -- Enforces a minimum confidence floor of `0.1` (lessons are never completely forgotten) -- Logs lessons that fall below a `0.3` confidence threshold for human review +**Semantic Search:** `semantic-search.py` is the ad-hoc CLI version — useful for manual queries and debugging. -**Recall Benchmark** (`recall-benchmark.py`): +Both use cosine distance (`<=>` operator in pgvector) for similarity ranking. -- A self‑diagnostic that validates the recall pipeline against **ground‑truth facts** stored in the database -- Executes a curated set of queries (e.g., “What is I)ruid’s birthday?”) and checks whether the expected keywords appear in the returned memories -- Computes a **hit rate**; the pipeline passes if ≥ 60% of queries succeed -- Provides per‑category breakdowns (entity lookup, library retrieval, lesson recall, etc.) -- Can be run manually or scheduled to ensure the memory system remains effective +### Stage 4: Maintenance -## Database Schema +**Scripts:** `scripts/recall-benchmark.py`, `scripts/decay-confidence.sh` -The scripts assume the following core tables exist in the `nova_memory` database: +**Benchmarking:** `recall-benchmark.py` runs a set of known queries against `proactive-recall.py` and checks if expected keywords appear in the results. It tests: -### `memory_embeddings` -```sql -CREATE TABLE memory_embeddings ( - id SERIAL PRIMARY KEY, - source_type TEXT NOT NULL, -- 'daily_log', 'memory_md', 'lesson', 'event', 'sop' - source_id TEXT NOT NULL, -- unique identifier for the source chunk - content TEXT NOT NULL, -- original text chunk - embedding vector(1536), -- pgvector column - created_at TIMESTAMP DEFAULT NOW() -); -CREATE INDEX ON memory_embeddings USING ivfflat (embedding vector_cosine_ops); -``` +- Entity lookups (direct fact retrieval) +- Library knowledge queries +- Lesson recall (from past corrections) +- Event date queries +- Cross-reference queries (architecture knowledge) +- Noise handling (irrelevant queries should return empty results) + +The pipeline passes if hit rate ≥ 60%. + +**Confidence Decay:** `decay-confidence.sh` runs daily via cron. It reduces confidence scores for lessons that haven't been referenced in 30+ days (multiply by 0.95, floor at 0.1). Lessons below 0.3 confidence are logged as candidates for review. + +--- + +## Agent Chat Channel + +The `agent-chat-channel/` directory contains a PostgreSQL-based messaging channel plugin for OpenClaw. -### `lessons` -```sql -CREATE TABLE lessons ( - id SERIAL PRIMARY KEY, - lesson TEXT NOT NULL, -- the lesson text - context TEXT, -- optional context - confidence FLOAT DEFAULT 1.0, -- confidence score (0.1–1.0) - last_referenced TIMESTAMP, -- when the lesson was last recalled - created_at TIMESTAMP DEFAULT NOW() -); +### Role in the Ecosystem + +In the NOVA agent ecosystem, agents need to communicate with each other. The agent-chat-channel plugin provides this capability by treating the `agent_chat` database table as a message bus: + +``` +Agent A (e.g., scout) + │ + │ INSERT INTO agent_chat (sender='scout', message='...', mentions=ARRAY['coder']) + ▼ +agent_chat table ──► PostgreSQL NOTIFY + │ + ▼ +gateway.agentChatPlugin ──► LISTEN agent_chat + │ + ├──► Routes to Agent B's session (e.g., coder) + │ runtime.handleInbound({...}) + │ + └──► Marks message as processed in agent_chat_processed ``` -### `events`, `sops`, `entity_facts`, etc. +### Key Design Decisions -Additional tables store structured data extracted by `extract-memories.sh`. Refer to the NOVA memory schema documentation for full details. +- **Database as message bus:** No separate message broker needed. PostgreSQL's LISTEN/NOTIFY provides real-time delivery. +- **Mention-based routing:** Agents only receive messages that mention them by name. This prevents message storms. +- **Deduplication at the DB level:** The `agent_chat_processed` table with a composite primary key `(chat_id, agent)` ensures each message is processed exactly once per agent. +- **1Password integration:** Database credentials can be stored in 1Password and resolved at runtime. -## Configuration & Environment +### Database Tables -All scripts rely on environment variables for API keys: +| Table | Purpose | +|---|---| +| `agent_chat` | Message store (channel, sender, message, mentions, reply chain) | +| `agent_chat_processed` | Deduplication tracker | -- `OPENAI_API_KEY` – used by `embed-memories.py`, `semantic-search.py`, `proactive-recall.py` -- `ANTHROPIC_API_KEY` – used by `extract-memories.sh` (can also be read from `~/.secrets/anthropic-api-key`) +### Plugin Architecture -Database connection parameters are hard‑coded in each script (`DB_NAME = "nova_memory"`, `host="localhost"`, `user="nova"`). Modify these constants if your setup differs. +The plugin follows OpenClaw's channel plugin architecture: -## Integration with the NOVA Ecosystem +| Component | Purpose | +|---|---| +| `config.resolveAccount` | Resolves account configuration (single or multi-account) | +| `gateway.startAccount` | Core listening loop (LISTEN, fetch unprocessed, route to sessions) | +| `outbound.sendText` | Sends agent replies back to the `agent_chat` table | +| `status` | Health and runtime status reporting | -The scripts are designed to be used together with: +--- -- **Clawdbot/OpenClaw** – hooks can call `extract-memories.sh` and `proactive-recall.py` -- **PostgreSQL + pgvector** – the vector store for embeddings -- **Cron** – scheduled execution of `embed-memories-cron.sh` and `decay-confidence.sh` -- **1Password** – API keys can be fetched via `op` (used in some scripts) +## Git Security -## Extending the Pipeline +The `scripts/git-security/` directory provides pre-commit hooks that scan staged files for secrets before they reach the repository. -To add a new source of memories: +### Purpose -1. Ensure its content is stored in a database table or a file in `~/clawd/memory/` -2. Add a new embedding function in `embed-memories.py` following the pattern of `embed_daily_logs()` or `embed_lessons()` -3. Update the `--source` argument handling to include your new source -4. (Optional) Add test queries for the new source in `recall-benchmark.py` +In an AI agent ecosystem where code is written autonomously (or semi-autonomously), the risk of accidentally committing API keys or credentials is higher than in human-only development. These hooks provide an automated safety net. -To adjust recall sensitivity: +### How It Works -- Modify `DEFAULT_THRESHOLD` in `proactive-recall.py` (lower = more results, higher = more precise) -- Change the `threshold` argument in `semantic-search.py` +``` +Developer stages files + │ + ▼ +git commit triggers pre-commit hook + │ + ▼ +Scans staged files for patterns: + - API keys (OpenAI, Anthropic, AWS, GitHub) + - Private keys (RSA, Ed25519, PEM) + - Secrets and passwords in config-like patterns + - Forbidden files (.env, credentials.json, id_*) + │ + ├── No problems found ──► Commit proceeds + │ + └── Secrets detected ──► Commit blocked + (can bypass with --no-verify) +``` -## Troubleshooting +### Installer -If recall performance drops: +`install-hooks.sh` automates installation: +1. Copies `pre-commit-template` to the target repo's `.git/hooks/pre-commit` +2. Makes it executable +3. Updates `.gitignore` with common secret patterns -1. Run `recall-benchmark.py --verbose` to see which queries are failing -2. Check that `embed-memories-cron.sh` is running daily (logs in `~/clawd/logs/embed-memories.log`) -3. Verify that the `memory_embeddings` table is being populated: - ```sql - SELECT source_type, COUNT(*) FROM memory_embeddings GROUP BY source_type; - ``` -4. Ensure the pgvector index is built (`ivfflat` for cosine similarity) +--- + +## Dependencies Summary + +| Component | Dependencies | +|---|---| +| Memory Pipeline | PostgreSQL (pgvector), OpenAI API, Anthropic API, Python 3 (psycopg2, openai), bash (jq, curl, psql) | +| Agent Chat Plugin | Node.js, PostgreSQL (`pg` npm package) | +| Git Security | bash, grep | +| GDrive Sync | gogcli, jq | + +--- -If extraction fails: +## Related Repositories -- Confirm the `ANTHROPIC_API_KEY` is set and valid -- Check that the Claude model (`claude-sonnet-4-20250514`) is accessible -- Review the prompt in `extract-memories.sh` for compatibility with your use case +- [OpenClaw](https://github.com/nova-ai/openclaw) — The gateway platform these scripts run on +- [nova-memory](https://github.com/nova-ai/nova-memory) — Database schemas and migrations +- [nova-cognition](https://github.com/nova-ai/nova-cognition) — Agent cognition and routing --- -*This architecture enables NOVA to maintain a long‑term, searchable memory that improves context awareness and response relevance over time.* \ No newline at end of file +*Architecture reviewed 2026-05-06* diff --git a/README.md b/README.md index e51da82..5c116a2 100644 --- a/README.md +++ b/README.md @@ -1,151 +1,314 @@ # nova-scripts ✨ -Utility scripts and tools by NOVA — an AI agent running on [OpenClaw](https://github.com/openclaw/openclaw). +Utility scripts and tools by **NOVA** — an AI assistant ecosystem running on [OpenClaw](https://github.com/nova-ai/openclaw). -Part of the [NOVA-Openclaw](https://github.com/NOVA-Openclaw) ecosystem. These are utilities for memory management, semantic recall, security, and general maintenance. Open source in case they're useful to others! +This repository contains the operational scripts that power NOVA's memory pipeline, inter-agent communication, Google Drive sync, and pre-commit security hooks. These are small utilities written to solve everyday problems — open source in case they're useful to others. --- -## Contents - -- [Memory Pipeline](#memory-pipeline) — Embedding, extraction, search, recall -- [Security](#security) — Pre-commit secret scanning -- [Utilities](#utilities) — Google Drive sync -- [Agent Chat Channel](#agent-chat-channel) — Inter-agent messaging plugin -- [Prerequisites](#prerequisites) +## Table of Contents + +- [Memory Pipeline](#memory-pipeline) + - [extract-memories.sh](#extract-memoriessh) + - [embed-memories.py](#embed-memoriespy) + - [embed-memories-cron.sh](#embed-memories-cronsh) + - [proactive-recall.py](#proactive-recallpy) + - [semantic-search.py](#semantic-searchpy) + - [recall-benchmark.py](#recall-benchmarkpy) + - [decay-confidence.sh](#decay-confidencesh) +- [Utilities](#utilities) + - [gdrive-sync.sh](#gdrive-syncsh) + - [agent-install.sh](#agent-installsh) +- [Security](#security) + - [git-security (install-hooks.sh + pre-commit-template)](#git-security) +- [Plugin](#plugin) + - [agent-chat-channel](#agent-chat-channel) +- [Architecture](#architecture) +- [License](#license) --- ## Memory Pipeline -Scripts for managing NOVA's semantic memory system: extracting memories from conversations, embedding them with vector representations, searching by meaning, and maintaining quality over time. +The memory pipeline is the core of NOVA's persistent memory system. Data flows through four stages: + +1. **Extract** → Parse chat messages and extract structured memory (facts, entities, lessons) +2. **Embed** → Convert extracted content into vector embeddings using OpenAI + pgvector +3. **Recall** → On new messages, retrieve semantically relevant memories for context injection +4. **Maintain** → Decay stale memories and benchmark retrieval accuracy + +### extract-memories.sh + +Extracts structured memory data (entities, facts, opinions, preferences, vocabulary, events) from chat messages using the Anthropic Claude API. Designed to be called from a message processing hook. + +```bash +# Pipe a message directly +echo "I love working on the Nova project" | ./scripts/extract-memories.sh + +# Or pass as argument +./scripts/extract-memories.sh "My birthday is May 27th, 1978" + +# Environment variables for sender attribution +SENDER_NAME="I)ruid" SENDER_ID="+15551234567" ./scripts/extract-memories.sh "Just between us, I'm thinking of quitting" +``` + +**Requirements:** Anthropic API key (`ANTHROPIC_API_KEY`), `jq`, `curl`, `psql` (for privacy preference lookup) + +**Output:** JSON with extracted entities, facts, opinions, preferences, vocabulary, and events. + +**Privacy-aware:** Respects per-user default visibility settings (public/private) stored in the database. Detects override cues like "feel free to share" or "don't tell anyone." ### embed-memories.py -Embed memory content using OpenAI's text-embedding API and store vectors in PostgreSQL with pgvector. Supports multiple source types (daily logs, entity facts, lessons, events, and more). +Takes memory content from various sources (daily logs, MEMORY.md, database lessons, events, SOPs) and creates vector embeddings using OpenAI's `text-embedding-3-small` model. Stores embeddings in PostgreSQL using the `pgvector` extension. ```bash -python3 scripts/embed-memories.py # Embed all sources -python3 scripts/embed-memories.py --source daily_log # Embed only daily logs -python3 scripts/embed-memories.py --reindex # Drop and recreate all embeddings +# Embed all sources +python3 ./scripts/embed-memories.py + +# Embed only daily logs +python3 ./scripts/embed-memories.py --source daily_log + +# Drop and recreate all embeddings +python3 ./scripts/embed-memories.py --reindex ``` -### semantic-search.py +**Requirements:** OpenAI API key (`OPENAI_API_KEY`), PostgreSQL (`nova_memory` database), `pgvector` extension + +**Sources:** +- `daily_log` — Markdown files from `~/clawd/memory/` +- `memory_md` — `MEMORY.md` file +- `lesson` — Database `lessons` table +- `event` — Database `events` table +- `sop` — Database `sops` table (Standard Operating Procedures) + +**Chunking:** Splits text into overlapping chunks (1000 chars with 200 char overlap) for granular retrieval. + +### embed-memories-cron.sh -Query embedded memories using natural language. Uses cosine similarity to find the most relevant stored memories. +Cron wrapper script that runs `embed-memories.py` daily. Sources a Python virtual environment and logs output to `~/clawd/logs/embed-memories.log`. ```bash -python3 scripts/semantic-search.py "what did we discuss about the app?" -python3 scripts/semantic-search.py "project architecture" --limit 10 +# Run manually +./scripts/embed-memories-cron.sh + +# Typical crontab entry (runs daily at 3 AM) +0 3 * * * /home/nova/clawd/scripts/embed-memories-cron.sh ``` +**Requirements:** Python virtual environment at `~/clawd/scripts/tts-venv/` + ### proactive-recall.py -Pre-message context retrieval — gets relevant memories *before* processing an incoming message and outputs JSON for injection into agent context. Used by the semantic-recall hook. +Pre-message semantic recall system. Given a user's message, retrieves the most semantically relevant memories from the embedding store. Designed for context injection — it can produce formatted output ready to insert into an LLM prompt. ```bash -python3 scripts/proactive-recall.py "user's message here" +# Standalone search +python3 ./scripts/proactive-recall.py "What projects does NOVA include?" + +# Formatted for prompt injection +python3 ./scripts/proactive-recall.py "Tell me about the Silmarillion" --inject + +# Custom limit and threshold +python3 ./scripts/proactive-recall.py "How does the recall system work?" --limit 5 ``` -### recall-benchmark.py +**Requirements:** OpenAI API key, PostgreSQL (`nova_memory` database with `memory_embeddings` table), `pgvector` -Self-diagnostic that tests the semantic recall pipeline against known ground-truth facts in the database. Measures retrieval accuracy across different query patterns. +**Output:** JSON with ranked results including source, content excerpt, and similarity score. Use `--inject` for a preformatted markdown block. + +### semantic-search.py + +CLI tool for ad-hoc semantic search across the embedded memory store. Provides flexible querying with configurable similarity threshold and result limits. ```bash -python3 scripts/recall-benchmark.py # Run benchmark -python3 scripts/recall-benchmark.py --verbose # Detailed per-query results -python3 scripts/recall-benchmark.py --json # Machine-readable output +# Basic search +python3 ./scripts/semantic-search.py "what did we discuss about the app?" + +# More results with lower threshold +python3 ./scripts/semantic-search.py "I)ruid's health" --limit 10 --threshold 0.3 + +# JSON output for programmatic use +python3 ./scripts/semantic-search.py "architecture" --json ``` -Exit code 0 if hit rate ≥ 60%. +**Requirements:** OpenAI API key, PostgreSQL, `pgvector` -### extract-memories.sh +### recall-benchmark.py -Extract structured memories from conversation text using the Anthropic Claude API. Respects sender privacy and visibility preferences. +Self-diagnostic benchmarking tool that tests the semantic recall pipeline (`proactive-recall.py`) against known ground-truth facts. Measures retrieval accuracy across multiple query patterns including entity lookups, library retrieval, lesson recall, event queries, cross-references, and noise handling. ```bash -echo "conversation text" | ./scripts/extract-memories.sh +# Run with verbose output +python3 ./scripts/recall-benchmark.py --verbose + +# JSON output for analysis +python3 ./scripts/recall-benchmark.py --json ``` -Requires `ANTHROPIC_API_KEY` (or `~/.secrets/anthropic-api-key`). +**Pass/fail:** Exit code 0 if hit rate ≥ 60%, exit code 1 otherwise. Reports per-category breakdown. + +**Test categories:** +- `entity_lookup` — Direct fact retrieval (names, birthdays, relationships) +- `library` — Library/knowbase retrieval (books, subjects) +- `lesson` — Lesson recall from past corrections +- `event` — Event date retrieval +- `cross_reference` — Architecture and ecosystem knowledge +- `noise` — Handles irrelevant queries without false positives ### decay-confidence.sh -Decay confidence scores for lessons that haven't been referenced recently. Prevents stale knowledge from ranking too highly in recall. Designed for daily cron execution. +Cron-compatible script that gradually decays confidence scores for lessons not referenced recently. Prevents stale or outdated information from persisting indefinitely. ```bash -# Crontab entry: -0 4 * * * ~/nova-scripts/scripts/decay-confidence.sh +# Run manually +./scripts/decay-confidence.sh + +# Typical crontab entry (runs daily at 4 AM) +0 4 * * * /home/nova/clawd/scripts/decay-confidence.sh ``` -### embed-memories-cron.sh +**Behavior:** +- Lessons not referenced in 30+ days: confidence multiplied by 0.95 +- Minimum confidence floor: 0.1 (never fully forgets) +- Logs lessons that drop below 0.3 confidence for review -Cron wrapper for nightly embedding runs. Activates the Python venv, runs the embedding script, and logs output. +**Requirements:** PostgreSQL access to `nova_memory` database + +--- + +## Utilities + +### gdrive-sync.sh + +Simple Google Drive folder sync using [gogcli](https://gogcli.sh). Supports pull, push, and status operations for a single Google Drive folder. ```bash -# Crontab entry: -0 3 * * * ~/nova-scripts/scripts/embed-memories-cron.sh +# Set required config (or edit script defaults) +export GDRIVE_FOLDER_ID="your-folder-id" +export LOCAL_DIR="$HOME/my-sync-folder" +export GOG_ACCOUNT="you@gmail.com" # optional + +# Download from GDrive to local +./scripts/gdrive-sync.sh pull + +# Upload from local to GDrive +./scripts/gdrive-sync.sh push + +# Show files in both locations +./scripts/gdrive-sync.sh status ``` ---- +**Requirements:** +- [gogcli](https://gogcli.sh) (`brew install steipete/tap/gogcli`) +- `jq` for JSON parsing +- Authenticated gog account (`gog auth add you@gmail.com`) -## Security +**Configuration:** Set via environment variables (`GDRIVE_FOLDER_ID`, `LOCAL_DIR`, `GOG_ACCOUNT`) or edit the defaults at the top of the script. -### git-security/ +### agent-install.sh -Pre-commit hook that scans staged files for potential secret leaks before they're committed. Detects API keys (Anthropic, OpenAI, AWS, GitHub), private keys, passwords, and other sensitive patterns. +Stub installer for NOVA-INSTALL.sh compatibility. This repository has no installation requirements. ```bash -# Install hooks to a repository: -./scripts/git-security/install-hooks.sh /path/to/repo +./agent-install.sh +# Output: No installation steps for nova-scripts ``` -This will: -1. Copy the pre-commit scanning hook to `.git/hooks/pre-commit` -2. Update `.gitignore` with common secret file patterns (`.env`, `*.pem`, `*.key`, etc.) - --- -## Utilities +## Security -### gdrive-sync.sh +### git-security + +Pre-commit hooks for scanning staged files for potential secrets before they reach the repository. Prevents accidental commits of API keys, credentials, and private keys. -Simple Google Drive folder sync using [gogcli](https://gogcli.sh). +**Contents:** +- `install-hooks.sh` — Installer script that copies the hook template to any git repository +- `pre-commit-template` — The actual pre-commit hook template ```bash -./scripts/gdrive-sync.sh pull # Download from GDrive to local -./scripts/gdrive-sync.sh push # Upload from local to GDrive -./scripts/gdrive-sync.sh status # Show files in both locations +# Install hooks in a repository +./scripts/git-security/install-hooks.sh /path/to/your/repo ``` -**Configuration:** Edit the variables at the top of the script: -- `LOCAL_DIR` — local directory to sync -- `GDRIVE_FOLDER_ID` — Google Drive folder ID -- `ACCOUNT` — your Google account email +**What it detects (via the pre-commit template):** ---- +| Pattern | Example | +|---|---| +| Anthropic API keys | `sk-ant-api...` | +| Anthropic Admin keys | `sk-ant-admin...` | +| OpenAI API keys | `sk-...` | +| AWS Access Keys | `AKIA...` | +| AWS Secret Keys | Base64-encoded 40-char strings in quotes | +| Private Keys | `-----BEGIN * PRIVATE KEY-----` | +| GitHub Tokens | `ghp_...`, `ghs_...`, etc. | +| Generic secrets/passwords | `"secret": "..."`, `"password": "..."` | +| Generic API keys | `"api_key": "..."` | +| Forbidden files | `.env`, `.pem`, `.key`, `credentials.json`, `id_rsa`, etc. | + +**Installation also updates `.gitignore`** with common secret patterns if they're missing. -## Agent Chat Channel +**Bypass:** `git commit --no-verify` (document why in the commit message). -The `agent-chat-channel/` directory contains a full OpenClaw channel plugin for PostgreSQL-based inter-agent messaging. It uses `LISTEN/NOTIFY` for real-time message delivery, mention-based routing, and deduplication via a processed-messages table. +--- -See [`agent-chat-channel/README.md`](agent-chat-channel/README.md) for full documentation and [`agent-chat-channel/SETUP.md`](agent-chat-channel/SETUP.md) for quick setup instructions. +## Plugin + +### agent-chat-channel + +A PostgreSQL-based messaging channel plugin for OpenClaw that allows agents to communicate via the `agent_chat` database table. Uses PostgreSQL `LISTEN`/`NOTIFY` for real-time message delivery. + +**Location:** `agent-chat-channel/` (has its own README.md and SETUP.md) + +**Key features:** +- Real-time notifications via PostgreSQL `NOTIFY` +- Mention-based routing — only processes messages where the agent is mentioned +- Deduplication via `agent_chat_processed` table +- Two-way messaging — routes incoming messages to agent sessions and sends replies back to the database +- Multiple account support — one gateway can serve multiple agents +- 1Password integration for database credentials + +**How it works:** +1. Plugin connects to PostgreSQL and executes `LISTEN agent_chat` +2. On startup, checks for any unprocessed messages where agent name is in the `mentions` array +3. When a `NOTIFY` is received, queries for new messages targeting this agent +4. Routes each message to the agent's session via `runtime.handleInbound` +5. Marks message as processed in `agent_chat_processed` +6. Agent replies are inserted back into `agent_chat` with the agent as sender + +**Database schema:** + +```sql +-- Main chat messages table +CREATE TABLE agent_chat ( + id SERIAL PRIMARY KEY, + channel TEXT NOT NULL DEFAULT 'default', + sender TEXT NOT NULL, + message TEXT NOT NULL, + mentions TEXT[] DEFAULT '{}', + reply_to INTEGER REFERENCES agent_chat(id), + created_at TIMESTAMP DEFAULT NOW() +); + +-- Track processed messages +CREATE TABLE agent_chat_processed ( + chat_id INTEGER REFERENCES agent_chat(id) ON DELETE CASCADE, + agent TEXT NOT NULL, + processed_at TIMESTAMP DEFAULT NOW(), + PRIMARY KEY (chat_id, agent) +); +``` + +**Configuration:** See `agent-chat-channel/README.md` and `agent-chat-channel/SETUP.md` for full details. --- -## Prerequisites +## Architecture + +See [ARCHITECTURE.md](./ARCHITECTURE.md) for the system overview showing how these components relate to each other and to the broader NOVA ecosystem. -| Dependency | Used By | Install | -|------------|---------|---------| -| Python 3 | Memory scripts | System package manager | -| `psycopg2` | Memory scripts | `pip install psycopg2-binary` | -| `openai` | embed-memories, semantic-search, proactive-recall | `pip install openai` | -| PostgreSQL + pgvector | Memory storage | [pgvector docs](https://github.com/pgvector/pgvector) | -| Anthropic API key | extract-memories.sh | [anthropic.com](https://www.anthropic.com/) | -| OpenAI API key | Embedding scripts | [platform.openai.com](https://platform.openai.com/) | -| [gogcli](https://gogcli.sh) | gdrive-sync.sh | `brew install steipete/tap/gogcli` | -| `jq` | gdrive-sync.sh | System package manager | -| Node.js + npm | agent-chat-channel | [nodejs.org](https://nodejs.org/) | +--- ## License @@ -153,4 +316,4 @@ MIT — do whatever you want with these. --- -*Part of the [NOVA-Openclaw](https://github.com/NOVA-Openclaw) project.* +*Made with 💜 by NOVA (Neural Oracle, Velvet Attitude)* diff --git a/agent-chat-channel/README.md b/agent-chat-channel/README.md index de9576c..f4a1166 100644 --- a/agent-chat-channel/README.md +++ b/agent-chat-channel/README.md @@ -1,4 +1,4 @@ -# Agent Chat Channel Plugin for Clawdbot +# Agent Chat Channel Plugin for OpenClaw PostgreSQL-based messaging channel that allows agents to communicate via the `agent_chat` database table. @@ -60,7 +60,7 @@ cd /home/nova/clawd/clawdbot-plugins/agent-chat-channel npm install ``` -2. Register the plugin in Clawdbot's config (usually `~/.config/clawdbot/config.yaml`): +2. Register the plugin in OpenClaw's config (usually `~/.config/openclaw/config.yaml`): ```yaml plugins: paths: @@ -69,7 +69,7 @@ plugins: ## Configuration -Add to your Clawdbot config: +Add to your OpenClaw config: ```yaml channels: @@ -162,7 +162,7 @@ Check that: - Verify NOTIFY trigger is firing: `SELECT * FROM pg_stat_activity WHERE wait_event = 'ClientRead'` - Check that agent name matches exactly in config and `mentions` array -- Look for errors in Clawdbot logs: `clawdbot gateway logs` +- Look for errors in OpenClaw logs: `openclaw gateway logs` ### Messages processed multiple times @@ -172,7 +172,7 @@ This shouldn't happen due to the `agent_chat_processed` table, but if it does: ## Development -The plugin follows Clawdbot's channel plugin architecture: +The plugin follows OpenClaw's channel plugin architecture: - `config`: Account resolution and configuration management - `gateway.startAccount`: Core listening logic @@ -181,4 +181,4 @@ The plugin follows Clawdbot's channel plugin architecture: ## License -Same as Clawdbot +Same as OpenClaw diff --git a/agent-chat-channel/SETUP.md b/agent-chat-channel/SETUP.md index 84043b3..f12c5de 100644 --- a/agent-chat-channel/SETUP.md +++ b/agent-chat-channel/SETUP.md @@ -25,9 +25,9 @@ CREATE FUNCTION notify_agent_chat() ...; CREATE TRIGGER agent_chat_notify ...; ``` -## 3. Configure Clawdbot +## 3. Configure OpenClaw -Edit your `~/.config/clawdbot/config.yaml`: +Edit your `~/.config/openclaw/config.yaml`: ```yaml # Register the plugin @@ -48,16 +48,16 @@ channels: See `example-config.yaml` for more options. -## 4. Restart Clawdbot Gateway +## 4. Restart OpenClaw Gateway ```bash -clawdbot gateway restart +openclaw gateway restart ``` ## 5. Verify Plugin Loaded ```bash -clawdbot gateway status +openclaw gateway status ``` Look for `agent_chat` in the channels list. @@ -77,7 +77,7 @@ The agent should receive and respond to the message. - Check plugin path in config - Verify index.js exports `agentChatPlugin` or default export -- Check gateway logs: `clawdbot gateway logs` +- Check gateway logs: `openclaw gateway logs` ### Database connection errors diff --git a/agent-chat-channel/example-config.yaml b/agent-chat-channel/example-config.yaml index ead3db4..9601f0a 100644 --- a/agent-chat-channel/example-config.yaml +++ b/agent-chat-channel/example-config.yaml @@ -1,6 +1,6 @@ -# Example Clawdbot configuration for agent_chat channel +# Example OpenClaw configuration for agent_chat channel # -# Add this to your ~/.config/clawdbot/config.yaml +# Add this to your ~/.config/openclaw/config.yaml # 1. Register the plugin plugins: @@ -30,10 +30,3 @@ channels: # host: localhost # user: newhart # password: op://NOVA Shared Vault/Agent DB: newhart/password -# -# assistant2: -# agentName: assistant2 -# database: nova_memory -# host: localhost -# user: assistant2 -# password: op://NOVA Shared Vault/Agent DB: assistant2/password diff --git a/agent-chat-channel/index.js b/agent-chat-channel/index.js index 752fd92..cb19a1f 100644 --- a/agent-chat-channel/index.js +++ b/agent-chat-channel/index.js @@ -2,7 +2,7 @@ import pg from 'pg'; const { Client } = pg; /** - * Agent Chat Channel Plugin for Clawdbot + * Agent Chat Channel Plugin for OpenClaw * * Listens to PostgreSQL NOTIFY on 'agent_chat' channel and routes messages * to the agent when mentioned. Marks processed messages in agent_chat_processed. @@ -11,7 +11,7 @@ const { Client } = pg; const PLUGIN_ID = 'agent_chat'; /** - * Resolve agent_chat account config from Clawdbot config + * Resolve agent_chat account config from OpenClaw config */ function resolveAgentChatAccount({ cfg, accountId = 'default' }) { const channelConfig = cfg.channels?.agent_chat; diff --git a/scripts/proactive-recall.py b/scripts/proactive-recall.py index e77129b..78fe28c 100755 --- a/scripts/proactive-recall.py +++ b/scripts/proactive-recall.py @@ -7,7 +7,7 @@ Output: JSON with relevant memories to inject into context. -For Clawdbot integration, call this from a hook or message preprocessor. +For OpenClaw integration, call this from a hook or message preprocessor. """ import os