Replit Agent Prompt: Content Operations API
What you are building
A FastAPI web application that serves as the deterministic operations layer for a large markdown knowledge repository (~3,000 pages). The application replaces expensive cloud LLM inference with pipelines of Python scripts and local LLM calls (via Ollama, OpenAI-compatible API).
The repository lives at the root of this project. The content/
directory is a git submodule containing all the markdown files. The
scripts/ directory contains Python scripts that already do most of the
individual operations — your job is to compose them into a web
application with a REST API and a simple dashboard.
Why this matters
Currently, an expensive cloud AI agent (Claude) does all content operations: classifying files, finding gaps, validating frontmatter, enriching metadata, scoring quality, deciding what to work on next. Each of these operations costs thousands of tokens of inference. Most of them can be done deterministically by scripts or cheaply by local 3-7B parameter models running on the user’s machine via Ollama.
The application you build will:
- Replace ~80% of per-session cloud inference with deterministic script calls and local model inference
- Provide a dashboard showing content health at a glance
- Expose pipeline endpoints that chain multiple operations
- Make the repository’s state queryable without reading thousands of files
Architecture
FastAPI Application
├── /api/v1/ — REST API endpoints
│ ├── /content/ — Content queries (search, stats, gaps)
│ ├── /validate/ — Validation endpoints (frontmatter, manifests)
│ ├── /triage/ — Triage pipeline (enrich, classify, promote)
│ ├── /graph/ — Predicate graph queries (triplets, satisfaction)
│ ├── /fragments/ — Fragment computation (closures, deltas)
│ ├── /plans/ — Plan management (list, status, board view)
│ ├── /pipelines/ — Composed multi-step operations
│ └── /llm/ — Local model delegation
├── /dashboard — Simple HTML dashboard (Jinja2 templates)
├── core/ — Business logic (wraps existing scripts)
│ ├── frontmatter.py — YAML frontmatter parser (reuse from repo)
│ ├── graph.py — Predicate graph builder and querier
│ ├── fragments.py — Fragment calculus implementation
│ ├── validation.py — Content and manifest validation
│ ├── triage.py — Triage operations (enrich, classify, index)
│ ├── local_llm.py — Local model client (adapt from repo)
│ └── pipelines.py — Multi-step pipeline composition
└── templates/ — Jinja2 HTML templates for dashboard
Existing code to study and reuse
IMPORTANT: This repository contains extensive prior work. Study these files before writing any new code — reuse their logic, patterns, and data structures.
Current production scripts (in scripts/)
These are the scripts your application wraps. Read each one:
-
scripts/local_llm.py(~354 lines) — Unified client for local LLM backends (Ollama + Foundry Local). Supports backend discovery, model suggestion by task type (classification, generation, extraction, reasoning, long-context), and OpenAI-compatible/v1/chat/completionscalls. Adapt this directly — it handles all local model communication. -
scripts/mcp-server.py(~566 lines) — The existing MCP server that exposes repository tools to Claude Code. Has implementations of:find_in_repo,query_triage_index,rebuild_triage_index,enrich_triage,list_plans,list_skills,validate_frontmatter,delegate_task,infer_triage_frontmatter,mine_triage_relevance,llm_status. Your REST API replaces/supersedes this — expose the same operations as HTTP endpoints. -
scripts/infer-triage-frontmatter.py(~678 lines) — Local LLM enrichment pipeline. Key patterns to reuse:- Model trust ordering (MODEL_TRUST_ORDER) — skip re-enrichment by weaker models
- Provenance stamping (triage-enriched-by field)
- Large-file summarization before enrichment (SUMMARY_THRESHOLD)
- Body-preservation verification (never modify body content)
- YAML key validation (remove keys with whitespace)
-
scripts/enrich-triage.py(~308 lines) — Mechanical (no-LLM) frontmatter enrichment. Derives title from headings, date-created from git history, fixes deprecated field names. This is the deterministic first pass before LLM inference. -
scripts/index-triage.py— Builds the triage index (.triage-index.json), tracking enrichment state per file. -
scripts/mine-triage-relevance.py— Pre-classifies triage files by relevance to a focus topic using local LLM scoring (0-3).
Analysis scripts (in content/.../scripts/)
These implement the semantic analysis layer:
-
content/technology/specifications/agential-semioverse-repository/scripts/predicate-graph.py(~1153 lines) — Builds the repository’s predicate graph from frontmatter. Key operations:- Graph building: extracts (subject, predicate, object) triplets from all pages
- Triplet queries: page interaction surface, incoming references
- Transitive relation following
- Axiom satisfaction checking: evaluates each page against its
type’s axiom registry (e.g., a
theoremMUST haveproven-by) - Repository-wide satisfaction report (currently 91.3% satisfaction, 37 errors, 308 warnings)
-
content/.../scripts/fragment.py(~588 lines) — Fragment calculus. Fragment F = <A, K, Π> where:- A (assumptions) = requires + extends (dependencies)
- K (claims) = defines + teaches (contributions)
- Π (provenance) = cites + authors (support)
- Closure computation: transitive dependency resolution
- Delta computation: gaps between fragments
- Directory summaries: statistics for content areas
-
content/.../scripts/validate-content.py(~624 lines) — Domain-specific validation. Checks frontmatter against rules for mathematics (theorems need proofs), philosophy (claims need arguments), sociology, games, education. Reports MUST/SHOULD violations. -
content/.../scripts/validate-manifest.py(~444 lines) — Validates SKILL.md files against the signal-dispatch specification. 6 checks: required fields, id matching, dependency resolution, path validity, type validation, trigger requirements.
Prior application code (in content/triage/)
Study these for architectural patterns and reusable components:
-
content/triage/wsl-backup/emsemioverse/engine/— A previous FastAPI application attempt. Key files:api/main.py— FastAPI app with lifespan manager, MCP mount, CORS, health endpoints, skill manifest loadingapi/api_v1.py— Versioned REST routes with Pydantic models (SkillSummary, SkillListResponse, APIError envelope)api/skill_runner.py— Subprocess-based skill execution with JSON output parsing, timeout handlingapi/app_state.py— Application state managementapi/mcp_sdk_adapter.py— MCP SDK integration Reuse these patterns — especially the Pydantic models, error envelope, and skill execution approach.
-
content/triage/library/deprecated-relational-research/— An earlier FastAPI knowledge graph application with:- SQLite database (relational cells, edges, provenance)
- Modular API routers (web, stats, git, query, graph, recognition, relation, metadata_discovery)
- Entity recognition and relationship management Study for: modular router organization, graph query patterns.
-
content/triage/engine/contracts/semiotic-markdown/— Python library for parsing semiotic markdown. Has:parse.py— YAML frontmatter parservault.py— Recursive vault indexing with duplicate detectionmodel.py— DocumentRecord data modelslug.py— Deterministic path-to-ID normalization Reuse the parser and vault indexer — they are tested and correct.
Endpoints to implement
Phase 1: Wrap existing scripts as REST endpoints
These are straightforward — call the existing scripts and return JSON.
GET /api/v1/content/search?q=<query>&discipline=<disc>&type=<type>
GET /api/v1/content/stats
GET /api/v1/plans?status=<status>&priority=<priority>
GET /api/v1/plans/board
GET /api/v1/skills?kind=<kind>&search=<term>
POST /api/v1/validate/frontmatter {path: "..."}
POST /api/v1/validate/content {path: "..."}
POST /api/v1/validate/manifest {path: "..."}
GET /api/v1/triage/index?enrichment=<level>&discipline=<disc>
POST /api/v1/triage/rebuild-index
POST /api/v1/triage/enrich {batch: N, dry_run: bool}
GET /api/v1/llm/status
POST /api/v1/llm/delegate {task: "...", context: "...", model: "..."}
Phase 2: Semantic analysis endpoints
These require importing logic from the analysis scripts:
GET /api/v1/graph/stats
GET /api/v1/graph/triplets/<page>
GET /api/v1/graph/incoming/<page>
GET /api/v1/graph/query?predicate=<pred>
GET /api/v1/graph/satisfy/<page>
GET /api/v1/graph/satisfy-all?gaps_only=true
GET /api/v1/fragments/<page>
GET /api/v1/fragments/<page>/closure
GET /api/v1/fragments/delta?a=<page1>&b=<page2>
GET /api/v1/fragments/summary/<directory>
GET /api/v1/content/undefined-terms
GET /api/v1/content/weakest?limit=20
The /content/weakest endpoint is the most important — it composes
satisfaction deficit + fragment analysis + validation to rank pages by
how much improvement they need. This is the “what should I work on?”
answer.
Phase 3: Pipeline endpoints
These chain multiple operations into single calls:
POST /api/v1/pipelines/triage-enrich-full
Runs: mechanical enrich → LLM inference → validate → update index
Params: {batch: N, model: "...", dry_run: bool}
POST /api/v1/pipelines/gap-fill
Runs: find undefined terms → generate stubs via local LLM → validate
Params: {limit: N, model: "...", dry_run: bool}
POST /api/v1/pipelines/assess-directory
Runs: fragment summary + satisfaction check + validation → report
Params: {directory: "...", depth: N}
POST /api/v1/pipelines/classify-triage
Runs: mine-relevance → infer-frontmatter → classify → suggest destination
Params: {focus: "...", batch: N, model: "..."}
GET /api/v1/pipelines/session-briefing
Runs: plan status + satisfaction deficit + weakest files + triage stats
Returns: everything an agent needs to know at session start
(This single endpoint replaces ~5,000 tokens of Claude inference
per session)
Phase 4: Dashboard
A simple server-rendered HTML dashboard (Jinja2, no JavaScript framework) with these views:
- Overview: Total pages, satisfaction rate, triage stats, active plans
- Board: Plan board view (Draft / Shaped / Active / Closed columns)
- Weakest Files: Table of pages ranked by improvement need
- Triage: Enrichment status breakdown, process buttons
- Graph: Predicate graph statistics, type distribution
- Validation: Recent validation results, common errors
Frontmatter specification
Every markdown file in the repo has YAML frontmatter between ---
delimiters. The minimum required fields are:
---
title: "Page title"
date-created: 2026-03-08T00:00:00
---Common semantic fields:
type: term | topic | concept | text | lesson | skill | index | question | person | school | babble | lettertags: CamelCase list (e.g., [Anarchism, PoliticalTheory])defines: list of terms/concepts this page definesrequires: list of prerequisite pages (paths)extends: list of pages this builds oncites: list of intellectual referencesteaches: list of concepts taughtpart-of: parent structuredescription: one-sentence summaryauthors: list of author identifiers
Domain-specific extensions add fields like proven-by, axiom,
argued-by, etc.
Local LLM integration
The application delegates text-forming tasks to local models running
via Ollama (HTTP API at localhost:11434). The existing
local_llm.py client handles:
- Backend discovery (Ollama + Foundry Local)
- OpenAI-compatible
/v1/chat/completionscalls - Model suggestion by task type
- Automatic fallback between backends
Key principle: Local models are text-forming tools, not knowledge sources. All knowledge is supplied as context. The model transforms, classifies, or formats — it never generates facts from training data.
Task types and recommended models:
- classification (type/tag inference): qwen2.5:3b (fast, sufficient)
- generation (definitions, descriptions): qwen2.5:7b (quality)
- extraction (structured output from text): qwen2.5:7b
- long-context (summarizing large files): phi-3-mini-128k
The model trust ordering from infer-triage-frontmatter.py should be
respected: don’t re-enrich a file if it was already enriched by an
equal or higher-trust model.
Plan data structure
Plans are markdown files at
content/technology/specifications/agential-semioverse-repository/plans/NNNN-slug.md
with frontmatter:
---
title: "Plan title"
date-created: 2026-03-08T00:00:00
status: draft | proposed | accepted | active | completed | abandoned | deferred
priority: critical | high | medium | low
depends-on: [4, 17] # plan numbers
milestone: "milestone-name"
appetite: small | medium | large
goal: "goal-id"
---The board view groups plans into columns:
- Draft: status=draft
- Shaped: status=proposed or accepted
- Active: status=active (WIP limit: 3)
- Closed: status=completed, abandoned, or deferred
Technical requirements
- Python 3.11+, FastAPI, Uvicorn
- Jinja2 for dashboard templates
- No database — all data comes from the filesystem (markdown files) and cached JSON indexes
- No JavaScript frameworks — dashboard is server-rendered HTML with minimal vanilla JS for interactivity
- The application reads from the
content/directory (read-only for query endpoints, write for enrichment/pipeline endpoints) - Scripts are in
scripts/— import their logic directly rather than shelling out to subprocess where possible - For Ollama integration, adapt
scripts/local_llm.pydirectly - Respect the project structure: keep the application code in a new
top-level
app/directory
What “done” looks like
uvicorn app.main:appstarts the server/docsshows the OpenAPI schema with all endpoints/dashboardshows the content health overviewGET /api/v1/pipelines/session-briefingreturns a complete briefing in <2 secondsGET /api/v1/content/weakest?limit=10returns the 10 pages most needing improvementPOST /api/v1/pipelines/triage-enrich-fullprocesses a batch of triage files through the full pipeline- The predicate graph and fragment computations work correctly against the actual content
Build order
Start with Phase 1 (script wrapping) — get the API skeleton working with real data. Then Phase 2 (semantic analysis) — this is the highest-value layer. Phase 3 (pipelines) composes Phase 1+2. Phase 4 (dashboard) visualizes everything.
The /api/v1/pipelines/session-briefing endpoint is the single
most valuable endpoint — prioritize it. An agent calling this one
endpoint at session start replaces 5,000+ tokens of file reading and
inference.