Objective
The semantic pipeline is the path from human-authored prose to machine-actionable knowledge: prose is encoded in frontmatter, frontmatter generates TTL/RDF, TTL is queryable through MCP tools, and agents use those tools to reason about and improve the repository.
The pipeline runs in both directions. The encoding direction (prose→frontmatter→TTL→MCP→Agent) makes repository knowledge available to agents. The action direction (Agent→skills→MCP→scripts→repository) lets agents improve the repository. Improving the encoding direction makes the action direction work smoother — agents that can query the predicate graph make better decisions about what to improve.
What exists
- Prose: extensive content across disciplines
- Frontmatter: semantic-frontmatter spec with typed relations, partially populated (8,130 triplets across 2,991 pages)
- TTL: RDF generation script exists (
generate-rdf.py), SHACL shapes and OWL ontologies written for 5 domains - MCP: ASR MCP server with 9 tools (find, triage, enrich, validate, plans, skills, frontmatter enrichment via Ollama)
- Agent: skills and policies for agent work
- Predicate graph: satisfaction checking against domain axiom registries (91.3% satisfaction rate, 37 errors, 308 warnings)
What is missing
- TTL generation is not integrated into the build
- SHACL validation is not automated or available via MCP
- Predicate graph satisfaction is not available via MCP
- SPARQL querying is not available for the main repository (only in the separate rdf-cms prototype)
- Frontmatter enrichment (Ollama) covers triage but not published files
- The 37 errors and 308 warnings are known but not being systematically addressed
Key results
- Predicate graph satisfaction checking available as an MCP tool
- SHACL validation available as an MCP tool
- At least one frontmatter enrichment skill works on published files (not just triage)
- Agent can query “what are the weakest files?” and get actionable results from the predicate graph
- Agent can run an improvement cycle: query weakness → enrich frontmatter → verify satisfaction improved
Constraints
- Existing infrastructure is substantial — build on it, do not replace
- Progressive automation: each improvement should make the next one easier for a less capable agent
- Ollama must be running for inference-based operations