Draft
Semiotic Markdown
Semiotic Markdown Specification
This specification describes how Markdown files, as ingested by the semiotic publisher [ @semiotic-publisher ], are interpreted as concepts in the Semiotic Universe [ @semiotic-universe ] and its Stewardable Semiotic Concept Universe extension [ @stewardable-semiotic-concept-universe ], and how their structure is mapped into atoms, fragments, and publication views.
Normative statements in this document use "MUST", "SHOULD", and "MAY" in the conventional requirements-language sense.
1. File Model
-
Throughout this specification, the loader denotes the Markdown‑to‑concept stage of the semiotic publisher [ @semiotic-publisher ].
-
A concept file is a UTF‑8
*.mdfile under the configuredvault_dir. -
Each file is parsed as:
- an optional YAML frontmatter block delimited by
---at the top, and - a Markdown body.
- an optional YAML frontmatter block delimited by
-
The loader treats one file as one concept.
2. Identification and Slugs
2.1 Concept identifiers
- Frontmatter field
id(string) MAY be provided. - If
idis absent, the concept id is derived from the file path:- take the path relative to
vault_dir, - drop the file extension,
- apply the slugification function (below).
- take the path relative to
2.2 Slugification
The slugification function slugify is applied to ids, folder names, link targets, and some metadata values:
- Convert to lowercase and trim surrounding whitespace.
- Replace
/and\with-. - Replace any character not in
[a-z0-9_-]with-. - Collapse consecutive
-into a single-. - Trim leading and trailing
-. - If the result is empty, use
"item".
3. YAML Frontmatter Fields
This section describes the frontmatter schema recognized by the loader and how each field contributes to the universe semantics.
3.1 Core identity
title(string)- Human‑readable title for the concept.
- If absent, defaults to the file’s stem (basename without extension).
id(string, optional)- Explicit concept identifier.
- If present, is slugified and used as the concept id.
3.2 Classification and status
type(string, optional)- Concept type (e.g.
"doc","tag","status"). - Semantics:
- Generates an annotation term
HasType(type). - Interns a
typeatom for this value. - May induce tag atoms via
type_tag_rulesin configuration.
- Generates an annotation term
- Concept type (e.g.
status(string, optional)- Editorial or publication status (e.g.
"draft","sketch","public","deprecated"). - Semantics:
- Generates an annotation term
HasStatus(status). - Interns a
statusatom for this value. - May induce tag atoms via
status_tag_rulesin configuration. - May be used by publishing and theming (e.g. draft styling, selection filters).
- Generates an annotation term
- Editorial or publication status (e.g.
tags(list of strings, optional)- Free‑form topical tags.
- Semantics:
- For each tag value
t, generatesHasTag(t). - Interns a
tagatom with valuet.
- For each tag value
3.3 Names, aliases, and folders
aliases(list of strings, optional)- Alternate names for the concept.
- Semantics:
- For each alias, generates
AliasFor(alias). - Interns an
aliasatom. - Aliases participate in concept resolution (e.g. citation and link resolving).
- For each alias, generates
- Folders (implicit)
- The file’s path segments under
vault_dirare used as folder annotations. - All path components except the filename are collected and slugified.
- Semantics:
- For each folder segment
f, generatesInFolder(f). - Interns a
folderatom. folder_tag_rulesin configuration MAY map folders to tags viaj‑relations.
- For each folder segment
- The file’s path segments under
3.4 Temporal metadata
created(string, optional)- Creation timestamp or date string (no enforced format).
- Semantics:
- Used by the temporal trace operator to create:
createdatom with the full string.created_yearatom with the first four characters.
- Used by the temporal trace operator to create:
updated(string, optional)- Last‑update timestamp or date string.
- Semantics:
- Used by the temporal trace operator to create:
updatedatom with the full string.updated_yearatom with the first four characters.
- Used by the temporal trace operator to create:
3.5 Licensing
license(string, optional)- Human‑readable declaration of the license for this concept (e.g.
"CC BY-SA 4.0","All rights reserved","CC0"). - Semantics:
- Stored as
concept.licenseand surfaced in rendered HTML metadata. - By default, if
licenseis absent, the site’s global policy applies (no reuse or derivatives without permission). - The license field does not currently generate atoms.
- Stored as
- Human‑readable declaration of the license for this concept (e.g.
3.6 Publication routing
publish_to(string or list of strings, optional)- Project selectors indicating which publication projects the concept belongs to (e.g.
"emsenn-net","math-papers"). - Semantics:
- Stored verbatim in frontmatter metadata.
- Used by project selection (
project_universe) to build sub‑universes specified in configuration (projects.*.publish_to). - Does not create atoms.
- Project selectors indicating which publication projects the concept belongs to (e.g.
3.7 Subjects and facts (relational metadata)
-
subjects(mapping from string → string, optional)- Defines named subject aliases for use in
facts. - Schema:
subjects: alias1: target-id-1 alias2: target-id-2 - Semantics:
- Internally, each alias maps to a slugified target id.
- Subjects themselves do not create atoms; they parameterise facts.
- Defines named subject aliases for use in
-
facts(mapping from subject alias → predicate→object(s), optional)- Declarative relational metadata.
- Schema:
facts: subject-alias-or-id: predicate-1: object-or-alias predicate-2: - object-or-alias-a - object-or-alias-b - Semantics:
- For each triple
(predicate, subject, object):- Resolve
subjectviasubjectsif present; otherwise slugify the subject key. - Resolve
objectviasubjectsif present; otherwise slugify the object value. - Append a fact tuple
(predicate, subject_id, object_id)to the concept. - Generate an annotation term:
Fact(predicate, subject_id, object_id).
- Intern a
factatom whose value is the string"{predicate}:{subject_id}->{object_id}".
- Resolve
- For each triple
3.8 Citation metadata for Pandoc (optional)
bibliography(string, optional)- Path to a bibliography file (e.g.
content/ref/refs.bib).
- Path to a bibliography file (e.g.
csl(string, optional)- Path to a CSL style file for Pandoc.
- Semantics:
- Currently preserved in metadata for use by Pandoc export and future citation tooling.
- Does not yet generate atoms directly.
4. Body Syntax
4.1 General Markdown
- The body is parsed as CommonMark‑style Markdown using
markdown-it. - Headings, paragraphs, lists, code blocks, and inline emphasis are rendered as usual; they do not, by themselves, create semantic atoms.
4.2 Wikilinks
- Syntax:
[Label](../docs/label.html)[concept-id-or-slug](../docs/concept-id-or-slug.html)
- Resolution:
Labelis slugified to derive a target id.- The HTML renderer rewrites wikilinks into standard Markdown links to
docs/<target-id>.html(respecting the current base href).
- Semantics:
- For each wikilink target id
k, the loader:- Generates
LinksTo(k)as an annotation term. - Interns a
linkatom with valuek.
- Generates
- For each wikilink target id
4.3 Markdown links
- Syntax:
[text](relative/path/to/other.md)[text](relative/path/to/other)
- Resolution:
- Links whose targets start with a URI scheme (
^[a-z]+://) are treated as external and do not participate in concept resolution. - For other targets:
- The path is normalised relative to the current file, the extension (if any) is removed, and the result is slugified.
- The slugified value is used as a link target id.
- HTML rendering uses the original link markup; no additional rewriting is performed beyond standard Markdown rendering.
- Links whose targets start with a URI scheme (
- Semantics:
- For each internal target id, generates
LinksTo(target_id)and a correspondinglinkatom.
- For each internal target id, generates
4.4 Inline citations (Pandoc‑style)
- Recognised citation groups:
[ <span class="citation">@key</span> ][ <span class="citation">@key</span>; <span class="citation">@other</span> ][ <span class="citation">@key</span>; <span class="citation">@other</span> ](commas are also accepted as separators).
- Parsing:
- A citation group is detected when text matches
[...]whose contents contain one or more@keytokens separated by commas or semicolons. @keytokens are extracted;-@key(author‑suppressed form) is recognised for parsing but rendered the same at present.
- A citation group is detected when text matches
- HTML rendering:
- Each
keyis resolved using:- concept id,
- slugified concept title,
- aliases and their slugified forms.
- If resolution succeeds:
- The citation is rendered as
<a class="citation" href="...">@key</a>pointing to the concept page.
- The citation is rendered as
- If resolution fails:
- The citation is rendered as
<span class="citation">@key</span>(external or unknown reference).
- The citation is rendered as
- Multiple keys in one group are rendered inside a single bracketed citation, separated by
"; ".
- Each
- Semantics:
- Inline citations currently affect HTML presentation but do not create atoms or j‑relations.
- They are intended as a trace/provenance layer that can be lifted into the semiotic universe in future revisions.
5. Derived Semantic Structure
5.1 Annotations and atoms
Given a loaded concept, the loader constructs a multiset of annotation terms from:
tags→HasTag(tag)status→HasStatus(status)type→HasType(type)aliases→AliasFor(alias)- folders →
InFolder(folder) - links (wikilinks and internal Markdown links) →
LinksTo(target_id) facts→Fact(predicate, subject_id, object_id)
Each annotation interpreter maps its term into a finite feature set:
HasTag(t)→ onetagatom with valuet.HasStatus(s)→ onestatusatom with values.HasType(t)→ onetypeatom with valuet.AliasFor(a)→ onealiasatom with valuea.InFolder(f)→ onefolderatom with valuef.LinksTo(k)→ onelinkatom with valuek.Fact(p, s, o)→ onefactatom with value"{p}:{s}->{o}".
The union of all such atoms for a concept forms its base feature set.
5.2 j‑relations and closure
Configuration MAY specify relational closure rules:
folder_tag_rules[folder] = tag- Creates a j‑relation from
folderatomfoldertotagatomtag.
- Creates a j‑relation from
tag_synonyms[a] = b- Creates symmetric j‑relations between
tagatomsaandb.
- Creates symmetric j‑relations between
type_tag_rules[type] = tag- Creates a j‑relation from
typeatomtypetotagatomtag.
- Creates a j‑relation from
status_tag_rules[status] = tag- Creates a j‑relation from
statusatomstatustotagatomtag.
- Creates a j‑relation from
The semantic fragment and contribution for each concept are then computed by iterated application of:
- temporal enrichment (trace operator
G) viacreated/updatedfields, and - graph closure (
j) following the configured j‑relations.
This section is descriptive: implementations MUST preserve extensivity, monotonicity, and idempotence of the combined closure but MAY refine the set of j‑relations in future versions of this specification. The full categorical treatment of fragments, j/G‑closure, and stewardship operators is given in the universe constructions [ @semiotic-universe; @stewardable-semiotic-concept-universe ].
Math details
- Fragment size
- 10 atoms
- Semantic closure
- 10 atoms
- Tags
- 5
- Types
- 1
- Statuses
- 1
- Links
- 3
- Tags
- 5
- Links
- 3
- Facts
- 0
- Temporal years
- 0
- Score
- 6.50