Markdown syntax

This Markdown code:

## Foo

Lorem ipsum.

### `bar`s

- Hello
- There

...parses to this syntax tree:

Document
    Heading         level=2
        Text        text="Foo"
    Paragraph
        Text        text="Lorem ipsum."
    Heading         level=3
        CodeInline  text="bar"
        Text        text="s"
    List
        ListItem
            Text    text="Hello"
        ListItem
            Text    text="There"

...which can be rendered to this HTML fragment:

<h2>Foo</h2>
<p>Lorem ipsum.</p>
<h3><code>bar</code>s</h3>
<ul>
<li>Hello</li>
<li>There</li>
</ul>

section: a Heading plus all subsequent nodes up until there's any Heading with a level that is at least as significant.

This page is released under CC BY-SA 4.0.

This page seeks to (1) introduce Markdown syntax, and (2) define another "15th standard" for Markdown ASTs (with language extensions)

This builds upon:

Disclaimer: I am also the author of MarkdownMesh, and this is the syntax tree schema that I am using in that project.

Properties from language extensions are indicated via: (+ foo)

NodeProperties
Documentchildren, (+ frontmatter_props)
Headingchildren, level
(+ is_details_section, open)
Paragraphchildren
Texttext
Htmltext
Commenttext
CodeBlocktext, lang, meta
CodeInlinetext, (+ lang, meta)
HorizontalRule( ↦ HTML:<hr />)
Linkurl, title, ref_id, ref_label, children
Imageurl, title, ref_id, ref_label, alt
LinkRefDefurl, title, def_id, def_label
Emphasischildren
Strongchildren
Blockquotechildren
Listchildren, spread, ordered, start
ListItemchildren, spread, (+ checked)

Nodes for extensions beyond CommonMark:

Name
FootnoteRefref_id, ref_label
FootnoteDefdef_id, def_label, children (content)
Tablechildren ([ head, body ]), align
TableHeadchildren ([ row ])
TableBodychildren ([ ...rows ])
TableRowchildren ([ ...cells ])
TableCellchildren (content)
TabNavigatorchildren ([ ...tabs ])
Tabchildren (content), label
LayoutRowchildren ([ ...columns ])
LayoutColumnchildren (content)

e.g. for (MD:$A \cap B$) ↦ (HTML:<math>…</math> or HTML:<span class="katex">…</span>)

Instead of defining MathBlock/MathInline nodes, math expressions are simply syntactic sugar for CodeBlock/CodeInline nodes where lang="math".

See also:

For some node types, CommonMark offers multiple forms of syntax for expressing them.

This page will focus on the forms to which Prettier normalizes. For the rest, see the CommonMark spec.

In addition, there are many language extensions with new types of nodes. This page covers some of them.

This is the root node.

Children: the top-level block nodes of your document (Paragraphs, Tables, etc)

Attributes:

WARNING: never parse YAML without security-auditing your parser! 😱

(MD:## Foo) ↦ (HTML:<h2>Foo</h2>)

Children: inline nodes

Standard attributes:

Non-standard attributes re: <details>/<summary>-ifying sections

Non-standard optional attributes re: <details>/<summary>-ifying this Heading's section:

Extensions/plugins related to that:

(MD: text/etc surrounded by enough space) ↦ (HTML <p>…</p>)

This node wraps a string of text.

Attributes: text

This node wraps a string of raw HTML.

Attributes: text

This node wraps a comment (<!-- ... -->)

Attributes: text, with the trimmed interior of the comment (...)

Attributes:

A CodeBlock always has an info string; it is the zero-or-more characters to the right of the opening triple-backticks, trimmed.

lang is the non-empty first word of the info string, or null-or-missing if the info string is blank.

meta is the non-empty trimmed remainder of the info string, or null-or-missing if the trimmed remainder of the info string is blank.

See "Β§ What about math nodes?"

```py β–ΆοΈŽcode ; result
[f(x) for x in arr]
```

β†’

(MD:`...`) ↦ (HTML:<code>...</code>)

If you need to express backticks within the code, there are options:

(MD:`` ...`... ``)
↦ (HTML:<code>...`...</code>)

Attributes:

Language extensions that can lead to non-null lang/meta attributes:

(MD:---) ↦ (HTML:<hr />)

e.g. (MD:[About `Foo`](#foo "bar")) ↦ (HTML:<a href="#foo" title="bar">About <code>Foo</code></a>)

Note: don't confuse the title with the "link text" / content being linked.

Attributes:

Either both of { ref_id, ref_label } are specified or neither are. These join exactly one LinkRefDef on link.ref_id = lrd.def_id.

When a Link use a LinkRefDef, a copy of the url and title properties from that LinkRefDef get included in the Link node.

(MD:![](foo.png)) ↦ (HTML: <p><img src="foo.png" /></p>)

(MD:![desc](foo.png "title")) ↦ (HTML: <p><img alt="desc" title="title" src="foo.png" /></p>)

Note: don't confuse the title with the alt text.

Children: zero or more inline nodes whose combined text content, if any, forms the alt="…" tag value.

Attributes:

Either both of { ref_id, ref_label } are specified or neither are. These join exactly one LinkRefDef on link.ref_id = lrd.def_id.

When a Link use a LinkRefDef, a copy of the url and title properties from that LinkRefDef get included in the Link node.

e.g. (MD:[Foo]: https://example.org/ "title") ↦ (AST: LinkRefDef{ url, title, def_id, def_label } node), for use by other Markdown nodes

Attributes:

Emphasis{} β†’ content

e.g. (MD:_Lorem ipsum_) ↦ (HTML:<em>Lorem ipsum</em>), which usually gets styled as italic.

Strong{} β†’ content

e.g. (MD:**Lorem ipsum**) ↦ (HTML:<strong>Lorem ipsum</strong>), which usually gets styled as bold.

Blockquote{} β†’ nodes

e.g. (MD:> ...) ↦ (HTML:<blockquote>...</blockquote>)

List{ children, spread, ordered, start } β†’ ListItem-list

ListItem{ spread, checked } β†’ content

Table{ align } β†’ [TableHead, TableBody]

TableHead{} β†’ TableRow-list with exactly one element

TableBody{} β†’ TableRow-list with zero or more elements

TableRow{} β†’ TableCell-list with one element per column

TableCell{} β†’ content

e.g. (MD:...lorem ipsum[^foo].) ↦ (HTML:TODO)

e.g. (MD:[^foo]: et other stuff.) ↦ (HTML:TODO)

TODO

TabNavigator{} node β†’ Tab nodes

Tab{ label } node β†’ content

TabNavigator
    Tab     label="Foo"
        ...
    Tab     label="Bar"
        ...

TODO: (extended) Markdown code + notes

When generating a web page that includes a TabNavigator, please don't use any JavaScript β€” you don't need it!

With HTML/CSS there are plenty of options:

LayoutRow node β†’ LayoutColumn nodes

LayoutColumn node β†’ content

LayoutRow
    LayoutColumn
        ...
    LayoutColumn
        ...

TODO: (extended) Markdown code + notes