chapter.of: UI and Rendering Pipeline created: 20140717174120958 modified: 20140717202324456 sub.num: 2 tags: doc title: Parser The first stage of WikiText processing is the parser. A Parser is provided by a module with ``module-type: parser`` and is responsible to transform block of text to a parse-tree. The parse-tree consists of nested nodes like ```js {type: "element", tag: , attributes: {}, children: []} - an HTML element {type: "text", text: } - a text node {type: "entity", entity: } - an HTML entity like © for a copyright symbol {type: "raw", html: } - raw HTML ``` The core plug-in provides a recursive descent WikiText parser which loads it's individual rules from individual modules. Thus a developer can provide additional rules by using ``module-type: wikirule``. Each rule can produce a list of parse-tree nodes. A simple example for a wikirule producing a ``
`` from ``---`` can be found in [[horizrule.js|$:/core/modules/parsers/wikiparser/rules/horizrule.js]] HTML tags can be embedded into WikiText because of the [[html rule|$:/core/modules/parsers/wikiparser/rules/html.js]]. This rule matches HTML tag syntax and creates ``type: "element"`` nodes. But the html-rule has another special purpose. By parsing the HTML tag syntax it implicitly parses WikiText widgets. It the recognises them by the $ character at the beginning of the tag name and instead of producing "element" nodes it uses the tag name for the type: ```js {type: "list", tag: "$list", attributes: {}, children: []} - a list element ``` The [[Widgets]] part will reveal why this makes sense and how each node is transformed into a widget. Another special characteristic of the html-rule or the parse nodes in general is the attributes property. Attributes in the parse-tree are not stored as simple strings but they are nodes of its own to make indirect text references available as attributes as described in [[Widgets]]: ```js {type: "string", value: } - literal string {type: "indirect", textReference: } - indirect through a text reference ```