The engine will be coded entirely in Javascript, giving unique mobility advantages:
The engine consists of rules, each of which has a regular expression describing what it matches, and a list of children that should be applied in recursive descent to its contents.
The use of regular expressions allows great flexibility in matching a particular class of markup rules. MeatballWiki's TextFormattingRules fall largely within this category, with the exception of overlapping double- and triple-quote emphasis. It is also very quick to apply on most Javascript engines.
Each rule typically corresponds to a single XHTML node type, and always wraps its contents in balanced pairs of tags; thus, XML output is guaranteed. By limiting each rule's children to be a subset of the valid XHTML structure, XHTML output is guaranteed.
As an example: The table rule corresponds to the table element, and matches groups of lines starting and ending with ||. More specifically, the regex is:
(^|\n)(\|\|.*\|\|[ \t]*(\n|$))+
The only child of the table rule is the tr rule, which matches a single line, wraps it in a tr tag, and strips off the final ||. Finally, the only child of the tr rule is the td rule, which matches a single pipe-separated block.
This structure can thus be verified against a DTD, guaranteeing that all future output will be strict XHTML.
One disadvantage of this is technique is that it scans over the same portion of the string several times. In tables cells with complex formatting (something that UseModWiki does not support, but is useful), for instance, this might get expensive. However, since most of that scanning is very simplistic loop & compare, which can be done in a relatively tight loop in the underlying regular expression parser, it may be less expensive than a state machine that expends thousands of instructions per byte. That being said, however, different JavaScript engines have grossly different speeds for string manipulation. Experimentation will determine if the performance is acceptable.
While this technique is currently limited to XHTML nodes, it can be extended to include any balanced string, which will be important in the future.