MeatballWiki | RecentChanges | Random Page | Indices | Categories

This page is here to form a ConsensusGroup about the markup-specific portions of the MeatballWiki engine.

1. ConsensusGroup
2. WikiCreole
3. Extended Unicode Support
4. Smart Hyphenation
5. Regular Language–Recursive Descent Parser
6. Windows Punctuation Macros

1. ConsensusGroup

Put your name down if you want to be an active member of MB's markup committee. This solely affects whether you want to be consulted before any and all decision-making; everyone is welcome to contribute to this discussion even if not "active". No rapid decisions can be made before 4th November.

Interested parties:

Inactive members:

2. WikiCreole

(From WikiCreole) Perhaps as my last bit of BenevolentDictatorship, I've added MeatballWiki and InfiniteMonkey as potential supporters. Tacit support is not the same as actual implementation, but it's better to be as helpful as we can be than a stick in the mud. -- SunirShah

3. Extended Unicode Support

I would like add a rule to MeatballWiki that automatically translates such characters to the relevant Unicode versions when the page is saved (rather than rendered) — and similarly for all other HTML entities, including &, < and > as these are occasionally produced by non-compliant editors. A better way of writing e.g. & than & is thus needed; for consistency, I suggest using two double-quotes after the ampersand: &""amp;.

HTML entities may potentially be emitted to user agents that do not accept UTF-8–encoded text. They are terribly ugly in wiki text. -- ChrisPurcell

4. Smart Hyphenation

Are you suggesting this is entirely for automatic conversions? Automatically converting a character entity into its associated unicode character is one thing, but smart quotes (or smart-hyphens, or what have you) are another. 'Smart' anything is usually very frustrating, because as it turns out, computers aren't very smart, and when they are wrong it is too aggravating to 'negotiate' with them. Also bear in mind that people will be copying and pasting rendered text directly into the edit box, so multiplying the quote characters will complicate the underlying regular expressions. -- SunirShah

No smart quotes, no magic. Never fear :) -- ChrisPurcell

Coming back to this, ‘smart hyphenation’ might be a nice idea to improve the presentation of pages following the PunctuationConventions. Most computers cannot type proper hyphens, allowing only the ASCII hyphen-minus, but many can display them. Hyphens are also only legitimate in specific places — a different set of places from minus, figure dash and quotation dash, the other use for hyphen-minus in the above. It is thus technically feasible to allow the editor to specify that smart hyphenation is a good idea on a particular page, one on which all other conflicting uses of the hyphen-minus have been corrected to the relevant dash.

One problem with enabling smart hyphens on a page would be edits by someone unfamiliar with the system; perhaps the option should be similar to the WikiWikiWeb "I can't type tabs" system, i.e. a one-off translation of the hyphen-minus to the hyphen where appropriate. This would also allow any accidents to be corrected.

Displaying the hyphen code point is a more fraught proposition, as we don't want users of older browsers to see garbage. Automatically replacing hyphen, minus, figure dash and quotation dash code points with the hyphen-minus would solve the display issue, though obviously significant user testing would be needed to ensure this happens if and only if the browser needs it. This is equally difficult when editing the page: simply replacing hyphens with hyphen-minus characters will inadvertently corrupt the original page on saving. Substituting the relevant entity is probably the least of three evils here.

Finally, what is the advantage of introducing all this extra bloat? In the end, wikis are, visually, always going to be less than pretty; users concentrate on good text, not colourful formatting. Using the correct punctuation character gives a professional feel to the text that helps remedy the lack of visual panache: we look like a serious, considered site, with a correspondingly serious, considered opinion. New users don't need to worry about the issue as the natural use of hyphen-minus and apostrophe-quote can be cleaned up by copy-editors. -- ChrisPurcell

5. Regular Language–Recursive Descent Parser

I've designed a markup parser based on a regular language–recursive descent (RLRD) design. That is, rules are applied in a recursive descent matching the XML output, and at each level of the descent, a set of regular expressions define the rules applied to the text. Thus, the rules that define when, say, a code block begins and ends are determined by a single regular expression, and then other regular expressions determine what markup is applied within that block.

If that means nothing to you, this is simply a way of ensuring that different engines apply the same set of rules identically. Current designs often break down at edge cases, in unpredictable ways.

I've successfully coded a [MeatballWiki-style ‘live preview’] in Javascript using this design. Future revisions of MB may use this engine internally, if nobody objects.

Additionally, I intend to transpose the WikiCreole rules into a RLRD engine, and make it available to the WC team as a benchmark, or potentially a golden standard if they approve. -- ChrisPurcell

Putting aside the merits of this approach for the moment, and speaking only about the question of our CommunityMembers' acceptance, it's not a matter of no one objecting, since I doubt most fully understand the impact of the change. It is a matter of painstakingly demonstrating that any new rendering engine would have either a negligible impact or an improved outcome. That means a lot of unit testing. However, I recognize that the most likely strategy is bundling the engine with a better WikiEngine that would essentially force conversion if people agree to adopt the better engine for other features. I don't accept that. I want to see a gauntlet of syntax acceptance tests, even (especially) for WikiCreole, which is more complex. -- SunirShah

Absolutely. I'd have to insist on suspending the idea of rapid decisions, et cetera, when it actually comes to making changes to the MB engine. I see this committee as about directing the development of future solutions, not imposing them on the rest of the community. Unfortunately, given the priorities of most of our community (i.e. not on the minutae of our engine, and quite rightly so), to get a proper, considered response from those we represent, we're likely to have to make live changes at some point to elicit proper feedback. Still, that's quite a way off.

Do you mean unit acceptance tests, or community acceptance tests? I'm not sure what the latter would mean, but it sounds promising. -- ChrisPurcell

The method used with the recursive regular expressions is definitely one of the better ways of parsing WikiMarkup?. But have some reservations as to whether it should be done in javascript. Performing TransClusions? or even displaying a different style for WantedPages would require additional HTTP requests? Which may have a greater cost than parsing on the server?

Also I quite like using a DOM implementation for building the output, as this guarantees at least XML compliance. Allowing the parser to run in a single pass. Parsing situations where content is moved from its relative place in the WikiMarkup? (eg. MediaWiki <ref></ref> <references/>) is easier.

PS. If rename the ITRender.html to ITRender.xhtm (or set the Content-Type to application/xhtml+xml) to trigger Mozilla's XHTML parsing it'll throw up 2 minor problems (missing closing / on a <img> (line 192) and <hr> (line 592)). -- JaredWilliams

It doesn't have to be done in Javascript, since the method is language-neutral. Anything supporting regexen should do fine. The Javascript markup's main benefit is the ability to play live, as it were, in the browser. Nevertheless, you could still do things client-side and not use extra HTTP requests: simply determine the backlinks server-side, and send out a list of which ones are WantedPages along with the page. Alternatively, you could install server-side JS, and simply use the engine for convenience, not off-loading work to the client.

Haven't yet added movable content: MB's <toc> macro is another good example of this. Also, can't remember if I missed the closing / on purpose or not :) -- ChrisPurcell

For those interested in this project, I've made a crack at a Creole 0.4 RLRD parser; see [MeatballSociety:Creole/0.4]. It uses the same basic engine as the MB parser; only the rules have changed. -- ChrisPurcell

Enhanced it to be compliant to [WikiCreole:Creole1.0] standard. Clean up the source a little, too. See it here: [JavaScript Creole 1.0 Wiki Markup Parser]. BTW, what happened to the above site? -- IvanFomichev?

6. Windows Punctuation Macros

Using javascript to provide user configurable macros seems possible, so the user could bind common/difficult sequences to say a alt+digit key combination. -- JaredWilliams

Had a quick stab at implementing it, [1], its gecko specific, but it works. Alt+[1-9] runs a macro, and using Control+[1-9] to start and stop recording. It compiles a macro by capturing keypress events, and then replays them when the macro is run. Unfortunately not all keypress events seem equal. Most notably the cursor keys do not seem to do anything when replayed. I think it would've been nice to be able to relatively position the cursor. Like a macro for '«  »' (bound by default to Alt+1) should position the cursor in the center so typing/pasting the quotation would be easier. You can run macros whilst recording, the output of the ran macro is inserted literally into the macro recording, though just recording the keypress event required to invoke it maybe more desirable. -- JaredWilliams


MeatballWiki | RecentChanges | Random Page | Indices | Categories
Edit text of this page | View other revisions