[Home]LinkPattern

MeatballWiki | RecentChanges | Random Page | Indices | Categories

For a wiki, this is the RegularExpression that the script searches for to determine which text makes a link.

See LinkPatternSuggestions for suggested changes or additions to this wiki's LinkPatterns.

CategoryWikiTechnology CategoryIndexingScheme


For those who know Perl, this wiki's LinkPattern(s) are:

# Current for UseModWiki/MeatballWiki 0.8.8
$UseSubpage  = 0;       # 1 = use subpages,       0 = do not use subpages
$SimpleLinks = 1;       # 1 = only letters,       0 = allow _ and numbers
$NonEnglish  = 1;       # 1 = extra link chars,   0 = only A-Za-z chars

sub InitLinkPatterns {
  my ($UpperLetter, $LowerLetter, $AnyLetter, $LpA, $LpB, $QDelim);

  $UpperLetter = "[A-Z";
  $LowerLetter = "[a-z";
  $AnyLetter   = "[A-Za-z";

  if ($NonEnglish) {
    $UpperLetter .= "\xc0-\xde";
    $LowerLetter .= "\xdf-\xff";
    $AnyLetter   .= "\xc0-\xff";
  }
  if (!$SimpleLinks) {
    $AnyLetter .= "_0-9";
  }
  $UpperLetter .= "]"; $LowerLetter .= "]"; $AnyLetter .= "]";

  # Main link pattern: lowercase between uppercase, then anything
  $LpA = $UpperLetter . "+" . $LowerLetter . "+" . $UpperLetter
         . $AnyLetter . "*";
  # Optional subpage link pattern: uppercase, lowercase, then anything
  $LpB = $UpperLetter . "+" . $LowerLetter . "+" . $AnyLetter . "*";

  if ($UseSubpage) {
    # Loose pattern: If subpage is used, subpage may be simple name
    $LinkPattern = "((($LpA)?\\/$LpB)|$LpA)";
    # Strict pattern: both sides must be the main LinkPattern
    # $LinkPattern = "((($LpA)?\\/)?$LpA)";
  } else {
    $LinkPattern = "($LpA)";
  }
  $QDelim = '("")?';     # Optional quote delimiter (not in output)
  $LinkPattern .= $QDelim;

  # Url-style links are delimited by one of:
  #   1.  Whitespace (kept in output)
  #   2.  Left angle-bracket (<)  (kept in output)
  #   3.  A single double-quote (")  (kept in output)
  #   4.  A double double-quote ("") (removed from output)

  # Inter-site convention: sites must start with uppercase letter
  # (Uppercase letter avoids confusion with URLs)
  $InterSitePattern = $UpperLetter . $AnyLetter . "+";
  $InterLinkPattern = "(($InterSitePattern:[^\\s\"<]+)$QDelim)";
  $UrlProtocols = "(http|ftp|afs|news|nntp|mid|cid|mailto|wais|"
                  . "prospero|telnet|gopher)";
  $UrlPattern = "((($UrlProtocols):[^\\s\"<]+)$QDelim)";
  $ImageExtensions = "(gif|jpg|png|bmp|jpeg)";
  $RFCPattern = "RFC\\s?(\\d+)";
  $ISBNPattern = "ISBN:?([0-9- xX]{10,})";
}


For the other 98% who don't read Perl regular expressions, the Meatball pattern is:

One or more uppercase letters, then one or more lowercase letters, then one uppercase letter, then "any letters" (either upper or lowercase). (For non-Meatball wikis using UseModWiki, "any letters" can include underscores and numbers.)

UseModWiki's LinkPattern is purposefully looser than WikiWiki's, mostly to accomodate names with middle initials, and a few cases like titles with "A" in the middle. Many users have been confused by the Wiki rules--watch Wiki:RecentVisitors for frequent examples.


WikiWiki's LinkPattern is: \b([A-Z][a-z]+){2,}\b

This is considered to be the CamelCase link pattern and is the "standard" format. (More like the reference format.) Most wiki sites with automatic linking will create links on this pattern.


LinkPattern usually refers to just the pattern for normal page links. There are other patterns which match URLs and images. (These patterns are now listed in the code above.)
I was talking to OriFolger? who has set up a wiki in Hebrew at http://www.rashreshet.org (BrokenLink dec 2003). Since Hebrew has no miniscule/majuscule distinctions, these "bumpy" link patterns aren't applicable. He decided on the "standard" free form link pattern ala "[link pattern]". On the other hand, YasushiIwata? has a Japanese MoinMoin at http://www.sh.rim.or.jp/~yasusii (BrokenLink dec 2003). There, as far as I can tell, the page names are the standard MoinMoin patterns, essentially CamelCase in the Latin alphabet. Consequently, they use English page names with Japanese content. This further reinforces the need to tailor the LinkPattern to the site's particular needs, not enforce a "standard" pattern. -- SunirShah


Thank you Cliff for the elegant external link pattern in brackets, e.g. [homepage of usemod]. Now MeatballWiki can rather easily refer to arbitrary WikiPages on external wikis, using their native names. Could this be combined with InterWiki prefixes, looking like [Tcl:Tcl community projects] in browse mode and [Tcl:Tcl community projects] in edit mode. -- FridemarPache


Some notes

This page is obsolete and bogus. I'll be rewriting it over time, as well as trying to develop better WikiSyntaxSemantics? for links.

The PatternLanguage-based LinkPattern that we have here, known as CamelCase, is possibly an inbetween case if your brain isn't already written in Smalltalk, but it worked so well for the PortlandPatternRepository simply because it was very natural for those authors to name concepts (i.e. patterns) as single tokens written in CamelCase.

The GaGaParser takes implicit links to a new level.

InterWiki links are explicit links. TwinPages are implicit links.

I secretly believe that a well designed implicit link system is much better than an explicit link system.

Some people have done \(very_strange_things) with free links like combine the two types whilst adding extraneous (and therefore useless) syntax for the hell of it. Don't do this.

The external form of the link syntax also makes link verification very complex because it demands two versions of the LinkPattern. One for the syntax with the begin/end markers and one for the page nym matching the WikiNameCanonicalization format.

The internal form loses a character from the domain.

_link_pattern_ is also an internal form, even with begin and end markers, because the essential linking is done by connecting words together. This pattern allows single word _links_.

At the very least, it's probably a good idea to canonicalize to some, singular invariant nym format. If you have multiple link patterns, accept them on the URL, but redirect to a canonicalized version of the nyms. For instance, suppose MeatballWiki moved to the link_pattern format, so this page would become http://usemod.com/cgi-bin/mb.pl?link_pattern. For backward compatibility, though, we also accept the old LinkPattern, but we canonicalize it. So, when someone went to http://usemod.com/cgi-bin/mb.pl?LinkPattern, LinkPattern would canonicalize to link_pattern, and then the script will redirect to http://usemod.com/cgi-bin/mb.pl?link_pattern.

Then again, suppose we made the underscore (_) a linkable character but a non-character. So Link_Pattern would link the same as LinkPattern. If you go to http://usemod.com/cgi-bin/mb.pl?Link_Pattern, the script may then be intelligent enough to space the nym as Link Pattern. On the other hand, if you go to http://usemod.com/cgi-bin/mb.pl?LinkPattern, the script would emit the nym as LinkPattern. This would allow Marty_McFly to render properly, whilst still remaining MartyMcFly? from the PageDatabase's point of view. This doesn't break WYSIWYG, though it does add the mystifying part about WikiNameCanonicalization to the mix.

Speaking about disambiguators, is there any way I can configure UseMod to accept Shiva(Skansen) as a link, not provided with brackets? [[Dan Koehl]]

--- New to MeatBall, not sure where to post this. :-/

FreeLinks cover most of my needs, except for losing some of the convenience of CamelCase-ing links. I would like to get feedback on the social and technical ramifications of a linking rule for camel cased links that, after validating a potential wiki link, matched it against that word regardless of case. Thus PythOn? would match PyThon? but not Python. The latter case can be covered by a bracket syntax. GaGaParser is too liberal for my needs. - ZWiki:DeanGoodmanson


So, I'd love to hear about experiments with other patterns for ImplicitLink?s. Sure, CamelCase is all well and good, but what about some other ideas?

--EvanProdromou

ProWiki supports

-- HelmutLeitner

Discussion

PatrickAnderson -- Thu Sep 3 16:43:38 2009

RE: ImplicitLinks? - EvanProdromou, you may be interested in http://CommunityWiki.org/en/PlainLink


MeatballWiki | RecentChanges | Random Page | Indices | Categories
This page is read-only | View other revisions | Search MetaWiki
Search: