[Home]InternalBackLink

MeatballWiki | RecentChanges | Random Page | Indices | Categories

An "InternalBackLink of page X" is a link from page Y to page X, where X and Y are on the same web site.

See also BackLink.

Implementing BackLinking on wikis

The simplest way to find backlinks in the PageDatabase is to just search the text of each page for the title. One can do better in Perl by breaking the link at word boundaries, ala /\bLinkPattern\b/gs. That way, searching for pages like ModWiki won't falsely bring up references to UseModWiki.

Still, this isn't perfect. Backlinking finds exactly the pages that have the link, not just the text. Although the second regex is much more restrictive than a full search, it would still get false positives with small words and phrases. Consider if someone creates a new page called the (use FreeLinks) just for kicks. You want to delete not just the page, but dangling links to that page so other people don't come along and click on the NoSuchPageSyntax. Without true backlinking, this becomes much more difficult. However, one could still extend the backlink search to be sensitive to the FreeLinkPattern?, which is commonly [[link]]. Thus /\[\[the\]\]/gs would do it.

Nonetheless, one might incorrectly include a LinkPattern that wasn't, like that, as it is escaped with the <nowiki/> tag. So, we might want a more powerful backlink search that wasn't just merely a text search, but one really sensitive to the entire syntax of the wiki. This generally means either parsing every page when searching it--very slow--or generating a BackLink database.

You don't need to parse every page when searching, only those pages that match the regexp. This is typically an order or two (or three) of magnitude less.

WikiWiki uses a BackLink database to reduce the load on the server. This can become infuriating to RecentChangesJunkies because the BackLink database is only updated once a day at best. So, new pages or new usages of pages on RecentChanges often don't provide their context by searching BackLinks. Also, you may miss a referal when deleting a page.

One could always update the backlink database every time a page is saved. AtisWiki? did this. It became intractable, because when saving a new page, one had to do a search across the whole PageDatabase. One solution is to store BackLinks to even WantedPages. This may be reasonable if you have the storage space available to store this database and a fast way to load and store the backlinks from and to this database.

An examination of the Meatball:LinkDatabase shows that there are 10,208 links (including links to wanted pages) spread over 2,184 pages. I then dumped the links listing output into excel, moved the first word into a new column, and then multiplied the lengths of the two columns together {the length of the page name * the length of all the link names} and did a sum() ... a total of 2.31 MB would be required for the entire backlinks database, including delimiters. Hardly onerous.

MonkieMonkie takes this approach. Each page (including WantedPages) has a separately maintained list of backlinks. When a page is saved, the script checks for any links that were removed or added, and updates the backlink lists for those pages only. Listing the backlinks for a page then requires no searching because you just read out the backlink list for that page.


CategoryWebTechnology CategoryWikiTechnology

Discussion

MeatballWiki | RecentChanges | Random Page | Indices | Categories
Edit text of this page | View other revisions
Search: