It's certainly possible to create an HTML cache, however the problem becomes one of bookkeeping. If a page contains links to WantedPages, it will cache with the NoSuchPageSyntax. However, if one of those wanted pages gets created in the meantime, the HTML cache becomes invalid. Similar problems occur if a referenced page is deleted.
Solutions to this problem are varied. One could store the list of links (separated as "existing" and "non-existing") in the cache database, and do a quick check each time the cached HTML is fetched to see if the sets have changed. This may not be an inefficient process either because it may be a lot faster to search the PageDatabase for multiple titles all at once than to create a separate query for each one. Even if the cache becomes invalidated, the translator can use this information to more efficiently determine what type of link to generate.
Another solution is to maintain a BackLinkDatabase?, including backlinks to wanted pages. When a page status changes between existing and non-existing (or vice versa), then simply invalidate all the backlinks' caches. This has the additional advantage of speeding up backlink searches (and also making them accurate), as well as allowing more interesting graph analysis. However, the number of race conditions now grows at O(N^2), not to mention the amount of computation required on the server for each and every update. The result would likely be a less stable system, and certainly a slower one.
The best practice is to store the links (or any other potentially mutatable portion of the page) as unparsed section that needs to be reparsed every time. Links don't take a long time to look up, so that won't be too difficult. Sections like RssInclusion may be slower, but by their very definition they of course will be slow. Caching RSS would be useful as well.
OddMuse uses two strategies: Caching of partial HTML fragments, and support for HTTP/1.1 caching. 
Oddmuse uses a cache of HTML and raw text fragments. Assume the raw text is "This is a WikiLink." The parser will split this into three fragments. "<p>This is a " is cached as HTML, "WikiLink" is cached as raw text (and will be reparsed whenever the cache is used), and finally "." is cached as HTML. For every text formatting rule, Oddmuse knows whether the output can change in the future without the page itself changing. This is true for all sorts of local links, for example.
That's semi-caching, basically -- render text formatting into HTML and leave the wiki links that depend on the state of the entire database till later.
In addition to that, you can store the time of the last change to the page database somewhere, and send that in every response using the last-modified header. When the client (browser or cache) then requests a page the second time using an if-modified-since header, we can just reply 304 Not Modified if no edits happened since then. See RFC 2616 for details.
It'd also be possible to use 404 handlers to do HTML caching. Apache lets you define scripts as 404 handlers; some other Web servers do, too.
When a page is saved, it's only necessary to delete the file from the cache directory, as well as the files for any related pages (to update broken links, for example). The next request will trigger the 404 handler (file isn't there, it got deleted) which will re-render the page.
Note also that you can use Apache's (or other servers') Multiviews feature to leave off the ".html" at the end of the file name. You could even play around with serving XML+XSL (or XML+CSS) and HTML transparently. And that you don't have to worry about rendering links differently for existing/non-existing pages. OK, well, you have to show them differently, but the URL you use can be the same ("/wiki/ExistingPage?", "/wiki/NonExistingPage?").
The whole thing is predicated on the assumption that the Web server serving a static file will be much much faster and resource-savvy than if it fires off any dynamic page stuff (CGI, PHP, ASP, whatever). This is usually the case. So it's worth doing.
The cache may grow pretty big. You can either have a scheduled task to go in and reap the LRU pages, or the 404 handler can do that before writing out the page it's doing. The first is fast but potentially risky (the cache may overflow before the reaper gets to it), the second is safer but requires some housework on the part of th 404 generator, which should probably be heavily optimized. A paranoid might just want to reap everything in the cache every N minutes or so.
You could extend this strategy for some page-info pages, like a PageHistory feature. Just map a different directory ("/history") for page histories, and have a separate 404 generator there. This means some more careful cache invalidation at edit time, but, hey, if you need some speed, it could aid significantly. --EvanProdromou
I like it. Not just for the speed, but because I like programs to use the filesystem as their data representation as much as possible (because that allows more unforseen interoperability with standard tools). My only complaint is that you aren't really following the semantics of the idea of a missing page handler. I wonder if there is any "less surprising" way to do this? -- BayleShanks
ErrorHandler? is for errors, and cache-miss is not a server-level error, its rather a condition. You should simply use tools like mod_rewrite to handle that condition and check if file exists (or entry in database and maybe some other conditions - you get more options and more control here).
Clearly maintaining list of backlinks is the best solution. With help of database it would be quick and simple. You just have to mark cached pages stale on edit, wich is comparatively rare occasion.
And for best performance you could use caching server-gateway (squid proxy?) at top of inner wiki-cache. It will be userful if you get a lot of reads by anonymous users (like Wikipedia). Ofcourse HTTP cache to work efficently needs proper HTTP headers generated (last modified, expires, no cookies).
If you ask me, the cost in resources of generating the HTML from data with WikiLink?s is lower, than de cost in resources to store and create a HTML-cache plus the cost in development to design and code a HTML-cache plus the extra debugging and support afterwards, especially with web-servers that have enough CPU-power and memory and still a relatively slower read-from-disk speed (even if only a little). -- StijnSanders
MoinMoin compiles pages into Python byte code. The code consists of request.write("HTML") and formatter.dynamic_item(params) calls. This has speeded up rendering by a factor >> 10 if you don't have expensive macros in the page. The implementation uses the seperation between parser and formatter by implementing a kind of "meta formatter". See MoinMoin:MoinMoinIdeas/WikiApplicationServerPage for details.