[Home]GenerateStaticPages

MeatballWiki | RecentChanges | Random Page | Indices | Categories

Many dynamic sites generate the output pages each time a page is fetched, say through a CgiScript. However, in many cases the set of output pages doesn't change until some infrequent event occurs. For instance, on a wiki, pages only change when someone edits them. Generating the same output over and over again is a waste of processing time and browsing time (it's sloooow). Moreover, most indexing agents will not index dynamic pages--or at least CGI scripts.

Therefore, generate the output page when the infrequent event occurs (in this case a page edit), and save it to a file (e.g. GenerateStaticPages.html). Use these files as the main interface to the site, only resorting to the CGI script when something special will happen--like a page edit.

Not only will you skip expensive processing, but Apache and Unix will be able to cache the pages, thus giving a much faster site. Also, your site is now capable of being indexed by the search engines.

But, you lose the ability to customize the pages to each user, or do anything "just in time" generation. For instance, some users might like to set the colour scheme of the site to something they can easily read. Unfortunately, you will be forcing everyone to use the same interface with static pages.

Nonetheless, if a large segment of your readership prefers the defaults, you can still use static pages for them and let advanced users use the dynamic portion of the site. Besides, the site will still be indexable by search engines.

Or, if you can hack the DHTML, you might even be able to make large portions of your page static and small portions dynamic, giving you a "best of both worlds" solution. However, this latter solution is truly difficult, and only likely to work in InternetExplorer (but we all knew that anyway).

By the way, if you use some magic Apache hacks like Zope, you can fool indexing agents into thinking your page is static when it is really dynamic. This requires some clever behind the scenes work.

-- SunirShah


I've been thinking about similar ideas recently in regards to wikis. UseModWiki currently takes about 0.3 seconds to serve a page (on a fast server shared with hundreds of other sites). This includes the time to start a new Perl process, load/compile the script (about 55Kb) and the CGI library (over 200Kb), and all processing for each page.

UseModWiki allows semi-dynamic generation of pages. If the "HTML cache" option is enabled, dynamically generated pages will save a copy of the output into a the cache area. When a page is requested the cache will be checked. If the page is in the cache, it is simply dumped to the user (which currently takes about 0.1 seconds). If not, the page is generated as a dynamic request.

Wikis have a caching problem with the undefined page links (like SampleUndefinedPage?). When the target page is defined, all the links which reference the new page need to be changed to regular page links. UseModWiki handles this problem by removing *all* cached pages when a new page is created.

RecentChanges is another problem area, since it is usually one of the largest and most active pages (both reading and updating) in a wiki.

Recently I've been considering another solution: partial rendering at edit time. Instead of doing all the rendering at display time, the partially-rendered version would do most of it when the page is edited. For instance, the basic markup like escaping <>& characters, bold/italics, and lists could be completely done at edit-time. Page links would be replaced by tokens (containing indexes into an array of names). At display-time, all that is necessary is to search for the tokens (which would be a trivial/fast regular expression), and replace them with the appropriate link (either a normal or edit link).

On the other hand, the fully-dynamic behavior is easy to work with and extend. The fully-dynamic code would probably work well for communities the size of the C2 wiki (a few hundred active writers, and a few thousand active readers).

For search-engine indexing and offline reading, I plan to write code to generate a directory of HTML files from a wiki database. One could make this directory available for indexing, and/or create archive files with all the pages from the wiki. The biggest problem with static-appearing pages is that they often won't update unless the user specifically reloads the page. --CliffordAdams

Wikis have a caching problem with the undefined page links (like SampleUndefinedPage?).

Oi, that hurts. I forgot about that. Nonetheless, you can still win by creating a multimap of WantedPages and their BackLinks and storing that server side, updating it during the edit-save. Like all good speed optimizations, you need more storage. As a bonus, though, this structure makes it really easy to save the list of BackLinks with each page because you no longer have to search the whole PageDatabase for references to a new page. (And saving the BackLinks would make clicking the header more useful and faster.)

By the way, you can make an attempt to get the browser to stop caching the page with judicious use of MetaTags and the HttpHeader?. Not like InternetExplorer cares. It seems to cache completely randomly. Of course, MSIE will also cache CGI pages. -- SunirShah


Discussion

MeatballWiki | RecentChanges | Random Page | Indices | Categories
Edit text of this page | View other revisions
Search: