DistributedCacheConsistency

MeatballWiki | RecentChanges | Random Page | Indices | Categories

Systems like FreeNet employ distributed caching to limit server bogging, DenialOfService attacks and for ensuring content is available even if its master server is down.

A distributed cache works by moving frequently requested pages closer to the nodes making the request. Say, everytime a page is requested, the closest server caches the page, chucking out the least recently used page to make room.

However, if the page from the source site has changed or if it is dynamically generated, how do you ensure the client site gets the latest version? This is trivial in something like a computer's virtual memory caching system because the entire universe is controlled by a central processing unit, but on a distributed fault-prone network, this can become complex.

The alternative would be no caching but that leaves us with slow bottlenecks, denial of service attacks, the SlashdotEffect and server failure preventing content distribution.

Surely the only way this can work is with a smart-cache which can upload condensed information about what has changed? We can argue about all kinds of ways to send the list of changes, as a bit-array, as list of changed pages, other alternatives... But they amount to different ways of compressing the information about what pages have changed. Just as with any compression algorithm, the best compression will depend on the data being compressed. As a first cut, send a gzip of 'RecentChanges'.

With network failures etc, the cache won't really be able to control its 'distance' from the true source of data. Therefore it won't be possible to set a hard upper limit on the out-of-datedness of the cache, rather that limit will be inherited from the physical structure of the internet. So the cache should inform clients of the expected delay in information it holds, and if they don't like that the client goes to a closer source.

As I see it this kind of smart cache would be superimposing another virtual structure on the internet, in many ways very similar to DNS name resolution. Worth looking in RFCs on DNS implementation problems for issues and solutions?? --JamesCrook

DistributedCacheConsistency

Discussion