Wikis work if the energy of the community vastly outweighs the energy of malfeasants; this is a pillar of SoftSecurity. Traditionally, however, we have assumed malfeasant motivations are unstable, either inherently (e.g. teenage vandals who quickly get bored) or as a result of community action (e.g. peace-making, et cetera). The new "threat" to wikis, spam, is simply one with a stable motivation: profit.
Initially, we relied on community to combat spam; however, the spam energy has grown to the point where this is destabilising SoftSecurity. By targeting energy alone, we risk starting an arms race - one we cannot ultimately win - because we are simply escalating our own side of the motivation-energy balance. Instead, we have to return to SoftSecurity's first pillar: ensuring malfeasant motivation is unstable.
People have suggested several ways of killing motivation; without global adoption on all wikis, these will never prevent spam, as there now exist 'bots who are blind to motivation-sapping. We must win the arms race and sap motivation, simultaneously. If an anti-energy weapon is deployed without an anti-motivation one then, like giving last-resort antibiotics to farmyard animals, we risk losing the weapon; if an anti-motivation weapon is deployed locally without an anti-energy one, it will only be partially effective.
Never think of motivation and energy as separate.
Having won the energy war on MB — for now — I suggest that in 2007, we move on to brainstorming ways of sapping the motivation to spam wikis, to present the wiki community with a complete package. A little early I am, but here's some ideas:
Our minor victory in the spam wars has restored some of my personal energy. Lay on! -- ChrisPurcell
One amusing idea, is to replace the text in link spam. If enough sites did it, could create [Google Bomb]. Then anyone could retrieve a list of spam URLs by searching google with the replacement text. -- JaredWilliams
Indeed, that is what chongqed.org is all about. -- SunirShah
Ah they approach from other end, as it were, replacing the URLs.
Another is to use a bloom filter for publishing the whitelist of domains. I think the false positive nature is outweighed by the compactness of storing each domain in a few bytes. For instance, 10,000 domains could be stored in around 14KB, using 4 hash functions with a 1% false positive rate [1]. This means when parsing a post, can keep the filter in memory, and also reduces bandwidth publishing it. Other sites could simple retrieve and OR it with their own filter. (Assuming using the same hash & number of hashes) -- JaredWilliams
After some experimentation, I think my previous idea is probably not that workable, due to having to deal with sub-domains, sub-sub-domains etc. Another possibility is using bloom filter to record the entire set of external URLs on a whole wiki. This would allow to limit the HumanVerification to when new URLs have been added to an edit. Maybe reduce the hassle for refactoring pages. -- JaredWilliams