PeerToPeerBanList

MeatballWiki | RecentChanges | Random Page | Indices | Categories

Introduction

BanLists, or blackhole lists as they are also called, should not be centralized as that leads to a number of failure modes. First, LimitTemptation. Any one who maintains such a list attracts the spammers to attack them, as they are a ConcentrationOfPower and PowerIsCriticism. Second, if one bad link goes into that list, say from one angry individual seeking revenge upon another, every site dependent on that list will suffer.

Instead, a better solution would be decentralized. At an extreme, each site would maintain its own Banlist. This will DevolvePower to individuals who can then control what they want to ban or not. However, the burden of maintaining your own BanList is extreme. It will DivideAndConquer? the individual OnlineCommunitiesAreCityStates and allow the OrganizedCrime? of spammers to overwhelm each individuals. It's more efficient to collaborate with each other.

Therefore, the best solution would be PeerToPeer. Based on the protocol we developed for exchanging the InterMapTxt files, there is no central authority with power over the rest. Rather, individual communities will import data from close, trusted neighbours. This relationship need not be bilateral, and it may even be anonymous. All that a community need do is publish its BanList.

Once published, a subscribing community has to do is scrape the list and merge it with their own. With a large network of subscriptions, a single spam event will result in a ban across a wide number of communities. Further, the quality of those links will be at least marginally guaranteed by the TrustMetric formed implicitly by the SocialNetwork? of the communities. If one errant GodKing decides to ban a person out of anger and thus poison the well, then downstream communities may seek to remove that community from the pool.

The only obligation on the part of a community is to publish their list and maintain some sort of integrity over that list. The only requirement to join the network is to find some trusted BanLists to import into your own list.

See the TINSEL protocol on DailyMe for a similar approach for news. This is the same protocol we proposed for the InterMapTxt.

ForgiveAndForgetInSoftware

Individual communities may want to ForgiveAndForget the banned links after a certain period of time for two reasons. One, the server load of checking hundreds of patterns against each save will become excessive. Pruning the list will be necessary after a while. Two, some of the banned IPs or regexes will no longer be controlled by spammers or CommunityExiles after a while, and so they should be given a second chance.

Due to the nature of the protocol which is based on accumulating banned patterns, one cannot simply delete patterns from your own list. The deleted pattern will be resurrected the next time you synchronize with a list that also has the pattern. If the two lists are mutually subscribed, it will be impossible to remove a pattern unless both do so simultaneously.

Instead, each local site will need also to maintain a list of retired or forgiven patterns that they will remove from the actual BanList. This list should not be published, or if it is, it should not be imported. If it is also published and imported just like the BanList, then an attacker can quickly seed the network with his own pattern and thus disable the network against his attack.

CommunityExiles

Some people will be keen to use their BanList to eliminate certain sociopaths that are troubling them. One community's problem user is another's friend. This is dangerous, although predictable. As stated, the best solution is to eliminate subscriptions from sites that create these kinds of social problems for you.

Another solution is to maintain two separate BanLists. One for spam and another for CommunityExiles. The spam list will be the one spread around in a network.

Network lag and its effects

The problem with all encoded SocialNetwork?s is that they lag behind the reality of the social organization. If one site in the network begins publishing bad patterns, it will take a long time before that site is removed from the network, as first someone must notice the problem and track down its source. Then either the source must be convinced they are errant, or its neighbours must be convinced to unsubscribe from the errant source. Failing that, the neighbours' upstream subscribes must be convinced to unsubscribe.

Identifying all the source's neighbours is of course impossible since the model is based on HTTP fetches, which are essentially anonymous. It will be unlikely you will be given access to the full list of IPs that have loaded that BanList, and even if you had, it won't help you identify all the subscribers.

Finally, most people don't care about security because security is boring. It's not going to be a high priority for most people to ensure their BanLists and their PeerToPeer network is top quality and coherent.

Therefore, a very large network will always be low quality and probably useless. Rather, the only trustable sources are those with whom you have a very strong PersonalRelationship, enough to exert PeerPressure; or major nodes that actively maintain their integrity. Those latter nodes will be subject to attacks from spammers.

Add an AuditTrail

The problem is that untrusted nodes two or three degrees away are impossible to control from your point of view because they are invisible to you. You don't know who was responsible for the data.

A slightly more complex solution that will help alleviate some of the network lag problems will be to embed an AuditTrail in the PeerToPeerBanList. For each banned pattern, list the path that it took to get to you. For instance, if the Alpha community published the pattern, and it found itself to you through the chain of Beta-Gamma-Delta, then you might list the pattern as

: pattern. (Alpha, Beta, Gamma, Delta)

And downstream subscribes could add your name to the end of that chain. Then at least you will know that Alpha is the source of errant links. You can then create another (local) BanList to blackhole patterns that originate from errant nodes on the network. That is, you can create an untrusted node list and add Alpha to that list. Then all patterns with Alpha in the path will be banned.

SMTP works like this in a way, but if you have noticed, it doesn't work to curb e-mail spam since the paths can be spoofed. The difference is that in this network, you only will accept data from nodes that you explicitly trust. With e-mail, you have to accept data from every node on the Internet.

At least by only subscribing to sources with an AuditTrail you can avoid the network lag in your own community even if you cannot fix the entire network.

CategoryHardSecurity

The above text is PrimarilyPublicDomain.

Contributors: SunirShah AlexSchroeder LionKimbro BrianTempleton? BjornLindstrom?

Candidate sources

MeatBall:BanList
Wiki:WikiBlackList
CommunityWiki:BannedContent
http://www.jayallen.org/blacklist.txt (used for weblogs)
http://www.openwiki.com/ow.asp?WikiSpammers
http://www.rollerweblogger.org/wiki/Wiki.jsp?page=WikiVandals
http://www.istori.com/cgi-bin/wiki?WikiBlackList
http://www.stearns.org/sa-blacklist/ (for email, see the sa-blacklist.current.domains file, very large)
http://wiki.pivotlog.net/index.php?page=To%20combat%20referer%20spam,%20use%20ignored_domains.txt

Implementations

CommunityWiki:BannedContent

CategorySpam CategoryWikiTechnology

I'd just like to reaffirm that I am not overly in favour of BanLists. -- SunirShah

This whole idea when realized would be a great example of [Social Routing] -- ZbigniewLukasiak

4:41am UTC September 14, 2004. The EmacsWiki, CommunityWiki, OddMuse triad has banned the term 'usa' somehow, showing the problem with this strategy. Care must be taken when banning anything. -- SunirShah

I must somehow try to measure how many edits were rejected due to the current scheme in order to assess how useful it has been. Then we can at least compare it with that "usa" blunder... Generally speaking I try to solve the problem by spreading admin priviledges. I still haven't found how to do this the WikiWay... -- AlexSchroeder

I was told that the early "child-safe" internet filters blocked the Cricket results since Wisden, which publishes them is connected with the MCC (the club whose home ground is Lords) and the MCC is the "MiddleSEX? Cricket Club". --AndrewCates.

simply do not block "bad words" but block URLs which are much more specific. -- Anon

We've had chinese spam of the form foo.bla.blarg -- without the preceding http:// -- what do you suggest to do in this case? Somebody claimed at the time that Google would take these domain names into account for page ranking eventhough there was no actual link. I find it both hard to believe and hard to disprove. -- AlexSchroeder