A SurgeProtector could be as simple as blocking a troublesome IP from the site. Or, it could prevent a given IP from hitting the site more than five times a second. It could simply DelayAction.
All in all, a TechnologySolution that is reasonable, effective and has minimal to no effect on the society, because it's aimed at bots, not people. Contrast EditThrottling, which is aimed at difficult people (actually, difficult behaviour).
People can read warning messages, and bots cannot. So give warning messages, and then people can slow down a little, which reduces the amount of collateral damage. It's critical to provide a gradual warning system so you do not lock out a klutz. For example, HelmutLeitner recounts:
Although that isn't how the SurgeProtector actually works, and without OpenProcess and automated warnings, users may think you have elected to HardBan them. That's the problem of not heeding the lesson of OpenProcess#badvogato.
We model bots as a continuous flow of page reads, and people as more bursty, so if you average behaviour over an hour or more, you'll catch the one but not the other. (I'm certainly faster than 1 page every three seconds in bursts, but I don't continue that rate for 5 minutes. -- CliffordAdams)
If you lock out an IP, don't lock it for too long, due to the joys of dynamic IPs and such. ForgiveAndForgetInSoftware. Still, if you give a friendly error message, people will understand.
I have also been locked out of Ward's Wiki a number of times during the last few months. Ward changed his strategy over this time but I don't think that he documented this anywhere. Currently (Jan 2000) it seems that there is an hourly hit limit and when you reach it you are denied access till the end of the hour. This happens to me sometimes, usually at about xx:50. This seems quite reasonable though I sometimes would prefer a higher limit (I suppose it to be in the 200-400/hour range). -- HelmutLeitner
The limit is based on activity, not time. If you are responsible for more than 75% of the last 100 hits to the site, your further requests are refused. Consequently, t's easier to lock yourself out at 8-9am UTC, or on Christmas. Also, in an recent vandalism attack, when multiple white hats were defending WikiWiki, the attacker could continue to change pages because the white hats were pushing entries into the abuse log. -- SunirShah
WikiWiki has a surge protector whose goal is to deny access to robots would ignore the RobotsExclusionStandard. One clever aspect of its solution is to display the same error page with the same URL with no HTML links whenever it decides an IP is overloading its wiki script. You can see the "Cannot sustain request." error by hitting loading Wiki:SandBox and hitting refresh rapidly. You can probably lock yourself out in under a minute.
I have been temporarily locked out of Ward's Wiki when browsing innocently. This could probably be fixed by tuning some parameters, but it shows how hard it can be for an automated system to distinguish between innocence and guilt. It gets worse if the guilty parties are trying to appear innocent; a kind of ArmsWar? develops. Ever better discrimination and ever better camoflage. Remember also that my IP address is allocated dynamically. If you use addresses to try to perminantly me lock out, you will actually lock out some other randomly chosen customer of my ISP. -- DaveHarris
ProWiki uses the following strategy for SlurpBlocking?:
It has been in use since Jan03 and has proved to be very effective and selective.
KuroShin tracks usage (especially submissions) of an IP over time. When it crosses some threshold, it locks the IP out for a certain amount of time, s. If Sirge hits KuroShin again during the timeout, it doubles the timeout period to 2s. And it doubles it again to 4s if Sirge hits the site again. At a certain point, the login is disabled permanently.
WikiPedia limits the number of page reads per IP to 20(?) reads in the last minute.
On July 24, 2001, usemod.com received its highest ever hit count. 32458 hits. Considering that the day before and after were 3073 and 2592 hits respectively, something was definitely fishy. Indeed, it was a webrobot that went insane. Fortunately, it wasn't brutally heavy on the server. -- SunirShah
I once aimed the bot from atomz.com at allmyfaqs.com, hoping for a nicely indexed 360 pages, well under the 500-page limit for freebie use. The bot got quite excited by RecentChanges and category tags however, and looped around like crocheted Irish lace. I did an abort after it hit the 1200-page level, figuring it would never stop. Google, however, does not get so wild, and does a good job of indexing a wiki, at least for allmyfaqs. -- JerryMuelver
EmacsWiki got disabled twice around 2002/2003 because of robots slamming the site. I implemented a surge protector, now. -- AlexSchroeder