Open proxies are used for several reasons. Some people feel it will help protect their privacy, but because most open proxies are transparent, your IP address will be revealed. An AnonymousProxy can provide some degree of privacy, if you can guarantee it is not a HoneyPot. Others need them to skirt a firewall, say at a company. Usually this is because the company network is routing all traffic through a transparent proxy on the outside of the company firewall, except it is not properly configured, and so acts as an open proxy for everyone in the world to use. The overwhelming majority of the use cases, though, are for attacks, such as for spam or DistributedDenialOfService? attacks.
As such, it's helpful to detect open proxies in order to deny them access to your server. To detect an open proxy, simply use it. Send the string "GET http://www.yahoo.com HTTP/1.0\n\n" to the IP and port and see if it returns Yahoo!'s front page. However, we can be more clever and automatically have open proxies ban themselves. Using a SelfBan URL, say http://example.com/cgi-bin/wiki?action=selfban, send the string "GET http://example.com/cgi-bin/wiki?action=selfban HTTP/1.0\n\n" to the supposed open proxy. As with spiders, any IP that hits this URL is blocked for some time (cf. DynamicValue), with an informative error message. For example, MediaWiki implementation is described on WikiPedia:user:proxy_blocker.
OpenProxies often appear in an OpenProxyRing. They are often used in RotatingProxy attacks, in which case an EditHash is a cheap and reasonably effective defence.
The OpenProxy detector below is fine for basic detection, but it is very slow and provides a bad user experience. We probably lose potential editors who do not understand why the website freezes when they go to edit. Also, they do not know their machine is being scanned, why, or whom to complain about it if the algorithm turns out to be wrong. Finally, because it takes so long, it holds up httpd processes in the Apache queue, which is not very good for CPU load.
Therefore, First, move the OpenProxy detection out of line into a separate process from the main script. Through a shared, synchronized queue, the main script should enqueue IPs to scan which the OpenProxy detector would dequeue from within its own process.
When the user requests a change to the model, the main script should ask the OpenProxy detector if the user's host is banned. There can be one of three responses, with the associated reactions from the main script:
If a potential change is blocked because the host is unknown, the change event might be lost even though the IP passes the scan. Therefore, provide a SingleDispatch event object with the request. The OpenProxy detector can resend the event when the scan completes. If the resent event results in a conflict, the main script should log the event object to a public page (e.g. DeferredChangeConflict?) for manual reconciliation. To make this change in response detectable, the detector should add the flag deferred=1 to the event object.
While the script should scan for open proxies for every change to the model, the most common case is page editing. In that case, the main script should request a scan while the user is editing, so it is happening in parallel behind the scenes. By the time the edit is complete, the scan would have likely completed.
Finally, the open proxy detector maintains a database of scanned IPs and their statuses (acceptable, banned, enqueued) as well as the timestamp of their last scan. Scans should recur periodically as open proxies are DynamicValues (e.g. CryptoNauts may install an open proxy in defense of privacy ideals).
use LWP::UserAgent; # Modified from HTTP::CheckProxy sub proxy_port { my ($ip, $target_url) = @_; $target_url = "http://www.yahoo.com" unless $target_url; my @ports = qw/23 80 81 1080 3128 8080 8081 scx-proxy dproxy sdproxy funkproxy dpi-proxy proxy-gateway ace-proxy plgproxy csvr-proxy flamenco-proxy awg-proxy trnsprntproxy castorproxy ttlpriceprocy privoxy ezproxy ezproxy-2/; my $browser = LWP::UserAgent->new( timeout =>10, max_size =>2048, requests_redirectable => [] ); foreach my $port (@ports) { $browser->proxy("http","http://$ip:".$port); my $response = $browser->head($target_url); last unless defined $response; return $port unless $response->is_error; } } my $ip = $ARGV[0]; print proxy_port($ip);
Under the GeneralPublicLicense v. 2
In the first six weeks of being installed here on MeatballWiki, it caught 18 open proxies. -- SunirShah
The [tor] network is an OpenProxy network. It's automatically blocked by this patch.
There is an unconfirmed Tor blacklist at http://proxy.org/tor_blacklist.txt
An Oddmuse extension based on the old Meatball code is also available. [1] I've rewritten the code so that it will check once every month. Can you explain why your code wants to check it thrice every month? Should we have a random component in there, eg. check with a 5% probability but at least once a month? -- AlexSchroeder
While we were at Wikimania, Frankfurt, August 4-8 2005, we are being hit by dozens of OpenProxy connections until one managed to get through (DistributedDenialOfService? attack), and their bot just hit the site until it was blocked by the SurgeProtector. What happens is that sometimes the connection times out before the proxy can respond properly. By checking multiple times, the probability of getting through to an edit is greatly reduced. Even checking twice would often be enough to block the change since we check on edit and then check on post, and it's unlikely to fail twice. However, it is still possible to succeed, so three times was my fuzzy good feeling number I picked out of the air.
A random component is unnecessary since we will ban the IP of the attacker within 24 hours. Rechecking the IP was because I was afraid of false positives and the chance that people using an IP that was formerly an open proxy (e.g. they screwed up their computer's configuration) and would be banned forever. We need to be more transparent about a) we are portscanning them, b) we are banning them because they are an OpenProxy. -- SunirShah
It should be considered that categorically banning open proxies practically destroys freedom of speech in restrictive environments.
This statement is incoherent. It's tautological that someone who can install server-side defenses can be a GodKing. The original statement is worth more serious consideration, as it is a major trade off.
Historically, the OpenProxy detector was to discourage particular attackers, not spammers. Spammers was a nice side effect. -- SunirShah
As it turns out, our spam has increased quite dramatically since disabling the detector. -- SunirShah
I have been trying to get proxycheck and proxybuster running for the past week, but they don't make any sense to me. -- SunirShah
I've rewritten the open proxy detector to use a threaded system, which is much faster. It still takes about 8-10 seconds on usemod.com, however. You can reuse the code (new-style BSD license) available at http://sunir.org/meatball/OpenProxy/OpenProxy.pm -- SunirShah
I've rewritten the open proxy detector again to
-- SunirShah
There are already many lists of open proxies. Why don't use one? Possibly, it would be even more reliable, as some the lists catch even multi-point proxies (different addresses for in and out connections) --JK
The same reason that having a list of known criminals won't help you while you're being mugged. We need an active defense. The DNS-based blacklists are useful because they help keep costs down since a DNS look up is much faster than a port scan. I'd need to know more, though, before I trusted them. -- SunirShah
Question: At WardsWiki it is mentioned in their Wiki:TheOnionRouter page that OpenProxy has increased dramatically. I personally find it safer to edit at that site using OpenProxy, so the "contributions" can be assessed by its own merits, or lack of. Does Meatball community have any nice things to say about use of OpenProxy?
Whether or not a contribution is assessed on its own merits is a function of the people, not any technical means to strip information out of the stream. Jerks will be jerks to your text on a wiki, no matter how you post it. The only thing to do is to expel the jerks and encourage the leaders. In fact, people will be more open to what you say if you identify yourself (via your UserName on RecentChanges), don't sign contributions that are genuine contributions, and clean up messes that you and others have left behind in the course of normal life.