Distributed denial of service (DDoS) attacks are the worst nightmare of every web presence. Common wisdom has it that there is nothing you can do to protect yourself when a DDoS attack hits you. Nothing? Well, unless you have Stingray Traffic Manager. In this article we'll describe how Stingray helped a customer keep their site available to legitimate users when they came under massive attack from the "dark side".
What is a DDoS attack?
DDoS attacks have risen to considerable prominence even in mainstream media recently, especially after the BBC published a report on how botnets can be used to send SPAM or take web-sites down and another story detailing that even computers of UK government agencies are taken over by botnets.
A botnet is an interconnected group of computers, often home PCs running MS Windows, normally used by their legitimate owners but actually under the control of cyber-criminals who can send commands to programs running on those computers. The fact that their machines are controlled by somebody else is due to operating system or application vulnerabilities and often unknown to the unassuming owners. When such a botnet member goes online, the malicious program starts to receive its orders. One such order can be to send SPAM emails, another to flood a certain web-site with requests and so on.
There are quite a few scary reports about online extortions in which web-site owners are forced to pay money to avoid disruption to their services.
Why are DDoS attacks so hard to defend against?
The reason DDoS attacks are so hard to counter is that they are using the service a web-site is providing and wants to provide: its content should be available to everybody. An individual PC connected to the internet via DSL usually cannot take down a server, because servers tend to have much more computing power and more networking bandwidth. By distributing the requests to as many different clients as possible, the attacker solves three problems in one go:
They get more bandwidth to hammer the server.
The victim cannot thwart the attack by blocking individual IP addresses: that will only reduce the load by a negligible fraction. Also, clever DDoS attackers gradually change the clients sending the request. It's impossible to keep up with this by manually adapting the configuration of the service.
It's much harder to identify that a client is part of the attack because each individual client may be sending only a few requests per second.
How to Protect against DDoS Attacks?
There is an article on how to ban IP addresses of individual attackers here: Dynamic Defense Against Network Attacks.The mechanism described there involves a Java Extension that modifies Stingray Traffic Manager's configuration via a SOAP call to add an offending IP address to the list of banned IPs in a Service Protection Class. In principle, this could be used to block DDoS attacks as well. In reality it can't, because SOAP is a rather heavyweight process that involves much too much overhead to be able to run hundreds of times per second. (Stingray's Java support is highly optimized and can handle tens of thousands of requests per second.)
The performance of the Java/SOAP combination can be improved by leveraging the fact that all SOAP calls in the Stingray API are array-based. So a list of IP addresses can be gathered in TrafficScript and added to Stingray's configuration in one call. But still, the blocking of unwanted requests would happen too late: at the application level rather than at the OS (or, preferably, the network) level. Therefore, the attacker could still inflict the load of accepting a connection, passing it up to Stingray Traffic Manager, checking the IP address inside Stingray Traffic Manager etc. It's much better to find a way to block the majority of connections before they reach Stingray Traffic Manager.
Introducing iptables
Linux offers an extensive framework for controlling network connections called iptables. One of its features is that it allows an administrator to block connections based on many different properties and conditions. We are going to use it in a very simple way: to ignore connection initiations based on the IP address of their origin. iptables can handle thousands of such conditions, but of course it has an impact on CPU utilization. However, this impact is still much lower than having to accept the connection and cope with it at the application layer.
iptables checks its rules against each and every packet that enters the system (potentially also on packets that are forwarded by and created in the system, but we are not going to use that aspect here). What we want to impede are new connections from IP addresses that we know are part of the DDoS attack. No expensive processing should be done on packets belonging to connections that have already been established and on which data is being exchanged. Therefore, the first rule to add is to let through all TCP packets that do not establish a new connections, i.e. that do not have the SYN flag set:
# iptables -I INPUT -p tcp \! --syn -j ACCEPT
Once an IP address has been identified as 'bad', it can be blocked with the following command:
# iptables -A INPUT -s [ip_address] -J DROP
Using Stingray Traffic Manager and TrafficScript to detect and block the attack
The rule that protects the service from the attack consists of two parts: Identifying the offending requests and blocking their origin IPs.
Identifying the Bad Guys: The Attack Signature
A gift shopping site that uses Stingray Traffic Manager to manage the traffic to their service recently noticed a surge of requests to their home page that threatened to take the web site down. They contacted us, and upon investigation of the request logs it became apparent that there were many requests with unconventional 'User-Agent' HTTP headers. A quick web search revealed that this was indicative of an automated distributed attack.
The first thing for the rule to do is therefore to look up the value of the User-Agent header in a list of agents that are known to be part of the attack:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
sub isAttack()
{
$ua = http.getHeader( "User-Agent" );
if ( $ua == "" || $ua == " " ) {
#log.info("Bad Agent [null] from ". request.getRemoteIP());
counter.increment(1,1);
return 1;
} else {
$agentmd5 = resource.getMD5( "bad-agents.txt" );
if ( $agentmd5 != data.get( "agentmd5" ) ) {
reloadBadAgentList( $agentmd5 );
}
if ( data.get( "BAD" . $ua ) ) {
#log.info("Bad agent ".$ua." from ". request.getRemoteIP());
counter.increment(2,1);
return 1;
}
}
return 0;
}
The rule fetches the relevant header from the HTTP request and makes a quick check whether the client sent an empty User-Agent or just a whitespace. If so, a counter is incremented that can be used in the UI to track how many such requests were found and then 1 is returned, indicating that this is indeed an unwanted request.
If a non-trivial User-Agent has been sent with the request, the list is queried. If the user-agent string has been marked as 'bad', another counter is incremented and again 1 is returned to the calling function. The techniques used here are similar to those in the more detailed HowTo: Store tables of data in TrafficScript article; when needed, the resource file is parsed and an entry in the system-wide data hash-table is created for each black-listed user agent.
This is accomplished by the following sub-routine:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
sub reloadBadAgentList( $newmd5 )
{
# do this first to minimize race conditions:
data.set( "agentmd5" , $newmd5 );
$badagents = resource.get( "bad-agents.txt" );
$i = 0;
data. reset ( "BAD" ); # clear old entries
while ( ( $j = string.find( $badagents , "\n" , $i ) ) != -1 ) {
$line = string.substring( $badagents , $i , $j -1 );
$i = $j +1;
$entry = "BAD" .string.trim( $line );
log .info( "Adding bad UA '" . $entry . "'" );
data.set( $entry , 1 );
}
}
Most of the time, however, it won't be necessary to read from the file system because the list of 'bad' agents does not change often (if ever for a given botnet attack). You can download the file with the black-listed agents here and there is even a web-page dedicated to user-agents, good and bad.
Configuring iptables from TrafficScript
Now that TrafficScript 'knows' that it is dealing with a request whose IP has to be blocked, this address must be added to the iptables 'INPUT' chain with target 'DROP'. The most lightweight way to get this information from inside Stingray Traffic Manager somewhere else is to use the HTTP client functionality in TrafficScript provided by the function http.request.get(). Since many such 'evil' IP addresses are expected per second, it is a good idea to buffer up a certain number of IPs before making an HTTP request (the first of which will have some overhead due to TCP's three-way handshake, but of course much less than forking a new process; subsequent requests will be made over the kept-alive connection).
Here is the rule that accomplishes the task:
1
2
3
4
5
6
7
8
9
10
11
12
13
if ( isAttack() ) {
$ip = request.getRemoteIP();
$iplist = data.get( "badiplist" );
if ( string.count( $iplist , "/" )+1 >= 10 ) {
data.remove( "badiplist" );
$url = "http://127.0.0.1:44252" . $iplist . "/" . $ip ;
http.request.get( $url , "" , 5);
} else {
data.set( "badiplist" , $iplist . "/" . $ip );
}
connection. sleep ( $sleep );
connection.discard();
}
A simple 'Web Server' that Adds Rules for iptables
Now who is going to handle all those funny HTTP GET requests? We need a simple web-server that reads the URL, splits it up into the IPs to be blocked and adds them to iptables (unless it is already being blocked). On startup this process checks which addresses are already in the black-list to make sure they are not added again (which would be a waste of resources), makes sure that a fast path is taken for packets that do not correspond to new connections and then listens for requests on a configurable port (in the rule above we used port 44252).
This daemon doesn't fork one iptables process per IP address to block. Instead, it uses the 'batch-mode' of the iptables framework, iptables-restore. With this tool, you compile a list of rules and send all of them down to the kernel with a single commit command.
A lot of details (like IPv6 support, throttling etc) have been left out because they are not specific to the problem at hand, but can be studied by downloading the Perl code (attached) of the program.
To start this server you have to be root and invoke the following command:
# iptablesrd.pl
Preventing too many requests with Stingray Traffic Manager's Rate Shaping
As it turned out when dealing with the DDoS attack that plagued our client, the bottleneck in the whole process described up until now was the addition of rules to iptables. This is not surprising as the kernel has to lock some of its internal structures before each such manipulation. On a moderately-sized workstation, for example, a few hundred transactions can be committed per second when starting from an empty rule set. Once there are, say, 10,000 IP addresses in the list, adding more becomes slower and slower, down to a few dozen per second at best. If we keep sending requests to the 'iptablesrd' web-server at a high rate, it won't be able to keep up with them. Basically, we have to take into account that this is the place where processing is channeled from a massively parallel, highly scalable process (Stingray) into the sequential, one-at-a-time mechanism that is needed to keep the iptables configuration consistent across CPUs.
Queuing up all these requests is pointless, as it will only eat resources on the server. It is much better to let Stingray Traffic Manager sleep on the connection for a short time (to slow down the attacker) and then close it. If the IP address continues to be part of the botnet, the next request will come soon enough and we can try and handle it then.
Luckily, Stingray comes with rate-shaping functionality that can be used in TrafficScript. Setting up a 'Rate' class in the 'Catalog' tab looks like this:
The Rate Class can now be used in the rule to restrict the number of HTTP requests Stingray makes per second:
1
2
3
4
5
6
if ( rate.getBackLog( "DDoS Protect" ) < 1 ) {
$url = "http://localhost:44252" . $iplist . "/" . $ip ;
rate. use ( "DDoS Protect" );
# our 'webserver' never sends a response
http.request.get( $url , "" , 5);
}
Note that we simply don't do anything if the rate class already has a back-log, i.e. there are outstanding requests to block IPs. If there is no request queued up, we impose the rate limitation on the current connection and then send out the request.
The Complete Rule
To wrap this section up, here is the rule in full:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
$sleep = 300; # in milliseconds
$maxbacklog = 1;
$ips_per_httprequest = 10;
$http_timeout = 5; # in seconds
$port = 44252; # keep in sync with argument to iptablesrd.pl
if ( isAttack() ) {
$ip = request.getRemoteIP();
$iplist = data.get( "badiplist" );
if ( string.count( $iplist , "/" )+1 >= $ips_per_httprequest ) {
data.remove( "badiplist" );
if ( rate.getBackLog( "ddos_protect" ) < $maxbacklog ) {
$url = "http://127.0.0.1:" . $port . $iplist . "/" . $ip ;
rate. use ( "ddos_protect" );
# our 'webserver' never sends a response
http.request.get( $url , "" , $http_timeout );
}
} else {
data.set( "badiplist" , $iplist . "/" . $ip );
}
connection. sleep ( $sleep );
connection.discard();
}
$rawurl = http.getRawURL();
if ( $rawurl == "/" ) {
counter.increment(3, 1);
# Small delay - shouldn't annoy real users, will at least slow down attackers
connection. sleep (100);
http.redirect( "/temp-redirection" );
# Attackers will probably ignore the redirection. Real browsers will come back
}
# Re-write the URL before passing it on to the web servers
if ( $rawurl == "/temp-redirection" ) {
http.setPath( "/" );
}
sub isAttack()
{
$ua = http.getHeader( "User-Agent" );
if ( $ua == "" || $ua == " " ) {
counter.increment(1,1);
return 1;
} else {
$agentmd5 = resource.getMD5( "bad-agents.txt" );
if ( $agentmd5 != data.get( "agentmd5" ) ) {
reloadBadAgentList( $agentmd5 );
}
if ( data.get( "BAD" . $ua ) ) {
counter.increment(2,1);
return 1;
}
}
return 0;
}
sub reloadBadAgentList( $newmd5 )
{
# do this first to minimize race conditions:
data.set( "agentmd5" , $newmd5 );
$badagents = resource.get( "bad-agents.txt" );
$i = 0;
data. reset ( "BAD" );
while ( ( $j = string.find( $badagents , "\n" , $i ) ) != -1 ) {
$line = string.substring( $badagents , $i , $j -1 );
$i = $j +1;
$entry = "BAD" .string.trim( $line );
data.set( $entry , 1 );
}
}
Note that there are a few tunables at the beginning of the rule. Also, since in the particular case of the gift shopping site all attack requests went to the home page ("/"), a small slowdown and subsequent redirect was added for that page.
Further Advice
The method described here can help mitigate the server-side effect of DDoS attacks. It is important, however, to adapt it to the particular nature of each attack and to the system Stingray Traffic Manager is running on. The most obvious adjustment is to change the isAttack() sub-routine to reliably detect attacks without blocking legitimate requests.
Beyond that, a careful eye has to be kept on the system to make sure Stingray strikes the right balance between adding bad IPs (which is expensive but keeps further requests from that IP out) and throwing away connections the attackers have managed to establish (which is cheap but won't block future connections from the same source). After a while, the rules for iptables will block all members of the botnet. However, botnets are dynamic, they change over time: new nodes are added while others drop out.
An useful improvement to the iptablesrd.pl process described above would therefore be to speculatively remove blocks if they have been added a long time ago and/or if the number of entries crosses a certain threshold (which will depend on the hardware available).
Most DDoS attacks are short-lived, however, so it may suffice to just wait until it's over.
The further upstream in the network the attack can be blocked, the better. With the current approach, blocking occurs at the machine Stingray Traffic Manager is running on. If the upstream router can be remote-controlled (e.g. via SNMP), it would be preferable to do the blocking there. The web server we are using in this article can easily be adapted to such a scenario.
A word of warning and caution: The method presented here is no panacea that can protect against arbitrary attacks. A massive DDoS attack can, for instance, saturate the bandwidth of a server with a flood of SYN packets and such an attack can only be handled further upstream in the network. But Stingray Traffic Manager can certainly be used to scale down the damage inflicted upon a web presence and take a great deal of load from the back-end servers.
Footnote
The image at the top of the article is a graphical representation of the distribution of nodes on the internet produced by the opte project. It is protected by the Creative Commons License.
View full article