DoS Madness! Part 1: Memory Management / Behind Your Firewall
Alright, so here’s the bottom line. Your site is down. It’s your job not only to fix it this time, but also, to keep this kind of stuff from happening. Well, a DoS Attack in any of its varieties will rely on a single tenet: your service (www.yoursite.com) relies on a successive chain of processes and devices to generate, package, and deliver its content to your clients. A DoS attack is, in practice, any sort of attack that exploits one or more of these subsystem’s vulnerability to being overwhelmed. It then follows, that to reduce the potentiality for a successful DoS attack, one must reduce the ways in which your site’s subsystems will be overwhelmed.
With that thought in mind, let’s take a look at a couple different DoS attacks and the subsystem they target. The next series of articles will break down DoS types by the general subsystem they exploit.
Part 1: Memory Management / Behind Your Firewall
One type of DoS attack that takes advantage of poorly configured web servers is slowloris. This attack opens as many connections as your web server can handle, then keeps the connections open, with occasional
slowloris: hey.
webserver: what?
slowloris: ...
webserver: ...
webserver: are you still there?
slowloris: hey...
webserver: what?
slowloris: ...
statements.
Slowloris can be mitigated by rate limiting incoming connections to Apache and / or using a non-vulnerable front end web server, such as nginx.
Another tool that can be used maliciously is Google’s skipfish. This is a security scanner that can double as a load tester. Basically skipfish will form URLs based off of crawled data and dictionary based guesses. Used maliciously, it can overwhelm a poorly deployed LAMP stack. These servers may run out of memory and crash.
When it does, there may be an OOM_Killer message in your /var/log/messages file as a final, “OMG-help-me dump” right before reboot. Memory is a scarce resource in a system, and to improve performance, the Linux kernel will overcommit RAM to the numerous processes that request an allocation of memory space. Read more about the Linux kernel’s overcommit behavior here. In summary, it’s kind of like simultaneously sitting at five different $5 dollar black jack games with $20 in chips. Most times, a couple games will cash out, releasing chips back into the resource pool, and there is no net loss. The kernel runs faster, and everybody wins. That is, until traffic increases beyond what the system can handle, and everybody loses. When this happens, the kernel has to run from at least one table (and an angry pit boss). OOM_Killer is a program that was designed to handle exceptions to a linux kernel’s overcommit behavior and uses algorithms to determine which process to kill.
A particularly amusing analogy by Andries Brouwer describes OOM_Killer thus:
An aircraft company discovered that it was cheaper to fly its planes with less fuel on board. The planes would be lighter and use less fuel and money was saved. On rare occasions however the amount of fuel was insufficient, and the plane would crash. This problem was solved by the engineers of the company by the development of a special OOF (out-of-fuel) mechanism. In emergency cases a passenger was selected and thrown out of the plane. (When necessary, the procedure was repeated.) A large body of theory was developed and many publications were devoted to the problem of properly selecting the victim to be ejected. Should the victim be chosen at random? Or should one choose the heaviest person? Or the oldest? Should passengers pay in order not to be ejected, so that the victim would be the poorest on board? And if for example the heaviest person was chosen, should there be a special exception in case that was the pilot? Should first class passengers be exempted? Now that the OOF mechanism existed, it would be activated every now and then, and eject passengers even when there was no fuel shortage. The engineers are still studying precisely how this malfunction is caused.
So if you went to the win.tue.nl link I showed you, you’ll note that like a scared gambler, the kernel (after 2.5.30) can be instructed to be more reserved when allocating memory. The values in the below files will moderate this behavior.
echo 2 > /proc/sys/vm/overcommit_memory #this will tell the kernel to never commit memory over the swap space and a certain fraction of physical memory.*
echo 0 > /proc/sys/vm/overcommit_memory #(default) the kernel will do what it thinks is best
echo 1 > /proc/sys/vm/overcommit_memory #this will tell the kernel to go hog-wild and never refuse an app memory
*This fraction is defined in “/proc/sys/vm/overcommit_ratio” – the default value is 50 (=50%
But that’s not the cure-all to memory management. That just dictates kernel behavior. App behavior is another story. Here, you have to ensure your app never asks for more memory than your system can afford to give out. Now this is going to differ based on what type of server it is, but the most commonly exploited servers (from my perspective anyways) are LAMP stacks. And of the LAMP stack, the two most common reasons why OOM_Killer is invoked is:
Too many Apache worker processes. Take a look at your MaxClients declaration in your httpd.conf file. Now, looking at the MPM (by default, it’s usually prefork, which means 1 thread per process), multiply the number of processes for your MaxClients by the average size of your Apache processes, and you’ll have the total memory allocation for Apache at full load. Naturally, you do not want this figure to be anywhere close to your physical RAM limit….
For more on LAMP stack tuning, please refer to the following links:
http://httpd.apache.org/docs/2.0/misc/perf-tuning.html
http://dev.mysql.com/doc/refman/5.0/en/server-parameters.html
http://www.interworx.com/forums/showthread.php?p=2346
http://www.ibm.com/developerworks/views/linux/…
Finally, you may wish to consider using a lighter web server, or a reverse proxy cacher to optimize performance. The bottom line is, when faced against an array of incoming attempts to overwhelm and disable your backend applications, you want your system to be robust enough to gracefully handle what it can, and redirect or ignore the rest.
In Part 2: Socket Management / Behind Your NIC, we’ll be looking at SYN Floods and other, non-port saturating attacks. Now as this series progresses, there will be some overlap in mitigation techniques that will impact more than one type of DoS attack. As long as these correlations are mutually beneficial, we’ll be just hunky-dory.