Or: How big can my stagers be, really?
This post is also available at the Striker Security Blog
The idea here is to parse through the Metasploit Project’s available exploits to determine what the distribution of payload sizes is, with an eye to deciding whether that super cool stager idea you had the other day is worth pursuing.
If you’re familiar with the concepts of vulnerabilities, exploits, and stagers, go ahead and skip to the graphs below. Otherwise, read on:
Certain specific types of software bugs can lead to what are known as vulnerabilities - situations which can be exploited to cause the software to fail in specific ways. While it’s more common to be able to cause the software to simply crash, certain vulnerabilities can lead to what’s known as arbitrary code execution - which is exactly what it sounds like, allowing anyone who is able to exploit that vulnerability to execute whatever commands they wish on the machine running the vulnerable software.
In keeping with the terms above, a “vulnerability” is a situation which could lead to a problem, if someone figures out how to take advantage of it, and an “exploit” is a piece of software which takes advantage of a vulnerability - exploits it, as it were.
However, in most cases exploits can’t simply run anything they want - the specific bug they take advantage of forces them to have a size limit, only allowing a certain amount of data (code) to be arbitrarily executed by the vulnerable host. This has led to the advent of stagers, small stubs of code which exist to reach out and retrieve a larger, more fully-featured piece of malware which is then used to accomplish whatever the attacker has in mind now that they control the system. Since the usefulness of a stager is limited by its ability to fit into the small amount of payload space that many exploits have available, most are handwritten in assembly to ensure they are as small as possible.
So, with that in mind - this post explores the payload space in the exploits available as a part of the Metasploit Project, a large, publicly-accessible offensive security tool, with the assumption that these are reasonably representative of exploits in general.
This can help make decisions for stager size optimization - if I have a great idea for a stager (or other exploit payload), but can’t make it any smaller than 1k, is it worth it? What if it’s 2k? And so on.
As it turns out, payloads over 2kb work with less than 20% of available exploits, and payloads over 1kb only work with about 60% - if you can’t make your stager under 2k, you shouldn’t expect to be able to use it very often at all.
Let’s dive right in with some graphs:
We’ll start by displaying payload size (the space in which a stager must fit) against the fraction of exploits which will work (or not work) for that size. It looks like any payload over 2048 bytes will only work with about 20% of exploits - a little less, in fact! If any of your stagers are just barely above 1024 bytes, it’s well worth the effort to trim those last few bytes. Almost a quarter of exploits available in Metasploit have a payload size cutoff at 1024 bytes.
The three vertical lines serve to highlight these major cutoff points - they’re at 512, 1024, and 2048 bytes.
This chart can also be read as a probability: if I want to send a 1000 byte payload, and I pick an exploit at random (or, I find a vulnerable host at random), I have about a 60% chance that the exploit I end up with will be able to accomodate that payload. If my payload is 500 bytes, that probability becomes more than 90%.
The humble histogram finishes out our exploration of the data as a whole - note that it’s on a log scale. This is significantly less useful than the above charts, since payload size is a cumulative number (i.e. smaller payloads still work in exploits with more than enough space for them), but this view is interesting in that it shows us where there are large clusters of exploits accepting a certain payload size.
Next, we’ll move to a sanity check on platform-specific sizes. Do exploits for different platforms have roughly the same payload size characteristics?
As it turns out, not quite! It appears that if you’ve got a large payload (stager) you want to use, your best bet is to find a multi exploit to use it with, followed by unix, then linux, then Windows. For whatever reason, Windows does a better job than the rest at preventing the injection of large payloads.
Take note - if you’re designing a stager that’ll end up on the larger side, making it Windows-specific is the worst decision you can make - if it’s not under 1024 bytes, the best case scenario is that it’s useful on about 20% of hosts. (although for payloads under 1024 bytes, the platforms are essentially all the same)
In summary: consider 2048 bytes to be a realistic maximum when developing payloads intended for widespread use! If you want them to be really useful, though, 1024 (or even 1000) bytes is a much better target. Platform differences can make your payloads more or less useful when they’re relatively large, but under 1024 bytes the differences are fairly small - under 512 bytes, they’re almost entirely gone.
If you’d like to see the raw data, code, and analysis, you can find it on Github.
Also from ungineers: