So, having never heard of shellcode before, I assumed it was just another way of saying shell script. But I scroll to the bottom and see
(*(void(*)()) shellcode)();
and suddenly this looks way more interesting. Is it really possible to literally just execute random bytes stored in a string like that? I mean, sure, you'd have to guarantee that it's running on the right platform with the right type of assembly, but still. This is fascinating! Plus, I don't even see execve actually pushed anywhere! Is that because of int 0x80? Wow this stuff is neat! I'm whelmed.
Aside, is there any using HN syntax to write that code inline (i.e. not in its own paragraph) without the asterisks insisting that I mean italics?
Take a look at Microcorruption [0]. If this post intrigued you, I think you'd really enjoy it.
It's a series of challenges by tptacek that task you with exploiting the firmware for a digital smart lock. It starts out by assuming no knowledge, with the first level literally just requiring that you read a memory dump, but by the last level you'll be reverse engineering custom heap implementations, injecting shellcode into ASLR'd binaries, and bypassing memory protections.
It really is the best introduction to this sort of material that I've ever come across.
"shell code" comes from back in the day when such code would usually "spawn a shell", essentially calling /bin/sh to yield a shell with elevated privileges (i.e., the privileges of the program being exploited; imagine being a normal user, able to call a program that runs with root privileges, then if you can trick the program into running arbitrary code, you can run code as root although you're a normal user). Ideally, when called across a network (say you could get Apache to execute random code you put into a GET header), such a shell should be accessible on some port so that you could telnet to that port and be 'logged on remotely'.
This 'shell code' was the 'payload' of other code that exploited e.g. a buffer overflow; the goal of such an exploit was always to trick some program into executing that payload, i.e. the shell code. So there were/are essentially two parts to writing an exploit - first getting the software to actually execute random code, secondly (if you actually want to use the exploit) write shell code that does something useful to an attacker, like spawn a shell or (as in the OP) drop firewall rules.
As a less objective side note, writing 'shell code' was always considered (in the circles I hung around in) as being less honorable and prestigious. The 'exploit writers' (exploiting the primary bug) were the samurais, the shell code guys the ninjas, as it were. Writing shell code was considered a bit dirty grunt work, and only script kiddies needed it anyway, because after the PoC for the actual exploit was done, the 'interesting' part was over. (I never 100% agreed with this, for some exploits you need some serious trickery to make your PoC actually do something useful).
Correct. Once you successfully cast to a function pointer, the compiled code will do all of the various register bookkeeping for stack management, then jump to the instruction specified. If it's legit assembly, then you're in business. If it's not, then you probably crash. It doesn't matter whether the instructions are the garden variety compiler-generated kind or some hand-crafted artisanal limited edition bytes. Instructions are instructions at that point.
> So, having never heard of shellcode before, I assumed it was just another way of saying shell script. But I scroll to the bottom and see
(*(void(*)()) shellcode)();
> and suddenly this looks way more interesting. Is it really possible to literally just execute random bytes stored in a string like that?
Well it's just a sequence of bytes somewhere in-memory, no? If the C compiler allows the cast, then it'd compile fine, and the CPU wouldn't know the difference; it'd just execute whatever instructions the sequence of bytes listed.
Because DEP aka W^X will mark the section of memory containing the string as non-executable, so it will segfault when the instruction pointer hits that address. You can disable that security feature though with a compiler option.
> Because DEP aka W^X will mark the section of memory containing the string as non-executable, so it will segfault when the instruction pointer hits that address. You can disable that security feature though with a compiler option.
DEP is a kernel-level security feature no? Since the kernel is what sets up the .text and .data sections.
not necessarily. there are software emulation -- examples would be W^X on OpenBSD[1] and Grsecurity/PaX on linux[2]. Ubuntu[3] and RedHat[4] also has (partial) NX-emulation thanks to ExecShield.
As for OpenBSD and Linux without grsec/pax, one can bypass NX (whether the CPU has the NX-bit or not) by marking the region with the shellcode as executable, eg:
mprotect(shellcode & -pagesize, len, PROT_EXEC);
((void()()) shellcode)();
in an exploit this could be accomplished by ROPing
Right, but it doesn't change my statement that W^X is enforced by the kernel (meaning the kernel sets all of the program pages to have the right bitset). Not to mention that it was (is?) emulated in software for some kernels.
Every single program that your computer runs is nothing but a 'pile of bytes' that is carefully set up by your compiler. There is no technical reason that you cannot manually set up said byte piles and execute them, and it is actually not hard at all. (The only 'gotcha' is that some parts of memory are not marked as executable, but this can be changed at runtime or compile-time).
execve does nothing more than swap out which bytes are loaded into the address space of a program (okay, it does some other book-keeping too, but from 100 feet...). It is not necessary for continuing execution outside of what the program has set up at compile-time. In fact, you can dynamically generate code on the fly, as JIT compilers do.
Yes. Things get more complicated with a modern OS on good hardware, but otherwise it works fine. The "int 0x80" does a Linux system call. This is how things get hacked into.
This kind of thing works on most non-Linux embedded devices. The OS on these is commonly VxWorks or ThreadX.
It works on Linux if you use a 486, Pentium, or 32-bit PowerPC. (and weirder stuff) It works on Linux if you use a kernel that is 10 years old. It works on Linux if you add a *.s file to your build. (an assembly file; empty is fine)
It works on Windows if the program wasn't built with some compiler option I forget. The security is opt-in, so most developers don't bother.
A properly configured firewall wouldn't allow anything except, for example, inbound traffic on port 80. If you wanted to start some kind of reverse shell, you would first need a hole in the firewall.
http://www.pentesteracademy.com/course?id=3 http://www.pentesteracademy.com/course?id=7
There are many free videos in this collection.