NorthSec 2021 CTF write-up – part 2

by | Jun 15, 2021 | CTF write-ups, Hack, Information Technology, InfoSec

If you didn’t read it, I wrote a “part 1” which addresses easier challenges: https://www.tristandostaler.com/northsec-ctf-write-up-part-1/
This post will be the part 2 of my write ups.

Hymn

This challenge was interesting for me because of my bias towards challenges of the past. As soon as I saw a CAPTCHA, my whole body fully and completely assumed it was a challenge to logically bypass the captcha, as opposed to a crypto challenge. That assumption was my main error and I intend to learn from it. Like a friend of mine says: Never assume anything (except breach)!

CAPTCHA challenges tend to always be around the following possibilities:

  1. There is a vulnerability around how the captcha is generated, allowing to predict the value
  2. There is a vulnerability around how the captcha is stored, allowing to either bypass the validation, query the current value to be able to submit it, etc.
  3. The values are always in a closed and small set, allowing to send always the same response and having a good chance of falling on the good answer (imagine a case where there are only 20 possible values, you have a 1/20 chance of submitting the right one on each try).

The context

We are provided with a ZIP file and a link to a website. In the ZIP file there is a file named “backup_app.py”, which is the backup of the application with have access to. When analyzing the content of the file, we can see that it generates CAPTCHA challenges that we need to solve 10000000000000001 time to receive the flag:

if (current_tries > 10000000000000000):
    return "THE FLAG IS XXXXXXXXXXXXX" #TODO: Add real flag

Of course, in the real challenge, the flag has the correct value. You can find the complete file content in this gist: https://gist.github.com/tristandostaler/899c85561e022fbf95465d2c2cdeabe9

The wrong vulnerability

As explained before, my initial analysis of the challenge was biased. Because of this, my initial analysis of the challenge was wrong. I will still explain it as I think we can all learn from it as it would still have been an interesting, but easier, challenge.

So starting from the line where the flag is printed, we can analyse the path the code needs to follow from a request I send:

@app.route('/',methods=['GET','POST'])
def index():
    if request.method == "POST":
        if ('CT' not in request.cookies):
            return redirect('/')
        else:
            if (int(request.cookies.get('CT').split("|")[1]) != int(cache['foo']-1)):
                
                resp = make_response(render_template('index.html', tries=0))
                resp.set_cookie('CT',make_struct(get_random_string(),0,cache['foo']))
                cache['foo'] = cache['foo'] +1
                return resp
            gh = parse_struct(request.cookies.get('CT'))
            verse = request.form.get('verse')
            original_verse = gh.split('|')[0]
            current_tries = int(gh.split('|')[1])
            # SUCCESsS
            if (current_tries > 10000000000000000):
                return "THE FLAG IS XXXXXXXXXXXXX" #TODO: Add real flag

Starting from the line just above the flag, and analyzing one line at a time, we have:

if (current_tries > 10000000000000000):

We can understand from this line that we need to have the variable “current_tries” set to a value larger than 10000000000000000. Because I was biased, I was reading a rather small number, like 1000, or 10000, which could definitely be done with a script if we can somehow bypass the CAPTCHA validation. So let’s find how to bypass the validation

current_tries = int(gh.split('|')[1])

Keeping in mind my wrong interpretation of the point 1, I thought that was interesting because we can see that the value current_tries comes from user controlled data:

# (...)
gh = parse_struct(request.cookies.get('CT'))
# (...)
current_tries = int(gh.split('|')[1])

As we can see, current_tries directly comes from the cookies, which is something we can control and this value is not modified in any way. The only restriction we have is from the following line:

if (int(request.cookies.get('CT').split("|")[1]) != int(cache['foo']-1)):

This line makes sure that we can’t send a value that is not in the cache (the server’s local memory). We already have the first vulnerability: because the value is a local variable, it is not restricted to the session. That means we can influence it from multiple computers and/or processes at the same time, aka we can do a hyper aggressive bruteforce (in a setup where the cache is bound to a session, we would be limited to whatever number of parallel sessions we can have on the server, often 1). If we can do an agressive bruteforce, we can definetly reach 1000 or 10000 really quickly (keep in mind my bias).

We only need to make sure that the value is never reset, and we need to find how to increment it. To find this, we can search in the code for the text “cache[‘foo’] =”:

# (...)
cache['foo'] = 1
# (...)
cache['foo'] = cache['foo'] +1
# (...)
cache['foo'] = cache['foo'] + 1
# (...)
cache['foo'] = cache['foo'] + 1
# (...)

cache['foo'] = cache['foo'] +1
# (...)
cache['foo'] = cache['foo'] + 1

Looking at this, we understand that the value is never reset, it is ever only increased in some conditions. The bruteforce looks more and more promising! So we only need to find the path to increment the value and we can then code a script to send the right values to increment current_tries to 1000 or 10000. After some analysis, I was confident that by simply sending a POST request where the cookie had a value like “something|-1”, where -1 is never the same as the value in the cache, and after some time, send “something|10000”, we would get the flag. This is from the following lines:

if (int(request.cookies.get('CT').split("|")[1]) != int(cache['foo']-1)):
    resp = make_response(render_template('index.html', tries=0))
    resp.set_cookie('CT',make_struct(get_random_string(),0,cache['foo']))
    cache['foo'] = cache['foo'] +1
    return resp

The wrong exploit

With this wrong analysis in my mind, together with my misplaced confidence, I fired up Burp Suite’s intruder plugin, always repeating 10000 times the same request with the cookie set to “something|-1” (in reality I simply took an existing request I had where the cookie was set, and change the value to -1). While intruder was working, I started to explain the challenge to a teammate. When we passed over the line “if (current_tries > 10000000000000000):”, his response was something like “dude, that’s like 10 trillion, not 10 thousands….”

Oh shit…

The real vulnerability

After realizing my bias and how I completely overlooked the number “10000000000000000”, we need to start the analysis from the start.

So we know we need to send a POST request, and that we need to send a cookie where the value of the number is the same as the cache:

# (...)
if request.method == "POST":
# (...)
  if (int(request.cookies.get('CT').split("|")[1]) != int(cache['foo']-1)):
  # We don't want to fall in this condition, so the value needs to match the cache
# (...)

But is there a path where was can send a value that would be the same as the cache in the beginning of the request, but change to 10000000000000000 just before it is evaluated by the if?

it turns out there is one, and that it’s the challenge. If you remember, the variable current_tries indirectly comes from the cookie:

current_tries = int(gh.split('|')[1])

Well, it turns out that the variable “gh” comes from:

gh = parse_struct(request.cookies.get('CT'))

where parse_struct is:

def parse_struct(ct):
    ctr = ct.split("|")[1]
    try:
        zz = base64.b64decode(ct.split("|")[0])
    except:
        zz= make_struct(get_random_string,0,cache['foo']).split('|')[0]
    if (ctr != cache['foo']-1):
        rr = decrypt(cache['foo']-1,zz).decode('UTF-8')
        if (int(rr.split('|')[2]) != int(cache['foo']-1)):
            print(rr.split('|')[2])
            raise ValueError

        if (int(rr.split('|')[2]) != int(ctr)):
            raise ValueError
            return False
        return rr
    else: 
        return get_random_strings + "|" + str(0).zfill(24) + "|" + cache['foo']

After some analysis, we can see that the return value, when the conditions are correct, is “rr”, which comes from decrypting the value where I was sending “something” in “something|1” (at the index 0 when splitting the cookie on “|”). We can also see that this “rr” variables is the decrypted value of following function:

def make_struct(letters, tries, ctr):
    x = letters + "|" + str(tries).zfill(24) + "|" + str(ctr)
    y = base64.b64encode(encrypt(ctr,x)).decode('UTF-8')+"|" + str(ctr)
    return y

From this, we understand that, when splitting on “|”, the first index is the response of the CAPTCHA (the letters we needed to send), the second index is the number of tries we want to set at a high number, and the third index is the number of tries the server has in it’s cache, which need to match what we send in the cookie, because of the line “if (int(rr.split(‘|’)[2]) != int(ctr)):”. So how can we force the second index to be the number we want if it’s encrypted. If we looked at how the encrypt function works, we can see there is a warning and an explanation in the documentation of the “ARC4” module:

def decrypt(ctr, msg):
    x = int(cache['foo']-1)
    key = b'BRING YOUR OWN KEY'
    nonce = str(x%2048).encode('UTF-8')
    tempkey = SHA.new(key+nonce).digest() # this is the derived key
    cipher = ARC4.new(tempkey)
    msg =cipher.decrypt(msg)
    return msg

One problem of ARC4 is that it does not take a nonce or an IV. If it is required to encrypt multiple messages with the same long-term key, a distinct independent nonce must be created for each message, and a short-term key must be derived from the combination of the long-term key and the nonce. Due to the weak key scheduling algorithm of RC2, the combination must be carried out with a complex function (e.g. a cryptographic hash) and not by simply concatenating key and nonce.

https://pycryptodome.readthedocs.io/en/latest/src/cipher/arc4.html

But, we could also read that ARC4 is a stream cipher. Without going too much in the details, a major problem with ARC4 it that it calculates a strong and secure number from the key and the nonce, and this number is “simply” XOR’ed with the message we want to encrypt or decrypt. And if you know your XOR math, you know that XORing any number with 0, gives this number back:

So if we were to XOR the derived key with a bunch of zeros, we would essentially leak that part of the derived key when sending the result to the client (hacker). But because the derived key is built with the nonce and then hashed, the attack would be really difficult. Doable, but difficult. So we looked for an easier solution before going that route. It turns out it is much simpler than that!

When looking at how the cookie is first generated, we can see it comes from this line from the function make_struct:

x = letters + "|" + str(tries).zfill(24) + "|" + str(ctr)

and make_struct is called when we initially have no cookies or when we do a GET request. What we receive as a cookie can be understood as: “<base64 of the encrypted string>|<the number of tries in the cache>” where the value of <base64 of the encrypted string> is where we want to control data and where is the content of the variable “x” when encoded. As we know, a bunch of 0, when xored, is the derived key, and it turns out there are a bunch of 0: “str(tries).zfill(24)” (zfill fills the number to have a length of 24 characters and fills the rest with 0).

The real exploit

We now understand we can simply do a Bit-flipping attack to change the value of the number to get 10000000000000000. Because I sent a bunch of crap in my initial try, we know the lower bits are already at some value, and if we can flip the most significant bit to be a 1, we would get something like 10000000000001303, assuming the cache has 1303 as the number of tries, which is greater than 10000000000000000 and would spit the flag.

At the moment I don’t have the code we used to make the attack, but I will try and find it, and if I do I will update this post. But we would essentially do the following:

  1. Make a first GET request to receive the cookie encrypted with the actual temporary key
  2. Base64 decode the cookie the get the raw bytes
  3. Increment the correct bit, in the correct byte to get a 1 at the right place
    1. The bytes in the start of the cookie are always of the same length, so the byte to modify is always at the same position
    2. We would probably simply XOR the byte from the cookie with a number like ‘001000’ where the 1 would match the location where we want the 1 in the cookie.
      You can find an example of how and where the XOR would be done here: https://gist.github.com/tristandostaler/1e2aadf8746ec1aa166fb38d33e3a333
  4. Base64 encode the result and to a POST request with the new cookie

Conclusion

That’s it for today, I don’t know when I’ll write the next write-up, but I’ll try and add more of the challenges I found interesting!

I know this challenge was a bit difficult so if you have any question, don’t hesitate to ask!

If you’re interested in other write-ups, you can find them here:
https://nsec.io/competition-write-ups/

Feel free to leave your comment down here for any questions or comments.

Subscribe!

See more Posts: