PicoCTF – BitPuzzle

Hi guys 🙂

This time, I am not posting a binary exploitation write-up but rather a reverse-engineering one.

The challenge is called “bitpuzzle” and is basically a crack-me. It asks for a string, read it from the standard input and perform several operations to check if it matches a particular expected string. The binary is a stripped 32-bit ELF executable.

pico4180@shell:/home/bitpuzzle$ ./bitpuzzle
Bet you can't solve my puzzle!
Give me a string.
this_is_string

Sorry, this_is_string is not the right string!
pico4180@shell:/home/bitpuzzle$

In order to resolve the challenge, I did a static analysis using the demo version of IDA. The only inconvenient is that you can not save your current work but it is not that annoying when the challenge is solvable in a reasonable amount of time.

1) Reading the standard input

The first step for this kind of crack-me is to understand how the input string is read, in order to identify the length and location of the buffer.

You can see that the fgets() function is called:

  • The first parameter is the input buffer, located on the stack at esp + 0x1C
  • The second parameter, 0x50, is the maximum number of bytes (less one) that fgets() will read on the specified stream
  • The third parameter is the global pointer stdin (FILE *) included from the header file stdio.h and stored in the bss section

We know from that function call that the flag will not be longer than 79 bytes (0x50 – 1).

2) Removing the new line character

The next step is to identify operations performed on the input string and to understand their purpose.

Here, theses instructions are used to replace the last character from then string (usually ‘\n’) by a null byte. The first step is to compute the length of the string using the repne scasb instruction.

The scasb instruction (scan as byte) is commonly used to compute strings length. It compares the byte in al with the byte located at es:edi. If they are equals, it sets the status flag (SF) in the EFLAGS register. It then increments edi and decrements ecx (when used with a rep family instruction). The repne (repeat while not equal) is used to repeat the scasb instruction while SF is not set.
So basically, when you leave this instruction, the length of the string is equal to (~ecx – 1).
Since the input string is located at esp + 0x1C, moving the null byte at [esp + 0x1A + ecx] will replace the last character (‘\n’) with a null byte.

3) Checking whether the string length is valid

We continue to look after the following instructions related to our input string.

Well, just after removing the ‘\n’ character, it executes again the repne scasb instruction. It then checks whether ecx equals 0xFFFFFFDE. Since ecx is originally set to 0xFFFFFFFF, it has the value 0xFFFFFFDE if the input string is 32 bytes long (0xFFFFFFFF – 0xFFFFFFD – 1 = 32).
If this condition is true, it jumps to .text:08048582, to continue its execution flow. Otherwise it prints a failure message and leaves the program by calling the _exit() function.

Well, we know that our string must be exactly 32 bytes long !

4) Learning constraints applied on the input string

Here, things are starting to be interesting since arithmetics operations are applied on the input string.

The 12 first bytes of the input string are split into 3 dword. Since the input string is 32 bytes long, we know that we might have up to 8 dword. From theses instructions, we can learn 2 constraints:

  • dword_3 + dword_2 = 0x0C0DCDFCE
  • dword_2 + dword_1 = 0x0D5D3DDDC

If theses two comparisons are true, the ecx register is set to 1 (.text:080485AA), otherwise it keeps the value 0 (.text:08048591). At the end of the execution flow (after checking all the constraint), the ecx register is checked and if its value is 1 we found the right string 🙂

Let’s look at the next constraint:

From theses instructions, we can learn one more constraint:

  • (dword_2 * 5) + (dword_1 * 3) = 0x404A7666

Such as for the previous (and following) checks, if the comparison fails, the ecx register is set to 0 (.text:080485C1), otherwise it is left unchanged.

5) Listing all the constraints

I will not explain each constraint, instruction by instruction, since it would be long (more than 70 instructions) and may be a bit annoying… However I can tell you that the constraints are based on basics x86 instructions (xor, imul, and, div, sub, add), so it is not really hard to find them all.

Well, here is the list of the constraints after showing a little patience:

  • (dword_3 + dword_2 = 0x0C0DCDFCE)
  • (dword_2 + dword_1 = 0x0D5D3DDDC)
  • (dword_2 * 5) + (dword_1 * 3) = 0x404A7666
  • ((dword_4 ^ dword_1) == 0x18030607)
  • ((dword_1 & dword_4) == 0x666C6970)
  • (dword_2 * dword_5 == 0xB180902B)
  • (dword_5 * dword_3 == 0x3E436B5F)
  • (dword_5 + (dword_6 * 2) == 0x5C483831)
  • ((dword_6 & 0x70000000) == 0x70000000)
  • (dword_6 / dword_7 == 1)
  • (dword_6 % dword_7 == 0x0E000CEC)
  • ((dword_5 * 3) + (dword_8 * 2) == 0x3726EB17)
  • ((dword_8 * 7) + (dword_3 * 4) == 0x8B0B922D)
  • ((dword_8 * 3) + dword_4 == 0xB9CF9C91)

6) Writing a program handling all the constraints

Once we are aware of all the constraints applied on the input string, we can write a program used to guess what this string might be. We know from the previous challenges that the flag is likely to be composed of characters within the range [a-z] or ‘_’. Therefore, we know that each dword can have 531441 different values (27 ^ 4) and that each character of the string must be between 0x61 (‘a’) and 0x7A (‘z’) or equals to 0x5F (‘_’).

In order to be the most efficient, I tried to look at the most constraining operation on a dword to see if it could first reduce considerably the number of tries and then allow the deduction of the others dwords. I found these 3 operations pretty interesting:

((dword_6 & 0x70000000) == 0x70000000)
(dword_6 / dword_7 == 1)
(dword_6 % dword_7 == 0x0E000CEC)

From the first one, we know that the higher-byte of dword_6 (i.e last “string” character of this dword) must be from 0x70 (‘p’) to 0x7A (‘z’), that leaves only 11 choices. Therefore we can reduce for this dword the number of tries from 531441 to 216513 (11 * 27^3 rather then 27^4). Then, the second one tells us that dword_6 is bigger than dword_7 and finally, the third one tells us that dword_6 is bigger than dword_7 from 0x0E000CEC bytes (i.e dword_7 = dword_6 – 0x0x0E000CEC).

Starting from this point, i wrote the following program where the deduction of a dword allows the deduction of another one, and the only number on which iterations are done (brute-force) is the most constraining:

Well let’s compile and test this piece of C code:

user@debian:~$ gcc -O2 solver.c -o solver
user@debian:~$ time ./solver
The flag is: solving_equations_is_lots_of_fun

real 0m0.005s
user 0m0.003s
sys 0m0.002s
user@debian:~$

And here is our flag after 5ms of execution: solving_equations_is_lots_of_fun 🙂

If you want to play with this binary it can be downloaded here.

EDIT 24/07/2015:

After solving this challenge, I took a look at the hint section to see what advices where given. It said: “You will probably want to disassemble this program and find the constraints on the input. Then, you might want a constraint solver!

Well, brute-forcing, even on a reduced set, was may be not the more elegant solution… So I tried to do something with Z3 which is “a high-performance theorem prover being developed at Microsoft Research“.

While my poor CPU was burning compiling Z3, I took a look at this nice tutorial which helped me to use the python bindings and to write the following little script:


user@debian:~$ time python z3_solver.py
solving_equations_is_lots_of_fun

real 0m0.370s
user 0m0.338s
sys 0m0.032s
user@debian:~$

As you can see, it is also pretty fast ! The execution time is a little bit longer but this is certainly due to the creation of the python process which obviously heavier than our simple program of less than 100 lines. Moreover the code is simpler since Z3 is solving all the constraints for us ! I will probably use Z3 again in the future for crack-me challenges 🙂

remi

Security Engineer / Malware Analyst, interested in reverse engineering, vulnerability exploitation, OS architecture & software developpement.

Leave a Reply

Your email address will not be published. Required fields are marked *