There are two versions of the CTF challenge, as the first version of the program accepted false positives. When this kind of thing happens in CTF, it's usually beneficial to diff between the versions: on a file containing a lot of useless info, you can save hours of research by seeing which part has been reworked. Here the diff is not really useful, it just corrects an integer overflow in a modulo.
We load the file in IDA Pro (or any other free tool, thanks to my friend D.L. for the IDA license):
The challenge is made up of about twenty short functions, let's start with the main
Ransomware is malicious computer programs that render files on a computer unreadable, and offer the owner of the machine to recover them in exchange for payment of a ransom. The files are said to be encrypted, and the decryption tool can be obtained by paying a sum in virtual currency such as Bitcoin.
Recently, an evolved form of ransomware has been observed. The attacker demands a ransom not only to give the owner access to his files, but also not to distribute the files publicly. This form of attack mainly affects companies, which are more likely to possess sensitive files such as intellectual property documents, R&D projects or financial documents. This is called extortionware.
The flag is a ransomware that has been expanding , which has claimed more than a dozen victims among French companies. This article presents an analysis of the execution of this somewhat particular strain, recovered during an incident response and forensic mission at one of our customers.
The main simply reads 20 characters from the standard input, and sends this string to a checker function. The latter may seem very complex at first glance with its nested ifs, but the logic is actually simple: (There are 14 nested ifs in all) This function can be separated into sub-parts.
The dropper is the program responsible for depositing the malicious program on a machine. As part of the CTF ransomware, it takes the form of a PowerShell command that executes the contents of a base64-encoded string:
First, a pre-processing step that will set up the variables for the execution of the rest of the program. We're going to focus on the sub_1355 function which transforms 4-character blocks of user input into 32-bit integers, using some weird logical operations.
To understand the behavior of this function, you can copy and paste the code into Python (the operations here use the same symbols between C and Python), and play around with it a bit.
It can be seen that these operations are used to invert the position of the bytes in the number provided. But what is that for ? This is actually a fix related to how the processor represents data:
In x64 architectures, which use little endian, the order of bytes in memory is not the same if we consider int or char. It is therefore necessary to reverse the position of the bytes to fall back on our feet during a conversion from one to the other.
Then the last line of initialization will copy a long section of memory into a local variable v2.
Then the program enters an infinite loop of execution. It first retrieves the 3 successive values v2[ptr], v2[ptr + 1] and v2[ptr + 2].
Depending on the value of v2[ptr], the program will choose (with all the nested ifs seen above) one function among 14 which will be called with the arguments v2[ptr + 1] and v2[ptr + 2]. Reverse experts will have recognized a typical VM challenge, where the challenge program has a custom format and runs in a homemade interpreter. It’s sort of meta-assembler!
Let's take a closer look at some of the instructions at our disposal:
This function compares the two values pointed to by its arguments, and updates a bit in memory that notes whether these values are equal or not. This type of statement is generally followed by a conditional statement to reproduce an if, for or while. Here is one of these conditional jumps implemented in this VM:
If the two values are equal, the ptr variable will be modified. This corresponds to a conditional jump to another instruction.
Several other classic instructions are implemented such as xor, add or exit. However, one of the functions is more complex than the others:
A wise eye will recognize here the algorithm, equivalent to the Python function. The number and the modulo are given as an argument, but the power is fixed (dword_4074) and is 65537. It's crypto lovers' turn to wake up, since 65537 (also written 0x10001) is the most common exponent in the RSA cryptosystem.
Reconstruct of the application
Once all the instructions have been identified, we can move on to disassembling the program from the VM. We start by retrieving the content of the memory section that interests us, it's very simple with IDAPython for example:
Then, we reconstitute the instructions and the arguments in a readable format. Here we only rely on the first character of the opcode because it is unique to each instruction:
We get the following pseudocode:
It is a very simple script that can be summarized in 5 conditions that the input blocks must verify:
Resolution of conditions
In order to verify these conditions, two options are available to us. We could test the 2^32 possible values for each block, which would take about 1h of calculation on a good CPU with an intelligent bruteforce that eliminates non-printable characters.
The other option is to use RSA properties to instantly retrieve the expected values. We factor each module (the naive factorization by testing all the factors from 1 to sqrt(n) is very fast), which helps us find the decryption exponent. Then, we apply the RSA decryption function which directly gives us the piece of the expected flag:
And the result is instantly displayed:
We can now validate with the flag to earn 1337 points!
This article exclusively reflects the opinion of its authors and does not commit Consultingit in any way. I hope you liked it. Your comments/remarks are welcome.