<-- Flare-On 2015 Index / FLARE-On 2015 Challenge #11 
FLARE-On 2015 Challenge #11

Date: Sep 8, 2015


filename:    CryptoGraph    DOWNLOAD
size210 k (215,040 bytes)
typeWin32 Console App
Original FLARE AuthorClaudiu Teodorescu
tool:    IDA / Disassembler    Visit Website
tool:    OllyDbg 2.01 / Debugger    Visit Website

NOTE: This challenge proved a little too difficult for me to solve before the contest deadline. I put this tutorial together after reading the official solution and the solutions posted by others. The official solution explained most of what was going on, but it wasn't until I read about the finer details from Acidshout and Ghetto Forensics that I understood what was going on.

This final challenge was designed to decrypt a PE-embedded resource block and save off the result as the file "secret.jpg". The app was essentially broken, and you had to study the code to figure out how to make it do its job. Although this challenge was probably the most complex of the 2015 series, there were also plenty of fruitless areas to focus your efforts and overthink the problem. :)

A hex dump of the first few bytes of the supplied file reveals a Windows PE executable image. Renaming to CryptoGraph.exe and launching it from the command line resulted in:
The number of parameters passed in is incorrect.
Pulling the program up in IDA reveals a main() function that couldn't be more straightforward. No anti-debugging or anti-disassembling tricks to see here. We can see this function creates the file "secret.jpg" and passes the open handle to another function, presumably so the file's contents can be filled-in, but not before something is done with the first command line argument. IDA couldn't identify the atoi()-like function, but inspection of its internals revealed it did the same basic thing. It converts the passed base-10 string (the first command line argument) to a number. Although the atoi()-like function returns a 32-bit number, the caller only cared about the lower byte portion. So right away, we know the command line argument can take a number between 0 and 255. The byte and the file handle are then passed to another routine to perform the remainder of the work.

Flare-On 2015 Challenge #11 - Main() function

Before we spend any more time in static analysis, we now have enough information to do a trial run. Let's see what happens when we guess a byte value:
cryptograph.exe 37
Waiting... waiting... and we wait some more. Nothing happens. For all we can tell, the program is hung. After a couple minutes with no console activity, we kill it and go back to IDA. We aren't in any position to brute-force this byte yet.

The worker routine proceeds to load resource IDs #120 (48 bytes) and #121 into two memory buffers. Buffer #120 then undergoes a transformation where it is divided into 3 16-byte values. The 2nd and 3rd are XORed with each other, then that result is XORed over the first, with the result overwriting the first 16 bytes. The official solution describes resource #120 as the "MasterKey" from which all other keys are derived from, and resource #121 as a structure that describes the parameters of the RC5 decryption to perform on 32 more keys. These parameters include iterations, rounds, and a salt value.

Both buffers are then passed to two-more worker routines. The first of which I call cruncher() because it contained the bulk of the crypto work. So much that its main loop never seemed to finish. There was decryption and hashing going on and plenty complex initialization prior to the start of the main outer loop. Cruncher()'s outer loop was designed to decrypt a table of 32 keys. If you could figure out how to get this part to finish, the the 2nd worker routine would be called.

The 2nd worker routine, which I happened to call file_writer(), picks up where cruncher() left off. Another embedded resource (ID# 122) is loaded and depending on whether or not cruncher() properly decrypted its 32-key table (and no traps set along the way were triggered), a key is pulled from this table and combined with one of the keys from the 32-key table. The resulting key is then used to decrypt Resource ID #124 (the contents of "secret.jpg"). Both indexes to obtain this key might as well been encrypted along with the 32-key table, because their determination was indirectly based on those decryption results.

From start to finish, you are kind of in the dark with this challenge. Very few error paths stood out preventing you from easily confirming the correctness of one output so you could move on to the next. Whether certain branches executed or not, it was never very clear in certain areas if you were dealing with a success branch, a failure branch, or a bogus branch. Certain conditions in one part of the program causes drastic results in another seemingly disconnected part making the flow difficult to analyze.

Problem #1: The program hangs, so we can't even get a badly decrypted JPEG out of it. This may lead some to think this is the first problem to try to address. The cruncher() function happens to hang no matter what byte you plug in from the command line, so I focused the majority of my efforts to plug this byte into a different spot and skip the execution of the seemingly infinite decryption loop altogether; I thought it was bogus as we clearly had other keys that were being used in the parts of the program that followed. But, had I left the byte to be plugged-in as is, I might have focused my efforts down to the CMP instruction at 4016D4, which only executes a specific branch if the first DWORD of a resultant derived key is 0 (this derived key was decrypted by the MasterKey to decrypt the key table, and was based on the correct command line byte). This is in the initialization area before the main outer loop begins executing decryption "rounds". One clue I missed was a counter I had already correctly determined to be an "error" counter. If any part of the program increments it, it throws off the key indexes chosen in the 2nd half of the program, resulting in an incorrectly decrypted JPEG. If this branch did not execute, the error counter was incremented at 401723, so that was the indication that this branch needed to execute. Brute forcing the command line until I hit a breakpoint in this branch would have gotten me to the next step. The correct command line byte to "unlock" this branch happens to be 205 (decimal).

Flare-On 2015 Challenge #11 - cruncher() function

Problem #2 is that the loop still hangs, even with the correct command line byte. The difference is the correct byte allows the loop to finish in about 24 hours (depending on the speed of your computer), still resulting in a garbage JPEG. The loop executes much longer (perhaps months? years?) when an incorrect byte is used. The part of the loop consuming the majority of the execution time was the function call at 4017A9. This call was responsible for performing a single round of RC5 decryption using the number of iterations passed as the 6th argument. This value began at 0x10000 and doubled with each iteration of the outer loop, also doubling the time to complete each loop round. The kicker is that this 32-key table was encrypted 32 times, but not 32 times in its entirety: to make each round of decryption reveal the next subsequent decrypted key, the first round of the table's encryption must have encrypted the last key only. Then it would encrypt the last two keys as round #2 (with the last key having two layers of encryption), the last three keys as round #3 and so on, until it has encrypted the whole table down to the first key - the first key having only one layer of encryption, the rest having increasing layers of encryption at each step. The side effect of this algorithm allows you to have decryption that could take years, if it is not manually stopped; if you know where your key is (or just want to guess them one by one), you can manually stop the loop at the point where the encryption layer for the desired key has just been peeled away. This is my understanding of the author's solution in combination with the code's behavior.

So we understand at this point that we need to let this outer loop execute only as long as needed to expose whatever key is later combined with its "mate" key from resource #122. Since resource #122 has only 16 "mate" keys (which wasn't apparent to me during the challenge), you could guess that you only needed the loop to execute 16 times to reveal a maximum of 16 decrypted keys, rather than the full 32. By the way, this takes about 37 minutes on my machine and it results in a corrupted JPEG. I didn't investigate why (as it should have worked), but I likely triggered another one of the "traps" present in the challenge.

Flare-On 2015 Challenge #11 - key derivation loop in cruncher()

If we manually break out of the loop by skipping the JB instruction at 4018C3 at the right time, this unlocks the remainder of the program. Therefore you can brute force how many times you need to let the loop execute. Break on the JB instruction at 4018C3 at each iteration, and try increasing numbers of rounds (in the EAX register) in separate debugging sessions until the resulting "secret.jpg" becomes a valid JPEG. You'll eventually find this loop needs to be run 10 times to decrypt the 9th key which is later referenced. The patch to fix the executable is:

004018C0 83F8 20 CMP EAX, 20 ;replace this 004018C0 83F8 0A CMP EAX, 0A ;with this
    bytepatch -pa 0x4018C0 CryptoGraph.exe 83F80A
As others have pointed out, we get a hint that AT LEAST 10 iterations must execute by looking at function 401B60 (called near the top of file_writer() after resource #122 is loaded). The function call at 4017A9 from the outer loop in cruncher(), is initially called with at iterations count of 0x10000. This is a 32-bit number of all zeros except with the 17th bit set. With each iteration, this value is left-shifted by 1 (doubling it). By the time the outer loop's iteration count is 10, this bit is in the 25th position. The check at function 401B60 (specifically 401BED) shown below, only executes a special block if the iterations count is nonzero after shifting it 24 positions to the right. In other words, this block only executes if the outer loop has executed at least 10 times. The block then, in an obfuscated way uses a divide-by-zero exception handler to hide the selection of key index #9.

Flare-On 2015 Challenge #11 - Iterations check in 401B60

If you let cruncher()'s outer loop run many more times (such as 15 rounds), the decryption stops working once again.

If the program is fixed in the two spots described above, you'll get the properly decrypted "secret.jpg" file as shown below. The solution e-mail address is embedded at the bottom as viewable text.
Flare-On 2015 Challenge #11 - Properly Decrypted secret.jpg

I'm no soccer fan, so I didn't know who this guy was. Apparently it is a picture of Lionel Messi, an internationally acclaimed soccer player. Besides his recent high-profile tax evasion case, I'm not sure why cryptologists would hate him.

<< Flare-On 2015 Index