<-- Flare-On 2015 Index / FLARE-On 2015 Challenge #8 
FLARE-On 2015 Challenge #8

Date: Aug 20, 2015


filename:    gdssagh    DOWNLOAD
size818 k (837,120 bytes)
typeWin32 Console App
Original FLARE AuthorDimiter Andonov
tool:    Your favorite C/C++ compiler
tool:    IDA / Disassembler    Visit Website

Although gdssagh begins as a standard Win32 console application, this challenge soon becomes about what's embedded within the application than the application itself.

A hex dump of the first few bytes of the file reveals it is a Windows PE executable image. Renaming to EXE, and launching gdssagh.exe from the command line results in the message:

Flare-On 2015 Challenge #8 - The One Who Seeks...

And seeking we are... running the executable under the OllyDbg results in the process being terminated before we can get to the entry point and I was unclear as to why. I noticed the section characteristics seemed correct, but the image checksum was wrong... an indication the file may have been altered after it was linked. Fixing the checksum to the correct value or zeroing it out however didn't make it debuggable.

Anyway, let's see what IDA can tell us about the executable. At the executable entry point (401000), we can see the 3 instructions responsible for printing the "seeks" message followed by an unconditional jump. The code jumps over the next 843,962 bytes to the end of the function where ExitProcess() is called. The "code" being jumped over doesn't disassemble well, as it contains instructions like OUTS normally never see in 32-bit User Mode code. We can safely guess this is a chunk of data. An executable with nothing to hide would normally store data in the .rsrc or .data sections, so seeing a chunk of data of almost 1 MB in between code (in the .text section) is certainly suspicious.

Flare-On 2015 Challenge #8 - Embedded Data Buried within Code

Let's save off the embedded data into a separate file to have a better look. Click on the line at the beginning of the data at 401013, press the <HOME> key to go to the beginning of the line and press ALT+l (lowercase L) to begin a selection. With the selection started, double-click the jmp's target "loc_4CCDA5" at address 40100E (a few lines above). This quickly brings us to the the first instruction past the end of the data. Now use the arrow keys to ensure that ONLY the bytes being jumped-over are included within the selection and none of the code, ending in the byte value 0x3D. Now go to "Edit" -> "Export data..." to enter the export dialog. Select the "Raw bytes" radio-button and type a filename of your choosing to save the data.

Viewing this data in your favorite hex editor shows only alphanumeric characters, the plus and front-slash symbols and CRLFs. Seeing one or more trailing "=" character(s) at the end of the data is a strong indicator that this is Base64 data.

Flare-On 2015 Challenge #8 - Embedded Data viewed with a Hex Editor

Running the file through a Base64 decoder yields binary data whose first 24 bytes in a hex dump looks like this:
0000000: 89 50 4e 47 0d 0a 1a 0a 00 00 00 0d 49 48 44 52 00 00 02 58 00 00 01 e0  .PNG........IHDR...X....
Even if you didn't know much about image file headers, the fact that we can see the ASCII characters "PNG" provides a clue this is a PNG (Portable Network Graphics) image file. Renaming the file to .PNG and loading it in your favorite image viewer or internet browser results in the exact image shown here:

Flare-On 2015 Challenge #8 - Embedded Data is a PNG image

Pretty. This is certainly an indication we are on the right track but no flare-on.com e-mail address or any text for that matter is visible, so we'll have to keep on digging. But, we're going to rule-out the easy stuff first.

Since PNG files can contain metadata chunks using predefined fields such as "Comment", "Title", "Author", etc., we'll try the Linux "identify" tool, designed to dump metadata from image files:
    identify -verbose <file>
Unfortunately, the only properties we can gather from this otherwise standard-looking PNG image are that its dimensions are 600 x 480 and the color depth is 24-bit RGB (TrueColor). Since this image has many shades of varying forest colors, maybe something is camouflaged and would be viewable by adjusting the brightness, contrast and/or color balance? Loading the image in a photo editing tool like GIMP to play around with the colors doesn't make anything obvious stand out.

The next logical place to look are pixels themselves! For years, there have been programs that have been developed to allow data to be hidden directly in images. In fact, an entire field of research is dedicated to this endeavor known as steganography, a term that's been around for over 500 years and a concept dating back millennia. The basic concept is to hide data in plain sight, so as not to arouse suspicion or curiosity. We are suspicious of this particular image because there is nowhere else to look, but we might not be if we were looking at this image amongst an entire website of text and multiple images for example. Steganography allows for an embedded secret to be concealed and pass for some other innocuous purpose. One example is to hide a word in a paragraph, by using the first letter of each sentence in that paragraph to make up a new word. The encoding pattern is application defined, as it can be anything. The specific knowledge of how the data is hidden becomes the key or at least a component of the key, but keys like this fall under the "Security through Obscurity" principle.

When I first decoded and viewed the forest image in this challenge, I had never dealt with steganography before. A quick internet search turned up the term as well as the basic concept of dedicating one or more bits from the original pixel color values to bits of the data being concealed. As you can guess, if the secret data is just "plopped" where the pixel data goes, the original image is destroyed at least where it was overwritten, and tampering becomes obvious. If we instead "stripe" the data bits amongst many more bits of the pixel data, we might have a shot at keeping the image looking mostly the same. The less bits of data hidden per pixel, the less possibility the human eye can detect color alterations in the resulting image. The tradeoff is that a single byte of the secret data must span several bytes of pixel values, but that isn't a problem for your average digital picture. The resolution in this challenge's forest scene image stores 288,000 pixels which, using a simple scheme, would be enough to hide about 35k of data directly on top of the data already present (slight pixel changes might slightly alter the final ZIP-compressed pixels, but not significantly).

For example, you could dedicate the first bit of every color component in every pixel to store the secret data. Because a 24-bit RGB pixel consists of three distinct 8-bit values (one for each intensity of red, green and blue component of the combined pixel color), you could hide 3 bits of data within each 24-bit pixel, amounting to 3 complete bytes per 24 bytes of pixel data (8 pixels). I.e., the position of the secret data bits are denoted by a "D" below in the least significant bit position of each of the RGB color components for 8 pixels:
/--------pixel #1--------\ /--------pixel #2--------\ /--------pixel #3--------\ /--------pixel #4--------\ /--------pixel #5--------\ /--------pixel #6--------\ /--------pixel #7--------\ /--------pixel #8--------\
       D        D        D        D        D        D        D        D        D        D        D        D        D        D        D        D        D        D        D        D        D        D        D        D
This seems like it would only retain 7/8ths (87.5%) of the original original image since 1 bit was stolen/overwritten for every 8 bits, but that assumption is based on every bit being treated equally. Because only the "least significant" bit is being utilized, the color component's value in reality is only being altered by one out of 256. This would seem to retain 99.6% of the color values from the original image. Additionally we can assume a general rule that there is a 50/50 chance that the bit being used to encode the data is the same as the original bit from the unaltered image! When the 3 color components combine to provide 16.7 million possible pixel colors, we can safely hide data in just about any image using this scheme that a human won't be able to detect.

Although my own research turned up additional more elaborate steganography schemes, it makes sense to start simple and go elaborate if you fail to find anything. I began with the simplest method of unconditionally extracting every least significant bit from the pixel data and combining those bits for analysis in a hex editor, starting with pixel #1. If I didn't get meaningful results, I would have attempted more elaborate schemes, looking for size markers and/or starting the bit gathering from different positions, since the wrong starting bit position would throw every byte afterwards out of sync. Lucky for me however, the simplest method was indeed how the secret was encoded. The resulting binary data definitely stood out, but before I get into that, my C++ source code is HERE. I used libpng to handle the conversion of PNG file data to a raw pixel data, so I could just loop through the color bytes accumulating the bits. You must link the code with the free libpng and zlib libraries. I used libpng 1.2.52 (11/2014) and zlib 1.2.4 (3/2010). Contact me if you want to download the entire "standalone" project, including all the files necessary so you don't have to hunt them all down and fiddle with the different versions of the dependencies.

Running the pngsteggo tool (from built source code above) to produce "decoded.bin" looked like this:

Flare-On 2015 Challenge #8 - Running Custom LSB Pixel Extraction Tool

Dumping the first 128 bytes of decoded.bin:
00000000  4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00 b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00  MZ......................@.......
00000020  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0 00 00 00  ................................
00000040  0e 1f ba 0e 00 b4 09 cd 21 b8 01 4c cd 21 54 68 69 73 20 70 72 6f 67 72 61 6d 20 63 61 6e 6e 6f  ........!..L.!This program canno
00000060  74 20 62 65 20 72 75 6e 20 69 6e 20 44 4f 53 20 6d 6f 64 65 2e 0d 0d 0a 24 00 00 00 00 00 00 00  t be run in DOS mode....$.......
This should look familiar to you by now. The author embedded an executable inside the PNG data! Since we unconditionally extracted the data from every byte of the PNG image, we probably extracted more than just the PE image. If this extra data wasn't specifically encoded for yet another purpose, it represents bits of the remaining image data never used and was just extra space. Without manually parsing the PE headers, I recommend a PE tool that will show you if and how much data you have past the image-end of the file. The non-free PE Explorer tool will do it for you, and my bytepatch tool will also calculate it (running in info mode). Running decoded.bin through bytepatch reveals that we have extracted 105440 bytes of data beyond the executable itself:

Flare-On 2015 Challenge #8 - Identifying extraneous image data past hidden PE

To "trim the fat", select every byte from and including offset 0xA00 to the end of the file using your favorite hex editor, delete the selection, and save. Renaming the trimmed file to decoded.exe, and running it yields the following output:

Flare-On 2015 Challenge #8 - Running decoded EXE solution

The solution was buried 4 layers deep, offering a fun exercise in basic steganography. The solution's plaintext string was also visible in a hex dump of decoded.exe. We may have paid more attention to the trailing image data had we hit a brick wall with the PE image (such as if it was a decoy).

Note: You can skip the step of trimming the file in the hex editor, as the executable will still run fine, but I like to keep things clean. A reverse engineer should know how to manually find the boundaries of a PE image. I.e. there is no image size in the headers, but it can be calculated by mapping the largest (usually last) section's file offset from the section table headers. The Windows loader (at least on XP) doesn't map any data into memory outside of the PE image boundaries, regardless of the executable's size. Even if they did, it wouldn't break anything that didn't expect it to be there. I'd guess Microsoft retained this behavior for Windows 7 and up.

The official solution describes the same basic approach taken in this tutorial, but as usual, a Python script was used to do the data extraction as opposed to my C++ solution. Its obvious the FLARE guys really like Python!

The author also stressed an important point I hadn't realized in my ad-hoc steganography research. The PNG format is good for steganography because of its lossless algorithm, meaning the stored pixel values are retained exactly after being compressed, decompressed and displayed once again. Steganography (at least the variation described here) doesn't work well for the more common JPEG format because JPEG employs a lossy algorithm and losing a bit "here and there" is probably unacceptable to store most secrets, at least without more redundancy built-in to the encoding method. The take-away from all of this is that a reverse engineer might be more likely to suspect steganography while staring-down a PNG rather than a JPEG.

The significance of this challenge's name was never apparent to me, even after it was solved. The letters all fall within the same row on a typical US keyboard which could mean nothing. The least significant bits of each character produce the caret "^" character in one direction and the "=" character in the other direction. Anyone know what it means?

Response from Im_in_ur_p1cs@flare-on.com:
Subject: FLARE-On Challenge #8 Completed! From: Im_in_ur_p1cs@flare-on.com To: <HIDDEN> Date: Thu, 20 Aug 2015 04:54:44 -0400 I find it troubling that you are still solving these with ease. I hope you use your powers for good and not evil. I have attached the next challenge. I am obligated to inform you by law that the password to the zip archive is "flare". You are so close. If you can believe it, you can achieve it! -FLARE attachment_filename="4568CB1948CCD11DDB9B90359F7DC79A.zip"

<< Flare-On 2015 Index  --  Go on to Challenge #9 >>