Although gdssagh begins as a standard Win32 console application, this challenge soon becomes about
what's embedded within the application than the application itself.
A hex dump of the first few bytes of the file reveals it is a Windows PE executable image.
Renaming to EXE, and launching gdssagh.exe from the command line results in the message:
STATIC ANALYSIS OF EXECUTABLE:
And seeking we are... running the executable under the OllyDbg results in the process being
terminated before we can get to the entry point and I was unclear as to why. I noticed the section
characteristics seemed correct, but the image checksum was wrong... an indication the file may have been altered after it
was linked. Fixing the checksum to the correct value or zeroing it out however didn't make it debuggable.
Anyway, let's see what IDA can tell us about the executable. At the executable
entry point (401000), we can see the 3 instructions responsible for printing the "seeks" message followed by an
unconditional jump. The code jumps over the next 843,962 bytes to the end of the function where ExitProcess()
is called. The "code" being jumped over doesn't disassemble well, as it
contains instructions like OUTS normally never see in 32-bit User Mode code. We can safely guess this
is a chunk of data.
An executable with nothing to hide would normally store data in the .rsrc or
.data sections, so seeing a chunk of data of almost 1 MB in between code (in the .text section) is certainly suspicious.
Let's save off the embedded data into a separate file to have a better look. Click on the line at the beginning
of the data at 401013,
press the <HOME> key to go to the beginning of the line and press ALT+l (lowercase L) to begin a selection. With the
selection started, double-click the jmp's target "loc_4CCDA5" at address 40100E (a few lines above). This quickly brings us to the
the first instruction past the end of the data. Now use the arrow keys to ensure that ONLY the bytes being
jumped-over are included within the selection and none of the code, ending in the byte value 0x3D. Now go to "Edit" ->
"Export data..." to enter the export dialog. Select the "Raw bytes" radio-button and type a filename of your
choosing to save the data.
WHAT KIND OF DATA IS THIS?
Viewing this data in your favorite hex editor shows only alphanumeric
characters, the plus and front-slash symbols and CRLFs. Seeing one or more trailing "=" character(s)
at the end of the data is a strong indicator that this is Base64 data.
Running the file through a Base64 decoder yields binary data whose first 24 bytes in a hex dump looks like this:
Even if you didn't know much about image file headers, the fact that we can see the ASCII characters
"PNG" provides a clue this is a PNG (Portable Network Graphics) image file. Renaming the file to .PNG and loading it in your
favorite image viewer or internet browser results in the exact image shown here:
SOMETHING IS HIDDEN HERE:
Pretty. This is certainly an indication we are on the right track but no flare-on.com e-mail address or any
text for that matter is visible, so we'll have to keep on digging. But, we're going to rule-out the easy stuff first.
Since PNG files can contain metadata chunks using predefined fields such as "Comment", "Title", "Author", etc., we'll
try the Linux "identify" tool, designed to dump metadata from image files:
identify -verbose <file>
Unfortunately, the only properties we can gather from this otherwise standard-looking PNG image
are that its dimensions are 600 x 480 and the color depth is 24-bit RGB (TrueColor). Since this image has many shades
of varying forest colors, maybe something is camouflaged and would be viewable by
adjusting the brightness, contrast and/or color balance? Loading the image in a photo editing tool like GIMP
to play around with the colors doesn't make anything obvious stand out.
The next logical place to look are pixels themselves! For years, there have been programs that have
been developed to allow data to be hidden directly in images. In fact, an entire field of
research is dedicated to this endeavor known as steganography, a term that's been around for over 500 years and
a concept dating back millennia.
The basic concept is to hide data in plain sight, so as not to arouse suspicion or curiosity. We are
suspicious of this particular image because there is nowhere else to look, but we might not be if we were
looking at this image amongst an entire website of text and multiple images for example.
Steganography allows for
an embedded secret to be concealed and pass for some other innocuous purpose. One example is to hide a word in a paragraph,
by using the first letter of each sentence in that paragraph to make up a new word. The encoding pattern is
application defined, as it can be anything. The specific knowledge of how the data is hidden
becomes the key or at least a component of the key, but keys like this fall under the "Security through Obscurity" principle.
When I first decoded and viewed the forest image in this challenge, I had never dealt with steganography
before. A quick internet search turned up the term as well as the basic
concept of dedicating one or more bits from the original pixel color values to bits of the data
As you can guess, if the secret data is just "plopped" where the pixel data goes, the original image is
destroyed at least where it was overwritten, and
tampering becomes obvious. If we instead "stripe" the data bits amongst many more bits of the pixel data, we might
have a shot at keeping the image looking mostly the same.
The less bits of data hidden per pixel, the less possibility the human eye can detect color alterations in the
The tradeoff is that a single byte of the secret data must span several bytes of pixel values, but that isn't a
problem for your average digital picture. The resolution in this challenge's forest scene image stores 288,000
pixels which, using a simple scheme, would be enough to hide about 35k of data directly on top of the data
already present (slight pixel changes might slightly alter the final ZIP-compressed pixels, but not significantly).
For example, you could dedicate the first bit of every color component in every pixel
to store the secret data. Because a 24-bit RGB pixel consists of three distinct 8-bit values (one for each
intensity of red, green and blue component of the combined pixel color), you could hide 3 bits of data within each 24-bit pixel,
amounting to 3 complete bytes per 24 bytes of pixel data (8 pixels). I.e., the position of the secret data bits are
denoted by a "D" below in the least significant bit position of each of the RGB color components for 8 pixels:
/--------pixel #1--------\ /--------pixel #2--------\ /--------pixel #3--------\ /--------pixel #4--------\ /--------pixel #5--------\ /--------pixel #6--------\ /--------pixel #7--------\ /--------pixel #8--------\
BBBBBBBB GGGGGGGG RRRRRRRR BBBBBBBB GGGGGGGG RRRRRRRR BBBBBBBB GGGGGGGG RRRRRRRR BBBBBBBB GGGGGGGG RRRRRRRR BBBBBBBB GGGGGGGG RRRRRRRR BBBBBBBB GGGGGGGG RRRRRRRR BBBBBBBB GGGGGGGG RRRRRRRR BBBBBBBB GGGGGGGG RRRRRRRR
D D D D D D D D D D D D D D D D D D D D D D D D
This seems like it would only retain 7/8ths (87.5%) of
the original original image since 1 bit was stolen/overwritten for every 8 bits, but that assumption is based on every bit
being treated equally. Because only the "least significant" bit is being utilized, the color component's value
in reality is only being altered by one out of 256. This would seem to retain 99.6% of the color
values from the original image. Additionally we can assume a
general rule that there is a 50/50 chance that the bit being used to encode the data is the same as the original
bit from the unaltered image! When the 3 color components combine to provide 16.7 million possible pixel
colors, we can safely hide data in just about any image using this scheme that a human won't be able to detect.
Although my own research turned up additional more elaborate steganography schemes, it makes sense to start
simple and go elaborate if you fail to find anything. I began with
the simplest method of unconditionally extracting every least significant bit from the pixel data and combining
those bits for analysis in a hex editor, starting with pixel #1.
If I didn't get meaningful results, I would have attempted more elaborate schemes,
looking for size markers and/or starting the bit gathering from different positions, since the wrong
starting bit position would throw every byte afterwards out of sync.
Lucky for me however, the simplest method was indeed how the secret was encoded. The resulting binary data
definitely stood out, but before I get into that, my C++ source code is HERE.
I used libpng to handle the conversion of PNG file data to a raw pixel data, so I could just loop through the
color bytes accumulating the bits. You must link the code with the free libpng and zlib libraries. I used libpng 1.2.52 (11/2014) and zlib 1.2.4
(3/2010). Contact me if you want to download the entire "standalone" project, including all the files necessary
so you don't have to hunt them all down and fiddle with the different versions of the dependencies.
FINDING THE PASSWORD:
Running the pngsteggo tool (from built source code above) to produce "decoded.bin" looked like this:
This should look familiar to you by now. The author embedded an executable inside the PNG data! Since we
unconditionally extracted the data from every byte of the PNG image, we probably extracted more than just the PE
image. If this extra data wasn't specifically encoded for yet another purpose, it represents bits of the remaining
image data never used and was just extra space.
Without manually parsing the PE headers, I recommend a PE tool that will show you if and how much data
you have past the image-end of the file. The non-free PE Explorer tool will do it for you, and my
bytepatch tool will also calculate it (running in info
mode). Running decoded.bin through bytepatch reveals that we have extracted 105440 bytes of data beyond the
To "trim the fat", select every byte from and including offset 0xA00 to the end of the file using your favorite
hex editor, delete the selection, and save. Renaming the trimmed file to decoded.exe, and running it yields the following output:
The solution was buried 4 layers deep, offering a fun exercise in basic steganography.
The solution's plaintext string was also visible in a hex dump of decoded.exe.
We may have paid more attention to the trailing image data had we hit a brick wall with the PE image (such as if
it was a decoy).
Note: You can skip the step of trimming the file in the hex editor, as the executable will still
run fine, but I like to keep things clean. A reverse engineer should know how to manually find the
boundaries of a PE image. I.e. there is no image size in the headers, but it can be calculated by mapping the largest
(usually last) section's file offset from the section table headers. The Windows loader (at least on XP)
doesn't map any data into memory outside of the PE image boundaries, regardless of the executable's size.
Even if they did, it wouldn't break anything that didn't expect it to be there. I'd guess Microsoft
retained this behavior for Windows 7 and up.
The official solution describes the same basic approach taken in
this tutorial, but as usual, a Python script was used to do the data extraction as opposed to my C++ solution.
Its obvious the FLARE guys really like Python!
The author also stressed an important point I hadn't realized in my ad-hoc steganography research. The PNG format is good for steganography
because of its lossless algorithm, meaning the stored pixel values are retained exactly after being compressed,
decompressed and displayed once again. Steganography (at least
the variation described here) doesn't work well for the more common JPEG format because JPEG employs a lossy
algorithm and losing a bit "here and there" is probably unacceptable to store most secrets, at least without more
redundancy built-in to the encoding method. The take-away from all of this
is that a reverse engineer might be more likely to suspect steganography while staring-down a PNG rather than a JPEG.
The significance of this challenge's name was never apparent to me, even after it was solved. The
letters all fall within the same row on a typical US keyboard which could mean nothing.
The least significant bits of each character produce the caret "^" character in one direction and the
"=" character in the other direction. Anyone know what it means?
Response from Im_in_ur_p1cs@flare-on.com:
Subject: FLARE-On Challenge #8 Completed!
Date: Thu, 20 Aug 2015 04:54:44 -0400
I find it troubling that you are still solving these with ease. I hope you use your powers for good and not evil. I have attached the next challenge. I am obligated to inform you by law that the password to the zip archive is "flare".
You are so close. If you can believe it, you can achieve it!