This blog post was authored by Maria Rigaki, on 2018-09-20
During DEFCON 26 the AI village hosted a jeopardy style CTF with challenges related to AI/ML and security. I thought it would be fun to create a challenge for that and I had an idea that revolved around Denoising Autoencoders (DA). The challenge was named “Too much noise” but unfortunately it was not solved by anyone during the CTF. In this blog I would like to present the idea behind it and how one could go about and solve it.
In typical CTF style, one was given a file and was expected to find a flag and enter it to get the points. Here is the challenge image (on the left) and the original that produced it (on the right):
The challenge was produced by adding Gaussian noise to a clean image that contained the secret message and the flag. To make sure that the message was not easily readable by the naked eye I set the level of noise at 80% :
noise_factor = 0.80 noisy_base = flag + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=flag.shape) noisy_base = np.clip(noisy_base, 0., 1.)
In order to ensure that it was not trivially easy to use other ways to get the flag, I tried a few online de-noising tools and a variety of techniques, none of which gave a good result.
Given the title of the challenge and the reference to noise, I was hoping that people would see that the challenge required to de-noise the image and retrieve the flag. I was also hopeful that people would realize that they need to use a Denoising Autoencoder to solve the challenge -given that it was AI-related and that was the first critical point to solving the challenge.
An Autoencoder in its most basic form is a compression algorithm that consists of two components: an encoder and a decoder. The encoder takes an input X and produces a compressed (or latent) representation z. The decoder takes as input the compressed representation z and produces X’ that aims to be a close reproduction of the original input. Usually the encoder and the decoder are neural networks.
The denoising version of the autoencoder takes as input a noisy version of the input and tries to reconstruct it’s non-noisy version.
The following figure shows a set of noisy images in the first line, their reproduction output from the autoencoder and the original images in the third line. The noisy images were produced by adding Gaussian noise to the originals with a factor of 0.5.
I will not go into more details on how autoencoders work since there is plenty of material on the internet, but I would highly recommend the Keras blog on the subject , since it comes with very nice code examples. As a matter of fact, if one followed the Keras blog they would have had a very good basis for solving the challenge.
The second critical point of the challenge was the dataset required to train the autoencoder. For those with some familiarity with machine learning, the MNIST handwritten digit and the Extended MNIST datasets are very popular in computer vision experiments. While MNIST contains only digits, EMNIST contains both characters and digits .
EMNIST images are 28x28 pixels and the input image was 336x280 pixels which means that there were 10 lines of 12 characters each, for a total of 120 characters. So if one managed to train a Denoising Autoencoder and then split the image into 120 characters and pass each noisy character through the trained model, they could get a de-noised version of the image.
The notebook with the full solution can be found in  and if we de-noise the bottom line of the challenge we get something similar to the image below which would hopefully lead to the flag.
While I enjoyed creating the challenge it was a bit disappointing that nobody managed to solve it. I think that with a couple of hints people would have been able to find their way to the solution and that is something that I regret not having in place. In any case, I hope the solution notebook will answer most questions around the challenge and if not, feel free to leave a comment below.
References & links
 EMNIST dataset