Steganalysis for Reversible Data Hiding Based on Neural Networks and Convolutional Neural Networks

ossless data hiding techniques is a technique that is very interesting. In which there is a large amount of reversible information hidden technologies. This technique makes it possible to restore the original image after extracting the information from the stego image. The stego image (hidden image with secret data) is hardly detected by any variable. There are many studies for this field are published. Secret information is hidden on the pixel space, frequency (cosine, wavelet) coefficient space or difference image coefficient space. However, by analyzing meticulously between the cover image and the stego image on these space, one can detect abnormal signs. In a previous work, a steganalytic techniques produced that was based on analysis of the transform coefficient histogram with the correct detection ratio between 88% and 92%. In this article, proposing another method to improve the detection ratio of that steganalysis based on Neural Networks (NNs) and Convolutional Neural Networks (CNNs). The test results show 96% correct detection rates for NNs and 94% for CNNs, this is a better result than our previous method. This proposed approach can be applied to detect stego images on spatial and other frequency domain.


Introduction
Steganography is the science of hiding the existence of information. Its purpose is to convey the message secretly so except the sender and receiver, no one knows about the existence of the message. Steganography actually is a form of security by concealment. digital image, video, sound or any other files can be used as "cover" to carry secret messages, after that it is called stego.
Steganalysis is opposite of Steganography. It is the science of detecting the message hidden using steganography. The main purpose task of it is to distinguish between cover image and stego image.
In recent years, many lossless data hiding techniques have been proposed for stego images. Lossless data can be embedded in the spatial domain [1,2,3] or in the transform domain [4,5]. Xuan et al [5] proposed a method by a histogram shifting in integer wavelet transform domain (IWH method). This method hides messages into high frequent sub-bands of integer wavelet coefficients.
In [7], we offered a new steganalytic method based on integer wavelet transform that can detect stego images using Xuan's method. Besides capability of detecting the hidden image, the algorithm can estimate the length of embedded data reliably. To increase detection ratio, we also research another method based on Neural Networks (NNs) and Convolutional Neural Networks (CNNs). Recent CNN works focused on problems about computer vision, such as identification of 3D objects, natural images, and traffic signs [8,9,10] image denoising [11] and image segmentation [12]. Convolutional architectures also seem to benefit of unsupervised learning algorithms used for analyzing image data [13,14,15].
In the next section, we describe again the Xuan's steganography method. In section 3, 4 we review again our proposed steganalytic methods and introduce new method on NNs and CNN's. Our experimental results are displayed in section 5. Finally, the conclusion is given in section 6.

Lossless Data Hiding based on Integer Wavelet Histogram Shifting
In this section, we describe Xuan's IWH algorithm introduced in [5]. This algorithm does not cause distortion by hiding information on the integer wavelet domain. The image space domain after being transformed to integer wavelet domain will be divided into four sub-bands. Xuan et al hid secret information on three high-frequency domains. The details of the IWH algorithm are summarized as follows: Suppose there are M bits of secret information into a high frequency band. The IWH algorithm performs the following steps: Step 1: Choose the threshold T (T> 0), so that number of the coefficient in the range [-T, T] is greater than M. Set Peak = T.
Step 2: In the histogram of the wavelet coefficient, shift the histogram column (the value of the histogram is greater than the Peak) to the right by a unit to create a Zero column at the Peak +1 position. Secret information is hidden in this location. Scanning all the coefficients of the high frequency band, if the coefficient is equal to Peak, the secret bit is 1, add 1 to the Peak coefficient to become Peak + 1, the secret bit is 0, the value of coefficient doesn't change.
Step 3: Keeping the secret information, change Peak = -Peak, shift the histogram column (the value of histogram is smaller than Peak) to the left by one unit to create the Zero column at (-Peak -1). Information is hidden at this point.
Step 4: If all M bits have been hidden, the algorithm stops here and records the stop position S = Peak. In contrast, set Peak = -Peak -1, go back to step 2 to continue to hide secret bits remains.

PreviousSteganalytic Methods
The IWH algorithm is a Lossless Data Hidingalgorithm, but if you compare the integer wavelet coefficients, the histogram of the original image and the IWH stego image, you will see this difference. This is the key to estimate the information hidden in the stego image.
We first give analysis of occurrences in watermarking process as the three following experiments: In the first experiment, we use Lena image of size 512 x 512 pixels (see Fig. 1. (b)) and Logo image of size 128 x 56 pixels to test (see Fig. 1. (a)). After integer wavelet transform, we calculate the histograms of high frequency sub bands (see Fig. 2. (a)). We next embed payload data (that is the binary sequence from Logo image) into the high frequency sub bands with T=2 using IWH method. We get S= -2 and calculate again histogram of the high frequency sub bands that is shown in Fig. 2. (d). The data embedding process performs via some steps: the first and second step embeds data in the point 2 and -2 (see Fig.2. (b), (c)) but there are to been bedded data remaining, the process performs the third and fourth step with T= -2 to embed data (see In the second experiment, we use also the Lena original image and Logo watermark with T=4, we then get S=3. In this case, the histogram is changed much that is shown in Fig.3. (a).
In the third experiment, we use the same input with T=6, we then get S=-5. In this case, the histogram is changed clearly that is shown in Fig.3. (b).
From the above three experiments we find that the wavelet coefficient histogram is slightly variable. In the natural image, the value column of the pair of coefficients (h i , h -i ) is approximately equal and symmetric across the value h 0 column, the h i value is usually greater than h i-1 (Fig. 2. (a)). However, after hiding it broke the nature (  From the analysis we can estimate length of the information be hidden in the stego images. Details of the algorithm show the following as: Step 1: Set L=0 (length of message), scan all column h i (i >0 and i <=max (all integer wavelet coefficient of high sub bands)), if the first is met, stop scanning, set Peak = i be first location to estimate data length.
Applying the proposed algorithm for the three experiments we can estimate the length of information hidden in the image according to Table 1.

Steganlysis based on Neural Networks
Base on the analysis in section 2, 3, we found out classify images by Artificial Neural Networks -NNs and Convolutional Neural Networks -CNNs.
On NNs, the same architecture is used for experiments on 1000 cover images which were downloaded from [16,17] and 500 stego images from 1000 cover images. We transform all pixel of cover image and stego image to integer wavelet frequency domain, then calculate histogram of the wavelet coefficients h=[h -128 , h -127 , …, 0,…, h 126 , h 127 ], each h i divided by max(h) to decrease h i in value range [0,1] to increase the accuracy of the network training process. We consider the vector h is the characteristic vector as the input neuron (each h i being an input neuron) for training the network. Using a hidden layer with 20 neurons, the neurons of the output layer are two neurons, using the sigmoid function to summary the output values of each class, the weights of the neurons in each class are initialized accordingly so that the smallest possible output error (Fig. 4). On CNNs, use CNNs as base classifiers [11]. This network consists of 6 layers, 1 input layer, 2 convolution layers, 2 maxpooling layers and one layer of output. The input layer consists of 16 x 16 neurons (corresponding to 16 x 16 = 256 value of the wavelet coefficient histogram). The first hiding layer (the first convolution layer) consists of 6 maps of 12 x 12 neurons with 5 x 5 filter windows. The next hidden layer (max-pooling) is the output of the first convolution layer. Six mappings of 6 x 6 neurons use a 2 x 2 filter core.
The next convolution consists of 12 mappings of 6 x 6 neurons with 5 x 5 kernel. The second maxpooling layer consists of 12 mappings of 2 x 2 neurons using the 2 x 2 filter kernel. The output layer has a neuron per layer (corresponding to the original image layer and the stego image) (Fig. 5).We pick the trained CNN with the lowest validation error, and evaluate it on the corresponding test set.

Experimental Results
Having a set of images, it includes 2088 images.They were downloaded from [15], [16] and they were created from my digital camera, all images are then converted to grayscale images by Photoshop CS2 software. This set of photos is created into two subsets they will be used to test the proposed detection or classification methods in above session as follows: The first data set is used to detect by the estimation method includes: + 2088 cover images + 2088 stego images with hidden information is a 2000 bits binary string generated randomly.
The second data set is used for two neural networks includes: + The training set: 1500 images including 500 cover images and 1000 stego images which are embedded the randomly secret binary sequence of 2000 bits and 6000 bits into corresponding 500 original images by IWH method. + The testing set: 3676 cover images and stego images which are embedded the randomly secret binary sequence of 2000 bits or 6000 bits.
Proceeing to test three scenarios for two proposed approaches to detect hidden images using IWH hiding techniques: + Case 1: using estimating the information hidden in the wavelet coefficient domain of the image for the first data set, the images are hidden with the amount of information is 2000 bits, we obtain the estimation results as shown in Fig. 5. There the horizontal axis represents image number # and the vertical axis represents the embedded data length corresponding image number #.    Table 3, the average detection rate for the data set is 93.05%, the classifying time is 0.38 seconds. However, the training time of CNNs is higher than the training time of NNs. From the three test cases, the results obtained by the method of message estimation are lower than those found in the Neuron network. In fact, CNNs are rated better than NNs, but in this case, NNs are better at detecting CNNs. This may be because the number of entry layers of CNNs in this case is much lower than those used in [11].In addition, the training time of CNNs is many times higher than that of NNs.

Conclusions
In this paper, introducing two methods by using NNs and CNNs for better results. The two results show 96% correct detection rates for NNs and 94% for CNNs that indicates the reliability of the methods, this is a better result than old method [7]. Combining the old method [7] and new method to classify cover and stego images using IWH method. The first, using the new method (using on NNs or CNNs) to detect, then using the old method to estimate the hidden information.
However, it is hard to detect stego image with two factors which are shown in section 5. Noting that, there are many elements in this algorithms that can be changed or replaced with other elements. This research can be used to detect hidden images on spatial, frequency, or other domain.