Summary: | With the development of multimedia processing technology, it is becoming much easier to manipulate and tamper with digital video without leaving any visual clues. Because video compression is very common in digital videos, the tamper might employ powerful multimedia deblocking methods to cover up the video tampering traces. Motion JPEG (MJPEG) is one of the most popular video formats, in which each video frame or interlaced field of a digital video sequence is compressed separately as a JPEG image. By splitting the MJPEG video into JPEG image frames, the tamper might employ powerful multimedia deblocking methods to cover up the video tampering traces. To the best our knowledge, there is no existing method for the forensics of deblocking. In this paper, we propose a novel method to detect deblocking, which can automatically learn feature representations based on a deep learning framework. We first train a supervised convolutional neural network (CNN) to learn the hierarchical features of deblocking operations with labeled patches from the training datasets. The first convolutional layer of the CNN serves as the preprocessing module to efficiently obtain the tampering artifacts. Then, we extract the features for an image with the CNN on the basis of a patch by applying a patch-sized sliding-window to scan the whole image. The generated image representation is then condensed by a simple feature fusion technique, i.e., regional pooling, to obtain the final discriminative feature. The experimental results on several public datasets demonstrate the superiority of the proposed scheme.
|